Data Science with R- Text Mining Graham Williams 9th June 2014 Visit http://onepager.togaware.com/ for more OnePageR's. Text Mining or Text Analytics applies analytic tools to learn from collections of text documents like books, newspapers, emails, etc. The goal is similar to humans learning by reading books. Using automated algorithms we can learn from massive amounts of text, much more than a human can. The material could be consist of millions of newspaper articles to perhaps summarise the main themes and to identify those that are of most interest to particular people. The required packages for this module include: library(tm) # Framework for text mining. library(SnowballC) # Provides wordStem() for stemming. library(RColorBrewer) # Generate palette of colours for plots. library(ggplot2) # Plot word frequencies. library(Rgraphviz) # Correlation plots. As we work through this chapter, new R commands will be introduced. Be sure to review the command's documentation and understand what the command does. You can ask for help using the ? command as in: ?read.csv We can obtain documentation on a particular package using the help= option of library(): library(help=rattle) This chapter is intended to be hands on. To learn eectively, you are encouraged to have R running (e.g., RStudio) and to run all the commands as they appear here. Check that you get the same output, and you understand the output. Try some variations. Explore. http://onepager.togaware.com/TextMiningO.pdf