Hi folks, pls install java on your machines. Needed for rJava and text analysis packages to run. Alos, you'll need to have a twitter account to use twitteR. Just saying.
Announcement 1:
Your practice exam is up on LMS. No solution will be provided. Expect a similar Q pattern in the end-term.
Announcement 2:
This famous article from Wired magazine 'The Long tail' by Chris Anderson is the reading for Session 10. Its an excellent article on a new economic paradigm enabled by technology. Pls read and come, I will discuss both the McKinsey article pre-read for Session 9 and this one in Session 10. Also, a pre-read quiz may be lurking round the corner...
********************************************
Hi all,
This is a hurried post, and will deal mainly with the HW part of session 9.
There are two parts to Session 9 HW. Part 1 deals with plain descriptive analysis of session 2 survey data. Part 2 involves extracting web data from amazon product reviews, and then analyzing it. Here we go:
HW part 1:
- First ensure all the packages required are installed. If you're having trouble, given the paucity of time, run your HW off a friend's machine.
- Open 'Session 9 HW part 1 R code.txt'.
- Open excel file 's2 survey data for s9 HW.xls'.
- The first worksheet 'Questions reference' gives the 4 Qs and associated Q numbers that you'd answered.
- The second worksheet contains your data itself as 4 columns (one for each Q)
- Copy paste and save each column's data on a notepad. Since survey data from websurveys routinely come in the form of .csv files, the point here is get you to go from .csv to .txt to R.
- Read the notepad into R and run the relevant R codeon it
- Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
- Repeat the exercise with the same code for the other 3 columns
HW part 2:
- Open 'Session 9 HW part 2 R code.txt'.
- Check the URL given in the code - does it work - what page opens on a browser?
- Run the code given to extract the data from the webpage
- Check is the saved data look OK etc
- Now perfrom downstream analysis as given in the code sent.
- Execute R code line by line reading the commentary given to get a sense of the analysis flow
- Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
- Segment the reviewers.
- Run sentiment analysis for positive and negative words by segment.
- What themes emerge in the sentiment wordlcouds by segment? Are the segments distinct in their needs and focus?
- Optional but additional points for trying: Install twitteR
- Search for "xbox" on twitter. Collect some 100 tweets
- Run descriptive analysis on the tweets.
- Submission guidelines:
- Paste the plots and your analysis on PPT and submit in the relevant dropbox.
- Name the PPT session9HW_yourname.ppt
- Title slide should contain your name and PGID
- Submission deadline is this Thursday midnight.
Sudhir