Hi folks, pls install java on your machines. Needed for rJava and text analysis packages to run. Alos, you'll need to have a twitter account to use twitteR. Just saying.
Announcement 1:
Your practice exam is up on LMS. No solution will be provided. Expect a similar Q pattern in the end-term.
Announcement 2:
This famous article from Wired magazine 'The Long tail' by Chris Anderson is the reading for Session 10. Its an excellent article on a new economic paradigm enabled by technology. Pls read and come, I will discuss both the McKinsey article pre-read for Session 9 and this one in Session 10. Also, a pre-read quiz may be lurking round the corner...
********************************************
Hi all,
This is a hurried post, and will deal mainly with the HW part of session 9.
There are two parts to Session 9 HW. Part 1 deals with plain descriptive analysis of session 2 survey data. Part 2 involves extracting web data from amazon product reviews, and then analyzing it. Here we go:
HW part 1:
- First ensure all the packages required are installed. If you're having trouble, given the paucity of time, run your HW off a friend's machine.
- Open 'Session 9 HW part 1 R code.txt'.
- Open excel file 's2 survey data for s9 HW.xls'.
- The first worksheet 'Questions reference' gives the 4 Qs and associated Q numbers that you'd answered.
- The second worksheet contains your data itself as 4 columns (one for each Q)
- Copy paste and save each column's data on a notepad. Since survey data from websurveys routinely come in the form of .csv files, the point here is get you to go from .csv to .txt to R.
- Read the notepad into R and run the relevant R codeon it
- Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
- Repeat the exercise with the same code for the other 3 columns
HW part 2:
- Open 'Session 9 HW part 2 R code.txt'.
- Check the URL given in the code - does it work - what page opens on a browser?
- Run the code given to extract the data from the webpage
- Check is the saved data look OK etc
- Now perfrom downstream analysis as given in the code sent.
- Execute R code line by line reading the commentary given to get a sense of the analysis flow
- Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
- Segment the reviewers.
- Run sentiment analysis for positive and negative words by segment.
- What themes emerge in the sentiment wordlcouds by segment? Are the segments distinct in their needs and focus?
- Optional but additional points for trying: Install twitteR
- Search for "xbox" on twitter. Collect some 100 tweets
- Run descriptive analysis on the tweets.
- Submission guidelines:
- Paste the plots and your analysis on PPT and submit in the relevant dropbox.
- Name the PPT session9HW_yourname.ppt
- Title slide should contain your name and PGID
- Submission deadline is this Thursday midnight.
Sudhir
Hi Sir,
ReplyDeleteIs there a way to capture all the screens when we run the word clouds? Currently if we click on the screen it moves to the next screen
Hi arvind,
DeleteI use 'print screen' and paste the screen on MS paint from where I select the desired portion and crop the image.
Hope that helps.
Sudhir
Hi Prof,
ReplyDeleteCan you please share details to access twitter using R. I am unable to find instructions or the code /authentication details to capture the tweets.
HI Arvind,
DeletePls lookup the classwork files for session 9. Paste twitteR code line by line. At one place it generates a URL which you must copy-pate in a browser to obtain the passcode. Copy paste the passcode in R console to activate twitteR and continue.
I'll be in office today 10-12. drop by in case it still doesn't work.
Sudhir