Tuesday, November 12, 2013

Session 9 Updates

Update:

Hi folks, pls install java on your machines. Needed for rJava and text analysis packages to run. Alos, you'll need to have a twitter account to use twitteR. Just saying.

Announcement 1:

Your practice exam is up on LMS. No solution will be provided. Expect a similar Q pattern in the end-term.

Announcement 2:

This famous article from Wired magazine 'The Long tail' by Chris Anderson is the reading for Session 10. Its an excellent article on a new economic paradigm enabled by technology. Pls read and come, I will discuss both the McKinsey article pre-read for Session 9 and this one in Session 10. Also, a pre-read quiz may be lurking round the corner...

********************************************

Hi all,

This is a hurried post, and will deal mainly with the HW part of session 9.

There are two parts to Session 9 HW. Part 1 deals with plain descriptive analysis of session 2 survey data. Part 2 involves extracting web data from amazon product reviews, and then analyzing it. Here we go:

HW part 1:

  • First ensure all the packages required are installed. If you're having trouble, given the paucity of time, run your HW off a friend's machine.
  • Open 'Session 9 HW part 1 R code.txt'.
  • Open excel file 's2 survey data for s9 HW.xls'.
  • The first worksheet 'Questions reference' gives the 4 Qs and associated Q numbers that you'd answered.
  • The second worksheet contains your data itself as 4 columns (one for each Q)
  • Copy paste and save each column's data on a notepad. Since survey data from websurveys routinely come in the form of .csv files, the point here is get you to go from .csv to .txt to R.
  • Read the notepad into R and run the relevant R codeon it
  • Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
  • Repeat the exercise with the same code for the other 3 columns

HW part 2:

  • Open 'Session 9 HW part 2 R code.txt'.
  • Check the URL given in the code - does it work - what page opens on a browser?
  • Run the code given to extract the data from the webpage
  • Check is the saved data look OK etc
  • Now perfrom downstream analysis as given in the code sent.
  • Execute R code line by line reading the commentary given to get a sense of the analysis flow
  • Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
  • Segment the reviewers.
  • Run sentiment analysis for positive and negative words by segment.
  • What themes emerge in the sentiment wordlcouds by segment? Are the segments distinct in their needs and focus?
  • Optional but additional points for trying: Install twitteR
  • Search for "xbox" on twitter. Collect some 100 tweets
  • Run descriptive analysis on the tweets.
  • Submission guidelines:
  • Paste the plots and your analysis on PPT and submit in the relevant dropbox.
  • Name the PPT session9HW_yourname.ppt
  • Title slide should contain your name and PGID
  • Submission deadline is this Thursday midnight.
Shall update this post as more info (e.g., queries regarding the project or the end-term) comes in. Watch this space.

Sudhir

4 comments:

  1. Hi Sir,
    Is there a way to capture all the screens when we run the word clouds? Currently if we click on the screen it moves to the next screen

    ReplyDelete
    Replies
    1. Hi arvind,

      I use 'print screen' and paste the screen on MS paint from where I select the desired portion and crop the image.

      Hope that helps.

      Sudhir

      Delete
  2. Hi Prof,
    Can you please share details to access twitter using R. I am unable to find instructions or the code /authentication details to capture the tweets.

    ReplyDelete
    Replies
    1. HI Arvind,

      Pls lookup the classwork files for session 9. Paste twitteR code line by line. At one place it generates a URL which you must copy-pate in a browser to obtain the passcode. Copy paste the passcode in R console to activate twitteR and continue.

      I'll be in office today 10-12. drop by in case it still doesn't work.

      Sudhir

      Delete

Constructive feedback appreciated. Please try to be civil, as far as feasible. Thanks.