Thursday, November 28, 2013

Session 2 Updates and HW (Mohali)

Hi all,

Session Recap:
Session 2 got done today. We ventured into psychometric scaling and attempted to measure complex constructs using the Likert scale, among others. We also embarked on a common-sensical approach to survey design.

There were some Qs that "jump ahead" in the sense that I hope to cover them in Session 3 - Questionnaire Design. And Qs that seem to want to "force" a 'right' answer in a multiple choice context. Well, one issue with a lot of MKTR is that it is context-sensitive, so its hard to proclaim 'right' answers that will hold true in general. "It depends" is usally a better bet. Wherever possible I do try to point out general principles and frameworks but in many cases, the problem context decides whether something is true or not.

For example, "inferring scale reliability via analysis of demographic profiles" raised quite a few Qs [PsyScaling Q7]. Well, there are product categories where demographics alone can explain enough to evaluate scale relibility on their basis. But in an increasing number of product categories, demographics do not explain product choice very much. In such instances, its hard to conclude definitely about scale reliability on their basis. At least, that was the limited point, I was making.

Study Group Formation details:

  • Since there's no project, there's no project group. However, a number of homework activities are group-based, hence pls form HW or study groups.
  • Regarding group formation, pls send an email (only one per group) to the AA in the format prescribed.
  • To help in group formation, pls find the list of all final registrants for MKTR_146 on LMS.
  • If you are unable to find group partners and would like to be allotted to a group, let the AA know.
  • Choose a well-known brand as your group name and write your group name next to your PGID and name in the spreadsheet.
  • Pls complete the group formation exercise well before the start of session 3.

*******************************************

Pls read *any 2* of the articles from the business press below. The following 2 HWs are based on the above articles.

HW Part 1: Reducing an M.P. to a D.P. to an R.O.

For each article,

  • Q.1.1. write a short description of what a mgmt problem (M.P.) may look like.
  • Q.1.2. Write one D.P. corresponding to the M.P.
  • Q.1.3. Write an example or two of R.O.s that correspond to the D.P.
HW Part 2: Construct Analysis

  • Q.2.1. List a few major constructs you find (if any) in each of the two articles that are of MKTR interest.
  • Q.2.2. Pick any one construct you have listed in Q.2.1. and break it down into a few aspects.
  • Q.2.3. Make a table with 2 columns. In the first column, write the names of the aspects you came up with. In the second column, corresponding to each aspect, write a Likert statement that you might use in a Survey Questionnaire to measure that aspect.

Session 2 HW submission format:

  • Use a plain white blank PPT.
  • On the title slide, write your name and PGID.
  • For slide headers, use format "HW1: [Article name]" (and so on for the next article chosen)
  • Pls mention clearly the Question numbers you are solving in the slide body. Use fresh slides for each new article
  • Use a blank slide to separate HW2 from HW1.
  • Save the slide deck as session2HW_yourname.ppt and put in in the dropbox on LMS before the start of session 4.
*******************************************

Session 2 HW part 3 - Survey filling

Pls complete the following two surveys latest by sunday (01-Dec) midnight. I reckon it'll take you max 15 minutes on each survey.

Note: Pls answer as truthfully as you can. The data will be used to illustrate MKTR tools and R's analysis prowess in the classroom. It will *not* be shared with anybody outside the classroom.

Survey 1 (for JSM and perceptual mapping)

Survey 2 (Standard psychographic Personality profile questionnaire)

*******************************************

Session 3 Preview:
In Session 3 we cover two broad topics. For the first, we continue the "principles of survey design" part and wade into Questionnaire Design Proper. Be sure to read the pre-read on Questionnaire Design as it covers the basics. It'll thus help lighten my load considerably in class. And who knows if there's another pre-reads quiz lurking somewhere in Session 3 as well...

For the second broad topic, we do Data Reduction via Factor Analysis. For this we'll need R. Two ways to get R:

(i) The easy way is to copy the .exe files for both R and Rstudio putup by Ankit on LMS. First install R by clicking on the .exe file and following instructions. Then repeat the same for Rstudio.exe.

(ii) the second way ios to directly download and install R from the following CRAN link for both the Windows and Mac versions:

Download and Install R

Download and install Rstudio (only *after R has been installed)

If you have any trouble with R installation, contact IT and let me know. Watch this space for more updates.
That's it for now. See you in the next class.
Sudhir

Tuesday, November 26, 2013

Welcome Message and Session 1 Updates (Mohali)

Hi Co2014 @ Mohali,

Welcome to MKTR.

The first session got done today. We covered some subject preliminaries and the crucial task of problem formulation.

Reading for session 2:
"What is Marketing Research" (HBS note 9-592-013, Aug 1991) in the coursepack.

About pedagogy going forward:
The parts of the pedagogy that make MKTR distinctive : pre-reads quick-checks using index cards, in-class reads, this blog and R.

About this blog:
This is an informal blog that oncerns itself soley with MKTR affairs. It's informal in the sense that its not part of the LMS system and the language used here is more on the casual side. Otherwise, its a relevant place to visit if you're taking MKTR. Pls expect MKTR course related announcements, R code, Q&A, feedback etc. on here.

Last year, students said that using both LMS and the blog (not to mention email alerts etc) was inefficient and confusing. I'm hence making this blog the single-point contact for all relevant MKTR course related info henceforth. LMS will only be used for file transfers and email notifications to the class only in emergencies. Each session's blog-post will be updated with later news coming at the top of the post. Kindly bookmark this blog and visit regularly for updates.

About Session 2:
Session 2 deals with psychographic scaling techniques and delves into the intricacies of defining and measuring "constructs" - complex mental patterns that manifest as patterns of behavior. This session also sets the stage for questionnaire design to make an entry in Session 3.

There will be a series of 2 homework assignments for session 2. These concern merely filling up surveys that I will send you (this data we will use later in the course for perceptual mapping and segmentation). Nothing too troublesome, as you can see.

Any Qs etc, pls feel free to email me or use the comments section below.

Sudhir Voleti

Saturday, November 16, 2013

Session 8 HW queries

Hi all,

Some session 8 HW related queries that IMHO merit wider dissemination...

Satish Wrote:

Dear Professor,
I am having trouble interpreting the output of the factor regression and was wondering whether you could help me understand it better...

I understand that we use the factor regression for categorical variables. But in the Session 8 HW, the quant, qualitative etc are not categorical variables but we are forcing them to be categorical – correct? I didn’t understand why we were doing this.. (eg: summary(lm(overall ~ factor(quant1) +factor(quali1) +factor(R1) +factor(HWs1) +factor(blog1))))

Also, how does the interpretation of the results from the factor regression differ from that of regular regression? For example, what does each beta coefficient mean in a factor regression? I understand that ‘high’ is the reference in each of the factors but what exactly does it mean when we say that (for example) increasing the factor(quant) low would decrease the overall rating? (as shown by the negative sign)

Could you please elaborate? Thanks
Best Regards
Satish

My response:

Hi Satish (and Swati, who had a similar Q),

1. True that we use dummy variables (inR, factor() function makes dummy 0/1 variables out of a categorical variable) for categorical or nonmetric variables, and that the raw data for C02014 feedback was metric.

2. The point of the HW was to get you to run a dummy variables regression anyway. I *discretized* the metric X variables into categorical X variables using Hi/Med/Low scale. Normally, we wouldn't do this, metric variables are anyday much more informative than nonmetric factors. But for this HW, we did.

3. The interpretation for a factor regression is straightfwd - take the High/Med/Low case. By default R chooses 1 of the 3 categories (typically the first, High) as reference, sets it to zero and measures the effect of the other two factor levels (Med and Low) against this zero baseline. If Med and Low have higher impact than High, then they are positive. Lower impact, then they are negative and about the same impact as High, then they are insignificant.

4. Changing the reference makes no difference to the rest of the regression, it only moves the baseline up or down. For example, if you were to make Low the reference, then just add the negative of the Low coefficient to the coeffs of High, Medium and Low and you have your new set of coefficients.

To test this, just tweak the code slightly: replace 'factor(quali)' with 'factor(quali, ref = "Low")' in the code and then run the analysis again. Note what happens to the coeffs, to the overall fit in R square terms etc.

Hope that helps.

Anupama writes:

I have one more query –
I am unable to interpret negative coefficients in the variant of regression when you introduced categories of independent variables in the assignment.

If the overall qualitative rating is on a scale of 1-9….should I understand categories as - Low -> 1-3; Med -> 4-6; High -> 7-9 ?
With the above understanding, should I interpret negative coefficient for qual1-Low as ….
‘decrease in low qualitative rating increase overall rating’ => ‘Increase in quality rating increases overall rating’ ?

Overall implications => Professor should give high importance to qualitative material and low level of quantitative material and rest of the factors(like HW, blog) are not significant enough to affect overall rating?

Please let me know if my above understanding is correct.

Also, it would be great if we can addendum to the Session 8 and provide solution to this assignment.
It would help us in preparation for end-term exam.

My response:

Hi Anupama,

This is correct:

‘decrease in low qualitative rating increase overall rating’ => ‘Increase in quality rating increases overall rating’ ?

The High/Med/Low were chosen I think based on this rule: High (low) if score is > one stdev above (below) from the mean. Rest all are medium.

Have a received a few more such queries, will write a blog post and share my responses.

P.S.
Will putup session 8 HW solution (actually choose a few exemplerily good submissions) on LMS.

Friday, November 15, 2013

Project related Mailbag, Q&A

Hi all,

This post will be a general purpose one relating to Project based Q&A (the earlier one for End term based Q&A).

Priyo wrote to me with these Qs:

Hi Prof,

Requesting some clarity regarding further scope of work for the project submission.

Data Collection Guidelines—sample size, demographic, etc.
Analysis and flow of slides.
Scope of conclusion.
Read the grading criteria on you blog post but couldn’t glean much regarding the above.

Thanks,
Priyo

My response:

Hi Priyo,

>> Data Collection Guidelines—sample size, demographic, etc.

The data collection guidelines are, well, flexible. Am not expecting rigorous adherence to sample size requirements for instance. I'd say, with 4 people in a group, each person collecting some 10-15 survey responses is quite enough.

Regarding demographic, ideally go for offerings targeted at the most abundant demographic in ISB - upper middle class urban youth in the mid to late 20s.

>> Analysis and flow of slides.

These should be, in one-word, 'Common-sensical'. There's no right or wrong way, just context dependent implementation, I guess.

>> Scope of conclusion.

Depends on the scope of the DP and ROs. If the ROs are confirmatory, then yes, a yes/no kind of clear decision recommendation would be nice.

If exploratory, mere pointers or indications for further investigation are typically deemed sufficient.

Hope that helps.

Sudhir

P.S.
Will put this up on the blog for wider dissemination.

P.S. watch this space for more such updates. More recent updates at the top of the post.

********************************

Update: Incorporating social network analysis via R for MKTR insights:

I got an interesting Q from a student working on twitteR, on whether and how text analytics relates to social network analysis in general. So I started digging around... And recently discovered that R has a full suite of social media mapping and network analysis applications.

Yes, text analytics is merely the tip of the iceberg. R can go far deeper and far higher (at the same time) than merely text analytics. Now let's talk in terms of *networks* - verticies (or nodes) and edges signifying relations between the nodes...

Am going to demo social network analysis 101 on R using your course feedback 'overall feedback.txt' and the names of the associated students in 'names.txt' (see LMS). Social network anlysis would be a full lecture (or, perhaps, even a full course) by itself but the major major take-aways can be skimmed through rather quickly, I reckon.

Try these classwork examples at home, maybe you may want to do this for your project?

# read-in data first

names = read.table(file.choose()) # 'names.txt'

x = readLines(file.choose()) # 'overall feedback.txt'

x1 = Corpus(VectorSource(x)) # build corpus

ngram <- function(x1) NGramTokenizer(x1, Weka_control(min = 1, max = 2))

tdm0 <- TermDocumentMatrix(x1, control = list(tokenize = ngram,
tolower = TRUE,
removePunctuation = TRUE,
removeNumbers = TRUE,
stopwords = TRUE,
stemDocument = TRUE)) # patience. Takes a minute.
# remove columns with zero sums

dim(tdm0); a0 = NULL;

for (i1 in 1:ncol(tdm0)){ if (sum(tdm0[, i1]) == 0) {a0 = c(a0, i1)} }

if (length(a0) >0) { tdm1 = tdm0[, -a0]} else {tdm1 = tdm0}; dim(tdm1)

inspect(tdm1[1:5, 1:10])# to view elements in tdm1, use inspect()

# convert tdms to dtms
# dtm weighting from Tf to TfIdf (term freq Inverse Doc freq)
dtm0 = t(tdm1) # docs are rows and terms are cols
dtm = tfidf(dtm0) # new dtm with TfIdf weighting

# rearrange terms in descending order of TfIDF and view

a1 = apply(dtm, 2, sum); a2 = sort(a1, decreasing = TRUE, index.return = TRUE);

dtm01 = dtm0[, a2$ix]; dtm1 = dtm[, a2$ix];

the baove analysis was standard, pretty much what we did in the classwork in session 9. What follows comes with a twist. Will need to install package 'igraph' for this.

What we do next is find 'relations' or connections between terms - for our context, I define a 'connection' between two terms as the intra-document co-occurence, i.e. how often those terms occured together in a document across all docs in the corpus. Somewhat like a cluster dendogram, I guess, but way cooler.

install.packages("igraph") # install once per comp

### --- making social network of top-40 terms --- ###

dtm1.new = inspect(dtm1[, 1:40]); # top-25 tfidf weighted terms

term.adjacency.mat = t(dtm1.new) %*% dtm1.new; dim(term.adjacency.mat)

## -- now invoke igraph and build a social network --

library(igraph)

g <- graph.adjacency(term.adjacency.mat, weighted = T, mode = "undirected")

g <- simplify(g) # remove loops

V(g)$label <- V(g)$name # set labels and degrees of vertices

V(g)$degree <- degree(g)

# -- now the plot itself

set.seed(1234) # set seed to make the layout reproducible

layout1 <- layout.fruchterman.reingold(g)

plot(g, layout=layout1)

you should see something like this. Click for larger image.

The image depicts connections between terms. Of course, one may say that social networks are built among *people*, not among terms.

OK. Sure.

So can we build one among people using a similar procedure? You bet. This time, we'd be connecting people using the common terms they used in the corpus. The code to do that is below:

### --- make similar network for the individuals --- ###

dtm2.new = inspect(dtm1[,]); dim(dtm2.new)

term.adjacency.mat2 = dtm2.new %*% t(dtm2.new); dim(term.adjacency.mat2)

rownames(term.adjacency.mat2) = as.matrix(names)
colnames(term.adjacency.mat2) = as.matrix(names)

g1 <- graph.adjacency(term.adjacency.mat2,
weighted = T, mode = "undirected");

g1 <- simplify(g1) # remove loops

V(g1)$label <- V(g1)$name # set labels and degrees of vertices
V(g1)$degree <- degree(g1)

# -- now the plot itself --

set.seed(1234) # set seed to make the layout reproducible

layout2 <- layout.fruchterman.reingold(g1)

plot(g1, layout=layout2)

And the result will look something like this:

Recall Session 9's segmentation exercise on ice-cream flavor comments? We were trying to cluster together respondents based on similarity or affinity in terms used. k-means scree plots were a poor way of judging the number of clusters. The above graph provides much better insight that way. Seems like there are two big clusters and some 2-3 smaller ones in the periphery.

Sure, important and interesting Qs arise from this kind of analysis.... For instance, "Who is the most representative commentator, i.e. one whose words best represent the class'?", "Who is best connected with majority opinion?" and so on. Marketers routinely ask similar Qs to try to detect "influencers" (along various metrics of node centrality etc). But again, that is a whole other session.

Sudhir

Wednesday, November 13, 2013

End-term updates

Hi all,

Update: R tutorial announcement:

22-Nov Friday 4-6 pm at AC8 LT (tentatively). Will get a venue confirmation and update here. Would be nice if every group is represented at the tutorial.

Sudhir

1. Exam related Cases:

There are two full-length cases in your course-pack: The 'Coop' case, and 'The Fashion Channel' case.

At least one of these two cases will feature in the end-term exam. I'd rather you not try to read the case for the first time in the exam hall. Pls read them at home beforehand.

Preferably, discuss them in a group on the lines of the following Qs:

  • What is the management problem? Describe also the major symptoms.
  • What are some of the likely decision problems (DPs) that emerge based on the management problem?
  • List a few research objectives (R.O.s) that emerge based on the DPs.

2. End-term exam pattern Notes

  • There are a total of 50 Qs, 2 marks each.
  • The Qs are broken down into 8 Question-sets, each having tables or figures and Qs based on them
  • The Qs are all short-answer - True/False, fill-in-the-blanks, write expression for ...., name these factors, type of stuff.
  • If any Q comes from any pre-read, the concerned pre-read will be specified in the Q itself. So bring your course-pack to the exam. Not all pre-reads are relevant, only the ones I've specifically asked you to read. Properly speaking, those pre-reads are a part of the course.
  • Nothing that was not covered in class will show up anywhere in the exam.
  • At least one question set relates to a full-length case (see above)
  • Time will not be a problem - you'll have 150 minutes for a 120 minute paper.

Pls use the comments section to this post for any Q&A so that it is visible to the class at large.

See you in class for our last classroom meeting. Any feedback you have on how to improve any aspect the course etc is welcome at any time.

3. R tutorial and R in your resume:

From the project point of view, if you want an R tutorial at anytime between now and 24-Nov (when I leave for Mohali), pls let me know. A quorom of minimum 5 people must signup for the tutorial which can go as technical as the attendees want. Ideally, one person from each project group would attend.

If you want to include R in your resume, then provided that you have made a good-faith attempt at installing and running R for your HWs, provided that you intend to continue to invest in R going forward and provided that you have been able to read the code sent and get a "sense" of the analysis, pls consider amending and using any relevant subset of the following (as applicable):

  • State what you have done w.r.t R:
  • Have developed a familiarity with the R environment
  • Have used R to apply and analyze a wide variety of Mktg Research tools (from structured hypothesis testing to text analytics and social media analysis)
  • Have gained some understanding of the flexibility and extensibility of the system (installing and using packages and interfaces with external repositaries)
  • Have analyzed a full credit course project on the platform
  • State what you will do going forward:
  • Intend to continue investing in and developing greater insights into the R platform
  • Are attracted by the low-cost, license-free unrestricted use terms and rapid-analysis capabilities of the open-source platform
  • Believe in R's promise of substantially expanding enterprise analytics capabilities while keeping a tight lid on costs
  • Believe in the philosophy of collaborative design, rapid rollout and innate scale-ability
  • Bottomline: are convinced of R's compelling cost/benefit calculus for delivering enormous value to the organization

Again, folks, ensure that *only* that subset that applies to you in a bona fide way is used for your statements of purpose, on the resume etc. Making false claims has a habit of coming back to bite you at inconvenient times.

Good luck for the exam and going fwd for the placements as well.

Sudhir

Tuesday, November 12, 2013

Session 9 Updates

Update:

Hi folks, pls install java on your machines. Needed for rJava and text analysis packages to run. Alos, you'll need to have a twitter account to use twitteR. Just saying.

Announcement 1:

Your practice exam is up on LMS. No solution will be provided. Expect a similar Q pattern in the end-term.

Announcement 2:

This famous article from Wired magazine 'The Long tail' by Chris Anderson is the reading for Session 10. Its an excellent article on a new economic paradigm enabled by technology. Pls read and come, I will discuss both the McKinsey article pre-read for Session 9 and this one in Session 10. Also, a pre-read quiz may be lurking round the corner...

********************************************

Hi all,

This is a hurried post, and will deal mainly with the HW part of session 9.

There are two parts to Session 9 HW. Part 1 deals with plain descriptive analysis of session 2 survey data. Part 2 involves extracting web data from amazon product reviews, and then analyzing it. Here we go:

HW part 1:

  • First ensure all the packages required are installed. If you're having trouble, given the paucity of time, run your HW off a friend's machine.
  • Open 'Session 9 HW part 1 R code.txt'.
  • Open excel file 's2 survey data for s9 HW.xls'.
  • The first worksheet 'Questions reference' gives the 4 Qs and associated Q numbers that you'd answered.
  • The second worksheet contains your data itself as 4 columns (one for each Q)
  • Copy paste and save each column's data on a notepad. Since survey data from websurveys routinely come in the form of .csv files, the point here is get you to go from .csv to .txt to R.
  • Read the notepad into R and run the relevant R codeon it
  • Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
  • Repeat the exercise with the same code for the other 3 columns

HW part 2:

  • Open 'Session 9 HW part 2 R code.txt'.
  • Check the URL given in the code - does it work - what page opens on a browser?
  • Run the code given to extract the data from the webpage
  • Check is the saved data look OK etc
  • Now perfrom downstream analysis as given in the code sent.
  • Execute R code line by line reading the commentary given to get a sense of the analysis flow
  • Make a corpus-level wordcloud and dendogram, see if any broad themes emerge from them. Write a few lines on these.
  • Segment the reviewers.
  • Run sentiment analysis for positive and negative words by segment.
  • What themes emerge in the sentiment wordlcouds by segment? Are the segments distinct in their needs and focus?
  • Optional but additional points for trying: Install twitteR
  • Search for "xbox" on twitter. Collect some 100 tweets
  • Run descriptive analysis on the tweets.
  • Submission guidelines:
  • Paste the plots and your analysis on PPT and submit in the relevant dropbox.
  • Name the PPT session9HW_yourname.ppt
  • Title slide should contain your name and PGID
  • Submission deadline is this Thursday midnight.
Shall update this post as more info (e.g., queries regarding the project or the end-term) comes in. Watch this space.

Sudhir

Friday, November 8, 2013

Session 8 Updates

Hi all,

Session 8's two main topics - (i) Hypotheses formulation and testing, and (ii) basic (regression) modeling hasve been covered.

A recap of big picture session take-aways:

  • Assumptions, beliefs and conjectures about events of interest - the stuff of ideas, basically - underlie Hypotheses.
  • Its critical that the Null and alternate hypotheses be defined carefully so that they are present a mutually exclusive, exhaustive and logical range of events.
  • We make it hard to reject the Null (i.e. status quo) hypotheses to minimize false positives (also called the 'Type 1 error')
  • Tests of differences (t-tests, basically) and tests of association (chisquare tests, primarily) together account for a large proportion of MKTR hypothesis testing in pratice
  • Modeling underlies our every attempt to structure, order an interpret data.
  • Regression modeling is amongst the most common in practice due to its strengths, viz., prediction, description, control, existence and magnitude information, etc.
  • The principal regression variants - quadratic and higher order polynomial terms, log-log forms, interaction effects etc. - allow much flexibility in modeling dependence relations.
  • Last but not least, the ability to rapidly formulate and test conjectures as well as interpret regression modeling results - is a core (i.e. non-outsourceble) managerial skill.

OK, having beaten the drum on why Session 8's topics are relevant, interesting and important, let me proceed to explaining the Classwork examples we saw in Hypothesis testing.

1. Hypothesis Testing, Select classwork Examples:

Copy-paste the code below. Pls don't do it all at once, but a few lines at a time, after reading the descriptive comments (following the '#' symbol).

rm(list = ls()) # clear workspace

mydata = read.table(file.choose(), header=TRUE)
# 'core course pref data.txt'

attach(mydata) # allows us to call columns in mydata by name

head(mydata) # view top few rows, just in case

summary(mydata) # see summary of all cols

# t-test for differences

# Q: Does preference for INVA equal 4?

t.test(INVA, mu = 4) # t.test() is the core func

# Q: Does preference for INVA exceed 4?

t.test(INVA, mu = 4, alternative = "greater")

# Q: is pref for MGTO > GSB?

t.test(MGTO, GSB, alternative = "greater")

# Q: Did women prefer SAIT more than men did?

xf = SAIT[(male == 0)]
xm = SAIT[(male == 1)]
t.test(xf, xm, alternative = "greater")

# Q: Did men and women prefer SAIT equally?

t.test(xf, xm)

Upon running the above code, each t.test() statement will output results. I will examine only a few here.

The hypothesis being tested is shown in red font above as a commented question. The above example is a one-sample one-tailed test. R tells us what the alternative hypothesis is (in blue font), shows the mean of the quantity being tested and yields the p-value.

This is a 2-sample 1-tailed test. The result is again non-significant at 95%.

The above were tests of differences which deal with metric data. Next, look at a test of association that deal with relations between non-metric (i.e., nominal, ordinal or categorical) variables. [BTW, the two types of tests are connected. It is certainly possible to recast a test of difference as a test of association by changing the hypothsis and the form of the variables involved].

# Q: Are higher than avg workex people mainly male? engineers?

# make a yes/no out of 'workex'

hi.workex = (workex > mean(workex)); summary(hi.workex)

mytable = table(engineer, hi.workex) # build crosstab of counts

mytable; chisq.test(mytable) # chisq.test() is the func

mytable = table(male, hi.workex) # build crosstab of counts

mytable; chisq.test(mytable)

Only one illustrative result is shown here, below:

A few Qs had come up regarding the t and chisq distributions. Well, these distributions are sensitive to degrees of freedom, so they implicitly account for adjustments required due to changing sample sizes. Still, 'higher the sample size, better the inference' mantra always holds.

2. Regression Modeling, Beer classwork example:

Will discuss only the simple regression model results for beer dataset ('beer dataset.txt') here. Copy code below line by line and paste onto R console.

mydata = read.table(file.choose(),header=TRUE)

dim(mydata); summary(mydata) # view summary of variables

head(mydata) # view a few data rows

attach(mydata) # enables calling variables by name

summary(lm(volsold ~ # lm() is linear model func, volsold is Y

price + distbn+ promo+ adspend1 # Mktg mix variables

+factor(brand2) +bottle +light +amber +golden +lite +reg +sku.size # prod attributes

+ factor(month) )) # control variables, lm() closes

Upon running the data, we get the following results:

The table's columns are 'Estimate' == Coefficient, 'Std error' is, well, standard error, 't value' is the t-statistic computed as Estimate/Std error, and 'Pr(>|t|)' is the p-value.

Some things to note in the results (based on Qs received in class):

1. There are seven brands in the data but only six brand intercepts. The seventh, 'Amstel' is the reference brand and its coefficient is fixed to zero, a priori. Why does this happen? Why only 6 brands? Why fix one brand to zero - isn't that arbitrary?

2. Well, consider this: Suppose we had the gender in the X variables. We could have two columns - 'male' and 'female' represented with 1 and 0. But we can use only one of the two columns in the regression, not both, because male = 1-female and vice versa. In other words, the two columns are linearly dependent. Similarly, having the seventh brand in would make the right hand side of the above regression also linearly dependent and the analysis would not run at all.

3. Further, the reference brand is not arbitrary. Any brand can be chosen as reference and the other brand's coefficients would adjust accordingly. FOr example, if we make Miller the reference brand, then we simply add 3.872*10^4 to the coefficients of all the brands (including Amstel). Then Miller would go to zero and the other brands would have a value relative to Miller.

4. Given average values of the X variables, we can multiply them with the coefficient estimates and obtain the average Y value. This is called the predicted Y (or Y-hat). Similarly, we can manipulate some of the X variuables and compute the hypothetical Y-hat for that X-vector. This facility is a neat and powerful advantage of the regression modeling approach.

5. However, be careful and don't stretch predictions beyond the limits of the regression. For instance, just because 'Promotions' has a positive effect on sales doesn't mean that if we increase promotions to 1000x, we'll end up with 1000x times 1.188*10^2 (promotion-coefficient) in sales.... that's probably stretching the model way beyond its reasonable limits. 6. Can't reiterate enopugh how critical it is that tomorrow's managers in general and Mktg ones in particular be comfortable with the regression approach. Certainly expect an exam Q or two on this.

3. HW for Session 8:

Save the file 'feedback ratings.txt' and use the code given in 'session 8 HW R code.txt'. Solve the following Qs by interpreting analysis results:

Q1. Test of Differences: Test the hypotheses that Quant ratings are significantly (i) greater than quali ratings, (ii) greater than R ratings and (iii) about the same as overall ratings. Identify whether the tests you run are one-sample or two-sample, and one-tailed or two-tailed. Interpret the p-value for inference on significant differences.

For the next Q, do the following:

We now divide the sample into 3 groups - High, medium and low - where 'High' ratings are one standard deviation or more above the mean, 'Low' ratings are one stdev or more below the mean and the rest are in the 'Medium' zone. [I wrote a function in the code that'll do this part automatically]

Q2a. Test of Association: Suppose you conjecture that the people who rate Quant High (Low) also rate (i) R High (Low), (ii) the blog High (Low) and (iii) the HWs High (Low). Test this conjecture. Interpret the results.

Q2b. Some of the HWs are done on R and a lot of explanation for them can be found on the blog. Test the conjecture that folks who rate the HWs High (Low) also rate (i) the blog High (Low) and (ii) R High (Low).

Moving to the modeling section from the hypothesis testing one...

Q3. Basic Regression Modeling: Test the conjecture that overall rating is a function of the component ratings (for quant, quali, R, blog and HWs). Remember to first write a conceptual model that relates the variables by name, then write an econometric model with coefficients and the error term thrown in and then, finally, run the code.

Q3a. The overall test of significance (for the regression as a whole) is given by the F-statistic in the last line of output. Interpret whether the Y actually relates to the set of set of Xs chosen.

Q3b. The extent to which the X variables explain variation in Y is given by the multiple R squared. What is the extent of unexplained variation in Y in the above regression?

Q3c. Interpret the results (coefficients and inference) of the regression.

The following question is a variant of the regression model that uses only categorical variables in the RHS.

Q3d. Factor regression Modeling: Run a variant of the above regression model. Regress overall rating on the high/med /low categorization for the components ratings. That is, regress overall over quant1, quali1, R1 etc. Interpret the results.

Q3e. What practical implications arise for an instructor of MKTR from the above 2 regression results? What should he/she focus on and what should he/she de-emphasize? Is there sufficient information and evidence to warrant your conclusions? Write a few lines about this.

The relevant data and code files for transfer will be on LMS shortly. Dropbox will be up soon. Sorry about the delay in taking out this HW, am caught up making the exam paper.

The deadline is next week saturday, but as that'll be exam week, pls don't wait that long - finish it off, like, tomorrow and submit.

Sudhir

Session 7 Updates

Hi all,

Session 7 was about the Experimentation approach in MKTR. Some big picture insights:

  • Experiments are a very powerful confirmatory tool that can be applied in a variety of business situations.
  • Experiments are gaining widespread acceptability in business as the cost of conducting them drops and the benefits derived pile up (i.e., their ROI keeps rising)
  • The colloquial usage of the term 'experiment' often confuses people. In MKTR, and in the Research in general, 'true' experiments test treatment against an equivalent control group to cancel out extraneous effects.
  • Experiments rely on logical hypotheses and measurable outcomes.
  • While web-based and services firms were the first to leverage this powerful tool, product based firms are finally getting into the act - testing innovations big and small is now commonplace in even FMCG and engineering goods firms.
  • Many firms go with pseudo- or quasi- experiments when the exacting conditions required for a true experiment may not be justified under the cost/benefit calculations.

Admittedly this year, in general, am quite happy with my own time-management in class - most classes have had a fair amount in-class Q&A time and have tended to end on time. However, Session 7 was quite a bit rushed towards the second half and I felt the conjoint portion for once could certainly have doen with more time.

To make up, here's a lengthy, colloquial blog-post on how you may want to use metric conjoint analysis for your project, for example.

Suppose your project R.O. says:

R.O.- Find customers' atribute preferences for Breakfast Noodles

Since you're asked to find attribute preference (or attribute importance) in a bundle of attributes, this is a clear cut case for conjoint analysis application.

You determine through qualitative study that Breakfast noodles product has five key attributes along which people tend to evaluate it: (i)Price, (ii)PackSize, (iii)Brand, (iv)Whether there are special flavors or not and (v) whether it is 'vitamin fortified' or not. You can make the attributes and attribute levels table in Excelf or MEXL analysis thus:

Notice that while the attributes are clear cut, the use of "High", "medium/low" in the attribute levels is imprecise. Who knows how different respondents may view or understand it? Hence, if your product development and competition benchmarking is fairly advanced, then you should ideally put in hard numbers there as far as possible. For instance, see below:

Clearly, there are two vertically differentiated bundles {assuming people will generally tend to prefer a well-known international brand (Nestle) over a well-known Indian one ('Parle') over an unknown local one 'Desi'}

Now prepare a set of product bundles for respondents to evaluate. They shouldn't include the vertically differentiated bundles as far as possible, in order to better force trade-offs in the purchase decision. Say you choose 8 bundles to show:

The hard 'design' part is over, its time to program the Qs you have into a websurvey. The bundle rating questions may look like this in qualtrics:

Ensure the bundles are presented to respondents in randomized order to avoid (or rather, to average out) any order effects.

Like I mentioned in class, metric conjoint is practically obselete now. Firms have moved on to choice based conjoint (or, CBC) in which you would present a bouquet of bundle options in each 'choice task' (e.g. like the one below) and the respondent makes a binary choice - picks one bundle as most preferred in the bouquet. See below for how a CBC choice task might look like when programmed into a web survey:

In the above figure, I couldn't figure out how to get Qualtrics to have both the image and the multiple choice question together in one question, so I did them as separate questions and grouped them together.

Chalo, I hope that helps clarify the conjoint part of what is going on. Metric conjoint interpretation is fairly straightforward and MEXL help is quite good, I hear. Should you require specific assistance in interpreting conjoint results for your projects etc, pls let me know and we can go over it individually.

Sudhir

Wednesday, November 6, 2013

About the FGD Homework

Hi all,

Some quick news and views before I get to the FGD part:

1. On Session 7 Updates and Conjoint exercises:
  • Yesterday, in session 7, we covered an important and powerful MKTR tool - the Experimentation approach - and a famous special case - the conjoint experiment.
  • I ran into a bit of time trouble towards the end and felt that the Conjoint aspect (esp its MEXL implementation) could have done with some more time.
  • Hence, like I promised, a detailed write-up is coming up - on how one may use Metric conjoint in MEXL (for your projects perhaps?) to get to specific results and insights.
  • However, given that there's no HW for Session 7, that post'll have to wait till the weekend.

2.  On HW submissions and progress in general:
  • I and the AA are grading each HW component out of 10. This is an interim measure. After all the HWs are in, we'll normalize the scores and re-weight them to fit in with the HW % weight.
  • If you're interested in seeing your progress with respect to HW scores, kindly meet with the AA Mr Sreenath today during working hrs.
  • Any other concerns, queries etc you may have, pls feel free to write or talk to me (over the phone or in person).

3. Session 4 group HW - the FGD
  •  The group submission for the FGD piece has come in. I am going over the submissions and some interesting facets have come out.
  •  First off, great effort all, in pulling this off. A special mention to five teams: Disney, Nike, Fastrack, Brainjuicers and Happily unmarried - who also submitted snapshots/highlights of their FGD via Youtube links.
  • I think its worthwhile sharing these links (in no particular order) with the class and well, with the world.
Team Brainjuicer Arvind, Naveen, Richa and Swati have put up: https://www.dropbox.com/s/2pkkkhv440ci3sg/FGD-discussion1.mp4

Team Nike with members Srinayana, Apurv, Shaheer and Saptarshi have putup their link here: http://www.youtube.com/watch?v=uaDuMb6i7_o&feature=youtu.be

Team Fastrack with members Bharatam Shivaraman, Anubhuti, Piyush and Pragya have putup the link:  http://youtu.be/cOUNV8EUEcw

Team Disney with members Deepti, Shobit, Shweta and Vaishnavi have putup their FGD highlights here: http://www.youtube.com/watch?v=yjL6UgDEoFI&feature=youtu.be

Team Happily Unmarried with members Anant, Benjamin, Namrata and Sanjit have put up link: http://www.youtube.com/watch?v=RIvvthI1yEM
4. Some quick notes on the FGDs:
  • Not that I claim great expertise in FGD analysis, however, some things did become quite apparent even to my untrained eye.
  • The FGD relies and thrives on interactions among the panelists. Its a semi-unstructured 'discussion' and in the course of those interactions, sometimes, insight emerges.
  •  - Some of the FGDs I saw gave me the impression more of a dialogue between the Mod and the panelist rather than one among the panelists themselves.
  • Its not a big issue however - given the rather challenging FGD topic you had and the fact that many of you are probably running one for the first time - I'm happy that you as MKTR students have gotten first hand exposure to the challenges one faces in planning, running and analyzing FGDs.
  • I hope this exercise will serve to keep you more grounded, realistic and sceptical about claimed FGD insights than you would otherwise have been.


Alrite. That is all for now. Am busy going through some of the feedback you gave y'day and also busy prepping for tomorrow's R heavy class - "Hypothesis Testing and Modeling secondary data in MKTR".

See you in class tomorrow.

Sudhir
  



Sunday, November 3, 2013

Mailbag (Q&A upto Session 6)

Hi all,

Shall use this post to putup good Q&As that I received over email. Shall update as more Q&As come in (later ones to the top of the post). So do pls watch this space.

#####################################
I received this Q from Satish:

Dear Professor,
I had a couple of questions regarding the constructs and was wondering whether you could help me with them…
- When discussing the Pepsi’s challenge reading (session 2) you mentioned in the notes that ‘it is critical to clearly understanding the nature of the construct”. I was wondering what you meant by the nature of the construct?
- In the same slide, you mentioned that the it is critical to ensure that the measurement tool is aligned with the nature of the construct. But there is no description on how  to make sure that the measurement tool is aligned. Could you please elaborate on this?
Thanks
Satish
My response was as below:

Hi Satish,
I've prefaced your Qs with a ">>" below. My responses follow.

>> When discussing the Pepsi’s challenge reading (session 2) you mentioned in the notes that ‘it is critical to clearly understanding the nature of the construct”. I was wondering what you meant by the nature of the construct?

The context is whether Coke should go for a sip or central location test versus a 'home test' (whole can consumption experience). Pepsi's challenge frames the debate in terms of Pepsi's strengths and Coke falls for the gambit. IMO, what should have mattered more to Coke is what is it that connects people to COke, what is it that people find 'satisfying' in their entire soft drink consumption experience etc - as that ultimately had a larger sales impact that the sip/CLT test. So the "nature" of the construct alludes to the latent meaning of "consumption experience" or "consumption satisfaction". Coke reads the nature of teh construct they are looking for wrong. The example generalizes to other contexts as well.

>> In the same slide, you mentioned that the it is critical to ensure that the measurement tool is aligned with the nature of the construct. But there is no description on how  to make sure that the measurement tool is aligned. Could you please elaborate on this?

Well, there are no particular rules, only an intuitive understanding of general guidelines in this case. Point is that unless the 'nature' of the constructs was properly pinned down, the measurement tool used was always going to be measuring something other than what was required. In the caselet context, what should have been measured was how *all* consumption factors (including brand names rather than merely the blind test results) combined together in a home or natural consumption environment to produce satisfaction/ loyalty/ repurchase intention responses in the average target segment customer.

Well, I think the Q is important and timely. I'll putup the Q & A on the blog for further dissemination.

Sudhir


Happy Deepavali, Co2014

Have a safe and fun Diwali, folks.
-Sudhir

Friday, November 1, 2013

Session 6 Updates

Hi all,

Session 6 is done. We covered two main ways to map perceptual data - (i) using the attribute ratings (AR) method to create p-maps and joint-space maps (JSMs), and (ii) using the overall similarity (OS) approach to create multidimensional scaling (MDS) maps.We also saw some 101 stuff on positioning, definitional terms, common positioning startegies etc. The point was to get you thinking on how the mapping process could throw insights onto positioning in general, which strategy to adopt based on what criteria etc.

OK, next, what will follow is the code and snapshots of the plots that emerge from the classwork examples I did. Again, you are strongly encouraged to replicate the classwork examples at home. Copy-paste a only a few lines of code at a time after reading the comments next to each line of code. {P.S.- the statements following a '#' are for documentation purposes only and aren't executed}.So, without further ado, let us start right away:

##########################################

1. Simple Data Visualization using biplots: USArrests example.

We use USArrests data (inbuilt R dataset) to see how it can be visualized in 2 dimensions. Just copy-paste the code below onto the R console [Hit 'enter' after the last line]. Need to install package "MASS". Don't reinstall if you have already installed it previously. A package once installed lasts forever.

rm(list = ls()) # clear workspace

install.packages("MASS") # install MASS package

mydata = USArrests # USArrests is an inbuilt dataset

pc.cr = princomp(mydata, cor=TRUE) # princomp() is core func summary(pc.cr) # summarize the pc.cr object

biplot(pc.cr) # plot the pc.cr object

abline(h=0); abline(v=0) # draw horiz and vertical axes

This is what the plot should look like. Click on image for larger view.

2. Code for making Joint Space maps:

I have coded a user-defined function called JSM in R. You can use it whenever you need to make joint space maps provided just by invoking the function. All it requires to work is a perceptions table and a preference rating table. First copy-paste the entire block of code below onto your R console. Those interested in reading the code, pls copy-paste line-by-line. I have put explanations in comments ('#') for what the code is doing.

## --- Build func to run simple perceptual maps --- ##

JSM = function(inp1, prefs){ #JSM() func opens

# inp1 = perception matrix with row and column headers
# brands in rows and attributes in columns
# prefs = preferences matrix

par(pty="s") # set square plotting region

fit = prcomp(inp1, scale.=TRUE) # extract prin compts

plot(fit$rotation[,1:2], # use only top 2 prinComps

type ="n", xlim=c(-1.5,1.5), ylim=c(-1.5,1.5), # plot parms

main ="Joint Space map - Home-brew on R") # plot title

abline(h=0); abline(v=0) # build horiz and vert axes

attribnames = colnames(inp1);

brdnames = rownames(inp1)

# -- insert attrib vectors as arrows --

for (i1 in 1:nrow(fit$rotation)){

arrows(0,0, x1 = fit$rotation[i1,1]*fit$sdev[1],

y1 = fit$rotation[i1,2]*fit$sdev[2], col="blue", lwd=1.5);

text(x = fit$rotation[i1,1]*fit$sdev[1], y = fit$rotation[i1,2]*fit$sdev[2],

labels = attribnames[i1],col="blue", cex=1.1)}

# --- make co-ords within (-1,1) frame --- #

fit1=fitfit1$x[,1]=fit$x[,1]/apply(abs(fit$x),2,sum)[1]

fit1$x[,2]=fit$x[,2]/apply(abs(fit$x),2,sum)[2]

points(x=fit1$x[,1], y=fit1$x[,2], pch=19, col="red")

text(x=fit1$x[,1], y=fit1$x[,2], labels=brdnames, col="black", cex=1.1)

# --- add preferences to map ---#

k1 = 2; #scale-down factor

pref = data.matrix(prefs)# make data compatible

pref1 = pref %*% fit1$x[,1:2]for (i1 in 1:nrow(pref1)){

segments(0, 0, x1 = pref1[i1,1]/k1, y1 = pref1[i1,2]/k1, col="maroon2", lwd=1.25)

points(x = pref1[i1,1]/k1, y = pref1[i1,2]/k1, pch=19, col="maroon2")

text(x = pref1[i1,1]/k1, y = pref1[i1,2]/k1, labels = rownames(pref)[i1], adj = c(0.5, 0.5), col ="maroon2", cex = 1.1)}

# voila, we're done! #} # JSM() func ends

3. OfficeStar MEXL example done on R

Goto LMS folder 'Session 6 files'. The file 'R code officestar.txt' contains the code (which I've broken up into chunks and annotated below) and the files 'officestar data1.txt' and 'officestar pref data2.txt' contain the average perceptions or attribute table and preferences table respectively.

Step 3a: Read in the attribute table into 'mydata'.

# -- Read in Average Perceptions table -- #

mydata = read.table(file.choose(), header = TRUE)

mydata = t(mydata) #transposing to ease analysis

mydata #view the table read

# extract brand and attribute names #

brdnames = rownames(mydata);

attribnames = colnames(mydata)

Step 3b: Read into R the preferences table into 'prefs'.

# -- Read in preferences table -- #

pref = read.table(file.choose())

dim(pref) #check table dimensions

pref[1:10,] #view first 10 rows

Data reading is done. You should see the data read-in as in the figure above. We can start analysis now. Finally.

Step 3c: Run Analysis

# creating empty pref dataset

pref0 = pref*0; rownames(pref0) = NULL

JSM(mydata, pref0) # p-map without prefs information

The above code will generate a p-map (without the preference vectors). Should look like the image below (click for larger image):

However, to make true joint-space maps (JSMs), wherein the preference vectors are overlaid atop the p-map, run the one line code below:

JSM(mydata, pref)

That is it. That one function call executes the entire JSM sequence. The result can be seen in the image below.

Again, the JSM function is generic and can be applied to *any* dataset in the input format we just saw to make joint space maps from. Am sure you'll leverage the code for animating your project datasets. Let me or Ankit know in case any assistance is needed in this regard.

4. Session 2 survey Data on Core courses:

Lookup LMS folder 'session 6 files'. Save the data and code files to your machine. Data files are 'courses data.txt' for the raw data on perceptions and courses data prefs.txt' for the preference data with student names on it. Now let the games begin.

# read in data

mydata = read.table(file.choose()) # 'courses data.txt'

head(mydata)

# I hard coded attribute and brand names

attrib.names = c("will.recommend", "persp.change", "conceptual.value.add", "practical.relevance", "interest.sustained", "difficulty.level");

brand.names = c("GSB", "INVA", "MGTO", "SAIT")

Should you try using your project data or some other dataset, you'll need to enter the brand and attribute names for that dataset in the same order in which they appear in the dataset, separately as given above.I then wrote a simple function, titled 'pmap.inp()' to denote "p-map input", to transform the raw data into a brands-attributes average peceptions table. Note that the below code is specific to the first set of columns being the preferences data.

# construct p-map input matrices using pmap.inp() func

pmap.inp = function(mydata, attrib.names, brand.names){ #> pmap.inp() func opens

a1 = NULL

for (i1 in 1:length(attrib.names)){

start = (i1-1)*length(brand.names)+1; stop = i1*length(brand.names);

a1 = rbind(a1, apply(mydata[,start:stop], 2, mean)) } # i1 loop ends

rownames(a1) = attrib.names; colnames(a1) = brand.names

a1 } # pmap.inp() func ends

a1 = pmap.inp(mydata, attrib.names, brand.names)

And now, we're ready to run the analysis. First the p-mapo without the prefences and then the full JSM.

# now run the JSM func on data

percep = t(a1[2:nrow(a1),]); percep

# prefs = mydata[, 1:length(brand.names)]

prefs = read.table(file.choose(), header = TRUE) # 'courses data prefs.txt'

prefs1 = prefs*0; rownames(prefs1) = NULL # null preferences doc created

JSM(percep, prefs1) # for p-map sans preferences

Should produce the p-map below:

And the one-line JSM run:

JSM(percep, prefs) # for p-map with preference data

Should produce the JSM below:

Follow the rest of the HW code given to run segment-wise JSMs in the same fashion.

5. Running MDS code with Car Survey Data:

The code is in 'R code HW dataset JSMs.txt' in LMS folder 'session 6 files'. The data are in 'mds car data raw.txt'. Read them in and follow the instructions here.

# --------------------- #
### --- MDS code ---- ###
# --------------------- #

rm(list = ls()) # clear workspace

mydata = read.table(file.choose(), header = TRUE) # 'mds car data raw.txt'

dim(mydata) # view dimension of the data matrix

brand.names = c("Hyundai", "Honda", "Fiat", "Ford", "Chevrolet", "Toyota", "Nissan", "TataMotors", "MarutiSuzuki")

Note that I have hard-coded the brand names into 'brand.names' If you want to use this MDS code for another dataset (for your project, say) then you'll have to likewise hard-code the brand.names in.Next, I defined a function called run.mds() that takes as input the raw data and the brand names vector, runs the analysis and outputs the MDS map. Cool, or what..

### --- build user define func run.mds --- ###

run.mds = function(mydata, brand.names){

# build distance matrix # k = length(brand.names)

dmat = matrix(0, k, k)

for (i1 in 1:(k-1)){ a1 = grepl(brand.names[i1], colnames(mydata));

for (i2 in (i1+1):k){a2 = grepl(brand.names[i2], colnames(mydata));
# note use of Regex here

a3 = a1*a2;

a4 = match(1, a3);

dmat[i1, i2] = mean(mydata[, a4]);

dmat[i2, i1] = dmat[i1, i2] } #i2 ends

} # i1 ends

colnames(dmat) = brand.names;

rownames(dmat) = brand.names

### --- run metric MDS --- ###

d = as.dist(dmat)

# Classical MDS into k dimensions #

fit = cmdscale(d,eig=TRUE, k=2) # cmdscale() is core MDS func

fit # view results

# plot solution #

x = fit$points[,1];
y = fit$points[,2];

plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2", main="Metric MDS", xlim = c(floor(min(x)), ceiling(max(x))), ylim = c(floor(min(y)), ceiling(max(y))), type="p",pch=19, col="red");

text(x, y, labels = rownames(fit$points), cex=1.1, pos=1);

abline(h=0); abline(v=0)# horiz and vertical lines drawn

} # run.mds func ends

Time now to finally invoke the run.mds func and get the analysis results:

# run MDS on raw data (before segmenting)

run.mds(mydata, brand.names)

The resulting MDS map looks like this:

OK, that's quite a bit now for classwork replication. Let me know if any code anywhere is not running etc due to any issues.

###################################

6. Session 6 HW:

This HW is also a group submission. You will need to co-operate with the rest of our group to get it done.

  • JSM based homework:
  • Collect basic demographic information about your group mates - #yrs of workex, previous industry, educational qualifications, intended major etc.
  • Run individual level JSM analysis on each of your team mates (and youself) using the code below (place appropriate name in student.name = c("") in that code)
  • Compare the JSMs you obtain - what salient similarities and differences do you see?
  • Now, using the demographic data you have collected, speculate on which demographic characteristics are best able to explain at least some of the similarities and differences you see.
  • Place (i) the 4 JSms, (ii) your list of salient similarities and differences (preferably in tabular form), (iii) the demographic profile of each group member (again, in tabular form) and (iv) the subset of demographic variables that best explain the JSMs in a PPT.
  • MDS based homework:
  • Construct individual level MDS maps for yourself and your group members.
  • Interpret them. In particular try to see what the axes might mean.
  • Interpret the clusters of similar brands (brands bunched close together) in terms of what characteristics are common among them.
  • Collate your (i) MDS plots, (ii) axes interpretation on each plot and (iii) similarity cluster interpretation on each plot into a PPT
  • Submit one PPT for both parts of your homework. Title slide should contain group name and member names + PGIDs. Name the slide as _session6HW.pptx

  • Note: If you don't find your name in the dataset, use a friend's observation (and demographic data) instead.
Use code below to draw individual level JSM plots:
student.name = c("Himanshu") # say, student's name is Himanshu
# retain only that row in the raw data which has name 'Himanshu'

mydata.test = mydata[(rownames(prefs) == student.name),]

# run the pmap.inp() func to build avg perceptions table

a1.test = pmap.inp(mydata.test, attrib.names, brand.names);

percep.test = t(a1.test[2:nrow(a1.test),]);

# introduce a small perturbation lest matrix not be of full rank
percep.test = percep.test + matrix(rnorm(nrow(percep.test)*ncol(percep.test))*0.01, nrow(percep.test), ncol(percep.test));

prefs.test = prefs[(rownames(prefs) == student.name),]; prefs.test

# run analysis on percep.test and prefs.test

JSM(percep.test, prefs.test)

Use code below to run individual level MDS plots. Just place the apropriate student.name and run.

student.name = c("Himanshu") #change student name as reqd
# retain only that row in raw data with name Himanshu

mydata.test = mydata[(rownames(mydata) == student.name),];

# run analysis and save result by copy pasting onto PPT

run.mds(mydata.test, brand.names)

HW deadline is 9-Nov Saturday midnight. That's it for now. Contact me with queries if any.Sudhir