Showing posts with label Session 4. Show all posts
Showing posts with label Session 4. Show all posts

Thursday, October 24, 2013

Session 4 Updates for the Co2014 Hyderabad

Hi all,
Session 4 Big-picture Recap:
We're done with a readings-Heavy Session 4 'Qualitative Research' today. To recap the four big-picture take-aways from the session, let me use bullet-points:
  • We studied Observation Techniques - both of the plain vanilla observation (Reading 1 - Museums) and the 'immersive' ethnographic variety (Reading 2 - adidas).
  • We then ventured into deconstructing the powerful habit formation process and arrived a 3-step loop framework to describe it for Marketing purposes: cue-routine-reward.
  • We saw how the innovative combination of qualitative insight and predictive analytics can lead to windfall $$ profits (Reading 3-Target and reading 4-Febreze)
  • Finally we saw how unstructured respondent interaction personified by a focus group discussion (FGD) can be a powerful qualitative tool for digging up customer insights.
Session 4 HW details:Pls read the contemporary caselet given below, from last month's Economist magazine. It deals with a story on how to help professional chess gain fans. A sporting chance - how fans plan to revive Chess
IN LONDON in April, a 22-year-old Norwegian turned cartwheels by the Thames. Magnus Carlsen, the world’s top-ranked chess player (and a model for G-Star RAW, a fashion firm) had just earned the right to challenge for the World Chess Championship in India next month. His battle against Viswanathan Anand, a 43-year-old Indian and world champion since 2007, is a long-awaited spectacle. Match organisers see a chance to turn a struggling sport into a global brand.
Time was when the world stopped for professional chess. Millions watched Bobby Fischer, an American, beat the Soviet Union’s Boris Spassky in 1972. In the 1990s a pair of matches between Garry Kasparov and Deep Blue, a computer, recaptured some of that suspense. Yet despite booming interest in the amateur game, top-level chess has become obscure again, hobbled by squabbles and eccentric leadership.Enthusiasts spy a comeback. Last year Andrew Paulson, an American businessman based in London, bought rights to stage the game’s most prestigious contests, including November’s duel. For $500,000 the World Chess Federation (FIDE) granted Mr Paulson media and marketing licences for a decade—and the chance to make chess a profitable enterprise.The game itself has plenty of fans. Research in five countries by YouGov, a pollster, found that more than two-thirds of adults have played at least once. FIDE says 605m do so regularly. In India, where Mr Anand is a national hero, nearly a third of adults claim to play every week. The internet and smartphones mean novices no longer need a friend to play. Susan Polgar, a Hungarian-American grandmaster, says about 35 countries include chess in school curricula.But grassroots enthusiasm has not raised the profile of the professional game. Critics gripe about mercurial decision-making within FIDE. The sport’s governing body gets by on some $2m a year (FIFA, football’s federation, spent more than $1 billion in 2012) and has had only two presidents in 31 years.



So Mr Andrew paulson wants to do to world Chess what Kerry Packer did to world cricket, perhaps... The challenge clearly is to kickstart the virtuous cycle of more fans --> more advertiser interest --> more ad revenues --> more talent drawn in --> more competition --> more fans... How to get this cycle going is the Q. Clearly, the topic isn't as esoteric as it may first look. Football and Hockey in India suffer too from trying to kickstart this virtuous cycle in the shadow of big brother Cricket, for instance.The FGD then is partly a brainstorming session (from the firms' viewpoint) and partly an opining & evaluation (from a potential fan's viewpoint) of various possibilities. It may involve first trying to define what chess is or means to fans, what more it can come to represent, how to expand and leverage its present connect etc. Analogies with other sports such as cricket may come in as well, perhaps... I don't want to write more and bias the storylines you may come up with.
A deeper challenge is that watching chess is less fun than playing it. A single game can last six hours; its most riveting moment may be a strategic nuance known as the Yugoslav variation on the Sicilian. “Good chess leads to draws,” says Maurice Ashley, an American grandmaster.
Mr Ashley believes that new game and tournament formats could attract a wider audience. Competitors in blitz chess must finish their games in half an hour. Matches lasting minutes make popular footage online. Yet many players resist fast games, arguing that they reward low-quality chess. FIDE’s enthusiasm for shorter championships in the 1990s and 2000s prolonged the professional game’s split.Lengthy duels could still flourish if packaged well. Golf’s slow pace does not stop big audiences following four-day tournaments; in the cricket-playing world, witty commentary keeps fans tuned to games that last five days. Lately ESPN, a broadcaster, has turned poker, spelling bees and Frisbee-flinging (see article) into tense, dramatic television.Mr Paulson, who made a fortune in Russian internet ventures, says chess matches can make “heart-gripping, heart-pounding entertainment”. (He is standing for president of the English Chess Federation on October 12th.) He plans more competitions in big cities beyond Russia and eastern Europe, where many now take place. In March he launched ChessCasting, a web application that offers statistics and commentary on big events as well as discussion boards for amateur pundits. He talks of reporting competitors’ sweating, eye movement and heart rate.Chess needs deep-pocketed backers to complete this transformation. Mr Paulson thinks firms will want to associate with a game that is “clean, pure and meritocratic”. But he has not yet announced any big new sponsors. “One mistake has been assuming it would be easier,” he says. A cartwheeling world champion might help.



Submission format:
  • Title slide of your PPT should have your group name, member names and PGIDs
  • Next slide write your D.P. and R.O.(s) clearly.
  • Third slide, introduce the FGD participants and a line or so on why you chose them (tabular form is preferable for this)
  • Fourth Slide, write a bullet-pointed exec summary of the big-picture take-aways from the FGD
  • Fifth Slide on, describe and summarize what happened in the FGD
  • Note if unification and / or polarization dynamics happened in the FGD
  • Name your slide groupname_FGD.pptx and drop in the appropriate dropbox by the start of session 6
Any queries etc., pls feel free to email me.Update 1: FGD HW guidelines:A few more thoughts, guidelines on the FGD HW for this session. To keep it focussed and brief, lemme use the bullet-points format
  • The point of the FGD is *not* to 'solve' the problem, but merely to point a likely direction where a solution can be found. So don't brainstorm for a 'solution', that is NOT the purpose of the FGD.
  • Ensure the D.P. and R.O.s are aligned and sufficiently exploratory before the FGD can start. Different ROs lead to very different FGD outcomes. For example, if you define your R.o. as "Explore which advertising themes enable high levels of fan-connect" versus "Explore potential fan-connect across different chess formats", etc.
  • Keep your D.P. and R.O. tightly focussed, simple and do-able in a mini-FGD format. Having too broad a focus or too many sub-topics will lead nowhere in the 30 odd minutes you have.
  • Start broad: Given an R.O., explore how people connect with or relate to sports in general, their understanding of what constitutes a 'sports fan', their understanding of what constitutes 'excitement', memorability', 'social currency' or 'talkability' in a sport and so on. You might want to start with *sports* in general and not narrow down to chess right away (depending on the constructs you seek, of course).
  • Prep the moderator well: The moderator in particular has a crucial role. Have a broad list of constructs of interest, Focus on getting them enough time and traction (without being overly pushy). For example, the mod could start by asking the group: "What do you think connects people to sports?" and get the ball rolling, then steer it to keep it on course.
  • Converge on Chess in detail: After exploring sports in general, explore the particulars of chess as a sport - what is it, how is it viewed or understood, what is the perception of people who play or follow the game, how can it be made more trendy etc.
  • Do some background research on chess and its history first. Know what different game formats have been proposed and tried. E.g., apart from 'blitz chess', there is 'chess by jury' in which two groups of people individually vote for the best next move and the move with the highest votes is played on giant screens, etc.
  • See where people agree in general, change opinions on interacting with other people on any topic, disagree sharply on some topics and stand their ground etc.
  • In your PPT report, mention some of the broad constructs you planned to explore via the FGD.
  • Report (among other things) what directions seem most likely to be fruitiful for investigation.
External Speaker talk dates:Would've gone with next week Sunday after the midterms but its Diwali. Now I'm thinking of Friday 8-Nov, Sat 9-Nov or Sun 10-Nov Evening between 4 and 6 pm. If you have strong objections to this, pls let me know. Else I will pickup 10-Nov Sunday 4-6pm as the tentative date for the external speaker.Session 5 preview:We'll cover Segmentation and Targeting in session 5. This is both a theory and R heavy session. There'll be illustrative dummy examples as well as live exercises galore.That's it for now, more later.















Saturday, December 15, 2012

Session 4 HW Help

Hi all,

Thanks to the tireless efforts of one Mr Shaurya Singh, I've put up what is (almost) the solution to your Session 4 HW on LMS.

The data and code required to run all your session 4 HW exercises (due wednesday 19-Dec midnight)are up on a folder imaginatively named 'session 4 HW' on LMS.

As you know, session 4 HW has three parts to it. The data and code for each part are placed in separate subfolders.

Instead of copy-pasting code from the blog, I urge you to copy-paste bvlocks of code directly from the R code.txt files in each of the subfolders. The data required to be read-in are available directly as .txt files in the same sub-folders.

I now estimate no more than 30 minutes for you to run through the entire R portion of your session 4 HW. Pls ensure you have a good interpretation (or 'story') ready to be written as bullet points onto your HW slides.

My colleagues tell me there is hope the students may come back to normal with Day 1 of placements over, for the rest of the term. Well, we'll see. Cheers

Sudhir

Wednesday, December 5, 2012

Session 4 - MDS, Factor Analysis: R code and HW

Update:

Received this email from ms Priyanka Guha:

Hello Professor,

I think there is a missing data in row 19 and column AJ of the second data sheet and so, I am getting an error while running the code. Can you please suggest how to go forward about that.

My response:
Hi Priyanka,

Yes, I just checked.

Pls replace it with the median for that column (3, I think). More generally, missing data is a part of most real world data sets. We usually impute the missing values provided there’s not too many of them. Means, medians etc are often used.

.

******************************************

Hi all,

Like I said in my opening statement in class today: Session 4 onwards MKTR gets heavy on R and tools and analysis.

Last post, I discussed, demo-ed and deconstructed Joint Space maps (JSMs). This post is dedicated to MDS (and if its not too long), factor analysis. So, w/o further ado, here goes:

5. MDS code

MDS or Multi-dimensional scaling is the way we analyze overall-similarity (OS) data. Recall that you rated your impression of overall similarity among 9 car brands in a series of paired comparisons. We use that data for MDS here. The input is required in a particular format and there's some data cleaning and aggregation required. Luckily, a simple R function now takes care of that problem.

Pls find the data for MDS and subsequently factor analysis input in this google spreadsheet.

This is what the code below is helping you do:

  • Read in data with headers from cells D38 to AM90 into 'inp'.
  • Then read in the brand names vector (cells C39 to C47) into 'names'.
  • Build a dissimilarity (or distance) matrix 'dmat'
# read in the dissimilarities matrix #
inp = read.table(file.choose(),header=TRUE)
# read in brand names vector #
names = read.table(file.choose())

# build distance matrix #
k = nrow(names)
dmat = matrix(0, k, k)

for (i1 in 1:(k-1)){
a1=grepl(names[i1,1], colnames(inp))
for (i2 in (i1+1):k){
a2=grepl(names[i2,1], colnames(inp))
a3 = a1*a2
a4 = match(1, a3)
dmat[i1, i2] = mean(inp[, a4])
dmat[i2, i1] = dmat[i1, i2]
}}

colnames(dmat) = names[,1]
rownames(dmat) = names[,1]
Click on the image to see what the distance matrix 'dmat' looks like. This is the average of the similarity perceptions for the whole class.

First, we do metric MDS. Metric MDS uses metric data as its input. Since you rated similarity-dissimilarity on a 1-7 interval scale, metric MDS will work just fine. Here's the code for metric MDS, just copy-paste onto the R console.

d = as.dist(dmat)
# Classical MDS into k dimensions#
fit <- cmdscale(d,eig=TRUE, k=2)
fit # view results

# plot solution #
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2", main="Metric MDS", xlim = c(floor(min(x)), ceiling(max(x))), ylim = c(floor(min(y)), ceiling(max(y))), type="p",pch=19, col="red")
text(x, y, labels = rownames(fit$points), cex=1.1, pos=1)
abline(h=0); abline(v=0)
This is my graphical MDS output. Click on image for larger size.

Suppose the similarity-dissimilarity judgments were in yes/no terms rather than in metric ratings. Then metric MDS becomes dicey to use as it relies on interval scaling assumptions. The more robust (but somewhat less efficient) nonmetric MDS then becomes the way to go. I would recommend using nonmetric MDS when in doubt, and certainly for MDS maps for individual respondents.

Nonmetric MDS is just as easy to run. However, make sure you have the MASS package installed before running it.

library(MASS)
d <- as.dist(dmat)
fit <- isoMDS(d, k=2)
fit # view results


# plot solution $
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2", xlim = c(floor(min(x)), ceiling(max(x))), ylim = c(floor(min(y)), ceiling(max(y))), main="Nonmetric MDS", type="p", pch=19, col="red")
text(x, y, labels = rownames(fit$points), cex=1.1, pos=1)
abline(h=0); abline(v=0)
Click on image for larger size.

5a. Subset analyses with MDS

We can analyze select subsets of interest via MDS. I've defined a generic function 'dmat.subset' to do this with. It requires you to provide the brand names vector 'names', the input matrix 'inp' and a subset selection vector 'k1'. First, just copy-paste the following code onto the R console.

# dmat.subset func define #
dmat.subset <- function(names, inp, k1){

k = nrow(names)
dmat = matrix(0, k, k)
inp1 = inp[k1, ]

for (i1 in 1:(k-1)){
a1=grepl(names[i1,1], colnames(inp1))
for (i2 in (i1+1):k){

a2=grepl(names[i2,1], colnames(inp1))
a3 = a1*a2
a4 = match(1, a3)
dmat[i1, i2] = mean(inp1[, a4])
dmat[i2, i1] = dmat[i1, i2]

}}
colnames(dmat) = names[,1]
rownames(dmat) = names[,1]

dmat }

To select the second row in 'inp', simply define

k1 = 2

To select the second, fifth and tenth rows (say), write

k1 = c(2, 5, 10)

To select rows 2 to 5 and 22-to-39, write

k1 = c(2:5, 22:39)

And so on. Hopefully, the logic is clear enough now.

Suppose you want to use PGID to select. Then first read cells B39 to B90 into a dataset, say 'pgid'. Say you want to select PGID 61310363. Now do the following (useful from your HW point of view):

# read-in pgid vector
pgid = read.table(file.choose())

k1 = (pgid == 61310363) # define subset

Now, once having selected the subset of interest into 'k1', we can run MDS on it quite simply as:

# define dist matrix for the subset
dmat = dmat.subset(names, inp, k1)

### --- nonMetric MDS for individuals ---
library(MASS)
d <- as.dist(dmat)
fit <- isoMDS(d, k=2)
fit # view results

# plot solution $
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2", xlim = c(floor(min(x)), ceiling(max(x))), ylim = c(floor(min(y)), ceiling(max(y))), main="Nonmetric MDS", type="p", pch=19, col="red")
text(x, y, labels = rownames(fit$points), cex=0.85, pos=1)
abline(h=0); abline(v=0)
This is what I got for pgid 363. Notice how an individual's MDS plot differs markedly from the overall average for the class. Which one would you trust more? Why (not)?

5b. HW2 for Session 4 on MDS

Pls use the data on the google spreadsheet mentioned in this post. I have also sent it separately in the worksheet labeled 'MDS HW' in the excel file emailed to you.

Your task is to:

  • Draw an MDS map using your own individual input.
  • Interpret what the axes mean and what you might have had at the back of your mind when you were evaluating the survey Qs.
  • Make a note of whether there are any attributes you had in mind which are not reflected in the MDS. And if so, what these attributes are.
  • Now choose any 2 other students and draw 2 MDS plots corresponding to each of these students. Take care that their maps should differ substantially from your own. Interpret what major attributes they may have had in mind when they filled in the survey.

The format for submission:

  • Paste the MDS output on PPT slides.
  • For each plot, write your interpetation of the plot using bullet points on the next slide.
  • Any queries, clarifications etc, pls use the comments box below to reach me.
  • Any help needed with R, pls reach out to the AA Mr Ankit Anand or to me.

And here's the point of it all, the value-add that will accrue once you successfully (and sincerely) navigate this Session4-HW2.

Good luck. And I hope you have fun doing the HW rather than push it in too serious a mood.

6. Factor analysis code

Update: This blog-post contains a detailed explanation of how to interpret factor analysis results using a HW dataset I'd given out in term 5, Hyd.

We will use exploratory (or 'common') factor analysis first on the toothpaste survey dataset. This dataset can be found from calls H4 to P34 in the google spreadsheet mentioned above. Need to install package 'nFactors' (R is case sensitive, always) for running the scree plot.

# read in the data #
mydata=read.table(file.choose(),header=TRUE)
mydata[1:5,]#view first 5 rows

# install the required package first
install.packages("nFactors")

# determine optimal no. of factors#
library(nFactors) # invoke library
ev <- eigen(cor(mydata)) # get eigenvalues
ap <- parallel(subject=nrow(mydata),var=ncol(mydata),rep=100,cent=.05)
nS <- nScree(ev$values, ap$eigen$qevpea)
plotnScree(nS)

On the scree plot that appears, The green horizontal line represents the Eigenvalue=1 level. Simply count how many green triangles (in the figure above) lie before the black line cuts the green line. That is the optimal no. of factors. Here, it is 2. The plot looks intimidating as it is, hence, pls do not bother with any other color-coded information given - blue, black or green. Just stick to the instructions above.

k1=2 # set optimal no. of factors
If the optimal no. of factors changes when you use a new dataset, simply change the value of k1 in the line above. Copy paste the line onto a notepad, change it to 'k1=6' or whatever you get as optimal and paste onto R console. Rest of the code runs as-is.

# extracting k1 factors #
# with varimax rotation #
fit <- factanal(mydata, k1, scores="Bartlett",rotation="varimax")
print(fit, digits=2, cutoff=.3, sort=TRUE)

# plot factor 1 by factor 2 #
load <- fit$loadings[,1:2]
par(col="black")#black lines in plots
plot(load,type="p",pch=19,col="red") # set up plot
abline(h=0);abline(v=0)#draw axes
text(load,labels=names(mydata),cex=1,pos=1)
# view & save factor scores #
fit$scores[1:4,]#view factor scores

#save factor scores (if needed)
write.table(fit$scores, file.choose())
Click on image for larger size.

In case you are wondering how the variable loading onto factors looks like in R (after factor rotation), here is the relevant R console snapshot.

Clearly, the top 3 variables load onto factor 1 and the bottom 3 onto factor 2. That is all for now. See you soon.

6a. Factor Analysis Session 4-HW3.

Go to the 'HW3 Factor Analysis input data' worksheet. Read-in the dataset from cells B5 to L56. The annotation for the variables V1-V11 can be found in columns N and O.

Your task is to:

  • Run a factor analysis on the psychographics matrix
  • Use the scree plot to identify the optimal no. of factors
  • Assign variables to factors
  • Interpret what the factors may mean given the variables that load on them

Submission format:

  • Again, PPT format. leave one slide blank to separate HWs and HW3 in your PPT
  • Paste the scree plot and the actual factor table obtained
  • Write your interpretation of the factors as bullet points on a fresh slide

I will go over the factor analysis part again in class and in the tutorial. In particular, we will cover what is factor analysis, what are factor loadings and factor scores, how to use factor scores in downstream analysis and so on.

Sudhir


Tuesday, December 4, 2012

Session 4 - Perceptual Maps: R code and HW

Update:

Folks, the Session 4 HW is due on 14-Dec Friday (and not 10-December as was mistakenly stated in my email). Sorry about the confusion.

******************************************

Hi all,

I hope you had the chance to run the R demo of session 3 at home just to see first-hand R's ease-of-use in MKTR.

Before jumping into the R code for Session 3, first, a small admin step - you need to know how to download and install packages in R.

Say, you need the package 'MASS' installed. For the menu-driven option, go to the Packages Menu in the Menu bar, click on 'Install Package(s)...', a window will open asking which server to download the package from. Be patriotic and choose 'India' for our sole R repositary based in IIT-Chennai. Or choose any other country if the desi server is slow at a particular time.

A second window will open listing all the packages in R (at present, this list grows every month) in alphabetical order. Click on the package you want and sit back. R will automatically download and install the package for you. Might take a minute or two at most.

Well, either that or simply type:

install.packages("MASS")
OK, let's get started with the R codes then.

1. Simple Data Visualization using biplots. We use USArrests data (inbuilt R dataset) to see how it can be visualized in 2 dimensions. Just copy-paste the code below onto the R console [Hit 'enter' after the last line].

pc.cr <-princomp(USArrests, cor = TRUE)# scale data
summary(pc.cr)
biplot(pc.cr)
This is what the plot should look like. Click on image for larger view.

2. Code for making Joint Space maps I have coded a user-defined function called JSM in R. You can use it whenever you need to make joint space maps provided just by invoking the function. All it requires to work is a perceptions table and a preference rating table.

First copy-paste the entire block of code below onto your R console. Those interested in reading the code, pls copy-paste line-by-line. I have put explanations in comments ('#') for what the code is doing.

## --- Build func to run simple perceptual maps --- ##

JSM <- function(inp1, prefs){

# inp1 = perception matrix with row and column headers
# brands in rows and attributes in columns
# prefs = preferences matrix

par(pty="s") # set square plotting region

fit = prcomp(inp1, scale.=TRUE) # extract prin compts

plot(fit$rotation[,1:2], # use only top 2 prinComps
type ="n",xlim=c(-1.5,1.5), ylim=c(-1.5,1.5), # plot parms
main ="Joint Space map - Home-brew on R") # plot title

abline(h=0); abline(v=0) # build horiz & vert axes

attribnames = colnames(inp1)
brdnames = rownames(inp1)

# <-- insert attrib vectors as arrows--
for (i1 in 1:nrow(fit$rotation)){
arrows(0,0, x1=fit$rotation[i1,1]*fit$sdev[1], y1=fit$rotation[i1,2]*fit$sdev[2], col="blue", lwd=1.5);
text(x=fit$rotation[i1,1]*fit$sdev[1],y=fit$rotation[i1,2]*fit$sdev[2], labels=attribnames[i1],col="blue", cex=1.1)}

# <--- make co-ords within (-1,1) frame #

fit1=fit
fit1$x[,1]=fit$x[,1]/apply(abs(fit$x),2,sum)[1]
fit1$x[,2]=fit$x[,2]/apply(abs(fit$x),2,sum)[2]
points(x=fit1$x[,1], y=fit1$x[,2], pch=19, col="red")
text(x=fit1$x[,1], y=fit1$x[,2], labels=brdnames,col="black", cex=1.1)

# --- add preferences to map ---#
k1 = 2; #scale-down factor
pref=data.matrix(prefs)# make data compatible
pref1 = pref %*% fit1$x[,1:2]
for (i1 in 1:nrow(pref1)){segments(0,0, x1=pref1[i1,1]/k1,y1=pref1[i1,2]/k1, col="maroon2", lwd=1.25)}
# voila, we're done! #

}

3. OfficeStar MEXL example done on R

All the data for the below JSM examples are stored as tables in this google spreadsheet.

I use as my example the OfficeStar dataset that I also demo in class from MEXL's built-in examples database.To facilitate comparison, I use as input format in R the same tables that you would otherwise use in MEXL

Pls open the google doc and copy the average perceptions matrix (cell B5 to cell F10) onto a notepad and read it in the usual way.

Step 3a: Read in the attribute table into 'mydata'.

# -- Read in Average Perceptions table -- #
mydata = read.table(file.choose(), header = TRUE)
mydata = t(mydata) #transposing to ease analysis
mydata #view the table read
# extract brand & attribute names #
brdnames = rownames(mydata)
attribnames = colnames(mydata)

Step 3b: Read into R the preferences table into 'prefs' (cell C15 to F24 on the google spreadsheet) via the notepad.

# -- Read in preferences table -- #
pref = read.table(file.choose())
dim(pref) #check table dimensions
pref[1:10,] #view first 10 rows

Data reading is done. You should see the data read-in as in the figure above. We can start analysis now. Finally.

Step 3c: Run Analysis

JSM(mydata, pref)

That is it. That one function call executes the entire JSM sequence. The result can be seen in the image below. Again, the JSM function is generic and can be applied to *any* dataset in the input format we just saw to make joint space maps from.

4. Session 2 HW - JSM Analysis

This is *your* data from the session 2 HW surveys you had filled. You evaluated term 4 core courses along 5 attributes. Let's see if a JSM can throw more light on the perceptions and preferences of the Co2013 at Mohali in this regard.

First, read-in the data the usual way.

  • Read-in cells M5 to AF55 for the attribute perceptions matrix.
  • Then read-in cells H5 to K55 for the preferences matrix.
  • Note that the data format here is different from the input in the OfficeStar example. There we had the 'Average perceptions matrix'.
  • Here, the data is more 'raw'. It has to be processed and the averages computed and then tabulated. No problemo, we'll get R to process it for us.
  • Hence, I have separately entered the attribute and 'brand' names.
# now read in perceptions data (w/o headers)
mydata = read.table(file.choose())
dim(mydata); mydata[1:4,]

# read-in preferences matrix
prefs = read.table(file.choose())
dim(prefs); prefs[1:4,]

# name brands and attributes involved
attribs = c("PerspChange", "TheoryValue", "PracticalRelevance", "InterestStimulated", "DifficultyLevel")
brands = c("GSB", "INVA", "MGTO", "SAIT")

Note above that the attribute and brand names have no spaces in them.

Now, some more code to manipulate the data and build the input matrix in the right format. On your part, just copy-paste. if you want to understand the code, go line-by-line, else paste in a block.

# Build input matrix for Perceptual map
inp1 = matrix(0, nrow=length(attribs), ncol=length(brands))

a1 = apply(mydata, 2, mean)
a2 = seq(from=1, to=20, by=4)

for (i1 in 1:nrow(inp1)){ inp1[i1,] = a1[a2[i1]:(a2[i1]+ncol(inp1)-1)]}

colnames(inp1) = brands; rownames(inp1) = attribs; inp1
inp1 = t(inp1) # transpose for convenience
cor(inp1) # view correlations among the attribs

OK. Now to run the analysis. Again, just invoke JSM and it does the job.
Note:Use 'inp1' and 'prefs' as arguments to the JSM function because that is what we have named our average perceptions matrix and our preference matrix respectively.
JSM(inp1, prefs)

4. Applying JSM to select Subsets

In Session 4, we see that p-maps can mislead and confuse as much as enlighten. It's important that the transformation (e.g. taking averages) of perceptual data be meaningful. Hence, there may be occasion for us to require that we build JSMs on select subsets of the full dataset.

For instance, if we had reason to believe that males and females may perceive a given brand differently, then could we sort the dataset by gender and draw JSMs by gender?

The below function 'JSMsubset' allows us to sort on any column of interest and plot JSMs for each chosen subset.

### --- func to build inputs to JSM subsets ---- ###
JSMsubset <- function(mydata, prefs, attribs, brands, k1){

a1 = apply(mydata[k1,], 2, mean)
a2 = seq(from=1, to=length(a1), by=length(brands))

# define a new inp1
inp1 = matrix(0, nrow=length(attribs), ncol=length(brands))
for (i1 in 1:nrow(inp1)){ inp1[i1,] = a1[a2[i1]:(a2[i1]+ncol(inp1)-1)]}
colnames(inp1) = brands; rownames(inp1) = attribs; inp1

inp1 = t(inp1) # transpose for convenience
prefs1 = prefs[k1,]

outp = list(inp1, prefs1)

outp }
Henceforth, invoking JSMsubset() will produce an output that will then enable us to use JSM() proper to draw JSMs. Read in the gender variable (cells L5 to L55) into a dataset called 'demog' and follow the code below.
demog=read.table(file.choose())

attach(demog);
fem=(Gender=="Female")
male=(Gender!="Female")

k1 = fem
outp = JSMsubset(mydata, prefs, attribs, brands, k1)
JSM(outp[[1]], outp[[2]])
That was the female students' p-map. The one for male students follows.
k1 = male
outp = JSMsubset(mydata, prefs, attribs, brands, k1)
JSM(outp[[1]], outp[[2]])
OK, so I was quite mystified when I noted that the perceptions from Mohali seemed remarkably similar across attributes and courses. In contrast, the Hyd folks were all over the place. Later, it dawned that not all Hyd students had a uniform core courses experience. Different sections had different instructors and so on. Here, one instructor effectively handled all 3 sections.

BTW, how might you use JSMsubset for your coming HW? Well, imagine if you could sort by, say, workex or intended major or any other variable of interest and then draw JSMs. For instance:

k1=(pgid==61310076)
outp = JSMsubset(mydata, prefs, attribs, brands, k1)
JSM(outp[[1]], outp[[2]])

k1 = (workex > 4)
outp = JSMsubset(mydata, prefs, attribs, brands, k1)
JSM(outp[[1]], outp[[2]])

The real power of the above method will come after we go through session 5 - segmentation. Once we start segmenting a respondent base using basis variables of interest, then we can use JSMsubset to test around hither-tither.

4a. HW1 for Session 4 involving JSMsubsets.

About the data:

  • Open the HW dataset sent by email to you. Go to the worksheet 'HW1 JSM input data'.
  • Columns A to Z contain your perception and preference matrices. These data are already there in the google spreadsheet.
  • Columns AB to AI contain demographic information - workex(yrs), Gender, Education and Intended major. This is how typical qualtrics output would look like (after cleaning up).

Your task in Session 4 HW1 is to:

  • Plot any one student's JSM. Then plot another student's JSM. By trial and error find two students such that their JSMs show significant differences. Interpret the differences.
  • Split the Dataset into two - Those intending to major in marketing and Those not. Then plot JSMs for each of these 2 groups. Interpret differences if any.
  • Split the Dataset into two - Engineers versus non-engineers. Then plot JSMs for each of these 2 groups. Interpret differences if any.
Submission Format:

  • Create a plain white blank PPT
  • Write your name and PGID on the title slide
  • Copy-paste your JSM plots from R to the PPT(as metafiles)
  • After each JSM plot, explain yourinterpretation of the plot in bullet points in the next slide
  • Give an informative heading to each slide. E.g., 'JSM for student 10' or 'Interpreting student 10's JSM' etc.
  • Use the same format for HW2 and HW3 on MDS and factor analysis
  • Drop the PPT into the dropbox on LMS well before deadline (next Friday 14-Dec midnight).

In case you are wondering what is the value-add from this exercise, let me point out the learning goals that should all be well within grasp once you successfully navigate this part of your HW.

Click on the image for a larger size picture.

The MDS and factor analysis components of session 4 are in a separate blog-post as this one was getting unwieldy.

Sudhir