Wednesday, October 30, 2013

Session 5 Updates

Hi all,
Yesterday in Session 5 we covered two major topics - Segmentation and Targeting. Sorry about the delay in bringing out this blog post.In this blog post, I shall lay out the classwork examples (which you might want to try replicating) and their interpretation, and the HW for this session.There are many approaches to doing cluster analysis and R handles a dizzying variety of them. We'll focus on 3 broad approaches - Agglomerative Hierarchical clustering (under which we will do basic hierarchical clustering with dendograms), Partitioning (here, we do K-means) and model based clustering. Each has its pros and cons. Model based is probably the best around, highly recommended.1. Cluster Analysis Data preparation
First read in the data. USArrests is pre-loaded, so no sweat. I use the USArrests dataset example throughout for cluster analysis.
#first read-in data#
mydata = USArrests
Data preparation is required to remove variable scaling effects. To see this, consider a simple example. If you measure weight in Kgs and I do so in Grams - all other variables being the same - we'll get two very different clustering solutions from what is otherwise the same dataset. To get rid of this problem, just copy-paste the following code.
# Prepare Data #

mydata = na.omit(mydata) # listwise deletion of missing

mydata = scale(mydata) # standardize variables

2. Now we first do agglomerative Hierarchical clustering, plot dendograms, split them around and see what is happening.

# Ward Hierarchical Clustering

d = dist(mydata, method = "euclidean") # distance matrix

fit = hclust(d, method="ward") # run hclust func

plot(fit)# display dendogram

Click on image for larger size.Eyeball the dendogram. Imagine horizontally slicing through the dendogram's longest vertical lines, each of which represents a cluster. Should you cut it at 2 clusters or at 4? How to know? Sometimes eyeballing is enough to give a clear idea, sometimes not. Various stopping-rule criteria have been proposed for where to cut a dendogram - each with its pros and cons. I'll go with subjective - visual criterion for the purposes of this course.

Suppose you decide 2 is better. Then set the optimal no. of clusters 'k1' to 2.

k1 = 2 # eyeball the no. of clusters

Note: If for another dataset, the optimal no. of clusters changes to, say, 5 then use 'k1=5' in the line above instead. Don't blindly copy-paste that part. However, once you have set 'k1', the rest of the code can be peacefully copy-pasted as-is.

# cut tree into k1 clusters

groups = cutree(fit, k=k1)# cut tree into k1 clusters

3. Coming to the second approach, 'partitioning', we use the popular K-means method. Again, the Q arises, how to know the optimal no. of clusters? Eyeballing the dendogram might sometimes help. But at other times, what should you do? MEXL (and most commercial software too) requires you to magically come up with the correct number as input to K-means. R does one better and shows you a scree plot of sorts that shows how the within-segment variance (a proxy for clustering solution quality) varies with the no. of clusters. So with R, you can actually take an informed call.

# Determine number of clusters #

wss = (nrow(mydata)-1)*sum(apply(mydata,2,var));

for (i in 2:15) wss[i] = sum(kmeans(mydata,centers=i)$withinss);

plot(1:15, wss, type="b", xlab="Number of Clusters", ylab="Within groups sum of squares")
# Look for an "elbow" in the scree plot #

Look for an "elbow" in the scree plot. The interior node at which the angle formed by the 'arms' is the smallest. This scree-plot is not unlike the one we saw in factor-analysis. Again, as with the dendogram, we get either 2 or 4 as the options available. Suppose we go with 2.
# Use optimal no. of clusters in k-means #

k1=2

Note: If for another dataset, the optimal no. of clusters changes to, say, 5 then use 'k1=5' in the line above instead. Don't blindly copy-paste that part. However, once you have set 'k1', the rest of the code can be peacefully copy-pasted as-is.
# K-Means Cluster Analysis

fit = kmeans(mydata, k1) # k1 cluster solution


To understand a clustering solution, we need to go beyond merely IDing which individual unit goes to which cluster. We have to characterize the cluster, interpret what is it that's common among a cluster's membership, give each cluster a name, an identity, if possible. Ideally, after this we should be able to think in terms of clusters (or segments) rather than individuals for downstream analysis.
# get cluster means

aggregate(mydata.orig,by=list(fit$cluster),FUN=mean)

# append cluster assignment

mydata1 = data.frame(mydata, fit$cluster);

mydata1[1:10,]

OK, That is fine., But can I actually, visually, *see* what the clustering solution looks like? Sure. In 2-dimensions, the easiest way is to plot the clusters on the 2 biggest principal components that arise. Before copy-pasting the following code, ensure we have the 'cluster' package installed.
# Cluster Plot against 1st 2 principal components
# vary parameters for most readable graph

install.packages("cluster")
library(cluster)
clusplot(mydata, fit$cluster, color=TRUE, shade=TRUE,labels=2, lines=0)

Two clear cut clusters emerge. Missouri seems to border the two. Some overlap is also seen. Overall, the clusPlot seems to put a nice visualization over the clustering process. Neat, eh? Try doing this with R's competitors...:)

4. Finally, the last (and best) approach - Model based clustering.'Best' because it is the most general approach (it nests the others as special cases), is the most robust to distributional and linkage assumptions and because it penalizes for surplus complexity (resolves the fit-complexity tradeoff in an objective way). My thumb-rule is: When in doubt, use model based clustering. And yes, mclust is available *only* on R to my knowledge.Install the 'mclust' package for this first. Then run the following code.

install.packages("mclust")

# Model Based Clustering

library(mclust)

fit = Mclust(mydata)

fit # view solution summary

The mclust solution has 3 components! Something neither the dendogram nor the k-means scree-plot predicted. Perhaps the assumptions underlying the other approaches don't hold for this dataset. I'll go with mclust simply because it is more general than the other approaches. Remember, when in doubt, go with mclust.

fit$BIC # lookup all the options attempted

classif = fit$classification # classifn vector

mydata1 = cbind(mydata.orig, classif) # append to dataset

mydata1[1:10,] #view top 10 rows

# Use below only if you want to save the output

write.table(mydata1,file.choose())#save output

The classification vector is appended to the original dataset as its last column. Can now easily assign individual units to segments.Visualize the solution. See how exactly it differs from that for the other approaches.

fit1=cbind(classif)
rownames(fit1)=rownames(mydata)
library(cluster)
clusplot(mydata, fit1, color=TRUE, shade=TRUE,labels=2, lines=0)
Imagine if you're a medium sized home-security solutions vendor looking to expand into a couple of new states. Think of how much it matters that the optimal solution had 3 segments - not 2 or 4.To help characterize the clusters, examine the cluster means (sometimes also called 'centroids', for each basis variable.
# get cluster means
cmeans=aggregate(mydata.orig,by=list(classif),FUN=mean); cmeans
Seems like we have 3 clusters of US states emerging - the unsafe, the safe and the super-safe. Now, we can do the same copy-paste for any other datasets that may show up in classwork or homework. I'll close the segmentation module here. R tools for the Targeting module are discussed in the next blog post. Any queries or comment, pls use the comments box below to reach me fastest.

###############################

Targeting in R

This is the code for classwork MEXL example "Conglomerate's PDA". This is the roadmap for what we are going to do:

  • First we segment the customer base using model based clustering or mclust, the recommended method.
  • Then we randomly split the dataset into training and test samples. The test sample is about one-third of the original dataset in size, following accepted practice.
  • Then we try to establish via the training sample, how the discriminant variables relate to segment membership. This is where we train the Targeting algorithm to learn about how discriminant variables relate to segment memberships.
  • Then comes the real test - validate algorithm performance on the test dataset. We compare prediction accuracy across traditional and proposed methods.
  • Since R is happening, there are many targeting algorithms to choose from on R. I have decided to go with one that has shown good promise of late - the randomForest algorithm. Where we had seen decision trees in Session 5, think now of 'decision forests' in a sense...
  • Other available algorithms that we can run (provided there is popular demand) are artificial neural nets (multi-layer perceptrons) and Support vector machines. But for now, these are not part of this course.
So without further ado, let me start right away.1. Segment the customer Base.To read-in data, directly save and use the 'basis' and 'discrim' notepads I have sent you by email. Then ensure you have packages 'mclust' and 'cluster' installed before running the clustering code.
# read-in basis and discrim variables
basis = read.table(file.choose(), header=TRUE)
dim(basis); basis[1:3,]
summary(basis)

discrim = read.table(file.choose(), header=TRUE)
dim(discrim); discrim[1:3,]
summary(discrim)

# Run segmentation on the basis.training dataset library(mclust) #invoke library

fit = Mclust(basis) # run mclust

fit # view result

classif = fit$classification

# print cluster sizes

for (i1 in 1:max(classif)){print(sum(classif==i1))}

# Cluster Plot against 1st 2 principal components

require(cluster)

fit1 = cbind(classif)

rownames(fit1)=rownames(basis)

clusplot(basis, fit1, color=TRUE, shade=TRUE,labels=2, lines=0)

The segmentation produces 4 optimal clusters. Below is the clusplot where, interestingly, despite our using 15 basis variables, we see decent separation among the clusters in the top 2 principal components directly.

Click on the above image for larger size.

2. Split dataset into Training & Test samplesRead in the dataset 'PDA case discriminant variables.txt' from LMS for the below analysis:

rm(list = ls()) # clear workspace

# 'PDA case discriminant variables.txt'

mydata = read.table(file.choose(), header=TRUE)

head(mydata)

# build training and test samples using random assignment

train_index = sample(1:nrow(mydata), floor(nrow(mydata)*0.65));

# two-thirds of sample is for training

train_index[1:10];

train_data = mydata[train_index, ];

test_data = mydata[-(train_index), ];

train_x = data.matrix(train_data[ ,c(2:18)]);

train_y = data.matrix(train_data[ ,19]);

# for classification we need as.factor

test_x = data.matrix(test_data[ ,c(2:18)]); test_y = test_data[ ,19]

Last year, when Targeting was a full lecture session, I used the most popular machine learning algorithms - neural nets, random forests and Support vector machines (all available on R, of course) to demonstrate targeting. Those notes can be found here.3. Use multinomial logit for Targeting:Will need to install library 'textir' for this one.

###### Multinomial logit using Rpackage textir #######

install.packages("textir")

library(textir)

covars = normalize(mydata[ ,c(2,4,14)], s=sdev(mydata[,c(2,4,14)])); #normalizing the data dd = data.frame(cbind(memb=mydata$memb,covars,mydata[ ,c(3,5:13,15:18)]));

train_ml <- dd[train_index, ];

test_ml = dd[-(train_index), ];

gg = mnlm(counts = as.factor(train_ml$memb), penalty = 1, covars = train_ml[ ,2:18]);

prob = predict(gg, test_ml[ ,2:18]);

head(prob);

pred = matrix(0, nrow(test_ml), 1);

accuracy = matrix(0, nrow(test_ml), 1);

for(j in 1:nrow(test_ml)){

pred[j, 1] = which.max(prob[j, ]);

if(pred[j, 1]==test_ml$memb[j]) {accuracy[j, 1] = 1}

}

mean(accuracy)

You'll see something like this (but not the exact same thing because the training and test samples were randomly chosen)

Look at the probabilities table given. The table tells us the probability that respondent 1 (in row1) belongs to segment 1, 2, 3 or 4. We get maximum probability for segment 1, so we say that respondent 1 belongs to segment 1 with a 61% probability. In some cases, the all the probabilities may be less than 50%. Just take the maximum and assign the respondent to that segment, if so.Now, we let loose the logit algorithm onto the test sample. the algo comes back with its predictions. In the real world, we will go by what the machine says. But in this case, since we have the actual segment memberships, we can validate the results. This is what I got when I tried to assess the accuracy of the algo's predictions:

So, the algo is able to predict with a 60% odd accuracy, not bad considering that random allocation would have given you at best a 25% success rate. Besides, this is simple logit - more sophisticated algos exist that can do better, perhaps even much better.

That's it for now. Will putup the HW for this session in a separate update (deadline now is 9-Nov saturday midnight) here (watch this space).

Session 5 HW update:

There will be no HW ofr session 5. I figure I can combine segmentation and targeting bits into the session 6 HW.

Sudhir

Some HW and Project related announcements

Hi all,
A few quick admin announcements follow, each separated by a line of hash tags. ###############################################Mailbag.I received the following email from a group today. My response is attached and is applicable to all groups in the class.
Hello Professor –
Our group has assignments lined up for tomorrow and a 2 hour pre-placement talk by Microsoft as well.
We would like to request you to postpone the focus group assignment submission to Friday morning if possible.Thanks,
A

My response:
Hi A,
I understand your schedules are very busy. I must point out that the HW was released a week ago.Granting your team an extension wouldn't be fair to other teams that have planned and prepared to meet the deadline on time.Hence, to find some middle-ground, I suggest you submit by Saturday noon by accepting a small points penalty. For instance, while we grade HWs out of 10 (later, will normalize and reweight), late submissions can be graded out of 8 or 9.I would rather folks submit late than not submit at all or push through a rush-job of suspect quality.I'll post this message on the blog to share with everybody.Regards,Sudhir






###############################################Re the project proposal submission.The project proposal submission presents a weird problem, something akin to a chicken-and-egg in that I'm asking you to plan your project even before you've been exposed to all the MKTR tools in the course.In particular, two tools of interest - perceptual mapping and text analytics - could easily find place in and add value to many of the project proposals that may get submitted by tomorrow.Hence, I would rather wait until you've been exposed to the rather powerful perceptual mapping tool before you rush to submit your project proposals [text analytics comes in only in session 9 and that's too long to wait for].Hence, your project proposal submission deadline has been extended to Saturday midnight - no harm, no foul.I'd rather your project be more thoughtful and directly applicable in business scenarios than would be possible with a constrained proposal in a rushed submission.Regards,Sudhir###############################################Re the External speaker:Mr Milind Chigupaker is a former IBM analyst who rose through the ranks to make it first to mid management and then to the opening rungs of senior management before he quit to focus on his own venture.I'm honored to be able to say, I'm a co-founder in the venture - Modak Analytics - that does (surprise, surprise) high-dimensional predictive analytics.The talk is set for 10-Nov Sunday 4-6 pm.Sudhir###############################################Re the Session 5 HW.Sorry about the delay in posting the HW. Henceforth, all HWs will be due on the Saturday of the following week. So, Session 5 HW will now be due by 9-Nov, midnight. I would however strongly encopurage folks to not wait till the 11th hour, if possible.Sudhir
























Saturday, October 26, 2013

Project Related Announcements - Proposal Template and Grading Criteria

Hi all,

Please find listed here a quick 5-7 slide template for submission of your project proposals on PPT. To recap, the project proposal PPT is due in its dropbox by the start of Session 6.

1. Slide 1 - Title Slide
Give the project an informative title and list the names & PGIDs of the team members working on it.
For example, for the problem statement we have in Session 2's construct mapping homework, we could give a title like "Assessing E-tailing's appeal among the young and upwardly mobile" or something like that.

2. Slide 2 - Management problem
Pls give a condensed problem statement describing the essential context of the management problem. For instance, the problem statement used for the questionnaire design homework can be used as a base. Further condense it as required to fit it comfortably onto one slide.
More generally, I would encourage teams to base their management problems on real world concerns. Pick up any business magazine or newspaper and chances are around 10% of the articles may describe problem contexts for specific firms or industries that you could modify and adapt for your project.

Here are a few articles I saw from which management problem contexts can be seen emerging:

(a) The appeal of 3D movies can lead to a survey estimating preference for -> willingness to pay for -> likely demand for movies in the 3D format among a particular target segment. Here's the management problem sourced from the Economist (July 2011) The appeal of 3D movies - Cinema's great hope


(b) Here's an interesting possibility that requires folks to look at reeeally new products - akin to forecasting email's effect on postal services in 1994 - the impact on small-scale manufacturing of 3D printing services. This too is sourced from the Economist (Dec 2011)

(c) Here's a desi innovation that might get a huge fillip in demand as demographics start to favor it. Demand soars for a "House-call doctor services" for the elderly and the chronically infirm. Source is Economic times, 2012.

(d) Here's an interview with the boss of the cafe coffee day chain and he describes some interesting looking initiatives CCD is taking in trying to leverage facebook and other social media to provide speedy feedback on CCD Ops nationwide etc.

And soon. These are merely a few examples. There's no dearth of good management problems to find.

3. Slide 3 - Decision problem
Condensing the management problem to a decision problem (D.P.) is tricky stuff, as we saw in Session 1. Make appropriate assumptions and achieve this step. State the decision problem chosen in clear words. If possible, list also a few alternative D.P. statements that were considered but not chosen.

4. Slide 4 - Research objectives (R.O.s)
Ensure your set of R.O.s "cover" the D.P. in that they address the central question(s) raised by the D.P. State the R.O. in the prescribed format (lookup session 1 slides for this).

5. Slide 5 - Tools mapping
Map the R.O.s onto particular MKTR tools (or sets of tools in a particular sequence). You can refer to the MKTR toolbox for starters but are free to choose tools from outside that toolbox as well. Just to be clear, "tools" here refers to methods or approaches being followed - e.g., the survey method, secondary analysis or experiments are all "MKTR tools" for us.

Important Note:
I understand that as the course progresses and you come in contact with newer tools, you may want to revise your propject proposal a bit here and there. Limited changes to the proposal are permissible. Further, you are free to take any one of the HWs you have done and build upon it, expand it to make your project out of it - perfectly fine. FYI.

*********************************************

In case you are wondering what the grading criteria may be for the project (as that might influence what project you finally choose), then let me outline some thoughts on these criteria based on what I have used in the past. These criteria are indicative only and are not exhaustive. However, they give a fairly good idea of what you can expect. Pls plan such that your chosen problem can yield enough material such that your deliverable of <30 6="" areas.="" before="" beginning="" br="" broad="" doesn="" due="" in="" lack="" of="" slides="" substance="" t="" term="" the="" these="">
(i) Quality of the management problem context chosen - How interesting, relevant, forward looking and do-able it is within the time and bandwidth constraints we are operating under, in the course.

(ii) Quality of the D.P.s chosen - How well it aligns with and addresses the business problem vaguely outlined in the project scope document; How well can it be resolved given the data at hand. Etc.

(iii) Quality of the R.O.s - How well defined and specific the R.O.s are in general; How well the R.O.s cover and address the D.P.s; How well they map onto specific analysis tools; How well they lead to specific recomemndations made to the client in the end. Etc.

(iv) Clarity, focus and purpose in the Methodology -  Flows from the D.P. and the R.O.s. Why you chose this particular series of analysis steps in your methodology and not some alternative. The methodology section would be a subset of a full fledged research design, essentially. The emphasis should be on simplicity, brevity and logical flow.

(v) Quality of Assumptions made - Assumptions should be reasonable and clearly stated in different steps. Was there opportunity for any validation of assumptions downstream, any reality checks done to see if things are fine?

(vi) Quality of results obtained - the actual analysis performed and the results obtained. What problems were encountered and how did you circumvent them. How useful are the results? If they're not very useful, how did you transform them post-analysis into something more relevant and useable.

(vii) Quality of insight obtained, recommendations made - How all that you did so far is finally integrated into a coherent whole to yield data-backed recommendations that are clear, actionable, specific to the problem at hand and likely to significantly impact the decisions downstream. How well the original D.P. is now 'resolved'.

(viii) Quality of learnings noted - Post-facto, what generic learnings and take-aways from the project emerged. More specifically, "what would you do differently in questionnaire design, in data collection and in data analysis to get a better outcome?".

(ix) Completeness of submission - Was sufficient info provided to track back what you actually did, if required - preferably in the main slides, else in the appendices? For instances, were Q no.s provided for the inputs to a factor analysis or cluster analysis exercise?  Were links to appendix tables present in the main slides? Etc.

(x) Creativity, story and flow - Was the submission reader-friendly? Does a 'story' come through in an interconnection between one slide and the next? Were important points highlighted, cluttered slides animated in sequence, callouts and other tools used to emphasize important points in particular slides and so on.

OK. Thats quite a lot already, don't want to spook anybody this early in the course (or later in the course, for that matter). However, let none say that they didn't know how the project would be viewed and graded before making their project proposals.

Sudhir

Thursday, October 24, 2013

Session 4 Updates for the Co2014 Hyderabad

Hi all,
Session 4 Big-picture Recap:
We're done with a readings-Heavy Session 4 'Qualitative Research' today. To recap the four big-picture take-aways from the session, let me use bullet-points:
  • We studied Observation Techniques - both of the plain vanilla observation (Reading 1 - Museums) and the 'immersive' ethnographic variety (Reading 2 - adidas).
  • We then ventured into deconstructing the powerful habit formation process and arrived a 3-step loop framework to describe it for Marketing purposes: cue-routine-reward.
  • We saw how the innovative combination of qualitative insight and predictive analytics can lead to windfall $$ profits (Reading 3-Target and reading 4-Febreze)
  • Finally we saw how unstructured respondent interaction personified by a focus group discussion (FGD) can be a powerful qualitative tool for digging up customer insights.
Session 4 HW details:Pls read the contemporary caselet given below, from last month's Economist magazine. It deals with a story on how to help professional chess gain fans. A sporting chance - how fans plan to revive Chess
IN LONDON in April, a 22-year-old Norwegian turned cartwheels by the Thames. Magnus Carlsen, the world’s top-ranked chess player (and a model for G-Star RAW, a fashion firm) had just earned the right to challenge for the World Chess Championship in India next month. His battle against Viswanathan Anand, a 43-year-old Indian and world champion since 2007, is a long-awaited spectacle. Match organisers see a chance to turn a struggling sport into a global brand.
Time was when the world stopped for professional chess. Millions watched Bobby Fischer, an American, beat the Soviet Union’s Boris Spassky in 1972. In the 1990s a pair of matches between Garry Kasparov and Deep Blue, a computer, recaptured some of that suspense. Yet despite booming interest in the amateur game, top-level chess has become obscure again, hobbled by squabbles and eccentric leadership.Enthusiasts spy a comeback. Last year Andrew Paulson, an American businessman based in London, bought rights to stage the game’s most prestigious contests, including November’s duel. For $500,000 the World Chess Federation (FIDE) granted Mr Paulson media and marketing licences for a decade—and the chance to make chess a profitable enterprise.The game itself has plenty of fans. Research in five countries by YouGov, a pollster, found that more than two-thirds of adults have played at least once. FIDE says 605m do so regularly. In India, where Mr Anand is a national hero, nearly a third of adults claim to play every week. The internet and smartphones mean novices no longer need a friend to play. Susan Polgar, a Hungarian-American grandmaster, says about 35 countries include chess in school curricula.But grassroots enthusiasm has not raised the profile of the professional game. Critics gripe about mercurial decision-making within FIDE. The sport’s governing body gets by on some $2m a year (FIFA, football’s federation, spent more than $1 billion in 2012) and has had only two presidents in 31 years.



So Mr Andrew paulson wants to do to world Chess what Kerry Packer did to world cricket, perhaps... The challenge clearly is to kickstart the virtuous cycle of more fans --> more advertiser interest --> more ad revenues --> more talent drawn in --> more competition --> more fans... How to get this cycle going is the Q. Clearly, the topic isn't as esoteric as it may first look. Football and Hockey in India suffer too from trying to kickstart this virtuous cycle in the shadow of big brother Cricket, for instance.The FGD then is partly a brainstorming session (from the firms' viewpoint) and partly an opining & evaluation (from a potential fan's viewpoint) of various possibilities. It may involve first trying to define what chess is or means to fans, what more it can come to represent, how to expand and leverage its present connect etc. Analogies with other sports such as cricket may come in as well, perhaps... I don't want to write more and bias the storylines you may come up with.
A deeper challenge is that watching chess is less fun than playing it. A single game can last six hours; its most riveting moment may be a strategic nuance known as the Yugoslav variation on the Sicilian. “Good chess leads to draws,” says Maurice Ashley, an American grandmaster.
Mr Ashley believes that new game and tournament formats could attract a wider audience. Competitors in blitz chess must finish their games in half an hour. Matches lasting minutes make popular footage online. Yet many players resist fast games, arguing that they reward low-quality chess. FIDE’s enthusiasm for shorter championships in the 1990s and 2000s prolonged the professional game’s split.Lengthy duels could still flourish if packaged well. Golf’s slow pace does not stop big audiences following four-day tournaments; in the cricket-playing world, witty commentary keeps fans tuned to games that last five days. Lately ESPN, a broadcaster, has turned poker, spelling bees and Frisbee-flinging (see article) into tense, dramatic television.Mr Paulson, who made a fortune in Russian internet ventures, says chess matches can make “heart-gripping, heart-pounding entertainment”. (He is standing for president of the English Chess Federation on October 12th.) He plans more competitions in big cities beyond Russia and eastern Europe, where many now take place. In March he launched ChessCasting, a web application that offers statistics and commentary on big events as well as discussion boards for amateur pundits. He talks of reporting competitors’ sweating, eye movement and heart rate.Chess needs deep-pocketed backers to complete this transformation. Mr Paulson thinks firms will want to associate with a game that is “clean, pure and meritocratic”. But he has not yet announced any big new sponsors. “One mistake has been assuming it would be easier,” he says. A cartwheeling world champion might help.



Submission format:
  • Title slide of your PPT should have your group name, member names and PGIDs
  • Next slide write your D.P. and R.O.(s) clearly.
  • Third slide, introduce the FGD participants and a line or so on why you chose them (tabular form is preferable for this)
  • Fourth Slide, write a bullet-pointed exec summary of the big-picture take-aways from the FGD
  • Fifth Slide on, describe and summarize what happened in the FGD
  • Note if unification and / or polarization dynamics happened in the FGD
  • Name your slide groupname_FGD.pptx and drop in the appropriate dropbox by the start of session 6
Any queries etc., pls feel free to email me.Update 1: FGD HW guidelines:A few more thoughts, guidelines on the FGD HW for this session. To keep it focussed and brief, lemme use the bullet-points format
  • The point of the FGD is *not* to 'solve' the problem, but merely to point a likely direction where a solution can be found. So don't brainstorm for a 'solution', that is NOT the purpose of the FGD.
  • Ensure the D.P. and R.O.s are aligned and sufficiently exploratory before the FGD can start. Different ROs lead to very different FGD outcomes. For example, if you define your R.o. as "Explore which advertising themes enable high levels of fan-connect" versus "Explore potential fan-connect across different chess formats", etc.
  • Keep your D.P. and R.O. tightly focussed, simple and do-able in a mini-FGD format. Having too broad a focus or too many sub-topics will lead nowhere in the 30 odd minutes you have.
  • Start broad: Given an R.O., explore how people connect with or relate to sports in general, their understanding of what constitutes a 'sports fan', their understanding of what constitutes 'excitement', memorability', 'social currency' or 'talkability' in a sport and so on. You might want to start with *sports* in general and not narrow down to chess right away (depending on the constructs you seek, of course).
  • Prep the moderator well: The moderator in particular has a crucial role. Have a broad list of constructs of interest, Focus on getting them enough time and traction (without being overly pushy). For example, the mod could start by asking the group: "What do you think connects people to sports?" and get the ball rolling, then steer it to keep it on course.
  • Converge on Chess in detail: After exploring sports in general, explore the particulars of chess as a sport - what is it, how is it viewed or understood, what is the perception of people who play or follow the game, how can it be made more trendy etc.
  • Do some background research on chess and its history first. Know what different game formats have been proposed and tried. E.g., apart from 'blitz chess', there is 'chess by jury' in which two groups of people individually vote for the best next move and the move with the highest votes is played on giant screens, etc.
  • See where people agree in general, change opinions on interacting with other people on any topic, disagree sharply on some topics and stand their ground etc.
  • In your PPT report, mention some of the broad constructs you planned to explore via the FGD.
  • Report (among other things) what directions seem most likely to be fruitiful for investigation.
External Speaker talk dates:Would've gone with next week Sunday after the midterms but its Diwali. Now I'm thinking of Friday 8-Nov, Sat 9-Nov or Sun 10-Nov Evening between 4 and 6 pm. If you have strong objections to this, pls let me know. Else I will pickup 10-Nov Sunday 4-6pm as the tentative date for the external speaker.Session 5 preview:We'll cover Segmentation and Targeting in session 5. This is both a theory and R heavy session. There'll be illustrative dummy examples as well as live exercises galore.That's it for now, more later.















Friday, October 18, 2013

Session 2 Updates

Hi all,

Session Recap:
Session 2 got done today. We ventured into psychometric scaling and attempted to measure complex constructs using the Likert scale, among others. We also embarked on a common-sensical approach to survey design.

Group Formation details:
  • Regarding group formation, pls send an email (only one per group) to the AA in the format prescribed.
  • To help in group formation, pls find the list of all final registrants for MKTR_145 in this google spreadsheet.
  • If you are unable to find group partners and would like to be allotted to a group, let the AA know.
  • Choose a well-known brand as your group name and write your group name next to your PGID and name in the spreadsheet.
  • Pls complete the group formation exercise well before the start of session 3.


Session 2 Analytical HW format:
  • Use a plain white blank PPT.
  • On the title slide, write the names of the 2 constructs you have chosen, followed by your name and PGID.
  • For each construct chosen, (i) write a few lines explaining the context of the research (e.g., "Volkswagen is considering entering India and wants to set up a plant to manufacture its cars here for the local market. The firm wants to know the best products and target segments to go with.")
  • (ii) Next, in a fresh slide, list the major aspects that you believe make up the construct.
  • (iii) In a fresh slide, make a table with two columns. In the first column, write Likert statements that draw upon the aspects you listed. In the second column, write which aspect(s) the Likert statement helps to elucidate.
  • Save the slide deck as session2HW_yourname.ppt and put in in the dropbox on LMS before the start of session 4.
Session 2 Survey-taking HW:
Pls take these two surveys positively by midnight on Sunday. I don't expect they'll take more than 10-12 minutes each. I will use the responses collected in the sessions to come for segmentation, factor analysis and perceptual-mapping purposes in the classroom on R.
Session 2 HW Survey 1

Session 2 HW Survey 2

Session 3 Preview:
In Session 3 we cover two broad topics. For the first, we continue the "principles of survey design" part and wade into Questionnaire Design Proper. Be sure to read the pre-read on Questionnaire Design as it covers the basics. It'll thus help lighten my load considerably in class. And who knows if there's another pre-reads quiz lurking somewhere in Session 3 as well...

For the second broad topic, we do Data Reduction via Factor Analysis. For this we'll need R. R can be downloaded and installed from the following CRAN link for both the Windows and Mac versions:

DOwnload and Install R

If you have any trouble with R installation, contact IT and let me know. Watch this space for more updates.
That's it for now. See you in class on Tuesday if not in the Atrium for the Dandyia Gala.
Sudhir

Tuesday, October 15, 2013

Welcome to the Class of 2014 (Hyderabad Campus) to MKTR

Hi Co2014,

Welcome to MKTR.

The first session got done yesterday. We covered some subject preliminaries and the crucial task of problem formulation.

Some Qs remained on problem formulation, in particular one on "how exactly does one move from the management problem ('messy reality') to the decision problem or D.P.?"

Well, there are no hard and fast rules and in general, the process is *very* context-dependent. I do intend to cover a few more examples in class to better illustrate this point.

Reading for session 2: "What is Marketing Research" (HBS note 9-592-013, Aug 1991).

About pedagogy going forward
The parts of the pedagogy that make MKTR distinctive : pre-reads quick-checks using index cards, in-class reads, this blog and R.

About this blog This is an informal blog that oncerns itself soley with MKTR affairs. It's informal in the sense that its not part of the LMS system and the language used here is more on the casual side. Otherwise, its a relevant place to visit if you're taking MKTR. Pls expect MKTR course related announcements, R code, Q&A, feedback etc. on here.

Last year, students said that using both LMS and the blog (not to mention email alerts etc) was inefficient and confusing. I'm hence making this blog the single-point contact for all relevant MKTR course related info henceforth. LMS will only be used for file transfers and email notifications to the class only in emergencies. Each session's blog-post will be updated with later news coming at the top of the post. Kindly bookmark this blog and visit regularly for updates.

About Session 2
Session 2 deals with psychographic scaling techniques and delves into the intricacies of defining and measuring "constructs" - complex mental patterns that manifest as patterns of behavior. This session also sets the stage for questionnaire design to make an entry in Session 3.

There will be a series of 3 homework assignments for session 2. These concern merely filling up surveys that I will send you (this data we will use later in the course for perceptual mapping and segmentation). Nothing too troublesome, as you can see.

Any Qs etc, pls feel free to email me or use the comments section below.

Sudhir Voleti