Sunday, November 25, 2012

Some project related Q&A

Hi all,

Been receiving project specific emails but there are some generalities that might benefit with more dissemination. So here goes.

Hello Professor,

We are currently working to finish our marketing research assignment. We have a query about our perception output we are getting from our survey

We got 3 segment using m clust. We still haven’t interpreted them. However, we used this segmented data to draw our perception maps for each segment. I am attaching the outputs with the email. We are not sure how to interpret the perception map for segment 1. Please help us to interpret this output.

Also, please have a look at segment 2 and 3 output as well, and please confirm if they make sense to you

Many Thanks!
G

My response:
Hi G and team,

The blue arrows should be the attributes and the red dots the brands. Your maps seem to be the other way round.

Pls use the transpose of mydata or (‘t(mydata)’ in R) as input to the JSM procedure. Other option is doing it in MEXL.

Interpretation should be straightfwd after that, I think.

Sudhir

***************************

Here's my response to one team that ran into clustering issues.

Hi S and team,

Some quick observations from my side:

1. Why factorize the demographics? Usually these would be used as discriminants. But if you’ve good reason to do this and the factors that emerge make sense, then fine, I guess.

2. What psychographics did you use and how many Qs do you have? Ideally, they would revolve around lifestyle habits. If so, if factorizing them gives useful constructs as factors, then use the factor scores of the factors of interest as downstream variables for your cluster analysis. In Rcode terms, we’d have:

# view & save factor scores #
fit$scores[1:4,]#view factor scores

#save factor scores (if needed)
write.table(fit$scores, file.choose())
The fit$scores object contains the factor scores.

3. Now, you needn’t even use all the factors that emerge, just the ones that seem meaningful. For the rest of the variables (loading onto factors that do not seem interesting), you can directly use the variables as-is instead of its factor score in downstream cluster analysis.

4. Typically, one would want to segment on the basis of needs or benefits sought. All other observed characteristics of consumers would become discriminant variables. Demographics usually fit this bill. And also purchase history data if such exist. I would suggest trying a logit:

segment membership = f(demographics, other factors)

to see if there is any systematic relation between the needs based psychographic segments you got and the observed discriminant variables. Even if there is none, try profiling segments that emerge by eyeballing the centroids of the demographics of those segments to see if there is any link between the two.

5. From segmentation can flow segment sizes and from segment profiles can emerge campaign ideas for selling to different segments. And so on and so forth.

Hope that helped.

Sudhir

**************************

This one from a team having mlogit issues:

Dear Professor,

We are facing some issues in identifying factors in R using mlogit. We were wondering if we can seek 10 minutes of your time to help us run the code for R to get the result. Please let us know what would be a good time today to have this discussion.

PS: Attached is the file that we are trying to work with. The columns on the left are the factors that impact the right most column “contrib.”. We wanted to run Mlogit to assess which factor has the max impact. But the code on the blog was for discriminant analysis.

Regards,

N

My response:
Folks,

This is straightfwd regression, no need to go logistic here.

The dependent variable ‘contrib’ appears to be continuous. Then after you define the conceptual model:

Y = f( X1, X2,…)

Just run a simple linear model for f(.) or perhaps some variation of a linear model – log-log, quadratics, interaction terms and the like. . Results too are easy to interpret for regressions.

Sudhir

********************************

Another one on mlogit:

Dear Sir,
I am completely stuck at mlogit. I have made my file, 1st col of dependent variable (binary) followed by 17 col of independent variables.
Whenever I am trying to read the data (be it csv or notepad) , its showing following error,

discrim = read.table(file.choose(), header=TRUE)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 13 elements
> dim(discrim); discrim[1:17,]
Error: object 'discrim' not found
> discrim = read.table(file.choose(), header=TRUE)
Error in file.choose() : file choice cancelled
> dim(discrim); discrim[1:17,]
Can you pl suggest, what am I doing wrong, and what if means (discrim not found)
D

My response:
Hi D,

Some causes that come immediately to my mind:

1. The file 'discrim' wasn't read-in properly. So either there are spaces in the column headers or some variables are character strings.

2. In such a scenario, it is always safer to read in from a .csv file rather than a .txt one.

3. As long as this error is showing, the file wasn't even read in correctly.

"Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 1 did not have 13 elements"

Its typically good practice to view a few rows of the file first just after reading in to see if all is well.

discrim[1:5,] etc.

4. 17 X variables in logit is quite a lot. Ensure you have sufficient amount of data (#observations) that can sustain such an estimation exercise. Might take a few minutes of run-time on R.

Good luck. I'm off to Mohali tomorrow but shall be available over email. I'd prefer the blog for such Q&A in future.

Hope this helps. Let me know either way.

Sudhir

More as they come.

No comments:

Post a Comment

Constructive feedback appreciated. Please try to be civil, as far as feasible. Thanks.