Thursday, December 20, 2012

Session 8 - Annotations

Hi all,

In the Regression Modeling module in Session 8, we talked about how quadratic terms can act to capture simple curves. Here is an R simulation to demonstrate the same, hopefully to greater clarity.

Here is what the code does. It randomly draws a 100 numbers between 0 and 2 (into vector x1), creates a quadratic term vector x2, the square of x1, then plots 4 scenarios in which a vector and its quadratic term together give rise to different shapes simple curve shapes. Read the code while copy-pasting it and hopefully, as you walk the data through the processing, the point of the exercise will be clear.

## --- Quadratic Terms Effect: Demo --- ##

# generate random numbers in vector x1
x1=runif(100)*2

# x2 is the quadratic term (or square) of x1
x2=x1*x1
x3 = cbind(x1,x2) # x3 is a matrix
x3=x3[order(x1),] # sort x3 in descending order

# different variations of outcome y
op <- par(mfrow = c(2,2), oma = c(0, 0, 3, 0))

# when x1 has +ve and x2 a negative coeff
y = x3[,1] - 0.5*x3[,2];
plot(y=y,x=x3[,1],type="b", main="(A) First +ve then -ve")

# when x1 has -ve and x2 a +ve coeff
y= -1*x3[,1] + 0.5*x3[,2];
plot(y=y,x=x3[,1],type="b", main="(B) First -ve then +ve")

# when x1 has +ve and x2 a +ve coeff
y= x3[,1] + 0.5*x3[,2];
plot(y=y,x=x3[,1],type="b", main="(C) Both +ve")

# when x1 has -ve and x2 a -ve coeff
y= -1*x3[,1] - 0.5*x3[,2];
plot(y=y,x=x3[,1],type="b", main="(D) Both -ve")

mtext("Capturing simple curves using Quadratic terms", outer = TRUE, cex=1.2)

par(op)
This is the image I got.
Observe how changing the signs changes the curve type captured in panels (A) to (D) in the figure. Thus in summary, we learn that quadratic terms give much flexibility to capture nonlinear shapes, useful in modeling secondary data in MKTR. Higher order polynomials (cubes etc) can capture even finer and more complex curve shapes.

***************

There was a Q about the rationale or intuition behind the chi-squared test. I'd say, start with this wikipedia page on the common Pearson's chi-squared test. The R code below helps simulate some chi-square distributions of varying degrees of freedom (from 1 to 8 in this case) on R.

## --- chi-squared distribution simulation on R --- ##

# define list Z for storing std normal draws
Z = NULL;
for (i1 in 1:8){ Z[[i1]] = rnorm(1000) }
Z1 = Z[[1]]; for (i1 in 2:8) { Z1 = cbind(Z1, Z[[i1]]) }

# create chi-squared distribution C for varying deg of freedom
Z2 = Z1*Z1
C = Z2
for (i1 in 2:8){ C[,i1] = as.matrix(apply(Z2[,1:i1],1,sum)) }

par(mfrow=c(1,1))
test = C[,1]
plot(density(C[,1]), col="black", type="l", main = "ChiSq distbn simulation in R")

for (i2 in 2:ncol(C)){
lines(density(x = C[,i2]), col=i2)}
Compare this with the plot on the wikipedia page. The point of this exercise was to demonstrate the construction of the chi-square distribution - as a function of the squares of the standard normal variates. This is strictly part of the stats core and I don't want to go any further into it. However, folks with more specific Qs and clarification are free to drop by and contact me.

Dassit for now. Need to work on your Session 7 and 8 HWs and prepare the end-term paper and so on.

Sudhir

No comments:

Post a Comment

Constructive feedback appreciated. Please try to be civil, as far as feasible. Thanks.