Starting from:

$30

Problem Set II Solution

For this problem set you have to use the data in the ascii file yogurt 2018.txt. The data consists of observations on 429 households making 2567 yogurt purchases. They purchase each time one of three brands. The five variables in the data set are, (i) the household id, running from 1 to 430, (ii) the choice made by the household, running from 1 to 3, (iii) the price for that household when they made their decision, in cents, of yogurt brand 1, (iv) the price, in cents, of yogurt brand 2, (v) the price, in cents, of yogurt brand 3.







Let j index the choice, running from 1 to 3, t index the purchase, running from 1 to Ti, and i the household, running from 1 to 429. The number of purchases made by each household differs. For example, the first two purchases come from household 1, the next two from household 2, and the next eight from the third household.




We focus on a discrete choice model where the utility for individual i associated with choice j, in purchase t is







Uijt = αj + β · Pijt + ijt,







where Pijt is the price of brand j for household i at purchase time t. We assume the εijt are independent across time, choice and household, with an extreme value distribution. Normalize α1 = 0, so that there are three free parameters, α2, α3, and β .







First, calculate the mean price for each brand, by the choice made. That is, calculate the average price of brand 1, 2 and 3 for households choosing brand 1, calculate the average price of brand 1, 2 and 3 for households choosing brand 2, and calculate the average price of brand 1, 2 and 3 for households choosing brand 3. Do the patterns


Imbens, Problem Set II, MGTECON640/ECON292 Fall ’18
2






make sense? That is, are the prices for brand j lowest among the households choosing brand j?




Next we want to estimate the conditional logit model. Because of the independence assumption on the εijt, we can ignore the fact that some purchases come from the same household, and the likelihood function is



L(α2, α3, β) =








N
Ti










Y Y 1Yijt =1 · exp(β Pi1t) + 1Yijt =2 · exp(β Pi2t + α2) + 1Yijt =2 · exp(β Pi3t + α3)




i=1 t=1




exp(β Pi1t) + exp(β Pi2t + α2) + exp(β Pi3t + α3)












First show that at






β


−0.0400




α2


=
0.5000




α3




−1.0000





the log likelihood function is equal to -2660.1.




Calculate the analytic first derivatives of the log likelihood function at these values for the parameters. Hint: the first derivative with respect to β is -6948.8.



Calculate the analytic second derivatives with respect to the parameters. Hint: the second derivative with respect to β is approximately -505130.



Next, use the Newton-Rahpson algorithm to find the maximum likelihood estimators for β , α2, and α3.



Next we explore a random coefficients version of the conditional logit model. Rather than use continuous mixtures, as is more common in the literature, we use a compu-tationally simpler version with a binary mixture. We will let the coefficient on price vary by household. Thus, we model the latent utility as



Uijt = αj + βi · Pijt + ijt,


Imbens, Problem Set II, MGTECON640/ECON292 Fall ’18
3






where




βi ∈ {βL, βH }, with pr(βi = βL) = π = 0.4.







(More generally we may want to estimate the mixture probability π, but we wont do that here.)




Show that the likelihood function for this mixture model is










N






L(α2, α3, βL, βH , π) = Y












i=1








Ti
1Yijt =1 · exp(βLPi1t) + 1Yijt =2 · exp(βLPi2t + α2) + 1Yijt =2 · exp(βLPi3t + α3)






π ·
Y






















t=1


exp(βLPi1t) + exp(βLPi2t + α2) + exp(βLPi3t + α3)










Ti
1Yijt =1 · exp(βH Pi1t) + 1Yijt =2 · exp(βH Pi2t + α2) + 1Yijt =2 · exp(βH Pi3t + α3)


+(1−π)·Y






.
exp(βH Pi1t) + exp(βH Pi2t + α2) + exp(βH Pi3t + α3)






t=1







7. Plot the log likelihood function, for π = 0.4, at




α2








αˆ2




=






,


α3


ˆ
αˆ3
βL


















β − c






































βH






ˆ
+ c












β







as a function of c, from c = 0 to c = 0.2.




Compare the value of the log likelihood function at the value of c that maximimes it with the value at c = 0. Does it appear that allowing for the heterogeneity in price sensitivity is important?




Estimate the random effects model using the EM algorithm. Report parameter esti-mates for α2, α3, βL, and βH .

More products