# a linear model that predicts the quality of a bottle of wine

Construct a linear model that predicts the quality of a

bottle of wine based on the following features:

1

- fixed acidity

2

- volatile acidity

3

- citric acid

4

- residual sugar

5

- chlorides

6

- free sulfur dioxide

7

- total sulfur dioxide

8

- density

9

- pH

10

- sulphates

11

- alcohol

Output variable (based on

sensory data): 12 - quality (score

between 0 and 10)

You must turn in a screen shot showing the

results of linear regression using the following steps:

1.

Download the data from http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/

2.

Construct and evaluate a separate model for both red

and white wines. Specifically, I want you to report the cross-validation R2

value. To do this you must create a driver program that:

a. Loads/rearranges

the data into the proper format. I suggest using the following command as an

example: ds

= dataset('File','winequalityred.csv','delimiter',';');

b. This

function returns a dataset object that has a lot of useful functionality. Here

are some commands that might be useful. If you want to know what variables are

available, type ds.Properties.VarNames

c. So

if you want to construct a X matrix using two of the variables, you can use the

following command:

X =

[ds.fixedAcidity ds.volatileAcidity];

d. Likewise

you can construct a y vector: y = ds.quality

e. Construct

a X matrix using all of the features (except quality of course), and then

construct a linear model using LinearModel.fit

model =

LinearModel.fit(X,y)

f. Just

like classification, we need to evaluate this on data that is hasn’t seen yet,

so we need cross-validation. To do this you’ll need to use the following

commands:

cp =

cvpartition(length(y),'k',10); cvMSE = crossval('mse',X,y,'predfun',

@doregression,'partition',cp) cvR2 = 1 – cvMSE/mean((y – mean(y)).^2)

g. But

in order to run this code you’ll have to define a function called doregression

that looks like the following:

function ypredicted =

doregression(xtrain, ytrain, xtest)

model = LinearModel.fit(xtrain,ytrain); % Create the model

ypredicted = model.predict(xtest); % Run

prediction on our training data

end

3.

I want you to turn in a screen shot(s) showing the code

that you created and the results for both types of wine.

You'll get 1 file (640.0KB)