# STEP BY STEP INSTRUCTIONS

STEP BY STEP INSTRUCTIONS:

The Assignment Data (PopulationPropertyData2014.xls) file, which you can access from the Assessment Information page on the portal contains, in the range A1:I401, real estate sales data for a population of 400 properties around Melbourne in a particular week. In Part I of the assignment you have selected a random sample of 50 properties each containing observations, where appropriate, of the eight variables V1 to V8. In Part II of the assignment you have performed some statistical analyses on a number of these variables using your sample data file, SamplePropertyData.xls. The variables in the data set are as follows:

V1 = Region around Melbourne where property is located (1 = North, 2 = West, 3 = East, 4 = Central)

V2 = Property type (0 = Unit, 1 = House)

V3 = Sale result (1 = Sold at auction, 2 = Passed-in, 3 = Private sale, 4 = Sold before auction). Note that a blank cell for this variable simply indicates that the property did not sell.

V4 = Building type (1 = Brick, 2 = Brick veneer, 3 = Weatherboard, 4 = Vacant land)

V5 = Number of rooms

V6 = Land size (Square metres)

V7 = Sold Price ($000s)

V8 = Advertised Price ($000s)

Column A (PN), contains the property identification numbers for the 400 properties.

ALL WORKING OUT MUST BE SHOWN.

Assignment Tasks (Part III)

Answers to the assignment tasks must be based on the sample data file that you created in Part I of the assignment. As for Part II, most tasks in Part III of the assignment require you to obtain an Excel output prior to performing some analysis. Copy and paste these outputs to your assignment MS Word document immediately preceding any subsequent analysis. Explanations must be precise and to the point. Charts and tables must have appropriate titles and numerical values must be rounded to an appropriate number of decimal places and accompanied by the correct units of measure.

There are four tasks in Part III of the assignment. You must meet all task requirements to receive full marks.

ALL WORKING OUT MUST BE SHOWN. THERE IS A TOTAL OF 4 TASKS TO BE DONE

WHICH IS TASKS 6-9.

4 TASKS (TASKS 6-9)

Task 6 (7 marks)

By reference to numerical summary measures in the Descriptive Statistics table obtained in Task 5, provide three pieces of distinct evidence that might suggest that your sample “Sold Price” data has been obtained from a normally distributed population. What is your conclusion? Note: Make sure your three pieces of distinct evidence only contain one relating to the shape of the sample data.

Sold Prices (V7)

Mean

661.30

Standard Error

57.06

Median

603.00

Mode

#N/A

Standard Deviation

391.16

Sample Variance

153,003.30

Kurtosis

15.29

Skewness

2.88

Range

2,538.00

Minimum

112.00

Maximum

2,650.00

Sum

31,081.05

Count

47

Confidence Level(90.0%)

95.78

One indication that this sample data did not come from a normally distributed population is the difference between the mean and the median value. With the median so far to the left of the mean, the population is likely skewed.

A second indication is the skewness value. A normally distributed population would have a skewness value of 0, or close to it. A value which is close to 3 is fairly high, and indicates a departure from normality.

A third indicator, as a rough approximation, can be found by dividing the value of the range by 6. In a normally distributed data set, this value should be approximately equal to the sample standard deviation. In this case, the range divided by 6 is 2538 / 6 = 423. This value is higher than the actual sample standard deviation of 391.16, indicating that the population that the data is drawn from may not be normally distributed. (Of the three indicators discussed, this one is the weakest in this particular situation.)

Regardless of your conclusion in (a), assume your sample “Sold Price” data have been obtained from a normally distributed population, and calculate, using Standard Normal tables, approximately how many “Sold Price” data values in your sample you would expect to lie within 1.5 standard deviations of the mean (i.e. between z = –1.5 and z = +1.5).

With 47 values in the data set, approximately (0.8664)(47) ≈ 41 values should be within 1.5 standard deviations of the mean.

Use your sorted “Sold Price” sample data from Task 4, and the mean and standard deviation from the Descriptive Statistics table of Task 5, to manually count the number of “Sold Price” data values in your sample that lie within 1.5 standard deviations of the mean. State whether this count matches, approximately, your answer to (b) and hence whether this result confirms (or not) your conclusion in (a).

Task 7 (7 marks)

Use Excel to produce a Descriptive Statistics table for the “Sold Price” variable in your sample suitable for constructing an interval estimate of the population mean “Sold Price”. Hence determine:

(i)A point estimate of the mean “Sold Price” of the population of properties.

(ii) A 90% confidence interval estimate of the mean “Sold Price” of the population of properties.

(b) Make a brief verbal statement explaining the meaning of the confidence interval estimate obtained in (a) in the context of the variable in this task.

(c) If the population mean “Sold Price” is actually 650 ($000s), would you consider the interval estimate obtained in (a), to be satisfactory? Explain why or why not.

Task 8 (6 marks)

By reference to the Descriptive Statistics table obtained in Task 7, determine a 99% confidence interval estimate of the mean “Sold Price” of the population of properties using the following formula:

Comment on the precision of this interval, in particular, compare the precision associated with this interval with that obtained in Task 7. Explain why the direction of the change in precision that you have observed in these two intervals ought to be obvious prior to constructing the two intervals.

Task 9 (7 marks)

(a) Use Excel to produce a Descriptive Statistics table for the brick veneer properties in your sample suitable for constructing an interval estimate of the population proportion of brick veneer properties. Hence determine:

Brick Veneer (V4)

Mean

0.3200

Standard Error

0.0666

Median

0.0000E+0

Mode

0.0000E+0

Standard Deviation

0.4712

Sample Variance

0.2220

Kurtosis

1.5956

Skewness

0.7717

Range

1.0000

Minimum

0.0000E+0

Maximum

1.0000

Sum

16.0000

Count

50

Confidence Level(99.0%)

0.178590526

(i)A point estimate of the proportion of brick veneer properties in the population.

(ii) A 99% confidence interval estimate of the proportion of brick veneer properties in the population.

(b) Make a brief verbal statement explaining the meaning of the confidence interval estimate obtained in (a) in the context of the variable in this task.

(c) If the population proportion of brick veneer properties is actually 42%, would you consider the interval estimate obtained in (a), to be satisfactory? Explain why or why not.

Task 2 (6 marks)

Use Excel to produce a Frequency Column Chart and a Relative Frequency Pie-Chart for your sample to show the number and proportion, respectively, of each building type.

Bld.type

Frequency

Br

12

BrV

16

Wbrd

17

VL

5

How many properties in your sample consist of brick buildings?

Which building type occurs most frequently in your sample?

What proportion of properties in your sample consists of weatherboard buildings?

Task 3 (7 marks)

Use Excel to produce a Frequency Histogram for the “Advertised Price” variable in your sample. Without generating numerical summary measures for this data, use this histogram to describe:

Any tendency for the data to cluster/concentrate around a particular value or group of values.

The variability of the data.

The shape of the data.

Task 4 (7 marks)

Use Excel to sort your sample “Sold Price” data and paste into your MS Word assignment document.

Use the percentile location formula;

The 70th percentile.

The first and third quartiles.

First Quartile:

Briefly explain what the 70th percentile that you have determined informs you about your sample “Sold Price” data.

Determine the Inter-QuartileRange of your sample “Sold Price” data and provide a brief explanation of what information this statistic provides about your sample data.

Task 5 (7 marks)

Use Excel to produce a Descriptive Statistics table for your sample “Sold Price” data and paste into your MS Word assignment document.

(b) Use results from Task 4 to determine manually for this data, the upper and lower inner fence limits;

(c) Based on the limits calculated in (b), choose from the numerical summary measures provided in the Descriptive Statistics table, and/or measures calculated previously in Task 4;

an appropriate measure of central tendency, and,

an appropriate measure of dispersion for your sample “Sold Price” data.

The Assignment Data (PopulationPropertyData2014.xls) file, which you can access from the Assessment Information page on the portal contains, in the range A1:I401, real estate sales data for a population of 400 properties around Melbourne in a particular week. In Part I of the assignment you have selected a random sample of 50 properties each containing observations, where appropriate, of the eight variables V1 to V8. In Part II of the assignment you have performed some statistical analyses on a number of these variables using your sample data file, SamplePropertyData.xls. The variables in the data set are as follows:

V1 = Region around Melbourne where property is located (1 = North, 2 = West, 3 = East, 4 = Central)

V2 = Property type (0 = Unit, 1 = House)

V3 = Sale result (1 = Sold at auction, 2 = Passed-in, 3 = Private sale, 4 = Sold before auction). Note that a blank cell for this variable simply indicates that the property did not sell.

V4 = Building type (1 = Brick, 2 = Brick veneer, 3 = Weatherboard, 4 = Vacant land)

V5 = Number of rooms

V6 = Land size (Square metres)

V7 = Sold Price ($000s)

V8 = Advertised Price ($000s)

Column A (PN), contains the property identification numbers for the 400 properties.

ALL WORKING OUT MUST BE SHOWN.

Assignment Tasks (Part III)

Answers to the assignment tasks must be based on the sample data file that you created in Part I of the assignment. As for Part II, most tasks in Part III of the assignment require you to obtain an Excel output prior to performing some analysis. Copy and paste these outputs to your assignment MS Word document immediately preceding any subsequent analysis. Explanations must be precise and to the point. Charts and tables must have appropriate titles and numerical values must be rounded to an appropriate number of decimal places and accompanied by the correct units of measure.

There are four tasks in Part III of the assignment. You must meet all task requirements to receive full marks.

ALL WORKING OUT MUST BE SHOWN. THERE IS A TOTAL OF 4 TASKS TO BE DONE

WHICH IS TASKS 6-9.

4 TASKS (TASKS 6-9)

Task 6 (7 marks)

By reference to numerical summary measures in the Descriptive Statistics table obtained in Task 5, provide three pieces of distinct evidence that might suggest that your sample “Sold Price” data has been obtained from a normally distributed population. What is your conclusion? Note: Make sure your three pieces of distinct evidence only contain one relating to the shape of the sample data.

Sold Prices (V7)

Mean

661.30

Standard Error

57.06

Median

603.00

Mode

#N/A

Standard Deviation

391.16

Sample Variance

153,003.30

Kurtosis

15.29

Skewness

2.88

Range

2,538.00

Minimum

112.00

Maximum

2,650.00

Sum

31,081.05

Count

47

Confidence Level(90.0%)

95.78

One indication that this sample data did not come from a normally distributed population is the difference between the mean and the median value. With the median so far to the left of the mean, the population is likely skewed.

A second indication is the skewness value. A normally distributed population would have a skewness value of 0, or close to it. A value which is close to 3 is fairly high, and indicates a departure from normality.

A third indicator, as a rough approximation, can be found by dividing the value of the range by 6. In a normally distributed data set, this value should be approximately equal to the sample standard deviation. In this case, the range divided by 6 is 2538 / 6 = 423. This value is higher than the actual sample standard deviation of 391.16, indicating that the population that the data is drawn from may not be normally distributed. (Of the three indicators discussed, this one is the weakest in this particular situation.)

Regardless of your conclusion in (a), assume your sample “Sold Price” data have been obtained from a normally distributed population, and calculate, using Standard Normal tables, approximately how many “Sold Price” data values in your sample you would expect to lie within 1.5 standard deviations of the mean (i.e. between z = –1.5 and z = +1.5).

With 47 values in the data set, approximately (0.8664)(47) ≈ 41 values should be within 1.5 standard deviations of the mean.

Use your sorted “Sold Price” sample data from Task 4, and the mean and standard deviation from the Descriptive Statistics table of Task 5, to manually count the number of “Sold Price” data values in your sample that lie within 1.5 standard deviations of the mean. State whether this count matches, approximately, your answer to (b) and hence whether this result confirms (or not) your conclusion in (a).

Task 7 (7 marks)

Use Excel to produce a Descriptive Statistics table for the “Sold Price” variable in your sample suitable for constructing an interval estimate of the population mean “Sold Price”. Hence determine:

(i)A point estimate of the mean “Sold Price” of the population of properties.

(ii) A 90% confidence interval estimate of the mean “Sold Price” of the population of properties.

(b) Make a brief verbal statement explaining the meaning of the confidence interval estimate obtained in (a) in the context of the variable in this task.

(c) If the population mean “Sold Price” is actually 650 ($000s), would you consider the interval estimate obtained in (a), to be satisfactory? Explain why or why not.

Task 8 (6 marks)

By reference to the Descriptive Statistics table obtained in Task 7, determine a 99% confidence interval estimate of the mean “Sold Price” of the population of properties using the following formula:

Comment on the precision of this interval, in particular, compare the precision associated with this interval with that obtained in Task 7. Explain why the direction of the change in precision that you have observed in these two intervals ought to be obvious prior to constructing the two intervals.

Task 9 (7 marks)

(a) Use Excel to produce a Descriptive Statistics table for the brick veneer properties in your sample suitable for constructing an interval estimate of the population proportion of brick veneer properties. Hence determine:

Brick Veneer (V4)

Mean

0.3200

Standard Error

0.0666

Median

0.0000E+0

Mode

0.0000E+0

Standard Deviation

0.4712

Sample Variance

0.2220

Kurtosis

1.5956

Skewness

0.7717

Range

1.0000

Minimum

0.0000E+0

Maximum

1.0000

Sum

16.0000

Count

50

Confidence Level(99.0%)

0.178590526

(i)A point estimate of the proportion of brick veneer properties in the population.

(ii) A 99% confidence interval estimate of the proportion of brick veneer properties in the population.

(b) Make a brief verbal statement explaining the meaning of the confidence interval estimate obtained in (a) in the context of the variable in this task.

(c) If the population proportion of brick veneer properties is actually 42%, would you consider the interval estimate obtained in (a), to be satisfactory? Explain why or why not.

Task 2 (6 marks)

Use Excel to produce a Frequency Column Chart and a Relative Frequency Pie-Chart for your sample to show the number and proportion, respectively, of each building type.

Bld.type

Frequency

Br

12

BrV

16

Wbrd

17

VL

5

How many properties in your sample consist of brick buildings?

Which building type occurs most frequently in your sample?

What proportion of properties in your sample consists of weatherboard buildings?

Task 3 (7 marks)

Use Excel to produce a Frequency Histogram for the “Advertised Price” variable in your sample. Without generating numerical summary measures for this data, use this histogram to describe:

Any tendency for the data to cluster/concentrate around a particular value or group of values.

The variability of the data.

The shape of the data.

Task 4 (7 marks)

Use Excel to sort your sample “Sold Price” data and paste into your MS Word assignment document.

Use the percentile location formula;

The 70th percentile.

The first and third quartiles.

First Quartile:

Briefly explain what the 70th percentile that you have determined informs you about your sample “Sold Price” data.

Determine the Inter-QuartileRange of your sample “Sold Price” data and provide a brief explanation of what information this statistic provides about your sample data.

Task 5 (7 marks)

Use Excel to produce a Descriptive Statistics table for your sample “Sold Price” data and paste into your MS Word assignment document.

(b) Use results from Task 4 to determine manually for this data, the upper and lower inner fence limits;

(c) Based on the limits calculated in (b), choose from the numerical summary measures provided in the Descriptive Statistics table, and/or measures calculated previously in Task 4;

an appropriate measure of central tendency, and,

an appropriate measure of dispersion for your sample “Sold Price” data.

You'll get a 308.3KB .ZIP file.