Two Plus Two Newer Archives  

Go Back   Two Plus Two Newer Archives > General Gambling > Probability
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 10-17-2006, 11:24 PM
KingGordy KingGordy is offline
Senior Member
 
Join Date: Jul 2005
Posts: 1,392
Default Question from my statistics midterm I got wrong

This is the only question on my midterm I got wrong, but I've reviewed it closely and don't understand why. Hopefully someone smarter than me can explain it.

18) A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount of natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y. The house is heated with gas so x helps explain y. The least-squares regression line for predicting y from x is y=1344-19x

The correlation between temperature x and gas usage y is r=-0.7. Which of the follwing would not change r?

A)Measuring temperature in degrees Celsius instead of degrees Fahrenheit.

B) Removing two outliers from the data used to calculate r.

C)Measuring gas usage in hundreds of cubic feet, so that all values of y are divided by 100.

D) Both A and C.

Any help would be appreciated.
Reply With Quote
  #2  
Old 10-18-2006, 03:19 AM
dave6 dave6 is offline
Senior Member
 
Join Date: Aug 2006
Location: California
Posts: 145
Default Re: Question from my statistics midterm I got wrong

I haven't studied this recently, so I could be way off, but wouldn't it have to be D? Intuitively, a measure of how closely the two sets of data are correlated shouldn't depend on what units you use to do the measuring.

The answer can't be B. If you remove outliers, you're changing the data, which could very well change the correlation.

The answer might be C, if the fact that Fahrenheit and Celsius aren't absolute temperature scales makes any difference.

To make a poker analogy, suppose you gather data on your tournament buyins and how long you last before you bust out, and the correlation is, say, -0.1. Measuring your buyins in British pounds shouldn't affect the correlation, and neither should measuring how long you lasted in picoseconds. However, throwing out that one time your aces got cracked on the first hand by a donkey with AT would affect the correlation.

It might help if you posted an equation from your textbook showing how to calculate r given two sets of data. You could then try, for example, multiplying all the numbers in one data set by a constant and seeing if this would affect the correlation. If the equation is based on random variables, you would need to think about how using a different unit would affect the expected value and other relevant properties of the random variable, and therefore the calculation of r.
Reply With Quote
  #3  
Old 10-18-2006, 03:30 AM
SumZero SumZero is offline
Senior Member
 
Join Date: Jul 2004
Location: South SF bay area, Califonia
Posts: 1,223
Default Re: Question from my statistics midterm I got wrong

Shouldn't it be D, both A and C merely change the units and thus may change your regression equation (because of the units scaling) but shouldn't change the underlieing relationship between X and Y. Obviously throwing out 2 outliers is very likely to change r as you are changing your data and thus the newly calculated r value is unlikely to be the same.

What's the trick?

Maybe the translation from Celcius to Fahrenheit because it involves the 5/9 scaling as well as the -32 is doing a transformation that would change r and the right answer is C, but I would have thought that what you are measuring is the correlation between two specific things the temperature, which is a real world concept regardless of units, and the amount of heating, which again is a real world amount and thus I would have thought that this kind of transformation doesn't change r. You can always make up some data set and do the Celcius/Fahrenheit conversion and calculate the r value yourself to see.
Reply With Quote
  #4  
Old 10-18-2006, 03:48 AM
Siegmund Siegmund is offline
Senior Member
 
Join Date: Feb 2005
Posts: 1,850
Default Re: Question from my statistics midterm I got wrong

It is D. Correlation coefficients are invariant under linear transformations of the data.
Reply With Quote
  #5  
Old 10-18-2006, 04:59 AM
SumZero SumZero is offline
Senior Member
 
Join Date: Jul 2004
Location: South SF bay area, Califonia
Posts: 1,223
Default Re: Question from my statistics midterm I got wrong

[ QUOTE ]
It is D. Correlation coefficients are invariant under linear transformations of the data.

[/ QUOTE ]

I guess the trick is does changing from Fahrenheit to Celcius represent a linear transformation? The function that does the conversion isn't linear. I.e. if f(x) turns Fahrenheit into Celcius so f(32) = 0 and f(212) = 0 then f(x+y) isn't f(x) + f(y) as f(32+212) = f(244) = 117 7/9 while f(212) + f(32) = 100 + 0 = 100. In addition af(x) isn't f(ax) as f(10 * 32) = f(320) = 160 while 10 * f(32) = 10 * 0 = 0.

But I don't think that is what you meant as I've verified with numbers that r stays the same for various numbers when you transform from celcius to fahrenheit so D is correct.
Reply With Quote
  #6  
Old 10-18-2006, 05:36 AM
KingGordy KingGordy is offline
Senior Member
 
Join Date: Jul 2005
Posts: 1,392
Default Re: Question from my statistics midterm I got wrong

Thanks for the responses guys. On the actual test I put D. According to my professor the answer is A. I emailed for an explanation and here's what I got:

"C is incorrect because measuring gas usage in hundreds of cubic feet implies that the y values will now be averages, averaging removes some of the variability leading to inflated correlations."

WTF? How does changing the units 'imply' that the y values will be averages?

I probably won't fight this because the prof seems like the type of guy to be hard headed over something like this, and it's not worth that much anyways. I just wanted to confirm my answer was correct for my own peace of mind.
Reply With Quote
  #7  
Old 10-18-2006, 06:30 AM
PairTheBoard PairTheBoard is offline
Senior Member
 
Join Date: Dec 2003
Posts: 3,460
Default Re: Question from my statistics midterm I got wrong

[ QUOTE ]
C)Measuring gas usage in hundreds of cubic feet, so that all values of y are divided by 100.


[/ QUOTE ]

[ QUOTE ]
"C is incorrect because measuring gas usage in hundreds of cubic feet implies that the y values will now be averages."


[/ QUOTE ]

email him again and ask him how "all values of y are divided by 100", "implies that the y values will now be averages"? I'll bet he thinks he asked something else.

Are you sure there wasn't more to the problem?

PairTheBoard
Reply With Quote
  #8  
Old 10-18-2006, 09:02 AM
Tiki Tiki is offline
Senior Member
 
Join Date: Aug 2006
Location: A fugitive from the Law of Averages..
Posts: 329
Default Re: Question from my statistics midterm I got wrong

Perhaps your Prof. means 427 cu.ft.->4 h.cu.ft. This is of course not what he said.

Enjoy your peace of mind.
Reply With Quote
  #9  
Old 10-18-2006, 01:05 PM
alThor alThor is offline
Senior Member
 
Join Date: Mar 2004
Location: not Vegas
Posts: 192
Default Re: Question from my statistics midterm I got wrong

[ QUOTE ]
"C is incorrect because measuring gas usage in hundreds of cubic feet implies that the y values will now be averages, averaging removes some of the variability leading to inflated correlations."

[/ QUOTE ]

That's nonsense. If you're going to ace the class anyway, you are wise to choose to forget about it. But yes, you were right. The linear transformation in "C" was just a special case of the affine transformation in "A", especially if the exam explicitly said "y values divided by 100". Is this a professor, or a grad student teaching this course?
Reply With Quote
  #10  
Old 10-18-2006, 03:17 PM
dave6 dave6 is offline
Senior Member
 
Join Date: Aug 2006
Location: California
Posts: 145
Default Re: Question from my statistics midterm I got wrong

[ QUOTE ]
[ QUOTE ]
"C is incorrect because measuring gas usage in hundreds of cubic feet implies that the y values will now be averages, averaging removes some of the variability leading to inflated correlations."

[/ QUOTE ]

That's nonsense. If you're going to ace the class anyway, you are wise to choose to forget about it. But yes, you were right. The linear transformation in "C" was just a special case of the affine transformation in "A", especially if the exam explicitly said "y values divided by 100". Is this a professor, or a grad student teaching this course?

[/ QUOTE ]

This is pretty sneaky. The funny thing is, if you're rounding temperatures to the nearest degree, then A is also wrong, and for exactly the same reason.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 09:17 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.