Regression least squares

Hallo

I want to calculate a regression for the correlation of one independent variable with repeated measurements. An instrument measures a substance level at discrete steps (calibration curve) and this is repeated several times. In Excel and statistiXL only two one-dimenstional vectors can be correlated. Is there a procedure for repeated measures at distinct levels?

sad.gif
Klaus

Comments

  • Hi Klaus

    I have forwarded your query on to Philip and he should have a reply for you tomorrow.

    Best Regards

    Alan
  • Hello Klaus

    Regression with repeated X measures is not an option in statistiXL (or other packages that I know of), but the calculations are exactly the same as a regular regression on the data. The difference is that repeat of X measures allows separation of the residual sum of squares into the "pure error" sum of squares and the "lack of fit" sum of squares. This is because the repeat X measure allows the estimation of the "pure error" sum of squares. Note that the repeated measure must be a true, independent repeat e.g. you you can't just repeat a measure for the same individual, or remeasure the value for a sample - you have to measure a different individual with the same X value, or measure a different standard of the same X value.

    So, you can analyse your data with X repeats as a standard regression, and get the same result - ASSUMING that the "lack of fit" sum of squares is NOT significant (if it were significant, then you should abandon the particular regression model and seek a better one e.g. try a quadratic model).

    It isn't too hard to calculate the pure error SS - if you are interested then read my description below.

    Phil Withers

    ********************************************************************************************************************
    To calculate the pure error SS in repeated X regression.
    An excellent description of this is given by Draper & Smith (1998) Applied Regression Analysis,
    whose example I use
    ********************************************************************************************************************

    1. For each set of repeated X values calculate the sum of each Y-squared value and add them together
    e.g. X,Y =(4.0, 2.8), (4.0, 2.8) and (4.0, 2.2) , Y2 = 2.8^2 + 2.8^2 + 2.2^2 = 20.52

    2. Sum the Y values for each of these repeated X values, square this, and divide by how many repeats there are
    e.g. ((2.8 + 2.8 + 2.2)^2)/3 = 20.28

    3. The number of degrees of freedom for this sum of squares is the number of the X repeats - 1
    e.g. df = 3-1 = 2

    4. Sum all of the sums of squares, and all of the degrees of freedom - this is the PURE ERROR SS and DF
    e.g. total repeat SS = 7.055, total df = 10

    5. Get the residual SS and df from the normal regression analysis
    e.g. residualSS = 15.278, df = 21

    6. The Lack of Fit SS and df are obtained by subtraction from the residual SS and DF
    e.g. LofF SS = 15.278 - 7.055 = 8.233, df = 21 - 10 = 11

    7. Calculate the mean squares in the normal fashion (SS / df)
    e.g. LofF MS = 8.233/11 = 0.748 Pure Error MS = 7.055/10 = 0.7055

    8. Calculate F for LofF MS in the usual way (LofF MS/Pure Error MS)
    e.g. F = 0.748/0.7055 = 1.061

    9. Check significance of this F value:
    IF F is non-significant, proceed with the regression in the conventional way and calculate
    regression F, etc (i.e. just use repeat X values as X values)
    IF F is significant, then stop and rethink your model - maybe a quadratic model is more appropriate
    (look at your residuals)
    smile.gif
  • QUOTE (Philip Withers @ 28 Apr 2005, 23:46)
    Hello Klaus

    Regression with repeated X measures is not an option in statistiXL (or other packages that I know of), but the calculations are exactly the same as a regular regression on the data. The difference is that repeat of X measures allows separation of the residual sum of squares into the "pure error" sum of squares and the "lack of fit" sum of squares. This is because the repeat X measure allows the estimation of the "pure error" sum of squares. Note that the repeated measure must be a true, independent repeat e.g. you you can't just repeat a measure for the same individual, or remeasure the value for a sample - you have to measure a different individual with the same X value, or measure a different standard of the same X value.

    So, you can analyse your data with X repeats as a standard regression, and get the same result - ASSUMING that the "lack of fit" sum of squares is NOT significant (if it were significant, then you should abandon the particular regression model and seek a better one e.g. try a quadratic model).

    It isn't too hard to calculate the pure error SS - if you are interested then read my description below.

    Phil Withers

    ********************************************************************************************************************
    To calculate the pure error SS in repeated X regression.
    An excellent description of this is given by Draper & Smith (1998) Applied Regression Analysis,
    whose example I use
    ********************************************************************************************************************

    1. For each set of repeated X values calculate the sum of each Y-squared value and add them together
    e.g. X,Y =(4.0, 2.8), (4.0, 2.8) and (4.0, 2.2) , Y2 = 2.8^2 + 2.8^2 + 2.2^2 = 20.52

    2. Sum the Y values for each of these repeated X values, square this, and divide by how many repeats there are
    e.g. ((2.8 + 2.8 + 2.2)^2)/3 = 20.28

    3. The number of degrees of freedom for this sum of squares is the number of the X repeats - 1
    e.g. df = 3-1 = 2

    4. Sum all of the sums of squares, and all of the degrees of freedom - this is the PURE ERROR SS and DF
    e.g. total repeat SS = 7.055, total df = 10

    5. Get the residual SS and df from the normal regression analysis
    e.g. residualSS = 15.278, df = 21

    6. The Lack of Fit SS and df are obtained by subtraction from the residual SS and DF
    e.g. LofF SS = 15.278 - 7.055 = 8.233, df = 21 - 10 = 11

    7. Calculate the mean squares in the normal fashion (SS / df)
    e.g. LofF MS = 8.233/11 = 0.748 Pure Error MS = 7.055/10 = 0.7055

    8. Calculate F for LofF MS in the usual way (LofF MS/Pure Error MS)
    e.g. F = 0.748/0.7055 = 1.061

    9. Check significance of this F value:
    IF F is non-significant, proceed with the regression in the conventional way and calculate
    regression F, etc (i.e. just use repeat X values as X values)
    IF F is significant, then stop and rethink your model - maybe a quadratic model is more appropriate
    (look at your residuals)
    smile.gif
    Thank you Philip,

    a very simple and thorough description. It solves my problem.

    Klaus biggrin.gif
Sign In or Register to comment.