Discrepancies with Rencher in Discriminant Analysis

Hi, first of all congratulations on the very flexible software and website !

I was trying your sample exercise in the Discriminant Analysis
section (Example 1, 2 sample grouping) and it seems you get a different discriminant function from that of Rencher's in his Example 8,2.

His discriminant function gives: z = -1.633 y1 + 1.820 y2
Yours: z = -1.605 y1 + 1.594 y2

The z-scores obtanined are very different (rounded):

Rencher's / Using your discriminant function:
55 / 43
52 / 39
59 / 46
53 / 39
53 / 39
47 / 35
49 / 36
45 / 33
47 / 35
48 / 35
48 / 35
40 / 28

Hence it seems that using your function the separation between the 2 groups (dotted line) is less clear.

Why is there a discrepancy in the Standardised Discriminant Function Coefficients in the first place ??

I would also like to add that I am having some trouble running two tests in a row; I have to exit Excel, load again and run again.

Please me help out with these queries (especially the first one !)

Thanks a lot.


  • Hi Enrique

    Thank you for your comments. We have spent a lot of time getting statistiXL to this stage (we've been developing it for 5 years now - while holding down our day jobs!) and hope that it will prove useful (and affordable) to a wide range of people.

    As to your Discriminant Analysis query, the differences between our Discriminant Function and that of Rencher are simply down to rounding errors. As far as I can tell, in order to allow readers to work through his examples, Rencher has worked all of his calculations by hand. In order to make this manageable (by him at least ... I wouldn't like to try it with a pen and paper), he has had to restrict the number of significant digits that he works with. Using a computer, we are obviously not under the same restrictions as him, so statistiXL uses a lot more significant digits. If you run the sample through another computer based statistics package (e.g. SPSS, SAS etc) you will find that they provide the same Discriminant Function as statistiXL. In a similar manner, during development we found that if we reduced the number of digits statistiXL used we were able to get very similar answers to Rencher's.

    Your second query is of more concern to me in that I haven't been able to replicate it. Can you provide any more information on exactly what problems you experience when trying to run consecutive analyses? You should be able to run as many as you want without having to close Excel. What versions of Excel and Windows are you using?


  • Hi Alan, thanks very much for your reply.

    You're right - I ran the data in another software and got the same
    results as yours. It seemed strange that the rounding errors accounted
    for so much discrepancy - they even change the conclusions, since
    following Rencher's calculations one would point one variable (y2) as being slightly more significant in explaining the variation, whereas with your output both variables are about the same.

    Regarding the second question, I'll llook up which version of Exce I used when running the program in another computer. All I recall is that when running again
    I got a message about problems in the data and had to restart again.

    Best regards
