http://groups.google.com/group/sci.stat.ma...e56eb6616146984

Lance ]]>

I just purchased statistixl and am getting ready to enter data for a survey that is under way. I don't know what is best to do with missing data or cells that might be left blank but the data is not necessarily missing. I will be running mostly descriptive-type stats, some relationships and comparisons. Can you direct me as to what is best to do or where I might look to find this answer specific to how Excel operates.

Thanks for your help and for this handy program!

]]>

1. EXCEL'S PROBLEM WITH 65,535 & 65,536

See this article online with sample worksheet at http://news.office-watch.com?556

Excel 2007 and Excel Services 2007 have a problem with a few numbers it doesn't like ... truly.

A few numbers around the 65,535 and the 65,536 mark will not display properly and instead of the correct result it will show " 100,000". The true result is stored and most cells based upon the flawed display will work out correctly - but any screen display or printout is wrong.

2. WHAT HAPPENS

Some Excel 2007 calculations with a result in the range:

65,534.99999999995 to 65,535

or

65,535.99999999995 to 65,536

will display the characters " 100,000 " instead of the correct result. The right number is stored 'under' the cell so calculations based on the 'bad' cell should be OK (unless that result is in the problem range too).

The problem is exacerbated by being such relatively low numbers and two integers which are more likely to be the result of live customer worksheets than a higher number in the millions or billions. Eg - I buy 850 widgets at $77.10 each - is the total cost $65,535 or the $100,000 Excel tells me?

You don't need fancy formulas to make this happen - the bug will appear with any of the following formulas:

= 77.1 * 850

= 10.2 * 6425

=20.4 * 3212.5

=850 * $77.10

But not all results are affected, for example =32767.5*2 displays the correct result.

Iterative calculations which 'pass through' the range of problem numbers should be OK because they will work on the real cell value not the displayed value - however if the final result of an iteration is in the range then the displayed value could be wrong.

Conditional formatting still works correctly thought it might seem wrong. That's because Excel conditional formatting works off the actual cell value not the displayed number. If you have a condition to work if a cell equals 65,535 then Excel 2007 may trigger that condition even though the cell is displaying " 100,000 ".

Surprisingly, the TEXT function (which converts a number to text) works off the displayed value not the true cell value.

________

]]>

The Statistitxl PCA analysis gives me the '% of Var' at the top of the results page. However, I believe this is the % of Var for PCs of the column X column covariance matrix, while the casewise score PCs are from a row X row covariance matrix. So I believe the '% of Var' does not apply to the 1st 3 casewise scores PCs, and I should transpose the data and re-run the PCA to get the corrext % of Var calculation. However, I'm not sure. Any insight? Sorry if this is confusing - please ask me to clarify if necessary. ]]>

I would appreciate your help and I would like to thank you in advance. ]]>

http://www.medscape.com/viewarticle/546515_print

The double entry method is perhaps the most valuable suggestion. I also always look for outliers and off patterns in the data - what Vickers calls "consistency checks."

I don't think Excel can produce programmed analyses (I presume they mean "R" which is a statistical programming language that is becoming standard for many medical journals). Also I don't think Excel can create "log" files. If anyone knows how let me know!

Lance

]]>

Several times I have seen Excel's STDEV function return a negative value as the standard deviation of a some collection of numbers. Perhaps I am ignorant but this seems quite wrong to me. Does anyone know why this happens, what it means, and what, if anything can be done about it?

Thanks for any help.

Lance ]]>

I'm trying out the cluster anaysis tool after running Principle Component analysis, extracting the highest corellative variables, and standardizing their units ((x-mean)/stndard dev). I am then using Wards method with Squared Euclidean distances, and looking at the resulting dendrogram. I understand that the distances are somehow related to SSE values, but what specific units would be prescribed to the x axis? Keep in mind that the units are standardized prior to cluster analysis.

Thanks!

Kerry ]]>

Is there any guidance on how to choose the right test? What I mean is to have to answer to a series of questions on the type of data that I have and what to do with them, and finally to reach to a suggestion on what test is the right one to use and how to read the results.

This question does not refer to only a statistical test to do but more generally if there is anywhere any guidance on statistics

Thank you in advance for your help. ]]>

i'm completely new to statistics.... Can anyone kindly refer to any references or readings that talk about goodness of fit test? like what is degree of freedom or p-value? what significance do they play in the goodness of fit tests?

or can someone give a quick explaination here pls?

thanks a million ]]>

I used your pca.xls file in order to calculate the scores for each component. I used your equation:

PC1 = 0.207 WDIM + 0.873 CIRCUM + 0.261 FBEYE + 0.326 EYEHDÂ + 0.066 EARHD + 0.128 JAW

PC2 = -0.142 WDIM -0.219 CIRCUM - 0.231 FBEYE + 0.891 EYEHB + 0.222 EARHD - 0.187 JAW

But I did not get the same scores as yours.

Could you tell me in this exercise what I must to input in WDIM, FBEYE, EARHD and JAW. I know that those names are for the variables, but does it means the standardized values for each variable?

Many thanks!

Mercedes ]]>

I want to calculate a regression for the correlation of one independent variable with repeated measurements. An instrument measures a substance level at discrete steps (calibration curve) and this is repeated several times. In Excel and statistiXL only two one-dimenstional vectors can be correlated. Is there a procedure for repeated measures at distinct levels?

Klaus ]]>

Suppose 4 people take a test either after drug 1 or drug 2. And with each drug, they take the test either with a strong or a dim light. How would I enter the data in Excel?

Thanks!

]]>