Chủ Nhật, ngày 11 tháng 7 năm 2010
Chapter 2. Statistacal terms and concepts
In this chap ter I shall explain and discuss all the statistical concepts and trems which are involved in the computation of factor analyses. In addition I shall explain the symbols uesed in the agebra of the technique. For ease of explanation I shall usually assume that scores are test scores although they could be the measures of any variable such as height or social class.
X refers to a score on any variable, e.g. an intelligence test.
x refers to a deviation score. If a person scores 10 on a test and the average score of his or her group is 15, then the deviation score is -5. Similarly a score a score of 21 in that group would yield a deviation score of +6. Aa will be seen these deviation scores play an important part in factor analytic computations.
N refers to the number of subjects in a sample.
Σ (capital sigma) means ‘sum of’. Thus ΣX means add together all the Xs – the scores on a test or variable. Similarly Σx means add together all the deviation scores.
There are other symbols used in the computations of factor analysis but these refer to statistical terms and concepts and will be explicated as the term arise in the text.
BASIS STATISTICAL TERMS
The mean is the average score of any group on a test. It is often interest to compare the mean scores of different groups on a test, e.g. of boys and girls on a reading test. The mean, X, is given by
X = ΣX/N (2.1)
where ΣX is the sum of all scores on the test and N is the number in the sample. This is a calculation which most people do at various times in everyday life. The score are added up and divided by the number of subjects.
The mean indicates the average score of a group and is sometimes referred to in textbooks as a measure of central tendency. It tells us what the group on average is like. However, this is not informative on its own unless the spread or dispersion of the scores in the group is also known. An artificial and exaggerated example will make this point, Suppose thet we have two group, A and B, each of five subjects, who have taken an intelligence test, the score of which is given in Table 2.1. Appying formula (2.1) in both
Table 2.1. Intelligence test scores
Group A Group B
cases we can see that the mean are identical, i.e. 10. However, the groups are quite different – the scores of subjects in group A in no case overlap the scores of subjects in group B. This different lies in the dispersion or spread of the scores. In group A the spread is small, in group B it is far larger. Clearly, therefore, it is necessary to have a measure of the dispersion of the scores as well as the mean if the scores of a group are to be properly describes. This is given by standard deviation.
The standard deviation is a measure of dispersion or variation among scores. It is symbolized by either SD or σ (lower case sigma) and is given by
= [ x2/N]1/2
Where Σ x2 is the sum of the square deviations and N is the number in the sample. The scores from groups A and B can be used to illustrate this calculation (Table 2.2).
Table 2.2 Standard deviations of itelligence test scores
Group A Group B
X X x x2 X X x x2
10 10 0 0 20 10 10 100
10 10 0 0 1 10 -9 81
10 10 0 0 15 10 5 25
9 10 -1 1 2 10 -8 64
11 10 1 1 12 10 2 4
Σ x2 2 274
Applying formula (2.2) we obtain
Group A SD = (2/5)^1/2 = (0.4)^1/2 = 0.632
Group B SD = (274/5)^1/2 = (54.8)^1/2 = 7.40
Standard deviations are expressed in the same units as the test. Thus if this is a test with possible scores ranging from 0 to 20, the standard deviation give us a good indication of the dispersion or spread of scores. Ai is obvious from these two extreme and tiny samples, group A has a very small