This work was created by Dr Jamie Love and Creative Commons Licence licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Teacher's Study Guide for Lesson Eleven
The Chi-Square is a "Ratio Ruling"

by Dr Jamie Love Creative Commons Licence 2002 - 2010

The ratios might be slightly "skewed" (a math term for "off") due to the number of individuals you collected or just "bad luck".

P = smooth seeds crossed with wrinkled seeds
F1 = all smooth seeds (so smooth is dominant and wrinkled is recessive)
F2 = 5,474 smooth seeds and 1,850 wrinkled seeds is a ratio of 2.96 : 1

Mendel and the Punnett square tell us that we should have a ratio of 3 : 1 not 2.96 : 1!
It depends upon how close the actual, observed numbers are from the calculated, expected numbers.
Mendel used the chi-square (abbreviated 2) test.
The chi-square test, or simply the "chi-square", measures the significance of the data in comparison with what you expect to get. The 2 only requires that you know the number of individuals observed in each category and what numbers you expected them to be.

In chi-square analysis you compare the number of individuals of a certain phenotype (or anything else) that you have found in the experiment to the number you expected to have.
You find the difference between the observed and the expected by simply subtracting one from the other.
Then, you square it (multiply it by itself). That gives you the "squared difference".
Then you divide the "squared difference" by what you expected in the first place in order to give you a "squared difference per expected" for that group.
Take into account all the different types by adding together these "squared differences per expected" values for each class.
The final sum (of the "squared differences per expected") gives you a number called the 2.

F2 = 5,474 smooth seeds and 1,850 wrinkled seeds for a ratio of 2.96 : 1
We use the "raw numbers" of the data and compare it to the "raw numbers" we expected.

Step 1: calculate the EXPECTED number of each type.
Add together both seed types = a total population of 7,324 seeds.
Of those 7,324 seeds you expected a quarter of them (1 in 4) to be wrinkled.
So, how many of those 7,324 seeds should be wrinkled?
[Divide 7,324 by 4 to get 1,831 wrinkled seeds expected.]
How many smooth seeds should you expect from the total of 7,324 seeds?
[Three times as many smooth as wrinkled so 1,831 times 3 = 5,493.]
We expected 5,493 smooth seeds and 1,831 wrinkled seeds.

Step 2: calculate the "squared differences per expected".
We observed 5,474 smooth seeds but expected 5,493 a DIFFERENCE of 19.
Next we SQUARE THE DIFFERENCE. 19 x 19 (or 192) = 361.
Next we find the SQUARE OF THE DIFFERENCE PER EXPECTED by dividing that number by the total number of smooth seeds we expected to see (5,493). So that is 361/5,493 = 0.066
Now the next class. We observed 1,850 wrinkled seeds but expected 1,831. That's a difference of 19. Square the difference to get 361. Divide it by the expected number of wrinkled seeds (NOT the expected number of smooth seeds - a common mistake) so that is 361/1,831 = 0.197

Step 3: congratulate yourself for having got through the toughest part!

Step 4: add up the "squared differences per expected" from all the categories.
0.066 + 0.197 = 0.263

This experiment (above) has a chi-square equal to 0.263 (2 = 0.263).

Step 5: compare our chi-square value to the value in a chi-square significance table and determine if our value is significant.
For our work we want to know if these results pass a significance level of 5%.

Your degrees of freedom are one less than the number of categories you have to work with.
We have two categories, smooth and wrinkled, so we have one degree of freedom.
With one degree of freedom we could be allowed a chi-square as large as 3.84 and the results would still be considered significant to 5%.

Degrees of Freedom 5 % Significance Levels
1
3.84
2
5.99
3
7.81
4
9.49

We would have to get a chi-square value over 3.84 before we would say that our results were so far from a 3 : 1 ratio that we would have to reject that ratio (and Mendel's explanation of how he got that ratio).
With a 2 = 0.263 there's less than a 5% chance that this 3:1 ratio happened by accident.

The Chi-square is a kind of "mathematical judge" of probabilities.

Imagine Mendel observed in this experiment 5,493 smooth seeds and 1,831 wrinkled seeds. That is an exact 3 : 1 ratio.
Looking first at the smooth seeds we would see that the difference between the observed and expected is zero!
When we square zero we still get zero. If we divide zero by the expected value we get zero!
The same happens when we calculate the values for the wrinkled seeds too.
Now we would add those two values together (because they are the "squared differences averaged") to get a final 2 = 0.
In other words, when the chi-square equals zero the experimental results are in exactly the ratio expected!

Conversely, the farther the chi-square gets from zero the less likely the ratio "rule" is being followed.
We could have got a chi-square value as high as 3.84 and still feel that we were close enough to the 3 : 1 ratio to not be worried.

Here's another set of results from Mendel's monohybrid cross experiments. Let's do the chi-square analysis of it.
P = green seeds crossed with yellow seeds
F1 = all yellow seeds
(So which color is dominant?)
[yellow]
F2 = 6,022 yellow seeds and 2,001 green seeds

First, calculate the expected number of each type.
You have a total of 8,023 seeds (6,022 yellow + 2,001 green = 8,023).
The green seeds should make up a quarter of that population, so dividing 8,203 by 4 gives you 2,005.75.
That means you expected 2,005 and ¾ seeds.
The yellow seeds should make up the rest of the sample so we can find their number by subtracting 2,005.75 from the total 8,023 to get 6,017.25.
So, from the total of 8,023 seeds you expected 6,017.25 to be yellow and 2,005.75 to be green.

Second, calculate the "squared differences per expected".
You expected 2,005.75 greens but observed 2001 and that is a difference of 4.75 (2,005.75 - 2,001 = 4.75).
Square that number you get 22.56. Divide it by the number you expected (2,005.75) you get 0.011.
You expected 6,017.25 yellows but observed 6,022 and that is a difference of 4.75 (6,017.25 - 6,022 = -4.75). Square that number you get 22.56. Divide it by the number you expected you get (6,017.25) to get 0.004.

Third, add up the "squared differences per expected" from all the categories. That's 0.011 + 0.004 = 0.015 so your 2 = 0.015.
We are working with only one degree of freedom.

The ratio observed in this experiment
(6,022 yellow : 2,001 green or a 3.01 : 1)
is not so far off from the 3 : 1 ratio to cause concern.

Degrees of Freedom 5 % Significance Levels
1
3.84
2
5.99
3
7.81
4
9.49

The chi-square formula is 2 = [(O - E)2/E]
"O" is the number observed and "E" is the number expected.
The symbol "" (called "sigma") is used throughout math to mean "sum".

Let's assume the results of a cross were
3,087 yellow seeds and 2,937 green seeds.

There are 6,024 seeds in total. If the 3 : 1 ratio applies then one quarter of them should be green.
That means 1,506 should be green and the rest (4,518) should be yellow.
Doing the greens first, that's (O - E)2/E = (2,937 - 1,506)2/1,506 = 1359.7
The yellows will be (O - E)2/E = (3,087 - 4,518)2/4,518 = 453.2.
Now add them together (that's what S means) to get a 2 = 1812.9.

Degrees of Freedom 5 % Significance Levels
1
3.84
2
5.99
3
7.81
4
9.49

Our calculated chi-square shows that these experimental results are well outside the acceptable range for a 3 : 1 ratio so we reject the idea that these results represent a 3 : 1 ratio.

What ratio are they close to and how would you test the ratio to see if it is close enough?
[Close to a 1 : 1 ratio.]

Do another chi-square on that same data (3,087 yellow seeds and 2,937 green seeds) and see if it is close enough to a 1 : 1 ratio.

There is a total of 6024 seeds so a 1: 1 ratio should show us
3,012 yellow seeds and 3,012 green seeds.
The greens will be (O - E)2/E = (2,937 - 3,012)2/3,012 = 1.867.
The yellows will be (O - E)2/E = (3,087 - 3,012)2/3,012 = 1.867 (again).
Adding them together gives me the 2 = 3.734 and compare that with the Table.

Degrees of Freedom 5 % Significance Levels
1
3.84
2
5.99
3
7.81
4
9.49

The new 2 , using a 1 : 1 ratio, is low enough to be within the range of significance. (This 2 is less than 3.84.)
Therefore, the results of this experiment are too far from being 3 : 1. I think they are really 1 : 1.

The chi-square is used whenever you want to compare the observed results to the ones you would expect from a certain ratio. That ratio could be 1 : 1 or 1 : 3 or even 9 : 3 : 3 : 1.