Hobbies And Interests
Home  >> Science & Nature >> Science

How to Derive the Chi-Square

The chi square test is a statistical method used to test an observed distribution against a distribution determined by a null hypothesis. Scientists seeking to discover whether an observed distribution is random or not use chi square extensively. For example, a geographer trying to discover whether the distribution of apple trees over a hillside is related to the type of soil will use the chi square test.

Instructions

    • 1

      Create a null hypothesis. A null hypothesis expresses exactly what you want to investigate.

      For example, your research shows that there may be a link between soil type and where farmers grow strawberries. The null hypotheses would be "The frequency of farms growing strawberries is not related to soil type."

    • 2

      Establish exactly what you need to record and then create a data sheet on which to record it. For example, to investigate soil type and strawberry growing, record the soil type and the number of farms your identified growing strawberries on that soil type. Assign each soil type a "category number." For example, a completed data sheet may look like this:

      1 : Soil Type = Sand : Number of strawberry farms =15

      2 : Soil Type = Clay : Number of strawberry farms = 5

      3 : Soil Type = Peat : Number of strawberry farms =12

      4 : Soil Type = Loam : Number of strawberry farms = 7

      5 : Soil Type = Limestone : Number of strawberry farms = 1

    • 3

      Determine the "expected" frequency. The expected frequency is the total number of observations divided by the number of areas. For example, a total of 40 observations and five different areas gives an expected frequency of eight -- 40 / 5 = 8.

    • 4

      Subtract the observed frequency from the expected frequency for each area.

      For example an observed frequency of 15 and an expected frequency of eight gives an observed - expected frequency of 7 -- 15 - 8 = 7.

    • 5

      Square the (observed - expected) frequency for each area. For example, if Area 1 has an observed frequency of 15 and an expected frequency of eight, the observed - expected frequency is seven, and seven squared = 49. Using the example data from Step 2, the data would now look like this:

      Soil Type Observed Expected Obs - Exp^2

      Sand 15 8 49

      Clay 5 8 9

      Peat 12 8 16

      Loam 7 8 1

      Limestone 1 8 49

    • 6

      Add the (observed - expected)^2 values together, and then divide the total by the expected frequency. Using the data from Step 5, the math is (49 + 9 + 16 + 1 + 49) / 8 which resolves to 124 / 8, or 15.5.

    • 7

      Calculate the "degree of freedom" value by subtracting one from the number of categories in your investigation. In the example there are five categories of soil, so the degree of freedom value is four -- 5 - 1 = 4. Use a table of critical values for Chi square to identify the value for your degrees of freedom with a 0.05 probability. If the calculated chi square value is greater than the value in the table, then there is less than a 0.05 probability of the relationship being random. In other words, the relationship is due to the factor you investigated. Using the example data, the 0.05 probability value with four degrees of freedom is 9.4877. The calculated value of 15.5 is greater than 9.4877 so there is less than a 0.05 percent chance that the relationship is random. There is a link between soil type and strawberry farming.


https://www.htfbw.com © Hobbies And Interests