Statistical Analysis
Learning Objectives
Students will be able to:
- Calculate mean, median, minimum, maximum, and range for sensor data
- Use statistical measures to compare different conditions
- Calculate the percentage of time above/below thresholds
- Interpret what statistics tell us about air quality
Beyond the Graph
Graphs show patterns. Statistics tell the story.
While graphs help us see trends, statistical measures give us specific numbers we can compare. Instead of saying "CO2 seemed higher in Room A," we can say "Room A averaged 1,150 ppm vs. 780 ppm in Room B."
Key Statistical Measures
Mean (Average)
What it tells you: The typical value
How to calculate: Sum of all values ÷ number of values
Spreadsheet: =AVERAGE(A1:A10)
Median
What it tells you: The middle value (50% above, 50% below)
How to calculate: Sort values, find the middle one
Spreadsheet: =MEDIAN(A1:A10)
Minimum & Maximum
What they tell you: The lowest and highest values recorded
Why useful: Shows the extremes your environment reached
Spreadsheet: =MIN(A1:A10) and =MAX(A1:A10)
Range
What it tells you: How much values varied
How to calculate: Maximum − Minimum
Example: Range of 600 ppm means lots of variation
Example: Calculating Statistics
Sample data: CO2 readings over 45 minutes (ppm)
| Statistic | Calculation | Result |
|---|---|---|
| Mean | (520+680+790+890+985+1050+1120+1180+1090) ÷ 9 | 923 ppm |
| Median | Middle of sorted list: 520, 680, 790, 890, 985, 1050, 1090, 1120, 1180 | 985 ppm |
| Minimum | Lowest value | 520 ppm |
| Maximum | Highest value | 1,180 ppm |
| Range | 1,180 − 520 | 660 ppm |
Threshold Analysis
Often we want to know: How often does air quality exceed a threshold?
Calculating Percentage Above Threshold
Formula:
% Above Threshold = (Readings above threshold ÷ Total readings) × 100
Example
Using the data above, how much time was CO2 above 1,000 ppm?
- Values above 1,000: 1050, 1120, 1180, 1090 = 4 readings
- Total readings: 9
- Percentage: 4 ÷ 9 × 100 = 44%
Interpretation: CO2 exceeded 1,000 ppm for 44% of the class period.
Comparing Conditions
Statistical comparisons help answer research questions.
Example: Windows Open vs. Closed
| Statistic | Windows Closed | Windows Open | Difference |
|---|---|---|---|
| Mean CO2 | 1,150 ppm | 720 ppm | 430 ppm lower |
| Maximum | 1,450 ppm | 890 ppm | 560 ppm lower |
| % Above 1,000 ppm | 72% | 0% | 72% less time |
Conclusion: Opening windows reduced average CO2 by 430 ppm and eliminated all readings above 1,000 ppm.
Mean vs. Median: When to Use Which
Use Mean When...
- Data is fairly symmetric
- No extreme outliers
- You want to include all values equally
- Example: Average CO2 over a normal day
Use Median When...
- Data has extreme values (outliers)
- Data is skewed
- You want the "typical" experience
- Example: PM2.5 with occasional spikes from cooking
Example: Why Median Matters
PM2.5 data: 8, 10, 12, 11, 9, 145 (spike from someone cooking)
- Mean: 32.5 μg/m³ — Suggests "Moderate" air quality
- Median: 10.5 μg/m³ — Shows air was usually "Good"
- The median better represents typical conditions; the mean is inflated by one spike
Spreadsheet Functions Quick Reference
| What You Want | Google Sheets / Excel Formula | Example Result |
|---|---|---|
| Average | =AVERAGE(B2:B50) | 923 |
| Median | =MEDIAN(B2:B50) | 985 |
| Minimum | =MIN(B2:B50) | 520 |
| Maximum | =MAX(B2:B50) | 1180 |
| Count readings | =COUNT(B2:B50) | 49 |
| Count above 1000 | =COUNTIF(B2:B50,">1000") | 22 |
| % above 1000 | =COUNTIF(B2:B50,">1000")/COUNT(B2:B50)*100 | 44.9% |
Activity: Analyze Your Data
Statistical Analysis Worksheet
Calculate the following for your collected data:
| Statistic | Your Value | What It Means |
|---|---|---|
| Mean | ||
| Median | ||
| Minimum | ||
| Maximum | ||
| Range | ||
| % Above threshold |
If Comparing Conditions
Calculate the same statistics for each condition (e.g., windows open vs. closed) and find the difference.
Key Takeaway
Statistical measures turn raw data into meaningful summaries. The mean tells you the average, the median shows the typical value, and the range reveals variability. Threshold analysis tells you how often conditions exceeded safe levels. Together, these statistics provide the evidence you need to support your conclusions and recommendations.