Topic 1

๐Ÿ“Š What is Statistics & Why It Matters

The science of collecting, organizing, analyzing, and interpreting data

Introduction

What is it? Statistics is a branch of mathematics that deals with data. It provides methods to make sense of numbers and help us make informed decisions based on evidence rather than guesswork.

Why it matters: From business forecasting to medical research, sports analysis to government policy, statistics powers nearly every decision in our modern world.

When to use it: Whenever you need to understand patterns, test theories, make predictions, or draw conclusions from data.

๐Ÿ’ก REAL-WORLD EXAMPLE

Imagine Netflix deciding what shows to produce. They analyze viewing statistics: what genres people watch, when they pause, what they finish. Statistics transforms millions of data points into actionable insights like "Create more thriller series" or "Release episodes on Fridays."

Two Branches of Statistics

Descriptive Statistics

  • Summarizes and describes data
  • Uses charts, graphs, averages
  • Example: "Average class score is 85"

Inferential Statistics

  • Makes predictions and inferences
  • Tests hypotheses
  • Example: "New teaching method improves scores"

Use Cases & Applications

  • Healthcare: Clinical trials testing new drugs, disease outbreak tracking
  • Business: Customer behavior analysis, sales forecasting, A/B testing
  • Government: Census data, economic indicators, policy impact assessment
  • Sports: Player performance metrics, game strategy optimization

๐ŸŽฏ Key Takeaways

  • Statistics transforms raw data into meaningful insights
  • Two main branches: Descriptive (what happened) and Inferential (what will happen)
  • Essential for decision-making across all fields
  • Combines mathematics with real-world problem solving
Topic 2

๐Ÿ‘ฅ Population vs Sample

Understanding the difference between the entire group and a subset

Introduction

What is it? A population includes ALL members of a defined group. A sample is a subset selected from that population.

Why it matters: It's usually impossible or impractical to study entire populations. Sampling allows us to make inferences about large groups by studying smaller representative groups.

When to use it: Use populations when you can access all data; use samples when populations are too large, expensive, or time-consuming to study.

๐Ÿ’ก REAL-WORLD ANALOGY

Think of tasting soup. You don't need to eat the entire pot (population) to know if it needs salt. A single spoonful (sample) gives you a good ideaโ€”as long as you stirred it well first!

Interactive Visualization

Key Differences

Aspect Population Sample
Size Entire group (N) Subset (n)
Symbol N (uppercase) n (lowercase)
Cost High Lower
Time Long Shorter
Accuracy 100% (if measured correctly) Has sampling error
โš ๏ธ COMMON MISTAKE

Biased Sampling: If your sample doesn't represent the population, your conclusions will be wrong. Example: Surveying only morning shoppers at a store will miss evening customer patterns.

โœ… PRO TIP

For a sample to be representative, use random sampling. Every member of the population should have an equal chance of being selected.

๐ŸŽฏ Key Takeaways

  • Population (N): All members of a defined group
  • Sample (n): A subset selected from the population
  • Good samples are random and representative
  • Larger samples generally provide better estimates
Topic 3

๐Ÿ“ˆ Parameters vs Statistics

Population measures vs sample measures

Introduction

What is it? A parameter is a numerical characteristic of a population. A statistic is a numerical characteristic of a sample.

Why it matters: We usually can't measure parameters directly (populations are too large), so we estimate them using statistics from samples.

When to use it: Parameters are what we want to know; statistics are what we can calculate.

๐Ÿ’ก REAL-WORLD EXAMPLE

You want to know the average height of all students in your country (parameter). You can't measure everyone, so you measure 1,000 students (sample) and calculate their average height (statistic) to estimate the population parameter.

Common Parameters and Statistics

Measure Parameter (Population) Statistic (Sample)
Mean (Average) ฮผ (mu) xฬ„ (x-bar)
Standard Deviation ฯƒ (sigma) s
Variance ฯƒยฒ sยฒ
Proportion p pฬ‚ (p-hat)
Size N n

The Relationship

Key Concept

Statistic โ†’ Estimates โ†’ Parameter

We use statistics (calculated from samples) to estimate parameters (unknown population values).

๐Ÿ“Š EXAMPLE

Scenario: A factory wants to know the average weight of cereal boxes.

  • Population: All cereal boxes produced (millions)
  • Parameter: ฮผ = true average weight of ALL boxes (unknown)
  • Sample: 100 randomly selected boxes
  • Statistic: xฬ„ = 510 grams (calculated from the 100 boxes)
  • Inference: We estimate ฮผ โ‰ˆ 510 grams
โš ๏ธ COMMON MISTAKE

Confusing symbols! Greek letters (ฮผ, ฯƒ, ฯ) refer to parameters (population). Roman letters (xฬ„, s, r) refer to statistics (sample).

๐ŸŽฏ Key Takeaways

  • Parameter: Describes a population (usually unknown)
  • Statistic: Describes a sample (calculated from data)
  • Greek letters = population, Roman letters = sample
  • Statistics are used to estimate parameters
Topic 4

๐Ÿ”ข Types of Data

Categorical, Numerical, Discrete, Continuous, Ordinal, Nominal

Introduction

What is it? Data comes in different types, and understanding these types determines which statistical methods you can use.

Why it matters: Using the wrong analysis method for your data type leads to incorrect conclusions. You can't calculate an average of colors!

When to use it: Before any analysis, identify your data type to choose appropriate statistical techniques.

Data Type Hierarchy

DATA
CATEGORICAL
NUMERICAL
Nominal
Ordinal
Discrete
Continuous

Categorical Data

Represents categories or groups (qualitative)

Nominal

Categories with NO order

  • Colors: Red, Blue, Green
  • Gender: Male, Female, Non-binary
  • Country: USA, India, Japan
  • Blood Type: A, B, AB, O

Ordinal

Categories WITH meaningful order

  • Education: High School < Bachelor's < Master's
  • Satisfaction: Poor < Fair < Good < Excellent
  • Medal: Bronze < Silver < Gold
  • Size: Small < Medium < Large

Numerical Data

Represents quantities (quantitative)

Discrete

Countable, specific values only

  • Number of students: 25, 30, 42
  • Number of cars: 0, 1, 2, 3...
  • Dice roll: 1, 2, 3, 4, 5, 6
  • Number of children: 0, 1, 2, 3...

Can't have 2.5 students!

Continuous

Can take any value in a range

  • Height: 165.3 cm, 180.7 cm
  • Weight: 68.5 kg, 72.3 kg
  • Temperature: 23.4ยฐC, 24.7ยฐC
  • Time: 3.25 seconds

Infinite precision possible

๐Ÿ’ก QUICK TEST

Ask yourself:

  1. Is it a label/category? โ†’ Categorical
  2. Is it a number? โ†’ Numerical
  3. Can you count it? โ†’ Discrete
  4. Can you measure it? โ†’ Continuous
  5. Does order matter? โ†’ Ordinal (else Nominal)
๐Ÿ“Š EXAMPLES
Data Type Reason
Zip codes Categorical (Nominal) Numbers used as labels, not quantities
Test scores (A, B, C, D, F) Categorical (Ordinal) Categories with clear order
Number of pages in books Numerical (Discrete) Countable whole numbers
Reaction time in milliseconds Numerical (Continuous) Can be measured to any precision
โš ๏ธ COMMON MISTAKE

Just because something is written as a number doesn't make it numerical! Phone numbers, jersey numbers, and zip codes are categorical because they identify categories, not quantities.

๐ŸŽฏ Key Takeaways

  • Categorical: Labels/categories (Nominal: no order, Ordinal: has order)
  • Numerical: Quantities (Discrete: countable, Continuous: measurable)
  • Data type determines which statistical methods to use
  • Always identify data type before analysis
Topic 5

๐Ÿ“ Measures of Central Tendency

Mean, Median, Mode - Finding the center of data

Introduction

What is it? Measures of central tendency are single values that represent the "center" or "typical" value in a dataset.

Why it matters: Instead of looking at hundreds of numbers, one central value summarizes the data. "Average salary" tells you more than listing every employee's salary.

When to use it: When you need to summarize data with a single representative value.

๐Ÿ’ก REAL-WORLD ANALOGY

Imagine finding the "center" of a group of people standing on a field. Mean is like finding the balance point where they'd balance on a seesaw. Median is literally the middle person. Mode is where the most people are clustered together.

Mathematical Foundations

Mean (Average)
ฮผ = ฮฃx n

Where:

  • ฮผ (mu) = population mean or xฬ„ (x-bar) = sample mean
  • ฮฃx = sum of all values
  • n = number of values

Steps:

  1. Add all values together
  2. Divide by the count of values
Median (Middle Value)

If odd number of values: Middle value

If even number of values: Average of two middle values

Steps:

  1. Sort values in ascending order
  2. Find the middle position: (n + 1) / 2
  3. If between two values, average them
Mode (Most Frequent)

The value(s) that appear most frequently

Types:

  • Unimodal: One mode
  • Bimodal: Two modes
  • Multimodal: More than two modes
  • No mode: All values appear equally

Interactive Calculator

Mean: 30
Median: 30
Mode: None
๐Ÿ“Š WORKED EXAMPLE

Dataset: Test scores: 65, 70, 75, 80, 85, 90, 95

Mean:

Sum = 65 + 70 + 75 + 80 + 85 + 90 + 95 = 560

Mean = 560 / 7 = 80

Median:

Already sorted. Middle position = (7 + 1) / 2 = 4th value

Median = 80

Mode:

All values appear once. No mode

When to Use Which?

Use Mean

  • Data is symmetrical
  • No extreme outliers
  • Numerical data
  • Need to use all data points

Use Median

  • Data has outliers
  • Data is skewed
  • Ordinal data
  • Need robust measure

Use Mode

  • Categorical data
  • Finding most common value
  • Discrete data
  • Multiple peaks in data
โš ๏ธ COMMON MISTAKE

Mean is affected by outliers! In salary data like $30K, $35K, $40K, $45K, $500K, the mean is $130K (misleading!). The median of $40K better represents typical salary.

โœ… PRO TIP

For skewed data (like income, house prices), always report the median along with the mean. If they're very different, your data has outliers or is skewed!

๐Ÿ“ Worked Example - Step by Step

Problem:

Find the mean, median, and mode of: [12, 15, 12, 18, 20, 15, 12, 22]

Solution:

Step 1:

Calculate the Mean (Average)

Sum = 12 + 15 + 12 + 18 + 20 + 15 + 12 + 22 = 126
Count (n) = 8 values
Mean = Sum รท n = 126 รท 8 = 15.75

Add all values together, then divide by how many values there are

Step 2:

Find the Median (Middle Value)

Sorted data: [12, 12, 12, 15, 15, 18, 20, 22]
Even number of values (8), so average the middle two
Middle positions: 4th and 5th values = 15 and 15
Median = (15 + 15) รท 2 = 15

For even-sized datasets, average the two middle values

Step 3:

Find the Mode (Most Frequent Value)

Frequency count:
โ€ข 12 appears 3 times โ† Most frequent!
โ€ข 15 appears 2 times
โ€ข 18, 20, 22 each appear 1 time
Mode = 12

The mode is the value that appears most often

Final Answer: Mean = 15.75, Median = 15, Mode = 12
โœ“ Check:

Mean (15.75) is slightly higher than median (15) because the outlier 22 pulls it up. The mode (12) is the lowest because it's the most common value at the lower end.

๐Ÿ’ช Try These:

  1. Find the mean of: [5, 10, 15, 20]
  2. What's the median of: [3, 1, 4, 1, 5]?
  3. Find the mode of: [2, 2, 3, 4, 4, 4, 5]

๐ŸŽฏ Key Takeaways

  • Mean: Sum of all values divided by count (affected by outliers)
  • Median: Middle value when sorted (resistant to outliers)
  • Mode: Most frequent value (useful for categorical data)
  • Choose the measure that best represents your data type and distribution
Topic 6

โšก Outliers

Extreme values that don't fit the pattern

Introduction

What is it? Outliers are data points that are significantly different from other observations in a dataset.

Why it matters: Outliers can indicate data errors, special cases, or important patterns. They can also severely distort statistical analyses.

When to use it: Always check for outliers before analyzing data, especially when calculating means and standard deviations.

๐Ÿ’ก REAL-WORLD EXAMPLE

In a salary dataset for entry-level employees: $35K, $38K, $40K, $37K, $250K. The $250K is an outlierโ€”maybe it's a data entry error (someone added an extra zero) or a special case (CEO's child). Either way, it needs investigation!

Detection Methods

IQR Method

Most common approach:

  • Calculate Q1, Q3, and IQR = Q3 - Q1
  • Lower fence = Q1 - 1.5 ร— IQR
  • Upper fence = Q3 + 1.5 ร— IQR
  • Outliers fall outside fences

Z-Score Method

For normal distributions:

  • Calculate z-score for each value
  • z = (x - ฮผ) / ฯƒ
  • If |z| > 3: definitely outlier
  • If |z| > 2: possible outlier
โš ๏ธ COMMON MISTAKE

Never automatically delete outliers! They might be: (1) Valid extreme values, (2) Data entry errors, (3) Important discoveries. Always investigate before removing.

๐ŸŽฏ Key Takeaways

  • Outliers are extreme values that differ significantly from other data
  • Use IQR method (1.5 ร— IQR rule) or Z-score method to detect
  • Mean is heavily affected by outliers; median is resistant
  • Always investigate outliers before deciding to keep or remove
Topic 7

๐Ÿ“ Variance & Standard Deviation

Measuring spread and variability in data

Introduction

What is it? Variance measures the average squared deviation from the mean. Standard deviation is the square root of variance.

Why it matters: Shows how spread out data is. Low values mean data is clustered; high values mean data is scattered.

When to use it: Whenever you need to understand data variabilityโ€”in finance (risk), manufacturing (quality control), or research (reliability).

Mathematical Formulas

Population Variance (ฯƒยฒ)
ฯƒยฒ = ฮฃ(x - ฮผ)ยฒ / N

Where N = population size, ฮผ = population mean

Sample Variance (sยฒ)
sยฒ = ฮฃ(x - xฬ„)ยฒ / (n - 1)

Where n = sample size, xฬ„ = sample mean. We use (n-1) for unbiased estimation.

Standard Deviation
ฯƒ = โˆš(variance)

Same units as original data, easier to interpret

๐Ÿ“Š WORKED EXAMPLE

Dataset: [4, 8, 6, 5, 3, 7]

Step 1: Mean = (4+8+6+5+3+7)/6 = 5.5

Step 2: Deviations: [-1.5, 2.5, 0.5, -0.5, -2.5, 1.5]

Step 3: Squared: [2.25, 6.25, 0.25, 0.25, 6.25, 2.25]

Step 4: Sum = 17.5

Step 5: Variance = 17.5/(6-1) = 3.5

Step 6: Std Dev = โˆš3.5 = 1.87

๐Ÿ“ Worked Example - Step by Step

Problem:

Calculate the variance and standard deviation for the dataset: [4, 8, 6, 5, 3]

Solution:

Step 1:

Calculate the Mean

Sum = 4 + 8 + 6 + 5 + 3 = 26
Mean (xฬ„) = 26 รท 5 = 5.2

First, we need the mean to calculate deviations

Step 2:

Find Deviations from Mean

(4 - 5.2) = -1.2
(8 - 5.2) = 2.8
(6 - 5.2) = 0.8
(5 - 5.2) = -0.2
(3 - 5.2) = -2.2

Subtract the mean from each value

Step 3:

Square Each Deviation

(-1.2)ยฒ = 1.44
(2.8)ยฒ = 7.84
(0.8)ยฒ = 0.64
(-0.2)ยฒ = 0.04
(-2.2)ยฒ = 4.84

Squaring eliminates negative signs and emphasizes larger deviations

Step 4:

Calculate Variance (sample)

Sum of squared deviations = 1.44 + 7.84 + 0.64 + 0.04 + 4.84 = 14.8
Divide by (n-1) = 5-1 = 4
sยฒ = 14.8 รท 4 = 3.7

We use (n-1) for sample variance (Bessel's correction)

Step 5:

Calculate Standard Deviation

s = โˆšsยฒ = โˆš3.7 โ‰ˆ 1.92

Standard deviation is the square root of variance

Final Answer: Variance = 3.7, Standard Deviation = 1.92
โœ“ Interpretation:

A standard deviation of 1.92 means most values fall within about 1.92 units of the mean (5.2). This indicates moderate spread in the data.

๐Ÿ’ช Try These:

  1. Calculate the standard deviation of: [2, 4, 6, 8]
  2. Find the variance of: [10, 12, 14, 16, 18]

๐ŸŽฏ Key Takeaways

  • Variance measures average squared deviation from mean
  • Standard deviation is square root of variance (same units as data)
  • Use (n-1) for sample variance to avoid bias
  • Higher values = more spread; lower values = more clustered
Topic 8

๐ŸŽฏ Quartiles & Percentiles

Dividing data into equal parts

Introduction

What is it? Quartiles divide sorted data into 4 equal parts. Percentiles divide data into 100 equal parts.

Why it matters: Shows relative position in a dataset. "90th percentile" means you scored better than 90% of people.

The Five-Number Summary

  • Minimum: Smallest value
  • Q1 (25th percentile): 25% of data below this
  • Q2 (50th percentile/Median): Middle value
  • Q3 (75th percentile): 75% of data below this
  • Maximum: Largest value
๐Ÿ’ก REAL-WORLD EXAMPLE

SAT scores: If you score 1350 and that's the 90th percentile, it means you scored higher than 90% of test-takers. Percentiles are perfect for standardized tests!

๐ŸŽฏ Key Takeaways

  • Q1 = 25th percentile, Q2 = median, Q3 = 75th percentile
  • Percentiles show relative standing in a dataset
  • Five-number summary: Min, Q1, Q2, Q3, Max
  • Useful for understanding data distribution
Topic 9

๐Ÿ“ฆ Interquartile Range (IQR)

Middle 50% of data and outlier detection

Introduction

What is it? IQR = Q3 - Q1. It represents the range of the middle 50% of your data.

Why it matters: IQR is resistant to outliers and is the foundation of the 1.5ร—IQR rule for outlier detection.

The 1.5 ร— IQR Rule

Outlier Boundaries
Lower Fence = Q1 - 1.5 ร— IQR
Upper Fence = Q3 + 1.5 ร— IQR

Any value outside these fences is considered an outlier

๐ŸŽฏ Key Takeaways

  • IQR = Q3 - Q1 (range of middle 50% of data)
  • Resistant to outliers (unlike standard deviation)
  • 1.5ร—IQR rule: standard method for outlier detection
  • Box plots visualize IQR and outliers
Topic 10

๐Ÿ“‰ Skewness

Understanding data distribution shape

Introduction

What is it? Skewness measures the asymmetry of a distribution.

Why it matters: Indicates whether data leans left or right, affecting which statistical methods to use.

Types of Skewness

Negative (Left) Skew

Tail extends to the left

Mean < Median < Mode

Example: Test scores when most students do well

Symmetric (No Skew)

Perfectly balanced

Mean = Median = Mode

Example: Normal distribution

Positive (Right) Skew

Tail extends to the right

Mode < Median < Mean

Example: Income data, house prices

๐Ÿ“Š Visual: Types of Skewness

๐Ÿ“ Worked Example - Step by Step

Problem:

Calculate and interpret skewness for dataset: [2, 3, 4, 5, 15]

Solution:

Step 1:

Calculate the Mean

Sum = 2 + 3 + 4 + 5 + 15 = 29
n = 5
Mean (xฬ„) = 29/5 = 5.8

First, find the average of all values

Step 2:

Calculate Standard Deviation

Deviations from mean: (2-5.8), (3-5.8), (4-5.8), (5-5.8), (15-5.8)
= -3.8, -2.8, -1.8, -0.8, 9.2
Squared: 14.44, 7.84, 3.24, 0.64, 84.64
Variance (sample) = (14.44+7.84+3.24+0.64+84.64)/4 = 110.8/4 = 27.7
SD = โˆš27.7 = 5.26

We need standard deviation for the skewness formula

Step 3:

Calculate Skewness

Cubed deviations: (-3.8)ยณ, (-2.8)ยณ, (-1.8)ยณ, (-0.8)ยณ, (9.2)ยณ
= -54.87, -21.95, -5.83, -0.51, 778.69
Sum = 695.53
Skewness = (695.53/5) / (5.26)ยณ = 139.11 / 145.77 = 0.95

Skewness formula uses cubed deviations divided by cubed standard deviation

Step 4:

Interpret the Result

Skewness = +0.95 (positive)
Distribution is right-skewed
The value 15 pulls the tail to the right
Most data clustered on left, long tail on right

Positive skewness means tail extends to the right

โœ“ Final Answer: Skewness = +0.95 (positively skewed, right tail)
Check:

The positive skewness confirms that the outlier (15) creates a long right tail, pulling the mean (5.8) above the median (4).

๐Ÿ’ช Try These:

  1. Find skewness of [1, 1, 2, 3, 3]
  2. Data with left tail - positive or negative skew?
  3. If mean < median, what type of skew?

๐ŸŽฏ Key Takeaways

  • Skewness measures asymmetry in distribution
  • Negative skew: tail to left, Mean < Median
  • Positive skew: tail to right, Mean > Median
  • Symmetric: Mean = Median = Mode
Topic 11

๐Ÿ”— Covariance

How two variables vary together

Introduction

What is it? Covariance measures how two variables change together.

Why it matters: Shows if variables have a positive, negative, or no relationship.

Formula

Sample Covariance
Cov(X,Y) = ฮฃ(xแตข - xฬ„)(yแตข - ศณ) / (n-1)

Interpretation

  • Positive: Variables increase together
  • Negative: One increases as other decreases
  • Zero: No linear relationship
  • Problem: Scale-dependent, hard to interpret magnitude

๐Ÿ“Š Visual: Understanding Covariance

๐Ÿ“ Worked Example - Step by Step

Problem:

Find covariance between X=[2, 4, 6, 8] and Y=[1, 3, 5, 7]

Solution:

Step 1:

Calculate the Means

xฬ„ = (2 + 4 + 6 + 8) / 4 = 20 / 4 = 5
ศณ = (1 + 3 + 5 + 7) / 4 = 16 / 4 = 4

Find the average of each variable

Step 2:

Create Deviation Table

x y (x-xฬ„) (y-ศณ) (x-xฬ„)(y-ศณ)
2 1 -3 -3 9
4 3 -1 -1 1
6 5 1 1 1
8 7 3 3 9
Sum 20

Calculate deviations from means and their products

Step 3:

Calculate Sample Covariance

Cov(X,Y) = ฮฃ(x-xฬ„)(y-ศณ) / (n-1)
Cov(X,Y) = 20 / (4-1)
Cov(X,Y) = 20 / 3
Cov(X,Y) = 6.67

Use n-1 for sample covariance (Bessel's correction)

Step 4:

Interpret the Result

Cov(X,Y) = 6.67 > 0
Positive covariance indicates:
โ€ข X and Y tend to increase together
โ€ข When X is above its mean, Y tends to be above its mean
โ€ข When X is below its mean, Y tends to be below its mean

Positive covariance shows positive relationship

Final Answer: Cov(X,Y) = 6.67 (positive relationship)
โœ“ Verification:

The positive covariance confirms that X and Y have a positive linear relationship. As X increases by 2, Y also increases by 2, showing consistent movement together.

๐Ÿ’ช Try These:

  1. Calculate Cov(X,Y) for X=[1, 2, 3] and Y=[2, 4, 6]
  2. If Cov(X,Y) = -5, what does this tell you about the relationship?
  3. Find Cov(X,Y) for X=[5, 5, 5] and Y=[1, 2, 3]. What do you notice?

๐ŸŽฏ Key Takeaways

  • Covariance measures joint variability of two variables
  • Positive: variables move together; Negative: inverse relationship
  • Scale-dependent (unlike correlation)
  • Foundation for correlation calculation
Topic 12

๐Ÿ’ž Correlation

Standardized measure of relationship strength

Introduction

What is it? Correlation coefficient (r) is a standardized measure of linear relationship between two variables.

Why it matters: Always between -1 and +1, making it easy to interpret strength and direction of relationships.

Pearson Correlation Formula

Correlation Coefficient (r)
r = Cov(X,Y) / (ฯƒโ‚“ ร— ฯƒแตง)

Covariance divided by product of standard deviations

Interpretation Guide

  • r = +1: Perfect positive correlation
  • r = 0.7 to 0.9: Strong positive
  • r = 0.4 to 0.6: Moderate positive
  • r = 0.1 to 0.3: Weak positive
  • r = 0: No correlation
  • r = -0.1 to -0.3: Weak negative
  • r = -0.4 to -0.6: Moderate negative
  • r = -0.7 to -0.9: Strong negative
  • r = -1: Perfect negative correlation

๐Ÿ“Š Visual: Correlation Strength & Direction

Standard covariance (Cov / ฯƒ_x ฯƒ_y)

๐Ÿ’ก REAL-WORLD EXAMPLE

Study hours vs exam scores typically show r = 0.7 (strong positive). More study hours correlate with higher scores.

๐Ÿ“ Worked Example - Step by Step

Problem:

Calculate correlation coefficient for X=[2, 4, 6, 8] and Y=[1, 3, 5, 7]

Solution:

Step 1:

Use Covariance from Topic 11

From previous calculation:
Cov(X,Y) = 6.67
xฬ„ = 5, ศณ = 4

We already calculated this in Topic 11

Step 2:

Calculate Standard Deviation of X

Deviations from mean: -3, -1, 1, 3
Squared deviations: 9, 1, 1, 9
Sum of squared deviations = 20
Variance_x = 20 / (4-1) = 20/3 = 6.67
SD_x = โˆš6.67 โ‰ˆ 2.58

Standard deviation measures spread of X values

Step 3:

Calculate Standard Deviation of Y

Deviations from mean: -3, -1, 1, 3
Squared deviations: 9, 1, 1, 9
Sum of squared deviations = 20
Variance_y = 20 / (4-1) = 20/3 = 6.67
SD_y = โˆš6.67 โ‰ˆ 2.58

Standard deviation measures spread of Y values

Step 4:

Calculate Correlation Coefficient

r = Cov(X,Y) / (SD_x ร— SD_y)
r = 6.67 / (2.58 ร— 2.58)
r = 6.67 / 6.66
r โ‰ˆ 1.00

Correlation standardizes covariance by dividing by both standard deviations

Step 5:

Interpret the Result

r = 1.00 (perfect positive correlation)
This means:
โ€ข X and Y have a perfect linear relationship
โ€ข As X increases by 2, Y increases by 2 (exactly)
โ€ข All points lie exactly on a straight line
โ€ข The relationship is: Y = 0.5X (or Y = -1 + 0.5X when adjusted)

r = 1 indicates perfect positive linear correlation

Final Answer: r = 1.00 (perfect positive linear correlation)
โœ“ Verification:

Check: If we plot these points, they form a perfect line. When X=2, Y=1; X=4, Y=3; X=6, Y=5; X=8, Y=7. The relationship is Y = (X/2) - 1 + (X/2) = 0.5X, which is indeed perfectly linear! โœ“

๐Ÿ’ช Try These:

  1. If Cov(X,Y) = 10, SD_x = 2, SD_y = 5, find r
  2. What does r = -0.8 indicate about the relationship?
  3. Can correlation be greater than 1? Why or why not?

๐ŸŽฏ Key Takeaways

  • r ranges from -1 to +1
  • Measures strength AND direction of linear relationship
  • Scale-independent (unlike covariance)
  • Only measures LINEAR relationships
Topic 13

๐Ÿ’ช Interpreting Correlation

Correlation vs causation and common pitfalls

The Golden Rule

โš ๏ธ CORRELATION โ‰  CAUSATION

Just because two variables are correlated does NOT mean one causes the other!

Common Scenarios

  • Direct Causation: X causes Y (smoking causes cancer)
  • Reverse Causation: Y causes X (not the direction you thought)
  • Third Variable: Z causes both X and Y (confounding variable)
  • Coincidence: Pure chance with no real relationship
๐Ÿ“Š FAMOUS EXAMPLE

Ice cream sales correlate with drowning deaths.

Does ice cream cause drowning? NO! The third variable is summer weatherโ€”more people swim in summer (more drownings) and eat ice cream in summer.

๐Ÿ“ Worked Example - Step by Step

Problem:

Study finds r = -0.75 between hours of TV watched and exam scores. Interpret this result and discuss causation.

Solution:

Step 1:

Analyze the Sign

Negative correlation (r < 0)
As one variable increases, the other decreases
More TV โ†’ Lower scores (or vice versa)

The negative sign tells us the direction of the relationship

Step 2:

Analyze the Strength

|r| = |-0.75| = 0.75
Interpretation scale:
โ€ข 0.0-0.3 = Weak
โ€ข 0.3-0.7 = Moderate
โ€ข 0.7-1.0 = Strong
0.75 falls in "Strong" category

The absolute value determines relationship strength

Step 3:

State the Relationship

Strong negative correlation
Students who watch more TV tend to have lower exam scores
Relationship is fairly consistent but not perfect

Combine sign and strength for complete interpretation

Step 4:

Address Causation

Correlation โ‰  Causation!
Possible explanations:
a) TV causes lower scores (less study time)
b) Lower-performing students watch more TV (compensating)
c) Third variable: stress causes both TV watching and poor performance
Cannot determine causation from correlation alone

Correlation never proves causation - always consider alternatives

Step 5:

Predict Using Correlation

If we know TV hours, we can predict exam score
But prediction โ‰  causation
rยฒ = 0.75ยฒ = 0.56 = 56% of variance explained

rยฒ shows percentage of variance in one variable explained by the other

โœ“ Final Answer: Strong negative correlation (r = -0.75), but does NOT prove TV causes lower scores
Check:

While the correlation is strong, we must resist concluding causation. The relationship could be coincidental, reverse-causal, or due to confounding variables.

๐Ÿ’ช Try These:

  1. r = +0.90 between study hours and grades. Interpret.
  2. Can r = 1.5? Why or why not?
  3. If r = 0, does that mean no relationship at all?

๐ŸŽฏ Key Takeaways

  • Correlation shows relationship, NOT causation
  • Always consider third variables (confounders)
  • Need controlled experiments to prove causation
  • Be skeptical of correlation claims in media
Topic 14

๐ŸŽฒ Probability Basics

Foundation of statistical inference

Introduction

What is it? Probability measures the likelihood of an event occurring, ranging from 0 (impossible) to 1 (certain).

Why it matters: Foundation for all statistical inference, hypothesis testing, and prediction.

Basic Formula

Probability of Event E
P(E) = Number of favorable outcomes / Total number of possible outcomes

Key Rules

  • Range: 0 โ‰ค P(E) โ‰ค 1
  • Complement: P(not E) = 1 - P(E)
  • Addition (OR): P(A or B) = P(A) + P(B) - P(A and B)
  • Multiplication (AND): P(A and B) = P(A) ร— P(B) [if independent]
๐Ÿ“Š EXAMPLE

Rolling a die:

P(rolling a 4) = 1/6 โ‰ˆ 0.167

P(rolling even) = 3/6 = 0.5

P(not rolling a 6) = 5/6 โ‰ˆ 0.833

๐ŸŽฏ Key Takeaways

  • Probability ranges from 0 to 1
  • P(E) = favorable outcomes / total outcomes
  • Complement rule: P(not E) = 1 - P(E)
  • Foundation for all statistical inference
Topic 15

๐Ÿ”ท Set Theory

Union, intersection, and complement

Introduction

What is it? Set theory provides a mathematical framework for organizing events and calculating probabilities.

Key Concepts

  • Union (A โˆช B): A OR B (either event occurs)
  • Intersection (A โˆฉ B): A AND B (both events occur)
  • Complement (A'): NOT A (event doesn't occur)
  • Mutually Exclusive: A โˆฉ B = โˆ… (can't both occur)

๐Ÿ“ Worked Example - Step by Step

Problem:

In a class of 40 students: 25 like Math, 20 like Science, 10 like both. Find: a) P(Math OR Science), b) P(only Math), c) P(neither)

Solution:

Step 1:

Set Up the Information

Total students: n = 40
P(Math) = 25/40 = 0.625
P(Science) = 20/40 = 0.5
P(Math โˆฉ Science) = 10/40 = 0.25

Convert all counts to probabilities

Step 2:

Find P(Math โˆช Science) using Addition Rule

Formula: P(A โˆช B) = P(A) + P(B) - P(A โˆฉ B)
P(Math โˆช Science) = 0.625 + 0.5 - 0.25
= 1.125 - 0.25
= 0.875

We subtract the intersection to avoid double-counting

Step 3:

Find P(only Math)

Only Math = Math AND NOT Science
Students in only Math = 25 - 10 = 15
P(only Math) = 15/40 = 0.375

Subtract those who like both from total Math students

Step 4:

Find P(neither)

Neither = NOT (Math OR Science)
P(neither) = 1 - P(Math โˆช Science)
= 1 - 0.875
= 0.125
Or: 40 - 35 = 5 students, so 5/40 = 0.125 โœ“

Use complement rule or count directly

โœ“ Final Answer: a) P(Math OR Science) = 0.875 (87.5%)
b) P(only Math) = 0.375 (37.5%)
c) P(neither) = 0.125 (12.5%)
Verification:

Check: 0.375 (only Math) + 0.25 (both) + 0.25 (only Science) + 0.125 (neither) = 1.0 โœ“

๐Ÿ’ช Try These:

  1. P(A)=0.6, P(B)=0.5, P(AโˆฉB)=0.3. Find P(AโˆชB)
  2. If P(AโˆชB)=0.8, P(A)=0.5, P(B)=0.4, find P(AโˆฉB)
  3. 100 students: 60 like pizza, 40 like burgers, 20 like both. How many like neither?

๐ŸŽฏ Key Takeaways

  • Union (โˆช): OR operation
  • Intersection (โˆฉ): AND operation
  • Complement ('): NOT operation
  • Venn diagrams visualize set relationships
Topic 16

๐Ÿ”€ Conditional Probability

Probability given that something else happened

Introduction

What is it? Conditional probability is the probability of event A occurring given that event B has already occurred.

Formula

Conditional Probability
P(A|B) = P(A and B) / P(B)

Read as: "Probability of A given B"

๐Ÿ“Š EXAMPLE

Drawing cards: P(King | Red card) = ?

P(Red card) = 26/52

P(King and Red) = 2/52

P(King | Red) = (2/52) / (26/52) = 2/26 = 1/13

๐ŸŽฏ Key Takeaways

  • P(A|B) = probability of A given B occurred
  • Formula: P(A|B) = P(A and B) / P(B)
  • Critical for Bayes' Theorem
  • Used in machine learning and diagnostics
Topic 17

๐ŸŽฏ Independence

When events don't affect each other

Introduction

What is it? Two events are independent if the occurrence of one doesn't affect the probability of the other.

Test for Independence

Events A and B are independent if:
P(A|B) = P(A)

OR equivalently:

P(A and B) = P(A) ร— P(B)

Examples

  • Independent: Coin flips, die rolls with replacement
  • Dependent: Drawing cards without replacement, weather on consecutive days

๐Ÿ“ Worked Example - Step by Step

Problem:

Two dice are rolled. Let A = "first die shows 6" and B = "sum is 7". Are A and B independent?

Solution:

Step 1:

Find P(A)

First die shows 6: one outcome out of 6
P(A) = 1/6 โ‰ˆ 0.167

Probability the first die is 6

Step 2:

Find P(B)

Sum equals 7: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1)
6 favorable outcomes out of 36 total
P(B) = 6/36 = 1/6 โ‰ˆ 0.167

Count all ways to get sum of 7

Step 3:

Find P(A โˆฉ B)

First die is 6 AND sum is 7
Only possibility: (6,1)
P(A โˆฉ B) = 1/36 โ‰ˆ 0.028

Find where both events occur simultaneously

Step 4:

Test Independence

If independent: P(A โˆฉ B) = P(A) ร— P(B)
P(A) ร— P(B) = (1/6) ร— (1/6) = 1/36
P(A โˆฉ B) = 1/36
1/36 = 1/36 โœ“ EQUAL!

Compare the two probabilities to test independence

Step 5:

Conclusion

Events A and B ARE independent
Knowing first die is 6 doesn't change probability of sum being 7

When the product rule holds, events are independent

โœ“ Final Answer: YES, events are independent. P(AโˆฉB) = P(A)ร—P(B) = 1/36
Check:

We can also verify: P(B|A) = P(AโˆฉB)/P(A) = (1/36)/(1/6) = 1/6 = P(B). Since P(B|A) = P(B), the events are independent.

๐Ÿ’ช Try These:

  1. P(A)=0.3, P(B)=0.4, P(AโˆฉB)=0.12. Independent?
  2. Coin flip: P(Heads) and P(Tails). Independent?
  3. Drawing two cards without replacement. Independent?

๐ŸŽฏ Key Takeaways

  • Independent events don't affect each other
  • Test: P(A and B) = P(A) ร— P(B)
  • With replacement โ†’ independent
  • Without replacement โ†’ dependent
Topic 18

๐Ÿงฎ Bayes' Theorem

Updating probabilities with new evidence

Introduction

What is it? Bayes' Theorem shows how to update probability based on new information.

Why it matters: Used in medical diagnosis, spam filters, machine learning, and countless applications.

The Formula

Bayes' Theorem
P(A|B) = [P(B|A) ร— P(A)] / P(B)
  • P(A|B) = posterior probability
  • P(B|A) = likelihood
  • P(A) = prior probability
  • P(B) = marginal probability
๐Ÿ“Š MEDICAL DIAGNOSIS EXAMPLE

Disease affects 1% of population. Test is 95% accurate.

You test positive. What's probability you have disease?

P(Disease) = 0.01

P(Positive|Disease) = 0.95

P(Positive|No Disease) = 0.05

P(Positive) = 0.01ร—0.95 + 0.99ร—0.05 = 0.059

P(Disease|Positive) = (0.95ร—0.01)/0.059 = 0.161

Only 16.1% chance you have the disease!

๐Ÿ“ Worked Example - Step by Step

Problem:

A disease affects 1% of the population. A test is 99% accurate (detects 99% of sick people and correctly identifies 99% of healthy people). You test positive. What's the probability you actually have the disease?

Solution:

Step 1:

Define the Events and Given Information

Let A = has disease
Let B = tests positive
P(A) = 0.01 (1% of population has disease)
P(B|A) = 0.99 (99% true positive rate)
P(B|A') = 0.01 (1% false positive rate)

Set up all known probabilities before applying Bayes' Theorem

Step 2:

Calculate P(B) using Total Probability

P(B) = P(B|A) ร— P(A) + P(B|A') ร— P(A')
P(B) = (0.99 ร— 0.01) + (0.01 ร— 0.99)
P(B) = 0.0099 + 0.0099 = 0.0198

Find the overall probability of testing positive

Step 3:

Apply Bayes' Theorem

P(A|B) = [P(B|A) ร— P(A)] / P(B)
P(A|B) = (0.99 ร— 0.01) / 0.0198
P(A|B) = 0.0099 / 0.0198
P(A|B) = 0.5 = 50%

This is the posterior probability - what we want to find!

Final Answer: Only 50% chance you have the disease despite testing positive!
โœ“ Why So Low?

This counter-intuitive result occurs because the disease is so rare (1%). Even with a 99% accurate test, there are many more false positives from the healthy 99% than true positives from the sick 1%. Base rates matter!

๐Ÿ’ช Try These:

  1. What if the disease affects 10% of the population instead? Recalculate P(A|B)
  2. If the test was 95% accurate instead of 99%, what would P(A|B) be?

๐ŸŽฏ Key Takeaways

  • Updates probability based on new evidence
  • P(A|B) = [P(B|A) ร— P(A)] / P(B)
  • Critical for medical testing and machine learning
  • Counter-intuitive results common (base rate matters!)
Topic 19

๐Ÿ“Š Probability Mass Function (PMF)

Probabilities for discrete random variables

Introduction

What is it? PMF gives the probability that a discrete random variable equals a specific value.

Why it matters: Used for countable outcomes like dice rolls, coin flips, or number of defects.

Properties

  • 0 โ‰ค P(X = x) โ‰ค 1 for all x
  • Sum of all probabilities = 1
  • Only defined for discrete variables
  • Visualized with bar charts
๐Ÿ“Š EXAMPLE: Die Roll

P(X = 1) = 1/6

P(X = 2) = 1/6

... and so on

Sum = 6 ร— (1/6) = 1 โœ“

๐ŸŽฏ Key Takeaways

  • PMF is for discrete random variables
  • Gives P(X = specific value)
  • All probabilities sum to 1
  • Visualized with bar charts
Topic 20

๐Ÿ“ˆ Probability Density Function (PDF)

Probabilities for continuous random variables

Introduction

What is it? PDF describes probability for continuous random variables. Probability at exact point is 0; we calculate probability over intervals.

Key Differences from PMF

  • For continuous (not discrete) variables
  • P(X = exact value) = 0
  • Calculate P(a < X < b) = area under curve
  • Total area under curve = 1

๐Ÿ“Š Visual: PDF vs CDF (Uniform Distribution)

๐Ÿ“ Worked Example - Step by Step

Problem:

Continuous random variable X has uniform distribution on interval [0, 10]. a) Find the PDF f(x), b) Calculate P(3 โ‰ค X โ‰ค 7)

Solution:

Step 1:

Understand Uniform Distribution

X is equally likely anywhere between 0 and 10
For uniform on [a, b], PDF is constant
Total area under curve must equal 1

Uniform means constant probability density across the interval

Step 2:

Find PDF Height

Interval length = b - a = 10 - 0 = 10
For area = 1: height ร— width = 1
height ร— 10 = 1
height = 1/10 = 0.1
Therefore: f(x) = 0.1 for 0 โ‰ค x โ‰ค 10, and 0 otherwise

The constant height must give total area of 1

Step 3:

Calculate P(3 โ‰ค X โ‰ค 7)

For continuous uniform: P(a โ‰ค X โ‰ค b) = (b-a) ร— height
P(3 โ‰ค X โ‰ค 7) = (7-3) ร— 0.1
= 4 ร— 0.1
= 0.4

Probability is the area of the rectangle

Step 4:

Visualize (Area Under Curve)

Rectangle: width = 4, height = 0.1
Area = 4 ร— 0.1 = 0.4
This represents probability

The geometric area equals the probability

โœ“ Final Answer: a) f(x) = 0.1 for x โˆˆ [0,10]
b) P(3 โ‰ค X โ‰ค 7) = 0.4 (40%)
Verification:

P(0 โ‰ค X โ‰ค 10) = 10 ร— 0.1 = 1.0 โœ“ (total probability = 1)

๐Ÿ’ช Try These:

  1. Uniform on [5,15]. Find PDF.
  2. For above, find P(8 โ‰ค X โ‰ค 12)
  3. Why is P(X = 7) = 0 for continuous distributions?

๐ŸŽฏ Key Takeaways

  • PDF is for continuous random variables
  • Probability = area under curve
  • P(X = exact point) = 0
  • Total area under PDF = 1
Topic 21

๐Ÿ“‰ Cumulative Distribution Function (CDF)

Probability up to a value

Introduction

What is it? CDF gives the probability that X is less than or equal to a specific value.

Formula: F(x) = P(X โ‰ค x)

Properties

  • Always non-decreasing
  • F(-โˆž) = 0
  • F(+โˆž) = 1
  • P(a < X โ‰ค b) = F(b) - F(a)

๐Ÿ“ Worked Example - Step by Step

Problem:

For the uniform distribution from Topic 20 (X ~ Uniform[0,10]), find: a) F(5) = P(X โ‰ค 5), b) F(12), c) P(2 < X โ‰ค 8)

Solution:

Step 1:

Recall PDF

f(x) = 0.1 for 0 โ‰ค x โ‰ค 10
CDF is cumulative (area from left up to x)

CDF accumulates probability from the left

Step 2:

Find F(5)

F(5) = P(X โ‰ค 5)
Area from 0 to 5: width = 5, height = 0.1
F(5) = 5 ร— 0.1 = 0.5

Half of the distribution is below x = 5

Step 3:

Find F(12)

F(12) = P(X โ‰ค 12)
But X can't exceed 10
All probability is accounted for by x = 10
F(12) = 1.0 (certainty)

CDF plateaus at 1 beyond the support of the distribution

Step 4:

Find P(2 < X โ‰ค 8)

Using CDF: P(a < X โ‰ค b) = F(b) - F(a)
F(8) = 8 ร— 0.1 = 0.8
F(2) = 2 ร— 0.1 = 0.2
P(2 < X โ‰ค 8) = 0.8 - 0.2 = 0.6

Subtract lower CDF from upper CDF

Step 5:

General CDF Formula

For uniform [0, 10]:
โ€ข F(x) = 0 if x < 0
โ€ข F(x) = x/10 if 0 โ‰ค x โ‰ค 10
โ€ข F(x) = 1 if x > 10

The complete CDF function has three pieces

โœ“ Final Answer: a) F(5) = 0.5
b) F(12) = 1.0
c) P(2 < X โ‰ค 8) = 0.6
Check:

F(0) = 0 (no probability below 0), F(10) = 1 (all probability by 10), F is non-decreasing โœ“

๐Ÿ’ช Try These:

  1. For uniform [5,15], find F(10)
  2. What is P(X > 7) using the CDF?
  3. If F(x) = 0.75, what does this mean?

๐ŸŽฏ Key Takeaways

  • CDF: F(x) = P(X โ‰ค x)
  • Works for both discrete and continuous
  • Always increases from 0 to 1
  • Useful for finding percentiles
Topic 22

๐Ÿช™ Bernoulli Distribution

Single trial with two outcomes

Introduction

What is it? Models a single trial with two outcomes: success (1) or failure (0).

Examples: Coin flip, pass/fail test, yes/no question

Formula

Bernoulli PMF
P(X = 1) = p
P(X = 0) = 1 - p = q

Mean = p, Variance = p(1-p)

๐Ÿ“ Worked Example - Step by Step

Problem:

Flip a fair coin once. Let X = 1 if Heads, X = 0 if Tails. a) Find P(X=1) and P(X=0), b) Calculate E(X) and Var(X)

Solution:

Step 1:

Identify Bernoulli Trial

Single trial with two outcomes (Success/Failure)
Success = Heads, p = 0.5
Failure = Tails, 1-p = 0.5

This is a classic Bernoulli trial

Step 2:

Find Probabilities

P(X = 1) = p = 0.5 (probability of heads)
P(X = 0) = 1-p = 0.5 (probability of tails)
Check: 0.5 + 0.5 = 1.0 โœ“

Probabilities must sum to 1

Step 3:

Calculate Expected Value

Formula: E(X) = p
E(X) = 0.5
Or: E(X) = 0ร—P(X=0) + 1ร—P(X=1)
= 0ร—0.5 + 1ร—0.5 = 0.5 โœ“

Expected value is the probability of success

Step 4:

Calculate Variance

Formula: Var(X) = p(1-p)
Var(X) = 0.5 ร— 0.5 = 0.25
Standard deviation: ฯƒ = โˆš0.25 = 0.5

Variance measures spread of outcomes

Step 5:

Interpret

On average, we get 0.5 heads per flip
Variance measures spread of 0 and 1 outcomes

Expected value represents long-run average

โœ“ Final Answer: a) P(X=1) = 0.5, P(X=0) = 0.5
b) E(X) = 0.5, Var(X) = 0.25
Check:

For fair coin, p = 0.5 makes sense. Over many flips, we expect half heads (E(X) = 0.5).

๐Ÿ’ช Try These:

  1. Biased coin: P(Heads) = 0.7. Find E(X) and Var(X)
  2. Free throw: 80% success rate. Model as Bernoulli
  3. When is Var(X) maximized for Bernoulli?

๐ŸŽฏ Key Takeaways

  • Single trial, two outcomes (0 or 1)
  • Parameter: p (probability of success)
  • Mean = p, Variance = p(1-p)
  • Building block for binomial distribution
Topic 23

๐ŸŽฐ Binomial Distribution

Multiple independent Bernoulli trials

Introduction

What is it? Models the number of successes in n independent Bernoulli trials.

Requirements: Fixed n, same p, independent trials, binary outcomes

Formula

Binomial PMF
P(X = k) = C(n,k) ร— p^k ร— (1-p)^(n-k)

C(n,k) = n! / (k!(n-k)!)

Mean = np, Variance = np(1-p)

๐Ÿ“Š EXAMPLE

Flip coin 10 times. P(exactly 6 heads)?

n=10, k=6, p=0.5

P(X=6) = C(10,6) ร— 0.5^6 ร— 0.5^4 = 210 ร— 0.000977 โ‰ˆ 0.205

๐ŸŽฏ Key Takeaways

  • n independent trials, probability p each
  • Counts number of successes
  • Mean = np, Variance = np(1-p)
  • Common in quality control and surveys
Topic 24

๐Ÿ”” Normal Distribution

The bell curve and 68-95-99.7 rule

Introduction

What is it? The most important continuous probability distributionโ€”symmetric, bell-shaped curve.

Why it matters: Many natural phenomena follow normal distribution. Foundation of inferential statistics.

Properties

  • Symmetric around mean ฮผ
  • Bell-shaped curve
  • Mean = Median = Mode
  • Defined by ฮผ (mean) and ฯƒ (standard deviation)
  • Total area under curve = 1

The 68-95-99.7 Rule (Empirical Rule)

  • 68% of data within ฮผ ยฑ 1ฯƒ
  • 95% of data within ฮผ ยฑ 2ฯƒ
  • 99.7% of data within ฮผ ยฑ 3ฯƒ
๐Ÿ’ก REAL-WORLD EXAMPLE

IQ scores: ฮผ = 100, ฯƒ = 15

68% of people have IQ between 85-115

95% have IQ between 70-130

99.7% have IQ between 55-145

๐Ÿ“Š Visual: The 68-95-99.7 Rule

๐Ÿ“ Worked Example - Step by Step

Problem:

IQ scores follow Normal distribution with ฮผ = 100, ฯƒ = 15. Find: a) P(IQ โ‰ค 115), b) P(85 โ‰ค IQ โ‰ค 115), c) IQ score at 95th percentile

Solution:

Step 1:

Understand Normal Distribution

Bell-shaped, symmetric around mean
ฮผ = 100 (center)
ฯƒ = 15 (spread)

Parameters define the shape and location of the curve

Step 2:

Find P(IQ โ‰ค 115) using z-score

z = (x - ฮผ)/ฯƒ = (115 - 100)/15 = 15/15 = 1
P(Z โ‰ค 1) = 0.8413 (from z-table)
About 84.13% have IQ โ‰ค 115

Standardize to z-score, then use standard normal table

Step 3:

Find P(85 โ‰ค IQ โ‰ค 115)

Lower bound: zโ‚ = (85-100)/15 = -15/15 = -1
Upper bound: zโ‚‚ = (115-100)/15 = 1
This is ฮผ ยฑ 1ฯƒ (68-95-99.7 rule)
P(-1 โ‰ค Z โ‰ค 1) = 0.68 (approximately 68%)
Exact: P(Zโ‰ค1) - P(Zโ‰ค-1) = 0.8413 - 0.1587 = 0.6826

One standard deviation on each side covers 68% of data

Step 4:

Find 95th Percentile

P(IQ โ‰ค x) = 0.95
From z-table: z = 1.645 for 95th percentile
x = ฮผ + zฯƒ = 100 + 1.645ร—15
= 100 + 24.675 = 124.675
IQ โ‰ˆ 125

Convert z-score back to original scale using inverse formula

โœ“ Final Answer: a) P(IQ โ‰ค 115) = 0.8413 (84.13%)
b) P(85 โ‰ค IQ โ‰ค 115) = 0.6826 (68.26%)
c) 95th percentile = IQ of 125
Verification:

Using 68-95-99.7 rule: ฮผยฑ1ฯƒ contains 68% โœ“, ฮผยฑ2ฯƒ contains 95%, ฮผยฑ3ฯƒ contains 99.7%. Our answer matches the empirical rule!

๐Ÿ’ช Try These:

  1. Find P(IQ > 130) using same distribution
  2. What IQ scores contain the middle 95% of people?
  3. If z = -2, what percentile is this?

๐ŸŽฏ Key Takeaways

  • Symmetric bell curve, parameters ฮผ and ฯƒ
  • 68-95-99.7 rule for standard deviations
  • Foundation for hypothesis testing
  • Central Limit Theorem connects to sampling
Topic 25

โš–๏ธ Hypothesis Testing Introduction

Making decisions from data

Introduction

What is it? Statistical method for testing claims about populations using sample data.

Why it matters: Allows us to make evidence-based decisions and determine if effects are real or due to chance.

The Two Hypotheses

  • Null Hypothesis (Hโ‚€): Status quo, no effect, no difference
  • Alternative Hypothesis (Hโ‚ or Hโ‚): What we're trying to prove

Decision Process

  1. State hypotheses (Hโ‚€ and Hโ‚)
  2. Choose significance level (ฮฑ)
  3. Collect data and calculate test statistic
  4. Find p-value or critical value
  5. Make decision: Reject Hโ‚€ or Fail to reject Hโ‚€
๐Ÿ“Š EXAMPLE

Claim: New teaching method improves test scores

Hโ‚€: ฮผ = 75 (no improvement)

Hโ‚: ฮผ > 75 (scores improved)

๐ŸŽฏ Key Takeaways

  • Hโ‚€ = null hypothesis (status quo)
  • Hโ‚ = alternative hypothesis (what we test)
  • We either reject or fail to reject Hโ‚€
  • Never "accept" or "prove" anything
Topic 26

๐ŸŽฏ Significance Level (ฮฑ)

Setting your error tolerance

Introduction

What is it? ฮฑ (alpha) is the probability of rejecting Hโ‚€ when it's actually true (Type I error rate).

Common values: 0.05 (5%), 0.01 (1%), 0.10 (10%)

Interpretation

  • ฮฑ = 0.05: Willing to be wrong 5% of the time
  • Lower ฮฑ: More stringent, harder to reject Hโ‚€
  • Higher ฮฑ: More lenient, easier to reject Hโ‚€
  • Confidence level: 1 - ฮฑ (e.g., 0.05 โ†’ 95% confidence)

๐Ÿ“ Worked Example - Step by Step

Problem:

Explain the difference between ฮฑ = 0.05 and ฮฑ = 0.01. Which is more strict? Find critical values for both in a two-tailed test.

Solution:

Step 1:

Understand ฮฑ = 0.05

ฮฑ = 0.05 means 5% significance
95% confidence level (1 - 0.05)
P(Type I error) = 5%
Willing to be wrong 5% of the time
Step 2:

Understand ฮฑ = 0.01

ฮฑ = 0.01 means 1% significance
99% confidence level (1 - 0.01)
P(Type I error) = 1%
Only willing to be wrong 1% of the time
Step 3:

Find Critical Values for ฮฑ = 0.05

Two-tailed: split ฮฑ into both tails
Each tail = 0.05/2 = 0.025
Zโ‚€.โ‚‰โ‚‡โ‚… = ยฑ1.96
Reject if |z| > 1.96
Step 4:

Find Critical Values for ฮฑ = 0.01

Two-tailed: each tail = 0.01/2 = 0.005
Zโ‚€.โ‚‰โ‚‰โ‚… = ยฑ2.576
Reject if |z| > 2.576
Harder to reject (more strict!)
Step 5:

Compare

ฮฑ = 0.01 is MORE STRICT
Requires stronger evidence to reject Hโ‚€
Reduces Type I errors but increases Type II
โœ“ Final Answer: ฮฑ = 0.05: z = ยฑ1.96; ฮฑ = 0.01: z = ยฑ2.576 (more strict)

๐Ÿ’ช Practice Problems:

  1. Find critical value for ฮฑ = 0.10, two-tailed
  2. If we want to be very strict, should we use ฮฑ = 0.05 or ฮฑ = 0.001?
  3. What happens to Type II error when ฮฑ decreases?

๐ŸŽฏ Key Takeaways

  • ฮฑ = probability of Type I error
  • Common: ฮฑ = 0.05 (5% error rate)
  • Set before collecting data
  • Trade-off between Type I and Type II errors
Topic 27

๐Ÿ“Š Standard Error

Measuring sampling variability

Introduction

What is it? Standard error (SE) measures how much sample means vary from the true population mean.

Formula

Standard Error of Mean
SE = ฯƒ / โˆšn

or estimate: SE = s / โˆšn

Key Points

  • Decreases as sample size increases
  • Measures precision of sample mean
  • Lower SE = better estimate
  • Used in confidence intervals and hypothesis tests

๐Ÿ“ Worked Example - Step by Step

Problem:

Population has ฯƒ = 20. Calculate standard error for sample sizes: n = 4, n = 16, n = 64, n = 100. What pattern do you notice?

Solution:

Step 1:

Recall Standard Error Formula

SE = ฯƒ / โˆšn
Where:
- ฯƒ = population standard deviation
- n = sample size
SE measures variability of sample means
Step 2:

Calculate SE for n = 4

SE = 20 / โˆš4
SE = 20 / 2
SE = 10
Step 3:

Calculate SE for n = 16

SE = 20 / โˆš16
SE = 20 / 4
SE = 5
Step 4:

Calculate SE for n = 64

SE = 20 / โˆš64
SE = 20 / 8
SE = 2.5
Step 5:

Calculate SE for n = 100

SE = 20 / โˆš100
SE = 20 / 10
SE = 2
Step 6:

Analyze Pattern

n = 4: SE = 10
n = 16: SE = 5 (4ร— sample โ†’ ยฝ SE)
n = 64: SE = 2.5 (16ร— sample โ†’ ยผ SE)
n = 100: SE = 2 (25ร— sample โ†’ โ…• SE)

Pattern: Quadruple sample size โ†’ Half the SE
Larger samples give more precise estimates!
โœ“ Final Answer: SE: 10, 5, 2.5, 2. Larger n โ†’ Smaller SE (more precision)

๐Ÿ’ช Practice Problems:

  1. If ฯƒ = 15 and n = 25, find SE
  2. To cut SE in half, by what factor must we increase n?
  3. Why does larger sample size reduce SE?

๐ŸŽฏ Key Takeaways

  • SE = ฯƒ / โˆšn
  • Measures sampling variability
  • Larger samples โ†’ smaller SE
  • Critical for inference
Topic 28

๐Ÿ“ Z-Test

Hypothesis test for large samples with known ฯƒ

When to Use Z-Test

  • Sample size n โ‰ฅ 30 (large sample)
  • Population standard deviation (ฯƒ) known
  • Testing population mean
  • Normal distribution or large n

Formula

Z-Test Statistic
z = (xฬ„ - ฮผโ‚€) / (ฯƒ / โˆšn)

xฬ„ = sample mean

ฮผโ‚€ = hypothesized population mean

ฯƒ = population standard deviation

n = sample size

๐Ÿ“ Worked Example - Step by Step

Problem:

A factory claims ฮผ = 100. Sample: n = 36, xฬ„ = 105, ฯƒ = 12. Test at ฮฑ = 0.05 (two-tailed).

Solution:

Step 1:

State Hypotheses

Hโ‚€: ฮผ = 100 (claim is true)
Hโ‚: ฮผ โ‰  100 (claim is false)
ฮฑ = 0.05, two-tailed test
Step 2:

Calculate Standard Error

SE = ฯƒ / โˆšn
SE = 12 / โˆš36
SE = 12 / 6
SE = 2
Step 3:

Calculate Z-Statistic

z = (xฬ„ - ฮผโ‚€) / SE
z = (105 - 100) / 2
z = 5 / 2
z = 2.5
Step 4:

Find Critical Values

ฮฑ = 0.05, two-tailed
Critical values: z = ยฑ1.96
Rejection regions: z < -1.96 or z > 1.96
Step 5:

Make Decision

Test statistic: z = 2.5
Critical value: z = 1.96
2.5 > 1.96 โ†’ In rejection region

REJECT Hโ‚€
Step 6:

Interpret

There IS significant evidence that ฮผ โ‰  100
The sample mean of 105 is statistically different
Factory's claim is likely false
โœ“ Final Answer: z = 2.5 > 1.96, REJECT Hโ‚€ (claim is false)
Check:

P-value = 2 ร— P(Z > 2.5) = 2 ร— 0.0062 = 0.0124 < 0.05 โœ“ Confirms rejection

๐Ÿ’ช Practice Problems:

  1. Test: ฮผโ‚€ = 50, xฬ„ = 48, ฯƒ = 10, n = 25, ฮฑ = 0.05
  2. If z = -1.5, ฮฑ = 0.05, two-tailed, what's your decision?
  3. When should we use z-test vs t-test?

๐ŸŽฏ Key Takeaways

  • Use when n โ‰ฅ 30 and ฯƒ known
  • z = (xฬ„ - ฮผโ‚€) / SE
  • Compare z to critical value or find p-value
  • Large |z| = evidence against Hโ‚€
Topic 29

๐ŸŽš๏ธ Z-Score & Critical Values

Standardization and rejection regions

Z-Score (Standardization)

Z-Score Formula
z = (x - ฮผ) / ฯƒ

Converts any normal distribution to standard normal (ฮผ=0, ฯƒ=1)

Critical Values

  • ฮฑ = 0.05 (two-tailed): z = ยฑ1.96
  • ฮฑ = 0.05 (one-tailed): z = 1.645
  • ฮฑ = 0.01 (two-tailed): z = ยฑ2.576

๐Ÿ“ Worked Example - Step by Step

Problem:

Find critical z-values for: a) ฮฑ = 0.05 one-tailed (right), b) ฮฑ = 0.05 two-tailed, c) ฮฑ = 0.01 two-tailed. Draw rejection regions.

Solution:

Step 1:

One-Tailed Right (ฮฑ = 0.05)

All ฮฑ in right tail
Find z where P(Z > z) = 0.05
P(Z โ‰ค z) = 1 - 0.05 = 0.95
From z-table: zโ‚€.โ‚‰โ‚… = 1.645

Critical value: z = 1.645
Reject Hโ‚€ if z > 1.645
Step 2:

Two-Tailed (ฮฑ = 0.05)

Split ฮฑ between both tails
Each tail = 0.05/2 = 0.025
Left tail: P(Z < z) = 0.025 โ†’ z = -1.96
Right tail: P(Z > z) = 0.025 โ†’ z = +1.96

Critical values: z = ยฑ1.96
Reject Hโ‚€ if |z| > 1.96
Step 3:

Two-Tailed (ฮฑ = 0.01)

More strict test
Each tail = 0.01/2 = 0.005
P(Z < z) = 0.005 โ†’ z = -2.576
P(Z > z) = 0.005 โ†’ z = +2.576

Critical values: z = ยฑ2.576
Reject Hโ‚€ if |z| > 2.576
Step 4:

Visualize Rejection Regions

One-tailed (ฮฑ=0.05): [______|โ–ˆโ–ˆโ–ˆโ–ˆ] z > 1.645
Two-tailed (ฮฑ=0.05): [โ–ˆโ–ˆ|________|โ–ˆโ–ˆ] |z| > 1.96
Two-tailed (ฮฑ=0.01): [โ–ˆ|__________|โ–ˆ] |z| > 2.576

Smaller ฮฑ โ†’ Larger critical values โ†’ Harder to reject
โœ“ Final Answer: a) z = 1.645, b) z = ยฑ1.96, c) z = ยฑ2.576

๐Ÿ’ช Practice Problems:

  1. Find critical value for ฮฑ = 0.10, one-tailed (left)
  2. If your test statistic is z = 2.0, which tests would reject Hโ‚€?
  3. Why are two-tailed critical values larger than one-tailed?

๐ŸŽฏ Key Takeaways

  • Z-score standardizes values
  • Critical values define rejection region
  • |z| > critical value โ†’ reject Hโ‚€
  • Common: ยฑ1.96 for 95% confidence
Topic 30

๐Ÿ’ฏ P-Value Method

Probability of observing data if Hโ‚€ is true

Introduction

What is it? P-value is the probability of getting results as extreme as observed, assuming Hโ‚€ is true.

Decision Rule

  • If p-value โ‰ค ฮฑ: Reject Hโ‚€ (statistically significant)
  • If p-value > ฮฑ: Fail to reject Hโ‚€ (not significant)

Interpretation

  • p < 0.01: Very strong evidence against Hโ‚€
  • 0.01 โ‰ค p < 0.05: Strong evidence against Hโ‚€
  • 0.05 โ‰ค p < 0.10: Weak evidence against Hโ‚€
  • p โ‰ฅ 0.10: Little or no evidence against Hโ‚€
โš ๏ธ COMMON MISCONCEPTION

P-value is NOT the probability that Hโ‚€ is true! It's the probability of observing your data IF Hโ‚€ were true.

๐Ÿ“ Worked Example - Step by Step

Problem:

Sample of 36 students has mean score xฬ„ = 78. Population mean claimed to be ฮผโ‚€ = 75 with ฯƒ = 12. Test at ฮฑ = 0.05 using p-value method.

Solution:

Step 1:

State Hypotheses

Hโ‚€: ฮผ = 75 (null hypothesis - no difference)
Hโ‚: ฮผ โ‰  75 (alternative - there is a difference)
Two-tailed test

Set up null and alternative hypotheses

Step 2:

Calculate Test Statistic

z = (xฬ„ - ฮผโ‚€) / (ฯƒ/โˆšn)
z = (78 - 75) / (12/โˆš36)
z = 3 / (12/6)
z = 3 / 2 = 1.5

Calculate the z-score

Step 3:

Find P-Value

For two-tailed: p-value = 2 ร— P(Z > |1.5|)
P(Z > 1.5) = 1 - 0.9332 = 0.0668
p-value = 2 ร— 0.0668 = 0.1336

Multiply by 2 for two-tailed test

Step 4:

Compare with ฮฑ

p-value = 0.1336
ฮฑ = 0.05
0.1336 > 0.05

Since p-value exceeds ฮฑ, we fail to reject Hโ‚€

Step 5:

Make Decision

Since p-value > ฮฑ, FAIL TO REJECT Hโ‚€
Not enough evidence to conclude mean differs from 75
p-value of 13.36% means we'd see results this extreme
13.36% of time if Hโ‚€ true

Interpret in context

โœ“ Final Answer: p-value = 0.1336 > 0.05, Fail to reject Hโ‚€
Check:

The result is not statistically significant at ฮฑ = 0.05 level. We need stronger evidence to claim the mean differs from 75.

๐Ÿ’ช Try These:

  1. If z = 2.5, ฮฑ = 0.01, find p-value and decide
  2. When do we reject Hโ‚€ using p-value method?

๐ŸŽฏ Key Takeaways

  • P-value = P(data | Hโ‚€ true)
  • Reject Hโ‚€ if p โ‰ค ฮฑ
  • Smaller p-value = stronger evidence against Hโ‚€
  • Most common approach in modern statistics
Topic 31

โ†”๏ธ One-Tailed vs Two-Tailed Tests

Directional vs non-directional hypotheses

Two-Tailed Test

  • Hโ‚: ฮผ โ‰  ฮผโ‚€ (different, could be higher or lower)
  • Testing for any difference
  • Rejection regions in both tails
  • More conservative

One-Tailed Test

  • Right-tailed: Hโ‚: ฮผ > ฮผโ‚€
  • Left-tailed: Hโ‚: ฮผ < ฮผโ‚€
  • Testing for specific direction
  • Rejection region in one tail
  • More powerful for directional effects

๐Ÿ“ Worked Example - Step by Step

Problem:

Researcher claims new drug LOWERS blood pressure (ฮผ < 120). Sample of 49: xฬ„ = 115, ฯƒ = 21. Test at ฮฑ = 0.05. Should this be one-tailed or two-tailed?

Solution:

Step 1:

Analyze the Claim

Claim: drug LOWERS pressure (directional)
Looking for decrease specifically
This requires ONE-TAILED test (left tail)

Directional claim = one-tailed test

Step 2:

Set Up Hypotheses

Hโ‚€: ฮผ โ‰ฅ 120 (blood pressure not lower)
Hโ‚: ฮผ < 120 (blood pressure IS lower)
Left-tailed test

Alternative hypothesis shows the direction

Step 3:

Calculate Z-Score

z = (xฬ„ - ฮผโ‚€) / (ฯƒ/โˆšn)
z = (115 - 120) / (21/โˆš49)
z = -5 / (21/7)
z = -5 / 3 = -1.67

Negative z-score indicates below mean

Step 4:

Find Critical Value (One-Tailed)

For ฮฑ = 0.05, one-tailed (left)
Critical value: z = -1.645

One-tailed critical value differs from two-tailed

Step 5:

Make Decision

Test statistic: z = -1.67
Critical value: z = -1.645
-1.67 < -1.645 (in rejection region)
REJECT Hโ‚€

Falls in rejection region, so reject null

Step 6:

Contrast with Two-Tailed

If two-tailed: critical values ยฑ1.96
Our |z| = 1.67 < 1.96
Would NOT reject Hโ‚€ with two-tailed!
This shows importance of choosing correct test

Test choice matters!

โœ“ Final Answer: Use ONE-TAILED (left). z = -1.67 < -1.645, Reject Hโ‚€
Check:

Evidence supports claim that drug lowers blood pressure. One-tailed test was appropriate for directional claim.

๐Ÿ’ช Try These:

  1. Claim: ฮผ > 50. One-tailed or two-tailed?
  2. Claim: ฮผ โ‰  100. Which test?

๐ŸŽฏ Key Takeaways

  • Two-tailed: testing for any difference
  • One-tailed: testing for specific direction
  • Choose before collecting data
  • Two-tailed is more conservative
Topic 32

๐Ÿ“ T-Test

Hypothesis test for small samples or unknown ฯƒ

When to Use T-Test

  • Small sample (n < 30)
  • Population ฯƒ unknown (use sample s)
  • Population approximately normal

Formula

T-Test Statistic
t = (xฬ„ - ฮผโ‚€) / (s / โˆšn)

Same as z-test but uses s instead of ฯƒ

Follows t-distribution with df = n - 1

๐Ÿ“ Worked Example - Step by Step

Problem:

Small sample: n = 16, xฬ„ = 52, s = 8. Test if ฮผ = 50 at ฮฑ = 0.05. Population ฯƒ unknown.

Solution:

Step 1:

Choose Correct Test

n = 16 < 30 (small sample)
ฯƒ unknown (use sample s)
Use T-TEST instead of z-test

Small sample + unknown ฯƒ = t-test

Step 2:

Calculate T-Statistic

t = (xฬ„ - ฮผโ‚€) / (s/โˆšn)
t = (52 - 50) / (8/โˆš16)
t = 2 / (8/4)
t = 2 / 2 = 1.0

Use sample standard deviation s

Step 3:

Find Degrees of Freedom

df = n - 1
df = 16 - 1 = 15

Lose 1 df for estimating mean

Step 4:

Find Critical Value

Two-tailed test, ฮฑ = 0.05
df = 15
From t-table: tโ‚€.โ‚€โ‚‚โ‚…,โ‚โ‚… = ยฑ2.131

Look up in t-distribution table

Step 5:

Compare and Decide

Test statistic: t = 1.0
Critical values: ยฑ2.131
|1.0| < 2.131
FAIL TO REJECT Hโ‚€

Test statistic not in rejection region

Step 6:

Interpret

Not enough evidence that ฮผ โ‰  50
Sample mean of 52 is not significantly different from 50

Interpret in context of problem

โœ“ Final Answer: t = 1.0, critical = ยฑ2.131, Fail to reject Hโ‚€
Check:

The difference between 52 and 50 is not statistically significant at ฮฑ = 0.05 level with this small sample.

๐Ÿ’ช Try These:

  1. n = 25, xฬ„ = 100, s = 15, test ฮผ = 95 at ฮฑ = 0.01
  2. Why use t-test instead of z-test?

๐ŸŽฏ Key Takeaways

  • Use when ฯƒ unknown or n < 30
  • t = (xฬ„ - ฮผโ‚€) / (s / โˆšn)
  • Follows t-distribution
  • More variable than z-distribution
Topic 33

๐Ÿ”“ Degrees of Freedom

Independent pieces of information

Introduction

What is it? Degrees of freedom (df) is the number of independent values that can vary in analysis.

Common Formulas

  • One-sample t-test: df = n - 1
  • Two-sample t-test: df โ‰ˆ nโ‚ + nโ‚‚ - 2
  • Chi-squared: df = (rows-1)(cols-1)

Why It Matters

  • Determines shape of t-distribution
  • Higher df โ†’ closer to normal distribution
  • Affects critical values

๐Ÿ“ Worked Example - Step by Step

Problem:

Calculate degrees of freedom for: a) Single sample t-test: n = 20, b) Two-sample t-test: nโ‚ = 15, nโ‚‚ = 18, c) Chi-squared test: 3ร—4 contingency table

Solution:

Step 1:

Single Sample T-Test

Formula: df = n - 1
n = 20
df = 20 - 1 = 19
We "lose" 1 df because we estimate mean from sample

Each parameter estimated reduces df by 1

Step 2:

Two-Sample T-Test (Equal Variances)

Formula: df = nโ‚ + nโ‚‚ - 2
nโ‚ = 15, nโ‚‚ = 18
df = 15 + 18 - 2 = 31
Lose 1 df per sample for estimating each mean

Two samples = two means estimated

Step 3:

Chi-Squared Contingency Table

Formula: df = (rows - 1) ร— (columns - 1)
3 rows, 4 columns
df = (3 - 1) ร— (4 - 1)
df = 2 ร— 3 = 6

Degrees of freedom for independence test

Step 4:

Explain Concept

Degrees of freedom = number of values free to vary
Each parameter estimated reduces df by 1
Higher df โ†’ distribution closer to normal

Conceptual understanding

โœ“ Final Answer: a) df = 19, b) df = 31, c) df = 6
Check:

These df values would be used to find appropriate critical values from respective distribution tables.

๐Ÿ’ช Try These:

  1. Sample size 100, find df for t-test
  2. 5ร—3 table, find df for chi-squared

๐ŸŽฏ Key Takeaways

  • df = number of independent values
  • For t-test: df = n - 1
  • Higher df โ†’ distribution closer to normal
  • Critical for finding correct critical values
Topic 34

โš ๏ธ Type I & Type II Errors

False positives and false negatives

The Two Types of Errors

Hโ‚€ True Hโ‚€ False
Reject Hโ‚€ Type I Error (ฮฑ) Correct!
Fail to Reject Hโ‚€ Correct! Type II Error (ฮฒ)

Definitions

  • Type I Error (ฮฑ): Rejecting true Hโ‚€ (false positive)
  • Type II Error (ฮฒ): Failing to reject false Hโ‚€ (false negative)
  • Power = 1 - ฮฒ: Probability of correctly rejecting false Hโ‚€
๐Ÿ“Š MEDICAL ANALOGY

Type I Error: Telling healthy person they're sick (false alarm)

Type II Error: Telling sick person they're healthy (missed diagnosis)

๐Ÿ“ Worked Example - Step by Step

Problem:

Drug trial tests Hโ‚€: "Drug is safe" vs Hโ‚: "Drug is dangerous". Describe Type I and Type II errors with consequences.

Solution:

Step 1:

Define Type I Error (False Positive)

Type I: Reject Hโ‚€ when Hโ‚€ is TRUE
In this case: Conclude drug is dangerous when it's actually safe
Probability = ฮฑ (significance level)
Consequence: Safe drug rejected, patients miss beneficial treatment

False alarm - reject truth

Step 2:

Define Type II Error (False Negative)

Type II: Fail to reject Hโ‚€ when Hโ‚ is TRUE
In this case: Conclude drug is safe when it's actually dangerous
Probability = ฮฒ
Consequence: Dangerous drug approved, patients harmed!

Miss detecting danger

Step 3:

Create Decision Matrix

Reality vs Decision:
If Hโ‚€ true (safe) + Reject Hโ‚€ (call dangerous) = TYPE I
If Hโ‚ true (dangerous) + Fail to reject = TYPE II
Correct decisions: Accept truth or reject false

Four possible outcomes

Step 4:

Calculate Example

If ฮฑ = 0.05: 5% chance of Type I error
If ฮฒ = 0.20: 20% chance of Type II error
Power = 1 - ฮฒ = 0.80 (80% chance of detecting dangerous drug)

Probabilities of each error

Step 5:

Compare Consequences

Type I: Waste safe drug (economic cost)
Type II: Approve dangerous drug (LIFE RISK!)
Type II often more serious โ†’ use lower ฮฑ

Context determines which error is worse

โœ“ Final Answer: Type I (ฮฑ): Reject safe drug
Type II (ฮฒ): Approve dangerous drug
Type II more dangerous in this case!
Check:

In medical contexts, Type II errors (missing danger) are often considered worse than Type I errors (false alarms).

๐Ÿ’ช Try These:

  1. Security scanner: Hโ‚€ = "Safe". Describe Type I/II errors
  2. If ฮฑ = 0.01, what's P(Type I error)?

๐ŸŽฏ Key Takeaways

  • Type I: False positive (ฮฑ)
  • Type II: False negative (ฮฒ)
  • Trade-off: decreasing one increases the other
  • Power = 1 - ฮฒ (ability to detect true effect)
Topic 35

ฯ‡ยฒ Chi-Squared Distribution

Distribution for categorical data analysis

Introduction

What is it? Chi-squared (ฯ‡ยฒ) distribution is used for testing hypotheses about categorical data.

Properties

  • Always positive (ranges from 0 to โˆž)
  • Right-skewed
  • Shape depends on degrees of freedom
  • Higher df โ†’ more symmetric

Uses

  • Goodness of fit test
  • Test of independence
  • Testing variance

๐ŸŽฏ Key Takeaways

  • Used for categorical data
  • Always positive, right-skewed
  • Shape depends on df
  • Foundation for chi-squared tests
Topic 36

โœ“ Goodness of Fit Test

Testing if data follows expected distribution

Introduction

What is it? Tests whether observed frequencies match expected frequencies from a theoretical distribution.

Formula

Chi-Squared Test Statistic
ฯ‡ยฒ = ฮฃ [(O - E)ยฒ / E]

O = observed frequency

E = expected frequency

df = k - 1 (k = number of categories)

๐Ÿ“Š EXAMPLE

Testing if die is fair:

Roll 60 times. Expected: 10 per face

Observed: 8, 12, 11, 9, 10, 10

Calculate ฯ‡ยฒ and compare to critical value

๐ŸŽฏ Key Takeaways

  • Tests if observed matches expected distribution
  • ฯ‡ยฒ = ฮฃ(O-E)ยฒ/E
  • Large ฯ‡ยฒ = poor fit
  • df = number of categories - 1
Topic 37

๐Ÿ”— Test of Independence

Testing relationship between categorical variables

Introduction

What is it? Tests whether two categorical variables are independent or associated.

Formula

Chi-Squared for Independence
ฯ‡ยฒ = ฮฃ [(O - E)ยฒ / E]

E = (row total ร— column total) / grand total

df = (rows - 1)(columns - 1)

๐Ÿ“Š EXAMPLE

Are gender and color preference independent?

Create contingency table, calculate expected frequencies, compute ฯ‡ยฒ, and test against critical value.

๐ŸŽฏ Key Takeaways

  • Tests independence of two categorical variables
  • Uses contingency tables
  • df = (r-1)(c-1)
  • Large ฯ‡ยฒ suggests association
Topic 38

๐Ÿ“ Chi-Squared Variance Test

Testing claims about population variance

Introduction

What is it? Tests hypotheses about population variance or standard deviation.

Formula

Chi-Squared for Variance
ฯ‡ยฒ = (n-1)sยฒ / ฯƒโ‚€ยฒ

n = sample size

sยฒ = sample variance

ฯƒโ‚€ยฒ = hypothesized population variance

df = n - 1

๐ŸŽฏ Key Takeaways

  • Tests claims about variance/standard deviation
  • ฯ‡ยฒ = (n-1)sยฒ/ฯƒโ‚€ยฒ
  • Requires normal population
  • Common in quality control
Topic 39

๐Ÿ“Š Confidence Intervals

Range of plausible values for parameter

Introduction

What is it? A confidence interval provides a range of values that likely contains the true population parameter.

Why it matters: More informative than point estimatesโ€”shows precision and uncertainty.

Formula

Confidence Interval for Mean
CI = xฬ„ ยฑ (critical value ร— SE)

For z: CI = xฬ„ ยฑ z* ร— (ฯƒ/โˆšn)

For t: CI = xฬ„ ยฑ t* ร— (s/โˆšn)

Common Confidence Levels

  • 90% CI: z* = 1.645
  • 95% CI: z* = 1.96
  • 99% CI: z* = 2.576
๐Ÿ“Š EXAMPLE

Sample: n=100, xฬ„=50, s=10

95% CI = 50 ยฑ 1.96(10/โˆš100)

95% CI = 50 ยฑ 1.96 = (48.04, 51.96)

๐ŸŽฏ Key Takeaways

  • CI = point estimate ยฑ margin of error
  • 95% CI most common
  • Wider CI = more uncertainty
  • Larger sample = narrower CI
Topic 40

ยฑ Margin of Error

Measuring estimate precision

Introduction

What is it? Margin of error (MOE) is the ยฑ part of a confidence interval, showing the precision of an estimate.

Formula

Margin of Error
MOE = (critical value) ร— SE

MOE = z* ร— (ฯƒ/โˆšn) or t* ร— (s/โˆšn)

Factors Affecting MOE

  • Sample size: Larger n โ†’ smaller MOE
  • Confidence level: Higher confidence โ†’ larger MOE
  • Variability: Higher ฯƒ โ†’ larger MOE

๐ŸŽฏ Key Takeaways

  • MOE = critical value ร— SE
  • Indicates precision of estimate
  • Inversely related to sample size
  • Trade-off between confidence and precision
Topic 41

๐Ÿ” Interpreting Confidence Intervals

Common misconceptions and proper interpretation

Correct Interpretation

"We are 95% confident that the true population parameter lies within this interval."

This means: If we repeated this process many times, 95% of the intervals would contain the true parameter.

โš ๏ธ COMMON MISCONCEPTIONS
  • WRONG: "There's a 95% probability the parameter is in this interval."
  • WRONG: "95% of the data falls in this interval."
  • WRONG: "We are 95% sure our sample mean is in this interval."

Using CIs for Hypothesis Testing

  • If hypothesized value is INSIDE CI โ†’ fail to reject Hโ‚€
  • If hypothesized value is OUTSIDE CI โ†’ reject Hโ‚€
  • 95% CI corresponds to ฮฑ = 0.05 test
โœ… PRO TIP

Report confidence intervals instead of just p-values! CIs provide more information: effect size AND statistical significance.

๐ŸŽฏ Key Takeaways

  • Correct interpretation: confidence in the method, not the specific interval
  • 95% refers to long-run success rate
  • Can use CIs for hypothesis testing
  • More informative than p-values alone