Hassan Ijaz
Ai, Web & Design
Linear regression
Drag-and-drop interface where users place points and see regression line update, with residual visualization and R-squared display
Concept Overview
Linear regression models the relationship between a continuous response variable and one or more predictor variables using a linear equation. It's one of the most fundamental and widely used statistical techniques.
The Linear Model
Simple: y = β₀ + β₁x + ε
One predictor variable
Multiple: y = β₀ + β₁x₁ + β₂x₂ + ... + βₚxₚ + ε
Multiple predictors
- β₀: Intercept (y-value when all x = 0)
- βᵢ: Slope coefficients (change in y per unit change in xᵢ)
- ε: Random error term (ε ~ N(0, σ²))
Key Assumptions
Linearity
Relationship between x and y is linear
Independence
Observations are independent of each other
Homoscedasticity
Constant variance of errors across all levels of x
Normality
Errors are normally distributed
Estimation: Least Squares
Minimize sum of squared residuals:
minimize: Σ(yᵢ - ŷᵢ)²
- Closed-form solution: β̂ = (X'X)⁻¹X'y
- Unique solution when X'X is invertible
- BLUE: Best Linear Unbiased Estimator
- Minimizes variance among unbiased estimators
Model Evaluation
R-squared (R²)
R² = 1 - SS_res/SS_tot
- Proportion of variance explained by model
- Range: 0 to 1 (higher is better)
- Adjusted R² penalizes for additional predictors
Residual Analysis
- Plot residuals vs fitted values (check homoscedasticity)
- Q-Q plot of residuals (check normality)
- Cook's distance (identify influential points)
- Leverage (identify outliers in x-space)
Inference
Test individual coefficients:
t = β̂ᵢ / SE(β̂ᵢ)
Test overall model significance:
F = (SS_reg/p) / (SS_res/(n-p-1))
Key Insight: Linear regression is interpretable and provides uncertainty quantification, but assumes linear relationships. Consider transformations or non-linear methods for complex patterns.
The drag-and-drop interface below lets you place points and see the regression line update in real-time. Watch residuals and R-squared change as you modify the data.
Interactive Visualization
Loading interactive visualization...
Drag-and-drop interface where users place points and see regression line update, with residual visualization and R-squared display