Regression Calculator
Perform simple linear regression on paired X and Y data. Find the best-fit line equation y = mx + b and predict Y values for any given X.
About this Calculator
Perform simple linear regression on paired X and Y data. Find the best-fit line equation y = mx + b and predict Y values for any given X.
Formula & Calculations
Formula
m = Σ((x−x̄)(y−ȳ)) / Σ(x−x̄)²; b = ȳ − mx̄; y = mx + bWhere:
- x, y=Paired data points for the independent (X) and dependent (Y) variables
- x̄, ȳ=Means of X and Y respectively
- m=Slope of the regression line (rate of change in Y per unit change in X)
- b=Y-intercept (value of Y when X = 0)
- ŷ=Predicted value of Y for a given X input
Assumptions
- The relationship between X and Y is approximately linear.
- Data pairs are independent of each other.
- Least squares regression minimizes the sum of squared vertical distances from data points to the line.
- Input X and Y values as comma-separated or space-separated numbers of equal length.
Calculation Examples
Example 1
m = 2, b = 1. The line y = 2x + 1 passes through all data points exactly. For X=6, Y = 2(6) + 1 = 13.
Example 2
The data points approximately follow y = 2x, with small deviations. The regression line finds the best fit through these points.
Frequently Asked Questions
What is the difference between correlation and regression?
Correlation (r) measures the strength and direction of a linear relationship between two variables. Regression goes further by finding the actual equation of the line (y = mx + b) that best describes the relationship, allowing you to predict Y from X. Correlation is symmetric (r_xy = r_yx); regression is not.
Can I use regression to predict values outside my data range?
Extrapolation (predicting beyond the range of your X data) is risky because the linear relationship may not hold outside your observed range. The regression line is only validated within the range of your sample data. Use predictions cautiously and only within or very near the observed X range.