Predicting Employee Salary Using Linear Regression

Published: (January 15, 2026 at 03:40 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

Project Overview

In this project I predict an employee’s salary based on their years of experience using a linear regression model.
Linear regression is a statistical method used to model the relationship between a dependent variable and an independent variable.

  • X (Independent Variable) – Years of Experience
  • Y (Dependent Variable) – Salary

Libraries Used

The following Python libraries are used in this project:

  • pandas – handling data frames
  • seaborn & matplotlib – visualization
  • scikit‑learn (sklearn) – data preprocessing, model training and evaluation
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.feature_selection import f_regression

%matplotlib inline

Data Preparation

Read the Excel file that contains the dataset.

Excel Dataset – First 5 rows

Define the X and y variables.
X is stored as a DataFrame (2‑D array) because scikit‑learn expects that shape.

X = df[['YearsExperience']]   # independent variable (must be 2‑D)
y = df['Salary']              # dependent variable

Splitting the Dataset

The data are split into training and testing sets (25 % for testing).

# Train‑test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

Model Validation

To verify that a genuine relationship exists between years of experience and salary, F‑regression is applied.

  • F‑value: 622.5 – measures how well the independent variable explains the dependent variable.
  • p‑value: 0.0 – indicates statistical significance.

(Replace the placeholder URL with the correct image link if needed.)

Conclusion – Linear regression effectively captures the relationship between an employee’s years of experience and their salary, achieving a high R² score and providing an interpretable model (intercept and slope) that can be used for future salary predictions.

Linear Regression Result

Conclusion

This project demonstrates that Linear Regression can effectively model the relationship between years of experience and salary.

Thank you for reading! ❤️

Back to Blog

Related posts

Read more »