Predicting Employee Salary Using Linear Regression
Source: Dev.to
Project Overview
In this project I predict an employee’s salary based on their years of experience using a linear regression model.
Linear regression is a statistical method used to model the relationship between a dependent variable and an independent variable.
- X (Independent Variable) – Years of Experience
- Y (Dependent Variable) – Salary
Libraries Used
The following Python libraries are used in this project:
- pandas – handling data frames
- seaborn & matplotlib – visualization
- scikit‑learn (sklearn) – data preprocessing, model training and evaluation
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.feature_selection import f_regression
%matplotlib inline
Data Preparation
Read the Excel file that contains the dataset.

Define the X and y variables.
X is stored as a DataFrame (2‑D array) because scikit‑learn expects that shape.
X = df[['YearsExperience']] # independent variable (must be 2‑D)
y = df['Salary'] # dependent variable
Splitting the Dataset
The data are split into training and testing sets (25 % for testing).
# Train‑test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.25, random_state=42
)
Model Validation
To verify that a genuine relationship exists between years of experience and salary, F‑regression is applied.
- F‑value: 622.5 – measures how well the independent variable explains the dependent variable.
- p‑value: 0.0 – indicates statistical significance.
(Replace the placeholder URL with the correct image link if needed.)
Conclusion – Linear regression effectively captures the relationship between an employee’s years of experience and their salary, achieving a high R² score and providing an interpretable model (intercept and slope) that can be used for future salary predictions.

Conclusion
This project demonstrates that Linear Regression can effectively model the relationship between years of experience and salary.
Thank you for reading! ❤️