How does a machine actually learn from data?

Published: (January 14, 2026 at 02:57 AM EST)
2 min read
Source: Dev.to

Source: Dev.to

🎯 The Correct Order (Beginner‑Optimal)

You should not fully learn scikit‑learn before understanding:

  • what a model is
  • what loss is
  • what training means
  • what overfitting is

Otherwise, scikit‑learn becomes a black box.

🧠 Think of scikit‑learn like this

Concepts → why something works
scikit‑learn → how to apply it quickly

If you reverse this order:

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X, y)

you can run the code, but you don’t actually know what happened.

✅ What You SHOULD Do Instead (Best Approach)

Step 1️⃣ – Learn learning concepts (no scikit‑learn yet)

Focus on the fundamentals:

  • Supervised learning
  • Regression vs. classification
  • Model = function
  • Loss function
  • Overfitting vs. underfitting
  • Train vs. test behavior

This can be done with math intuition + NumPy.

Step 2️⃣ – Implement Linear Regression from scratch

Use only:

  • NumPy
  • A few lines of math
  • No ML libraries

This answers the question: “How does the model actually learn?”

Step 3️⃣ – THEN introduce scikit‑learn (lightly)

Once the concept clicks, scikit‑learn becomes:

  • Clean
  • Logical
  • Easy

You’ll instantly understand:

  • .fit()
  • .predict()
  • .score()

❌ What NOT to Do (Common Beginner Mistake)

  • Deep dive into the scikit‑learn API
  • Memorize classifiers and their parameters
  • Jump to advanced models too early

These habits create a fragile understanding.

🧭 Minimal scikit‑learn You May Peek At (Optional)

It’s fine to recognize these utilities without mastering them yet:

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

(You’ve likely used them in previous projects.)
Don’t start learning full models until the earlier steps are solid.

Further Reading

Back to Blog

Related posts

Read more »

Data Leakage pada Machine Learning

Data Leakage pada Machine Learning Sering kali mentee melakukan kesalahan dasar dalam alur kerja Machine Learning: Exploratory Data Analysis EDA → preprocessin...