왜 수학이 머신러닝에 필수적인가

발행: (2025년 12월 15일 오전 08:43 GMT+9)
3 min read
원문: Dev.to

Source: Dev.to

Introduction — The Black Box Myth

Machine Learning is often presented as an essentially algorithmic discipline: you load data, choose a model, train it, and “it works.”
This view is partly true, but fundamentally incomplete.

Behind every Machine Learning algorithm lie precise mathematical structures:

  • notions of distance
  • properties of continuity
  • assumptions of convexity
  • convergence guarantees and theoretical limits that no model can circumvent

👉 Modern Machine Learning is not an alternative to mathematics; it is a direct application of it.

This article sets the general framework for the series: understanding why mathematical analysis is indispensable for understanding, designing, and mastering Machine Learning algorithms.

1. Machine Learning Is Primarily an Optimization Problem

At a fundamental level, almost all ML algorithms solve the same problem:

Minimize a loss function.

Formally, we search for parameters θ such that

[ \theta^{*} = \arg\min_{\theta} L(\theta) ]

where L(θ) measures the model’s error on the data.

Essential mathematical questions arise immediately:

  • What does it mean to minimize?
  • Does a minimum exist?
  • Is it unique?
  • Can it be reached numerically?
  • At what speed?

These questions are mathematical, not merely algorithmic.

2. Distance, Norms, and Geometry: Measuring Error Is Not Neutral

Before optimizing anything, we must answer:

How do we measure error?

This leads directly to the notions of distance and norm.

Classic examples:

  • MAE (Mean Absolute Error) ↔  norm
  • MSE (Mean Squared Error) ↔  norm
  • Maximum error ↔ L^∞ norm

These choices are not incidental:

  • they change the geometry of the problem
  • they affect robustness to outliers
  • they influence numerical stability
  • they impact gradient descent behavior

👉 Without understanding the geometry induced by a norm, one does not truly understand what the algorithm is optimizing.

3. Convergence: When Can We Say an Algorithm Works?

A Machine Learning algorithm is often iterative:

[ \theta_{0} \rightarrow \theta_{1} \rightarrow \theta_{2} \rightarrow \dots ]

This raises a crucial question:

Does this sequence converge? And if so, to what?

The answer depends on concepts from analysis:

  • sequences and limits
  • Cauchy sequences
  • completeness
  • continuity

Without these notions, it is impossible to answer practical questions such as:

  • why training diverges
  • why it oscillates
  • why it is slow
  • why two implementations produce different results

4. Continuity, Lipschitz Conditions, and Stability

A Machine Learning model must be stable: a small change in the data or parameters should not cause predictions to explode. This stability is formalized by:

  • uniform continuity
  • Lipschitz functions

A function f is Lipschitz if

[ |f(x) - f(y)| \le L,|x - y| ]

This inequality lies at the core of:

  • model stability
  • learning rate selection
  • convergence guarantees for gradient descent

👉 The Lipschitz constant is not a theoretical detail; it directly controls the speed and stability of learning.

5. Convexity: Why Some Problems Are Easy… and Others Are Not

Convexity is arguably the most important mathematical property in optimization.

A convex function has:

  • a unique global minimum
  • no traps in the form of local minima

This is why methods such as:

  • linear regression
  • support vector machines
  • certain regularization problems

benefit from strong theoretical guarantees.

By contrast, deep neural networks are non‑convex, yet they still work thanks to particular structures and effective heuristics.

👉 Understanding convexity makes it possible to know when guarantees exist — and when they do not.

6. Theory vs. Practice: What Mathematics Guarantees (and What It Does Not)

A crucial point to understand from the outset:

Mathematics guarantees properties, not miraculous performance.

It can tell us:

  • whether a solution exists
  • whether it is unique
  • whether an algorithm converges
  • how fast it converges

It cannot guarantee:

  • good data
  • good generalization
  • an unbiased model

Without mathematical insight, we proceed blindly.

Conclusion — Understand Before You Optimize

Modern Machine Learning rests on three fundamental mathematical pillars:

  • Geometry (norms, distances)
  • Analysis (continuity, convergence, Lipschitz conditions)
  • Optimization (convexity, gradient descent)

Ignoring these foundations amounts to:

  • applying recipes without understanding their limits
  • misdiagnosing failures
  • overcomplicating simple problems

👉 Understanding the mathematical analysis of Machine Learning is not theory for theory’s sake; it is about gaining control, robustness, and intuition.

Back to Blog

관련 글

더 보기 »

Edge ML은 크기에 집착한다

UPS가 당신의 Amazon 소포를 화물 E‑Bike로 배달할 수 있습니다. 대부분의 도시에서 대부분의 소포는 실제로 더 빠를 것입니다. 주차 필요 없음. 교통 체증 없음. 바로 목적지까지.