왜 수학이 머신러닝에 필수적인가

발행: 1개월 전 (2025년 12월 15일 오전 08:43 GMT+9)

3 분 소요

Source: Dev.to

Introduction — The Black Box Myth

Machine Learning is often presented as an essentially algorithmic discipline: you load data, choose a model, train it, and “it works.”
This view is partly true, but fundamentally incomplete.

Behind every Machine Learning algorithm lie precise mathematical structures:

notions of distance
properties of continuity
assumptions of convexity
convergence guarantees and theoretical limits that no model can circumvent

👉 Modern Machine Learning is not an alternative to mathematics; it is a direct application of it.

This article sets the general framework for the series: understanding why mathematical analysis is indispensable for understanding, designing, and mastering Machine Learning algorithms.

1. Machine Learning Is Primarily an Optimization Problem

At a fundamental level, almost all ML algorithms solve the same problem:

Minimize a loss function.

Formally, we search for parameters θ such that

[ \theta^{*} = \arg\min_{\theta} L(\theta) ]

where L(θ) measures the model’s error on the data.

Essential mathematical questions arise immediately:

What does it mean to minimize?
Does a minimum exist?
Is it unique?
Can it be reached numerically?
At what speed?

These questions are mathematical, not merely algorithmic.

2. Distance, Norms, and Geometry: Measuring Error Is Not Neutral

Before optimizing anything, we must answer:

How do we measure error?

This leads directly to the notions of distance and norm.

Classic examples:

MAE (Mean Absolute Error) ↔ L¹ norm
MSE (Mean Squared Error) ↔ L² norm
Maximum error ↔ L^∞ norm

These choices are not incidental:

they change the geometry of the problem
they affect robustness to outliers
they influence numerical stability
they impact gradient descent behavior

👉 Without understanding the geometry induced by a norm, one does not truly understand what the algorithm is optimizing.

3. Convergence: When Can We Say an Algorithm Works?

A Machine Learning algorithm is often iterative:

[ \theta_{0} \rightarrow \theta_{1} \rightarrow \theta_{2} \rightarrow \dots ]

This raises a crucial question:

Does this sequence converge? And if so, to what?

The answer depends on concepts from analysis:

sequences and limits
Cauchy sequences
completeness
continuity

Without these notions, it is impossible to answer practical questions such as:

why training diverges
why it oscillates
why it is slow
why two implementations produce different results

4. Continuity, Lipschitz Conditions, and Stability

A Machine Learning model must be stable: a small change in the data or parameters should not cause predictions to explode. This stability is formalized by:

uniform continuity
Lipschitz functions

A function f is Lipschitz if

[ |f(x) - f(y)| \le L,|x - y| ]

This inequality lies at the core of:

model stability
learning rate selection
convergence guarantees for gradient descent

👉 The Lipschitz constant is not a theoretical detail; it directly controls the speed and stability of learning.

5. Convexity: Why Some Problems Are Easy… and Others Are Not

Convexity is arguably the most important mathematical property in optimization.

A convex function has:

a unique global minimum
no traps in the form of local minima

This is why methods such as:

linear regression
support vector machines
certain regularization problems

benefit from strong theoretical guarantees.

By contrast, deep neural networks are non‑convex, yet they still work thanks to particular structures and effective heuristics.

👉 Understanding convexity makes it possible to know when guarantees exist — and when they do not.

6. Theory vs. Practice: What Mathematics Guarantees (and What It Does Not)

A crucial point to understand from the outset:

Mathematics guarantees properties, not miraculous performance.

It can tell us:

whether a solution exists
whether it is unique
whether an algorithm converges
how fast it converges

It cannot guarantee:

good data
good generalization
an unbiased model

Without mathematical insight, we proceed blindly.

Conclusion — Understand Before You Optimize

Modern Machine Learning rests on three fundamental mathematical pillars:

Geometry (norms, distances)
Analysis (continuity, convergence, Lipschitz conditions)
Optimization (convexity, gradient descent)

Ignoring these foundations amounts to:

applying recipes without understanding their limits
misdiagnosing failures
overcomplicating simple problems

👉 Understanding the mathematical analysis of Machine Learning is not theory for theory’s sake; it is about gaining control, robustness, and intuition.

왜 수학이 머신러닝에 필수적인가

Introduction — The Black Box Myth

1. Machine Learning Is Primarily an Optimization Problem

2. Distance, Norms, and Geometry: Measuring Error Is Not Neutral

3. Convergence: When Can We Say an Algorithm Works?

4. Continuity, Lipschitz Conditions, and Stability

5. Convexity: Why Some Problems Are Easy… and Others Are Not

6. Theory vs. Practice: What Mathematics Guarantees (and What It Does Not)

Conclusion — Understand Before You Optimize

관련 글

머신러닝 어드벤트 캘린더 15일차: Excel에서 SVM

3가지 질문: 계산을 활용해 세계 최고의 단세포 화학자를 연구하기

AI와 머신러닝: 지능형 자동화의 Futnewsure

오버스무딩의 숨겨진 함정: 과도함을 방지하는 방법