๐Ÿต ์ฐจ์™€ ํ•จ๊ป˜ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ์„ค๋ช… โ€” Zeroโ€‘Knowledge ๋น„์œ 

๋ฐœํ–‰: (2025๋…„ 12์›” 20์ผ ์˜คํ›„ 11:51 GMT+9)
6 min read
์›๋ฌธ: Dev.to

Source: Dev.to

โญ 1. ์™„๋ฒฝํ•œ ์ฐจ ํ•œ ์ž” ๋งŒ๋“ค๊ธฐ๋ฅผ ๋ฐฐ์šฐ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค

์นœ๊ตฌ์—๊ฒŒ ์ฐจ๋ฅผ ๋งŒ๋“ค์–ด ์ฃผ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๊ทธ ์นœ๊ตฌ๋Š” ๋งค์šฐ ๊นŒ๋‹ค๋กญ์Šต๋‹ˆ๋‹ค.

  • ์นœ๊ตฌ๋Š” ์™„๋ฒฝํ•œ ์ฐจ ๋ง›์„ ์ •ํ™•ํžˆ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค โ€” ๋‹น์‹ ์˜ ML ๋ชจ๋ธ์€ ๋ชจ๋ฆ…๋‹ˆ๋‹ค.
  • ๋‹น์‹ ์ด ๋งŒ๋“  ํ•œ ์ž” = ํ•˜๋‚˜์˜ ์˜ˆ์ธก
  • ์นœ๊ตฌ์˜ ๋ง› = ์‹ค์ œ ์ •๋‹ต

โญ 2. ๋น„์šฉ ํ•จ์ˆ˜ = ์ฐจ ๋ง›์ด ์–ผ๋งˆ๋‚˜ ๋‚˜์œ๊ฐ€

์ฒซ ์ž”์„ ๋งŒ๋“ค๋ฉด ์นœ๊ตฌ๊ฐ€ ๋งํ•ฉ๋‹ˆ๋‹ค:

  • โ€œ์„คํƒ•์ด ๋„ˆ๋ฌด ๋งŽ์•„.โ€
  • โ€œ์ฐจ ๊ฐ€๋ฃจ๊ฐ€ ๋ถ€์กฑํ•ด.โ€
  • โ€œ๋„ˆ๋ฌด ๋ฌผ ๊ฐ™์•„.โ€

์ด ํ”ผ๋“œ๋ฐฑ์€ ์ฐจ๊ฐ€ ์™„๋ฒฝํ•œ ๋ง›(์˜ค์ฐจ)์—์„œ ์–ผ๋งˆ๋‚˜ ๋–จ์–ด์ ธ ์žˆ๋Š”์ง€๋ฅผ ์•Œ๋ ค์ค๋‹ˆ๋‹ค.

  • ์ฐจ๊ฐ€ ๋งค์šฐ ๋‚˜์˜๋ฉด, ๋น„์šฉ์ด ๋†’๋‹ค.
  • ์ฐจ๊ฐ€ ๊ฑฐ์˜ ์™„๋ฒฝ์— ๊ฐ€๊น๋‹ค๋ฉด, ๋น„์šฉ์ด ๋‚ฎ๋‹ค.

๋น„์šฉ ํ•จ์ˆ˜ = ์ฐจ ์‹ค์ˆ˜ ์ ์ˆ˜ โ€“ ์ด๊ฒƒ์€ ๋‹ค์Œ์„ ์ธก์ •ํ•ฉ๋‹ˆ๋‹ค:

  • ๋ ˆ์‹œํ”ผ๊ฐ€ ์–ผ๋งˆ๋‚˜ ํ‹€๋ ธ๋Š”์ง€
  • ์ด์ƒ์ ์ธ ๋ง›์—์„œ ์–ผ๋งˆ๋‚˜ ๋–จ์–ด์ ธ ์žˆ๋Š”์ง€
  • ์–ผ๋งˆ๋‚˜ ๊ณ ์ณ์•ผ ํ•˜๋Š”์ง€

โญ 3. Gradient Descent = ์ฐจ๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ๊ณ ์น˜๊ธฐ

์™„๋ฒฝํ•œ ๋ ˆ์‹œํ”ผ๋ฅผ ๋ชจ๋ฅธ ์ฑ„, ์ฒœ์ฒœํžˆ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค:

  • ์„คํƒ•์„ ์กฐ๊ธˆ ์ค„์ธ๋‹ค
  • ์šฐ์œ ๋ฅผ ์กฐ๊ธˆ ๋”ํ•œ๋‹ค
  • ์ฐจ ๊ฐ€๋ฃจ๋ฅผ ์•ฝ๊ฐ„ ๋Š˜๋ฆฐ๋‹ค

๊ฐ๊ฐ์˜ ๋ณ€ํ™”๋Š” ์ž‘์€ ๋ณด์ •์ด๋ฉฐ, ๋‚˜์œ ๋ง›์„ ์ค„์—ฌ์ค๋‹ˆ๋‹ค.

Gradient Descent = ๋งค๋ฒˆ ์‹ค์ˆ˜๋ฅผ ์ค„์ด๋Š” ์ž‘์€ ๊ฑธ์Œ๋“ค์„ ์ทจํ•˜๋Š” ๊ฒƒ

๋ฐ˜๋ณต ๋ฃจํ”„:

  1. ์ฐจ๋ฅผ ๋งŒ๋“ ๋‹ค
  2. ํ”ผ๋“œ๋ฐฑ์„ ๋ฐ›๋Š”๋‹ค
  3. ๋ ˆ์‹œํ”ผ๋ฅผ ์กฐ์ •ํ•œ๋‹ค
  4. ๋ฐ˜๋ณตํ•œ๋‹ค

์ด๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•˜๋Š” ๋ฐฉ์‹๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

โญ 4. Learning Rate (ฮฑ) = ๊ฐ ๋ ˆ์‹œํ”ผ ์ˆ˜์ •์ด ์–ผ๋งˆ๋‚˜ ํฐ๊ฐ€

ํ•™์Šต๋ฅ ์€ ๊ฐ ์‹ค์ˆ˜ ํ›„ ์กฐ์ • ํฌ๊ธฐ๋ฅผ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค.

  • ฮฑ๊ฐ€ ์ž‘์œผ๋ฉด โ†’ ์„คํƒ•์„ ์•„์ฃผ ์กฐ๊ธˆ๋งŒ ์ค„์ž„ โ†’ ์ง„ํ–‰์ด ๋А๋ฆผ.
  • ฮฑ๊ฐ€ ๋„ˆ๋ฌด ํฌ๋ฉด โ†’ ์„คํƒ•์„ ๋„ˆ๋ฌด ๋งŽ์ด ๋นผ์„œ โ†’ ์ฐจ๊ฐ€ ์“ด๋ง›์ด ๋‚˜๊ณ  โ†’ ๊ณผ๋„ํ•˜๊ฒŒ ๋ณด์ •ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • ฮฑ๊ฐ€ ์ ๋‹นํ•˜๋ฉด โ†’ ์ ๋‹นํ•œ ์กฐ์ •์œผ๋กœ ๊พธ์ค€ํžˆ ์™„๋ฒฝํ•œ ๋ง›์— ๋‹ค๊ฐ€๊ฐ‘๋‹ˆ๋‹ค.

ํ•™์Šต๋ฅ  = ๋ ˆ์‹œํ”ผ๋ฅผ ๋ฐฐ์šฐ๋Š” ์†๋„.

โญ 5. ์ˆ˜๋ ด ์•Œ๊ณ ๋ฆฌ์ฆ˜ = ์–ธ์ œ ์กฐ์ •์„ ๋ฉˆ์ถœ์ง€ ์•Œ๊ธฐ

At first, improvements are large:

  • Cost drops 70 โ†’ 50 โ†’ 30 โ†’ 15

Later, progress becomes tiny:

  • 15 โ†’ 14.5 โ†’ 14.4 โ†’ 14.39

Eventually:

๐ŸŽ‰ You canโ€™t improve the taste any further.

Extra changes donโ€™t help.

Convergence = the moment your recipe is good enough โ€” stop training.

The convergence algorithm checks:

  • Is improvement tiny?
  • Is cost stable?
  • Should training stop?

โญ 6. ์™œ ์ด๋Ÿฌํ•œ ๊ฐœ๋…๋“ค์ด ํ•จ๊ป˜ ์ž‘๋™ํ•˜๋Š”๊ฐ€ (๋น ๋ฅธ ์ฐจ ์š”์•ฝ)

ConceptTeaโ€‘Making AnalogyPurpose
Cost Functionโ€œ์ด ๋ง›์ด ์–ผ๋งˆ๋‚˜ ๋‚˜์œ๊ฐ€?โ€์˜ค๋ฅ˜๋ฅผ ์ธก์ •ํ•œ๋‹ค
Gradient Descentโ€œํ•œ ๋‹จ๊ณ„์”ฉ ๊ณ ์ณ๋ณผ๊ฒŒ.โ€์ ์ง„์ ์œผ๋กœ ๊ฐœ์„ ํ•œ๋‹ค
Learning Rate (ฮฑ)โ€œ๊ฐ ๋ณด์ •์€ ์–ผ๋งˆ๋‚˜ ํฌ๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ?โ€ํ•™์Šต ์†๋„๋ฅผ ์ œ์–ดํ•œ๋‹ค
Convergence Algorithmโ€œ์ด์ œ ๋ง›์ด ์™„๋ฒฝํ•ด. ๋ฉˆ์ถฐ.โ€ํ•™์Šต์„ ์ค‘๋‹จํ•œ๋‹ค

โญ 7. ์„ฑ๋Šฅ ์ง€ํ‘œ = ์ฐจ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•

๋งŽ์€ ๊ณ ๊ฐ์—๊ฒŒ ์ฐจ๋ฅผ ํŒ๋งคํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ์ƒํ•ด ๋ณด์„ธ์š”. ์‚ฌ๋žŒ๋งˆ๋‹ค ํ‰๊ฐ€ ๊ธฐ์ค€์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค:

  • Accuracy(์ •ํ™•๋„) โ€“ โ€œ๋‚ด ์ฐจ๋ฅผ ์ข‹์•„ํ•œ ๊ณ ๊ฐ์€ ๋ช‡ ๋ช…์ธ๊ฐ€?โ€
  • Precision(์ •๋ฐ€๋„) โ€“ โ€œ์ด ์ปต์ด ์ข‹๋‹ค๊ณ  ๋งํ–ˆ์„ ๋•Œ, ์–ผ๋งˆ๋‚˜ ์ž์ฃผ ๋งž์•˜๋Š”๊ฐ€?โ€
  • Recall(์žฌํ˜„์œจ) โ€“ โ€œ์ข‹์€ ์ฐจ๋ฅผ ์›ํ•  ์‚ฌ๋žŒ๋“ค ์ค‘ ์‹ค์ œ๋กœ ๋ช‡ ๋ช…์—๊ฒŒ ์ œ๊ณตํ–ˆ๋Š”๊ฐ€?โ€
  • F1โ€‘Score โ€“ ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ ์‚ฌ์ด์˜ ๊ท ํ˜•: ๋‚ด๊ฐ€ ์ผ๊ด€๋˜๊ฒŒ ์ข‹์€๊ฐ€?
  • ROCโ€‘AUC โ€“ โ€œ์ฐจ๋ฅผ ์ข‹์•„ํ•˜๋Š” ์‚ฌ๋žŒ๊ณผ ๊ทธ๋ ‡์ง€ ์•Š์€ ์‚ฌ๋žŒ์„ ์–ผ๋งˆ๋‚˜ ์ž˜ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€?โ€
    • ๋†’์€ AUC โ†’ ๊นŒ๋‹ค๋กœ์šด ์‚ฌ๋žŒ๋“ค์กฐ์ฐจ ๋ง›์˜ ํ’ˆ์งˆ์— ๋™์˜ํ•œ๋‹ค.

โญ 8. ํ•˜๋‚˜์˜ ์ฐจ ์ด์•ผ๊ธฐ ์† ๋ชจ๋“  ๊ฐœ๋…

1๏ธโƒฃ ์ฐจ๋ฅผ ๋งŒ๋“ ๋‹ค โ†’ prediction
2๏ธโƒฃ ์นœ๊ตฌ๊ฐ€ ๋ง›์„ ๋ณธ๋‹ค โ†’ cost function
3๏ธโƒฃ ๋‹น์‹ ์ด ์กฐ์ •ํ•œ๋‹ค โ†’ gradient descent
4๏ธโƒฃ ์–‘์„ ํ˜„๋ช…ํ•˜๊ฒŒ ์กฐ์ ˆํ•œ๋‹ค โ†’ learning rate
5๏ธโƒฃ ์™„๋ฒฝํ•ด์งˆ ๋•Œ ๋ฉˆ์ถ˜๋‹ค โ†’ convergence
6๏ธโƒฃ ๋งŽ์€ ์‚ฌ๋žŒ์—๊ฒŒ ์ œ๊ณตํ•œ๋‹ค โ†’ performance metrics

์ด์ œ ์ฐจ๋ฅผ ํ†ตํ•ด ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์ด ํ•™์Šตํ•˜๊ณ  ํ‰๊ฐ€๋˜๋Š” ๊ณผ์ •์„ ๊ทธ๋Œ€๋กœ ์žฌํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค! ๐Ÿต

๐ŸŽ‰ ์ตœ์ข… ์ฐจ ์š”์•ฝ

  • Cost Function = ๋ง› ์˜ค๋ฅ˜
  • Gradient Descent = ๋ ˆ์‹œํ”ผ๋ฅผ ๋‹จ๊ณ„๋ณ„๋กœ ๊ฐœ์„ 
  • Learning Rate (ฮฑ) = ๊ฐ ์ˆ˜์ •์˜ ํฌ๊ธฐ
  • Convergence = ๋ ˆ์‹œํ”ผ๊ฐ€ ์™„๋ฒฝํ•ด์งˆ ๋•Œ ๋ฉˆ์ถค
  • Performance Metrics = ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ์ฐจ ํ’ˆ์งˆ์„ ํ‰๊ฐ€

๋จธ์‹ ๋Ÿฌ๋‹ โ‰ˆ ํ”ผ๋“œ๋ฐฑ๊ณผ ์ ์ง„์  ๊ฐœ์„ ์„ ํ†ตํ•ด ํ›Œ๋ฅญํ•œ ์ฐจ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์šฐ๋Š” ๊ฒƒ ๐Ÿตโœจ

Back to Blog

๊ด€๋ จ ๊ธ€

๋” ๋ณด๊ธฐ ยป

์ฐฝ๊ณ  ํ™œ์šฉ์— ๋Œ€ํ•œ ์ข…ํ•ฉ ๊ฐ€์ด๋“œ

์†Œ๊ฐœ ์ฐฝ๊ณ ๋Š” ๊ทผ๋ณธ์ ์œผ๋กœ 3โ€‘D ๋ฐ•์Šค์ผ ๋ฟ์ž…๋‹ˆ๋‹ค. Utilisation์€ ์‹ค์ œ๋กœ ๊ทธ ๋ฐ•์Šค๋ฅผ ์–ผ๋งˆ๋‚˜ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€๋ฅผ ์ธก์ •ํ•˜๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค. While logistics c...

CinemaSins: ๋ ˆ๋“œ ์›์˜ ๋ชจ๋“  ์ž˜๋ชป์„ 18๋ถ„ ์ด๋‚ด์—

๊ฐœ์š”: โ€œEverything Wrong With Red One in 18 Minutes Or Lessโ€๋Š” ์˜ˆ์ธก ๊ฐ€๋Šฅํ•œ ํ”Œ๋กฏ์˜ ํœด๊ฐ€ ๋ธ”๋ก๋ฒ„์Šคํ„ฐ์— ์ถ•์ œ์ ์ธ ๋ฐ˜์ „์„ ๊ฐ€ํ•˜๋ฉฐ, ๋ชจ๋“  โ€œsinโ€์„ ์ฐจ๋ก€๋กœ ๋‚˜์—ดํ•œ๋‹ค.

1์–ต ๊ฐœ์˜ ์‹ฌ์žฅ ๋ฐ•๋™ ์ˆ˜์ง‘: ํŒŒ์‚ฐ ์—†์ด Wearable Tech ํ™•์žฅ

โ€œContinuousโ€์˜ ์ˆ˜ํ•™ ์ˆซ์ž์— ๋Œ€ํ•ด ํ˜„์‹ค์ ์œผ๋กœ ์ด์•ผ๊ธฐํ•ด๋ด…์‹œ๋‹ค. ์žฅ์น˜๊ฐ€ 1์ดˆ์— ํ•œ ๋ฒˆ, ์ฆ‰ 1 Hz๋กœ ํ•˜ํŠธ๋น„ํŠธ ํŽ˜์ด๋กœ๋“œ๋ฅผ ์ „์†กํ•œ๋‹ค๋ฉด: - 1 User = 86,400 writes/day. - 1,000 Use...