๐Ÿ›  Local LLM Ops 2025: ๊ฐœ๋ฐœ์ž๋ฅผ ์œ„ํ•œ ํฌ์ผ“ ์‚ฌ์ด์ฆˆ ์‹ ๊ฒฝ๋ง ์‹คํ–‰ ๊ฐ€์ด๋“œ

๋ฐœํ–‰: (2025๋…„ 12์›” 21์ผ ์˜คํ›„ 02:02 GMT+9)
5 min read
์›๋ฌธ: Dev.to

Source: Dev.to

Overview

Cover image for ๐Ÿ›  Local LLM Ops 2025: A Developer's Guide to Running Pocket-Sized Neural Networks

2025๋…„, ๊ฐ€์ •์šฉ PC์—์„œ ๋กœ์ปฌ ์‹ ๊ฒฝ๋ง์„ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ด ์ด์ œ๋Š” ์• ํ˜ธ๊ฐ€๋“ค์˜ ์ทจ๋ฏธ๋ฅผ ๋„˜์–ด ์‹ค์ œ ์—…๋ฌด ๋„๊ตฌ๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. โ€œ๋””์ง€ํ„ธ ํด๋ก โ€์„ ๋งŒ๋“ค๋“ , ํ„ฐ๋ฏธ๋„์—์„œ ์ผ์ƒ ์ž‘์—…์„ ์ž๋™ํ™”ํ•˜๋“ , ๋ณด์•ˆ AIโ€‘์ง€์› VPN ์„œ๋น„์Šค๋ฅผ ๋ฐฐํฌํ•˜๋“ , ์ด ๊ฐœ์š”๊ฐ€ ์†Œํ”„ํŠธ์›จ์–ด ํƒ์ƒ‰์— ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Part 1: โ€œEnginesโ€ (Backend)

๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ GPU์— ๋กœ๋“œํ•˜๊ณ  API๋ฅผ ์ œ๊ณตํ•˜๋Š” ํ•ต์‹ฌ ํ”„๋กœ๊ทธ๋žจ๋“ค.

  • KoboldCPP: GGUF (Llama/Loki) โ€“ 8โ€ฏGB VRAM์— ๋Œ€ํ•œ ๊ธˆ๋ณธ์œ„ ํ‘œ์ค€. ๋งค์šฐ ๊ฐ€๋ณ๊ณ  SillyTavern๊ณผ ์™„๋ฒฝํ•˜๊ฒŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
  • Oobabooga (WebUI) โ€“ ์œ ์—ฐํ•œ ์‹คํ—˜ ํ™˜๊ฒฝ. ๋ชจ๋“  ๊ฒƒ์„ ์ง€์›: LoRA, EXL2, AWQ. ๊ฐ•๋ ฅํ•œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์™€ DarkPlanet ์Šคํƒ€์ผ์„ โ€œ๋ธ”๋ Œ๋“œโ€ํ•ด์•ผ ํ•  ๋•Œ ์ด์ƒ์ ์ž…๋‹ˆ๋‹ค.
  • Ollama โ€“ ์ฝ˜์†” ๊ธฐ๋ฐ˜ ๋ฏธ๋‹ˆ๋ฉ€๋ฆฌ์ฆ˜. ํ•œ ์ค„ ๋ช…๋ น์œผ๋กœ ์‹คํ–‰. ๊ฐ„๋‹จํ•œ ๋กœ์ปฌ API ์—”๋“œํฌ์ธํŠธ์— ๊ฐ€์žฅ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
  • LocalAI (Docker) โ€“ OpenAI API์™€ ์™„์ „ ํ˜ธํ™˜. ์ž์ฒด ์„œ๋ฒ„์— ๋ฐฐํฌํ•˜๊ธฐ์— ์ด์ƒ์ ์ž…๋‹ˆ๋‹ค.

Part 2: โ€œFaceโ€ and Personality (Frontend)

ํ†ต์‹ ๊ณผ โ€œํด๋ก โ€ ์„ค์ •์ด ์ด๋ฃจ์–ด์ง€๋Š” ์ธํ„ฐํŽ˜์ด์Šค.

SillyTavern โ€” โ€œDigital Twinโ€ ํ—ˆ๋ธŒ

  • ๋‹จ์ˆœ ์ฑ„ํŒ…์ด ์•„๋‹ˆ๋ผ ๋กคํ”Œ๋ ˆ์ž‰ ์—”์ง„์ž…๋‹ˆ๋‹ค.
  • World Info (Lorebook) โ€“ ์ง€์‹ ๋ฒ ์ด์Šค(์ „ํ™”๋ฒˆํ˜ธ, ์ด๋ฉ”์ผ, ํšŒ์‚ฌ ์„ค๋ช… ๋“ฑ)๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ ์š”์ฒญ ์‹œ์—๋งŒ ์ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒ€์ƒ‰ํ•ด ์ปจํ…์ŠคํŠธ๋ฅผ ๊น”๋”ํ•˜๊ฒŒ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
  • Character Cards โ€“ โ€œLag Cloneโ€ ์นด๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์‹œ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ: โ€œ๋‹น์‹ ์€ IT ๋ณด์•ˆ ์ „๋ฌธ๊ฐ€์ด์ž ๋ฏธ๋””์–ด ์†Œ์œ ์ž์ด๋ฉฐ, ๊ฒ€์—ด ์—†์ด ์†”์งํ•˜๊ฒŒ ๋งํ•ฉ๋‹ˆ๋‹ค.โ€
  • Group chats โ€“ ๋ณ€ํ˜ธ์‚ฌ ๋ชจ๋ธ๊ณผ ํ”„๋กœ๊ทธ๋ž˜๋จธ ๋ชจ๋ธ์„ ํฌํ•จํ•œ โ€œ๋ฏธํŒ…โ€์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

LibreChat / AnythingLLM

  • LibreChat โ€“ ๋กœ์ปฌ ๋ชจ๋ธ ๋ฐ API(OpenRouter/Groq)์™€ ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ChatGPT ํด๋ก ์ด ํ•„์š”ํ•  ๋•Œ.
  • AnythingLLM โ€“ RAG(์ง€์‹โ€‘๋ฒ ์ด์Šค) ์‹œ์Šคํ…œ ๊ตฌ์ถ•์— ์ตœ์ . ๋Ÿฌ์‹œ์•„ ๋ฒ•๋ฅ  PDF๋‚˜ VPN ๋ฌธ์„œ๋ฅผ ๋„ฃ์œผ๋ฉด ์‚ฌ์‹ค์— ๊ธฐ๋ฐ˜ํ•ด ๋‹ต๋ณ€ํ•ฉ๋‹ˆ๋‹ค.

Part 3: AI in Action (Agentic Tools)

์ฑ„ํŒ…๋งŒ์œผ๋กœ๋Š” ๋ถ€์กฑํ•˜๊ณ  ์‹ ๊ฒฝ๋ง์ด โ€œ๋งˆ์šฐ์Šค๋ฅผ ์›€์ง์—ฌ์•ผโ€ ํ•  ๋•Œ.

  • Open Interpreter โ€“ ๊ฐœ๋ฐœ์ž๋ฅผ ์œ„ํ•œ ๊ฐ•๋ ฅ ๊ธฐ๋Šฅ. ํ„ฐ๋ฏธ๋„์„ ํ†ตํ•ด ์ž‘๋™: โ€œGPU ๋ถ€ํ•˜๋ฅผ ๋ถ„์„ํ•˜๊ณ  ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค์ค˜โ€๋ผ๊ณ  ๋งํ•˜๋ฉด ์‹œ์Šคํ…œ์ด ์ง์ ‘ Python ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑยท์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • Continue.dev โ€“ ๋กœ์ปฌ Loki ๋˜๋Š” Vikhr์™€ ์—ฐ๊ฒฐํ•ด ์ฝ”๋“œ ์ƒ์„ฑ์ด ๊ฐ€๋Šฅํ•œ VSโ€ฏCode ํ”Œ๋Ÿฌ๊ทธ์ธ. ๋…์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด Microsoft ์„œ๋ฒ„์— ๋– ๋Œ์ง€ ์•Š๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

Final checklist: what to look for?

์ด๋ฆ„์ด๋‚˜ ๋งํฌ๋ฅผ ์žŠ์–ด๋ฒ„๋ ธ๋‹ค๋ฉด GitHub์™€ Hugging Face์—์„œ ๋‹ค์Œ ํƒœ๊ทธ๋ฅผ ๊ฒ€์ƒ‰ํ•˜์„ธ์š”:

  • Model formats: GGUF (universal), EXL2 (NVIDIA์— ๋น ๋ฆ„), AWQ (์••์ถ•).
  • Where to find models: Hugging Face (์ €์ž Bartowski, mradermacher ๋˜๋Š” abliterated ํƒœ๊ทธ ๊ฒ€์ƒ‰).
  • Key repositories:
    • SillyTavern/SillyTavern
    • LostRuins/koboldcpp
    • KillianLucas/open-interpreter

Tip for 2025: ๋กœ์ปฌ 8B (Loki/Vikhr) ๋ชจ๋ธ์ด โ€œ๋ฉ์ฒญํ•ด ๋ณด์ด๋ฉดโ€, Llamaโ€‘3โ€‘70Bโ€‘Abliterated API ํ‚ค๋ฅผ ํ†ตํ•ด ์—ฐ๊ฒฐํ•ด ๋ณด์„ธ์š”. ๊ฒ€์—ด ์—†์ด ์ž์œ ๋กœ์šด ๋ฐœ์–ธ๊ณผ ํ•จ๊ป˜ GPTโ€‘4 ์ˆ˜์ค€์˜ ์ง€๋Šฅ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

#LocalLLM #SillyTavern #Oobabooga #KoboldCPP #OpenInterpreter #SelfHostedAI #AIops #MachineLearning #Python #GPU #CUDA #LLMops #PrivacyFirst #DigitalTwin #UncensoredAI #ITSecurity #VPN #CloudComputing #Automation

Back to Blog

๊ด€๋ จ ๊ธ€

๋” ๋ณด๊ธฐ ยป

์ดˆ๋ณด์ž๋ฅผ ์œ„ํ•œ AIOps ๊ฐ€์ด๋“œ: IT ํŒ€์ด ์•Œ์•„์•ผ ํ•  ๋‚ด์šฉ

ํ˜„๋Œ€ IT ํ™˜๊ฒฝ์€ ์‹œ๋„๋Ÿฝ๊ณ  ๋ณต์žกํ•˜๋ฉฐ ์–ธ์ œ๋‚˜ ๊ฐ€๋™ ์ค‘์ž…๋‹ˆ๋‹ค. Cloud platforms, microservices, containers, ๊ทธ๋ฆฌ๊ณ  hybrid systems๋Š” ์ธ๊ฐ„์ด ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

Regression testing workflow: ์œ„ํ—˜์ด ๋จผ์ € ๋ฆด๋ฆฌ์Šค๋ฅผ ์•ˆ์ •์ ์œผ๋กœ ์œ ์ง€ํ•˜๋Š”์ง€ ํ™•์ธ

TL;DR ์›Œํฌํ”Œ๋กœ์šฐ: ์œ„ํ—˜โ€‘์šฐ์„  ํšŒ๊ท€ ๋ฒ”์œ„ ์„ค์ • โ†’ ๊ณจ๋“ โ€‘ํŒจ์Šค ๊ธฐ์ค€์„  โ†’ ํƒ€๊นƒ ํ”„๋กœ๋ธŒ โ†’ ์ฆ๊ฑฐโ€‘๊ธฐ๋ฐ˜ ๊ฒฐ๊ณผ. ์˜ˆ์‹œ ์ƒํ™ฉ: Sworn์ด PC Game Pass์—โ€ฆ

2025๋…„ ์ตœ๊ณ ์˜ ๊ฐœ๋ฐœ์ž AI ๋„๊ตฌ โ€” ์‹ค์ œ ํ”„๋กœ์ ํŠธ์—์„œ ์‹ค์ œ๋กœ ํšจ๊ณผ๊ฐ€ ์žˆ์—ˆ๋˜ ๊ฒƒ

2025๋…„์€ AI ๋„๊ตฌ๊ฐ€ โ€œnice to haveโ€ ์ˆ˜์ค€์„ ๋„˜์–ด ๊ธฐ๋ณธ ๊ฐœ๋ฐœ์ž ์›Œํฌํ”Œ๋กœ์šฐ์˜ ์ผ๋ถ€๊ฐ€ ๋œ ํ•ด์˜€์Šต๋‹ˆ๋‹ค. ์™„๋ฒฝํ•ด์„œ๊ฐ€ ์•„๋‹ˆ๋ผ, ๋Œ€์ฒดํ•œ๋‹ค๋Š” ์ด์œ ๋งŒ์œผ๋กœ๊ฐ€ ์•„๋‹ˆ๋ผโ€ฆ