Use Cursor with Local LLM and LM Studio

Published: (January 17, 2026 at 09:07 PM EST)
3 min read
Source: Dev.to

Source: Dev.to

Prerequisites

  • Cursor installed
  • LM Studio installed
  • ngrok installed
  • A local machine capable of running an LLM

For this guide we’ll use the model zai-org/glm-4.6v-flash.

Step 1 – Install LM Studio

  1. Download LM Studio from the official site.
  2. Install the application.
  3. Launch LM Studio.

Step 2 – Download a Model

  1. Open LM Studio.
  2. Inside the app, download the model you want to use.
  3. For this article, download zai-org/glm-4.6v-flash.
  4. Wait until the download finishes before proceeding.

LM Studio screenshot

Step 3 – Install ngrok

ngrok lets you expose a local server to the internet with a public URL.

  • Official site:

If you use Homebrew (macOS/Linux), install it with:

brew install ngrok

Step 4 – Set Up ngrok

  1. Create an ngrok account (if you don’t have one).
  2. Retrieve your auth token from the dashboard.
  3. Authenticate your local installation:
ngrok config add-authtoken {your_token}

Step 5 – Start the Local Server in LM Studio

  1. Open LM Studio.
  2. Enable Developer Mode (Settings → Developer).
  3. Click Start Local Server.

LM Studio will now serve your LLM via an OpenAI‑compatible API.

Developer mode screenshot
Load model screenshot
Loading model screenshot
Loaded model screenshot

Step 6 – Expose the Local Server with ngrok

Open a terminal and run:

ngrok http 1234

Note: 1234 must match the port LM Studio uses for its local server (default is 1234).

After the command starts, ngrok will display a public URL, e.g.:

Copy this URL – you’ll need it when configuring Cursor.

Typical ngrok output:

🚪 One gateway for every AI model. Available in early access *now*: https://ngrok.com/r/ai

Session Status                online
Account                       your_account (Plan: Free)
Version                       3.35.0
Region                        United States (us)
Latency                       19ms
Web Interface                 http://127.0.0.1:4040

You now have a locally‑running LLM served by LM Studio and exposed to the internet via ngrok, ready to be used with Cursor. Happy coding!

Forwarding

https://something.ngrok-free.app → http://localhost:1234

Connections

ttlopnrt1rt5p50p90
700.000.006.26263.91

HTTP Requests

20:10:37.113 EST POST /v1/chat/completions 200 OK
20:06:13.115 EST POST /v1/chat/completions 200 OK
20:04:59.112 EST POST /v1/chat/completions 200 OK
20:04:42.221 EST POST /v1/chat/completions 200 OK
20:03:11.002 EST POST /v1/chat/completions 200 OK
20:03:05.636 EST POST /v1/chat/completions 200 OK
20:02:22.796 EST POST /v1/chat/completions 200 OK

Step 7 – Open Cursor Settings

Launch Cursor and navigate to:

Settings → Models / OpenAI Configuration

Cursor settings

Step 8 – Configure the OpenAI Base URL

  1. Enable OpenAI API Key.
  2. Enter any placeholder value for the API key (e.g., 1234).
  3. Paste the ngrok URL into Override OpenAI Base URL.
  4. Append /v1 to the end of the URL.

Your final URL should look like this:

https://yours.ngrok-free.app/v1

Step 9 – Add a Custom Model

  1. Click Add Custom Model.
  2. Enter a name for your local LLM (e.g., GLM4.6‑local).

⚠️ Windows users:
You must enter the exact model name that LM Studio reports internally.
For this case: zai-org/glm-4.6v-flash

Add custom model

Done! 🎉

That’s it — the setup is complete. You can now open Cursor Chat, enter a prompt, and send it. Cursor will route the request through ngrok to your local LLM running in LM Studio.

Result

Final Thoughts

Using Cursor with a local LLM is a great way to:

  • Reduce API costs
  • Improve privacy
  • Experiment with custom or open‑source models

LM Studio and ngrok make the process surprisingly straightforward. Once configured, it feels almost identical to using a hosted OpenAI model — except everything runs on your own machine.

Happy hacking! 🚀

Back to Blog

Related posts

Read more »

Rapg: TUI-based Secret Manager

We've all been there. You join a new project, and the first thing you hear is: > 'Check the pinned message in Slack for the .env file.' Or you have several .env...

Technology is an Enabler, not a Saviour

Why clarity of thinking matters more than the tools you use Technology is often treated as a magic switch—flip it on, and everything improves. New software, pl...