本地推理:在 C 指针逻辑上测试 SLM (L3.1-8B-I)

发布: (2026年1月31日 GMT+8 23:40)
3 min read
原文: Dev.to

Source: Dev.to

Setup

我最近在我的 Linux Debian 笔记本上安装了 llama-cli,并使用 native inference——直接通过 Linux 命令和标志运行 llama-cli 二进制文件。通过从源码编译,我可以针对特定的 CPU 标志(例如 AVX‑512)进行优化,以充分利用我的硬件。

本次测试使用的模型如下:

  • Llama 3.1 8B Instruct (Q4_K_M GGUF) – 在本文中统一称为 Llama‑3.1

测试的目标是观察模型如何处理包含指针逻辑的一个简单 C 程序。

C Program

#include 
#include 
#include 

int main() {
    char *buffer = malloc(100 * sizeof(char));

    if (buffer == NULL) {
        return 1;
    }

    strcpy(buffer, "Llama 3.1 is running close to the metal.");
    printf("Buffer content: %s\n", buffer);

    // Intentional bug: No free(buffer) here.
    return 0;
}

Prompt given to the model

Explain the memory allocation here step‑by‑step.

Llama‑3.1’s Response

Here’s a step-by-step explanation of the memory allocation in the code:

char *buffer = malloc(100 * sizeof(char));

  • malloc dynamically allocates a block of memory of the requested size.
  • 100 * sizeof(char) equals 100 bytes because sizeof(char) is 1.
  • The allocated block is stored in the pointer buffer.
  • malloc returns NULL on failure.

if (buffer == NULL) { … }

  • Checks whether the allocation succeeded.
  • If buffer is NULL, the program exits with error code 1.

strcpy(buffer, "Llama 3.1 is running close to the metal.");

  • Copies the string into the allocated memory.
  • The allocated block is 100 bytes, so the copy succeeds, leaving the remaining bytes uninitialized.

free(buffer);

  • Deallocates the memory allocated by malloc.
  • This step is necessary to avoid a memory leak.

Note: In a real program you would typically check the return value of malloc and free the memory when it is no longer needed.

Analysis of the Model’s Output

The Hallucination

The original source code does not contain a free(buffer); call. Llama‑3.1 incorrectly added this line, a classic example of a model “hallucinating” a common best practice.

Accuracy

  • Correctly described malloc and the null‑pointer check.
  • Misidentified the presence of free(buffer);.

Math Check

The model claimed that 86 bytes remain after copying the string.
The actual string "Llama 3.1 is running close to the metal." is 39 characters long; with the terminating null byte it occupies 40 bytes.
Thus the remaining space is 100 – 40 = 60 bytes, not 86.

Takeaways

  • Human verification is essential. Even a well‑behaved 8 B model can insert plausible but incorrect code.
  • Prompt engineering matters. Precise prompts help guide the model, but they cannot eliminate hallucinations.
  • Hardware‑side success. Running an 8 B model with AVX‑512 (or AVX‑2 on older hardware) demonstrates that useful inference can be performed without a GPU, making on‑device AI more accessible.

Ben Santora – January 2026

Back to Blog

相关文章

阅读更多 »

老极简主义者的新站点

概述 我是一名开发者,我相信能够构建你想象中的任何东西是世界上最大的特权之一。对我而言,开发是……

几乎未来感的作品集

概述 本次提交是针对 Google AI 主办的 “New Year, New You Portfolio Challenge”。我是一名热衷于创建沉浸式网页体验的开发者。