[Paper] Prompt Less, Smile More: MTP with Semantic Engineering in Lieu of Prompt Engineering

Published: 1 week ago (November 24, 2025 at 01:58 PM EST)

3 min read

Source: arXiv

Source: arXiv - 2511.19427v1

Overview

The paper “Prompt Less, Smile More: MTP with Semantic Engineering in Lieu of Prompt Engineering” tackles a growing pain point for developers building AI‑augmented software: the need to hand‑craft prompts for large language models (LLMs). By extending the Meaning Typed Programming (MTP) framework with a lightweight “Semantic Engineering” layer, the authors let developers embed natural‑language context directly in code, dramatically cutting the manual effort usually required for prompt engineering while preserving—or even improving—model performance.

Key Contributions

Semantic Context Annotations (SemTexts): A language‑level syntax that lets developers attach free‑form natural‑language notes to variables, functions, and data structures.
Integration with MTP: Extends the existing automatic prompt generation pipeline to consume SemTexts, turning enriched code semantics into high‑quality LLM prompts.
Jac language prototype: Implements SemTexts in the experimental Jac language, demonstrating feasibility without altering the underlying compiler or runtime.
Real‑world benchmark suite: Curated tasks that mimic typical AI‑integrated development scenarios (e.g., data cleaning pipelines, conversational agents, code‑assist tools).
Empirical validation: Shows that Semantic Engineering matches the accuracy of hand‑crafted prompt engineering across the benchmark while slashing developer time by ~70 %.

Methodology

Semantic Enrichment: Developers annotate code constructs with @semtext comments (e.g., @semtext "this function extracts user intent from chat messages"). These annotations are parsed alongside the abstract syntax tree.
Prompt Synthesis: The MTP engine combines static type information (e.g., function signatures, variable types) with the extracted SemTexts to generate a structured prompt that conveys both formal and informal intent to the LLM.
Evaluation Pipeline:
- Benchmarks: 12 tasks covering data transformation, UI generation, and autonomous decision‑making.
- Baselines: (a) Pure MTP (no annotations), (b) Traditional manual prompt engineering, (c) Zero‑shot LLM usage.
- Metrics: Task success rate, BLEU/ROUGE for generated text, and a developer effort survey (time spent writing prompts).

Results & Findings

Approach	Avg. Success Rate	Prompt Quality (BLEU)	Avg. Dev. Time (min)
Zero‑shot LLM	48 %	0.31	2
Pure MTP	62 %	0.44	3
Manual Prompt Engineering	78 %	0.68	12
MTP + Semantic Engineering	77 %	0.66	4

Performance parity: The enriched MTP pipeline reaches within 1 % of the manual prompt baseline on success rate and BLEU scores.
Efficiency gain: Developers spend roughly a third of the time they would need to write full prompts, thanks to concise natural‑language annotations.
Robustness: In tasks requiring domain‑specific reasoning (e.g., medical triage simulation), the semantic annotations helped the LLM avoid common misinterpretations that pure MTP missed.

Practical Implications

Faster prototyping: Teams can spin up AI‑driven features (chatbots, code assistants, data pipelines) without a dedicated prompt‑engineering sprint.
Maintainability: Since annotations live alongside code, future developers can see the intended LLM behavior directly in the source, reducing knowledge loss.
Tooling integration: IDE plugins could surface autocomplete for @semtext blocks, turning prompt design into a first‑class developer activity.
Cross‑language potential: While demonstrated in Jac, the concept maps cleanly to any language that supports comments or attributes, opening the door for gradual adoption in mainstream ecosystems (Python decorators, Java annotations, TypeScript JSDoc).

Limitations & Future Work

Language support: The current prototype is limited to the experimental Jac language; broader adoption will require language‑agnostic annotation standards.
Annotation quality: The approach assumes developers can articulate intent concisely; noisy or ambiguous SemTexts can degrade prompt fidelity.
Scalability of benchmarks: The benchmark suite, though realistic, covers a modest number of domains; larger, community‑driven datasets would strengthen external validity.
Future directions: The authors plan to (1) develop a language‑neutral annotation schema, (2) explore automated suggestion of SemTexts via LLMs themselves, and (3) evaluate the approach in large‑scale production codebases.

Authors

Jayanaka L. Dantanarayana
Savini Kashmira
Thakee Nathees
Zichen Zhang
Krisztian Flautner
Lingjia Tang
Jason Mars

Paper Information

arXiv ID: 2511.19427v1
Categories: cs.SE, cs.AI
Published: November 24, 2025
PDF: Download PDF

[Paper] Prompt Less, Smile More: MTP with Semantic Engineering in Lieu of Prompt Engineering

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] The Universal Weight Subspace Hypothesis

[Paper] Value Gradient Guidance for Flow Matching Alignment

[Paper] Deep infant brain segmentation from multi-contrast MRI

[Paper] DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation