[Paper] Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
Source: arXiv - 2603.04337v1
Overview
The paper introduces Pointer‑CAD, a new framework that lets large language models (LLMs) generate and edit full‑featured CAD models. By augmenting the traditional command‑sequence representation with pointer operations that directly select edges, faces, or other B‑rep entities, the system overcomes the long‑standing limitation of “blind” sequence generation and dramatically cuts topological errors caused by discretizing continuous geometry.
Key Contributions
- Pointer‑based command language – Extends the usual CAD command stream with explicit “select‑entity” tokens that let the LLM point to a specific face, edge, or vertex in the current B‑rep.
- Iterative B‑rep conditioning – Each generation step receives both the natural‑language prompt and the up‑to‑date boundary‑representation, enabling context‑aware edits (e.g., chamfer the selected edge).
- Large‑scale annotated dataset – A pipeline that pairs 575 K expert‑level CAD models with high‑quality natural‑language descriptions, providing the training signal needed for pointer prediction.
- Quantization‑error mitigation – By selecting existing geometric entities instead of approximating continuous parameters, the method reduces segmentation/topology errors by orders of magnitude compared with prior sequence‑only approaches.
- Comprehensive evaluation – Demonstrates reliable generation of complex parts (multiple features, nested operations) and shows near‑zero failure rates on standard CAD benchmarks.
Methodology
- Representation – The CAD model is expressed as a command sequence (e.g.,
sketch,extrude,fillet). Pointer‑CAD adds a new token typeSELECT <entity_id>that points to an element in the current B‑rep (edges, faces, vertices). - Model Architecture – A transformer‑based LLM (e.g., GPT‑NeoX) is fine‑tuned to predict the next token given:
- The textual design description.
- A serialized view of the current B‑rep (encoded as a list of entity features).
- The previously generated command tokens.
The pointer prediction is treated as a classification over the set of available entities.
- Training Data Pipeline – Existing CAD repositories are parsed into B‑rep structures, then a semi‑automatic annotator generates paired natural‑language specifications (using GPT‑4 for drafting and human verification). The pipeline also extracts ground‑truth pointers for every operation that involves entity selection.
- Inference Loop – Starting from an empty model, the LLM iteratively emits commands. When a
SELECTtoken is produced, the model scores all candidate entities and picks the highest‑scoring one, which is then fed back into the CAD kernel to update the B‑rep before the next step.
Results & Findings
| Metric | Pointer‑CAD | Prior Sequence‑Only (e.g., CAD‑GPT) |
|---|---|---|
| Topological error rate (invalid B‑rep) | 0.3 % | 7.8 % |
| Chamfer/Fillet success on complex parts | 94 % | 62 % |
| Average number of features per generated part | 12.4 | 6.1 |
| Human evaluation (design fidelity) | 4.6 / 5 | 3.8 / 5 |
- Error reduction – The pointer mechanism cuts quantization‑induced segmentation errors by ~10×.
- Feature richness – Models can reliably chain multiple dependent operations (e.g., sketch → extrude → select face → fillet).
- Generalization – On unseen prompts, the system still produces valid B‑reps, indicating that the pointer‑based conditioning learns robust geometric reasoning rather than memorizing fixed command patterns.
Practical Implications
- Developer APIs – Pointer‑CAD can be wrapped as a REST service that accepts a natural‑language design brief and returns a standard CAD file (STEP/IGES). This opens the door for “design‑by‑prompt” features in IDE plugins, product configurators, or rapid prototyping tools.
- Interactive CAD assistants – Because the model can point to existing geometry, it can be used for in‑situ editing: a user asks “add a 2 mm fillet to the top edge of the bracket,” and the system instantly selects the correct edge and applies the operation.
- Reduced manual modeling time – Early experiments suggest a 30‑40 % cut in the number of manual steps required to build complex parts, translating into faster iteration cycles for mechanical engineers and hobbyists alike.
- Better downstream simulation – Valid B‑reps mean fewer geometry clean‑up steps before feeding models into finite‑element analysis or 3‑D printing pipelines, improving overall workflow reliability.
Limitations & Future Work
- Scalability of entity set – Pointer prediction currently enumerates all faces/edges, which can become costly for very large assemblies; hierarchical or learned indexing could alleviate this.
- Dataset bias – The 575 K models are sourced mainly from mechanical parts; architectural or organic shapes may need additional training data.
- Fine‑grained parameter control – While pointers eliminate quantization error, continuous parameters (e.g., exact fillet radius) still rely on discretized tokens; future work could integrate differentiable geometry modules to predict real‑valued values.
- User intent ambiguity – Vague natural‑language prompts may lead to ambiguous pointer choices; incorporating clarification dialogs or multimodal inputs (sketches, images) is a promising direction.
Pointer‑CAD marks a significant step toward truly intelligent CAD generation, bridging the gap between language understanding and precise geometric manipulation. For developers eager to embed generative design capabilities into their products, the paper offers both a solid technical foundation and a roadmap for practical integration.
Authors
- Dacheng Qi
- Chenyu Wang
- Jingwei Xu
- Tianzhe Chu
- Zibo Zhao
- Wen Liu
- Wenrui Ding
- Yi Ma
- Shenghua Gao
Paper Information
- arXiv ID: 2603.04337v1
- Categories: cs.CV, cs.CL
- Published: March 4, 2026
- PDF: Download PDF