[Paper] An Agentic Framework for Neuro-Symbolic Programming

Published: (January 2, 2026 at 11:59 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2601.00743v1

Overview

The paper introduces AgenticDomiKnowS (ADS), a new framework that lets developers describe a neuro‑symbolic task in plain language and automatically generates a complete DomiKnowS program. By turning free‑form prompts into executable code, ADS removes the steep learning curve of the original DomiKnowS library and speeds up prototype development from hours to just 10‑15 minutes.

Key Contributions

  • Agentic translation pipeline: An LLM‑driven workflow that parses a natural‑language task, creates each DomiKnowS component (data loaders, symbolic constraints, neural modules), and validates them step‑by‑step.
  • Human‑in‑the‑loop optionality: Developers familiar with DomiKnowS can intervene at any stage to edit or approve generated snippets, blending automation with expert control.
  • Speedup in development: Empirical user studies show a reduction of end‑to‑end coding time from several hours to roughly 10‑15 minutes for both novices and experienced DomiKnowS users.
  • Modular testing harness: Each generated component is unit‑tested against synthetic inputs before being assembled, improving reliability of the final neuro‑symbolic program.
  • Open‑source reference implementation: The authors release ADS as a Python package with example notebooks, making it easy to plug into existing AI pipelines.

Methodology

  1. Prompt ingestion – Users provide a free‑form description of the desired neuro‑symbolic task (e.g., “classify images while enforcing that the sum of detected object counts equals the reported total”).
  2. Task decomposition – An LLM (GPT‑4‑style) breaks the description into a structured plan: data acquisition, neural model selection, symbolic constraints, and integration points.
  3. Component generation – For each plan item, ADS invokes a specialized “agent” that emits the corresponding DomiKnowS code fragment (e.g., a SymbolicConstraint class).
  4. Local validation – The generated fragment is run against automatically created test cases (synthetic data that satisfies/violates the constraint). Failures trigger a regeneration loop.
  5. Human‑in‑the‑loop (optional) – If a developer opts in, the intermediate code is displayed for review and manual editing before proceeding.
  6. Program assembly – Validated fragments are stitched together into a full DomiKnowS script, which is then executed on the target dataset.
  7. Feedback loop – Execution logs are fed back to the LLM to fine‑tune prompts for future runs, gradually improving generation quality.

The workflow is deliberately modular, allowing each agent to be swapped out (e.g., using a different LLM or a rule‑based parser) without breaking the overall system.

Results & Findings

  • Time‑to‑prototype: In a controlled study with 12 participants (6 DomiKnowS experts, 6 novices), average coding time dropped from 3.2 h (manual) to 12 min (ADS).
  • Correctness: 87 % of generated programs passed all unit tests on the first pass; the remaining 13 % required a single regeneration cycle.
  • User satisfaction: Survey scores (1–5 Likert) averaged 4.6 for ease of use and 4.2 for confidence in the generated code.
  • Scalability: ADS successfully generated programs for three benchmark neuro‑symbolic tasks (visual question answering, physics‑based reasoning, and rule‑guided text classification) without task‑specific tuning.

Practical Implications

  • Rapid prototyping: Teams can spin up neuro‑symbolic pipelines on new datasets without deep expertise in DomiKnowS, accelerating research‑to‑product cycles.
  • Lower barrier to entry: Start‑ups and product engineers can experiment with symbolic constraints (e.g., business rules, safety checks) alongside deep models, fostering more trustworthy AI solutions.
  • Human‑augmented AI development: The optional review step lets senior engineers retain control while delegating boilerplate generation to the agent, improving productivity without sacrificing quality.
  • Integration with existing stacks: Since ADS outputs pure Python/DomiKnowS code, it can be dropped into CI pipelines, containerized services, or Jupyter notebooks with minimal friction.
  • Data efficiency: By encouraging the use of symbolic priors, developers can achieve comparable performance with fewer labeled examples—a cost saver for domains where data is scarce (medical imaging, scientific simulation).

Limitations & Future Work

  • Dependency on LLM quality: The accuracy of generated code hinges on the underlying language model; out‑of‑domain terminology can still cause mis‑parsing.
  • Limited to DomiKnowS: While the modular design could be extended, ADS currently only supports the DomiKnowS API, restricting adoption for teams using alternative neuro‑symbolic libraries.
  • Scalability of symbolic testing: The unit‑test generation assumes relatively simple constraints; more complex logical formulas may require sophisticated test‑case synthesis.
  • Future directions: The authors plan to (1) add multi‑LLM ensembles for more robust parsing, (2) broaden support to other neuro‑symbolic frameworks (e.g., DeepProbLog, Neuro‑Symbolic Concept Learner), and (3) incorporate reinforcement‑learning‑based self‑debugging to reduce the regeneration loop further.

Authors

  • Aliakbar Nafar
  • Chetan Chigurupati
  • Danial Kamali
  • Hamid Karimian
  • Parisa Kordjamshidi

Paper Information

  • arXiv ID: 2601.00743v1
  • Categories: cs.AI
  • Published: January 2, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »