'Beyond Linting: A Data-Driven Approach to Suggesting Better Code, Not Just Flagging Bad Code'

Published: (April 24, 2026 at 06:36 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

Intro

Every developer has experienced this loop: you run your linter or static analysis tool, it highlights a dozen issues – long methods, high cyclomatic complexity, tight coupling – and then… you’re on your own. You know what’s wrong. You just don’t know what better looks like in your specific context.

A recently published paper in IET Software tackles this gap head‑on. Titled “A Data‑Driven Methodology for Quality Aware Code Fixing” by Thomas Karanikiotis and Andreas Symeonidis (Aristotle University of Thessaloniki), it presents a system that doesn’t just detect code quality problems – it recommends concrete, higher‑quality alternatives drawn from real‑world code.

The Problem: Detection Without Direction

Static analysis has matured significantly. Tools like SonarQube, ESLint, Pylint, and platforms like Cyclopt can evaluate code across dimensions such as maintainability, security, readability, and reusability. They grade your codebase, flag violations, and prioritize technical debt.

But there’s a disconnect. Once you know that a function has excessive complexity or poor cohesion, refactoring it still requires judgment, effort, and domain knowledge. For junior developers especially, the distance between “this method is too complex” and “here’s how to decompose it properly” can be enormous.

The paper proposes bridging that gap with a recommendation engine built on top of quality‑annotated code snippets.

The Approach: Functional Match + Quality Upgrade

Dataset Construction

The researchers built a rich dataset on top of the CodeSearchNet corpus, enriching each code snippet with static analysis metrics: complexity, coupling, cohesion, documentation quality, coding violations, readability scores, and source‑code similarity metrics.

Functional Similarity Assessment

When a developer submits a code snippet, the system identifies functionally equivalent alternatives – code that does the same thing, verified through advanced similarity techniques. This step ensures the replacement actually works for the same purpose.

Quality‑Aware Ranking

Among the functionally equivalent candidates, the system ranks them by quality metrics. The top suggestions are snippets that not only match what your code does but also score measurably better on maintainability, readability, and structural quality.

A key design decision: the system also evaluates syntactic similarity, prioritizing alternatives that look similar to the original. This minimizes the cognitive overhead of adopting a suggestion – you’re not replacing your entire approach, just getting a cleaner version of it.

What Makes It Interesting for Practitioners

  • Language‑agnostic architecture – the methodology isn’t tied to a single language; quality metrics and similarity assessments work across different programming languages, which matters in polyglot codebases.
  • Practical over theoretical – evaluation shows the system produces alternatives that are both functionally equivalent and syntactically close to the originals, meaning they’re actually usable, not academic curiosities.
  • Closes the feedback loop – if you’re already using quality dashboards (e.g., Cyclopt’s quality scoring), this recommendation system turns passive monitoring into active guidance. Instead of a grade, you get a path to a better grade.

The Bigger Picture

This research sits at the intersection of several trends in developer tooling:

  • AI‑assisted coding is everywhere, but most tools focus on generation, not the improvement of existing code.
  • Technical debt management is increasingly data‑driven, yet remediation is still manual.
  • Code reuse from open source is standard practice, but quality filtering is rarely systematic.

The paper argues – and convincingly – that we have enough data in open‑source repositories to build quality‑aware recommendation systems that work. The CodeSearchNet corpus alone contains millions of functions across six languages. Enriching that data with quality metrics transforms it from a search index into a quality improvement engine.

Try the Research Yourself

The paper is published open access under CC BY 4.0:

  • Full paper: DOI: 10.1049/sfw2/4147669
  • Zenodo archive (PDF):
0 views
Back to Blog

Related posts

Read more »