BiasAwareFeedback: Detecting Textual Bias with NLP (Mini-Research Project)

Published: 1 month ago (December 15, 2025 at 10:56 AM EST)

3 min read

Source: Dev.to

Source: Dev.to

Cover image for BiasAwareFeedback: Detecting Textual Bias with NLP (Mini-Research Project)

Bias-Aware Automated Feedback System for Student Writing

Limitations, Reproducibility, and Research Positioning

A. System Limitations

Model Dependence – The bias detection component relies on a locally hosted large language model (LLaMA 3 via Ollama). This enables free, offline experimentation but introduces variability in outputs depending on model version, prompt phrasing, and inference temperature.
Non-deterministic Outputs – Because large language models are generative, identical inputs may yield slightly different outputs across runs. This limits strict reproducibility of exact results, although trends and qualitative behaviors remain consistent.
Synthetic Evaluation Data – Many bias tests rely on synthetically modified text (e.g., demographic swap tests). While common in fairness research, such data may not fully capture real‑world linguistic complexity.
Lack of Human Evaluation – The project does not include large‑scale human annotation or expert evaluation of feedback quality; results are primarily machine‑ and prompt‑based.
Resource Constraints – Designed to run on consumer‑grade hardware (4–8 GB VRAM). Consequently, model size and inference depth are limited compared to cloud‑based systems.

B. Reproducibility Strategy

Although full determinism is not guaranteed, the project emphasizes procedural reproducibility, meaning another researcher can follow the same steps and reach comparable conclusions.

Reproducibility is ensured through:

Open‑source code hosted on GitHub
Explicit dependency listing (requirements.txt)
Clear directory structure (src/, paper/, results/)
Prompt templates embedded directly in the source code
Local inference via Ollama (no API keys required)

To reproduce the experiments:

Install Ollama and download the LLaMA 3 model.
Clone the GitHub repository.
Run the bias detection module on the provided sample texts.
Observe qualitative differences across biased vs. neutral inputs.

C. Research Ethics and Safety Considerations

Bias analysis inherently involves sensitive topics such as gender, race, and socioeconomic status. To mitigate harm:

No personal data is used.
All test sentences are synthetic or anonymized.
Outputs are framed as analytical observations, not judgments.
The system avoids reinforcing stereotypes by explicitly labeling detected bias.

These practices align with responsible AI research guidelines.

D. Intended Contributions

A fully local, free bias analysis pipeline using modern LLMs.
A practical demonstration of fairness‑aware NLP principles.
A reproducible template for student‑led AI ethics research.
A bridge between theory (bias/fairness) and deployment (local inference).

E. Positioning as a Research Mini‑Project

This work is intentionally framed as a research‑style mini project, not a production system. Its value lies in:

Clear research motivation.
Explicit assumptions and limitations.
Structured experimentation.
Ethical awareness.
Transparent reporting.

These qualities are central to undergraduate research programs and academic evaluation.

F. Future Work

Quantitative benchmarking with labeled bias datasets.
Human evaluation studies.
Prompt optimization experiments.
Cross‑model comparisons.
Integration with educational writing tools.

Summary

The project is functional, scientifically reasoned, ethically grounded, and reproducible—key qualities of credible research.

BiasAwareFeedback: Detecting Textual Bias with NLP (Mini-Research Project)

Bias-Aware Automated Feedback System for Student Writing

Limitations, Reproducibility, and Research Positioning

A. System Limitations

B. Reproducibility Strategy

C. Research Ethics and Safety Considerations

D. Intended Contributions

E. Positioning as a Research Mini‑Project

F. Future Work

Summary

Related posts

An Intro to Large Language Models and the Transformer Architecture: Talking to a calculator

[Paper] From Signal to Turn: Interactional Friction in Modular Speech-to-Speech Pipelines

[Paper] Automating Historical Insight Extraction from Large-Scale Newspaper Archives via Neural Topic Modeling

[Paper] Bounding Hallucinations: Information-Theoretic Guarantees for RAG Systems via Merlin-Arthur Protocols