Nichebench Released to Benchmark AI Models on Drupal 10/11 Code and Knowledge

Tests LLMs on Drupal-specific code generation and knowledge to guide open-source fine-tuning.

Staff Reporter

Sergiu Nagailic Releases Nichebench: A New Benchmark for Testing LLMs on Drupal 10/11

Sergiu Nagailic, Co-founder and CTO of HumanFace Tech has released Nichebench, a benchmarking framework designed to evaluate how well large language models (LLMs) understand and generate Drupal 10/11 code. The tool assesses both factual knowledge and practical coding ability—key metrics for fine-tuning AI models tailored to Drupal’s evolving ecosystem.

Unlike generic benchmarks, Nichebench focuses on niche, domain-specific tasks. It runs two evaluation tracks: quiz-based knowledge tests using multiple-choice questions and code generation tasks that require models to produce Drupal implementation code. Each result is assessed using a GPT-5-based LLM-as-a-Judge setup via DeepEval, allowing for scalable and consistent evaluation.

Initial findings show promising accuracy from some open models on the quiz track, including GPT-OSS-120B (90%), Phi-4 (14B) (88%), and Qwen3-Coder (86%). However, the code generation tests revealed a much larger performance gap. GPT-5 achieved 75% accuracy, while top open models like GPT-OSS-120B peaked at 40%. Most open models struggled with structured data output in JSON or YAML—a known weakness in smaller architectures.

Based on the results, Nagailic is now preparing to fine-tune a dedicated Drupal-specific LLM starting with high-performing open models like GPT-OSS-20B and Qwen3-Coder-30B-A3B. The goal is to create an open-weight LLM optimized for modern Drupal practices, capable of supporting real-world development tasks such as module creation, architecture advice, and implementation support.

The benchmark's test cases and datasets are not public to prevent contamination in future training datasets, but contributors may request access via the project’s GitHub repository. A companion explainer video is also available on YouTube.

Reference: Nichebench - Benching AIs vs Drupal 10-11 by Sergiu Nagailic (nikro) (22 September 2025)

Drupal 10

Drupal 11

New release

LLM

Disclosure: This content is produced with the assistance of AI.

Note: The vision of this web portal is to help promote news and stories around the Drupal community and promote and celebrate the people and organizations in the community. We strive to create and distribute our content based on these content policy. If you see any omission/variation on this please reach out to us at #thedroptimes channel on Drupal Slack and we will try to address the issue as best we can.