Skip to content

Research projects

What we're working on

KenyaNLP runs community research projects focused on advancing language technology for Kenyan languages. Every project is open for contributors.

Active - Phase 1

KenBench

The first comprehensive evaluation suite for Kenyan language NLP - benchmarking models and datasets across 10 languages with a public leaderboard.

45% complete Updated June 2025
Q1/Q2 202623 contributors10 languages
Planning

Language revival project

Collect speech and text data for forgotten Kenyan languages - Ogiek, Kipsigis, Nandi, Kisii - then evaluate and train small models.

10% complete Updated May 2025
Q3 20264+ languages
Upcoming

Small LM / ASR for Kenyan languages

Training and evaluating small language models and ASR systems purpose-built for Kenyan languages and low-resource constraints.

5% complete Updated May 2025
Q3–Q4 2026Research scoping