About KenyaNLP
Every Kenyan language deserves world-class AI tools.
We aim to enable inclusive, locally relevant, and globally impactful AI that represents the linguistic and cultural diversity of Kenya.
Our story
KenyaNLP is a pilot initiative under the Masakhane umbrella, uniting Kenya's language efforts through open research, transparent evaluation, and inclusive cross-sector collaboration.
Our mission is straightforward: conduct impactful research for Kenyan languages, advance the technologies that serve them, and ensure that every language - from widely spoken Swahili and Kikuyu to endangered languages like Dahalo and Ogiek - has a place in the AI future.
Founded in 2025, we've rapidly grown into a 23-member collaboration spanning universities, industry labs, and the broader Masakhane community.
Research
Systematic benchmarking, model evaluation, and original research publications targeting Kenyan language NLP tasks.
Community
Knowledge-sharing sessions, reading groups, mentorship programs, and collaborative projects open to all contributors.
Data & infrastructure
Building, curating, and open-sourcing datasets and evaluation pipelines for Kenyan languages.
Language revival
Collecting speech and text data for endangered Kenyan languages like Kipsigis, Nandi, Ogiek, and Dahalo.
Partnerships
Collaborating with Masakhane, Guild Code, University of Nairobi, Maseno, Microsoft Research, KenCorpus, and Mozilla Common Voice.
Journey so far
KenyaNLP founded under Masakhane umbrella
Grew to 23+ contributors across 6 institutions
KenBench Phase 1 launched - 10 languages, 15+ datasets
Synthesis workshop & AfricaNLP publication (target)