The lack of readily available data is a major obstacle to developing AI models in Africa’s numerous languages. While over 42 languages are currently supported, many more exist with virtually no written resources, hindering the training of large language models (LLMs) like ChatGPT. The African Next Voices project, collecting 9,000 hours of audio recordings from 18 languages across South Africa, Kenya, and Nigeria, offers a solution by creating focused datasets, particularly in areas like health and agriculture. Researchers prioritize accuracy within specialized models, acknowledging potential errors in broader applications. Crucially, understanding cultural context and nuances – including the multiple meanings of symbols and the importance of recognizing slang – is vital to avoid biased or inaccurate AI systems. Furthermore, the absence of codified dictionaries and data centers exacerbates the problem, potentially leading to the “disappearance” of smaller languages if AI development doesn't adapt to their unique challenges.