Microsoft wants more European languages for AI training

Microsoft, through its president and vice chairman Brad Smith, announced a strengthened commitment to European linguistic and cultural diversity in the digital age. The company recognizes that the predominance of English in Large Language Models, the technological basis of generative AI, could leave the continent's languages and cultures behind, with significant cultural and commercial implications.
“Although only 5% of the world’s population speaks English as a first language,” says Smith, “English texts make up half of all web content and dominate the data used to train AI models.”
"An AI that does not understand Europe's languages, history and values cannot fully serve its citizens, businesses and future."
To address this imbalance, Microsoft has launched two new initiatives. The first is to promote the development of multilingual LLM programs through Europe-based teams that will work with European partners, including the University of Strasbourg, to expand the availability of data in multiple languages. The second expands Microsoft's Culture AI program, which aims to safeguard cultural heritage through digital replicas and data collaboration. This fall, Microsoft will begin creating a digital replica of Notre Dame in Paris, following St. Peter's Basilica in 2024, in collaboration with the French Ministry of Culture, using artificial intelligence technologies.
Other projects include the digitization of nearly 1,500 stage sets from the Opéra National de Paris and the public availability of detailed descriptions of approximately 1.5 million artifacts from the Musée des Arts Décoratifs.
ansa