Natural Language Processing

AI made in Italy: here is Minerva, the first family of large language models trained "from scratch" for Italian

The models, based on a vast open source corpus of more than 500 billion words, are intended to meet a wide range of application needs, from natural language understanding to text generation, from machine translation to automated customer support. The project was carried out by the Sapienza NLP - Natural Language Processing research group within FAIR - Future Artificial Intelligence Research, with PNRR funding and the collaboration of CINECA, which provided the Leonardo supercomputer

Sapienza NLP - Natural Language Processing Research Group, led by Roberto Navigli, Full Professor at the Antonio Ruberti Department of Computer, Control and Management Engineering of Sapienza University of Rome, today announced the release of the Minerva models, a new family of Large Language Models (LLM) trained from scratch for the Italian language.

Minerva was developed as part of FAIR (Future Artificial Intelligence Research), the project led by the National Research Council to implement the National Strategy for Artificial Intelligence thanks to PNRR funding, in collaboration with CINECA, which provided the Leonardo supercomputer. The Minerva models are available to the FAIR scientific community as a preview as of today, before being released to the public in their most advanced version in the coming weeks, which will include the ability to converse with the AI in Italian.

Minerva is a clear step forward for AI made in Italy, confirming Italian excellence in the field of generative AI. The project is led by Roberto Navigli, winner of two prestigious ERC grants and Fellow of ACL, the International Association for Computational Linguistics, and two brilliant young researchers, Edoardo Barba and Simone Conia.

“The special feature of the Minerva models is that they have been built and trained from scratch using open access texts, unlike the existing Italian models, which are based on the adaptation of models such as LLaMA and Mistral, whose training data are still unknown," says Roberto Navigli. "Specifically, each Minerva model was trained on a vast set of online and documented Italian and English sources, totalling over 500 billion words, the equivalent of over 5 million novels. Not only does transparency in model training strengthen the confidence of users, the scientific community, public bodies and industry, but it also stimulates continuous improvement and is a first step towards rigorous verification processes to ensure compliance with laws and regulations."

With a range of models varying in size and computational capacity and relying on billions of parameters, the Minerva project aims to provide transparent foundations for artificial intelligence systems that can be applied in various fields, from natural language understanding to text generation, from machine translation to automated customer service. This flexibility will make Minerva a valuable resource for researchers, companies and developers interested in harnessing the potential of artificial intelligence to improve efficiency and interaction.

"This important result, which is unique in Italy, confirms the scientific excellence of the Department of Computer, Control and Management Engineering (DIAG) of Sapienza University of Rome, particularly in the field of Artificial Intelligence, where we have a large group of researchers of absolute excellence at national and international level,' says Tiziana Catarci, Director of DIAG.

Another new element of this project is the involvement of the Sapienza NLP group in the creation of new evaluation benchmarks, ad hoc tools designed to test the ability of large-scale language models to respect and improve the cultural and linguistic nuances of the Italian language. Furthermore, the project will publish a comprehensive technical documentation in order to share the engineering process and scientific results and to be able to replicate the implementation and training of the models.

 

Further Information

Roberto Navigli
Department of Computer, Control and Management Engineering
navigli@diag.uniroma1.it

 

 

Tuesday, 23 April 2024

© Sapienza Università di Roma - Piazzale Aldo Moro 5, 00185 Roma - (+39) 06 49911 - CF 80209930587 PI 02133771002