BigScience’s AI Language Model Is Now Available

0
678

A volunteer-led project has produced an open source language model that they claim is as powerful as OpenAI’s GPT-3, but is free and open for anyone to use after more than a year of planning and training (if they have the computing power). The model, dubbed Bloom, is open source, as are the code and datasets that were used to create it. Hugging Face, a Brooklyn-based AI startup, has released a free web app that allows anyone to try Bloom without having to download it.

Bloom is the creation of BigScience, an international, community-driven project aimed at making large natural language models widely available for research. Large language models, or “LLMs,” can — more or less — translate, summarise, and write text with humanlike nuance. (See also GPT-3.) However, they have historically been expensive to develop, keeping them out of reach of researchers and firmly in the hands of Big Tech companies like Meta, Google, and Microsoft.

That is finally changing, thanks in part to BigScience’s efforts. The group’s more than 1,000 volunteer researchers — aided by ethicists, philosophers, legal scholars, and engineers from startups and large tech companies alike — worked for months on Bloom, which competes with large-scale LLMs developed by firms such as OpenAI and Alphabet’s DeepMind. Bloom, one of the largest open source models that works across multiple languages, is intended to be used in a variety of research applications, including extracting information from historical texts.

BigScience supporters also hope that Bloom will spark new research into ways to combat the issues that plague all LLMs, such as bias and toxicity. LLMs have a tendency to spread misinformation and harbour prejudices against religions, genders, races, and people with disabilities. They also struggle with basic writing principles, frequently changing the subject of a conversation without a segue and endlessly repeating — or even contradicting — themselves. Bloom is the culmination of their efforts. It was trained using $7 million in publicly funded (via grants) compute time on the Jean Zay supercomputer near Paris, France, which is one of the world’s most powerful machines.

LEAVE A REPLY

Please enter your comment!
Please enter your name here