LLaMA has rapidly risen to prominence in the open source community in the past month.
The AI research team at Meta has developed a solid reputation for publishing their models as open-source, the most recent of which is LLaMA, with the model’s weights made accessible to academics and researchers as needed. Nevertheless, one of these parties published the code’s leak on GitHub, allowing programmers everywhere free access to their initial GPT-level LLM.
Since then, the developer community has had a field day with this model, optimising it to operate on the lowest-powered devices, enhancing the model’s functionality, and even using it to develop some new use cases for LLMs. The largest multiplier for AI study is the open-source community, and developers are the driving force behind it.
Beginning LLM enthusiasts discovered that the 7 billion parameter version of the model needed more than 16GB of VRAM when LLaMA was first released. They soon discovered methods to reduce the model’s memory requirements, though. The model was rewritten in C++ as part of a community project called LLaMA.cpp as the first stage in optimisation.
The model was able to function on a wide variety of hardware thanks to this and a group effort to quantize the weights. Even on a Google Pixel 5, a programmer was able to execute the 7B model and produce one token per second. After that, llama.cpp was converted to Rust, enabling quicker inference on CPUs, but the community was only just emerging.
Alpaca marked a democratisation of LLMs, bringing LLaMA to the masses. By bringing down the fine-tuning cost to a few hundred dollars and open sourcing the model, Alpaca put the power of LLMs in the hands of developers all over the world, leading them to add some functionality to this LLM.
Researchers at Stanford University created a different model that is a better variant of LLaMA 7B. Using more than 50,000 instruction-following examples from GPT 3.5, the researchers trained LLaMA to generate outcomes that were comparable to those of OpenAI’s model. Additionally, compared to the millions of dollars required to train similar models, the Alpaca model’s training and deduction costs were only $600.
Alpaca democratised LLMs by making LLaMA accessible to the general population. By reducing the cost of fine-tuning to a few hundred dollars and opening sourcing the model, Alpaca gave the power of LLMs into the hands of developers all over the world, encouraging them to add some features to this LLM.
In order to further lower the entry barrier for LLMs, Dalai was introduced as a simple method to get both Alpaca and LLaMA running on any platform with just a command. The foundation of Alpaca was used to create a new model dubbed GPT4All. This increased the strength of LLaMA even more because it was trained on about 800,000 GPT 3.5 generations. The use cases never stopped coming in.
By using reinforcement learning and human input to train LLaMA, Colossal-AI developed a ChatGPT substitute. The community then developed Llamahub to keep track of all the different ways that people can interact with the model. The best part is that all of this happened within a month of the model’s release, demonstrating the open-source community’s real power.