With the aid of OpenChatKit, programmers can easily create their own unique chatbot applications, fine-tune the model, retain dialogue context, moderate responses, and more.
Since quite some time, open source and closed-source have been at battle. GPT-Neo, an open source alternative that EleutherAI launched after OpenAI released GPT-3 as a closed-source model, has produced comparable results. In a similar vein, Stability AI developed Stable Diffusion, an open-source version of DALLE 2 at the time of its launch.
We are all aware of ChatGPT and how eager users are to obtain an open source version of the concept so they may develop secure applications with more freedom. You will be using ChatGPT’s service and machine to carry out a variety of operations, however they are currently only providing API access and the option to fine-tune.
Together Computer launched OpenChatKit, an open-source variant of ChatGPT, on March 10th, 2023. Developers that choose an open-source alternative can modify the chatbot’s behaviour to better suit their needs. Also, a wider spectrum of users and groups can access it, especially those who might not have the means to use proprietary models.
An open-source, potent suite of tools called OpenChatKit is available for building both general-purpose and niche chatbot apps.
The model is in its initial iteration, and its creators have made a set of tools and procedures available in order to enhance it with community input. Together Computer has released OpenChatKit 0.15 with source code, model weights, and training datasets under an Apache-2.0 licence.
You can test out the Hugging Face: OpenChatKit based model demo. Similar to ChatGPT, the model reacts to your written prompt with an answer, a code block, tables, or text.
The fundamental bot and the building blocks needed to construct customised chatbot apps are included with OpenChatKit.
There are 4 parts in the kit:
- GPT-NeoX-20B from EleutherAI is a huge language model that has been fine-tuned for conversation instructions.
- Instructions on how to adjust the model so that it performs certain jobs with great accuracy.
- A flexible retrieval method for updating the bot answer with information from news feeds, sports scores, or Wikipedia.
- Enhanced from GPT-JT-6B to filter out the questions that the bot responds to for moderation purposes.
The GPT-NeoXT-Chat-Base-20B language model serves as the foundation for OpenChatKit. It was tuned using 43 million high-quality conversational instructions and EleutherAI’s GPT-NeoX model. Many tasks, including multi-turn discussion, question-answering, classification, extraction, and summary, have received special attention from the development team.
Out of the box, the model offers a solid foundation. As we can see, it performs better on the HELM benchmark than its base model GPT-NeoX. The question-and-answer, extraction, and classification tasks were all successfully completed by the GPT-NeoXT-Chat-Base-20B model.
Since this is the model’s first iteration, there are several errors, bugs, and suitable responses. We will go over a couple topics that the model has trouble understanding in this session.
- Knowledge-based: The chatbot could provide findings that are factually inaccurate. Both ChatGPT and it have problems. The group is developing a retrieval system that will correct inaccurate data.
- Code-based: Insufficient source code was used to train the model, which prevented it from producing accurate code. You can become impatient.
- Context switching: If you switch topics during a conversation, the chatbot won’t change the subject automatically and will continue to answer your questions in relation to the earlier subjects.
- Repetition: The chatbot occasionally repeats the answer or freezes. To reset the page, simply refresh it.
- Creative answers: The chatbot does not produce articles or original tales, in contrast to ChatGPT. Just brief responses are permitted.
With the assistance of the community and OpenChatKit, a better chatbot should be available shortly. OpenChatKit is still in its infancy and was trained on a less diversified dataset, so don’t expect it to repose like ChatGPT or deliver great responses.