ChatGPT was released in November 2022. And since then the world has gone nuts for it. We see people using this for various purpose like answering questions on Twitter, getting suggestions for better code, cracking online interviews (interviewers be careful), and most of all, for having fun feeling amazed by its capability.
But what is ChatGPT? How does it work? Is this really as good and helpful as people have been talking about? Is it opensource? Can I integrate it with my applications? This blog is to help you with all of these questions. Please do add comments with your thoughts and inputs on the same.
What is ChatGPT?
ChatGPT (Generative Pre-trained Transformer) is a text-based large language model.
ChatGPT was fine-tuned on top of GPT 3.5 using both Supervised learning as well as Reinforcement Learning having a Human Feedback loop.
How does the user converse with ChatGPT?
ChatGPT identifies the user query intent and returns an appropriate response from the data it was trained on.
How was ChatGPT trained?
The OpenAI GPT-3 model has been trained on about 45 TB text data from multiple sources which includes Wikipedia, books and lot more.
Pros and Cons of ChatGPT
ChatGPT has the ability to generate human-like responses in a wide variety of inputs.
The responses generated by ChatGPT are often quite engaging and informative.
It also has the ability to learn and adapt to new contexts. Unlike many other text-based models, ChatGPT has the ability to retain information from previous conversations and use it to generate more relevant and personalised responses.
The answers generated by ChatGPT vary from question to question even context remains the same. The answers generated are sometimes inaccurate/nonsensical. This happens when we ask an unrelated question from the previous question.
ChatGPT requires a significant amount of computational power to run. If new data regarding products/information/other is to be added, we have to fine tune the model from scratch, which is computationally heavy.
This is not free or open source. It is declared that Microsoft will be acquiring 49% of stakes in it for Bing search. So in future the costs could be huge of using ChatGPT.
“In certain circumstances we may share your Personal Information with third parties without further notice to you”
“Usage data:We may automatically collect information about your use of the Services, such as the types of content that you view or engage with, the features you use and the actions you take, as well as your time zone, country, the dates and times of access, user agent and version, type of computer or mobile device, computer connection, IP address, and the like.”
So this may not be good for enterprise solutions.
What can we do to have performance like ChatGPT?
ChatGPT is fine-tuned from a model in the GPT-3.5 series. And they have not published much details on GPT3.5. But we know BLOOM (BigScience Large Open-science Open-access Multilingual Language Model) which has a similar architecture to GPT-3.
To develop a model similar to a Chat-GPT model we can use BLOOM. We need to fine tune it on a custom dataset prepared with custom annotations. The annotation members are required to have different demographic groups and should be good at identifying potential harmful outputs for annotating the dataset.
We can use in theory any T5 model as long it has learnt data-rich features in the upstream task. This is called as the pre-training stage. After this, we finetune on a specific task, which in this case is generation of human like responses.
For finetuning (downstream task), we have to remove the final embedding layer of the model, add additional layers and and train it on a custom Reward model (RM) function.
The creators of Instruct-GPT (a sibling model of Chat-GPT) used 6B-Reward Modeling. This Reward modeling is trained on a dataset of comparisons between two model outputs on the same input and use a cross-entropy loss function. This will make the model learn how to use it’s existing features to generate human like responses.