GPT-3 is taking the internet by storm as it rightly should. What makes it fascinating is that the same algorithm can perform a wide range of tasks. It can write an email or blog-post when given a topic, it can code for you, it can give you answers to factual questions, it can chit chat with you, and it can even play chess with you. It truly is the closest we have ever been to Artificial General Intelligence (AGI).
The way GPT-3 works is that, when given just a few sentences, it will identify the context and pattern, and generate relevant text for the application. While there are a lot of exciting things that can be done with GPT-3, its practical applications have been hotly debated. Using generative algorithms, such as GPT-3, in real life applications requires a mindset change — a necessary one for us to be able to leverage AI to its full potential.
We have to understand that these algorithms are not perfect and that one has to consider both the risks and the benefits of using such algorithms. Striking the right balance between the two is what will enable us to derive exponential gains from AI.
Let us first dive a little deeper into precisely what GPT-3 is. It is a neural network that, when given an input sequence of words, is trained to predict the next word. A simple example would be:
Building this neural network, OpenAI trained it on 45TB (300 billion tokens of text) of various kinds of data available on the internet. It is a language model trained on facts from the internet, news, social media conversions, code and much more diverse data which it uses to generate text. Due to the data it is trained on, it is able to generate text that is grammatically correct and is coherent based on its context. GPT-3 is a neural network with 175 billion parameters and is leaps ahead of its previous version GPT-2 or any other open model out there.
Given the above context, there are a lot of use cases for which GPT-3 can be used. We ourselves have used it to generate an email sent to the whole company, written a blog -post using it, written code using it and much more. You can find a list of all creative examples collated here.
Apart from the algorithm being great, one big reason why I think it’s so progressive is that it is being offered as an API. Being able to run a neural network like this is no easy feat — it requires very expensive infrastructure to keep it running, and if you want to use it at scale it’s a different challenge altogether. These kinds of resources are not available to a lot of companies, and being able to use the algorithm as an API enables everyone to focus on building applications on top of it, without having to worry about other considerations.
In addition, the input and output pattern is abstracted out to ensure that it is not restrictive to any particular application. This enables people to be creative in their thought process and utilise it for multiple applications apart from the obvious ones.
Given the way GPT-3 has been introduced to the world, even a non-engineer can start playing around with it and start thinking of use cases to solve using GPT-3.
All this is great, but not all is perfect with the algorithm. It has its own set backs and flaws. The algorithm can go completely off-topic sometimes and it can get offensive too. These drawbacks may make people apprehensive to use this model in production, which leads to them ignoring the benefits that come with it.
As we discussed earlier, there is a need to balance the risks and benefits for using generative models. Keeping the flaws in mind we can mitigate the risks and reap the benefits of the algorithm in the following ways:
- Use it for use cases where the risk doesn’t have major repercussions: Instead of using it for an end-user application, use it for internal use cases which can boost productivity. That way you control the risk but still reap the benefits. We at Haptik have already been using GPT-2 to augment training data which previously required someone to do it manually. It is now being generated for them and they just have to validate it — a much easier activity. We can’t begin to imagine how much better this will get with GPT-3!
- Build components to control the content: There is a way to control the content by architecting different ML models on top of GPT-3, which will flag content that is not correct. This can help prevent incorrect or inappropriate content from going out when it’s not supposed to. These components on top can err on the side of caution and be more strict because their role is to prevent bad content going out even if it’s at the risk of restricting some good content. This might lead to more false positives but at least provides the reassurance that the content going out is safe. This allows the generative model to be creative while still having an architecture to prevent things from going off track.
- Adapt it to domain specific data: openAI does provide access to training APIs (on request) which will allow you to adapt GPT3 to a particular task/domain and make it more relevant for the task at hand.
Overall, openAI’s GPT-3 is a massive step forward for the AI space, specifically for natural language generation, and it is going to exponentially help multiple industries. Choosing the right application and combining it with other components can really help leverage this model to a great extent and I am super excited to see everything that people will build on top of this.