In May, OpenAI launched the GPT-4o (Omni) model offering next-level multimodality. During the launch, OpenAI’s CEO, Sam Altman discussed launching a new generative pre-trained transformer that will be a game-changer in the AI field- GPT5.
OpenAI has started training for its latest AI model, which could bring us closer to achieving Artificial General Intelligence (AGI). OpenAI described GPT-5 as a significant advancement with enhanced capabilities and functionalities.
It is astounding how OpenAI has progressed and made remarkable improvements over the years. Explore what capabilities GPT-5 unlocks for us.
The improvements are amazing from GPT-1 to GPT5.
OpenAI describes ChatGPT-5 as “a state-of-the-art language model that makes it feel like you are communicating with a person rather than a machine.”
GPT-5 is the latest in OpenAI’s Generative Pre-trained Transformer models, offering major advancements in natural language processing. This model is expected to understand and generate text more like humans, transforming how we interact with machines and automating many language-based tasks.
Just like GPT-4o is a better and sizable improvement from its previous version, you can expect the same improvement with GPT-5. However, GPT-5 has not launched yet, but here are some predictions that are in the market based on various trends.
Altman said the upcoming model is far smarter, faster, and better at everything across the board. It shows exceptional performance at every general task. With new features, faster speeds, and multimodal, GPT-5 is the next-gen intelligent model that will outrank all alternatives available.
During the podcast with Bill Gates, Sam Altman discussed how multimodality will be their core focus for GPT in the next five years. Multimodality means the model generates output beyond text, for different input types- images, speech, and video.
From verbal communication with a chatbot to interpreting images, and text-to-video interpretation, OpneAI has improved multimodality. Also, the GPT-4o leverages a single neural network to process different inputs- audio, vision, and text.
It allows users to use the device’s camera to show ChatGPT an object and say, “I am in a new country, how do you pronounce that?” The new model will produce results incredibly quickly.
We expect to see an extraordinary advancement in GPT-5.
We cannot say that AI cannot reason, with high computation and calculation power they are capable of generating human-like intelligence and interactions. This capability will be enhanced with the upcoming GPT models.
Sam Altman said they will be focusing on improving reasoning ability. The GPT-4o model has enhanced reasoning capability on par with GPT-4 Turbo with 87.2% accurate answers.
However, GPT-5 will be trained on even more data and will show more accurate results with high-end computation.
Context windows refer to how many tokens a model can process in a single go. A bigger context window means the model can absorb more data from given inputs, generating more accurate data. Currently, GPT-4o has a context window of 128,000 tokens which is smaller than Google’s Gemini model’s context window of up to 1 million tokens.
As per Alan Thompson’s prediction, there will be a whopping increase of 300x tokens. This could change the course of the Gemini model, offering notable advancement.
Altman said they will improve customization and personalization for GPT for every user. Currently, ChatGPT Plus or premium users can build and use custom settings, enabling users to personalize a GPT as per a specific task, from teaching a board game to helping kids complete their homework.
In the later interactions, developers can use user’s personal data, email, calendar, book appointments, and others. However, customization is not at the forefront of the next update, GPT-5, but you will see significant changes.
From GPT-1 to GPT-4, there has been a rise in the number of parameters they are trained on, GPT-5 is no exception. The size of these parameters affects how well the model can learn from data. OpenAI hasn’t revealed the exact number of parameters for GPT-5, but it’s estimated to have about 1.5 trillion parameters. This is a huge jump from GPT-3’s 175 billion and GPT-2’s 1.5 billion.
AI expert Alan Thompson, who advises Google and Microsoft, thinks GPT-5 might have 2-5 trillion parameters. He bases this on the increase in computing power and training time since GPT-4.
Improving reliability is another focus of GPT’s improvement over the next two years, so you will see better reliable outputs with the Gpt-5 model.
In the case of the GPT-4, Altman says, “If you ask GPT-4 the same question 10,000 times, one answer will probably be good, but it can’t always pick the best one. You’d want it to give the best response every time, so making it more reliable is important.”
In GPT-4o the reliability improved, reducing AI hallucinations. So, we are hoping GPT-5 to be more reliable and stable.
[Also Read: How to Use AIML In Chatbot Development with Python?]
It is estimated to be in the market and available in 2024. Some are suggesting that the release is delayed due to the upcoming U.S. election, with a release date closer to November or December 2024.
Considering the training period of around 4-6 months (double of GPT-4 training time), the new model will take time in reinforcement learning, red teaming, and further testing before being released.
If OpenAI keeps its usual pricing, using GPT-5 will be expensive. ChatGPT with GPT-4 costs $20/month, while GPT-3.5 is free.
For the API, GPT-4 costs $30 per million input tokens and $60 per million output tokens (double for the 32k version). If GPT5 is as powerful as expected, it will cost more than previous models.
However, the latest OpenAI model, GPT-4o, is much cheaper. It costs only $5 per million input tokens and $15 per million output tokens. While pricing isn’t a big issue for large companies, this move makes it more accessible for individuals and small businesses.
As Altman said, we just scratched the surface of AI and this is just the beginning. There is much more to explore and improve the AI capabilities. GPT5 is just a step closer in the race of AI intelligence.
AGI: The Ultimate Goal
Artificial General Intelligence (AGI) refers to AI that understands, learns, and performs tasks at a human-like level without extensive supervision. AGI has the potential to handle simple tasks, like ordering food online, as well as complex problem-solving requiring strategic planning. OpenAI’s dedication to AGI suggests a future where AI can independently manage tasks and make significant decisions based on user-defined goals.
GPT-5 is an upcoming OpenAI model with key features-
GPT-5 is more multimodal than GPT-4 allowing you to provide input beyond text and generate text in various formats, including text, image, video, and audio.
GPT-5 is estimated to be trained on millions of datasets which is more than GPT-4 with a larger context window. It means the GPT5 model can assess more relevant information from the training data set to provide more accurate and human-like results in one go.
You can use GPT5 for content creation, customer service chatbots, language translation, code generation, and more.
It will take time to enter the market but everyone can access GPT5 through OpenAI’s API. also, developers can integrate its capabilities into their applications. However, it might have usage limits and subscription plans for more extensive usage.