GPT-5 Is A Freak

Good morning,

If you’ve been following “tech news” long enough you know that users HATE change. New interface on Facebook : revolt. Twitter changes name to X : revolt.

So when Sam Altman and OpenAI launched GPT-5 last week - the uproar was even louder. Because while launching GPT-5 they simultaneously deprecated all other models.

That means that everyone having long running chats and projects now discover that those don’t really work anymore. Thousands upon thousands of people who think of ChatGPT as their therapist or their girlfriend have now found that their ‘significant others’ are basically gone and replaced by someone else.

Reddit was on fire with complaints.

So much so that Altman scurried back and promised that if you’re on the higher subscription tier - you could still use GPT-4o.

Also, people expected AGI. There was so much hype, they expected this to be the final frontier of AI. But what we got is an upgrade. A very good upgrade , yes, but not the big leap everybody expected.

A lot of negativity is surrounding this launch. We’re not going to pile on and look at what this new model actually CAN do.

The headlines called it the world’s most capable model. The benchmarks nodded in agreement—slightly better than April’s o3-pro, which was slightly better than last year’s o1.

At first sight it’s not the leap the hype machine promised, but the curve is still climbing fast. Two hours of complex human work today could be a full eight-hour shift inside two years if the doubling trend holds.

This is a chart from the Model Evaluation and Threat Research institute

Each dot is an AI model; left axis shows how long those coding tasks take a human (seconds → minutes → hours), and the bottom axis is the model’s release date.
A model’s height marks the toughest software tasks it can solve about 50% of the time-higher = tasks that take humans longer-while the bars show uncertainty.

The near-straight dashed line on a log scale means exponential progress: newer models (e.g., GPT-5) are reaching tasks that take people tens of minutes to hours, though this reflects one benchmark, not all coding work.

Around 8 hours is where AGI lies.

If you want to see what “incremental” looks like when it’s weaponized, you need to watch the test video below this article.

It’s thirty-plus minutes of GPT-5 pulling off things that a year ago would have been UNIMAGINABLE.

In one prompt, it built a zero-shot, interactive beehive simulator complete with sliders for colony size and resource scarcity. Then a neon-lit cyberpunk racing game with working collision physics - generated in 16 seconds. Then a fluid dynamics visualizer you could tinker with like a digital lava lamp.

Ray tracing? It threw a metallic sphere into a 3D street scene and let you tweak its reflectivity on the fly. Photoshop clones, real-time video effects editors, each done in a single shot, often with bugs spotted and fixed by the model itself before you even hit “Run.” It wasn’t perfect (a vignette effect misfired, storybook images struggled with character consistency), but compared to the brittle output of other top models, it was noticeably more error-resistant.

And then it got weird.

Given an anonymized crowd shot at a seaside park, GPT-5 correctly identified the event and location—no metadata, no obvious landmarks. It also cranked out business intelligence reports, medical research syntheses, sports rehab protocols, and high-school physics courses with working simulations. In health-related Q&A, it not only scored far higher than rivals but hallucinated less than 2% of the time compared to 15% or more for some competitors.

If we zoom out a bit, once again , GPT-5 is not the AGI moment some of us thought it was going to be. It’s the step before the step before : incremental, on-trend, slightly better across the board, but most of all, a consolidation play.

The unified-model routing system is a user-friendly shell over multiple internal models, and while OpenAI’s secrecy keeps the exact recipe under wraps, the benchmarks and live demos tell the real story: this is a tool that’s getting sharper in ways that compound.

Watch the full test video and you’ll see why that matters. It’s one thing to read “74.9% on SWEBench verified” or “ranked #1 across LM Arena categories.” It’s another to watch GPT-5 casually spin up working physics engines and production-grade dashboards in seconds, or spot it self-correcting its own broken code mid-generation. That’s the kind of capability creep that doesn’t make the front page but changes what’s possible for anyone with a keyboard.

It’s not about GPT5 being the best model - it’s about how many more of these “slight improvements” it takes before the ground shifts under all of us. And whether we’ll even notice until we’re standing somewhere completely different.

See you next week !

AI leaders only: Get $100 to explore high-performance AI training data.

Train smarter AI with Shutterstock’s rights-cleared, enterprise-grade data across images, video, 3D, audio, and more—enriched by 20+ years of metadata. 600M+ assets and scalable licensing, We help AI teams improve performance and simplify data procurement. If you’re an AI decision maker, book a 30-minute call—qualified leads may receive a $100 Amazon gift card.

Book a call

_{For complete terms and conditions, see the offer page.}

AI News

OpenAI has launched GPT-5, a unified AI model family replacing all prior versions, with smarter, faster performance and access for both free and paid users. It includes GPT-5, GPT-5 Pro, and GPT-5 Mini—delivering state-of-the-art results in writing, coding, math, and more, while reducing hallucinations and deceptive behavior. This marks a major step toward making powerful AI broadly accessible, but competition from Google, Anthropic, and others remains fierce.
Google DeepMind open-sourced an improved version of Perch, an AI tool that helps scientists analyze vast wildlife audio recordings to monitor endangered species. The model can handle complex soundscapes across diverse ecosystems, with tools that detect rare species even with limited data. It's already speeding up conservation efforts, helping scientists act faster to protect biodiversity.
MIT, Harvard, and Broad Institute researchers created PUPS, an AI that predicts the exact location of any protein inside human cells. By combining protein structure data with cell features, it produces precise maps—even for unknown proteins—far beyond what traditional lab work can handle. This breakthrough could transform how we diagnose diseases and discover new drugs.
OpenAI’s GPT-5 launch stumbled badly, with technical issues, botched visuals, and user backlash over the removal of GPT-4o, which many felt had more emotional intelligence. CEO Sam Altman addressed the criticism during a Reddit Q&A, promising fixes, better transparency, and a return of 4o for paid users. The missteps show that even cutting-edge upgrades must account for how people actually use — and connect with — these tools.
Ex-OpenAI researcher Leopold Aschenbrenner raised over $1.5B for a hedge fund based on his AI predictions, despite no finance background, and has already posted a 47% return this year. His fund bets on AI-adjacent sectors like semiconductors and energy, showing that deep technical insight may now matter more than Wall Street experience. It's a sign that those closest to AI development could be the next power players in investing.
Google and NASA are building an AI medical assistant to support astronauts on deep-space missions where help from Earth is delayed or impossible. The system, called CMO-DA, can diagnose injuries with high accuracy and will eventually integrate with tools like ultrasound and biometrics. It may also drive new solutions for remote healthcare here on Earth.
Meta’s FAIR team introduced TRIBE, a 1B parameter model that can predict how the brain responds to movies without needing brain scans, by analyzing video, audio, and text. It won the top spot in the Algonauts 2025 brain modeling challenge and showed high accuracy in regions tied to attention and emotion. While it advances brain science, it also raises questions about how this knowledge might be used to further optimize — or exploit — viewer attention.
OpenAI’s reasoning model earned a gold-level score at the 2025 International Olympiad in Informatics, outperforming all other AIs and ranking sixth overall against human coders. The model wasn’t fine-tuned for programming but still placed in the 98th percentile, highlighting rapid advances in general problem-solving ability. Paired with earlier math competition wins, it suggests AI is quickly approaching — and possibly surpassing — top human skill in complex reasoning tasks.
KAIST researchers developed BInD, an AI model that designs precise cancer drug candidates from scratch, without any prior molecular data. The system creates and evaluates drug molecules in one step, optimizing for safety, effectiveness, and manufacturability all at once. It marks a leap toward faster, more targeted drug discovery — with the potential to revolutionize personalized medicine.
Elon Musk says xAI is suing Apple for allegedly favoring OpenAI’s apps and suppressing rivals like Grok, but the argument quickly turned into an online spat with Sam Altman, who clapped back with sarcasm and bot accusations. Despite Musk’s claims, users pointed out that other AI apps like DeepSeek and Perplexity have topped the App Store. The feud highlights the increasingly personal nature of the AI race, where former allies now trade insults more like influencers than executives.
OpenAI is reportedly backing Merge Labs, a new brain-computer interface startup co-founded by Sam Altman that aims to compete with Elon Musk’s Neuralink. The project will be led by Alex Blania, with OpenAI’s venture arm expected to lead a major funding round at an $850M valuation. It’s a bold new front in the Musk-Altman rivalry — and a clear sign that OpenAI’s ambitions now extend beyond software into futuristic hardware.
AI startup Perplexity reportedly offered $34.5B to buy Google Chrome, even though that’s nearly twice its own valuation. The move comes as Google faces antitrust pressure to spin off the browser, and Perplexity pitched itself as a neutral solution. While the offer likely won’t go anywhere, it’s a savvy PR play that keeps Perplexity in the headlines as it pushes its AI-powered Comet browser.
Apple is reportedly preparing four new AI-powered smart home devices, including a desktop robot and smart display, with launches planned between 2026 and 2027. The products will feature a redesigned Siri with a personality-driven assistant named “Bubbles,” and are being built on a new AI-focused operating system. As rivals like Google and Amazon push ahead in home AI, Apple faces pressure to finally deliver on its long-promised smart assistant upgrades.
In response to backlash over the GPT-5 launch, OpenAI is restoring the GPT-4o model for paid users and increasing GPT-5’s rate limits from 200 to 3,000 queries per week. Users can now choose how GPT-5 behaves—fast, thoughtful, or automatic—and OpenAI is planning a personality update to improve user experience. The changes highlight how much users value consistency and customization, especially for those who preferred 4o’s conversational style over raw performance.
Microsoft is now actively recruiting AI talent from Meta, offering multi-million dollar deals to engineers outside of Meta’s new Superintelligence Labs. The hiring push targets Meta's Reality Labs, GenAI Infra, and core research teams, with fast-track approval systems led by AI heads like Mustafa Suleyman. With Meta reportedly facing internal cultural struggles, Microsoft may find some employees ready for a change.

Quickfire News

OpenAI added GPT-5 to its API and ChatGPT, launching new chat customization features, an improved voice mode, and four distinct chatbot personalities.
Elon Musk confirmed that xAI will open-source its Grok 2 model next week, following OpenAI’s lead in releasing open models.
Microsoft upgraded its Copilot assistant with GPT-5 in a smart mode that auto-selects the model based on the user’s task.
The Browser Company launched a $20/month subscription for its Dia AI browser, offering unlimited chat and skills access to compete with Perplexity’s Comet.
xAI plans to introduce ads into Grok’s responses, with Elon Musk suggesting ads should match the user’s problem for maximum relevance.
MiniMax released Speech 2.5, a voice cloning model supporting 40 languages and capable of replicating accents, age, and emotion in speech.
Donald Trump’s Truth Social platform launched Truth Search AI, a Perplexity-powered tool that pulls results from a curated list of sources.
xAI launched Grok 4 globally for free for a limited time and introduced a new “long press” feature in Grok Imagine that turns images into video.
OpenAI’s o3 model dominated the Kaggle AI chess tournament, winning every match against competitors like DeepSeek R1, Grok-4, and Gemini 2.5 Pro.
Microsoft unveiled Copilot 3D, an AI tool that instantly converts images into 3D models for use in gaming, animation, and virtual or augmented reality.
Roblox open-sourced Sentinel, a new AI model built to detect and block inappropriate chat content, aimed at better protecting children.
SoftBank acquired Foxconn’s U.S. EV plant in Ohio, with plans to turn it into a Stargate data center for AI infrastructure.
Elon Musk announced Tesla is shutting down its Dojo Supercomputer team to focus on AI chip development, as VP Pete Bannon exits the company.
Bloomberg reported that Apple AI researcher Yun Zhu is leaving for Meta’s MSL, becoming the fifth member to exit Apple’s foundation models team.
Chinese AI lab Z AI released GLM-4.5V, an open-source visual reasoning model that leads across more than 40 industry benchmarks.
GitHub CEO Thomas Dohmke is stepping down to start his own company, as GitHub becomes part of Microsoft’s CoreAI division.
The U.S. government is preparing a deal with Nvidia and AMD to claim a 15% share of chip sales to China under new regulatory terms.
Pika Labs rolled out a new HD video model in its social app, enabling lip-sync and audio generation in under six seconds.
Anthropic added memory features to Claude for Max, Team, and Enterprise users, allowing it to recall past conversations (Pro users excluded).
Alibaba upgraded its Qwen3 models with ultra-long context windows supporting up to 1 million tokens.
OpenAI clarified that GPT-5’s reasoning context window is 196,000 tokens, correcting earlier confusion that linked it to the 32,000-token non-reasoning version.
Mistral released Mistral Medium 3.1, an upgraded model with better general performance and enhancements in creative writing.
Skywork introduced Matrix-Game 2.0, an open-source interactive world model that can generate several minutes of playable video at 25 FPS, similar to Genie 3.
Anthropic announced $1 access to its Claude assistant for all three branches of the U.S. federal government, mirroring a recent offer from OpenAI.
Igor Babuschkin announced his departure from xAI and the launch of Babuschkin Ventures, an investment firm focused on AI startups aiming to benefit humanity and explore the universe.
Anthropic is acquiring three co-founders and several team members from Humanloop, a platform focused on enterprise AI evaluation and safety.
The U.S. government is reportedly embedding tracking devices in AI chip shipments from Nvidia and AMD to monitor potential diversions to China.
Tencent released Hunyuan-Vision-Large, a multimodal model now ranked No. 6 on the Vision Arena leaderboard, just behind top models like GPT-4.5 and o4 mini.
Google added temporary chats and memory features to Gemini, allowing it to recall user preferences and past conversations.
Higgsfield AI introduced Draw-to-Video, a tool that lets users sketch out text and visuals on images to guide video creation.
Geoffrey Hinton suggested training AI models with “maternal instincts” toward humans as a way to reduce existential risks posed by the technology.
Liquid AI launched LFM2-VL, a set of open-weight vision-language models optimized for fast use on consumer hardware.

Closing Thoughts

That’s it for us this week.

If you find any value from this newsletter, please pay it forward !

Thank you for being here !

GPT-5 Is A Freak

AI leaders only: Get $100 to explore high-performance AI training data.

AI News

Quickfire News

Closing Thoughts

Reply

Keep Reading

The Blacklynx Brief

Home