
GPT-5.2 Can't Identify a Serial Killer & Was The Year of Agents A Lie? EP99.28-5.2
12/12/2025 | 1 h 3 min
Join Simtheory: https://simtheory.aiGPT-5.2 is here and... it's not great. In this episode, we put OpenAI's latest model through its paces and discover it can't even identify a convicted serial killer when the text literally says "serial killer." We compare it head-to-head with Claude Opus and Gemini 3 Pro (spoiler: they win). Plus, we reflect on the "Year of Agents" that wasn't, why your barber switched to Grok, Disney's billion-dollar investment to use Mickey Mouse in Sora, and why Mustafa Suleyman should probably be fired. Also featuring: the GPT-5.2 diss track where the model brags about capabilities it doesn't have.CHAPTERS:00:00 Intro - GPT-5.2 Drops + Details01:25 First Impressions: Verbose, Overhyped, Vibe-Tuned02:52 OpenAI's Rushed Response to Gemini 303:24 Tool Calling Problems & Agentic Failures04:14 Why Anthropic's Models Just Work Better06:31 The Barber Test: Real Users Are Switching to Grok10:00 The Ivan Milat Vision Test (Serial Killer Edition)17:04 Year of Agents Retrospective: What Went Wrong25:28 The Path to True Agentic Workflows31:22 GPT-5.2 Diss Track (Yes, Really)43:43 Why We're Still Optimistic About AI50:29 Google Bringing Ads to Gemini in 202654:46 Disney Pays $1B to Use Mickey Mouse in Sora56:57 LOL of the Week: Mustafa Suleyman's Sad Tweets1:00:35 Outro & Full GPT-5.2 Diss TrackThanks for listening. Like & Sub. xoxox

ChatGPT is Dying? OpenAI Code Red, DeepSeek V3.2 Threat & Why Meta Fires Non-AI Workers | EP99.27
04/12/2025 | 1 h 3 min
Join Simtheory: https://simtheory.ai/OpenAI has declared "Code Red" as ChatGPT faces growing competition from Gemini and other rivals. In this episode, we break down OpenAI's 6% market share decline, why their ad strategy is on hold, and what they need to do to reclaim the AI crown. We also explore DeepSeek V3.2's impressive capabilities as a cheap open-source alternative, Meta's new policy grading employees on AI skills, and the crisis facing higher education as AI fluency becomes essential. Plus, Fatal Patricia hits #1 on our Spotify charts, and Tesla's Optimus robot is running like a slightly unfit human.CHAPTERS:00:00 Intro - OpenAI Code Red & Market Share Crisis07:03 ChatGPT's Failure to Go Deeper Into Users' Lives16:33 What OpenAI Needs to Win Back the Crown26:46 Chris's Wishlist for an OpenAI Comeback31:22 DeepSeek V3.2 - The Open Source Threat39:34 Meta Grading Workers on AI Skills46:29 The University & Education AI Crisis56:25 Fatal Patricia Hits #1 & WTF of the WeekThanks for listening. Like & Sub. xoxox

Claude 4.5 Opus Shocks, The State of AI in 2025, Fara-7B & MCP-UI | EP99.26
28/11/2025 | 1 h 45 min
Join Simtheory: https://simtheory.ai (Use coupon BLACKFRIDAY15 for $15 USD off any subscription).----Simtheory Discord: https://discord.gg/Ar6GeQnAR7This Day in AI Discord: https://discord.gg/TVYH3HD6qsLinkedIn Group: https://www.linkedin.com/groups/16562039/Spotify: https://open.spotify.com/artist/28PU4ypB18QZTotml8tMDq?si=FPaJU2NRSnOSNPmnsfwA_g---CHAPTERS:00:00 Intro & Fatal Patricia Update01:40 Promotions (Discord, Black Friday, LinkedIn)04:36 Claude 4.5 Opus - Best Anthropic Model Ever?31:17 Computer Use API Updates36:14 Will AI Replace 57% of Jobs? (McKinsey Report)1:00:52 Claude 4.5 Opus Demos (Christmas Hut & Diss Track Preview)1:07:13 Microsoft Farah 7B - Moose Porn Refusals1:21:51 Why ChatGPT's MCP-UI Apps Are a Bad Idea1:42:01 🎵 Claude 4.5 Opus Diss Track (Full Song)---Thanks for listening. Like & Sub. xoxoxAnthropic just dropped Claude 4.5 Opus and it might be the best AI model of 2024. In this episode, we compare Claude 4.5 Opus vs Gemini 3 Pro vs GPT-5.1, breaking down the new API features including effort parameters, context management, and computer use updates. We also test Microsoft's new Farah 7B parameter model for computer use - with hilarious refusal results. Plus, we react to McKinsey's controversial report claiming AI agents could automate 57% of US jobs by 2030. We dive deep into Anthropic's pricing (3x cheaper than Opus 4.1), why Claude is now beating Google and OpenAI on agentic coding benchmarks, and whether MCP-UI apps in ChatGPT are a step backwards for AI workflows. Is Claude 4.5 Opus the new king of AI coding assistants? Should enterprises be worried about AI job replacement? And why did Microsoft's Farah model refuse to draw a moose? All this plus an AI-generated diss track roasting Sam Altman, Elon Musk, and Sundar Pichai.

Is Gemini 3 Really the Best Model? & Fun with Nano Banana Pro - EP99.25-GEMINI
21/11/2025 | 1 h 44 min
Join Simtheory for Gemini 3 & Nano Banana Pro: https://simtheory.ai----CHAPTERS:00:00 - Gemini 3 Pro Impressions & Thoughts33:34 - xAI Releases Grok 4.1 Fast40:09 - More on Gemini 3 Pro: What We Want Improved45:46 - Gemini 3 Pro Dis Track51:16 - Thoughts on Nano Banana Pro And What It Means1:12:49 - Does Nano Banana Disrupt Design Software Like Canva? Where is This Going?1:26:20 - OpenAI's Reaction to Gemini 3 Pro & Nano Banana with GPT-5.1-Pro and Codex model updates1:32:38 - Final Thoughts & Sam Altman Sad Song1:38:41 - FATAL PATRICIA SONG1:42:12 - Gemini 3.0 Pro Diss Track----Thanks for your support plz like and sub xoxo

Are We In An AI Bubble? In Defense of Sam Altman & AI in The Enterprise | EP99.24
07/11/2025 | 1 h 5 min
Join Simtheory & experience MCPs in action: https://simtheory.ai----00:00 - Chris Has a Merch Sponsor02:42 - In Defense of Sam Altman20:29 - Are We In An AI Bubble? & What is Working in The Enterprise?43:58 - Anthropic's Code Execution with MCP: Problems with MCP Context52:44 - Kimi-K2 Thinking Model Release1:00:45 - "In the Middle of a Bubble" Song----Thanks for your support and listening, we appreciate you!Join our Discord: https://discord.gg/TVYH3HD6qs



This Day in AI Podcast