Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.https://assemblyai.com/aiexplainedChapters:00:00 - Introduction00:56 - GPT 5.1 Smarter?01:47 - Some Regressions03:22 - Sycophancy?05:22 - Claude Auto-Hacking 06:16 - Jailbreaking through Granularity08:22 - This Will be Re-used09:30 - Hallucinating Hacker09:57 - Surprisingly Neutral Tone12:18 - SIMA 214:10 - Alpha Parallels17:24 - AI MusicGPT 5.1 Announcement: https://openai.com/index/gpt-5-1/System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdfBenchmarks: https://openai.com/index/gpt-5-1-for-developers/Simple Bench: https://lmcouncil.ai/benchmarksAuto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618https://www.anthropic.com/news/disrupting-AI-espionageReport: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdfSima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/https://x.com/amoufarek/status/1988986075331858693Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/Voyager: https://voyager.minedojo.org/Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
--------
18:26
--------
18:26
Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)
Don’t let headlines about bubbles distract you from the real avenues of progress being explored in AI every week, including what had been thought to be a long-term blocker - continual learning (learning on the fly). https://app.grayswan.ai/ai-explainedThis, plus models introspecting (hesitate before you berate), Nano Banana 2 possibly spotted, Chinese imagen and more.AI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction01:26 - Continual Learning (Nested Learning / HOPE)07:00 - Introspection10:54 - Image-Gen ProgressNested Learning Post: https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/Nested Learning Paper: https://abehrouz.github.io/files/NL.pdfOriginal Titans Paper: https://arxiv.org/pdf/2501.00663Siri News: https://www.bloomberg.com/news/articles/2025-11-05/apple-plans-to-use-1-2-trillion-parameter-google-gemini-model-to-power-new-siriIntrospection: https://www.anthropic.com/research/introspectionFull Paper: https://transformer-circuits.pub/2025/introspection/index.html#mechanismsEarlier Work: https://www.anthropic.com/research/mapping-mind-language-modelhttps://transformer-circuits.pub/2024/scaling-monosemanticity/index.htmlRelease Post: https://x.com/AnthropicAI/status/1983584136972677319https://lmcouncil.ai Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/
--------
12:53
--------
12:53
Sora 2 - It will only get more realistic from here
Sora 2 - the start of the infinite slop-feed or a key step to a generalist agent? Better than VEO 3 or over-hyped? I bring out 6 details you may have missed, contrast the announcement to Periodic Labs and even squeeze in some Claude Sonnet 4.5 analysis. Maybe I should make my videos longer…https://80000hours.org/aiexplainedAI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:40 - Two models?01:15 - Rollout Details01:43 - Versus Sora 1 / Veo 304:30 - Sora App / Social Media06:40 - Masterplan09:30 - Generalist Agent? Periodic Labs12:05 - Claude Sonnet 4.513:42 - Future OutlookAnnouncement: https://openai.com/index/sora-2/Launch Video: https://www.youtube.com/live/gzneGhpXwjUSystem Card: https://cdn.openai.com/pdf/50d5973c-c4ff-4c2d-986f-c72b5d0ff069/sora_2_system_card.pdfSam Altman Blog Post on Sora App: https://blog.samaltman.com/sora-2Most Intelligent Claim: https://x.com/willdepue/status/1973089331284681110GTA: https://x.com/AndrewCurran_/status/1973298436536766666Meta Vibes: https://x.com/alexandr_wang/status/1971295156411433228?s=46Altman on Regulations: https://www.lesswrong.com/posts/5jjk4CDnj9tA7ugxr/openai-email-archives-from-musk-v-altmanOpenAI Profit: https://www.theinformation.com/articles/openais-first-half-results-4-3-billion-sales-2-5-billion-cash-burn?rc=sy0ihqPeriodic Labs: https://periodic.com/https://www.nytimes.com/2025/09/30/technology/ai-meta-google-openai-periodic.htmlhttps://x.com/LiamFedus/status/1973055380193431965https://baincapitalventures.com/insight/we-must-know-we-will-know/?s=09Sonnet 4.5: https://www.anthropic.com/news/claude-sonnet-4-5https://simple-bench.com/Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/
--------
15:43
--------
15:43
OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings
An OpenAI report released in the last 24 hours is the best look we have as to whether 2025 AI can automate your job. I’ll go through 4 unexpected findings, from which model is best at what, to practical tips and massive caveats. Plus UFC robots, radiologist essay, don’t trust videos and the blockers to the singularity. Gray Swan: https://app.grayswan.ai/ai-explainedGDPval: https://cdn.openai.com/pdf/d5eb7428-c4e9-4a33-bd86-86dd4bcf12ce/GDPval.pdf[GDP Impact: https://fred.stlouisfed.org/release/tables?rid=331&eid=211Task List: https://www.onetonline.org/link/summary/11-9141.00Summer Tweet: https://x.com/LHSummers/status/1971252567981146347Emad: https://x.com/EMostaque/status/1971254153067593739Robots: https://x.com/cixliv/status/1967663286679478759Unitree G1: https://x.com/UnitreeRobotics/status/1970039940022239491Don’t Trust Video: https://x.com/AISafetyMemes/status/1970453369446871420AGI Tweet: https://x.com/hyhieu226/status/1968378785709133915Blockers to the Singularity: https://www.patreon.com/posts/blockers-to-and-139264812Framework: https://gemini.google.com/share/f4b9c85a6ae9METR Study (Dev Slowdown): https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/Karpathy Tweet: https://x.com/karpathy/status/1971220449515516391Radiology Essay: https://worksinprogress.co/issue/the-algorithm-will-see-you-now/Chapters:00:00 - Introduction00:55 - OpenAI Report Summary02:40 - Tipping Point Speed-up04:11 - Better than Industry Experts?06:33 - Big Caveat11:10 - Karpathy and the Radiologist Analogy13:30 - Outro
--------
14:06
--------
14:06
ChatGPT Will Guess your Age, Flirt if Asked, and Can Call the Cops
Sam Altman, CEO of OpenAI, announced a set of new ‘protections’ and ‘privileges’ for ChatGPT users, requiring a significant amount of trust from users. From predicting your age based on your chat to calling law enforcement if you are at risk of harm, to allowing non-minors to flirt. But amidst all of these announcements, there are interview snippets you may have missed, as Altman dramatically revises his predictions of AI impact on jobs. Plus a Hassbis backtrack to boot.https://80000hours.org/aiexplainedCalling the Cops: https://openai.com/index/teen-safety-freedom-and-privacy/Age Prediction: https://openai.com/index/building-towards-age-prediction/Not Everyone Will Agree: https://x.com/sama/status/1967955739911364693?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5EtweetTheory 1: NYT Lawsuit: https://openai.com/index/response-to-nyt-data-demands/Theory 2: FTC Investigation into AI Companions: https://x.com/AndrewCurran_/status/1966167585994764743YT Does the Same: https://www.cbsnews.com/news/youtube-ai-powered-technology-teen-users/Carlsen Interview: https://www.youtube.com/watch?v=5KmpT-BoVf4vs Senate Testimony (70% Jobs): https://www.youtube.com/watch?v=5CWVP8-XVjQHallucinations Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdfHassbis Quote 1: https://www.youtube.com/watch?v=toShbNUGAyovs Quote 2: https://www.youtube.com/watch?v=Kr3Sh2PKA8Y
Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.