Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promises, here's 10 new things to know in just under 12 minutes.AI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:22 - Benchmark Results02:11 - Benchmark Caveats02:59 - ARC-AGI 2 03:35 - SimpleBench04:49 - ‘Humanity’s Last Exam’07:20 - SuperGrok Heavy Price07:58 - API Price08:12 - Grok 5, Gemini 3.0 Beta, GPT-509:12 - System Prompt Change + $1B a month, pollution10:20 - Not soloing science, helping you solo codeLivestream: https://www.youtube.com/watch?v=1tQ_KrlHgfg&t=1sPrice: https://grok.com/#subscribehttps://x.com/ArtificialAnlys/status/1943166841150644622Gemini DeepThink: https://blog.google/technology/google-deepmind/google-gemini-updates-io-2025/#deep-thinkhttps://simple-bench.com/ARC-AGI 2: https://x.com/arcprize/status/1943168950763950555Humanity’s Last Exam: https://agi.safe.ai/SmartGPT: https://www.youtube.com/watch?v=hVade_8H8mENew Power Plant, 1m GPUs: https://www.tomshardware.com/tech-industry/artificial-intelligence/elon-musk-xai-power-plant-overseas-to-power-1-million-gpusGemini 3.0 beta: https://web.archive.org/web/20250709174548/https://github.com/google-gemini/gemini-cli/blob/b0cce952860b9ff51a0f731fbb8a7649ead23530/packages/cli/src/ui/utils/errorParsing.test.tsPollution: https://www.theguardian.com/technology/2025/apr/24/elon-musk-xai-memphishttps://www.youtube.com/watch?v=C8rU4dv2w8Qhttps://www.youtube.com/watch?v=3VJT2JeDCywSystem Prompt: https://github.com/xai-org/grok-prompts/blob/535aa67a6221ce4928761335a38dea8e678d8501/ask_grok_system_prompt.j2Burn Rate: https://www.bloomberg.com/news/articles/2025-06-17/musk-s-xai-burning-through-1-billion-a-month-as-costs-pile-upRon Johnson: https://x.com/jdcmedlock/status/1939814516503847259Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/
--------
11:43
--------
11:43
When Will AI Models Blackmail You, and Why?
In the last few days Anthropic have released an impressive honest account of how all models blackmail, no matter what goal they have, and despite prompt warnings, and other preventions. But do these models *want* this?Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: storyblocks.com/AIExplainedAI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction01:20 - What prompts blackmail?02:44 - Blackmail walkthrough 06:04 - ‘American interests’08:00 - Inherent desire?10:45 - Switching Goals11:35 - Murder12:22 - Realizing it’s a scenario? 15:02 - Prompt engineering fix?16:27 - Any fixes?17:45 - Chekov’s Gun19:25 - Job implications21:19 - Bonus DetailsReport: https://www.anthropic.com/research/agentic-misalignment30 Page Appendices: https://assets.anthropic.com/m/6d46dac66e1a132a/original/Agentic_Misalignment_Appendix.pdfAnnouncement: https://x.com/AnthropicAI/status/1936144602446082431?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5EtweetOpenAI Files: https://www.openaifiles.org/Grok 4 News: https://x.com/RonFilipkowski/status/1936372579607912473Claude 4 Report Card: https://www-cdn.anthropic.com/6be99a52cb68eb70eb9572b4cafad13df32ed995.pdfNew Apollo Research: https://www.apolloresearch.ai/blog/more-capable-models-are-better-at-in-context-schemingInteresting Reflections: https://nostalgebraist.tumblr.com/post/785766737747574784/the-voidNon-hype Newsletter: https://signaltonoise.beehiiv.com/
--------
26:19
--------
26:19
Apple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know
What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the paper in layman’s terms, what it means and doesn’t mean, and what’s next. Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: https://storyblocks.com/AIExplainedPlus o3-pro and whether it is my current most-recommended model.AI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:57 - Viral Post + Headlines01:42 - Apple Paper Analysis08:34 - But they do Hallucinate 10:43 - Not Supercomputers11:18 - o3 Pro and Recommendations 13.7M Tweet: https://x.com/RubenHssd/status/1931389580105925115Apple Paper: https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdfGuardian Article: https://www.theguardian.com/technology/2025/jun/09/apple-artificial-intelligence-ai-study-collapseLisan al Gaib post: https://x.com/scaling01/status/1931854370716426246Multiplication: https://x.com/yuntiandeng/status/1836114401213989366The Illusion of the Illusion of Thinking: https://drive.google.com/file/d/1Zx9ikRj0Enc3SB4wA9HlYIlpmO_8QiUO/viewMarcus: https://www.theguardian.com/commentisfree/2025/jun/10/billion-dollar-ai-puzzle-break-downProf Rao: https://x.com/rao2z/status/1927707640223719631AI Job Headlines: https://www.nytimes.com/2025/06/11/technology/ai-mechanize-jobs.htmlhttps://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropicSky News Story: https://news.sky.com/story/can-we-trust-chatgpt-despite-it-hallucinating-answers-13380975Veo 3 Ad: https://x.com/Kalshi/status/1932891608388681791Altman Essay: https://blog.samaltman.com/o3 Original benchmarks: https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b8b6c44-acd6-43b3-b5c6-1a1d5c6c25e4_2486x1388.pnghttps://pbs.twimg.com/media/GfQ0bfcXQAAQt13.jpgAlpha Evolve Video: https://www.youtube.com/watch?v=RH4hAgvYSzghttps://simple-bench.com/Non-hype Newsletter: https://signaltonoise.beehiiv.com/
--------
14:00
--------
14:00
AI Accelerates: New Gemini Model + AI Unemployment Stories Analysed
There’s a new best language model, so let’s go through the up and downs of Gemini 2.5 Pro 06-05. Record-breaking common-sense, but dumb mistakes remain. And it’s not even their best model, which remains behind the scenes - Gemini 2.5 Ultra. Plus Sundar Pichai’s AGI date and an analysis of whether the current AI unemployment headlines are justified, and Elevenlabs v3.https://emergentmind.comAI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction02:04 - Gemini 2.5 Ultra 03:34 - Benchmarks07:41 - AGI Date and Meaning Pichai09:13 - Jobs and AI Unemployment Fears15:28 - Elevenlabs v3Sundar Pichai Fridman: https://www.youtube.com/watch?v=9V6tWC4CdFQPichai More Jobs (until 2026 at least): https://www.techradar.com/pro/alphabet-ceo-sundar-pichai-says-ai-wont-lead-to-job-cuts-will-be-an-acceleratorGemini Comparison: https://blog.google/products/gemini/gemini-2-5-pro-latest-preview/https://x.com/viathebrink/status/1930733154203292121https://simple-bench.com/White Collar Bloodbath: https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropichttps://fortune.com/2025/05/25/ai-entry-level-jobs-gen-z-careers-young-workers-linkedin/https://www.nytimes.com/2025/05/19/opinion/linkedin-ai-entry-level-jobs.htmlhttps://www.nytimes.com/2025/03/25/business/economy/white-collar-layoffs.htmlCollege Unemployment: https://www.newyorkfed.org/research/college-labor-market/#--:explore:unemploymentNew Scientist AI Hallucinaitons: https://www.newscientist.com/article/2479545-ai-hallucinations-are-getting-worse-and-theyre-here-to-stay/Duolingo: https://fortune.com/2025/05/24/duolingo-ai-first-employees-ceo-luis-von-ahn/Klarna: https://www.forbes.com/sites/quickerbettertech/2025/05/18/business-tech-news-klarna-reverses-on-ai-says-customers-like-talking-to-people/Sholto Douglas: https://www.reddit.com/r/ClaudeAI/comments/1ktt1rb/anthropics_sholto_douglas_says_by_202728_its/Figure 02: https://x.com/adcock_brett/status/1930693311771332853Elevenlabs v3: https://www.youtube.com/watch?v=zv_IoWIO5EkGemini Speech Generation: https://aistudio.google.com/generate-speechNon-hype Newsletter: https://signaltonoise.beehiiv.com/
--------
16:41
--------
16:41
Claude 4: Full 120 Page Breakdown … Is it the Best New Model?
Not only did I get early access and ran my own tests, as per the title I read both the 120 page Claude 4 Opus and Claude 4 Sonnet System Card, and 25 page report on ASL-3 being triggered, plus the 2 hour launch video, and surrounding coverage. Ft. coding tests, Simple, twitter controversies, deep alignment coverage, spiritual bliss and much more!https://80000hours.org/aiexplainedChapters: 00:00 - Introduction01:12 - 3 Quick Controversies02:42 - Benchmark Results 04:20 - 120 page Card 20 Highlights10:07 - Coding Test11:27 - Model Welfare and Spiritual Bliss13:29 - ASL-3Claude Card: https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf?s=09ASL 3:https://www-cdn.anthropic.com/807c59454757214bfd37592d6e048079cd7a7728.pdfTweets: https://x.com/fish_kyle3/status/1925597284546629753https://x.com/EMostaque/status/1925624164527874452?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5EtweetCursor Says State of the Art for Coding: https://x.com/cursor_ai/status/1925594428095561941Benchmarks: https://www.anthropic.com/news/claude-4
Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.