PodcastsEducaciónAI Explained Official Podcast

AI Explained Official Podcast

Philip - Host of AI Explained YT
AI Explained Official Podcast
Último episodio

49 episodios

  • AI Explained Official Podcast

    Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI

    20/2/2026 | 18 min
    Do we have a new best AI model, or do we have the downfall of benchmarks in general, as a way of capturing machine intelligence? Full breakdown of Gemini 3.1 Pro, guest-starring the new Sonnet 4.6, plus analysis from 7 papers/posts that will give you much needed context. Oh, and a new record on Simple Bench!

    https://epoch.ai/ai-explained-datacenters

    Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Chapters:
    00:00 - Introduction
    00:30 - Post-training Dominance
    04:00 - ARC-AGI 2 Caveat
    05:54 - Simple Bench Record
    08:22 - Hallucination Caveat
    10:05 - Model Card
    11:12 - Exponential Coming
    12:20 - Amodei on Generalizing
    15:10 - One True Benchmark?
    17:02 - Other Metrics…

    Gemini 3.1 Model Card: https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-1-Pro-Model-Card.pdf

    Release: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/

    Where are Agents deployed?: https://www.anthropic.com/research/measuring-agent-autonomy

    Newsletter Post: https://signaltonoise.beehiiv.com/p/4-ai-numbers-that-surprised-me-this-week

    Hallucination AA: https://artificialanalysis.ai/evaluations/omniscience

    Melanie Mitchell: https://x.com/MelMitchell1/status/2022738363548340526
    ARC-AGI-2: https://x.com/arcprize/status/2024522812728496470/photo/1

    Chollet on Agentic Coding and ML: https://x.com/fchollet/status/2024519439140737442

    METR Caveat: https://metr.org/notes/2026-01-22-time-horizon-limitations/

    Talaas Fast: https://chatjimmy.ai/

    Amodei Interview Continual learning: https://www.dwarkesh.com/p/dario-amodei-2?open=false#%C2%A7002942-is-continual-learning-necessary-how-will-it-be-solved

    Metaculus FutureEval: https://www.metaculus.com/futureeval/

    Next Vid to Watch: https://www.patreon.com/posts/what-you-need-to-150647292

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/
  • AI Explained Official Podcast

    The Two Best AI Models/Enemies Just Got Released Simultaneously

    06/2/2026 | 19 min
    The two models that you will hear discussed for at least the next two months - Claude Opus 4.6 and GPT 5.3 Codex - just got released within 26 mins or each other. The full breakdown of around 250 pages of reports, with just the most interest moments, from the battle of which is best, Claude personhood, the surprising misbehaviour of Opus 4.6, and much more

    https://assemblyai.com/aiexplained

    Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

    AI Insiders ($9): https://www.patreon.com/AIExplained

    Chapters:
    00:00 - Introduction
    00:54 - Self-improvement?
    02:44 - Knowledge Work
    05:30 - Overly agentic behaviour
    09:12 - Who Shouldn’t Use Claude Opus
    11:39 - Step-change?
    15:09 - Claude’s ‘Personhood’

    Hassabis Roadmap: https://www.patreon.com/posts/hassabis-roadmap-149750869

    Release of Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6
    212 Page System Card: https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdf
    Claude Code Tip: https://x.com/bcherny/status/2019475897691124107

    GPT Codex 5.3: https://openai.com/index/introducing-gpt-5-3-codex/
    System Card: https://openai.com/index/gpt-5-3-codex-system-card/

    Browse Comp: https://arxiv.org/pdf/2504.12516v1
    Finance Agent: https://www.vals.ai/benchmarks/finance_agent
    Terminal Bench 2: https://arxiv.org/pdf/2601.11868
    Vending Bench: https://andonlabs.com/blog/opus-4-6-vending-bench

    My X post: https://x.com/AIExplainedYT/status/2016851303436095647

    Anthropic Apology: https://x.com/ch402/status/2014066134194995256/photo/1

    Altman rebuttal: https://x.com/sama/status/2019139174339928189
    https://x.com/sama/status/2019140276246442089

    4% of GitHub: https://x.com/dylan522p/status/2019490550911766763

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/
  • AI Explained Official Podcast

    Claude AI Co-founder Publishes 4 Big Claims about Near Future: Breakdown

    28/1/2026 | 22 min
    Anthropic's CEO, who has consistently predicted transformative AI will arrive before 2030, recently published a nearly 20,000-word essay outlining his vision of where AI is heading. The video gives you the highlights. The essay argues that scaling and recursion will advance AI from coding automation to full engineering automation, while warning of economic displacement within 1-2 years and China's trajectory toward AI-enabled totalitarianism. Additionally, Dario Amodei predicts that AI models will increasingly be understood as collections of distinct personas rather than monolithic systems.

    80,000 Hours: https://www.youtube.com/watch?v=B54EQiuO1UU

    Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Chapters:
    00:00 - Introduction
    01:10 - Scaling to software engineers
    06:11 - Permanent Underclass
    10:18 - Totalitarian Nightmares
    16:38 - Collection of Personas

    Essay: https://www.darioamodei.com/essay/the-adolescence-of-technology

    Physics Prediction: https://www.quantamagazine.org/is-particle-physics-dead-dying-or-just-hard-20260126/

    Axios: https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

    World GDP: https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG?end=2024&start=1961&view=chart

    Demis Hassabis Counter: https://www.youtube.com/watch?v=q6fq4_uP7aM

    Karpathy 80%: https://x.com/karpathy/status/2015883857489522876

    Machines of Loving Grace: https://www.darioamodei.com/essay/machines-of-loving-grace

    Anthropic LessWrong: https://www.lesswrong.com/posts/5aKRshJzhojqfbRyo/unless-its-governance-changes-anthropic-is-untrustworthy#1__In_private__Dario_frequently_said_he_won_t_push_the_frontier_of_AI_capabilities__later__Anthropic_pushed_the_frontier

    Original Constitution: https://www.anthropic.com/news/claudes-constitution

    New Constitution: https://www.anthropic.com/constitution

    Kimi K2.5: https://x.com/Kimi_Moonshot/status/2016024049869324599

    Societies of Thought, Google DeepMind Paper: https://arxiv.org/pdf/2601.10825

    https://lmcouncil.ai/benchmarks

    https://www.patreon.com/posts/our-new-age-of-133960279

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/
  • AI Explained Official Podcast

    Anthropic: Our AI just created a tool that can ‘automate all white collar work’, Me:

    14/1/2026 | 18 min
    A new tool, with code written by an AI model, has gone omega-viral: Claude Cowork. But is the hype justified? What do the stats say on productivity? Where is the truth in a sea of noise? What is truth? Can we handle the truth? Where's Nemo?

    https://matsprogram.org/s26-aie

    Check out my new app! https://lmcouncil.ai

    AI Insiders ($9!): https://www.patreon.com/AIExplained

    Chapters: 
    00:00 - Introduction
    01:12 - Claude Cowork
    06:48 - Productivity Speed-up + jobs
    09:33 - Comparing Models
    12:00 - Brittle AI Paper

    Cowork Intro: https://x.com/claudeai/thread/2010805682434666759

    'All of it': https://x.com/bcherny/status/2010813886052581538

    'AGI' Claims: https://x.com/deepfates/status/2004994698335879383

    Douglas Interview: https://www.youtube.com/watch?v=TOsNrV3bXtQ&t=2313s

    Job Stats: https://www.oxfordeconomics.com/wp-content/uploads/2026/01/Evidence-of-an-AI-driven-shakeup-of-job-markets-is-patchy.pdf
    Amodei Prediction: https://fortune.com/2025/05/28/anthropic-ceo-warning-ai-job-loss/

    GenAI Traffic: https://x.com/demishassabis/status/2009075877347512545

    Illusion of Insight: https://arxiv.org/pdf/2601.00514
    Entropy Exploration: https://arxiv.org/pdf/2506.14758
    ProRL: https://arxiv.org/pdf/2505.24864

    Genesis Mission: https://www.whitehouse.gov/presidential-actions/2025/11/launching-the-genesis-mission/
    https://deepmind.google/blog/how-were-supporting-better-tropical-cyclone-prediction-with-ai/

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/
  • AI Explained Official Podcast

    What the Freakiness of 2025 in AI Tells Us About 2026

    23/12/2025 | 33 min
    It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.

    http://matsprogram.org/s26-aie

    My new app! https://lmcouncil.ai

    Patreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094

    Chapters:
    00:00 - Introduction
    00:34 - Reasoning Models … and limits
    02:54 - A playable world
    03:36 - Realism
    03:50 - AI Slop gone mainstream
    05:03 - DolphinGemma
    05:39 - Public Mood
    07:34 - AI Enlisted
    08:30 - GPT-5
    11:05 - Open Weight not out
    13:00 - METR Breakout
    17:30 - VASA-1
    18:28 - Lateral Productivity
    20:15 - 1 or 1000 benchmarks needed?
    24:54 - Continual Learning + Altman on Superintelligence
    28:08 - Automated Information Discovery ft AlphaEvolve

    Hassabis on Generality: https://x.com/demishassabis/status/2003097405026193809
    https://www.youtube.com/watch?v=PqVbypvxDto

    Gemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
    Reasoning Trade-offs: https://arxiv.org/pdf/2504.13837

    DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09

    Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/

    METR Time Horizon: https://arxiv.org/pdf/2503.14499
    https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
    Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443
    https://shash42.substack.com/p/how-to-game-the-metr-plot
    https://x.com/METR_Evals/status/2002203627377574113

    GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problems

    https://simple-bench.com/

    AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9k
    https://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-plan

    Survey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1

    Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169

    OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1
    Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ

    AI in Govt: https://x.com/jdcmedlock/status/1939814516503847259

    Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/

    AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
    https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=
    Continual Learning: https://abehrouz.github.io/files/NL.pdf

    Job Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

    GPT4o: https://x.com/AISafetyMemes/status/1916889492172013989

    Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/

    Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines
    Turing Test: https://x.com/tunguz/status/1907185471211422147

    Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/

    LLM Brainrot: https://arxiv.org/pdf/2510.13928

    Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-report

    Emotional Quotient: https://arxiv.org/pdf/2511.08394

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/

    AI Insiders ($9!): https://www.patreon.com/AIExplained

Más podcasts de Educación

Acerca de AI Explained Official Podcast

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.
Sitio web del podcast

Escucha AI Explained Official Podcast, The Mel Robbins Podcast y muchos más podcasts de todo el mundo con la aplicación de radio.es

Descarga la app gratuita: radio.es

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app
Aplicaciones
Redes sociales
v8.6.0 | © 2007-2026 radio.de GmbH
Generated: 2/23/2026 - 12:27:30 AM