AI: Weekly Summary (September 29 - October 05, 2025)

Key trends, opinions and insights from personal blogs

I’d call this past week in AI a proper market square of stories — vendors hawking shiny things, researchers swapping blueprints, pundits yelling about bubbles, and a steady trickle of real-world fixes and failures. To me, it feels like standing at a busy station: you see the flashy trains (new models), the freight lines (compute and money), the commuters (developers and workers), and the odd poor soul with a broken suitcase (security and ethics). Lots to poke at. Lots to follow up on.

The new models that stole the thunder

There were three headlines that kept popping up: Anthropic with Claude Sonnet 4.5, OpenAI with Sora 2 and its Sora app, and a cluster of Chinese model updates like GLM-4.6 and DeepSeek. The coverage is part admiration, part skepticism. Folks who build apps are buzzing about Sonnet 4.5 as a coding workhorse. Simon Willison, thezvi.wordpress.com, and JP Posma all ran experiments and said it handles multi-step code workflows, long contexts and even interprets code in ways earlier models fumbled. I would describe those write-ups as the kind of early customer reviews you look for when deciding if something will survive in daily use.

Then Sora 2 from OpenAI showed up — a video and audio model that, according to several posts, can do convincing physical interactions and synced dialogue. Brian Fagioli, The PyCoach and Michael J. Tsai wrote about its app and the cameos feature that lets users insert likenesses into generated clips. I’d say the reactions fall into two camps: creative excitement and cultural alarm. The creative people see a new toolkit that democratizes short-form video. The cultural critics worry about deepfakes, misuses and the sheer tidal wave of content.

On the other side, a lot of posts flagged cost and token efficiency as battlegrounds. GLM-4.6 landed in Kilo Code as a cost-effective coding model, beating Sonnet in price per token while being competitive in tasks. JP Posma and Simon Willison covered that angle. It’s the classic grocery store vs delicatessen debate: fancy, expensive models with better taste versus cheaper ones that get the job done for the weekly shop.

A handful of posts dug into architectural experiments, like DeepSeek's sparse attention and other hybrid designs pushing longer context windows. Simone Bellavia and Kevin Kuipers gave readable takes on how these choices might affect reasoning and efficiency. To me, it feels like engineers tinkering under the hood of a classic car — you can see the gains but the ride might get bumpy.

Agents, agent orchestration, and the idea that AI actually does work now

A recurring theme is agentic AI — not just chatbots but AI that plans, acts, and stitches other tools together. Posts by Ethan Mollick, Nate, and Nate again in his longer brief argue that agents are starting to pull real economic weight. Walmart’s move to create a job category for 'Agent Developers' got a shout-out, and practical guides started to appear: quick playbooks, build patterns, and even the MVP checklists for adopting agents in teams.

There were also nuts-and-bolts posts showing how to design safe agent loops and how to use agents for coding, like Simon Willison on agentic loops, and Chris Dzombak showing MCP-driven workflows. The tone here is pragmatic: agents are useful, but they need guardrails, tests and a human in the loop.

I’d describe the excitement as cautious optimism. It’s like getting a robotic lawn mower that does a passable job: great if you supervise it, less great if you expect it to trim the hedges and file your taxes.

Money, compute, and the bubble talk that never sleeps

If there is one argument louder than model bragging, it’s money. Dozens of posts worried about overspending on AI, and the idea that this is a bubble that could burst. Ed Zitron, Michael Spencer, Will Lockett and others made the case that the current spending is unsustainable. There’s a chorus comparing today to dot-com era excesses. Then there are counter-voices like Dave Friedman who argue financialization could create a compute market rather than collapse overnight.

Right in the middle are stories about real estate and infrastructure: Google’s big data center investment in Arkansas, CoreWeave and Lambda Labs business models, and Kioxia opening a 218-layer NAND fab. Stephen Hackett, Brian Fagioli and Dave Friedman covered these. There’s a smell of industrialization: data centers, power, cooling, and long-term leases. It’s less venture froth and more heavy machinery, which is harder to re-purpose than a marketing pitch.

I’d say the tension is obvious: investors want a returns story; engineers want compute; the public wonders where the jobs will be. That mismatch shows up in a lot of writing about inequality and labor.

Labor, deskilling, and the human cost

A strong thread in the week’s posts is the human factor. Branko Milanovic wrote on income inequality amplified by automation. Molly White had several pieces — one critiquing White House reliance on Wikipedia for training data, another highlighting the poor conditions for AI data workers. Both are blunt reminders that models are built on human work — often low-paid, invisible, and precarious.

There were also pieces on hiring and jobs. Harald Agterhuis argued that screening driven by automation pushes companies towards 'analog authenticity' and better human interviews. Christopher Brunet noted the collapse of the econ PhD job market, with AI partly blamed. And the University of Phoenix white paper (covered by Brian Fagioli) argued that AI is burning out workers when badly implemented. The message is simple: AI shifts work, and not always in ways that help people.

I would describe many of these posts as a litany of mismatches: promises of productivity without the training, tools that managers love but that leave junior staff overworked, and automation that replaces attention with surveillance.

Security, scams, and the dark side of convenience

Security got a lot of attention. Bruce Schneier’s posts about Notion’s AI agents and the 'Scam GPT' report made the rounds, showing how prompt injection and agent vulnerabilities can be weaponized. Schneier on Security gave practical warnings: when an agent has private data, can access untrusted input, and can talk to the outside world, you have a perfect storm for data exfiltration.

Related stories include the Postmark backdoor, and the Postmark MCP issue covered by Mike McBride, plus notes on malicious agents and malware invoking AI in Darwin Salazar. The takeaway is less melodrama and more hand-on: security teams need to treat AI like any other attack surface, perhaps more urgently than before.

A small but unnerving note: content moderation and the misuse of Sora 2. Brian Fagioli pointed out disturbing clips of Martin Luther King Jr. created by the tech. That’s not theoretical; that’s cultural harm in the wild.

Creativity and culture: cheapening art or new tools?

The arts were loud this week. Tilly Norwood, a fully AI-generated actress, stirred debates in Hollywood. Dave Friedman, Jason Clauss and others talked about unions, authenticity and the economics of synthetic performers. Meanwhile, song lyrics in the 'computer country' genre turned up as a playful, existential riff in Goto80 — a reminder that not everything AI does has to be corporate or scary.

Several posts suggested a middle path: tools that augment rather than replace. Matt Mullenweg’s essay on craft urged intentional creation in an age of generative slop, and Nate argued that new models are collaboration tools that let humans focus on decisions rather than grunt work. I’d say the argument is familiar: AI can be a power saw or a pixie dust, depending on who wields it and why.

There’s also the legal side: author settlements with Anthropic, the White House RFI asking artists to speak up on rights, and heated legal questions around AI-generated characters. John Scalzi and The Trichordist covered pieces of this. It’s a messy, fast-moving patchwork.

Developers, tools, and the craft of building with AI

Developers wrote a lot this week about how to actually live with these models. There’s the 'vibe-coding' conversation — tools that spin out an app from a prompt — and a counterpoint that warns about cargo-cult coding. Ben Dickson, Anup Jadhav, and others covered both sides. Armin Ronacher had a pragmatic post: AI can write a lot of code, but understanding threading, rate limiting and basic concepts still matters.

Tactical pieces were useful. Bart Wullems showed how to integrate Semantic Kernel with Microsoft.Extensions.AI, Chris Dzombak explained MCP for GitHub Actions, and Nacho Morató walked through Chrome DevTools MCP. There are also tutorials like building headshot apps with Nano Banana or production apps with simple AI templates. If you want to ship, these are the posts you skim and then copy-paste from.

A recurring note: tests, automation, and privacy. Design agent loops carefully. Don’t YOLO-run agents with your secrets. Keep tests. Engineers who ignore this end up like people who buy a power tool without goggles: they pay for it later.

Verification, RAG, and the future of context

A quieter but important thread looked at verification and the limits of retrieval-augmented generation (RAG). Nico Dekens and Nicolas Bustamante discussed OSINT, verification ladders, and the idea that as context windows grow and agents get smarter, RAG may be an interim hack. Someone called it the 'RAG obituary' — dramatic, but the point stands: chunking long documents is ugly and brittle.

There were also technical deep dives on evaluation methods for LLMs, on context priming to avoid hallucinations (Chris Dzombak), and on hybrid attention designs. In short: people are moving from band-aids to architecture changes. That matters more than the PR lines.

Healthcare, law, and real-world deployments

Not everything is hype. A few posts stood out because they focused on specific benefits. UCL researchers used AI to predict which patients with keratoconus need treatment — a straightforward win covered by Charles Carter. In law, Altorney’s MARC tool automates first-pass e-discovery, promising big savings (Robert Ambrogi). Eve raised a huge Series B to be an AI platform for plaintiff-side firms, which shows the legal market is hungry for automation.

There were also uneasy stories of malpractice: lawyers caught using AI and fabricating case law, and debates over AI prescribing. These are reminders that domain expertise still matters. If AI is used as a blunt instrument in high-risk domains, people get hurt.

Governance, norms, and a tired call for better stewardship

The week had its usual governance hand-wringing. Tim Berners-Lee’s call to return the web to its collaborative roots and to manage AI research via not-for-profits like CERN was highlighted by John Lampard. The White House RFI invited artists to speak up. California passed new transparency rules. There’s a drumbeat for governance that is sometimes drowned out by corporate PR.

I’d say the gap is procedural: many people call for rules, fewer write the rulebooks. The posts collectively read like a town meeting where everyone agrees the village needs a well, but nobody has the shovel.

Voices that nag and voices that cheer — patterns and disagreements

What’s interesting is how often the same themes recur with different accents. Hot takes on 'the AI bubble' range from doom-saying to market-sculpting. Some pieces warn that AI is 'asbestos for the web' or will 'destroy everything' (Chris Ferdinandi, Will Lockett). Others say we are industrializing compute, and that futures contracts could smooth volatility (Dave Friedman).

On safety and ethics, the camps are similarly split. Some want urgent moratoria and heavy regulation; others think practical industry standards and better engineering will do. Many authors sit in the middle: demand better testing, accountability, and hands-on governance rather than slogans.

I’d say the common denominator is uncertainty. People are either building, defending, or worrying. That alone is useful: uncertainty makes for better experiments than complacency.

Little things you might have missed but worth a peek

Practical UX and product notes: Why short surveys work better for research, and Apple Music automix being loved by DJs (Jakob Nielsen, blog.jpnearl.com). Small wins.
Hardware nudges: LPCAMM2 memory for AI laptops and a new MINIX mini PC for home labs (Brian Fagioli, Brandon Lee). Not sexy, but these matter for people building at home.
Strange and human: a tale of AI-related psychosis in a software engineer told by Jules Evans. Hard to read, but a reminder that these systems touch human minds in unpredictable ways.

Where to look next

If you enjoy the product-review vibe: read the Claude Sonnet 4.5 tests and Nate’s deck-building demo for practical impressions. If you like policy and ethics, check Tim Berners-Lee’s essay and the White House RFI pieces. For hardcore developer reads, the MCP and Semantic Kernel notes are low on hype and high on utility. And if you want the doomsday vs market debate served hot, there are plenty of bubble essays to chew on.

The week feels like a crowded living room after a party. People are picking up chairs, arguing about who should have cleaned the kitchen, someone’s showing a new gadget they love, and someone else is quietly sweeping broken glass into a dustpan. It’s messy, alive and occasionally alarming. Read the posts. Follow the threads. There are real tools here, and real problems too — both require attention, not slogans.

If you want a short reading plan, try this round trip: start with the Sonnet and Sora 2 reports to see what’s new, skim the agent playbooks to learn how teams are using agents today, then dig into the security and labor pieces so you don’t end up surprised. After that, have a look at the compute and capex analysis to understand why someone is buying a data center in Arkansas at scale.

There’s more under the hood than headlines. If you poke a few posts, you’ll find the practical tips — calls to instrument your systems, write tests, and think about worker conditions — that most of the hot takes forget. It’s the kind of reading that helps you build for next week, not just argue about next year.