X-AI-2026-03-27

Digest

Morning signal

LLM summarization failed. Raw content below.

= X/TWITTER POSTS FROM FOLLOWED PEOPLE =

--- @karpathy ---

@karpathy (Andrej Karpathy): When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc…

I am really looking forward to a day where I could simply tell my agent: “build menugen” (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human.

It’s easy to state, it’s now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!

QT @patrickc: When @karpathy built MenuGen (https://t.co/2OjrUJ3aLS), he said:

“Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services,… https://x.com/patrickc/status/2037190688950161709 date: Thu Mar 26 16:10:52 +0000 2026 url: https://x.com/karpathy/status/2037200624450936940 ──────────────────────────────────────────────────

@karpathy (Andrej Karpathy): One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard. date: Wed Mar 25 16:05:14 +0000 2026 url: https://x.com/karpathy/status/2036836816654147718 ──────────────────────────────────────────────────

@karpathy (Andrej Karpathy): (I cycle through all LLMs over time and all of them seem to do this so it’s not any particular implementation but something deeper, e.g. maybe during training, a lot of the information in the context window is relevant to the task, so the LLMs develop a bias to use what is given, then at test time overfit to anything that happens to RAG its way there via a memory feature (?)) date: Wed Mar 25 16:22:08 +0000 2026 url: https://x.com/karpathy/status/2036841069636370467 ──────────────────────────────────────────────────

--- @sama ---

@sama (Sam Altman): The first steel beams went up this week at our Michigan Stargate site with Oracle and Related Digital https://t.co/Hl0NBqwfnS VIDEO: https://pbs.twimg.com/amplify_video_thumb/2037609606734925824/img/tQ8iaeWmXu4KQ6dk.jpg date: Fri Mar 27 19:17:35 +0000 2026 url: https://x.com/sama/status/2037610000122839116 ──────────────────────────────────────────────────

@sama (Sam Altman): The coolest meeting I had this week with was Paul, who used ChatGPT and other LLMs to create an mRNA vaccine protocol to save his dog Rosie. It is amazing story.

“The chat bots empowered me as an individual to act with the power of a research institute - planning, education, troubleshooting, compliance, and yes, real scientific design work in converting genomic data to a vaccine prescription and designing the treatment protocol around it. But they worked alongside humans at every step. The combination is what made it possible.”

It immediately got me thinking “this should be a company”.

Also, Paul is an extraordinary guy. This should be easy to do, but it is not yet.

QT @paul_conyngham: Article: How I created Rosie’s mRNA Vaccine Protocol https://x.com/paul_conyngham/status/2036940410363535823 date: Fri Mar 27 05:10:30 +0000 2026 url: https://x.com/sama/status/2037396826060673188 ──────────────────────────────────────────────────

@sama (Sam Altman): RT @nanransohoff: The new OpenAI nonprofit just announced that it aims to spend $1B in its *first year” and will be led by two superb human… date: Tue Mar 24 21:39:30 +0000 2026 url: https://x.com/sama/status/2036558551825653810 ──────────────────────────────────────────────────

--- @DarioAmodei ---

@DarioAmodei (Dario Amodei): RT @AnthropicAI: A statement from Anthropic CEO Dario Amodei: https://t.co/WnSFrwI9nI date: Fri Mar 06 00:46:13 +0000 2026 url: https://x.com/DarioAmodei/status/2029720168780169546 ──────────────────────────────────────────────────

@DarioAmodei (Dario Amodei): RT @AnthropicAI: A statement from Anthropic CEO, Dario Amodei, on our discussions with the Department of War.

https://t.co/rM77LJejuk date: Thu Feb 26 22:43:10 +0000 2026 url: https://x.com/DarioAmodei/status/2027152488659394660 ──────────────────────────────────────────────────

@DarioAmodei (Dario Amodei): The Adolescence of Technology: an essay on the risks posed by powerful AI to national security, economies and democracy—and how we can defend against them: https://t.co/0phIiJjrmz date: Mon Jan 26 17:03:45 +0000 2026 url: https://x.com/DarioAmodei/status/2015833046327402527 ──────────────────────────────────────────────────

--- @AmandaAskell ---

@AmandaAskell (Amanda Askell): Tech companies pay millions of dollars for their employees and then stick them in open-plan offices that make it nearly impossible to get work done. Best strategy for poaching employees is probably to just offer them an office with a door. date: Thu Mar 26 16:40:34 +0000 2026 url: https://x.com/AmandaAskell/status/2037208098121933188 ──────────────────────────────────────────────────

@AmandaAskell (Amanda Askell): Maybe the move to remote work actually made this worse for people who don’t like working from home, because working from home is now just assumed to be a viable alternative. date: Thu Mar 26 16:51:13 +0000 2026 url: https://x.com/AmandaAskell/status/2037210778198302907 ──────────────────────────────────────────────────

@AmandaAskell (Amanda Askell): Perhaps I should get married again so that the media has a more recent man they can reference any time they mention me or my work. date: Thu Mar 19 17:58:23 +0000 2026

Evening signal

TL;DR: Agents are eating software—from DevOps automation to autonomous jailbreaking research. The real bottleneck isn’t code anymore; it’s plumbing everything together. Meanwhile, AGI benchmarks are tightening (ARC-AGI-3), robot learning is scaling via human video, and security is becoming a multi-layered problem as agents gain filesystem access.

Agent-Driven Development & Automation

DevOps is the real hard part, not code — Karpathy observes that building actual deployed apps requires assembling services, payments, auth, databases like IKEA furniture; agents that can autonomously handle this entire lifecycle end-to-end would be genuinely transformative.

Claude Code auto-fix now works in the cloud with PR generation — Agents can now automatically follow CI/CD pipelines and fix issues without human permission, scaling autonomous code work beyond single sessions.

One-shot Mac setup automation proves blogging creates agent skills — Technical documentation becomes immediately consumable by agents; someone with 4 years of setup blogposts can now feed Claude their entire opinion stack and get executable bash scripts back.

Coding agents replacing broken vendor software — A user woke up to find Claude had written a fully functional Rust webcam app that fixed Canon’s perpetually crashing official software—pure replacement, not workaround.

AgentQL discovered novel jailbreaking algorithms via automated research loops — Claude Code is now being deployed in autoresearch loops to discover adversarial attacks, beating 30+ existing GCG methods; this signals that incremental security research itself is automatable.

Capability & Benchmarking

AI labs remain confident in sustained scaling trajectory — Leading labs haven’t been wrong on their ability to keep releasing more powerful models despite last year’s “GPT-5 plateau” narratives.

ARC-AGI-3 measures adaptability, not benchmark-specific engineering — True AGI requires adapting to new problems without handcrafted harnesses; current wins using benchmark-specific prompting don’t indicate generality.

ARC-AGI-3 is its own test with different rules — It measures different dimensions than previous ARC iterations; comparing scores across versions is comparing different tests entirely.

AI still can’t do basic scientific reasoning at scale — If current AI can’t solve ARC-AGI-3 environments (designed to be ultra-simple microcosms of the scientific method), breakthrough science breakthroughs remain steps away.

Real-World Impact: Biology & Medicine

mRNA vaccine protocols created by LLMs + humans as research partner — Paul used ChatGPT for planning, education, troubleshooting, compliance, and scientific design to create a vaccine protocol for his dog Rosie; Sam Altman flagged this as “should be a company” territory.

Memory & Personalization Problems

LLM personalization overshoots: one old question haunts forever — Models across all labs obsessively reference single past questions from months ago, treating context-window data as permanently relevant despite being training artifacts.

Memory overfitting likely rooted in training dynamics — Models developed biases to use whatever context is available during training; at test time they overfit to RAG’d memories as if they indicate deep user interests.

Tools & Infrastructure

Context Hub: agents accessing up-to-date API docs via CLI — Open tool that eliminates hallucinated parameters; agents can annotate workarounds and share learnings, with longer-term goal of agents teaching each other across sessions.

Agent memory systems enabling multi-session learning — New course on building memory managers that let agents persist knowledge across sessions and semantically retrieve only relevant tools without bloating context.

iMessage now available as Claude Code channel — Multi-modal agent access expanding to SMS/messaging platforms for broader integration.

Security & Threat Vectors

Filesystem as distributed codebase creates massive attack surface — Every file that could go into agent context becomes an attack vector; credentials can hide in ~/.claude, skill directories, or PDFs in morning briefs; identity theft will dwarf past incidents when agents have filesystem access.

On-demand software requires layered permission guardrails — Gap between “pressing yes mindlessly” and “dangerously skip permissions” will spawn an industry of “de-vibing” tools: audited Software 1.0 watching over Software 3.0 agents; agents need shells, probably many nested layers.

Robot Learning at Scale

Humanoid with dexterous 22-DoF hands trained from human video alone — Assembled model cars, operated syringes, sorted cards, folded shirts—all learned from 20,000+ hours of egocentric human video with zero robot-in-the-loop; humans are the most scalable embodiment.

EgoVerse ecosystem enables robot learning without robots — Behavior cloning from human video is replacing teleop; 2026 is about scaling robot learning via egocentric data, tested across 4 labs + 3 industry partners.

Workplace & Culture

Closed-door offices > million-dollar salaries in tech — Companies pay millions then sabotage productivity with open-plan offices; offering an office with a door is probably the best employee retention move.

Remote work normalized the death of focus spaces — WFH became the assumed fallback, making office environments worse for people who need uninterrupted work—no competitive advantage even when you offer traditional offices.

Creative AI

AI-generated 100M Gaussian splats, but human imagination made it art — Single creator built cyberpunk world in Marble; technical capability matters less than vision.

Source provenance

Original title: AI Digest — Mar 28, 2026 Morning
Original title: AI Digest — Mar 27, 2026 Evening
Normalized from old import files backed up outside the vault at: /Users/skypawalker/.hermes/backups/obsidian-digests-pre-normalize-2026-05-10

Previous: X-AI-2026-03-26
Next: X-AI-2026-03-30

Mindscape

Explorer

X-AI-2026-03-27

X-AI-2026-03-27

Digest

Morning signal

Evening signal

Agent-Driven Development & Automation

Capability & Benchmarking

Real-World Impact: Biology & Medicine

Memory & Personalization Problems

Tools & Infrastructure

Security & Threat Vectors

Robot Learning at Scale

Workplace & Culture

Creative AI

Source provenance

Navigation

Backlinks

Graph View

Table of Contents

Backlinks