-
Composer: Building a fast frontier model with RL
Cursor has developed Composer, a new agent model for software engineering that achieves frontier coding results four times faster than similar models. Composer is a mixture-of-experts model trained with reinforcement learning to solve real-world software engineering challenges using various tools, including code editing and semantic search. -
Agentic AI and Security
Agentic AI systems have a fundamental security flaw: LLMs cannot separate instructions from data, making them vulnerable to prompt injection attacks where untrusted content contains hidden commands. The worst security occurs when an AI has access to sensitive data, reads untrusted content, and can communicate externally, allowing attackers to steal information by embedding instructions in sources like Jira tickets or web pages. To prevent this, it’s best to run LLMs in sandboxed containers, a… -
Why I code as a CTO
This CTO prioritizes coding despite the conventional wisdom that senior leaders should primarily manage. He divides his coding efforts into experimental projects, critical customer requests, and bug fixes. AI tools help boost his productivity. -
The Programmer Identity Crisis
The rise of AI and LLMs threatens the core identity of programmers as craftspeople. Vibe-coding and specification engineering have turned programmers into operators rather than creative problem-solvers. This shift removes a lot of the joy and fulfillment of programming. -
LLMs Can Get “Brain Rot”!
Continuous exposure to low-quality, “junk” web data can cause a lasting cognitive decline in LLMs. Through controlled experiments, researchers found that pre-training LLMs on data from X selected for high engagement or sensationalism negatively impacted reasoning, long-context understanding, and ethical behavior. The primary error was identified as “thought-skipping,” where models truncated reasoning steps, and standard fine-tuning methods only partially mitigated the damage, showing a persis… -
Claude Code on the Web
Anthropic has launched Claude Code on the web, a beta feature allowing users to delegate coding tasks to Claude directly from their browser. This cloud-based service allows for parallel execution of coding tasks in a sandboxed environment, connecting to GitHub repos and providing real-time progress tracking. -
[After the AI boom: what might we be left with)
The current AI boom differs from the dotcom era due to its focus on proprietary, short-lived systems rather than open infrastructure. Investment is primarily directed towards specialized GPUs and AI data centers optimized for specific vendors, creating closed ecosystems. However, a potential overbuild could lead to cheaper access to powerful compute for everyone. -
The State of AI Adoption in Engineering Teams 📊
A survey of 435 engineers and engineering managers shows that 77% use AI daily for personal work, mainly for coding and automating repetitive tasks. However, team-level adoption is still largely unstructured, with most organizations providing tool access without establishing shared practices or workflows. -
Effective Context Engineering for AI Agents
Context engineering focuses on optimizing the tokens within an AI model’s limited context window for the best outcomes. Good context engineering means curating and maintaining the most relevant information for each inference, including system instructions, tools, data, and message history. Messages can experience “context rot” and lose focus with excessive or irrelevant information, so agents like Claude Code use compaction, note-taking, and sub-agents for better context management. -
Only 100 Metrics Matter
Companies track thousands of numbers, but almost none of them change decisions. In reality, about 100 metrics explain 90% of what’s happening. Once you break the business into equations - growth, engagement, revenue - you see which levers actually matter. Everything else lives in the long tail: useful for edge cases, not for running the company. -
The AI Growth Endurance Problem
AI startups are growing faster than any SaaS wave before them, but speed hides cracks. Today’s AI boom is like a sprint run on borrowed stamina: margins are thin, retention untested, and switching costs low. Early growth looks great because customers are new and competition is still forming. The real test comes when renewals hit. Growth endurance, not growth speed, decides who survives once the hype cools. -
The ChatGPT App Store Moment
OpenAI just launched apps inside ChatGPT where you can say “Spotify, make me a playlist” or “DoorDash, order my usual” and it handles everything conversationally. This looks like early iPhone web apps, not true native experiences yet, but the trajectory is clear. As OpenAI expands its SDK to give developers access to user memory, persistent state, and action execution, we’ll see actual native ChatGPT applications designed for conversational interfaces. -
From Agent Hype to Agent Fatigue
Everyone said 2025 would be the year of AI agents, and they were right, just not in the way they expected. Companies raced to rebrand every workflow as “agentic,” but the reality is showing cracks: 40% of projects are predicted to fail by 2027, and employees are burning out under the pressure to use tools they barely understand. -
From managing people to managing AI: How the same leadership skills apply in the age of AI
Julie Zhuo emphasizes how AI turns everyone into managers, requiring traditional management skills to handle AI agents effectively. Her “diagnose with data, treat with design” framework highlights balancing intuition and data in product development. Despite AI-driven hypergrowth leading to poor data infrastructure, this isn’t a significant obstacle, according to Zhuo. -
First real samples of Veo 3.1 generated videos
Veo 3.1 marks a noticeable step forward in prompt fidelity and visual/audio quality. It doesn’t have the same issues as Veo 3, like occasional oddities with object proportions. The model demonstrates a better understanding of nuance, creating videos that closely match prompt intent. Traces of the model have appeared in Vertex AI and Google Vids. An official release is likely in the coming weeks.
This Month in Tech: October 2025
TLDR of the TLDR: October 2025 in Tech