This Month in Tech: March 2025

TLDR of the TLDR: March 2025 in Tech

  1. Gemini 2.5: Our most intelligent AI model
    Google DeepMind has launched Gemini 2.5. The first release, Gemini 2.5 Pro Experimental, is great in reasoning, coding, and performance benchmarks, including topping the LMArena leaderboard. This “thinking model” combines a strong base model with improved post-training, allowing it to handle complex problems and analyze information more effectively.

  2. Open Sora Release (GitHub Repo)
    The Open Sora effort, which has been underway since the original launch of the model, has trained a competitive model for less than $200k. It has released all of the code and weights for reproducing the results. The motions are quite compelling, although not completely state-of-the-art.

  3. Chinese AI lab DeepSeek just released the latest version of its enormous DeepSeek v3 model
    The latest version of DeepSeek’s new v3 model, DeepSeek-V3-0324, is under a new MIT license - the release adds up to a total of 641 GB of files.

  4. 95% AI-written code? Unpacking the Y Combinator CEO’s developer jobs bombshell
    Y Combinator’s CEO claims that 25% of the accelerator’s startups are launching with 95% AI-written code, which has led to smaller engineering teams, faster profitability pressure, and a shift in the roles of human engineers towards AI oversight and complex problem-solving.

  5. A Deep Dive Into MCP and the Future of AI Tooling
    MCP is an open protocol that allows systems to provide context to AI models in a manner that’s generalizable across integrations.

  6. The Model is the Product
    AI is evolving towards a model-as-a-product paradigm, where the model itself, rather than the application layer, becomes the primary source of value. This shift is driven by the stalling of generalist scaling, the success of opinionated training, and the plummeting cost of inference, leading to closed AI model providers moving up the value chain and disrupting the application layer. Examples include OpenAI’s DeepResearch and Claude 3.7.

  7. How Cursor (AI IDE) Works
    AI IDEs like Cursor fork VSCode, integrate a chat UI, implement coding agent tools like read_file and write_file, and optimize internal prompts. Users should simplify tasks for the main LLM agent by using smaller models for sub-tasks and prove explicit context through “@file” syntax and well-commented code.

  8. How OpenAI is building its moat
    OpenAI is shifting focus from models themselves to building comprehensive application and integration layers to maintain its position. This involves improving ChatGPT with features and developing APIs that make LLM application development easier. The company seeks to lock developers into its ecosystem by providing a complete framework.

  9. Introducing Gemma 3: The most capable model you can run on a single GPU or TPU
    Google DeepMind has released Gemma 3, a collection of state-of-the-art open models built from the same technology as Gemini 2.0 designed to be fast and run directly on devices. Gemma 3 comes in various sizes (1B to 27B) and has great performance, multilingual support for 140 languages, advanced reasoning capabilities, and a large 128k-token context window.

  10. Goodbye, Skype: A product insider’s take on the iconic app’s fall
    The downfall of Skype was due to a lack of focus, prioritization of features over quality, and a failure to capitalize on opportunities like the COVID-19 pandemic. Plus, Microsoft’s focus on Teams made Skype redundant and obsolete.

  11. The New AI Risk Curve
    AI startups grow faster but fade faster too. They can reach $50M ARR quickly but might shrink in year four as their lead vanishes. Unlike past tech eras where physical infrastructure (on-premise) or engineering teams (cloud) created barriers to entry, AI tools make it easy for competitors to catch up at any point. This creates a new risk curve where impressive growth numbers no longer guarantee lasting success as technology keeps shifting under everyone’s feet.

  12. Mistral Small 3.1
    The newly released Mistral Small 3.1 model surpasses competitors like Gemma 3 and GPT-4o Mini with enhanced text performance, multimodal understanding, and a 128k token context window. It’s designed for various AI applications, including instruction following, conversational assistance, and image processing, offering fast inference speeds and running efficiently on hardware like an RTX 4090. Available under Apache 2.0, Mistral Small 3.1 can be accessed through Hugging Face, used via API on Mi…

  13. OpenAI adopts rival Anthropic’s standard for connecting AI models to data
    OpenAI will support Anthropic’s Model Context Protocol (MCP) across its products, enhancing AI models’ data-driven capabilities. MCP, an open-source standard, allows AI models to connect with various data sources like business tools and software. Companies like Block, Replit, and Sourcegraph have adopted MCP. OpenAI plans to share more about its integration soon.

  14. Anthropic’s Red Team Warns of AI Security Risks
    AI models are rapidly advancing in cybersecurity and biology, approaching expert-level knowledge in some areas. While current risks remain manageable, the company warns that stronger safeguards will be needed as capabilities progress.

  15. Google’s new robot AI can fold delicate origami, close zipper bags without damage
    Google DeepMind has unveiled Gemini Robotics and Gemini Robotics-ER, AI models enhancing robots’ fine motor skills and adaptability for real-world applications. Gemini Robotics integrates vision, language, and action capabilities, allowing robots to perform complex tasks like origami folding. Early results highlight significant improvements in generalization and dexterity over previous models, with Google’s model partnering with Apptronik and others for further development.

  16. QwQ 32B reasoning model
    The Qwen team has trained an open weight, Apache 2.0, licensed model that performs on par with DeepSeek R1 and better than many of the larger distill models. They found that by building on outcome based rewards with formal verification and test-case checks, the model can continuously improve at math and code. Also, by mixing in general instruction following data later in RL training, the model can still align with human preference.