When the history books look back at this week in AI, they will definitely point to deep-seek as the driving force, but did this week actually change everything as it seemed like it might at the beginning, or was it all a bit overblown? The AI Daily Brief has a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief. Quick note, obviously we've had a couple of days of interviews in a row and there has been a lot going on. So this episode is going to be kind of an extended headlines mixed with the main episode. The big dominant theme continues to be deep-seek and everything we've learned about it. And so we are going to pick up the story that we left off a couple of days ago. But as you'll see, we're getting into AI earnings season as well, as well as a bunch of open AI rumors. So we'll connect all the dots. But where we are going to start, like I said, is where we left off when it comes to this deep-seek conversation.
Now, as we were leaving off, one of the big questions was, how legitimate these breakthroughs were? Was this real innovation? Was this stolen innovation? Were the costs actually that low? Were they being subsidized? Was it true that these models were actually trained for what they cost? Or were their secret Nvidia chips?
One of the notable things about this story was how fast it rose up the ranks in the White House. Clearly, there was a geopolitical dimension to this, and perhaps unsurprisingly then, the White House seized upon accusations that Deepseek hadn't actually been all that innovative. Speaking with Fox News on Tuesday, AIs are David Sachs said, there's a technique in AI called distillation when one model learns from another. They can essentially mimic the reasoning process that they learn from the parent model. There's substantial evidence that what Deepseek did here is they distilled the knowledge out of open AI's models.
One of the things you're going to see over the next few months is our leading AI companies taking steps to try to prevent distillation. Zacks appears to be referring to reporting from the Financial Times, who quoted open AI sources as stating that they had, quote, seen some evidence of distillation, which it suspects to be from deep-seek.
Now SACs explained it pretty well, but basically distillation allows AI labs to train a model based on synthetic data created by a larger, more performant model and retain much of the performance. OpenAI has actively encouraged model distillation in the past, even launching their own platform for carrying out the process in October. DeepSeek also documented their model distillation technique in the technical paper but didn't identify the parent model.
Now, OpenAI obviously isn't going to try to pursue an IP lawsuit against a China-based company, but from the perspective of the administration, at least in the short term, it allows for some amount of narrative to downgrade deep-seeks achievement. The argument goes basically, training a capable frontier model from scratch is a difficult and time-consuming task, with assembly of the training data being one of the biggest hurdles.
Distilling a new model from outputs from a more capable model is a substantially easier task. It also implies that R1 only proves that DeepSeek is capable of replicating leading US models rather than topping the benchmark with performance breakthroughs. Now, I will say that one of the more common responses to open AI raising questions about model distillation was a sort of pot calling the kettle black argument given that they are embroiled in a number of copyright lawsuits. But ultimately, like I said, at this point, I think that this distillation idea is much more about controlling the narrative than anything else.
The claimed less than $6 million training cost for DeepSeek has been another big part of the conversation. And for some technical folks, the claim seems to stand up. Jack Clark of Anthropic commented, The most surprising part of DeepSeek R1 is that it only takes around 800,000 samples of good reinforcement learning reasoning to convert other models into RL reasoners. Now that DeepSeek R1 is available, people will be able to refine samples out of it to convert any other model into an RL reasoner.
Accelerate harder to the math on how much compute would be needed to build DeepSeek's Foundation model posting. DeepSeek V3 has 37 billion active parameters. They trained on 14.8 trillion tokens. Flops estimate to train 37 billion parameters times 14.8 trillion tokens is 3.3E24 flops, totally achievable with 2.8 million H800 hours. For people who don't buy this, where exactly do you think the extra compute is being spent?
Earlier in the week, investor Navalravicon had posted, small technical teams are already starting to confirm that the techniques and resulting cost savings are real. And of course, we have a way to actually figure this out or at least get closer to the truth. Small-scale experiments to replicate and uncover whether the breakthroughs were genuine were spun up basically as soon as the paper was available last Monday. A Berkeley lab led by GIPAN has already completed a tiny proof of concept. They trained an extremely small 1.5 billion perimeter reasoning model for just $30 in compute.
Junxin He and Assistant Professor at Hong Kong University of Science and Technology has published a larger replication. His team added reasoning to the Quen 7B model using 8,000 reinforcement learning samples. Technically, this was actually an independent concurrent discovery of the process, with his team working on the project for the last two months.
A full-scale replication attempt is currently underway at HuggingFace. Their team is repeating the method described in the DeepSeek paper using their own dataset as DeepSeek hasn't disclosed theirs. The project uses the HuggingFace science cluster, which contains 768 NVIDIA H100s. This should be roughly equivalent to the limited resources claimed by the DeepSeek team.
Elle Bakush, one of the engineers on the project, said, the R1 model is impressive, but there's no open data set, experiment details, or intermediate models available, which makes replication and further research difficult. Fully open sourcing R1's complete architecture isn't just about transparency, it's about unlocking its potential.
Whenever the truth of the claims, the fact that some people are skeptical has not at all stopped US-based AI startups from beginning to adopt DeepSeek R1. Generally, there are two categories that we're seeing. The first are AI startups that serve other companies' models through their own UX, and these were, of course, some of the fastest to add R1. Perplexity is a lead example, standing up access earlier in the week. Interestingly, the perplexity team have managed to set up a version of DeepSeek with their own system prompts to circumvent Chinese content controls.
Practically, that means the model will now explain what happened in Tiananmen Square or why Winnie the Pooh memes were popular in Hong Kong a few years ago. On Wednesday, Microsoft announced that they had made R1 available on Azure AI Foundry and GitHub. Amazon followed Zoot the next day, adding R1 to AWS Bedrock and SageMaker. Apple didn't make any integration moves, but CEO Tim Cook did say during an earnings call, I think innovation that drives efficiency is a good thing, and that's what you see in that model.
As a whole separate conversation, beyond the scope of what we're doing here, this ability to plug and play different models and switch them out at will highlights how small the mode is in many circumstances for AI model companies. The switching costs are so low that startups and enterprises can quickly plug the latest model into their existing infrastructure.
Now, the second type of adoption we're seeing is startups using R1 in their own work. Pat Gelsinger, former Intel CEO and chairman of Glue, told TechCrunch, my Glue engineers are running R1 today. They could have run O1, well, they could only access O1 through the APIs. Their team is currently working on an AI service called COM, which will offer a chatbot and related features.
Gelsinger said that with the help of R1, his team expects to have rebuilt calm, quote, with our own foundation model that's all open source. That's exciting. Gelsinger's big picture view is that R1 is proved not only that AI will be affordable enough to be everywhere, but high performance AI will be everywhere, commenting, I want better AI in my ordering, I want better AI in my hearing aid, I want more AI in my phone, I want better AI in my embedded devices.
Framing his view on technology he wrote on X, wisdom is learning the lessons we thought we already knew. Deepseak reminds us of three important learnings from computing history. One, computing obeys the gas law, making it dramatically cheaper will expand the market for it. The markets are getting it wrong and this will make AI much more broadly deployed. Two, engineering is about constraints. The Chinese engineers have limited resources and they had to find creative solutions. Three, open wins. Deepseak will help reset the increasingly closed world of foundational AI model work.
Box CEO Aaron Levy commented, anyone building enterprise AI applications knows that the cost and quality of AI are the only two factors that matter in AI adoption right now. This is why DeepSeek's breakthroughs are such a big deal. Enterprise AI is the rare category of technology where the use case demand generally far exceeds the ability to satisfy all of these use cases well. This is fantastic, the opposite would mean that there's less demand than the tech is capable of, but of course, is only good news if you can eventually meet the demand. Chipmaking startups to reverse is using DeepSeek as an opportunity to demonstrate their technology.
They plan to host a version of R1 on their US servers and powered by their wafer scale hardware. Traditional GPUs are built using a single chip cut out of a larger wafer during the manufacturing process. These are then networked together to construct AI training and inference clusters.
The architecture Seribris' built allows for multiple GPU cores on a larger wafer to function as one large chip, about the size of a manhole cover. This places the networking on the chip, allowing for much faster communication than external wiring. Seribris says their servers can run the 70B version of DeepSeek R1 57 times faster than GPU-based solutions. This is particularly important for reasoning models, which use significantly more compute at the inference stage to generate responses.
And while R1 seems to be more efficient than some U.S. models, inference demands are still very high. Remember, the market crash at the beginning of the week was largely about the fear that deep-seek meant that demand would drop for AI chips and data centers. It seems most market analysts believe the bulk of the hundreds of billions of spending for AI labs went into infrastructure for training. Mad chief scientist, Jan Macoon, refuted this idea commenting, major misunderstanding about AI infrastructure investments. Much of those billions are going into infrastructure for inference, not training.
Running AI assistant services for billions of people requires a lot of compute. Once you put video understanding, reasoning, large-scale memory, and other capabilities in AI systems, inference costs are going to increase. The only real question is whether users will be willing to pay enough directly or not to justify the CapEx and OpEx. So the market reactions to DeepSeek are woefully unjustified.
Indeed, one of the really big takeaways of this week is that much of the AI race is about inference. In other words, models from multiple labs in both China and US are good enough for many tasks at this stage, and the competition is instead based on who can serve the cheapest, fastest, and most stable AI. Coho Ose, a partner in matrix ventures rights, under discuss deep-seak implication. If we can turn any decent-based model into a powerful reasoning model, compute spend shifts more dramatically to inference.
Meanwhile, perplexity CEO Aravan Shrinivas believes the implications go much deeper, posting,
Another big conclusion being drawn is that everything in the AI stack is getting commoditized at a breakneck pace. Other than the final user experience, Peter Yang, Principal Product Lead at Roblox wrote, soon people will care more about their favorite AI apps than the models powering them. I don't care which model is powering perplexity, granola, or replet, I care more that they have high craft and thoughtful UX, lightning fast speed, and seamless integration into my workflows. It's a great time to build AI apps.
Responding to a revelation that cursor has 100% adoption among Stanford CS undergrads and Y Combinator founders, Suhelt Doshi, the founder of Playground AI commented, Applayer will win, everything else will get commoditized. You won't even know what model is used under cursor soon. It's just the best one because you trust them.
And just as we got some comments from OpenAI, we also got comments from the leadership of Anthropic, specifically Dario Amade. On Wednesday, he condensed his thoughts into a blog post entitled, On DeepSeek and Export Controls. His central premise was that DeepSeek is not significantly ahead of US Labs. He noted that the media narrative has latched onto the idea that DeepSeek had spent $6 million to achieve a model that would cost US Labs billions to train. Amade disclose that Claude 3.5 saw it didn't cost billions to train. Its costs were in the tens of millions of dollar range.
The expensive part is that gigantic data centers required to serve inference for the models once they're released to the public. He claimed, deep-seek produced a model close to the performance of US models seven to 10 months older for a good deal less cost, but not anywhere near the ratios people have suggested. Amadeh explained that US companies have observed an annual forex reduction in training costs for several years, adding, deep-seek v3 is not a unique breakthrough or something that fundamentally changes the economics of LLMs. It's an expected point on an ongoing cost reduction curve.
What's different this time is that the company that was first to demonstrate the expected cost reductions was Chinese. Amadeh did acknowledge that some of the compression and optimization techniques present in the deep-seak paper are genuine innovations. However, he expects these techniques to now be applied at a much larger scale by leading labs in the US and China, keeping them on the same cost reduction curve.
He concluded, the performance of DeepSeek does not mean the export controls failed. DeepSeek had a moderate to large number of chips, so it's not surprising that they were able to develop and then train a powerful model. They were not substantially more resource constrained than USAI companies, and the export controls were not the main factor causing them to innovate. They are simply very talented engineers, and show why China is a serious competitor to the US.
Now another area that discussion of deep-seek showed up this week was in big tech earnings calls. You heard that Apple's Tim Cook was asked about it, but Meta, who reported this week, could be more impacted than most after the company bet it all being the leader in open source AI.
However, during Wednesday's earnings call, CEO Mark Zuckerberg didn't seem the least bit concerned about new competition out of China. He said,
With such a short time since the world recognized the pace of Chinese development, Zuckerberg added, it's probably too early to really have a strong opinion on what this means for the trajectory around infrastructure and CapEx and things like that. There are a bunch of trends that are happening here all at once. No, Meta has committed to spending $60 billion on new data centers this year. And if anything, Deepsea can increase Zuckerberg's conviction that the investment will pay off. He commented, I continue to think that investing very heavily in CapEx and Infra is going to be a strategic advantage over time.
It's possible that we'll learn otherwise at some point, but it's way too early to call that. At this point, I would bet that the ability to build out that kind of infrastructure is going to be a major advantage for both the quality of the service and being able to serve the scale that we want to. Now, internally at meta, the mood seems a little more urgent. A recording of the company's first all hands of the year was leaked late this week.
We will skip aggressively over discussions of changes to content policies, the end of the company's DEI training and impending layoffs. Suffice it to say there seems to be a reasonable level of discontent within the company, but much of the meeting was focused on their AI strategy. Zuckerberg is gunning to get penetration with llamas free and open source approach, stating, I'm always looking for ways that we can convert the strength of our business model into delivering a higher quality product to people. We have a model that's competitive with the best models out there and we offer it for free. We're not charging $20 or $200 a month or whatever.
Now, I think that there might be an opportunity to do even more. We can deliver even higher quality answers than other people in the industry could deliver and also make that free. Addressing deep seek, he said, whenever I see someone else do something, I'm like, ah, come on, we should have been there, right? We've got to make sure that we're on it. Zuckerberg also tried to assure his team that they weren't going to be replaced by AI. Referencing the company's plan to build a high quality coding agent, he said, does that mean that we're not going to need engineers? Actually, the opposite.
If an engineer can now do a hundred times more work, I want a lot more engineers, right? I would guess that we're going to be able to train AI's to do a better job than a lot of the human reviewers. It's probably not the case that that kind of flip will happen until next year. Overall Zuckerberg left with a parting message, which I think listeners to this show this week will have no trouble understanding. It's going to be an intense year, he said, so buckle up. We've got a lot to do. I'm excited about it.
Today's episode is brought to you by Vanta. Trust isn't just earned, it's demanded. Whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex.
That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001. Centralized security workflows complete questionnaires up to 5x faster and proactively manage vendor risk. Vanta can help you start or scale up your security program by connecting you with auditors and experts to conduct your audit and set up your security program quickly.
Plus, with automation and AI throughout the platform, Vanta gives you time back, so you can focus on building your company. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time. For a limited time, this audience gets $1,000 off Vanta at vanta.com slash NLW. That's V-A-N-T-A dot com slash NLW for $1,000 off.
If there is one thing that's clear about AI in 2025, it's that the agents are coming. Vertical agents by industry, horizontal agent platforms, agents per function. If you are running a large enterprise, you will be experimenting with agents next year. And given how new this is, all of us are going to be back in pilot mode.
That's why Super Intelligent is offering a new product for the beginning of this year. It's an agent readiness and opportunity audit. Over the course of a couple quick weeks, we dig in with your team to understand what type of agents make sense for you to test, what type of infrastructure support you need to be ready, and to ultimately come away with a set of actionable recommendations that get you prepared to figure out how agents can transform your business.
If you are interested in the agent readiness and opportunity audit, reach out directly to me at www.bsuper.ai. Put the word agent in the subject line so I know what you're talking about. And let's have you be a leader in the most dynamic part of the AI market.
Hello, AI Daily Brief listeners. Taking a quick break to share some very interesting findings from KPMG's latest AI quarterly pulse survey. Did you know that 67% of business leaders expect AI to fundamentally transform their businesses within the next two years? And yeah, it's not all smooth sailing. The biggest challenges that they face include things like data quality, risk management, and employee adoption. KPMG is at the forefront of helping organizations navigate these hurdles.
They're not just talking about AI, they're leading the charge with practical solutions and real-world applications. For instance, over half of the organization surveyed are exploring AI agents to handle tasks like administrative duties and call center operations. So if you're looking to stay ahead in the AI game, keep an eye on KPMG. They're not just a part of the conversation, they're helping shape it. Learn more about how KPMG is driving AI innovation at kpmg.com slash US.
Now, from a narrative perspective, that would be a wonderful place to close, but we've got to hit a few more stories before we get out of here. First, a couple quick model updates, which while not being about deep-seek are clearly still about trying to advance and being seen to advance, Google appears to be on the cusp of releasing their next iteration of their flagship model, Gemini 2.0 Pro.
The model showed up in a changelog for the Gemini chatbot app. In the model's blurb, the changelog said, whether you're tackling advanced coding challenges like generating a specific program from scratch, we're solving mathematical problems like developing complex statistical models. 2.0 Pro Experimental will help you navigate even the most complex tasks with greater ease and accuracy. During this week's hype, many noted that Google already offers a model that's pretty close to DeepSeq R1 and O1 Mini in quality. Gemini 2.0 Flash is priced similarly to R1 for API access once the Chinese model's introductory offer expires next month.
This week, Google has made it the default model for use in the Gemini app, and their reasoning mode is also available for free via Google's AI studio. Back in China, other labs are demonstrating their capabilities as well. Alibaba released Quen 2.5 Macs, claiming to outperform Deepsea Car 1, GPT-40, and Cloud 3.5 Sonnet across a range of reasoning and knowledge benchmarks.
Alibaba also highlighted that they use a mixture of experts' architecture to increase inference efficiency. This is one of the approaches to deal with resource scarcity issues that we've seen in deep-seeks V3 and R1 as well. And then there's OpenAI. Outside of discussions of how their O1 model stacks up compared to R1, or was the actual progenitor of R1, there were a bunch of other stories surrounding them as well.
Maybe most notably, OpenAI's investors don't seem to be worried about Chinese rivals, with the Wall Street Journal reporting that OpenAI is in early talks to raise up to $40 billion in a round that would see the company valued as high as $300 billion. SoftBank is reportedly leading the round and looking to take the bulk of the deal by investing between $15 and $25 billion.
Curiously, the Wall Street Journal originally published evaluation at $340 billion, but later revised the story. They commented, after the Wall Street Journal published that figure in an earlier version of the story, our source said newer negotiations lowered the proposal valuation to as much as $300 billion.
They also clarified that that $300 billion figure is a post-cash valuation, so it seems this was a genuine price drop during negotiations. Still, wherever the figure lands, it's a record-making deal. OpenAI's last round in October raised $6.6 billion at a $157 billion valuation. To double that, in just a few months, would be extraordinary even by OpenAI's own standards. The deal would make OpenAI the second highest value startup in history behind only SpaceX.
The Wall Street Journal also reported that the deal is intended to fund OpenAI's $18 billion share in Project Stargate as well as general operations. Part of the justification is that OpenAI's premium subscription seems to be driving a revenue boom. When OpenAI launched their $200 per month pro tier, to many it seemed like a stretch.
The subscription allowed unrestricted access to all of OpenAI's models, including the Sora video model and the 01 reasoning model, and also added a Pro mode for 01 that gave more extensive answers that replicated research reports. This month's release of the operator agent was also exclusive to the Pro tier. Still $200 per month is a hefty price tag for all but the power users. In fact, Sam Altman complained earlier this month that the Pro tier was actually being sold at a loss because, quote, people use it much more than we expected. Still, according to the information, the price tag doesn't seem to be turning that many people away.
They reported that revenue from Pro Tier subscriptions has now surpassed business team subscriptions, meaning that the Pro Tier has hit 300 million in annualized revenue. OpenAI also launched chat GPT for government, a new version of the chatbot platform designed as you would expect for government use. It's similar to the enterprise tier of chat GPT, allowing users to create custom GPTs and share conversations across a workspace, but also allows agencies to host a selection of OpenAI models in government cloud infrastructure.
They'll be able to configure their own security, privacy, and compliance standards, and OpenAI says that the tailored product could help expedite the approval of the company's tools to handle non-public sensitive data.
And finally and relatedly, on Thursday, OpenAI announced one of their largest scope government projects to date. The company will provide access to their O1 reasoning model to US national laboratories, the network of R&D labs operated by the Department of Energy. According to OpenAI, up to 15,000 scientists will use O1 to quote, accelerate basic science, identify new approaches for treating and preventing diseases, enhance the cybersecurity of the US power grid, and deepen our understanding of the quote forces that govern the universe from fundamental mathematics to high energy physics.
The most chatted about part of this is that one of the research programs partnering with OpenAI involves nuclear defense. The company frame the program is being, quote, focused on reducing the risk of nuclear war and securing nuclear material and weapons worldwide, and OpenAI capped off their announcements by stating, this is the beginning of a new era where AI will advance science, strengthen national security, and support U.S. government initiatives.
Still, as you might imagine, a lot of the chatter on X followed this pattern by Pinkmoon Kate, who wrote,
And so friends, that wraps what was a crazy week. Perhaps I should say another crazy week in artificial intelligence. Maybe the craziest part about this is that I think I've probably only said the word agent once or twice. And in any case, I hope you feel now up to date. We will be back over the weekend with the long reads episode and back to our normal approach on Monday. For now, appreciate you listening or watching as always. And until next time, peace.