Well, Casey, the last time we recorded an emergency podcast, you were at gate E8 of the San Francisco airport. And we were talking about open AI and how Sam Altman had just been fired. Are you at the airport today? And if not, would you like me to mail you an anti-ance pretzel so you feel more comfortable? Yeah. No, all things being equal, Kevin, it's actually much more comfortable to record here in my home studio and not have to compete with the PA system announcing flights to Houston.
Casey, we are here today to talk about a little company called DeepSeek, which probably most people had not heard of, but that is causing a major series of events in the US stock market and around the US tech industry this week.
That's right. By now, our listeners have probably seen that the stock market dipped on Monday and that some companies whose fortunes are closely tied to AI dipped quite dramatically. But they also might have just noticed it in the App Store, where DeepSeek has hit number one this week, which is a rarity for a Chinese consumer app to do in the United States. So, yes, suddenly everywhere you look, there are signs of this DeepSeek affecting the world.
And we should say to maybe talk directly to the things some listeners may be thinking about why we are interrupting our normal production schedule to do a special emergency episode about DeepSeek. It is not unusual for people in the AI world to start freaking out about some new development or breakthrough or some new model that was released. But I believe that this is the real deal. I think this is a big moment in the history of AI development.
And it is really taking a toll on stock markets in ways that I think are really interesting. I mean, you said dip, but Nvidia stock, one of the highest performing stocks on the market over the past few years, and certainly the one that is most closely correlated with people's feelings about AI, is down about 18% today.
That represents hundreds of billions of dollars wiped off the market cap of just one company by this announcement from DeepSeek. I think this is a broader story than just the stock market. I think it has tons of implications for other companies developing an AI and also for concerns that a lot of people working on AI safety have about how this technology could get out of hand.
Yeah, I'm excited to get into it too, but I will signal that I think that there are also some reasons not to freak out. And so I'm going to be trying to bring some of those to the discussion. But today, Kevin, I think we just really want to do three things. One, we want to tell you what deep seek is. Two, we want to give you some insight into why people think this is such a big deal. And then three, I think we want to debate a little bit back and forth just how big a deal this really is. So yeah, let's get into it.
All right, so let's start with what DeepSeek is. Kevin, we have mentioned it on the show before, but tell us a little bit about this new model and why it has taken the world by storm. Well, let's talk first about DeepSeek itself. You may remember if you listened to our episode a couple of weeks ago on this, that DeepSeek is a Chinese AI company. It is about a year old and it spun out of a hedge fund.
called High Flyer. And it was something that I think outside of China, most people were not paying attention to until late last year when they released something called V3. That was an AI model.
That was, they said, competitive with some of the leading AI models created by American AI companies. It really caught people's attention, not just because it came out of this little-known Chinese AI startup, but because of what DeepSeek said about how it was trained and how much it cost to train.
Yeah, so tell us about what was so interesting about how they did it and what it cost. Yeah, so the first interesting thing about DeepSeek that caught people's attention was that they had managed to make a good AI model at all from China. Because for several years now, the availability of the best and most powerful AI chips has been limited in China by Chinese export controls. You are not allowed, if you are in Vidya,
or another American company to export your most powerful AI chips to China. So Deepseak came out with this paper and they said, well, we actually didn't use your fancy AI chips. We used a kind of second rate AI chip that was artificially limited in order to be able to export them to China.
We have a bunch of those, and using just those kind of lesser AI chips, we were able to get a model to perform as well as you American Tech companies with all your fancy H100s.
And then the second thing that really caught people's attention was about the cost. Deepsea claimed that they had spent just $5.5 million training V3. And I think there's some reasons to take that number with a grain of salt. But just in terms of the raw cost of the training run for that model, $5.5 million is
peanuts relative to what American AI companies spend training their leading models, you know, something on the order of a hundred times cheaper than what something like an open AI model of equivalent performance would cost to train.
Right, and this comes against a backdrop of all the US tech giants saying we're going to spend tens of billions of dollars this year to increase our capacity and data centers and the amount of compute power that we'll have. So this really stood in stark contrast to that. So that tells us a little bit about what V3 is, sort of the training and the cost were maybe more interesting than the model itself, which is just kind of like a chatbot like a lot of us have already used.
But V3 came out around Christmas, Kevin. So why is the market reacting so strongly now? So a couple of things happened in the past week or so that have led to the freak out that we're seeing now. The first is that last week DeepSeek released another model, R1.
which was its attempt at a so-called reasoning model. Basically V3, the last model was kind of similar to things like Claude or Gemini. It was sort of a basic language model. But R1 was more like open AIs.
01 and 03, which are its newest reasoning models. So that happened early last week. And then a little later, a couple of days later, DeepSeek did something else, which was that it released an app where anyone could download DeepSeek and go use its model in a very easy way. And this is when people really started to go from being interested and fascinated by DeepSeek to really panicking about it. Because all of a sudden,
millions of Americans were downloading this app using deep seeks models and realizing, oh, wait, this is like as good or better than chat. It's free. It doesn't have any ads. It's
it seems to be quite smart, and it does something that OpenAI's models don't do, which is it shows you the internal thought process that it is going through as it is producing these answers. That is something that OpenAI's models do not show the user, but deep seeks models do, and I think people found that really compelling.
Yes. And that last point that you mentioned, I think is really important because I suspect actually all the AI companies are going to copy this now because the process of using a chatbot today is you ask a question. I've likened it before to like throwing a penny in a fountain, right? You're just sort of making a wish. You see what you get back. What's really interesting about the deep-seak thing is that as it's answering your question, you're seeing how the computer understood your query.
And so if you want to ask a follow-up question, you now have a much better sense of how the computer understood you. And this actually does seem to be a sort of conceptual breakthrough in product design, just as much as the underlying science.
All right, so that gives us a sense of what DeepSeek is and what its latest models are. Let's talk about why everyone is freaking out and maybe more specifically take a look at who is freaking out. So as we mentioned at the top, one of the big ways people are noticing this is through the decline in the stock market. Kevin, give us a sense of the industry reaction to what the DeepSeek models might mean.
The people who are freaking out the most are investors in the biggest American AI companies as evidenced by all of the tech stocks selling off today. I think you could categorize that as a fear of declining margins and commoditization. That sounds very boring.
But basically, what they're saying is, look, if a Chinese AI company that no one had ever heard of until a few weeks ago can come along and for a fraction of our costs develop a model that is as good or better as the leading models on the market,
with substandard chips, by the way, then the barrier to entry in this market is just not nearly as high as we thought it was. One of the fundamental assumptions over the past few years when it came to AI was that bigger was better, that in order to build the most powerful models, you needed billions of dollars, maybe tens or hundreds of billions of dollars in huge data centers and all of the leading chips.
And that assumption may no longer be true. If what Deepsea claims is true and checks out, then it may mean that it only costs single-digit millions or double-digit millions of dollars to build a leading model, which would just radically shift what these companies are able to charge for their models as well as the number of competitors in the market.
Yes, I definitely agree. It changes what companies might be able to charge. But I would also just note that nothing that DeepSeek did was possible without American innovation. One of the reasons that it was cheap for them is because it was expensive for everyone else. And other companies did spend hundreds of billions of dollars figuring this out. So worth saying.
All right, let's talk about a second group of people who have been really rattled by this series of announcements, Kevin, and that is folks who are paying attention to geopolitics. Yeah, so a lot of people who worry about China in general are worried about this deep-seek announcement because deep-seek is obviously a Chinese company.
If you're a person who's sort of worried about Chinese tech dominance or the possibility that Chinese firms could eventually get to something like AGI first, I think you are especially worried about what DeepSeek is doing. And I think we should also say that
The models themselves are recognizably Chinese. So people over the weekend, I saw testing out various queries on DeepSeek R1, including things like, you know, tell me about what happened at Tiananmen Square. And the model just refuses to answer them. And so I think there is a worry that if Chinese companies do take the lead in AI, then
Chinese values and censorship laws may become embedded into this technology in a way that is very hard to extract. Yeah, I think that's true. I also just always urge caution when people try to use the existence of China as a reason to dramatically accelerate the AI race. A lot of those people have made investments that will pay off handsomely if we find ourselves in some sort of protracted and awful conflict with China. So whenever anyone starts talking about China in the context of AI, my eyebrows arch up a little bit.
All right, now, Kevin, there is one more group of folks that I think is quite justly nervous about what they're seeing out there with DeepSeek and who is that? So the third group of people that I would say are freaking out about DeepSeek are AI safety experts, people who worry about the growing capabilities of AI systems and the potential that they could very soon achieve something like general intelligence or possibly super intelligence and that that could end badly for all of humanity.
And the reason that they're spooked about DeepSeek is this technology is open source. DeepSeek released R1 to the public.
It's an open weights model, meaning that anyone can download it and run their own versions of it or tweak it to suit their own purposes. And that goes to one of the main fears that AI safety experts have been sounding the alarms on for years, which is that just that this technology, once it is invented is very hard to control. It is not as easy as stopping something like nuclear weapons from proliferating. And if future versions of this are quite dangerous,
it suggests that it's going to be very hard to keep that contained to one country or one set of companies.
Yeah, I mean, say what you will about the American AI labs, but they do have safety researchers. They do at least have an ethos around how they're going to try to make these models safe. It's not clear to me that DeepSeek has a safety researcher. Certainly they have not said anything about their approach to safety, right? As far as we can tell their approaches, yeah, let's just like build AGI, give it to as many people as possible, maybe for free and see what happens. And that is not a very safety forward way of thinking.
So Casey, that is a lot of information that we just dumped on our listeners. But really what I want to know is like, are you freaked out about this? Do you think that this is as big a deal as some people seem to think it is?
I think as I am doing my reading and having conversations with folks this morning, my sense is I am freaking out a bit less than some other folks that I'm talking to. I think this is a big deal and merits discussion, but I also think that people may be getting a bit over their skis when it comes to thinking through the implications here.
So make that case because all I'm seeing all over my timelines is people saying, this is the Sputnik moment for AI. This is the biggest moment in AI since the release of chat GPT. Everyone needs to stop what they're doing and pay attention. So what is the case that you are seeing out there that people are hyperventilating a bit over nothing?
Sure. So let's take a few different points. One reason why people are really nervous here is that DeepSeq was able to train this model very cheaply. And I want to be clear, this is a significant technical achievement. At the same time, the cost of training and inference has been falling rapidly in AI for a long time now.
Ethan Molick, who we've had on the show before, posted a chart on Act that showed this decline. And in some cases, for example, running inference on a GPT-4 level model, the cost of that has fallen 1,000-fold over the past couple of years. So things have already been moving in this direction. And I think most people who work in AI expected that it would continue to go there. So that is the first point that I would make. Got it. And if you are
Satya Nadella at Microsoft, or Sam Altman in OpenAI, or Sundo Prashaya at Google, are you worried that you are going to spend tens or hundreds of billions of dollars building out new data centers and filling them with all the fanciest GPUs, and that some Chinese startup is going to just take everything that you do and copy it three months later for pennies on the dollar?
So this is a great question, which leads me to a second reason why I think at least some folks may be overreacting here. And that is that in most cases, the money that is being spent to build out the data centers that will handle these giant training runs can be repurposed. The same servers and ships that you would use to do that can also be used to serve what is called inference. So basically actually answering the questions.
So as more and more people start to use AI, it will be those giants that actually have the capacity to serve those queries. They will be able to build businesses around that. And by the way, that is another reason why I don't think that deep-seak is evidence that the export controls failed because the folks over at deep-seak would love to have all of these chips, not just to do the big training runs, but also that they could serve all of the demand that they are currently generating.
So I think that's another important thing to keep in mind as this discussion moves forward. Yeah, that makes a lot of sense to me. I do think the cost dynamics here are very important because I talk to a person that I know who works at one of these big companies and he said that a lot of their customers are already starting to ask.
Could we shift over from using the OpenAI APIs and their models to using DeepSeq if it saves us 80% of our costs? And so I think in the short term, there is reason for the American AI companies to worry because people want these things to be as cheap as possible.
Yeah, and let me just say as somebody who spent $200 to upgrade to GPT Pro last week so I could try operating her, I'm looking forward to that price going down. But that leads to I think maybe the third reason that I think people might be overreacting a little bit here, which is
A lot of what we are seeing here is just essentially a fancy ripping off of techniques that were pioneered here in the United States, right? It has long been the case that open source models were just a little bit behind the models that are made by the big labs, right? You look at Meta's llama models, which until deep-seak were seen as the best open weights models that were out there, they weren't as good as what OpenAI or Google or others were doing, right?
where I do think that this gets super interesting is that deep-seag is showing us open source can now catch up faster than it used to, right? That the labs used to have a little bit longer lead, but now people are just getting cleverer and cleverer about these techniques. And so it is getting harder to build that defensible mode because this is just one of those technologies where once you figure out basically how people are doing it, you could just get in there and do it too.
Yeah. Now, Casey, I'm curious what, if anything, you are hearing from inside meta specifically, because I think this is one of the most fascinating angles. You know, meta is a company that has spent billions of dollars developing AI models. And unlike most of its competitors has chosen to release those models freely. And part of what DeepSeek has shown is that you can take a model like llama
three or llama four, and you can distill it. You can make it smaller and cheaper. You can do that without sacrificing a lot of the performance. And so there were some reports in recent days that meta is basically at Defcon one right now that they are the information reported that they have four war rooms at meta headquarters devoted to trying to figure out how to respond to the deep seek threat.
Yeah. And by the way, I hope they were the same war rooms that meta used to use to protect America from election interference. They say, Hey, get out of here. We got something else we got to figure out. So are you hearing anything from meta? Because I think that is the company that I would say has the most to worry about when it comes to deep-seak because deep-seak is doing essentially what they do, but at a fraction of the cost.
Yeah. So I do not have my own original reporting to share on this yet, but I do trust the information that they are freaking out. And the reason is that meta is supposed to be the best company at ripping other people off. And so when they find out that some
Chinese Johnny come late leads are going to be better than they are ripping things off. They're going to have something to say about it. And so nothing could be more poetic now that, you know, Deepseek has ripped up all the American companies. Meta is coming back and they say, Oh, you think you're good at ripping people off? Just wait until we have plumbed the guts of V3 and R1. We're going to be back on top sooner or later. Bucko.
Yes. Now, I want to ask you about one other reaction that I saw on social media, which was from Satina Della, the CEO of Microsoft. He went on his ex-account late last night and posted the following.
Jevin's paradox strikes again. As AI gets more efficient and accessible, we will see its use skyrocket turning into a commodity we just can't get enough of. And then he linked to a Wikipedia article about Jevin's paradox. So Casey, did you understand this? And if so, what did you make of it?
Well, I did because we had just discussed Jevin's paradox on this very show, Kevin. It's true. When hugging faces, Sasha Lucione came on and explained Jevin's paradox, which is essentially as stuff becomes more efficient, you simply increase demand for it, thereby canceling out a lot of the efficiency gains. So when I saw Satya tweet Jevin's paradox, I said, once again, hard fork has set the national news agenda. And if you're not listening, fix that.
Yeah, many people are talking about Jeven's paradox. I predict that this is going to be something I'm going to hear about at every single party I go to for the next six months. Just to connect the dots a little bit, I think what Satya is trying to say here is that DeepSeek is not actually a threat to companies like Microsoft because as the cost of building and using AI models comes way down, people are just going to want to use them more and more. The overall demand and
Microsoft's overall profitability will not change, which could be true, but I would also just say is exactly what you would expect the CEO of Microsoft to say on a day where investors were panicking and selling their stock.
It is. It is. Um, all right. Well, Kevin, I think that's a pretty good overview of what deep-seek is doing, why people are freaking out and at least some thoughts about exactly how freaked out you should be. There is a lot more to say about this subject. And if you are starving for even more discussion of deep-seek, I can promise you that we'll have more to say on our regularly scheduled episode of hard fork this Friday.
Yes. Casey, I love doing these emergency podcasts. They fill me with a sense of danger, excitement, living on the edge. I love them for that reason. I love them for a second reason, Kevin, which is that I get paid by the episode. So here's to many more emergencies in 2025.
This emergency episode of Hardfork was produced by Whitney Jones and Rachel Cone. This episode was edited by Rachel Dry and was engineered by Daniel Ramirez, original music by Dan Powell. Our executive producer is Jen Pojont. Our audience editor is Nell Galogli. Special thanks to Paula Schuman, Blueing Tam, Dahlia Hadad, and Jeffery Miranda. You can email us as always at hardfork at nytimes.com.