Groq首席执行官乔纳森·罗斯 - 下一代AI硬件 | "World of DaaS" 中文双语解读

本集简介

乔纳森·罗斯是Groq公司的创始人兼首席执行官，该公司专门开发用于人工智能和机器学习的高性能微芯片。在创立Groq之前，乔纳森发明了谷歌的人工智能处理器TPU。在本期《DaaS世界》中，乔纳森与奥伦讨论了：人工智能时代下的行业未来 AI处理硬件的演变半导体供应链芯片开发中的挑战与创新 AI专家的提示词撰写技巧《DaaS世界》由SafeGraph与Flex Capital联合呈现。如需收听更多期节目，请访问 worldofdaas.buzzsprout.com，并关注我们 @WorldOfDaaS。奥伦·霍夫曼在X平台的账号为 @auren，乔纳森在X平台的账号为 @JonathanRoss321。本集的编辑与后期制作由The Podcast Consultant（https://thepodcastconsultant.com）提供。

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

欢迎来到DAS世界，一档为数据爱好者打造的节目。

Welcome to World of DAS, a show for data enthusiasts.

Speaker 0

我是您的主持人奥林·霍夫曼，Safegraft的首席执行官兼Flex Capital的普通合伙人。

I'm your host, Orin Hoffman, CEO of Safegraft and GP of Flex Capital.

Speaker 0

如需获取更多对话、视频和文字稿，请访问safegraft.com/podcasts。

For more conversations, videos, and transcripts, visit safegraft.com/podcasts.

Speaker 1

你好，各位数据达人。

Hello, fellow data darts.

Speaker 1

我今天的嘉宾是乔纳森·罗斯。

My guest today is Jonathan Ross.

Speaker 1

乔纳森是Grok公司的创始人兼首席执行官，该公司专门开发用于人工智能和机器学习的高性能微芯片。

Jonathan is the founder and CEO of Grok, a company that develops high performance microchips purpose built for AI and machine learning.

Speaker 1

在创立Grok之前，乔纳森发明了谷歌的AI处理器TPU。

Part of founding Grok, Jonathan invented Google's AI processor, the TPU.

Speaker 1

乔纳森，欢迎来到DaaS世界。

Jonathan, welcome to World of DaaS.

Speaker 2

谢谢邀请我。

Thanks for having me.

Speaker 2

我很感激能来到这里。

I appreciate being here.

Speaker 1

我现在真的很兴奋。

Now I'm really excited.

Speaker 1

你对数据和计算有一些非常有趣的观点，认为我们正从信息时代过渡到生成时代。

You have some really interesting ideas about data and compute and the idea that kind of we're transitioning from an information age to a generative age.

Speaker 1

这意味什么？这对数据意味着什么？

What does that mean, and what does that mean for data?

Speaker 2

我们第一次能够制作高保真度的数据副本并将其分发到全球时，就进入了信息时代。

We transitioned into the information age the first time we were able to make high fidelity copies of data and distribute them throughout the world.

Speaker 2

随着时间的推移，这改变了我们的商业运作方式。

And, over time, that changed the way that we did business.

Speaker 2

突然间，重点从‘谁在当下拥有答案’变成了‘谁提出了答案’。

All of a sudden, it went from being about who had the answer in the moment to who came up with the answer.

Speaker 2

这个想法最初作为一个概念，并且作为一个可盈利的项目出现。

There was the origination of the idea as a concept and as a monetizable thing.

Speaker 2

随着时间的推移，我们获得了新技术。

And we got new technologies over time.

Speaker 2

我们有了互联网，它能够制作高保真副本并即时分发。

We got the internet, which was about making high fidelity copies and distributing them instantly.

Speaker 2

我们有了移动技术，它让你能随时随地进行操作。

We got mobile, which was about doing it in your hands.

Speaker 2

但这些技术本质上和印刷术是一样的。

But really these technologies were the same as the printing press.

Speaker 2

只是它们变得更好了。

They were just much better.

Speaker 2

这种程度的变化足以在某种程度上打破我们所有的直觉。

And that degree of change was enough to break all of our intuition to some degree.

Speaker 2

生成式人工智能并不是信息时代的技术。

Generative AI is not an information age technology.

Speaker 2

这并不是关于复制数据并进行分发。

It's not about taking copies of data and distributing them.

Speaker 2

而是关于在当下创造事物。

It's about creating things in the moment.

Speaker 2

这是创造性的。

It's creative.

Speaker 2

这是生成性的。

It's generative.

Speaker 2

你会在当下获得一个专属于你的答案，而这需要计算能力。

And you're going to get an answer in the moment that is custom to you, and that requires compute.

Speaker 2

因此，这改变了范式，并打破了我们所有的直觉。

So that changes the paradigm and it breaks all of our intuition.

Speaker 2

不再是谁拥有数据，而是谁拥有计算能力，能够立即为你生成答案。

Instead of it being about who has the data, it's who has the compute, who can create the answer for you right now.

Speaker 1

这意味着数据比以前不那么有价值了，而计算能力更有价值吗？

And does that mean data is less valuable than before and compute is more valuable?

Speaker 1

或者它是分层的。

Or It's stacked.

Speaker 2

没有数据就无法拥有模型，但如果没有能源，也无法拥有信息时代的经济。

You can't have the models without data, but you also can't have an information age economy without energy.

Speaker 2

工业时代实际上是基于能源的。

And the industrial age was really based on energy.

Speaker 2

因此，我们现在有了这种新的分层，但它确实稍微改变了范式。

And so what we have is this new stacking, but it does change the paradigm a bit.

Speaker 2

当我们与人交谈时，这常常很有趣，因为一个人越成熟，就越固守着靠近数据的这种观念。

Oftentimes when we're talking to people, it's so funny because it's almost the more sophisticated someone is, the more entrenched this idea about being close to data is.

Speaker 2

你我第一次见面时，我给你展示过我们当时做的演示吗？

When you and I first met, did I show you the demo at all of what we had?

Speaker 1

是的。

Yeah.

Speaker 1

我想是的。

I think so.

Speaker 2

我们当时离服务器所在地有大约九千英里远，但响应几乎是即时的。

And we were, like, 9,000 miles away, some number of thousands of miles away from where the servers were, and it was almost instant.

Speaker 2

所以，你想想，印度最大的出口品是令牌。

And so, you think about it, the number one export of India is tokens.

Speaker 2

这些令牌是由人类生成的，而不是由计算机生成的。

They're just generated by human beings as opposed to generated by a computer.

Speaker 2

你不需要靠近数据，因为输入的数据量很小，但计算量巨大，能生成答案，然后再发送少量数据回去。

And you don't need locality because you have a small amount of data coming in, you have an enormous amount of compute producing an answer, and then you send a small amount of data back.

Speaker 2

所以我们经常只接收两字节的数据，执行一千八百亿次运算，然后再回传两字节。

So oftentimes, we'll get two bytes, perform 180,000,000,000 operations, and then send two bytes back.

Speaker 1

就像客服人员之类的角色。

Like a customer service agent or something like that.

Speaker 2

没错。

Exactly.

Speaker 2

我们甚至不在服务器里安装硬盘。

We don't even put hard drives in our servers.

Speaker 2

我们完全解耦了，只专注于计算。

We just completely decouple, and we're just focused on the compute.

Speaker 2

数据极其重要。

And data is incredibly important.

Speaker 2

但这不是我们的核心工作。

It's just not what we do.

Speaker 2

我们正在做生成式人工智能这一块，但这是建立在那些专注于数据的人所做工作的基础上的。

We're doing this generative age thing, but we build on top of the work of others who do things in data.

Speaker 1

少谈我们自己的书，多谈谈世界将走向何方。

Less talking about our own book, more talking about where the world is going here.

Speaker 1

在这个即将形成的世界里，这种转变总体上会是什么样子？对于数据提供方、计算方、带宽、能源等不同环节，它们各自的地位将如何变化？

In the world that it's going, what does this transition going to generally look like, and what does it mean for the different, where there's a data provider, there's a compute, there's obviously the bandwidth stuff, there's the energy, where do all these things stack up?

Speaker 1

它们之间相互作用，权力将如何重新分配？

And vis a vis one another, how will the power shift?

Speaker 2

权力的转移会以一种方式发生：历史上，你之所以拥有大量权力，是因为你的地理位置。

Well, power will shift in one way, which is historically you had a lot of power just because of your geographic location.

Speaker 2

而且实际上，你可以从地下开采石油。

And literally, you could pull oil out of the ground.

Speaker 2

然后我们开始建设数据中心。

And then we started building data centers.

Speaker 2

而这种转变在于，你在哪里建设数据中心，就会获得一定的权力。

And a shift there was where you build the data center gives you some power.

Speaker 2

这不仅仅是你在哪里找到资源，而是你在哪里建设它们。

It's not just where you find the resources, but where you build it.

Speaker 1

为什么呢？

Why is that?

Speaker 1

为什么俄勒冈州会比内布拉斯加州更有优势？

Why would Oregon have more power than Nebraska or something?

Speaker 2

在俄勒冈州的情况下，我不确定他们是否真的更有优势，但如果有的话，很可能是因为他们拥有大量的水电资源。

Well, in Oregon's case, I don't know that they do have more power, but if they did, it's probably because they have a lot of hydropower.

Speaker 1

你说过，因为数据中心更近，所以有些人更有优势。

You said somebody has more power because the data center is closer and stuff.

Speaker 2

我是说从技术层面来说。

I mean more technologically.

Speaker 2

所以，这么想吧。

So, think about it this way.

Speaker 2

目前，美国和中国之间存在冲突。

Right now, there's a conflict between The US and China.

Speaker 2

这场冲突主要关乎获取众多技术，尤其是人工智能计算能力。

And that conflict is about largely getting access to many technologies, but in particular, AI compute.

Speaker 2

这里有一种强者愈强的效应，就像石油一样，更擅长开采石油并不会让你拥有更多的石油可采。

And there's a strength leads to strength aspect here, which is with things like oil, being better at pulling oil out of the ground doesn't give you more oil to pull out of the ground.

Speaker 2

它不会形成一种循环：你开采的石油越多，就能得到越多的石油。

It doesn't have this cycle where the more oil you pull out, the more oil you get.

Speaker 2

但对于人工智能这类技术，你越熟练，就越能进一步提升能力。

But with AI and these sorts of technologies, the better you get at it, the more you're able to get better at it.

Speaker 2

因此，你实际上可以拉开与他人的差距。

And so, you could actually pull away from others.

Speaker 2

所以现在各方正在竞相争夺领先地位，而一旦某方取得初始领先，就可能拉开足够大的差距，让其他人难以追赶。

And so there's a bit of a race right now to see who can actually get a lead on this, but whoever has the initial lead could potentially pull away far enough that it'd be hard for others to catch up.

Speaker 1

这涉及到算力方面。

There's this compute side.

Speaker 1

还有芯片这一块。

There's the chip side of it.

Speaker 1

还有，姑且称之为算法层面的东西。

There's, let's say, for lack of a better word, the algorithmic side.

Speaker 1

这就像OpenAI或一些类似Anthropic的模型。

This would be like an OpenAI or some of these anthropic type of models.

Speaker 1

还有其他类型的东西。

There's other types of things.

Speaker 1

整个生态系统中的每一个环节都同等重要吗？还是你觉得有些部分更重要？

Is every single piece of the ecosystem is equally as important, or do you think some are more important than others?

Speaker 2

如果你缺少其中某些部分，可能就无法获得全部，但有些部分可能会被免费提供。

If you're missing some of it, you may not be able to get all of it, but some of these pieces are gonna be just given away for free.

Speaker 2

例如

For example

Speaker 1

伙计们，这不过是开源而已。

Guys, it's just open source.

Speaker 2

特别是开源模型。

Open source models in particular.

Speaker 2

如果你想想看，当年Linux刚出现时，面临巨大阻力，人们根本不相信开源。

If you think about it, Linux one, when it had enormous headwinds, people did not believe in open source at the time of Linux.

Speaker 2

Linux必须证明自己。

Linux had to prove it.

Speaker 2

如今，开源已被视为大多数类似项目最终的默认发展方向。

Now open source is considered the de facto way that things will end up going for most projects like this.

Speaker 2

我们接触的许多人，都默认认为最终会有一个开源模型胜出。

And many of the people that we speak with are just operating on the assumption that eventually an open source model will win.

Speaker 2

问题是，会是哪一个？

And the question is which one?

Speaker 1

Linux 真正开始兴起的一个原因是 dot-com 泡沫破裂之后，此前每个人都购买 Sun 的设备，而且非常乐意为那些昂贵的 Sun 服务器支付巨额费用，而 Linux 服务器的价格却只有 Sun 设备的十分之一左右，突然间成本变得至关重要，企业必须节省开支，于是 Linux 开始迅速普及。

One of the reasons Linux really started taking off in the aftermath of the dot com crash, prior everyone was buying Sun, and they were very happy with paying just extraordinary amounts for those Sun boxes, and then the Linux box was 10 times cheaper or something than the Sunbox, and because all of a sudden money was important and it had to save costs, Linux started taking off.

Speaker 1

这是不是人们现在会开始思考的一个时刻？你是怎么看待这一点的？

Is that a moment that people are going start thinking about here, or how do you think about that?

Speaker 2

这里可能的一个区别是，早期互联网技术的采用速度远低于生成式 AI 的采用速度。

Probably one difference here was the rate of adoption for the early internet stuff was much lower than the rate of adoption for generative AI.

Speaker 2

这其实才刚刚过去一年左右。

It's really only been a year or so.

Speaker 2

从事这一领域的公司增长速度简直疯狂。

And the rate of growth of companies doing this is crazy.

Speaker 2

是的，情况非常不同。

Yeah, it's very different.

Speaker 2

你看到的是，人们已经开始关心诸如‘这是否是一个好产品’这样的问题了。

What you're seeing is people are already starting to care about things like, is this a good product?

Speaker 2

对于早期的互联网来说，人们关心的是‘我能接入哪家互联网服务提供商’。

For the early Internet, it was, this is the Internet provider that I can get access to.

Speaker 2

我会用叔叔鲍勃的网络服务，你当地的ISP。

I'm going to use Uncle Bob Serve, your local ISP.

Speaker 2

那时候你只能选他家。

He was the only one you had.

Speaker 2

哦，这就是我知道的唯一一个该去的网站。

Oh, this is the one website I know to go to this thing.

Speaker 2

现在你看到的是，大家非常关注谁会成为赢家。

Now what you're seeing is there's really a focus on who's going to be the winners.

Speaker 2

所以，如果你有更好的方案，人们现在往往会涌向它。

And so if you have a much better approach, people tend to flock to it now.

Speaker 2

所以，这涉及到成本问题。

So, it's a cost.

Speaker 2

没人想被困在某个专有系统里。

No one wants to get stuck in something proprietary.

Speaker 2

这是我们经常听到的一个重要观点。

This is one of the big ones that we hear.

Speaker 2

而且，人们假设开源会因为过去的经验而发展得更快。

And also, the assumption that open source is going to move faster because that's what people have seen in the past.

Speaker 2

所以，你不想被困在某个专有技术上，虽然它现在领先，但开源最终会赶超。

And so, you don't want to be on some proprietary thing, which, yeah, it's ahead right now, but open source is gonna pull ahead.

Speaker 1

当我们迈向这个以计算驱动的生成式时代时，社会和经济将产生哪些影响？

As we move to this compute driven generative age, what are both the societal and economic implications are gonna happen?

Speaker 2

我无法预测会发生什么，但我们注意到一些事情，而进入一个新时代并称之为不同技术时代的根本原因，正是它打破了我们所有的直觉。

I can't predict what will happen, but there are a couple of things that we've noticed, which the entire point of going into a different age and why you would call it a different technological age is it breaks all of our intuitions.

Speaker 2

我认为被完全打破的一个最有趣的观点是，我们一直认为每项技术都会取代工作。

And one of the most interesting ones that I think is completely broken is we keep thinking of each technology as displacing work.

Speaker 2

可能将要发生的是，我们会为人们创造比人口还多的工作岗位。

One of the things that's probably going to happen is we will probably create more jobs for people than we have people.

Speaker 2

突然之间，会缺乏足够的人手来完成各种工作。

There will suddenly be a lack of supply of people to do things.

Speaker 2

我给你一个具体的例子。

And I'll give you a concrete example.

Speaker 2

过去，人们的文章或普通新闻中几乎不会配有图片。

It used to be that no one would have a graphic in their articles, their random news articles.

Speaker 2

但现在制作图片变得非常容易，大多数文章都包含某种图形。

But now it's so easy to create one, most articles have some sort of graphic.

Speaker 2

大多数博客文章都配有插图。

Most blog posts have an article.

Speaker 2

由于生成这些图片变得如此简单，人类整体上可能花费了更多时间来创建它们。

And people probably spend more hours overall as human beings generating these because it is so easy to generate them.

Speaker 2

这被称为杰文斯悖论。

This is called Jevan's paradox.

Speaker 2

这一现象在19世纪60年代就被一位撰写煤炭相关论文的人注意到，他发现每当蒸汽机变得更高效时，人们反而购买了更多的煤炭。

And this was noticed in the eighteen sixties by someone writing a treatise on coal where what he realized was every time steam engines got more efficient, rather than buying less coal, people bought more coal.

Speaker 2

蒸汽机效率提高了，为什么人们还要买更多煤呢？

So steam engines get more why are they buying more coal?

Speaker 2

因为运营成本下降了，更多资源被投入到其他用途，人们因此做了更多事情。

Well, OpEx goes down, more things go into the money, people do more things.

Speaker 2

因此，对于生成式人工智能让事情变得容易的大多数领域，你实际上会看到人类在这方面的活动增加。

And so what will probably happen is with most of the things that generative AI makes easy, you will actually see an increase in human activity on that.

Speaker 2

总会有人更富有创业精神，找到一种方式将其商业化，让大量人参与其中。

And there's always gonna be someone who's gonna be more entrepreneurial and figure out a way to monetize that, get a whole bunch of people working on it.

Speaker 1

我以前没听说过杰文斯悖论。

I haven't heard of Jebin's paradox before.

Speaker 1

对于普通人来说，这类似吗？

Is that a similar thing for the layman?

Speaker 1

可能就像这样，好吧。

Might be like, okay.

Speaker 1

比如，高速公路从两车道扩到三车道，但交通还是那么堵，因为开车的人更多了？

They made the highway from two lanes to three, but the traffic is still just as bad because more people drive on it?

Speaker 1

或者

Speaker 2

这个想法很好，是的。

That's a good way to think of it, yeah.

Speaker 2

或者用价格弹性来表述也是另一种方式。

Or price elasticity is another way to say it.

Speaker 1

在这个新的世界里，如果你要对公司的前景下注，你认为哪些类型的公司在这个更未来的世界中最具优势？

In this kind of new world, if you had a bet on companies, what types of companies do you think have the biggest advantages in this kind of more future world?

Speaker 2

我稍微调整一下这个问题。

I'm gonna shift that one a little bit.

Speaker 2

我可以说，肯定有一些公司会在这个时代取得成功，还有一些公司可能也会成功，但几乎不可能预测是谁，而他们甚至可能比那些其他公司更成功。

I'm gonna say there are gonna be companies that are definitely gonna be successful in this, and then there's gonna be companies that might be, and it's gonna be almost impossible to predict who, but they might be even more successful than those other companies.

Speaker 2

当信息时代到来时，从事纸张材料生产是绝佳的时机。

When the information age came along, it was a great time to be in material production for paper.

Speaker 2

那么，哪些报纸会成功呢？

Now, which newspapers were going to succeed?

Speaker 2

我不知道。

I don't know.

Speaker 2

还不清楚这背后是如何运作的。

Don't know how that works yet.

Speaker 2

所以这就像卖铲子和锤子的生意，我们知道我们需要能源。

So this is a picks and shovels thing where we know we're gonna need energy.

Speaker 2

我们知道需要更多的数据。

We know we'll need more data.

Speaker 2

这将成为一个关键点。

That's gonna be a thing.

Speaker 2

但我们也需要更多的计算能力。

But we're also gonna need more compute.

Speaker 2

但这些才是真正的铲子和锤子。

But these are the picks and shovels of it.

Speaker 2

接下来的问题是，哪家生成式AI公司会成为下一个谷歌，哪家会成为下一个微软。

Then there's gonna be which generative AI company is gonna be the next Google, which one's gonna be the next Microsoft.

Speaker 1

比如说航空公司之类的，是有些航空公司会表现特别好，而有些则表现不佳，因为有些航空公司会更快地采用这项技术，还是你觉得所有航空公司都会被淘汰，或者都会受益？

Let's say airlines or something, is it just some airlines are gonna do super well and some airlines are not gonna do well because some airlines will adopt it faster than others, or do you think all airlines get wiped out or all airlines do better?

Speaker 2

你并不是真的在说航空公司。

And you don't literally mean airlines.

Speaker 2

你是说

You mean

Speaker 1

是的

Yeah.

Speaker 1

没错。

Exactly.

Speaker 1

这可能是任何行业。

It could be anything.

Speaker 1

医疗、货运，随便你列出哪个行业都行。

Health care, trucking, just go down your list of whatever industry that's is.

Speaker 2

我只能说这么多。

I will say this much.

Speaker 2

每个行业都会被生成式人工智能颠覆。

Every single industry will be disrupted by generative AI.

Speaker 1

你觉得比信息时代还要快吗？

You think faster than the information age?

Speaker 2

多得多，因为最近有一项研究，让医生和人工智能在诊断上进行竞争。

Much, much, much, because there was a recent study where doctors competed against AIs to make diagnoses.

Speaker 2

如果你听到我说，在相同数据下，人工智能的表现优于人类，你可能不会太惊讶。

The fascinating part about this was you're not going to be too surprised if I say the AI outperformed the human being given the same data.

Speaker 2

我们可能会看到人工智能在诊断医学中发挥作用。

We're probably going to see AI involved in sort of diagnostic medicine.

Speaker 2

我只是想不通这怎么会不发生。

I just don't see how that doesn't happen.

Speaker 2

讽刺的是，外科手术的进展会稍微慢一些。

Ironically, it'll be a little slower for surgery.

Speaker 2

我们已经拥有用于外科手术的机器人，但诊断只需要输入数据即可。

We already have robots for surgery, but diagnoses, you can just feed the data.

Speaker 2

它现在就已经能做得更好了。

It could already do better.

Speaker 2

令人震惊的是，当人类医生与人工智能配对时，结果反而比仅由人工智能做出诊断更差。

The shocking part was when they paired a human doctor with the AI, the results were worse than just having the AI give a diagnosis.

Speaker 1

下棋也是如此。

This is true in chess too.

Speaker 1

以前是人类加上AI比AI单独更强，但很快AI就明显优于人类加AI，因为人类会否决AI的建议之类的。

It used to be a human with the AI was better than the AI, and then very quickly the AI was clearly better than the human plus the AI, because the human would overrule it and stuff.

Speaker 2

我预计，医学将变得如此一致，以至于如果你去看病，而常见病症没有得到规范治疗，这就会构成医疗事故，而不是你只是犯了严重错误。

What I expect is we'll get to a point where medicine becomes so consistent that if you go in and you don't get a common issue treated, that will be malpractice, as opposed to you just did something egregiously wrong.

Speaker 1

那执法和国家安全呢？

What about things like law enforcement and national security?

Speaker 1

这种普遍趋势在这些领域会如何体现？

How will this general world play there?

Speaker 2

让我先从国家安全说起。

Let me start with the national security part.

Speaker 2

这里有一个所谓的第一次、第二次和第三次技术优势的概念。

So there's a concept of first, second, and third offset.

Speaker 2

很多人把第一次技术优势归因于火药的运用。

And a lot of people refer to the first offset as when gunpowder was embedded.

Speaker 2

它彻底改变了战争的进行方式。

It totally changed the way war was done.

Speaker 2

第二次抵消是核武器，因为它彻底改变了冲突的运作方式。

The second offset was nuclear because it totally changed the way that conflicts were done.

Speaker 2

现在很多人认为，人工智能很可能成为第三次抵消。

A lot of people are now saying that AI is probably going to be the third offset.

Speaker 2

同样，直觉是完全不同的。

Again, intuition is totally different.

Speaker 2

在核武器时代，如果一个行为体拥有少量武器，你就会不惜一切代价避免冲突。

So with nuclear, if one actor has a small number of weapons, you want to avoid that conflict at all costs.

Speaker 2

尽管它增加了整个人类物种的风险，但却降低了冲突发生的可能性。

While it increased some risk for the species as a whole, it decreased the risk of conflicts happening.

Speaker 1

至少是重大冲突。

At least major conflicts.

Speaker 1

你会看到像越南那样的代理冲突发生。

You have like proxy conflicts like Vietnam or something happening.

Speaker 2

没错。

Exactly.

Speaker 2

它迫使这些冲突变得更小，而且与其说是我的问题，不如说是我的问题。

It forced these conflicts to be smaller and more, it's not really me, but it's me.

Speaker 2

对于人工智能而言，发动一场人工智能虚假信息攻击或类似行动的成本极低，因此你很可能会看到局势升级。

With AI, the cost to launch a sort of AI attack of disinformation or these sorts of things is so low that you're probably going to see an escalation.

Speaker 2

这改变了局面。

And this changes the dynamic.

Speaker 2

你要想成功，所需要的不是某种数量的人工智能能力。

And what you're going to need to be successful is not some amount of AI capacity.

Speaker 2

你需要的是人工智能优势，就像你需要空中优势一样。

You're going to need AI superiority like you need air superiority.

Speaker 2

谁拥有人工智能优势，谁就会处于非常有利的位置。

And whoever has AI superiority will be in a very good position.

Speaker 2

如果一个实体只有另一个实体三分之一的人工智能能力，它们很可能将不断陷入冲突。

If one entity has a third of the AI capacity of another, they will probably be in conflicts all the time.

Speaker 2

你必须极大地压倒任何对手。

You're gonna have to just dramatically overwhelm any sort of adversary.

Speaker 1

在一个即使美国拥有更优进攻能力的世界里，它可能面临更多攻击面，这意味着它需要有显著更强的防御能力——你是怎么看待这个问题的？

In a world where potentially even if The US has better offense, it may have more vectors of attack, which means it would have to have significantly better defense, or how do you think about that?

Speaker 1

在核战争中，我不确定美国的攻击面就一定比其他国家更差。

In nuclear war, I don't know, The US doesn't have necessarily any worse vectors of attack than anybody else.

Speaker 1

每个人都很脆弱，但在AI战争中，美国似乎更容易受到攻击。

Everyone is vulnerable, but in an AI war, it seems like The US is more vulnerable to attack.

Speaker 2

我特别担心选举问题。

I'm particularly concerned about elections.

Speaker 2

我的理念是，只要我们能确保选举正常运作并保持自由，不让外国资金介入选举，

My philosophy is all the concerns that we have about AI, as long as we get the elections to work and remain free, we don't allow foreign money in elections.

Speaker 2

那我们为什么允许外国的算力介入呢？

Why would we allow foreign compute?

Speaker 2

如果我们能保住选举，就能为我们争取时间，及时解决AI暴露出的问题。

And if we can preserve the elections, that'll give us time to fix the things that we discover are issues with AI in time.

Speaker 2

但这是我们基础设施中最令人担忧的部分——选举。

But that is the most concerning piece of our infrastructure, it's the elections.

Speaker 1

在我们深入探讨你正在研究的所有新范式之前，几乎任何听这个的人应该都很清楚，英伟达是一家极其成功的公司。

Now before we get into all the new paradigms that you're working on, pretty much anyone who's listening to this is probably well aware that NVIDIA has been this incredibly successful company.

Speaker 1

为什么过去十年里，英伟达的这些GPU如此成功？

Why have these GPUs, like, from NVIDIA been so successful over the last decade?

Speaker 2

原因有很多。

There's a lot of reasons.

Speaker 2

首先，让我们回溯一点历史，谈谈为什么CPU会取得成功。

First of all, let's go back a little bit in time, and let's talk about why CPUs became successful.

Speaker 2

英特尔这家公司，大多数人应该都听说过。

Company Intel, most people have heard of.

Speaker 2

英特尔最初是一家内存公司，后来不情愿地转向了CPU。

Intel originally was a memory company and they begrudgingly switched to CPUs.

Speaker 2

而CPU成了更赚钱的业务。

And CPUs became a much better moneymaker.

Speaker 2

原因是CPU并不是一个标准。

And the reason is CPUs, they're not a standard.

Speaker 2

因此切换成本更高。

And so switching costs are higher.

Speaker 2

如果你听说过汉密尔顿·海尔默提出的‘七种力量’框架，CPU简直就是它的化身。

If you've ever heard of the framework from Hamilton Helmer called seven powers, CPUs are like the personification of that.

Speaker 2

然后你看到英特尔通过‘Intel Inside’这样的品牌推广，以及其他各种举措。

And then you see Intel even with the Intel Inside going branding, all this other stuff.

Speaker 1

因为它们更难被复制，不像内存那样的商品。

Because they were both harder to copy, not a commodity like the memory types of things.

Speaker 2

你没法轻易替换它吗？

You couldn't swap it?

Speaker 1

我的意思是，AMD推出的x86芯片，我觉得已经很相似了。

I mean, AMD had a x 86 chip that was, I think, pretty similar.

Speaker 1

只是因为建一座工厂要花20亿美元吗？

Was it just because it cost $2,000,000,000 to create a plant?

Speaker 1

为什么英特尔的性能优势这么强？背后的理由是什么？

What was the reasoning why the power was so high for, like, an Intel?

Speaker 2

这里还有另一个方面，当你部署芯片时，真正重要的不是芯片本身的成本，而是整个基础设施的成本。

There's another aspect here, which is when you're deploying a chip, it's not the cost of the chip that matters, it's the cost of all the infrastructure.

Speaker 2

所以，如果你能将芯片的性能提升15%，那么你输入的电力、数据中心地板的混凝土、机架以及所有其他组成部分的单位价值都会相应提升15%，变得更加经济高效。

So if you can increase the performance of a chip, the CPU 15%, that effectively gives you 15% more value of the power that you're sending in, of the concrete in the data center floor, of the racks, of every single part of it gets 15% more economical.

Speaker 2

因此，哪怕只是微小的性能优势，也会对最终结果产生巨大的影响。

And so, that means that a small performance advantage makes an absolutely huge difference in terms of your outcomes.

Speaker 2

你会看到，许多数据中心提供商都会先用AMD芯片搭建系统，只是为了跟英特尔谈价格，实际上根本不会大规模部署这些AMD芯片。

What you would see is a lot of the data center providers would build a system with AMD chips just to try and price negotiate with Intel with no intention of ever deploying them at volume.

Speaker 1

所以，AMD的芯片并不是真正的复制品。

So the AMD's weren't really copies.

Speaker 1

英特尔的芯片就是更好。

Intel chips were just better.

Speaker 1

它们价格更高，但性能确实更优。

They were priced higher, but they were better.

Speaker 2

对。

Correct.

Speaker 2

这是一种优势带来更大优势的情况。

And this was a strength leads to strength kind of thing.

Speaker 2

当你领先时，你就获得了优势。

When when you pulled ahead, you got an advantage.

Speaker 2

但这也形成了一个双向市场，因为你为x86编写代码，然后人们会购买x86系统来运行这些代码。

But there was also a double sided market because you would write code for x86 and then people would buy x86 systems to run that code.

Speaker 2

而且x86里还有很多bug。

And then there was all these bugs in x86.

Speaker 2

这些bug反而成了特性，因为最终所有软件都必须支持这些奇怪的bug，而AMD也必须复制Intel的bug。

And that became a feature because eventually all the software had to support these weird bugs and AMD had to copy the bugs of Intel.

Speaker 2

当时甚至有专门研究x86漏洞的专家，只为确保别人能高效地进行仿制。

And there were literally people who were experts in the bugs in x 86 just to make sure that people could make copies efficiently.

Speaker 2

现在来看NVIDIA，他们做得非常出色的地方至少有两大方面，此外还有一些相关的因素。

Now with NVIDIA, what NVIDIA did very well was there's probably at least two major things, but then there's a bunch of other things around it.

Speaker 2

首先是CUDA，它也是一个双边市场，与之前的情况非常相似，因为如今还没有已知的算法能自动将代码编译到GPU上。

The first was CUDA was a double sided market, very much in the same way, because there's no known algorithm today to automatically compile stuff down to GPUs.

Speaker 1

可能不是每个听众都了解CUDA是什么，你能给一个相对聪明的人解释一下吗？

Not everyone listening to this probably understands what CUDA is, so can you explain it for, like, a relatively smart person?

Speaker 2

CUDA被宣传成什么样子，和它实际上是什么样子，是两回事。

There's what CUDA is advertised as, and then there's what it really is.

Speaker 2

CUDA被宣传为用于为NVIDIA芯片编写代码的开发平台。

So CUDA is advertised as the development platform that you use to write code for NVIDIA chips.

Speaker 2

这一点很容易复制。

That is trivial to replicate.

Speaker 2

这基本上没有任何价值。

That has basically zero value.

Speaker 1

哦，我没想到这一点。

Oh, I didn't realize that.

Speaker 1

所以任何人都可以随便创建另一个东西来实现这个功能？

So anyone could just create a new something else to go do that?

Speaker 2

这是世界上最简单的事情。

That's the easiest thing in the world.

Speaker 2

事实上，正是像.NET、LLVM、Java这样的技术削弱了x86英特尔的主导地位。

In fact, this is one of the things that eroded the x 86 Intel dominance was things like dot net, LLVM, Java.

Speaker 2

Java稍微更粗暴一些。

Java was a little more brute for it.

Speaker 2

你得让人去针对它。

You had to get people to target it.

Speaker 2

但其他这些技术，你的C++可以直接编译到它们上面，突然间切换成本就降低了。

But these other things, your c plus plus would compile to it, and all of a sudden the switching costs went down.

Speaker 1

它可以兼容任何东西。

It would work with anything.

Speaker 2

是的。

Yeah.

Speaker 2

CUDA编程语言也是如此。

And the same is true of the actual CUDA programming languages.

Speaker 2

这些都很容易重新实现。

Those are trivial to reimplement.

Speaker 2

困难的部分是所谓的CUDA内核。

The hard part is what's called CUDA kernels.

Speaker 2

CUDA内核是什么意思呢？比如我写了一个程序，假设我制作了一个视频游戏，我希望它能在GPU上高效运行。

And what a CUDA kernel is, is if I write a program, let's say I create a video game, I want that to run well on a GPU.

Speaker 2

问题是，目前没有已知的算法能将这段代码自动高效地适配到多核系统上。

The problem with that is there is no known algorithm to take that code and to get it to work on a multi core system efficiently.

Speaker 2

因此，人类需要根据你写的代码来编写这些内核。

So human beings write these kernels based on the code that you wrote.

Speaker 2

所以，如果你是一家大型视频游戏设计公司，你会把你的游戏发给NVIDIA，他们会为你编写一些内核。

So if you're a major video game design studio, you send your game over to NVIDIA, and they will write some kernels.

Speaker 2

然后这些内核会被集成到驱动程序中。

And then they're in the driver.

Speaker 2

当有人启动这款游戏时，它的运行速度就会更快。

And when someone loads up that video game, it runs faster.

Speaker 2

就像有人编写汇编语言来让C++程序运行得更快一样。

Just like someone writing assembly to make a c plus plus program run faster.

Speaker 2

现在这是由于多核特性所必需的，而NVIDIA认为这是巨大的优势。

Now this is needed because of the multi core nature, And this is viewed as a huge advantage by NVIDIA.

Speaker 2

当我们在谷歌开发TensorFlow时，我们也必须为NVIDIA的GPU编写内核。

And when we did TensorFlow at Google, we ourselves had to write the kernels for NVIDIA's GPUs.

Speaker 2

否则，TensorFlow就不会有竞争力。

Otherwise, TensorFlow wouldn't have been relevant.

Speaker 2

所以，这带来了双重市场的效应，使其几乎难以撼动。

So, it's got this double sided market feeling to it that makes it almost impenetrable.

Speaker 2

所以，这是第一点。

So, that's one.

Speaker 2

第二点是，NVIDIA的垂直整合超出了大多数人的注意范围。

The second is that NVIDIA has forward integrated beyond what most people have noticed.

Speaker 2

通常，大多数公司会制造芯片、构建系统、开发网络，并做一些软件。

So, typically, most companies, they'll build a chip, they'll build a system, they'll build networking, they'll do some software.

Speaker 2

他们并没有全部自己做。

They don't do all of them.

Speaker 2

NVIDIA的做法是从GPU和CUDA开始，也就是软件层面，而AMD则更多地让其他人替他们开发软件。

And what NVIDIA did was they started with the GPU and CUDA, so the software, whereas AMD was more letting other people write the software for them.

Speaker 2

然后他们推出了完整的DGX系统。

And then they added a system, the whole DGX boxes.

Speaker 2

接着他们收购了Mellanox，以掌握网络技术。

Then they bought Mellanox to bring in networking.

Speaker 2

现在他们正在自建云服务，直接与自己的客户竞争。

Now they're doing their own cloud to compete directly with their customers.

Speaker 2

这就是关键所在。

That's the thing.

Speaker 2

他们一直都在向前延伸，最近我忘了是哪家公司，有一家公司长期为NVIDIA生产显卡。

They've always forward in so there was actually a recent I forget which company it was, there was a company that made NVIDIA graphic cards for a very long time.

Speaker 2

他们80%的收入，或者类似的高比例，都来自于销售NVIDIA的显卡。

80% of their revenue or some large number like that came from selling NVIDIA cards.

Speaker 2

有一天，他们突然宣布：我们不干了。

And one day they just announced, we're done.

Speaker 2

我们放弃了。

We give up.

Speaker 2

NVIDIA 让我们根本没法做下去。

NVIDIA has made it impossible for us.

Speaker 2

他们把我们的利润压得一丝不剩。

They've squeezed out all of our margin.

Speaker 2

他们自己生产显卡。

They make their own cards.

Speaker 2

我们放弃了。

We give up.

Speaker 2

尽管这占了我们80%的业务，我们还是要退出。

Even though it's 80% of our business, we're leaving it.

Speaker 2

所以他们所做的，就是向前整合。

And so what they do is they just forward integrate.

Speaker 2

每次他们把某件事稳定下来，就会开始向前整合到下一个环节。

And every time they get something established, then they start forward integrating into the next part.

Speaker 1

但他们仍然是台积电的大客户。

But they still are a big TSMC customer.

Speaker 2

确实如此，但那将是反向操作。

Absolutely, and that would be going the other way.

Speaker 2

他们倾向于向前整合。

They tend to forward integrate.

Speaker 2

他们不太会倒回去依赖供应商。

They don't tend to go back into their vendors.

Speaker 2

他们会把客户做的事情，自己也做一遍。

They take what their customers do, and they do that themselves.

Speaker 2

他们选择这个方向而不是相反方向是有原因的，因为越往上走，利润空间越大，他们正试图为自己攫取越来越多的利润。

There's a reason why they go that direction rather than the other direction, which is there's more margin the more up the stack you go, and they're trying to capture more and more of that margin for themselves.

Speaker 1

他们不需要把NVIDIA的系统卖给别人再让别人出租，而是可以直接自己出租。

Instead of, like, selling NVIDIA systems to other people who then rent them out, they could just rent them out themself.

Speaker 2

是的

Yeah.

Speaker 2

我的意思是，他们正开始这么做，而且刚刚推出了推理服务，所以他们甚至要更进一步，开始提供所谓的‘令牌即服务’，与他们所有的‘令牌即服务’客户展开竞争。

I mean, that's what they're starting to do, and they just announced an inference service, so they're even gonna go beyond that and start selling, I guess, token as a service and compete with all their token as a service customers.

Speaker 2

当你达到这种级别的影响力时，人们就真的没法说：‘嘿，我要弃用你，因为你跟我竞争。’

So when you get to that level of power, people can't really say, hey, I'm going to drop you because you're going to compete with me.

Speaker 2

他们只能想：‘我别无选择。’

They're like, I have no choice.

Speaker 2

我只能继续购买这个产品。

I got to keep buying this.

Speaker 2

他们每花一美元给英伟达，都会被用来投入更多研发，以取代他们自己。

And every dollar that they send to NVIDIA is just used to develop more R and D to replace them.

Speaker 1

他们显然是最大的GPU客户，而真正能与之竞争的唯一方式，就是采用一种全新的范式。

They're clearly the biggest GPU customer, and obviously the only way to really compete is with a new paradigm.

Speaker 1

你正在研发这种LPU。

You're working on this LPU.

Speaker 1

在当前这种更偏向生成式的世界中，这有什么不同之处？为什么这很重要？

What is different about that and why is that important in this kind of more generative world?

Speaker 2

我们在这些方面与GPU的做法确实非常不同。

We really did those two things very differently that GPUs do.

Speaker 2

首先，我们花了前六个月的时间来开发我们的编译器。

For one, we spent the first six months working on our compiler.

Speaker 2

因此，在我们开始设计芯片时，软件已经可以正常运行了。

So by the time that we started designing our chip, we already had the software working.

Speaker 2

由于这是完全自动化的编译过程，我们完全避开了内核问题。

And we were able to avoid the whole kernel issue altogether because it's just a completely automated compile.

Speaker 2

第二点是，我们从一开始就全面构建了所有这些内容——包括网络和整个系统。

The second thing was we really built with all of that stuff, all the networking, the system, all from the beginning.

Speaker 2

所以，当我们在我们的硬件上运行大语言模型时，实际上是在数百甚至数千个芯片、数百或数千个我们的LPU上运行，而不是像GPU那样只用一两个、八个或类似的数量。

So, when we are running an LLM on our hardware, we're actually running on hundreds or thousands of chips, hundreds or thousands of our LPUs, not one or two or eight or whatever like you do with GPUs.

Speaker 2

我们能做到这一点，是因为我们拥有自己集成的互连技术，能够扩展到更多芯片，并且完全是同步的。

We can do that because we have our own integrated interconnect that allows us to scale up to more chips and it's completely synchronous.

Speaker 2

想象一下，你安排了一大堆会议，这让你能够高效地一整天与很多人会面。

Just imagine that you have a whole bunch of meetings scheduled and that allows you to have a fairly efficient day where you're meeting with a whole bunch of people.

Speaker 2

想象一下，如果你想要见很多人，但却无法安排会议。

Imagine if you were trying to meet with a whole bunch of people and you couldn't schedule a meeting.

Speaker 2

CPU、GPU 或任何其他架构内部的工作方式就是这样。

That's the way it works inside of a CPU, a GPU, or any other architecture.

Speaker 2

我们有调度机制，但我们也必须为互连和网络实现调度功能。

We have the scheduling, but we had to do the interconnect, the networking to be scheduled as well.

Speaker 2

我们还必须开发驱动程序。

We had to do the drivers.

Speaker 1

我们得

We had

Speaker 2

开发编译器。

to do the compiler.

Speaker 2

我们必须从零开始做所有事情，这是唯一的方法。

We had to do everything from scratch, and that was the only way.

Speaker 1

因为你要在这些芯片之间来回传输，所以需要超低延迟。

And because you're going between all these chips, there's a need for this super low latency.

Speaker 2

对。

Correct.

Speaker 2

这确实是关键。

And that's really the key.

Speaker 2

在进行训练时，你并不需要低延迟。

When you're running training, you don't need low latency.

Speaker 2

训练的话，你一个月内能完成就行。

Training, you're gonna finish in a month.

Speaker 2

唯一重要的是让硬件保持忙碌。

All that matters is that you keep the hardware busy.

Speaker 2

但在推理时，情况就完全不同了。

But with inference, it's very different.

Speaker 2

关键是你能多快给出答案。

It's about how quickly you can give an answer.

Speaker 2

这意味着所有事情都需要被调度。

And that means that everything needs to be scheduled.

Speaker 2

想象一下，如果一个任务需要792个人来处理。

Imagine if a task required 792 people to touch it.

Speaker 2

如果他们没有在精确的时刻完成各自的任务，那整个过程就会变得无比漫长。

Well, if they're not each doing their task at an exact moment, it's just gonna take forever.

Speaker 1

你该如何以一种全新的方式设计超低延迟的系统？

How do you design something in a new way for super low latency?

Speaker 1

今天人们必须做哪些过去的人从未做过的事情？

What does one have to do today that people weren't doing in the past?

Speaker 2

实际上，你必须从头开始重新设计整个技术栈。

Really, you just have to redo the entire stack from scratch.

Speaker 2

最大的问题是，大家都在试图用功能而非产品来解决AI计算问题。

The biggest problem is everyone's trying to solve AI compute with features rather than products.

Speaker 2

他们总是说：‘我为整个前向集成栈的某个部分做了一个小小的改动。’

They're like, Oh, I've come up with this one change to one portion of that entire forwardly integrated stack.

展开剩余字幕（还有 352 条）

Speaker 2

这就是我的优势。

And that's my advantage.

Speaker 2

现在来为我写软件吧。

And now come write software for me.

Speaker 2

来为我构建网络吧。

Come build networking around me.

Speaker 2

来为我构建一个系统吧。

Come build a system for me.

Speaker 2

来把你们所有的框架都带过来。

Come bring all your frameworks.

Speaker 2

但没人愿意这么做，因为成本太高，而回报太小。

And no one wants to do that because it's just too high of a cost for too little of a game.

Speaker 2

在我们的情况下，我们只是让它完全兼容PyTorch，因为所有人都在用PyTorch开发。

In our case, what we did was we just made it completely compatible with PyTorch, which is what everyone develops in.

Speaker 2

所以，你知道，PyTorch模型可以直接在我们的硬件上运行。

So, you know, the PyTorch model, it just works on our hardware.

Speaker 2

不需要任何努力。

There's no effort required.

Speaker 2

而且我们的 API 与 OpenAI 兼容，你只需将指向改为 Grok 即可。

And also, our API is compatible with OpenAI, so you just change to point to Grok.

Speaker 1

所以基本上，人们已经写好了这些代码。

So basically, people have already written this code.

Speaker 1

它可以直接运行，因此他们不需要重新编写或做其他事情。

It just works, so they don't have to redo it or something.

Speaker 1

所以你实际上可以立即吸引开发者加入。

So you've kind of already can bring in the developers immediately.

Speaker 2

没错。

Exactly.

Speaker 2

所以我们火了。

So we went viral.

Speaker 2

实际上，现在已经快一个月了。

Actually, it's been almost a month right now.

Speaker 2

我们已经有七万名开发者。

We already have 70,000 developers.

Speaker 2

相比之下，英伟达花了大约七年时间才达到十万开发者。

For comparison, it took NVIDIA about, I think it was seven years to get to 100,000 developers.

Speaker 2

我们预计在七周内就能达到这个数字。

We're on track to get there in seven weeks.

Speaker 1

哇。

Wow.

Speaker 1

苹果通常不会参与关于生成式AI的这些讨论，但也许你认为，他们自研芯片的劣势，反而可能成为优势，或者你认为这是劣势？

It doesn't seem like Apple is usually in these conversations about generative AI, but that disadvantage, maybe you think it's advantage or maybe you think it's a disadvantage that they design their own chips.

Speaker 1

你对苹果持乐观还是悲观态度？

Are you like bullish or bearish on Apple?

Speaker 2

我持乐观态度，但正因为我持悲观态度。

I'm bullish, but because I'm bearish.

Speaker 2

我认为苹果落后其他人太多，这反而会促使他们做出更明智的决策。

I think Apple is so far behind everyone else, it's going to embolden them to make some smart decisions.

Speaker 2

他们实际上会与其他公司合作。

They're actually gonna partner with others.

Speaker 2

他们会做一些事情。

They're gonna do things.

Speaker 2

而我认为，谷歌面临的挑战最大，因为它们领先其他公司太多，以至于难以意识到这一点。

Whereas, I think Google has it the hardest because Google's so far ahead of most others that it's hard for them to recognize.

Speaker 1

他们有自己的芯片。

They have their own chips.

Speaker 1

他们有自己的软件。

They have their own software.

Speaker 2

我的意思是，他们写了《Attention Is All You Need》这篇论文。

I mean, they wrote the attention is all you need paper.

Speaker 2

他们领先太多了。

They're so far ahead.

Speaker 2

他们没有意识到的是，他们的策略已经失效，需要采取其他措施。

And what they're not realizing is that their strategy isn't working and they need to do some other things.

Speaker 2

我认为其他一些超大规模云服务商也是如此。

I would say same for some of the other hyperscalers.

Speaker 2

坦白说，所有这些公司都一样。

Frankly, all of them.

Speaker 2

这听起来可能有点奇怪。

This is going to sound a little weird.

Speaker 2

我认为苹果拥有最大的优势，因为他们可能是唯一意识到自己尚未完全掌控局面的公司。

I think Apple has the best advantage because they're probably the only ones who realize that they haven't locked it down.

Speaker 2

我认为微软、Meta 和谷歌都认为，基于他们目前的行动和对话，自己已经是赢家了。

I think Microsoft, I think Meta, I think Google, all think that they're the winners at this point based on what they're doing, based on conversations.

Speaker 2

但我觉得亚马逊意识到自己稍微落后了，我认为这会给他们带来更多的灵活性。

But I would say Amazon realizes they're a little behind, and I think that's going to give them a little more flexibility.

Speaker 2

这种现象你经常能看到。

And you see this all the time.

Speaker 2

有时候，落后的那方反而能后来居上，因为他们并不觉得自己的做法有多么珍贵。

The folks who are the most behind sometimes end up getting ahead of everyone else because they don't find what they're doing that precious.

Speaker 2

苹果刚刚解散了他们的自动驾驶团队，并表示你们现在转去做生成式AI了。

Apple just canned their self driving team and said, you are now working on generative AI.

Speaker 1

过去，创新最大的开支是工程师的工资。

In the past, the biggest expense for innovation was salaries of engineers.

Speaker 1

而未来，最大的开支可能是硬件和基础设施。

And potentially in the future, the biggest expense is hardware plus infrastructure.

Speaker 1

这将如何改变这个行业？

How does that change the industry?

Speaker 2

这正是Grok启动的部分原因，因为我们希望在AI时代保护人类的自主性。

This is a little bit of why Grok got started because we want to preserve human agency in the age of AI.

Speaker 2

担忧在于，如果一小群人掌握了所有的计算资源，他们就会拥有全部的发言权。

And the concern is that if a small group of people has all of the compute, then they will have all to say.

Speaker 2

我们希望确保每个人都能获得计算资源。

We wanna make sure that everyone gets access to compute.

Speaker 2

这是一个真正的担忧，因为通过为员工、合作伙伴以及其他所有人提供更多的计算能力，你可以让他们变得更加高效。

This is a real concern because you will be able to make your employees and your partners and everyone else more efficient by giving them more compute.

Speaker 2

这跟搜索不一样。

This is not like search.

Speaker 2

搜索是一种信息时代的科技，你建立索引，然后从中检索信息。

With search, which is an information age technology, you build your index and then that index is retrieved on.

Speaker 2

你可以通过在索引中更深地搜索来略微提升质量，但这在质量上并不是颠覆性的改变。

You can improve the quality a little bit by searching deeper in the index, but it's not a game changer in terms of quality.

Speaker 2

你为所有人建立这个索引，所有人都使用相同的索引。

You're building that index for everyone and they all get the same index.

Speaker 2

现在有了大语言模型，你可能也是为所有人构建这个模型，因为构建成本太高了，但事实上，你提供的计算资源越多，结果就越好。

Now with LLMs, you're probably building that model for everyone because it costs so much to build, but actually the more compute you give, the better the results get.

Speaker 1

这种关系有多线性？

How linear is that?

Speaker 1

要获得两个额外的结果，你得提供100倍的算力吗？

To get two extra results, do you have to give a 100 x computer?

Speaker 1

这到底是怎么运作的？

How does it work?

Speaker 2

我不确定这一点是否已经得到充分证实，但让我换种方式说一下。

I don't know that that's well established yet, but let me also put it another way.

Speaker 2

如果你要找一位顾问，而一位顾问比另一位好10%，你会只多付10%的费用吗？

If you were going to work with a consultant and one consultant was 10% better than this other consultant, would you only pay 10% more?

Speaker 1

我会付超过10%的额外费用。

I'd pay more than 10% more.

Speaker 2

对于认知类任务来说，这通常是成立的。

And that's generally true of cognitive tasks.

Speaker 2

所以，即使成本线性增长得非常厉害，你还是会想要更好的结果，因为你很可能把很多赌注押在上面。

And so even if it gets super linearly more expensive, you're gonna want that better result because you're probably banking a lot on it.

Speaker 2

你会做出一些战略性的决策，因此它带来的回报可能会远超投入。

You're gonna be making some strategic decisions, and so it really can have an outsized return.

Speaker 1

在芯片领域，市面上只有少数几种极其昂贵的芯片。

In the chips world, there's a small number of super expensive chips out there.

Speaker 1

其中一些可能用在你的手机里，还有一些是像GPU之类的高端芯片，但绝大多数芯片数量庞大，其中很多甚至是在二三十年前制造的。

Some of them may be going to your phone, some of them are these, let's say, even the GPUs or whatever it might be that are out there, and then there's a very large number of chips, many of which were made even twenty, thirty years ago, that's the vast majority of things.

Speaker 1

我们会看到类似的情况吗？比如，这些模型会被用于处理那些极其重要的问题，而我会在这里用超便宜的模型处理其他事情？或者你认为人们会怎么使用这些模型？

Are we gonna see something similar where you're gonna bifurcate these models will be used for these very, very important questions I can use, but then I'll use the super cheap models over here for other stuff, or how do you think people are gonna be using these things?

Speaker 2

我确信，更困难的问题会分配更多的计算资源，而简单的问题则会使用较少的计算资源。

I definitely believe that more difficult questions will get more compute applied to them and easier questions will have less compute applied to them.

Speaker 2

这种情况的运作方式很像下棋。

The way that this is going to work is a lot like playing chess.

Speaker 2

当你下棋时，你可以下快棋，也可以下慢棋，慢棋需要深思熟虑，但耗时更长。

So when you play chess, you can play speed chess or you can play normal chess where you think a lot, but it takes a lot longer.

Speaker 2

如果你按秒计费硬件成本（实际上就是这样），那么只要可能，你都会想下快棋。

And if you are charged by the second for the hardware, which is effectively what you are, then you're gonna wanna do speed chess wherever you can.

Speaker 2

你会倾向于直接采用模型流畅输出的词元序列。

You're gonna wanna just go with the stream of consciousness output of the tokens.

Speaker 2

你不会希望它深入思考，试图找出更好的答案。

You're not gonna wanna have it think really deeply and try and come up with a better answer.

Speaker 2

此外，正如你所暗示的，还存在规模较小和较大的模型之间的区别。

There's also an element of, as you were alluding to, smaller and larger models.

Speaker 2

这也会对成本和质量产生影响。

That has an effect on the cost and the quality as well.

Speaker 2

但大模型的作用是真正提升了模型的直觉。

But what bigger models do is bigger models actually improve the intuition of the models.

Speaker 2

即使你使用较小的模型，只要给它更多计算资源，让它深入搜索，有时也能得到更好的答案。

Even if you run a smaller model, if you run more compute on it, if you have it search deeper, you can end up sometimes getting better answers.

Speaker 2

因此，对于某些任务，你可能更倾向于使用较小的模型，尤其是当你第一次尝试回答某个领域的问题时，其实并不需要大型模型。

And so for some tasks, you may want a smaller model, especially if it's an area you've never tried to answer something in before, you don't really need a large model.

Speaker 2

你真正需要的是更多的计算周期来找出答案。

What you really need is a lot more of those compute cycles to figure out the answer.

Speaker 2

但另一方面，模型越大，产生的幻觉就越少。

But then there's another side to this, which is the larger the model, the fewer hallucinations you get.

Speaker 2

这是因为，你有没有过这样的经历：你去了某个地方，心想：天啊，导航把我带到这里了。

And the reason that that happens is, have you ever gone somewhere and you're like, gosh, the GPS took me here.

Speaker 2

这根本不是我想去的地方。

This is not where I meant to go.

Speaker 2

这是因为模型中包含了一些与驾驶无关的额外信息，比如关于周边环境和你打算去的地方，你会觉得这不对。

And it's because there's just some extra information unrelated to driving about the neighborhood, about where you're intending to go, and you're like, this is wrong.

Speaker 2

模型越大，维度越高，出现这种错误的可能性就越低。

Well, the bigger the model, the higher dimensional, the harder it is to have one of those mistakes.

Speaker 2

因此，在一段时间内，你可能会看到这些模型继续变大，以降低这些幻觉发生的概率。

And so you're probably for a while going to see these models continuing to get larger in order to reduce the probability of these hallucinations.

Speaker 2

所以，对于任何需要减少幻觉的场景，这都是最佳方式。

And so for anything where you need to reduce hallucinations, that'll be the way.

Speaker 2

但你也可以通过增加计算量来实现这一点。

But you can also do that by applying more compute.

Speaker 2

但这需要投入更多的工作和努力。

It just takes work and effort.

Speaker 1

为什么增加计算量能减少幻觉呢？

Why does the compute reduce the hallucinations?

Speaker 2

我给你举个例子。

I'll give you an example.

Speaker 2

我接下来要说的是哪个词

What is the next word that I'm about to

Speaker 1

说。

Say.

Speaker 2

好的。

Okay.

Speaker 2

所有听这段话的人都在脑子里想到了同一个词。

Everyone listening to this had the same word in their head.

Speaker 2

如果我让你补全这句话：双曲正切函数平方的二阶导数是

If I ask you to complete this sentence, the second derivative of the square of the hyperbolic tangent is

Speaker 1

我不知道。

I don't know.

Speaker 2

你不知道。

You don't know.

Speaker 2

那你怎么会不知道呢？

So how is it that you don't know?

Speaker 1

这不是人们经常谈论的事情。

It's not a common thing that people talk about all the time.

Speaker 2

大型语言模型很像下棋。

Well, large language models are a lot like playing chess.

Speaker 2

这里是一系列的词元，而不是一系列的走法。

There's a sequence of tokens instead of a sequence of moves.

Speaker 2

在每一步，模型都会为所有可能的词元分配一个概率分布，然后按概率从高到低排序。

And what happens is at each point, the model assigns a probability distribution across all the potential tokens and then sorts them into highest probability first and lowest probability last.

Speaker 1

每个听众都熟悉类似自动补全这样的功能。

Everyone listening is is familiar with, like, an autocomplete or something.

Speaker 2

然后，通常算法会选择其中一个最高概率的词元。

And then, typically, the algorithm will pick one of the top tokens.

Speaker 2

它并不总是选择概率最高的那个，因为有时候你并不想总是选择最显而易见的答案，但它会往这里走。

It doesn't always pick the top one because there's reasons why you don't wanna always just pick the most obvious answer, but it goes up here.

Speaker 2

当你这么做时，就像下棋时直接下出第一个冒出来的招数。

Now, when you do that, it's like playing the first move that comes to mind when you're playing chess.

Speaker 2

但如果你仔细想想，模拟一下其中一盘棋，你最终会想出更好的着法。

But if you think about it a little bit and you play out one of those games a little bit, you end up coming up with better moves.

Speaker 2

这就像AlphaGo第二局中的肩冲手。

This is like the shoulder hit in the second game of AlphaGo.

Speaker 2

那其实是一个非常低概率的着法。

It was actually a very low ranked move.

Speaker 2

在一万盘棋中才出现一次，之所以能下出这步，是因为运行它的TPU有足够的算力进行更深入的搜索。

It was played one in 10,000 games, and it was only because the TPUs that it was running on had enough compute to go deeper and find it that it played it out.

Speaker 1

这就是算力发挥作用的地方。

That's where the compute comes in.

Speaker 2

对。

Yeah.

Speaker 2

因为你能够探索更深的搜索空间。

Because you can search a deeper space.

Speaker 2

对于这些词元来说，情况也是一样的。

The same is true of these tokens.

Speaker 2

如果我问你，双曲正切函数平方的二阶导数是什么，并强迫你给出答案，你会胡言乱语。

If I ask you, what is the second derivative of the square of the hyperbolic tangent and I force you to give an answer, you'll give me gobbledygook.

Speaker 2

但如果你能回头尝试大量其他可能性，突然间，其中一种就会开始变得合理。

But if you can go back and try a whole bunch of alternatives, then all of a sudden, one of them is gonna start to make sense.

Speaker 2

这就像当你听到一个合理答案时的感觉，比如如果我告诉你答案，你会说，听起来没错。

And sort of like when you hear something that makes sense, like if I told you the answer, you'd say, that sounds right.

Speaker 2

你可能并不真正懂，但听起来就是对的。

You may not know, but it sounds right.

Speaker 2

而这种辨别好答案的能力，正是你进行搜索所必需的，因此这些模型在直觉判断方面非常出色，再叠加搜索机制，它们的表现就会更好。

And that ability to detect a good answer is really what you need in order to be able to search, and so these models are very good at that sort of intuitive part, and then you layer on that search part and they get better.

Speaker 2

这叫做束搜索。

It's called beam search.

Speaker 2

这是一种技术。

That's the technique.

Speaker 1

现在你的芯片是在美国制造的。

Now your chips are built in The United States.

Speaker 1

能详细说说其中的经济优势吗？你们为什么要这么做？

Walk us through the economic advantage, why are you doing that?

Speaker 1

不只是因为你是爱国者。

Besides just the fact that you're a patriot.

Speaker 2

当时我们做出这个决定是出于很多原因。

At the time, we made the decision for a lot of reasons.

Speaker 2

其中一个原因是当时对台湾有些担忧。

One was there was a little bit of concern about Taiwan at the time.

Speaker 2

但这并不是最主要的因素。

That wasn't the top reason that factored in.

Speaker 2

另一个原因是，我们能够组建一个更积极的团队，因为他们更渴望与我们合作。

The other was we were able to get a better team because they were hungrier for the business to work with us.

Speaker 2

我们与格罗方德合作，他们在纽约州北部为我们制造芯片。

We work with GlobalFoundries and they fab our chips in Upstate New York.

Speaker 2

芯片的封装则在加拿大完成。

They're packaged in Canada.

Speaker 2

我们也在美国制造我们的系统。

And we build our systems in The US as well.

Speaker 2

我们正努力打造一个完整的北美供应链。

And we're trying to do this fully North American supply chain.

Speaker 2

这里的优点在于，为什么我会建议其他人也这么做，尤其是当你刚开始起步、试图成为一家成功企业时，这并不是关于在出现问题时提供解决方案，而是因为事情可能会发生；如果我与你合作，我就无需担心这种可能性。

And the advantage here and why I would recommend that everyone else do this, especially if you're starting to come up and trying to become a successful entity, it's not about being the solution if something happens, it's something could happen And if I work with you, then I don't have to worry about that eventuality.

Speaker 2

我降低了风险。

I reduce my risk.

Speaker 2

因为大家都处于相同的时间区，所以更方便。

It's easier because everyone's in the same time zones.

Speaker 1

你可以更快地迭代。

You can iterate faster.

Speaker 1

在某些情况下，比如从1到n，把事情交给相隔一万英里远的地方做更有意义，但从0到1的阶段，你肯定希望尽可能靠近。

Some ways, like a one to n makes a lot more sense to do something that's 10,000 miles away, but a zero to one, you're gonna want it as close as possible.

Speaker 2

此外，目前还存在地缘政治方面的担忧。

On top of that, there is an element right now of concern geopolitically.

Speaker 2

当时情况并非如此。

This wasn't the case at the time.

Speaker 2

人们在这里下的是象棋，而大多数供应链都横跨众多国家。

People are playing chess here on this, and most supply chains are stretched across so many different countries.

Speaker 2

这些国家中的任何一个都可能否决你的供应链，这是一个巨大的风险。

Any one of those countries could just veto your supply chain, and that's a huge risk.

Speaker 1

此外，许多大型芯片设计公司似乎都在竞争争取台积电的制造产能，彼此竞相抬价或利用自身影响力争取名额，我推测这意味着台积电的产能有限，因此必然会有赢家和输家。

Also, seems like a lot of these big chip designers are competing essentially to get slots for TSMC to manufacture them, and so they're kind of like bidding up each other to get the slot or using their own leverage to get there, and I assume that means TSMC only has a certain amount of capacity, so it means there's going to be winners and losers there.

Speaker 2

在某种程度上，尽管台积电对任何人来说都不是瓶颈。

To some extent, although TSMC is not the bottleneck for anyone.

Speaker 2

真正限制我们的并不是GPU，或者在我们的情况下，也不是LPU本身。

It's not the GPU or, in our case, LPU itself that's the limiter.

Speaker 2

真正受限的是这种叫做HBM的技术，HBM是用于GPU的内存。

It's this technology called HBM, and HBM is the memory that is used in GPUs.

Speaker 1

对我来说，这是个新术语。

That's a new term for me.

Speaker 1

HBM代表什么？

What does HBM stand for?

Speaker 2

它代表高带宽内存。

It stands for high bandwidth memory.

Speaker 2

全世界的HBM产量几乎都来自韩国。

Pretty much the entire world's supply of HBM comes out of Korea.

Speaker 2

两大制造商是SK海力士和三星。

Two major manufacturers are SK Hynix and Samsung.

Speaker 2

现在，美国的美光公司非常希望成为大规模生产HBM的厂商，但通常被认为位居第三。

Now Micron in The US would love to become a manufacturer of scale for that, but they're generally considered the distant number three.

Speaker 2

因此，NVIDIA的做法为我们带来了可能是他们取得巨大成功的第三点：他们实际上是一个买方垄断，也就是垄断的反面。

And so what NVIDIA has done this brings us into probably a third thing that they've done that's made them very successful, is they're effectively a monopsony, which is the opposite of a monopoly.

Speaker 2

他们不是单一的卖家，而是众多尖端零部件的单一买家。

Instead of being a single seller, you are a single buyer of a lot of the cutting edge parts.

Speaker 1

所以他们可以与三星等公司签订长期协议。

So they can sign, like, these long term deals with Samsung and these other companies.

Speaker 2

在SK海力士以及其他所有这些公司。

In SK Hynix and and all these others.

Speaker 2

它们买下了全部的供应量。

And they buy up all of the supply.

Speaker 2

但这不仅仅是HBM。

But it's not just the HBM.

Speaker 2

还有被称为互连基板或CoWoS的东西，HBM就安装在上面，而这种东西的供应也很有限。

It's also this thing called an interposer or co ops, which is what the HBM goes on, which is a limited supply of.

Speaker 2

英伟达是全球最大的超级电容器买家。

NVIDIA is the largest buyer of super capacitors in the world.

Speaker 2

所以它们也牢牢掌控了这一领域。

And so they've got a lock on that too.

Speaker 2

还有这么多其他东西。

And there's all of these things.

Speaker 2

实际上，这挺有意思的。

And actually, it was interesting.

Speaker 2

前几天有人告诉我，AMD下调了他们的出货量预测。

Someone noted the other day to me that AMD had revised their volume projections down.

Speaker 2

这很奇怪，因为他们的需求其实更高。

And that's weird because their demand is higher.

Speaker 1

因为他们根本搞不到这些东西。

Because they can't get this stuff.

Speaker 2

他们根本搞不到这些东西。

They can't get this stuff.

Speaker 2

所以他们实际上根本生产不出来。

So they can't actually produce any.

Speaker 2

所以无论你是从AMD还是NVIDIA购买GPU，你实际上都是在购买三星或SK海力士的HBM。

So it doesn't matter whether you buy a GPU from AMD or Nvidia, you're really buying the HPM from Samsung or SK hynix.

Speaker 2

从一开始，我们就做了一些非常不寻常的事情，并且基于此做出了实际的设计决策：我们彻底摒弃了所有那些高端技术，因为我们知道根本无法获得这些材料。

Of the things that we did that was very unusual from the beginning, and we made actual design decisions on this, we eliminated any of the exotic technology because we knew that we would never be able to get access to this stuff.

Speaker 1

你希望我尽可能多地使用普通、容易获取的材料。

You wanted me to build as much commodity stuff as possible that's easy to access.

Speaker 2

对。

Correct.

Speaker 2

事实上，我们的下一代芯片，其中一个版本原本设计中包含了HBM。

In fact, our next generation chip, one version of it actually had HBM in the design.

Speaker 2

我们确实花了一百万美元购买了HBM，因为你必须提前很久下单，而它本打算用于生产批次。

We actually bought a million dollars of HBM because you have to buy it way in advance, that and was gonna be part of the production run.

Speaker 2

当我开始意识到你需要提前那么久采购时，发现它的价格远高于芯片本身。

And when I started seeing you have to buy it that far in advance, it's way more expensive than the chips are.

Speaker 2

我当时就想，维托，不行。

And I'm just like, Vito, no.

Speaker 2

我们把它删掉了。

We took it out.

Speaker 2

它的成本和风险并没有带来相应的价值，但我们的架构非常独特。

It really didn't add that much for the cost and the risk, but we have a very unusual architecture.

Speaker 2

GPU需要HPM。

GPUs require HPM.

Speaker 2

它们是围绕HPM构建的。

They're built around HPM.

Speaker 1

是的。

Yeah.

Speaker 1

我的意思是，你的芯片采用的是14纳米技术，这已经落后好几代了，但在某些方面，这反而是个优势，而不是缺陷。

I mean, your chips are like 14 nanometer technology, which is several generations old, but in some ways that's a feature, not a bug.

Speaker 2

还有另一个原因。

And here's the other one.

Speaker 2

所有人都关注FLOPS，也就是每块芯片能提供的计算能力。

So everyone's focused on FLOPS, which is the amount of compute that each chip is capable of.

Speaker 2

但实际上，限制因素往往是芯片之间的互连，而不是FLOPS。

But, actually, the limiter tends to be the interconnect between the chips, not the FLOPS.

Speaker 1

如果你能实现更快的通信，那其实就没那么重要了。

If you can have faster communication, then it doesn't really matter.

Speaker 1

对吧？

Right?

Speaker 1

你可以直接把它们加起来。

You can just add them up.

Speaker 2

要理解这一点，最好的方式是用汽车工厂来打个比方：为什么GPU在运行大语言模型时比我们的LPU慢得多。

And the best way to think about this, the reason that GPUs are so slow versus our LPUs when they're running a large language model, let's use a car factory's analogy.

Speaker 2

如果你需要一百万平方英尺的装配线空间来生产汽车，但你只有十分之一大小的仓库，那么你只能搭建出十分之一的装配线，让汽车通过，然后把它们停在停车场里。

If you need a million square feet of assembly line space for the cars to be produced, but you only have a warehouse that's one tenth of that size, then what happens is you set up the one tenth of the assembly line that you can fit, you run the cars through, and then you park them in a parking lot.

Speaker 2

然后你拆掉这条装配线。

You tear down the assembly line.

Speaker 2

再搭建下一个十分之一的装配线，再次让汽车通过。

You set up the next one tenth and then run them through again.

Speaker 2

这叫做批处理。

That's called batching.

Speaker 2

这就是GPU所做的事。

That's what GPUs do.

Speaker 2

这是因为它们在等待高带宽内存（HBM）提供数据。

That's because they're waiting for that HBM, that high bandwidth memory to feed them.

Speaker 2

因为我们有同步互连，并且拥有数百个芯片，这实际上就像一条完整的生产线。

Because we have the synchronous interconnect and we have hundreds of chips, it's actually like that full assembly line.

Speaker 2

所以一个令牌就像一辆汽车，它可以直接从头到尾运行，而无需等待内存加载。

And so a token is like the car and it just goes from beginning to end without ever having to wait for a memory load.

Speaker 2

因此，我们不仅消除了对HBM的需求及其供应链问题，还让整个过程更快了。

And so not only did we get rid of the HBM and our supply chain issues, we actually made it faster.

Speaker 2

现在大多数人担心的是，他们看到后会说：要实现这个需要792个芯片。

Now the concern that most people have, they look at it and they're like, you need 792 chips to do this.

Speaker 2

而其他人只需要八个。

Others only need eight.

Speaker 1

是的。

Yeah.

Speaker 1

但它们的价格只有百分之一左右。

But they're one one hundredth of the price or whatever.

Speaker 1

对吧？

Right?

Speaker 2

但实际上，每个芯片只负责计算中非常小的一部分。

But actually, each chip is only doing a very small part of the computation.

Speaker 2

因此它能非常迅速地继续下去。

So it then moves on very quickly.

Speaker 2

这有点像说，天啊，这座工厂的成本真高。

And it's a little bit like saying, gosh, the factory costs so much.

Speaker 2

从工厂里出来的汽车会比手工打造的汽车贵得多。

Cars that come out of it be much more expensive than hand built cars.

Speaker 2

这是一种人们很难理解的直觉。

And that's an intuition thing that people are really struggling with.

Speaker 1

现在很多人认为我们已经达到了摩尔定律的极限，或者正在接近这个极限。

Now a lot of people think we've reached the limits of Moore's Law or we're reaching those limits.

Speaker 1

你同意吗？这对芯片开发和其他类似的技术有什么影响？

Do you agree, and how does that affect things like chip development and some of these other types of things?

Speaker 2

摩尔定律是一个了不起的推测，我们都把它当作现实来遵循，并竭尽全力维持它的延续。

Moore's Law was an amazing suggestion that we all followed and treated as real and did heroics to keep going.

Speaker 2

我认为我们应该稍微调整一下这个定律，让它仍然成立。

I think we should tweak the law a little bit so that it can still be true.

Speaker 2

与其把它看作是通过缩小晶体管尺寸来容纳更多元件的经济定律，我们不如开始说三维空间中的单位体积，这样我们就能开始堆叠芯片，继续提升密度。

Instead of it being about shrinking the size of the transistor so you could fit more and being an economic law, we should start saying unit volume in three d of space so that now we can start stacking chips and continue to get that density.

Speaker 2

一旦我们完全填满了立方体而不是二维空间，我们就会找到这个定律的下一个漏洞，以继续推进进展。

And then once we fully fill up cube instead of a two dimensional space, then we'll find the next loophole in that law to continue the progress.

Speaker 2

但从功能上讲，这还没有结束。

But functionally it is not done.

Speaker 2

发生变化的不再是芯片本身，而是芯片所在的封装方式，以便实现规模化。

And what's moved, instead of it being about the chip itself, it's become about the packaging that the chip is in, in order to scale that up.

Speaker 2

这可以说是接下来每个人都在玩的游戏。

And that's sort of the next game that everyone is playing.

Speaker 1

总的来说，作为这些大语言模型的消费者和使用者，它们的进步速度确实非常快。

Just in general, as a consumer and a user of these LMs, they do seem like they're improving at like a super fast rate.

Speaker 1

你预计明年会发生什么？

What do you expect to happen, let's say, the next year?

Speaker 2

今年年初我做了一些预测，我的首要预测是，到年底时，会出现一些实际应用，其中几乎不存在幻觉。

So I made some predictions at the start of this year, and my top prediction was that by the end of the year, there would be some deployments where there's effectively no hallucination.

Speaker 2

幻觉总是会存在，因为你不可能做到完美。

There's always gonna be a hallucination because you can't be perfect.

Speaker 2

在某些情况下，幻觉将不再成为问题，因为它会被解决得非常好。

Hallucinations, in some cases, won't be a thing because it will be solved so well.

Speaker 2

并不是每个人都能接触到这种技术，但因为它会被某个人解决——可能是以更高的价格，或者因为他们在使用更多计算资源、更大的模型——所以它会被视为一个可解决的问题。

Not that everyone will have access to this, just that it will be considered a solvable problem because someone will have solved it for maybe a higher price, maybe whatever, because they're doing more compute, they've got a bigger model.

Speaker 2

这将极大地改变现状。

That's gonna change things a lot.

Speaker 1

我的意思是，即使今天，你也可以通过某种方式来定义问题。

I mean, already today, you could define the question in a way.

Speaker 1

你可以找到合适的提示词，从而常常消除幻觉。

You could find the prompt so you can often eliminate the hallucination.

Speaker 1

如果你是个优秀的提示词编写者，你可以大大减少幻觉。

If you're a good prompt writer, you can reduce hallucinations quite a bit.

Speaker 2

很多人已经开始用大语言模型自动将提示重写为更好的提示，因为事实证明，大语言模型本身是非常出色的提示工程师。

A bunch of people have actually taken to rewriting the prompt into a better prompt and automatically doing that with LLMs because it turns out LLMs are really good prompt engineers themselves.

Speaker 1

完美。

Perfect.

Speaker 1

是的。

Yeah.

Speaker 1

那么事情就这样了。

So then there you go.

Speaker 1

你也可以把这个做得更好。

You can make that better as well.

Speaker 2

这会带来延迟问题，因为每多一步操作，都会延迟给出答案。

Now this is a latency issue because every time you do another step, it delays giving an answer.

Speaker 2

说到我们自己的书，我们的延迟要低得多。

Speaking of our own book, we are much lower latency.

Speaker 2

我们非常推崇这类技术，尽管目前它们已经被广泛应用，并且带来了很大帮助。

We're a big fan of these sorts of techniques, but they are currently used and they help a lot.

Speaker 2

事实上，有一种叫做反思的技术，你只需问模型：‘对于这个输出，你怎样才能做得更好？’

In fact, there's a technique called reflection where you just ask the model, hey, on this output, how could you have made it better?

Speaker 2

很好，现在就去做吧。

Great, now do that.

Speaker 1

哦，有意思。

Oh, interesting.

Speaker 1

然后再做一次，再做一次，再做一次。

And then do it again, do it again, do it again.

Speaker 2

通常的经验法则是，每三次反思相当于一代模型的提升，但这是指数级的。

And typically the rule of thumb is every three reflections is a generational model improvement, but it's to the power.

Speaker 2

所以如果你想获得两代提升，就需要九次。

So if you wanna get two generations, it's nine.

Speaker 2

如果你想获得三代提升，就需要二十七次。

If you wanna get three generations, it's 27.

Speaker 2

这就是为什么你在许多展示前沿成果的论文中会看到里面有一个数字。

This is why you see in a lot of these papers showing state of the art results, they'll have a number in there.

Speaker 2

我忘了他们怎么称呼它了，可能是几次提问之类的。

I forget what they refer to it as, but it's shots or whatever.

Speaker 2

多少次迭代？

How many iterations?

Speaker 2

多少次提问？

How many shots?

Speaker 2

他们会运行大量实例，然后不断改进结果。

They do a whole bunch of instances, and then they just improve the results over and over again.

Speaker 2

所以速度对此至关重要。

So speed really matters for that.

Speaker 1

当你自己设计提示时，有没有什么方法让你今天获得了更好的结果？

When you're doing your own prompts, is there something you've done in a way to get better results today?

Speaker 2

对人有效的方法，对大语言模型也有效。

Things that work with people work with LLMs.

Speaker 2

如果你有几个员工，给他们明确的目标通常会非常有帮助。

So you have a few employees, giving a very clear objective is generally very helpful with your employees.

Speaker 2

所以如果你说‘写一个故事’，它就会写一个故事，而这会让你失望。

So if you say, write a story, it'll write a story and that's gonna disappoint you.

Speaker 2

但如果你说‘写一个令人兴奋的故事’，或者定义什么是英雄之旅——即故事中存在紧张感和障碍，到结尾时，你最初面临的紧张和障碍其实并不重要，而出现了一个新的、更重要的障碍，这就是英雄之旅的结构——然后你明确说‘现在创作一个英雄之旅’，它就会表现得好得多。

But if you say, write a story that is exciting, or if you define what a hero's journey is, which is something where there's tension and there's an obstacle, by the end of it, the tension and obstacle that you started with turns out isn't important, and there's a new one that's important, that's kind of the hero's journey arc, and define that and you say, now make a hero's journey, it'll do that much better.

Speaker 2

但此外，先让它列出大纲，再让其将大纲转化为最终内容，这也会有很大帮助。

But then also just asking it to do an outline before it gets started and then having it turn that outline into output, that helps a lot.

Speaker 2

始终思考：我能做些什么来让人更好地完成这件事？

Always think what could I do to get a human being to do this better?

Speaker 2

这会有帮助。

And that'll help.

Speaker 2

一个非常有效的方法是，让它想象一下答案会是什么样子。

One that helps a lot, asking it to imagine what an answer would look like.

Speaker 2

突然间，答案的质量就会大大提高。

And all of a sudden, the answers get much better.

Speaker 1

有意思。

Interesting.

Speaker 1

我也得试着用在我合作的这些人身上。

I gotta try that with the humans I work with too.

Speaker 2

没错。

Exactly.

Speaker 2

我本来想说的是，我从大语言模型身上学到了这一点，然后我把这个方法应用到了人身上，效果非常好。

So what I was about to say is I learned that with LLMs, and then I back ported that to people, and that actually works very well.

Speaker 1

现在你作为CEO，处于一个非常有趣的位置。

Now you're in a very interesting position as a CEO.

Speaker 1

你的员工是全球最抢手、最受激烈争夺的人才。

Your employees are some of the most in demand, most aggressively recruited people in the world.

Speaker 1

在这种情况下，你该如何经营公司？

How do you run a company when you're in that situation?

Speaker 2

从一开始我们就一直面临这种情况，所以我们已经习以为常了。

We've always been in that situation from the start, so we're just used to that.

Speaker 2

而且这又回到了汉密尔顿·赫尔默的七种竞争优势。

And again, it goes to Hamilton Helmer's seven powers.

Speaker 2

如果你只关注经济因素，把它当成一种商品，认为只要多付钱就能留住人，那你会失败。

If you focus purely on economics and it's a commodity thing, and it's like, I'm gonna pay you more, you're gonna lose.

Speaker 2

所以，我们招聘的任何一个人，我们都会尽量把薪资定得低于他们从其他地方拿到的最高报价。

So one thing is anyone that we hire, we always try and be lower than the highest offer that they get elsewhere.

Speaker 2

因为否则我们就得不到足够的信号。

Because otherwise, we don't have enough signal.

Speaker 1

你不想要一个唯利是图的人，你想要真正相信你所做事业的人。

You don't want the mercenary, you want the person who really believes in what you're doing.

Speaker 2

没错。

Exactly.

Speaker 2

接下来一点是，拥有顶尖人才时，人们通常更愿意留在顶尖人才身边。

The next thing up is by having really great talent, people generally wanna stick around really great talent.

Speaker 2

我觉得每个人都认为自己的人才很出色。

I think everyone thinks that their talent is great.

Speaker 2

我记得曾和一位CEO交谈，他特别自豪于自己团队的人才密度，但后来他有机会见到我们的一些人，他说：‘我以为我懂什么是人才密度了。’

I remember talking to the CEO who was particularly proud of their talent density, and then they had an opportunity to meet some of our people, and they're like, I thought I knew what talent density was.

Speaker 2

现在我明白了。

Now I know.

Speaker 2

当你只招聘非常优秀的人时，其他非常优秀的人也想和非常优秀的人共事。

When you just hire really amazing people, other really amazing people wanna work with other really amazing people.

Speaker 2

要挖走他们真的很难。

It's really hard to pull them away.

Speaker 1

过去七年左右，这些员工的市场薪资水平大幅上涨。

The market rate for these employees has gone up dramatically over the last, let's say, seven years.

Speaker 1

每年可能都上涨了20%以上，而相比之下，普通软件工程师的薪资——至少在过去三年里——我认为已经下降了，但大多数与AI相关的工程师，即使他们愿意接受较低的薪资来为你工作，你仍然需要不时加薪以保持竞争力。

It's probably gone up over 20% a year every year, whereas like the average software engineer, at least in the last three years, I believe it's gone down in the last three years, most of these more AI related engineers, even if they're willing to take a reduced salary to come work for you, you still have to do some sort of raises and match it over time too.

Speaker 2

另一点是我们不做AI。

The other thing is we don't do AI.

Speaker 2

我们是卖铲子和铁锹的。

We're picks and shovels.

Speaker 2

实际上，在Grok公司，真正做AI的人非常少。

We actually have very few people who actually do AI at Grok.

Speaker 2

我们是一个赋能者。

We're an enabler.

Speaker 1

你们更多是在和英伟达这样的公司竞争。

You're more competing with, like, the NVIDIA or these other types of places.

Speaker 2

没错。

That's right.

Speaker 2

所以我们这里确实有这些人，但其中一个机会是，当你身处其中并帮助所有人让这些技术运行起来时，你就能学到东西。

And so we do have people here, but one of the opportunities is when you're near this stuff and you're helping everyone get it to work, you get to learn.

Speaker 2

所以，如果你作为一名软件工程师或硬件工程师加入，你将有机会以其他方式无法获得的方式与这些技术互动，因为我们实际上能与最顶尖的用户和客户合作。

And so if you come in as a software engineer or whatever or hardware engineer, you get to interface with this in a way that you wouldn't otherwise because we actually get to work with the best of the best as users and customers of our stuff.

Speaker 2

我们可能比其他人更近距离地看到他们真正关心什么，以及他们的工程实践是什么样子。

We might actually have more of a front row seat to what matters to them and what that engineering looks like than a lot of others.

Speaker 2

但我们自己并不亲自去做这些，因此这大大减轻了竞争压力。

But we're not the ones doing it ourselves, and so that takes a lot of the competitive pressure off.

Speaker 1

好的，我们最后问两个问题。

Alright, two last questions we ask all of our guests.

Speaker 1

第一个问题是：你相信什么样的阴谋论？

First is, what is a conspiracy theory that you believe?

Speaker 2

我是世界上最不擅长相信阴谋论的人。

I am the world's worst conspiracy theory believer.

Speaker 1

我原以为你在这方面还挺在行的。

I would have thought you'd be pretty good at it.

Speaker 2

关于阴谋论，大多数狂热者往往能同时相信两件相互矛盾的事情。

The thing about conspiracy theories is most people who are conspiracy theory junkies, they can believe two things that are incongruent.

Speaker 2

我本人和我们在Grok招聘时所看重的一点，是我们称之为‘现实指数’的东西。

One of the things that I do and that we hire for at Grok is what we call reality quotient.

Speaker 2

我们有一整套提升现实指数的层级。

And we've got a whole bunch of levels of how you improve on reality quotient.

Speaker 2

但最基本的是我们所说的‘可塑性思维’，也就是当事实发生变化时，你的想法也会随之改变。

But the start is what we call a malleable mindset, or when the facts change, your mind changes.

Speaker 2

这与阴谋论格格不入，因为阴谋论一旦被证伪，你就会觉得：‘哦，我错了。’

That doesn't work well with conspiracy theories because conspiracy theories get debunked and then you're like, oh, I was wrong.

Speaker 2

但我想说，我有一些奇怪的信念，而我们的一些奇怪信念竟然成真了。

But I would say it's more that I have some weird beliefs, and we have some weird beliefs that have come true.

Speaker 2

我不确定是否存在什么阴谋论。

I don't know that there'd be conspiracies.

Speaker 2

其中一个令人惊讶的是，我们当时觉得显而易见的是，推理将在市场中占据越来越重要的地位。

And one of them shockingly was that we thought it was obvious that inference would start to become a bigger part of the market.

Speaker 2

所有人都觉得我们疯了。

Everyone thought that we were nuts.

Speaker 2

他们觉得会是训练主导。

It's going be training.

Speaker 2

但我们说，你花钱在训练上，却靠推理赚钱。

And we're like, But you spend money on training and you make money on inference.

Speaker 2

当然，推理的规模会越来越大。

Of course, inference is going to get larger.

Speaker 2

我不确定。

I don't know.

Speaker 2

我对阴谋论不太在行。

I'm bad at conspiracy theories.

Speaker 2

对不起。

I'm sorry.

Speaker 1

那太好了。

Well, that's great.

Speaker 1

我们问所有嘉宾的最后一个问题。

Last question we ask all of our guests.

Speaker 1

你认为哪些普遍接受的智慧或建议其实是糟糕的建议？

What conventional wisdom or advice do you think is generally bad advice?

Speaker 2

我不喜欢给人建议，因为人们不喜欢接受建议。

I hate giving advice because people don't like to take advice.

Speaker 2

所以我要说的是，对我自己和我认识的其他人最有帮助的是尽量变得更加无畏。

And so what I will say is the thing that has been most advantageous for myself and others that I've known is to try and be more fearless.

Speaker 2

问题不在于不够无畏，而在于你没有意识到自己已经变得恐惧，而这正是阻碍你的原因。

And the problem is not being more fearless, it's recognizing that you've become afraid and that's what's stopping you.

Speaker 2

我经常参加一些会议，有人会说，我们不应该因为这个原因做这件事。

I'm in meetings all the time where someone says, we shouldn't do this for this reason.

Speaker 2

但事实上，他们不想做是因为害怕。

But the reality is they don't wanna do it because they're afraid.

Speaker 2

也有一群人完全不害怕，总是惹上麻烦。

There are also groups of people who have no fear and get in trouble all the time.

Speaker 2

但你我经常接触的那些人，往往是比较胆怯的类型。

But the folks that you and I interact with a lot are the kind who are more afraid.

Speaker 1

真的吗？

Is that true?

Speaker 1

还是你觉得典型的创业者性格可能

Or you think maybe a typical founder personality might

Speaker 2

创业者没那么害怕。

A founder is less afraid.

Speaker 2

但我是说，工程领域，因为如果你犯了错，所以你会更谨慎，想把一切都做对。

But I mean, engineering, because if you make a mistake and so you're more cautious, you wanna get it all right.

Speaker 2

这有点像纳西姆·塔勒布的观点，即高风险事物的风险价值其实并未被充分定价。

Sort of like a Nassim Talebism of the value of the risk isn't really priced in on high risk things.

Speaker 2

因此，你真正应该追求的是更高风险的事物，因为生活中那些低风险事物背后真正潜藏的风险，其实也未被定价。

And so you should really be going after higher risk things because the low risk things in life, that real risk that's under there for everything isn't priced in.

Speaker 2

我常举的一个例子是，每个人租用房产时都必须购买火灾保险，但没人需要购买疫情保险。

An example I like to give is everyone has to have fire insurance on any property that they rent or whatever, but no one needs pandemic insurance.

Speaker 2

然而，在过去两百年里，普通上班族因疫情而无法上班的次数，远多于因火灾而无法上班的次数。

Yet over the last two hundred years, the average person working in an office building has been out of an office more often because of pandemics than because of fires for buildings.

Speaker 2

你得去关注那些风险更高的事情，因为那样的话，风险与回报的比值会更好一些。

You got to look at things that are riskier and do those because then the price to value is a little better.

Speaker 2

创业是你能做的最好、风险最低的选择之一，因为如果你在大公司，随时可能被裁员。

Startups are one of the best, lowest risk things you can do because if you're in a big company, you could be in a layoff.

Speaker 1

你的成长速度不会那么快，还有其他各种因素。

You don't grow as fast, you know, all these other types of things.

Speaker 2

所以，去找那些别人都害怕去做的事。

So, look for the things that everyone else is afraid of.

Speaker 2

去做那些事，然后对那些别人都不害怕的东西，稍微保持一点警惕。

Go do those things, and then all the things that no one else is afraid of, be a little fearful of those.

Speaker 2

我觉得这是个门格主义，这样你会好很多。

I think that's a mongerism, and then you'll be much better off.

Speaker 1

这太棒了。

That's great.

Speaker 1

感谢乔纳森·罗斯加入我们的《DAS世界》节目。

Thank you, Jonathan Ross, for joining us on World of DAS.

Speaker 1

这真的非常有趣。

This has been really interesting.

Speaker 1

顺便说一下，我在Twitter上关注了你，JonathanRoss321。

By the way, I follow you JonathanRoss321 three two one on Twitter.

Speaker 1

我强烈建议我们的听众去和你互动。

I definitely encourage our listeners to engage with you there.

Speaker 1

我学到了很多，这真的非常有意思。

I learned a ton, so this has been super interesting.

Speaker 2

谢谢，谢谢邀请我。

Thanks, and thanks for having me.

Speaker 2

我很感激。

I appreciate it.

Speaker 1

如果你是个数据狂热爱好者，去

If you're a super data nerd, go

Speaker 0

worldofdaas.com。

to worldofdaas.com.

Speaker 0

就是daas，worldofdaas.com，注册我们的每周数据即服务简报。

That's daas, worldofdaas.com, and sign up for our weekly data as a service roundup newsletter.

Speaker 0

谢谢收听。

Thanks for listening.

Speaker 0

如果你喜欢这期节目，不妨给这个播客打分并留下评价。

If you enjoyed this show, consider reading this podcast and leaving a review.

Speaker 0

想了解更多World of DAS，DAS是d-a-a-s，你可以在Spotify、Apple Podcasts或任何你收听播客的平台订阅，也可以前往YouTube观看视频。

For more World of DAS, and DAS is d a a s, you can subscribe on Spotify or Apple Podcasts or anywhere you get your podcasts, and also check out YouTube for videos.

Speaker 0

你可以在Twitter上找到我，用户名是@Orin。

You can find me at Twitter at at Orin.

Speaker 0

拼写是u-r-e-n，Orin，我们非常期待你的反馈。

That's a u r e n, Orin, And we'd love to hear from you.

Speaker 0

《World of DAS》由Safegraft赞助播出。

World of DAS is brought to you by Safegraft.

Speaker 0

Safegraft提供实体地点的地理空间数据。

Safegraft is geospatial data for physical places.