OpenAI Podcast - 第八集 - OpenAI与博通及计算的未来 封面

第八集 - OpenAI与博通及计算的未来

Episode 8 - OpenAI x Broadcom and the future of compute

本集简介

OpenAI的Sam Altman和Greg Brockman与博通的Hock Tan及Charlie Kawwas共同探讨了双方的新合作伙伴关系及其对人工智能未来的意义。从定制芯片到全球规模的基础设施,他们分享了计算创新如何塑造通往通用人工智能的道路。 00:00 宣布合作 03:06 AI基础设施的规模 06:03 芯片设计中的协作与创新 08:49 历史背景与未来愿景 12:10 计算在AI发展中的作用 15:01 针对特定工作负载的优化 18:02 通往AGI的旅程 21:00 AI与计算能力的未来 23:50 总结与未来项目 本节目由Acast托管。更多信息请访问acast.com/privacy。

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

大家好,我是安德鲁·梅恩,欢迎收听OpenAI播客。今天我们非常激动地要宣布一则关于博通与OpenAI的重磅消息。来自OpenAI的萨姆·阿尔特曼和格雷格·布罗克曼,以及来自博通的陈福阳和查理·考利斯将共同参与本期节目。

Hello, I'm Andrew Main, and welcome to the OpenAI Podcast. Today, we're excited to be breaking some news involving Broadcom and OpenAI. Joining me from OpenAI is Sam Allman and Greg Brockman, and from Broadcom, Hock Tan and Charlie Kaulis.

Speaker 1

从当前人工智能基础设施建设的诸多角度来看,可以说这是人类历史上规模最大的联合工业项目。

A lot of ways that you would look at the AI infrastructure build out right now, you would say it's the biggest joint industrial project in human history.

Speaker 2

我们正在定义人类文明的下一代操作系统。

We're defining civilization's next generation operating system.

Speaker 3

与我们的目标相比,这只是沧海一粟。虽然这一粟很大,但

That is a drop in the bucket compared to where we need to go. That's a big drop, but

Speaker 1

确实很大。

it's a big drop.

Speaker 0

那么今天我们要讨论什么话题?是什么把各位聚集在一起?

So what are we talking about today? What brought you all together?

Speaker 1

今天我们宣布博通与OpenAI达成战略合作。过去十八个月里,我们共同设计了一款新型定制芯片。最近我们还开始研发整套定制系统。这些技术已复杂到需要整体解决方案。明年年底我们将开始部署10吉瓦规模的机架系统,搭载我们研发的芯片——这将是服务于全球先进智能需求的庞大计算基础设施。

So today we're announcing a partnership between Broadcom and OpenAI. We've been working together for about the last eighteen months designing a new custom chip. More recently, we've also started working on a whole custom system. These things have gotten so complex, you need the whole thing. And we will be starting in late next year deploying 10 gigawatts of these racks, of these systems, and our chip, which is a gigantic amount of computing infrastructure to serve the needs of the world to use advanced intelligence.

Speaker 0

那么这将涉及计算和芯片设计以及扩展?

So this is going to entail both compute and chip design and scaling out?

Speaker 1

这是一个完整的系统。我们曾紧密合作设计了一款专为我们工作负载定制的芯片。当我们意识到世界将需要多大的计算能力——特别是推理能力时,我们开始思考能否专门为这类特定工作负载设计芯片。显然,博通是全球最理想的合作伙伴。而令我们惊讶的是,虽然最初并非如此规划,但随着我们意识到需要整个系统协同支持,事情变得越来越复杂。

This is a full system. So we worked, we closely collaborated for a while on designing a chip that is specific for our workloads. When it became clear to us just how much capacity, inference capacity, the world was going to need, we began to think about could we do a chip that was meant just for that kind of a very specific workload? Broadcom is the best partner in the world for that, obviously. And then to our great surprise, this was not the way we started, but as we realized that we were going to really need the whole system together to support this this gotten more and more complex.

Speaker 1

事实证明博通在系统设计方面同样出色。我们正在合作开发整套方案,这将帮助我们进一步提升服务所能提供的计算能力。

It turns out Broadcom is also incredible at helping design systems. So we are working together on that entire package and this will help us even further increase the amount of capacity we can offer for our services.

Speaker 0

那么Hock,这是怎么开始的?你们最初是什么时候讨论合作的?

So Hock, how did this come about? When did this start? When did you guys first talk about working together on this?

Speaker 4

除了Sam和Greg是极好的合作伙伴这个事实外,这合作水到渠成——因为OpenAI始终致力于开发最先进的生成式AI前沿模型。但随之而来的是对计算能力的持续需求,尤其是随着路线图向更优秀的前沿模型和超级智能发展。计算能力是核心要素,这离不开半导体技术——正如Sam所说,甚至超越半导体本身。虽然由我来说可能不太合适,但我们确实是全球最顶尖的半导体企业。更重要的是,AI对我们而言是极其激动人心的机遇,我们的工程师正在不断突破半导体技术的创新边界。

Well, other than the fact that Sam and Greg are great people to work with, it's a natural fit because OpenAI has been doing, continues to do the most advanced models, frontier models in generative AI out there. And but as part of it, you need you continue to need compute capacity, the best latest compute capacity as you progress in the roadmap towards a better and better frontier model and towards super intelligence. Compute is a key part and that comes with semiconductors and as Sam indicated, more than semiconductors. And we are, even though I say it myself, probably the best semiconductor company out there. And more than that is AI is a very, very exciting opportunity for us in terms of we we are my engineers are pushing the innovation envelope and newer and newer generations of semiconductor technology.

Speaker 4

因此对我们来说,与顶尖生成式AI公司合作是顺理成章的事。

So for us, collaborating with the best generative AI company out there is a natural fit.

Speaker 0

这不仅仅是芯片规模化的问题——比如10吉瓦的规模。说实话我甚至难以理解,当你们提到10吉瓦时究竟意味着什么?

And this isn't just chips that's going out to scale, like 10 gigawatts. And I can't have trouble kind of even understanding that. What does that even mean when you're talking about 10 gigawatts?

Speaker 1

首先,你提到霍克也谈到了这一点,垂直整合至关重要。我们能够从晶体管蚀刻一直思考到当你向ChachiPiiti提问时输出的token,并设计整个系统。包括芯片相关的一切、我们设计机架的方式、它们之间的网络连接、我们使用的算法如何适配推理芯片本身,直至最终产品的所有环节。我对此感到兴奋的原因之一在于,通过优化整个技术栈,我们能获得巨大的效率提升,这将带来更优的性能、更快的模型、更经济的模型等等。随着性能提升和模型变得更便宜智能,我们始终观察到一个现象:人们总想用得更多。

First of all, you said it's not just chips that Hock touched on this too, the vertical integration point is really important. We are able to think from etching the transistors all the way up to the token that comes out when you ask ChachiPiiti a question and design the whole system. All of the stuff about the chip, the way we design these racks, the networking between them, how the algorithms that we're using fit the inference chip itself, a lot of other stuff all the way to the end product. And one of the many reasons I'm so excited about that is by being able to optimize across that entire stack, we can get huge efficiency gains, and that will lead to much better performance, faster models, cheaper models, all of that. As you get that better performance and cheaper and smarter models, one thing that we have consistently seen is people just wanna use way more.

Speaker 1

过去我们常想,优化10倍就能解决所有问题,但当你优化10倍时,需求却增长了20倍。因此新增的10吉瓦电力,是在我们现有合作伙伴、数据中心和硅基合作基础上的增量。10吉瓦是巨大的容量。然而如果我们能如预期般出色地完成,尽管远超当今世界水平,我们预计高质量智能以极快速度和极低成本交付时,世界将迅速吸收并挖掘出惊人的新用途。那么这一切的希望是什么?

So we used to think like, oh, we'll optimize things by 10x and we'll solve all of our problems, but you optimize by 10x and there's 20x more demand. So 10 gigawatts, incremental gigawatts, this is all on top of what we're already doing with other partners and all the other data centers and silicon partnerships we've done. 10 gigawatts is a gigantic amount of capacity. And yet, if we do as good of a job as we hope, even though it's vastly more than the world has today, we expect that very high quality intelligence delivered very fast and a very low price, the world will absorb it super fast and just find incredible new things to use it for. So what is the hope with this?

Speaker 1

希望在于,人们当前用这些算力做的事情——写代码、实现企业自动化、生成视频等等——未来他们将能做得更多,且借助更智能的模型。

The hope is that the kinds of things people are doing now with this compute, writing code, doing more and automating more and more of enterprises, generating videos and so on, whatever it is, they will be able to do that much more of it and with much smarter models.

Speaker 0

这太棒了。格雷格和查理,回顾历史,当人们试图开发芯片或硬件来适配当时计算需求时,你们参考了哪些历史案例来规划未来?是什么启发了你们的思考?

That's amazing. So Greg and Charlie, when you think about historically when people have tried to develop chips or hardware to suit whatever was the current modem for using computing at that point, what examples have you looked upon historically to figure out how to plan forward? What's been inspiring you when you think about this?

Speaker 3

老实说,我认为最关键的是与优秀伙伴合作。显然我们公司无法独自完成所有事,要从零开始为特定工作负载自研芯片,没有霍克、查理和博通的合作是不可能的。借助他们的专业与我们自身对工作负载的理解,这过程非常了不起。有趣的是,OpenAI能在许多方面以不同于行业传统的方式突破,例如我们用自己的模型来设计芯片,这非常酷。

Well, I'd say the number one thing, honestly, is working with good partners. I think it's very clear that we, as a company, are not able to do everything ourselves and getting into actually building our own chips for our own specific workloads was not something we could do from a total standstill without working with Hawk and Charlie and Broadcom. So it's just been really incredible to lean on their expertise together with our understanding of the workload. And it's been actually very interesting to see the places where OpenAI is able to do things very differently from the rest of the industry or the way that things would historically be done. For example, we've been able to apply our own models to designing this chip, which has been really cool.

Speaker 3

我们成功压缩了时间表,实现大幅面积缩减——在人类已优化的组件上投入算力后,模型会自行产生优化方案。现在的情况很有趣:我们的优化方案人类设计师并非想不到,但专家们事后查看时常说'这确实在我的清单上,但还有20项优化需要一个月才能完成'。与查理团队合作临近截止日时仍在进行优化,这本身就很有启发性。

We've been able to pull in the schedule, we've been able to get massive area reductions, where you take components that humans had already optimized and just pour compute into it, and the model comes out with its own optimizations. And it's very interesting. We're at the point now where I don't think any of the optimizations we have are ones that human designers couldn't have come up with. Like usually our experts take a look at it later and say, Yeah, this was on my list, but it was like 20 things that would have taken them another month to get to. And that's actually really interesting that we were coming up on a deadline working with Charley's team and we were running optimizations.

Speaker 3

当时面临选择:是立即检查优化方案,还是持续工作到截止日后再评估?我们当然选择继续推进。因此我们内部积累了该领域的专业能力,这实际上有助于推动整个行业。我认为我们正迈向AI智能帮助人类实现前所未有突破的世界,而这需要尽可能多的算力来支撑。

We had a choice of do we actually take a look at what those optimizations were or do we just keep going until the deadline and then take a look after? And we decided, of course, you gotta just keep going. And so we've really been building up this expertise in house to understand this domain, and that's something we actually think can help lift up the whole industry. But I think that we are heading to a world where AI intelligence is able to help humanity make new breakthroughs that just would not be possible otherwise. And we're going to need just as much compute as possible to power that.

Speaker 3

举个非常具体的例子,我们现在所处的世界中,ChatGPT正从一种需要交互对话的工具,转变为能在后台为你工作的智能体。如果你使用过Pulse等功能,每天早晨醒来时,它都会推送与你兴趣相关的内容。这非常个性化,而我们的目标是将ChatGPT打造成助力实现个人目标的工具。关键在于,我们目前只能向Pro用户开放此功能,因为受限于现有算力资源。理想情况下,每个人都该拥有24小时全天候运行的智能代理,在后台协助达成目标。

Like one example of something very concrete is that we are in a world now where ChatGPT is changing from something that you talk to interactively to something that can go do work for you behind the scenes. If you've used features like Pulse, you wake up every morning, it has some really interesting things that are related to what you're interested in. It's very personalized and our intent is to turn ChatGPT into something that helps you achieve your goals. Thing The is, we can only release this to the Pro tier because that's the amount of compute that we have available. And ideally, everyone would have an agent that's running for them 20 fourseven behind the scenes, helping them achieve their goals.

Speaker 3

因此最理想的状态是,每个人都拥有持续运行的专属加速器和算力资源。这意味着面对全球100亿人口,我们现有的芯片制造能力还远远不够。在满足全人类实际需求(而不仅是市场需求)之前,我们仍有很长的路要走。

And so ideally, everyone has their own accelerator, has their own compute power that's just running constantly. And that means, you know, there's 10,000,000,000 humans. We are nowhere near being able to build 10,000,000,000 chips. And so there's a long way to go before we are able to saturate just, not just the demand, but what humanity really deserves.

Speaker 0

查理,作为技术专家,又身处多次引领技术革命的公司,与OpenAI这样的企业合作,并与格雷格共事是怎样的体验?

So, Charlie, being very deeply technical and being with a company that's been at a number forefront of some of these revolutions, what's it been like working with a company like OpenAI and working with Greg on this?

Speaker 2

对我们而言,这段合作既激动人心又充满新意,因为我们共同聚焦于特定工作负载。最初我们从IP和AI加速器(我们称之为XPU)着手,但很快意识到可以深入到晶体管层级来优化工作负载。正如格雷格所说,通过共同定制平台来适配工作负载,从而打造出世界顶尖的平台。后来我们更意识到,正如山姆早前指出的,不仅需要XPU或加速器,更需要可扩展的网络架构来实现横向和纵向的拓展。

So, for us, it's been absolutely exciting and refreshing because the beauty of the work we do together is focus on a certain workload. We started actually first looking at the IP and AI accelerator, which is what we call the XPU. And then we realized very quickly that we now can actually go to the workload all the way down to the transistor. And as Greg was just explaining, how we can both work together to go customize that platform for your workload, resulting in the best platform in the world. Then we realized, as Sam was saying earlier on, it's not just that XPU or accelerator, actually, it's the networking that needs to go to scale it up, scale it out and scale it across.

Speaker 2

于是我们突然发现,我们实际上能推动下一阶段的标准化和开放化进程——这不仅使我们受益,更将惠及整个生态系统,并加速通用人工智能的发展。我们既为团队的技术实力感到振奋,也为共同愿景和推进速度感到自豪。

And so suddenly we started seeing that we actually can drive next level of standardization and openness that not just only benefits us, think it actually will benefit the entire ecosystem and it gets Gen AI to an AGI much faster. Excited about the technical capabilities of the teams we have, but also the vision and I think the speed at which we've been moving.

Speaker 0

我仍在努力理解这个项目的规模——既要设计芯片这样的核心部件,又要实现极致效率,还要考虑整个基础设施的庞大体量。这是个全球性工程。能否从历史中找到类似的对标案例?

I'm still kind of wrapping my head around the scale of it because it's just from both trying to design something like a chip and to help figure out how you're going to get the maximum efficiency on this, to just the size of it, the infrastructure, what's involved in this. This is a global effort. And what comparisons you've able to draw for this to other examples in history?

Speaker 1

历史类比总是困难的。但据我所知,虽然不清楚修建长城当时占全球GDP的比重,但就当前AI基础设施建设的规模而言,可以说是人类历史上最大的联合工业项目。这需要众多企业、国家和行业通力协作,许多环节必须同步推进。鉴于研究前沿的突破和商业领域创造的价值,整个行业都已认定这是值得投入的豪赌。当你走进一个千兆瓦级数据中心,就能直观感受到这种规模。

I always think the historical analogies are tough. But this is as far as I know, I don't know fraction building the Great Wall was of global GDP at the time, but a lot of ways that you would look at the AI infrastructure build out right now, would say it's the biggest joint industrial project in human history. And this is like This requires a lot of companies, lot of countries, lot of industries to come together. And a lot of stuff has to happen at the same time, and we've all got to kind of like invest together, but at this point, given everything we see coming on the research front, given all of the value we see being created on the business front, I think the whole industry has decided this is a very good bet to take, but it is huge. You go to one of these even one gigawatt data centers and you look at the scale of what's happening there.

Speaker 1

它就像一座微型城市,一个庞大复杂的系统。

It's like a tiny city. It's a big complex thing.

Speaker 3

这规模简直令人难以置信。作为一个大型协作项目,每次我联系查理时,他总在世界各地奔波——确保产能、寻找方法,只为共同实现我们的目标。

So it is just incredible scale. To the point of this being a massive collaborative project, I feel like whenever I call Charlie, he's in a different part of the world trying to secure capacity, trying to find a way to help us build what we're trying to do together.

Speaker 2

确实如此。实际上我最近一直在思考,我们在这个绝佳合作中最酷的事情——我们正在定义下一代文明操作系统。正如你所说,我们从晶体管层面开始构建,新建晶圆厂、制造基地,直到组装机架,最终建成你提到的那些数据中心——那可是10吉瓦规模的数据中心啊。

Exactly. Actually, one of the coolest thing actually I was thinking about is what we're doing together in this wonderful partnership. We're defining civilizations next generation operating system. And we're doing it, as you're saying, at the transistor level, building new fabs, building new manufacturing sites, all the way to building these racks, and ultimately the data centers you're talking about, 10 gigawatts in data centers.

Speaker 0

是的,我认为需要认清一个重要事实:人们往往只盯着芯片本身,这就像认为国家公路工程只是为了卖沥青,或铁路只是为了钢材。实际上,真正重要的是在此基础上实现的可能。你们肯定深有体会,比如当...

Yeah, think it's an important thing to keep track of is often people get fixated just on the chips themselves, and it's kind of like thinking the National Highway project was about selling asphalt, or railroads are about steel. In reality, it's the things become possible on top of that. And you've probably thought a lot about that, like what happens when

Speaker 4

在我看来,这就像铁路和互联网的结合。它正逐渐成为关键基础设施,不仅是服务于1万家企业的关键设施,山姆,最终将成为全球80亿人的基础服务。这像是另一种形式的工业革命正在兴起。但单靠一方无法完成,我们认为至少需要两方合作,实际上更需要建立广泛的伙伴关系,需要整个生态系统的协作。

Well, I think this is like railroad, Internet. That's one I think this is becoming over time critical infrastructure or critical utility, and more than just critical utility for say 10,000 enterprises. This is critical utility over time, right Sam, for 8,000,000,000 people globally. That's I think the big it's like the industrial revolution of a different sort coming forth. But it doesn't it cannot be done with just one party or we like to think it can be done with two, but more than it needs a lot of partnerships, it needs collaboration across an ecosystem.

Speaker 4

正因如此,尽管我们强调为特定工作负载、应用和大语言模型开发芯片,但同样需要建立开放透明的通用标准。因为归根结底,我们需要构建完整的基建体系,使之成为全球60亿人的基础服务。说实话我们非常兴奋,这正是我们认为能成为绝佳合作伙伴的原因——我们怀有相同的信念,更重要的是,这关乎扩展计算能力以实现超级智能和模型的突破,是在为未来奠基。

And also because of that, it's important to create much as we say about developing chips for specific workloads, applications and LLM. It also requires somewhat standards that are open, more transparent for all to use because you need to build up a whole infrastructure at the end of the day as to become a critical utility for 6,000,000,000 people in the world. And we're very excited, frankly, which is why we think we make great partners because I think we share the same conviction and more than that, it is about scaling computing to create breakthroughs in super high intelligence and models. It's building the foundation of that.

Speaker 0

你们手头已经有很多事了,为什么现在还要设计芯片?

You guys have a lot on your plate. Why design chips now?

Speaker 3

这个项目我们已经进行了大约十八个月,进展速度惊人。我们招募了一些非常优秀的人才。通过这段时间,我们深刻理解了工作负载的特性。我们与生态系统中的多方合作,发现市面上有许多令人惊叹的芯片,每种芯片都有其独特的应用场景。

Well, this project, we've probably been working on it for eighteen months now and it's moved incredibly quickly. We've hired some really amazing people. And I think what we found is that we have a deep understanding of the workload. And we work with a number of parties across the ecosystem and that there's a number of chips out there that I think are really incredible. And there's a niche for each one.

Speaker 3

因此我们一直在寻找那些未被充分满足的特定工作负载需求,思考如何构建能够突破现有可能性的解决方案。我认为这种能够为预见性需求提供全栈垂直整合的能力——尤其是在难以通过其他合作伙伴实现的情况下——正是这类项目的典型应用场景。

And so we've really been looking for specific workloads that we feel are underserved, how can we build something that will be able to accelerate what's possible? And so I think that that ability to say that we are able to do the full vertical integration for something we see coming, it's hard for us to work through other partners, like that's a very clear use case for this kind of project.

Speaker 4

确实如此,甚至远不止这些。Greg说得很到位,你们自主研发芯片的核心原因在于:计算能力是通往超级智能、打造更强大前沿模型的关键路径。这本质上可以归结为计算问题——但绝非普通计算,而是需要高效能、高性能且尤其注重能效的专项计算。

Yeah. Actually, more than that. And Greg, you put it very well. Really, why you want to do your your chip is computing is a big part of what's gaining this journey towards superintelligence, towards creating better and better frontier model. It really a lot of it down to computing and not just any computing, computing that is effective, high performance and efficient given especially on power.

Speaker 4

Greg的观点与我们在此的实践发现完全一致。举例来说,训练环节需要设计算力更强的芯片,以T FLOPS衡量的计算能力和网络性能都至关重要——正如Charlie所说,这需要芯片集群协同工作。而推理环节则需要侧重内存容量与计算访问的配比。随着时间推移,我们正在针对不同工作负载和应用场景打造专用芯片,这才是最终催生高效模型的关键。

And what Greg is saying is exactly what we learned and saw here. For instance, if you want to train, you design chips that much stronger in computing capacity measured T FLOPS as well as network because it's not just one chip that makes it happen, it's a cluster, as Charlie put it. But if you want to do inference, you put in more memory and memory access relative to compute. So you are actually over time creating chips optimised for particular workloads, applications as we go along. And that at the end of the day is what will create the most effective models.

Speaker 4

这是一个需要端到端构建的平台体系。

It's a platform that you want to create end to end.

Speaker 3

补充一个历史背景:OpenAI创立初期,我们并未特别关注算力问题,当时认为实现AGI的关键在于创新理念和实验验证。我们以为只要概念框架到位,AGI就会水到渠成。直到2017年——也就是成立约两年后——我们才发现规模化才是取得最佳成果的途径。

And also one piece of historical context is that when we started OpenAI, we didn't really have that much of a focus on compute. We felt that the path to AGI is really about ideas. It's really about tryouts and stuff. Eventually we'll put the right conceptual pieces in place and then AGI. And about two years in, in 2017, the thing that we found was that we were getting the best results out of scale.

Speaker 3

这并非我们刻意要证明的论点,而是通过实证发现的结果,因为其他方法的效果都相形见绌。最早期的突破来自我们在电子游戏DOTA2中强化学习技术的规模化应用。你们当年有关注过这个DOTA2项目吗?那确实是个非常酷的项目。

It wasn't something we set out to prove, was something we really discovered empirically because of everything else that didn't work nearly as well. And the first results were scaling up our reinforcement learning in the context of the video game DOTA two. Did you guys pay attention to the DOTA two project back in the day? Yes. It was a super cool project.

Speaker 3

我们确实看到你们实现了两倍规模扩展,突然间你们的智能体性能也提升了两倍。这就像,好吧,我们必须把这件事做到极致。那时我们开始关注整个生态系统,对吧?当时涌现了许多采用全新架构的芯片初创公司,它们与GPU截然不同。我们开始给予他们大量反馈,告诉他们我们认为行业的发展方向应该是这样的。

And we really saw you scale up by 2x and suddenly your agent is 2x better. It's like, okay, we have to push this to the limit. And at that point, we started paying attention to the whole ecosystem, right? There were all sorts of chip startups with novel approaches that were very different from GPUs. And we started giving them a ton of feedback saying, here's where we think things are going.

Speaker 3

模型必须采用这种形态。说实话,其中很多公司根本不愿听取我们的意见。对吧?所以当你处于这种境地时会非常沮丧——你明明看清了未来应该走向何方,却除了试图影响别人的技术路线图外,我们实际上没有能力真正改变什么。

It needs to be models of this shape. And honestly, a lot of them just didn't listen to us. Right? And so it's very frustrating to be in this position where you say, We see the direction the future should be going. We have no ability to really influence it besides trying to influence other people's roadmaps.

Speaker 3

因此通过将部分环节内部化,我们感觉自己终于能够真正实现这个愿景。而且我们希望以这样的方式指明方向后,其他人会跟进完善——因为要实现我们对AGI的构想,现有计算能力远远不够。10吉瓦?杯水车薪而已。要实现目标所需,这点算力只是沧海一粟。

So by being able to take some of this in house, we feel like we are able to actually realize that vision. And again, in a way that we hope that we can show a direction and other people will fill in because the amount of compute required to bring our vision of AGI to the world, 10 gigawatts is not enough. That is a drop in the bucket compared to where we need to go. That's a big drop,

Speaker 0

但这个容器确实足够大。当你们为推理和训练自研芯片时,能实现哪些突破?你们能把这件事做到什么程度?

but The bucket's really big. What becomes possible at this when you're building your own chips for inference and for training? Where can you take this?

Speaker 1

宏观来看,如果把我们整个工作流程简化为:熔化沙子→注入能量→产出智能(当然不是字面意义的熔沙,这是个数字化比喻)。我们追求的是让每单位能量产出最大化的智能。

To zoom out a little bit, if simplify what we do in this whole process to, you know, melt sand, run energy through it, get intelligence out the other end. You're not literally melting sand, but it's like it's nice digital. I did. What we want is the most intelligence we can get out of each unit of energy.

Speaker 0

而且

And

Speaker 1

因为这终将成为关键瓶颈。我希望整个流程能证明——从模型设计到芯片再到机架,我们能让每瓦特电力催生出更多智能。这样所有以各种创新方式使用这些模型的人,都能从中获得巨大收益。这就是我的期望。

because that will become the gate at some point. And and I hope what this whole process will show us, is, you know, from the model we designed to the chip to the rack, we will be able to wring out so much more intelligence per watt. And then everybody that's using these models in all of these incredible ways will do so much with it. That's what I hope for.

Speaker 4

你掌控着自己的命运。如果自主研发芯片,就能掌握自己的命运。

And you control your own destiny. If you do your own chips, you control your destiny.

Speaker 0

是的,想想我们今天所做的事情相当惊人且卓越,但我们使用的工具并非专为我们当前的使用方式而设计,这很有趣。

Yeah, it's interesting to think about how the things that we're doing today are pretty amazing, remarkable, but we're using stuff that wasn't necessarily designed specifically for the way we're doing it.

Speaker 1

当今的GPU确实令人惊叹,我非常感激,未来我们仍将大量需要它们。这种能让我们快速开展研究的灵活性和能力太棒了。但你说得对,随着我们对未来形态越来越有把握,针对工作负载高度优化的系统能让我们每瓦特电力发挥更大效能,这非常理想。

Oh, mean, the GPUs of today are incredible, incredible things. I'm very grateful and we will continue to really need a lot of those. Flexibility and the ability to let us do fast research is amazing, But you are right that as we get more and more confident in what the shape of the future is going to look like, a very optimized system to the workload will let us ring more up per watt. That's great.

Speaker 2

这是个需要数十年的漫长旅程。以霍克的铁路为例,它花了一个世纪才成为关键基础设施;互联网则用了约三十年。这不会在五年内完成,而是需要很长时间。

And it's a long journey that takes decades. So if you go back to Hawk's example, take railroads, it took about a century to roll it out as a critical infrastructure. If you take the internet, it took about thirty years. This is not going to take five years. It's going to take a long time.

Speaker 2

因此我认为,通过这种合作伙伴关系,当我们共同探索如何从中榨取更多算力时,会发现:对于某些训练或研究,GPU可能很出色;也可能发现与Greg合作的方式——这就像乐高积木平台,可以自由组合输入输出。突然间我们就能获得针对下一代训练、推理或研究的XPU或加速器。

So I think as we collectively, especially with this partnership, continue to figure out ways to ring out more tokens out of it, we'll discover that, oh, for this training or research, maybe a GPU is great, or maybe, you know what, we can take whatever we're doing with Greg. It's actually a platform that allows you like a Lego block to take in things and out. And now suddenly we can get another XPU or an accelerator for next gen that's targeted at a training or an inference or a research.

Speaker 3

确实。正如Sam所说,GPU的发展令人难以置信。2017年我们开始研究其他加速器时,完全无法预见五到十年后的格局。这充分证明了像英伟达和AMD这样的公司,是如何推动GPU持续进步并保持主导地位的。但同时,设计空间依然广阔,对吧?

Yeah. And to to the point that Sam said of GPUs have really come in incredible way. In 2017, when we started looking at all these other accelerators, it was actually very non obvious about what the landscape would look like in five, ten years. And I think it's really a testament to companies like NVIDIA AMD for how much the GPU has just moved forward and continued to be the dominant accelerator. But at the same time, there's a massive design space out there, right?

Speaker 3

我们看到的是现有平台无法满足的工作负载需求,这正是全栈垂直整合的独特价值所在。

And I think that what we see is workloads that are not served through existing platforms. And that's where that full vertical integration is something unique.

Speaker 0

这很有趣,因为将推理功能靠近用户的想法相对较新。我们已理解训练过程,但当你想到每天使用这些产品的用户数量,以及他们需要多少计算资源来完成娱乐或严肃任务时,再考虑到其规模——我们之前讨论过,我总忍不住重提——这确实是件大事。它会发展到哪里?计算资源是否会不断找到新的应用场景?

It's interesting too because the idea that you'd want to put inferences close to the user is something kind of relatively new. We've understood training, but then you think about like the number of people every day using these products and how much they need compute to do fun things or serious things. And when you start thinking about kind of like the scale of it, we talked before, I keep coming back to it, it's a very big thing. Where, you know, where does it keep going? Does it just a thing that we're gonna continuously find new things to use compute for?

Speaker 1

OpenAI拥有的第一个集群,我记得其能耗规模是两兆瓦。

The first cluster OpenAI had, the first one that I can remember the energy size for was two megawatts.

Speaker 3

真可爱。是啊。都忘了用那两兆瓦做过什么了。

Adorable. Yeah. Forgot things done with those two.

Speaker 1

不记得何时达到20兆瓦,但记得突破200兆瓦的时刻。今年我们将以略超两千兆瓦收尾,而近期合作将使这个数字接近三千兆瓦。世界的发展远超我的预期——事实证明,两千兆瓦就能支撑全球10%人口使用ChatGPT,同时进行SORA研究、API服务等多项工作。但想想世界还有多少未实现的潜在需求。

I don't remember when we got to 20. I remember when we got to 200, You know, and we will finish this year a little bit over two gigawatts and these recent partnerships will take us close to 30. And the world has done far more than I thought they were going to do. Turns out you can like serve, you know, 10% of the world's population with ChatGPT on and do the research and do SORA and do our API and a few other things on two gigawatts. But think about how much more the world would like to do than they get to do right now.

Speaker 1

即便现在拥有300亿瓦特电力配合当前模型质量,我认为人类需求仍会迅速饱和,尤其是随着成本降低。但我们反复验证的是:假设GPT-6能比GPT-5实现30点智商飞跃,并能持续工作数日/周/月而非数小时,同时单位token成本持续下降——每次突破带来的经济价值和衍生需求都会呈爆发式增长。就像当ChatGPT刚能编写简单代码时,人们立刻找到了应用场景。

If we had 30 gigawatts today with today's quality of models, I think you would still saturate that relatively quickly in terms of what people would do, especially with the lower cost we'll be able to do with this. But the thing we have learned again and again is, let's say we can push GPT-six to feel like, know, 30 IQ points past GPT-five, something big, And that it can work on problems not for a few hours, but for a few days, weeks, months, whatever. The amount, and while we do that, we bring the cost per token down. The amount of economic value and sort of surplus demand that happens each time we've been able to do that goes up a crazy amount. So you can see, to pick a, I think, well known example at this point, when ChatGPT could write a little bit of code, people actually used it for that.

Speaker 1

用户曾艰难地粘贴代码等待响应,模型虽功能有限但已能处理部分需求。随着模型改进和用户体验优化,现在Codex正以惊人速度成长,能以更高水平完成数小时工作量——这种可能性直接引发了需求激增。

They would like very painfully paste in their code and wait and they would say, do this for me and paste it back in and whatever. And models couldn't do much, but they could do a few things. The models got better, the UX got better, and now we have Codex. Codex is growing unbelievably fast and can now do like a few hours of work at a higher level of capability. And when that's possible, the demand increase is crazy.

Speaker 1

或许下个版本的Codex能以顶尖工程师水平完成数日工作量,也许还需要迭代几个版本,但终将实现。想象这会产生多大需求,然后再将其复制到每个知识工作领域。

Maybe the next version of Codex can do like a few days of work at kind of one of the best engineer you know level or maybe that takes a few more versions, whatever, it'll get there. Think how much demand there will just be for that, and then do it for every knowledge work industry.

Speaker 3

我喜欢这样理解:智能是经济增长的根本驱动力,是提升每个人生活水平的关键。我们通过人工智能所做的,实际上是在增加并放大每个人的智能。随着这些模型不断进步,我认为每个人的生产力都将提高。未来可能实现的成果将与今天截然不同。

And one way I like to think of it is that intelligence is the fundamental driver of economic growth, of increasing the standard of living for everyone. And what we're doing with AI is actually bringing more intelligence and amplifying the intelligence of everyone. And so as these models get better, I think everyone's going to become more productive. The output of what is possible is going to just be totally different from what exists today.

Speaker 0

同样有趣的是,从成本相对较高的GPT-3阶段发展到你们现在所处的GPT-5水平,而且你们能免费向公众提供这种服务。这是否成为你们的动力来源?因为每次创造这些新效能时,都能惠及更多人群。

It's interesting too that going from a point when with GPD three, which was pretty cost, you know, as expensive comparatively to where you're at, you know, level of a GPD five and the fact that you can provide that freely to people. And is that a motivating factor for you, fact that every time you create these new efficiencies that it just benefits so many more people?

Speaker 4

确实如此。从我们硬件和计算能力的角度来看,在某种程度上这是真正见真章的地方。我们肩负着持续优化、突破尖端技术边界的责任。仍有进步空间,即便在2纳米工艺之后,随着我们开发各种新技术,还能继续突破比2纳米更小的极限。

Yes. Absolutely. And from our side on hardware, compute capacity, where to some extent the rubber hits the road on this, it's really incumbent on us to keep optimising, pushing the envelope on leading edge technology. And there's still room to go. And there's room to grow even on where we are as we go from two nanometers going forward, less smaller than two nanometers as we start doing all kinds of different technology.

Speaker 4

这确实是硬件和半导体行业激动人心的伟大时代。

It is really great and exciting times, especially for the hardware and the semiconductor industry.

Speaker 1

博通公司的成就确实令人惊叹。像我们这样的公司过去根本不敢想象能制造出具有竞争力的芯片——事实上难度太大我们根本不会尝试。我想很多其他公司也同样做不到。这类针对特定工作负载的定制芯片和系统本不该存在于世。

What Broadcom has done here is really like quite incredible. It used to be extremely difficult for a company like ours to think about making a competitive chip. In fact, so hard we just wouldn't have done it. And I think a lot of other companies wouldn't have done it as well. And all of these sort of this customized chip and system to a workload just wouldn't be a thing in the world.

Speaker 1

但他们如此努力且出色地推进技术,使得企业能够与他们合作,快速实现技术奇迹般的大规模芯片生产。虽然他们也为所有竞争对手提供服务,但希望我们的芯片能成为最出色的。

But the fact that they have pushed so hard and so well on making it so that they can a company can partner with them and they can do a miracle of technology chip quickly and at scale. Unfortunately, they do it for all of our competitors too, but hopefully our chip will be the best.

Speaker 0

是的,

Yes, of

Speaker 3

当然。这确实令人难以置信。而且我认为,不仅仅是它们今天能为我们做什么,展望未来的发展路线图,那些即将为我们所用的技术种类,实在令人无比兴奋。

course. It's really quite incredible. And I think also not just what they can do for us today, but looking at the upcoming roadmap, it's just so exciting, the kinds of technologies that they're going to be able to bring to bear for us to be able to utilize.

Speaker 4

嗯,正是这种能够实现联合协作模型的兴奋感,想想GPT-5、6、7乃至更远的版本。每一个都需要不同的芯片、更好的芯片、更先进的芯片,那些我们甚至还没开始研究如何实现的尖端芯片。但我们终将做到。

Well, it's just the excitement of enabling joint and collaboratively models, check GPT-five, six, seven, on and on. And each of them will require a different chip, a better chip, a more developed chip, advanced chip that we haven't even begun to figure out how to get there. But we will.

Speaker 3

实际上GPT系列必将成为其中越来越重要的部分。是的,这会非常有趣。

And actually the GPTs are definitely going to be an increasing part of that. Yes. Will be very interesting.

Speaker 2

我们对此确实充满期待,因为我的软件工程师们已经从软件开发角度开始使用它了,它带来的效率提升相当于数十名工程师的工作量。

We're actually looking forward to that because my software engineers now already use that from a software point of view, and it's delivering efficiencies of dozens of engineers.

Speaker 1

真的吗?

Really?

Speaker 2

是的。

Yes.

Speaker 1

太棒了。

Great.

Speaker 2

在硬件方面,我们尚未达到目标。但好消息是

On the hardware side, we're not there yet. But, you know, the good news

Speaker 4

我们会让家变得更好

We'll get to home better

Speaker 3

比起我们应该谈论的,是的,关于

than We we're should talk Yes, about

Speaker 2

我们绝对应该利用这一点。但我想说的是,在计算方面,当我们开始构建这些XPU时,在800平方毫米的面积内最多只能集成一定数量的计算单元。仅此而已。而现在,我们实际上正在合作将这些计算单元以二维方式集成。接下来我们讨论的是将它们堆叠到同一芯片上。

we should absolutely leverage this. But I was going to say, with respect to compute, so when we started building these XPUs, you can maximum build a certain number of compute in 800 square millimeter. That's it. Now, today, we're actually working together to ship multiple of these in a two dimensional space. The next thing we're talking about is stacking these into the same chip.

Speaker 2

现在我们实际上正在向Y维度或Z维度发展,你可以想象成三维空间。最后我们还在讨论的是将光学技术引入其中,这正是我们刚刚宣布的——在同一芯片上集成100太比特的光学交换技术。这些技术将把计算能力、集群规模、整体性能及功耗提升到一个全新水平,我认为至少每六到十二个月就会翻一番。

Now we're actually going in the, you know, Y dimension or Z dimension, you want to think three-dimensional. And then the last step we're actually also talking about is now we're to bring optics into this, which is actually what we just announced, which is 100 terabits of switching with optics integrated all into the same chip. These are sort of the technologies that will take compute, the size of the cluster, the total performance and wattage of the cluster to a whole new level that I think it will keep doubling at least every six to twelve months.

Speaker 0

我们讨论的是什么样的时间框架?什么时候能首次看到这种合作关系的成果?

What kind of timeframe are we talking about? When are we going to first start to see what's coming out of this relationship?

Speaker 1

明年年底。之后我们将在未来三年内快速部署。

End next year. And then we'll deploy very rapidly over the next three years.

Speaker 4

绝对同意。

Absolutely.

Speaker 2

格雷格和我每周至少讨论一次这个问题。我们今天早些时候还就此进行了交谈。

Greg and I are talking about this at least once a week. We just had a chat earlier today on this.

Speaker 3

是的,今天进展不错。没错。不过我们确实很期待硅片能尽快开始生产。

Yes, good progress today. Yes, exactly. But yeah, we're really excited to get silicon back starting soon actually.

Speaker 2

是的,很快就能实现。

Yes, very soon.

Speaker 3

是的,我认为整个项目的难点在于它并不简单,对吧?随口说'哦,10吉瓦'很容易。但当你真正需要设计全新芯片、实现规模化交付、确保端到端全流程运作时,工作量简直是天文数字。我们必须非常认真对待——我们的使命是确保AGI造福全人类。

Yeah, I think that my view of this whole project is it's not easy, right? It's easy to just say, Oh yeah, 10 gigawatts. But when you look at what is required to actually design a whole new chip and to actually deliver this at scale, get the whole thing working end to end, it's just an astronomical amount of work. And I would say that we're very serious. Our mission is to ensure that AGI benefits all of humanity.

Speaker 3

我们真心希望惠及所有人。我们渴望这项技术能服务全球,让全人类受益。通过努力实现算力富足的世界就能体现这点,因为按现状发展,我们很可能面临算力匮乏的局面。

We're very serious about benefits everyone. Like we really want this to be a technology that is accessible to the whole world, that lifts up everyone. And you can really see that in trying to make the world be one of compute abundance. Because I think by default, we're heading towards one that is quite compute scarce.

Speaker 0

你可以问我妻子——当她试图获取更多Sora额度时,她深切感受到了资源紧缺。

You ask my wife when she's trying to get more Sora credits. She feels very scarce.

Speaker 3

不,不。我们对此感受非常具体。OpenAI内部团队,他们的产出直接取决于获得的计算资源。因此,关于谁能获得计算资源分配的竞争激烈程度令人咋舌。我认为我们真正向往的是这样一个世界:只要你有想法、想创造、想构建些什么,背后就有足够的算力支持你实现它。

No, no. We feel it so concretely. Teams within OpenAI, their output is just like a direct function of how much compute they get. And so that the amount of intensity on who gets the compute allocation is so extreme. And so I think that what we really want is to be a world where just if you have an idea, you want to create, you want to go build something, that you have the compute power behind you to make it happen.

Speaker 0

先生们,非常感谢你们与我们分享这些。看到未来发展会非常令人兴奋,希望随着项目推进我们能继续讨论这个话题。

Gentlemen, thank you very much for sharing this with us. It's going to be very exciting to see where this goes, I hope we can keep talking about this as it continues to develop.

Speaker 1

谢谢。感谢各位的合作伙伴关系。

Thank you. Thank you guys for the partnership.

Speaker 2

非常感

Thank you so

Speaker 4

谢这次合作。我们真的很享受这个过程。

much for the partnership. We're really enjoying it.

Speaker 3

我们也是。

We are too.

Speaker 1

谢谢。

Thank you.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客