Big Technology Podcast - OpenAI首席运营官Brad Lightcap:GPT-5的能力、其重要性及AI未来发展方向 封面

OpenAI首席运营官Brad Lightcap:GPT-5的能力、其重要性及AI未来发展方向

OpenAI COO Brad Lightcap: GPT-5's Capabilities, Why It Matters, and Where AI Goes Next

本集简介

布拉德·莱特卡普是OpenAI的首席运营官。莱特卡普做客《大科技》节目,探讨GPT-5的发布、其工作原理、与前代模型的区别以及它是否属于通用人工智能。我们还讨论了扩展定律、训练后的突破性进展、企业采用情况、医疗保健应用、定价策略以及公司的盈利前景。点击播放,前排聆听OpenAI对人工智能未来的思考。 --- 喜欢《大科技》播客吗?请在您常用的播客应用中给我们五星好评 ⭐⭐⭐⭐⭐。 想获取Substack+Discord上《大科技》的订阅折扣吗?首年可享25%优惠:https://www.bigtechnology.com/subscribe?coupon=0843016b 有问题或反馈?请写信至:bigtechnologypodcast@gmail.com

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

GPT-5已经问世,OpenAI的首席运营官布拉德·莱特卡普将为我们解析这一新模型的性能、对AI商业领域的影响,以及这项前景广阔技术的未来动向。布拉德,非常高兴见到你。感谢你在《大科技播客》的紧急特辑中接受我们的采访。

GPT five is here and OpenAI COO Brad Lightcap is with us to break down the new model's capabilities, what it means for the AI business, and what's next for this promising technology. Brad, it's so great to see you. Thank you for joining us on an emergency episode of Big Technology Podcast. Podcast.

Speaker 1

这是我的荣幸。谢谢邀请。

My pleasure. Thanks for having me.

Speaker 0

好的。那么简单来说,希望你能用大约60秒的时间谈谈GPT-5是什么,以及它相比OpenAI之前的模型有哪些改进?

Alright. So briefly, just want you to talk a little bit about what GPT-five is. So maybe within like sixty seconds or so, can you talk about what it is and how it improves on previous OpenAI models?

Speaker 1

好的,GPT-5是我们的下一代旗舰模型。它有个非常有趣的特点——能动态决定是否需要对问题进行深度思考推理来给出答案。你可能记得以前用户必须通过ChatGPT里那个备受'喜爱'的模型选择器,手动为不同任务挑选模型,然后才能提问获取答案。有时你选思考模式,有时则不选。

Yeah, so GPT-five is our next generation flagship model. It does something really interesting, which is it actually combines into one model the ability to dynamically choose whether to think hard about a problem and reason about it to give you an answer or not. And so you'll remember previously, you had to go deal with the model picker in ChatGPT, everyone's favorite thing. You had to select a model that you wanted to use for a given task, and then you'd run the process of asking a question, getting an answer. Sometimes you choose a thinking model, sometimes you wouldn't.

Speaker 1

这种体验对用户来说确实容易混淆。GPT-5彻底简化了这个流程,它会自动为你做决定。这实际上是个更智能的模型,无论是否启用思考模式,你都能获得更优质的答案。它在写作、编程、健康咨询等方面都有显著提升。

And that was, I think, a confusing experience for users. GPT-five abstracts all of that. So it makes that decision for you, and it's actually a smarter model. So you're gonna get a better answer in all cases regardless of whether you're using the thinking mode or not. And it's vastly improved on things like writing, coding, health.

Speaker 1

准确性大幅提高,响应速度更快。整体而言,我们认为这是更卓越的体验。

It's much more accurate, it's much faster. And so all around, we think a better experience.

Speaker 0

对于我们这些关注技术热潮的人来说,本以为你们会重点宣传智能水平的爆炸式提升,而非这个根据情境自动切换推理模式的开关。能否解释为何优先强调易用性而非智能飞跃?

And now for those of us who've been following the hype, I think we probably imagine you would lead with this is an explosive increase in intelligence versus there's a switcher on the model that will go to reasoning or non reasoning when it makes the most sense. So can you explain what's the disconnect there and why lead with the usability versus the intelligence increase?

Speaker 1

因为智能本质上取决于模型投入的思考时长。根据你为问题分配的思考时间,通常会获得更优质的答案——思考时间越长,答案质量越高。当我们在特定基准测试中允许模型思考时,其表现远超现有所有模型。即便不启用思考模式,其答案质量也普遍优于GPT-4.1等非思考型模型。

Yeah, because intelligence really is a function of how much time the model is going to be thinking. And so depending on how much you want to allocate thinking time to a problem, you're going to get a better answer. Typically, longer it thinks, the better an answer it can give you. So when we test the model on certain benchmarks and evals and we allow it to think, it will dramatically outperform any of our existing models by far. Even though if you don't allow any thinking time, you still get a typically better answer than you would for one of our non thinking models like GPT 4.1.

Speaker 1

这确实是智能水平的重大飞跃,几乎所有维度都有质的提升。但动态运用思考时间进行推理的能力,才是创造更佳用户体验的关键所在。

So it is a dramatic improvement in intelligence. It should be, I think, a better quality model across pretty much all dimensions. But that reasoning time and being able to use the reasoning time dynamically to think, we think actually is the important part. It makes it for a much better user experience.

Speaker 0

现在我要抠一下字眼。你说是对之前模型的

Now I'm going to parse your words a little bit. You said that it is a dramatic improvement over previous models. Sam, in a press call, said that GPT-five is a pretty significant step over four point zero. Simon Wilson, who's been using your model for a little bit says it doesn't feel like a dramatic leap ahead from what other LLMs from other LLMs, but it exudes competence. It rarely messes up and frequently impresses me.

Speaker 0

我提出这个问题是出于好奇,我们能否或者说您会认为这个模型是能力上的指数级提升还是渐进式提升?

I'm just setting this up because I'm curious whether we could say or whether you would say that this model is an exponential increase in capabilities or an incremental increase in capabilities?

Speaker 1

要知道,这种衡量方式很困难。我认为我们现在已经进入需要从多个维度衡量智能的阶段,这并非回避问题,而是解释为什么GPT-5如此特别。显然,它在核心能力上表现更优——在SWE基准测试和各类学术评估中得分都更高。

You know, it's it's hard to measure it that way. I think we're now kind of into this regime of having to measure intelligence across a lot of different dimensions, which isn't a way to dodge the question so much as it is to explain why GPT-five is such a special model. And so obviously, it's better at the core things that you'd expect it to be better at. It scores better on things like SWE bench. It scores better on all the kind of academic evals that we put it through.

Speaker 1

特别是这个版本,我们重点提升了其在健康类基准测试的表现,比如医疗推理等健康相关领域。但如今评判模型优劣的标准已多维化,因为训练方式和问题思考方式都会影响表现。例如更快的响应速度,我们认为这本身就是进步——单位思考时间内给出更优质答案,这也是重要的衡量维度。

This one in particular, we actually made a real emphasis to have it score better on certain health benchmarks. So it's better at medical reasoning and other health related things. But there's a lot of things that go into what makes a model good now because you have a lot of dimensions to play with depending on kinda how that model is trained and how it can think about problems. So if it's faster, for example, we think that's actually indicative of it being better. If it can give you a better answer per unit of time thinking, we think that's an improvement that is an important vector to measure also.

Speaker 1

结构化思考、问题解决、工具使用等能力都是我们的实际评估指标,虽然用户看不见这些。使用ChatGPT时你可能察觉不到这些底层改进,但GPT-5在这些方面确实全面超越了前代模型。

If it can do things like structured thinking, problem solving, tool use, all these things are are things we actually measure, and they're kind of invisible to users. You know, if if you're just using ChatGPT, you don't necessarily appreciate each of these things happening under the hood. But all those things are better for GPT-five than they were for our previous models. Right. And the

Speaker 0

我之所以这样问,是因为很多人注意到从初代GPT到GPT-2、GPT-3再到GPT-4的跨越式进步,其能力提升是全面性的。当时没有'某些方面智能提升'这样的限定说明——虽然可能有原因——更像是'我们训练了更大模型,效果全面更好'。现在情况改变了吗?

reason why I'm asking is because I think a lot of people have pointed to the leaps from original GPT to GPT-two, GPZ-two to GPT-three, GPT-three to GPT-four, and one of the things people have seen is just a general increase in capabilities across the board. There were no caveats of like, and maybe there's a reason for those caveats, but there were no caveats of, you know, there's intelligence increases in this place and that place. Was, we trained a bigger model, I'm pretty sure this is what it was, and it's better across the board. So have things changed?

Speaker 1

技术上确实变了。从GPT-2到3再到4,本质上是规模扩展范式的应用——训练越来越大的预训练模型。这仍是有效的单一训练向量。但现在我们有了新维度:后训练阶段,以及更灵活地运用推理时计算资源,这就像第二阶段的训练。这种范式能为我们提供额外助力,既是能力倍增器,又能赋予模型更多智能行为。

They've changed, yeah, from a technical perspective. I think when you go from GPT two to GPT three, three to four, these were really just exploits of what was and is the scaling paradigm of training larger pretraining bigger and bigger models, training larger models. It's kind of one vector of training, and you get a better model that as a as a result. And that continues to hold true, but we now have this kind of other category of of of training, which is post training, and being able to use test time compute in more interesting ways than we used to as almost kind of a second stage of training. And so we think that that actually gives us a little bit of a boost, a force multiplier on our ability to push the model toward new intelligence levels, and also be able to train into it a lot of the things that you want an intelligent model to be able to do.

Speaker 1

以工具使用为例,这对整体智能至关重要。GPT-2/3在这方面表现欠佳,GPT-4只是雏形,而GPT-5则天然具备多步骤、长周期推理的能力。我们希望能为用户屏蔽这些技术细节。

So using tools, for example, is something that we think is really important for overall intelligence. GPT two and three couldn't really do that as well. GPT four could do it in a more nascent way. And now GPT five, you get that baked in with the benefit of of these kind of multi multi step and and longer horizon reasoning processes. So, yeah, we we want to abstract that from users.

Speaker 1

显然,我们认为ChatGPT用户不该操心这些。某种程度上,模型选择器引发的用户不满正说明人们不愿每次对话都做技术决策——他们希望模型自主处理。这正是GPT-5的重大进步。

Obviously, we don't think that you as a a ChatGPT user should have to stop and think about that. And in some sense, think the model picker being a point of frustration for people was an expression of the fact that people don't necessarily want to have to make those decisions every time they talk to an AI model. They kind of want the model to make those decisions for them. And so that's why we think GPT-five is a big step.

Speaker 0

回到预训练规模扩大带来可预测性能提升的话题:现在后训练加入后确实以惊人方式优化模型。但您和OpenAI是否认为,在现有多种训练范式下,预训练的边际效益正在递减?

And going back to that increasing pre training, increasing the scale of pre training, delivering predictable improvements in model performance. Yes, now post training is in the picture. It's making models better in really impressive ways. But are you of the belief, and is OpenAI of the belief now, that there are diminishing returns from pre training, given that we're now talking about different forms of training these models?

Speaker 1

完全不是。我们的扩展定律依然成立,实证表明预训练不存在收益递减。而后训练领域我们才刚触及表面——之前的O系列模型只是这个新范式的探索起点。

Not at all. Our scaling laws still hold. Empirically, there's no reason to believe that there's any kind of diminishing return on pretraining. And on post training, we're really just starting to scratch the surface of of that new paradigm. You know, the the o series of models, which were kind of the previous reasoning models, were really just the beginning of us starting to explore what's possible in that post training regime.

Speaker 1

我认为这将成为未来一两年内的主导趋势——持续在规模维度上扩展,并持续获得显著收益,因为这些收益确实非常可观。现在我们正从两个方向推动模型改进,相信这将加速创新进程。

And I think that's gonna be kind of the dominant theme here for the next year or two, is continuing to scale in that dimension, and continuing to see the gains that you get there, simply because they're so significant. And so now we're pushing on two axes for how to improve models, and we think that's going to tighten and condense the rate of innovation.

Speaker 0

Zoppelin,我相信未来绝大多数改进将来自规模扩展或算法优化。

Zoppelin, I believe that the vast majority of improvements from here are going to be coming from scaling or from algorithms.

Speaker 1

我认为会是两者的结合——向来都是如此,对吧?永远是算法、规模、算力和数据的组合。我们在三个方面同时发力,它们都对我们展望未来起着关键作用。真正的难点在于如何让这些要素协同作用。

Think it'll be a combination of It's always a combination, right? It's always algorithms, scale, compute, and data, right? And so we push on all three. And they all play a really important role, I think, in, in how we look at the future. And then the hard part, obviously, is having them come together.

Speaker 1

训练更大模型通常需要更多数据,当然也需要更多算力。这需要在各要素间保持微妙平衡,因为单纯扩大规模未必总能带来相应的改进速率。必须同时兼顾其他要素——我们并非非此即彼地选择,而是审慎地推动所有要素协同发展。

So being able to train larger models requires typically that you wanna train on more data, obviously, with more compute. And so that's a delicate balance between those things because just scaling up doesn't necessarily mean, you know, in all cases that you're gonna get kind of the the same, you know, corresponding rate of improvement. You have to be able to bring those other pieces also. So it's not like we push one button or the other. We we actually make a really conscientious effort to try and kinda pull all of those those together.

Speaker 0

好的。你们不称之为AGI(通用人工智能),我不得不承认我在这个节目上输了个赌注——之前听Sam在Theo Von节目中说GPT-5几乎在所有方面都比人类聪明,我当时就说这听起来就是人们想象中的AGI。结果昨天GPT-5发布时,Sam却说讨厌AGI这个术语,因为现在人人对其定义都略有不同,但明确表示这是个具备通用智能的模型。请帮我理解现状,似乎他想称之为AGI但你们还没认可?

Okay. And you're not calling it AGI, and and I have to say I've lost a bet on this show because I was listening to Sam on the Theo Von show. He says he said GPT-five is smarter than us in almost every way and I said alright, well that sounds like what you would imagine AGI would be and then GPT-five comes out yesterday or as the release happens. Sam says, I kind of hate the term AGI because everyone at this point uses it to mean a slightly different thing, but this is clearly a model that is generally intelligent. Help me understand what's going on, because it seems like maybe maybe he wants to call it AGI, but you're not yet.

Speaker 0

那为什么这不算是AGI?

So why is this not AGI?

Speaker 1

这确实难以定义。业内玩笑说问五个人会得到七个答案。我们视其为渐进过程——这是个系统问题。

Well, it is it is a hard thing to define. You you ask the joke here is you ask five people what AGI is, you'll get seven answers. And I think the way we kinda look at it is it's a cumulative process. Right? It's a system.

Speaker 1

必须明确系统本质及其预期能力。对我而言,真正的AGI系统应能可靠地学习分布外新事物,具备推理、思考、解决问题、使用工具和创新的能力。目前我们尚未达到这个标准,但在GPT-5等模型中已能看到通用学习系统的雏形,相信其后续版本会继续完善。

And I think you have to define kinda what is it that that system is and what do you expect it to be able to do. And for me, at least, that's a system that is reliably able to learn new things that are kind of out of distribution by virtue of its ability to reason, to think, to solve problems, to use tools, to come up with new ideas. And so I do I think we're at a system that I would call AGI? No. But I think we see we start to see the traces and the, the pieces of that overall system for for generalized learning start to come together, in models like GPT five, and I suspect suspect in in its successors.

Speaker 1

我不确定是否会存在从非AGI到AGI的明确转折点。即便存在,我们可能事后才能察觉——现有模型已展现出显著的能力冗余。当Sam提及模型智能相当于口袋里的博士时,我们其实尚未充分开发这种潜力。某种意义上,即便AI发展停滞十年,以GPT-5当前水平也足够支撑未来十年的产品创新。

I don't know if we'll have a point where we are like, okay. We've crossed from a non AGI world into an AGI world. And even if there were, I'm not sure we'd actually realize it necessarily until after the fact because one of the things we've learned working with the models that we have is the capability overhang is significant. I think when Sam refers to the intelligence of the models and having a PhD in your pocket, we haven't yet really exploited that as a as a thing. You know, that in some sense, like, I think you could pause AI progress right here for ten years, and you'd still have about a decade worth of of new products to get built, of new ways that people figure out how to use the models, even at a GPT five level model in interesting products and interesting processes.

Speaker 1

有趣的是,模型越智能,对产品整合的要求就越高。我常将其比作超级聪明的实习生——终究只能完成有限任务:会议记录、摘要撰写、基础分析等。

And so there's and and one of the kind of interesting things is I think as the models get smarter, they almost demand more from a a product building perspective in terms of how you actually plug them into the system. I always kind of roughly analogize it to, like, you could have a really, really smart intern, and, you know, at the end of the day, they're only capable of doing a few things for you. They can take notes in meetings. They can write summaries. They can pull basic analyses together.

Speaker 1

但如果你引进一位博士入职,这个人拥有巨大的能力储备,尽管他们可能在第一天工作时不能完全发挥效能,但你的职责是真正想办法让他们接触足够的背景信息,提供合适的工具,使其日后能真正高效工作。这个过程实际上比培养实习生达到完全效能所需时间更长。我认为AI模型也将类似。这是一个持续的过程,且我认为不会线性发展。就目前而言,我们可能还未达到我称之为AGI级别的系统。

But if you bring a PhD to work, that person has a tremendous capability set that may they may not be totally effective at on the job on day one, but your job is to really figure out how to expose them to enough context, enough information, give them the right tools to make them really effective later on. And that process actually takes longer to get them to their full effectiveness than it would an intern. And I think it's gonna be similar with AI models. And so, you know, it it is a continuous process, and it I don't think it will be linear. But where we are today, I would say, you know, we're we're probably not quite yet at at something I would call like an AGI level system.

Speaker 0

是的,这引出了一个非常有趣的问题:从现在起让模型变得更智能真的有意义吗?还是应该专注于构建那些辅助能力?Sam在媒体电话会上提到,GPT-3相当于高中水平智力,GPT-4可能是大学生水平,而GPT-5则达到专家级。所以我好奇OpenAI的目标是继续提升智能水平,还是聚焦于智能之外的其他能力?比如你提到的记忆和持续学习等功能。

Yeah, and it brings up such an interesting question, which is, does it really make sense to try to make the models smarter from here? Or is it about trying to build those ancillary capabilities? I think Sam mentioned this on the media call, but GPT-three, you said, was high school level intelligence, GPT-four maybe the level of a college student, and GPT-five an expert. So I guess I wonder for OpenAI is the quest to add more intelligence to the mix, or is it to focus on capabilities other than smarts? Some of the things that you mentioned, like memory and continual learning.

Speaker 1

我认为这些方面都需要推进。确实存在一些未解决的问题——你刚才提到的几点我也认同——你会期望真正聪明的人自然具备的能力,而我们的模型仍在这些方面存在不足。因此我们仍需开展开放研究,以填补我所说的全频谱智能的空白。但正如我们早前在播客中讨论的,智能可以通过多种方式体现。

It's gonna be, I think, all of those things. Certainly, there are some unsolved problems. You mentioned a few here, and I would agree with those, that, you know, you'd expect a really smart person to, you know, it kinda comes by default that our models still struggle with. And so there's open research there that we still have to do, I think, to be able to kinda close the loop on what I would call the full spectrum of intelligence. But, you know, there's intelligence like we were talking about earlier in in the podcast expresses in a lot of different ways.

Speaker 1

部分智能体现在纯粹的IQ上——即对事物运作原理的认知和信息回忆能力。但还包括运用工具解决问题的能力,以及进行自我反思的能力——能够回顾自己的思维链条,在发现偏离正确路径时及时调整策略。我们观察到GPT-5在这些维度上的表现确实可靠地优于前代系统。

And part of it is just your, you know, pure IQ. It's your knowledge of how things work and your ability to recall information. But then it's also your ability to reason about how do you use other tools to solve problems. It's your ability to be reflective and to look back on your own chain of thought, your own line of thinking, and actually course correct when you feel like, you know, I actually went down the wrong path, and maybe I didn't come up with the right strategy to solve this problem. And so that's one of the cool things we see is GPT five on those vectors, we can actually reliably measure as better than the previous systems we had.

Speaker 1

对我们而言,真正想了解的是这些模型在现实世界中的表现——开发者如何运用它们?企业如何将其应用于现有实际问题?我们想观察新一代模型是否比前代表现更好。因此对我们来说,相较于学术基准,现实世界的表现正日益成为衡量智能的重要指标。

And for us, I think one of the real world things that we really wanna understand is how do they actually perform in, you know, in in the real world? How do we how do developers use these models? How do enterprises use these models to actually apply them to existing problems, real world problems, and see if the next models kind of do better than the last models? And so that's for us, I think the real world benchmark is increasingly becoming important as a sign of intelligence relative to the academic benchmarks.

Speaker 0

持续学习在OpenAI的优先级如何?

And how big of a priority is continual learning within OpenAI?

Speaker 1

我们有很多优先事项,这当然位列其中。我们对...

We have a lot of priorities. Think certainly that's among them. We feel really But good

Speaker 0

我们的研究路线感到满意。中等偏低优先级。很难...

about our research trajectory. Middle, low priority. It's hard to

Speaker 1

OpenAI的独特之处在于我们系统化开展研究的方式——这从公司早期就如此。我2018年加入时,我们就采取高度探索性的研究方法。我们绝非自上而下的模式——不是所有人围绕单一想法推进。实际上我们通过小团队进行开放式探索,尝试不同路径,将有效的新思路循环整合到核心研究方向。

you know, the cool thing about OpenAI is the way that we kind of, you know, has I think have, like, systematized being able to do research, and this has really been true from the early days of the company. I I joined OpenAI in 2018, is we we take this kind of highly exploratory approach to research. And so we're very much not tops down, I think, in how we how we approach research where there's one idea and everyone kind of just, you know, gloms on to that one idea, and we kind of do one thing at a time. What we really do is a lot of open ended exploration in small teams. We explore different paths and see if those lead to new ideas that we then kind of cycle back into the kinda core idea, the main line of ideas if they work.

Speaker 1

若某些路径不成功,我们就重组团队投入其他有潜力的方向,并允许衍生新想法。这确实像在黑暗中摸索——当你触碰到一片草地心想'可能找对路了',就把大家召集过来继续探索。我认为必须如此运作,因为这些事很难预先确知。

And if they don't, we kind of, we recombine those teams into other ideas that seem to be working and then allow other you know, new ideas to offshoot from there. And so it really is kinda feeling around in the dark a little bit. And when you find that kind of patch of grass that you're like, okay, we we might be on the right path here, you kind of bring everyone to that point and then kind of let everyone feel around a little more. And I think that's kind of how it has to work. I think it's really hard a priori to know these things, you know, in advance.

Speaker 1

我认为直觉是存在的,而且我们的研究人员往往比普通人拥有更敏锐的直觉,但这本质上仍是科学探索的过程。

I think you can have intuition, and I think our researchers tend to have kind of, you know, better intuition than than the average, but it really is still scientific exploration.

Speaker 0

现在我想聊聊你们的Plus订阅用户或使用这些聊天机器人的人会如何感知ChatGPT的改进。沃顿商学院的Ethan Moloch教授——他也在试验GPT-5——提出了一个有趣观点:'这是个重大进步,但对持续关注技术曲线的人来说并不意外。这些模型本周刚在国际数学奥林匹克竞赛中斩获金牌,我都快对所谓重大突破麻木了。'

Now I wanna talk about whether how your plus subscribers or how the people who are using these chatbots will feel using ChatGPT will feel the improvements. You know, there's an interesting comment from Ethan Moloch, the Wharton professor who is also experimenting with GPT-five. He says, I think it's a big step forward, but not an unexpected one. If you've been following the curve, he says, these models got gold at the Math Olympiad this week. I'm losing track of what massive advances mean.

Speaker 0

当前所有模型都在快速进化。问题是当某个模型从本科生物学水平提升到研究生水平时,普通聊天机器人用户可能根本感受不到这种进步——尽管它确实变得更聪明了。所以我很好奇,你认为这种智能提升会如何体现在普通用户和长期使用推理模型的Plus用户的体验中?他们会感受到明显差异吗?

All the models are improving very quickly right now. Their question is, if you have a model that's capable of graduate level or college level biology, then it goes to graduate level biology, the average chatbot user may not feel that even though it's, even though it's gotten much smarter. So I guess I'm curious how how you think this will be reflected, increased smarts will be reflected in the average users ChatGPT experience and the plus users experience who've been using these reasoning models, for a while? Is it gonna feel any different for them?

Speaker 1

是的。我在X平台看到类似观点:对于付费层级的顶尖用户——那些每日高频使用、堪称系统专家的群体——他们会感受到改进,但可能比较细微。而对于即将用上GPT-5的免费用户来说,这将是个巨大飞跃。实际上观察免费用户的使用模式,他们大多还没体验过推理模型的威力,主要在用GPT-4进行快速回合制的搜索式对话,这远未发挥模型的全部潜能。

Yeah. I saw something on on X that was akin to what you're describing, which someone basically kind of said, I think for the, you know, upper echelon of of ChatGPT users who are probably in the paid tiers, who are very, you know, active on a daily basis and are really kind of expert level using these systems, it it's gonna feel like an improvement, but maybe a, you know, a more subtle improvement. But for the average user, for the free user, and we're bringing GPT-five to our free tier, it will feel like a dramatic increase. If you actually look at kind of the way free users have used ChatGPT, most of them have actually not experienced the power of the reasoning models. They mostly are using GPT four o, and, you know, they they mostly are kinda using it for this very kind of, you know, turn based kind of, like, very quick, you know, back and forth, almost search like that ways and that I think don't actually kind of express the full capability of the model.

Speaker 1

对很多人来说,这将是首次使用具备推理能力的模型。不仅是首次接触推理功能,更是首次体验模型能自主决定思考时长、根据问题难度调整回答质量。因此我们预计普通用户会感觉天差地别,而高阶用户可能感受不那么明显——我认同这个观点,这很自然。

And so for a lot of people, this will be the first time using a model that has reasoning capability. And not only will it be, you know, the first time using it with reasoning, but it'll be the first time that they're experiencing a model making a decision about how long to think about a problem and how good of an answer to give relative to how hard the question is. And so we expect that, like, for, yeah, for the average user, it will feel dramatically different. Maybe for the kind of upper echelon of power user, it may not feel as different. So I would agree with that, and and I think that's a natural thing.

Speaker 1

我倒认为这是好事。如果你持续追踪AI发展节奏并始终站在技术前沿,进步确实令人目眩,但也会感觉更连贯。而如果你还在用一两年前的旧模型,感受就会截然不同。

I don't I think that's actually a good thing that, you know, it's it it is if you've been following the kind of rate of AI progress and you're you're you're kind of exploiting the frontier at every point, yes, it probably is dizzying, but it all it starts to feel it starts to feel more continuous than if you've kind of you know, you're using what is basically kind of the the best model from a year or two ago. Right.

Speaker 0

你说普通用户把它当搜索引擎用这点太精准了。每次有人问我'该用AI做什么',我就说'上传资料然后直接对话'。我有朋友上传儿子足球训练照片询问执教建议——

I think you're so spot on about the average user is using it as like a search version of search. And they're like, well, what should I use when they speak to me? They're like, should I use AI for? I'm like, just upload stuff and start talking to it about the things you upload. And I had a friend who like was uploading pictures of his son's, football practice and asking it for tips about like for coaching tips.

Speaker 0

当他看到系统给出真实的阵型分析时完全震惊了。当然我不会用它当足球教练,但普通用户接触到这些功能时绝对会目瞪口呆。

And he was like fairly blown away that this thing is giving some like real analysis of positioning. I mean, wouldn't use it as a football coach, but, I do think that as the average user gets into these capabilities, it's gonna be fairly mind blowing.

Speaker 1

没错。每个人的切入点都不同,这正是其魅力所在——对每个人都很个性化。本次发布我们特别关注健康领域,因为这是用户反馈中最常提到的AI应用场景。我们致力于确保当人们用AI处理健康问题时,能获得最优质的模型支持,这是训练GPT-5的重要目标之一。

Yeah. It's, you know, there's everyone's got a little bit of a different entry point, and that's the cool thing about it is like, it's really personal for everybody. You know, we we focused on health a lot with this release because that was one of the consistently common things that we heard from people as a starting point for how they've used powerful AI was in when they're navigating a health journey. And so we really wanted to make an effort on on making sure that if people are going to be using AI systems for health related things, that we could serve them the best possible model. And so that was a big a big push for training GPT five.

Speaker 0

你多次提到健康领域。你希望它取代全科医生吗?虽然很多人的医疗需求得不到满足,但我担心给患者一个可能产生幻觉的模型并说'这就是替代方案'。

Yeah, you brought up health a couple of times. Do you want this to replace a GP? I mean, a lot of people are really underserved with health care, but I kind of worry about handing them a model that can hallucinate and saying, This is the substitute now.

Speaker 1

我不认为它会取代全科医生。但我认为它能帮助人们在健康管理过程中获得更多主动权,对护理流程有更多掌控。同时也能提升人们对自身病情的认知。我们常听到这样的故事——患者长期管理某种疾病却对其一无所知,因为没人真正花时间解释。这并非因为谁做错了什么。

I don't think it'll replace GPs. But what I think it helps people do is become have more agency in their journey, a little bit more control over the process of managing care. It gives people also just an awareness of the condition. So, you know, we hear stories all the time of people managing conditions that, you know, they didn't really understand because no one actually took the time to explain it to them. And I that's not because anyone did anything wrong.

Speaker 1

根本原因在于现行医疗体系的设计,就没有留出足够时间让患者理解自己管理的疾病。哪怕只是提供最基础的教育:'这是你正在应对的病症,它常见吗?会以这些方式表现,你可能出现这类症状'——

It's just because the health system health care system as it's designed doesn't allow for there to be time to allow people to understand what it is that they're they're managing. And so even just giving people that baseline of education of, like, you know, this is this is the condition you're managing. Is this common? It's gonna express in these ways. You're gonna feel these types of symptoms.

Speaker 1

这对患者的疾病管理心理就是重大突破。当然,你仍需要与全科医生或专科医生协作治疗。但有个能全程引导你的工具,对许多人来说是极大的安慰,事实上也证明确有助益。显然我们要确保模型尽可能精准,因此提升该领域的模型能力一直是重点方向。

That's a huge unlock just in people's kind of psychology for what it means to be to be managing a disease. And, you know, I don't I don't think I think you still have to kind of work with a GP for care or, you know, a specialist for care. But having something that can can can kind of handhold you through that journey, I think, for a lot of people is really comforting, and in a lot of cases, has actually proven to be helpful. Obviously, like, we wanna make sure that model is as accurate as possible. So being able to kind of push the model capability in that domain specifically has been a big area of focus.

Speaker 1

我们认为随着GPT-5及后续模型推出,准确率持续上升而幻觉率下降。GPT-5的准确度根据衡量标准不同,约是前代的4-5倍。医疗领域可能提升更显著(具体数据我记不清)。我们在确保模型可靠精准方面掌握着充分主动权,且正朝着正确方向推进。

But we think now with GPT five and obviously with, you know, with future models, we've seen consistently the the rates of accuracy and the rates of hallucination go up and down respectively. GPT five, I think, depends on how you measure it, but it's, you know, four to five times more accurate than its predecessors. And so that and may be more accentuated in health. I don't know off the top of my head, so we have a lot of control, I think, and are pushing in the right direction on being able to make them reliable and accurate.

Speaker 0

有趣的是我们讨论的已远超越聊天机器人。除了聊天功能,还有编程、医疗,当然还有企业应用。企业 notoriously 采用新技术缓慢,要经过层层审批,产品落地困难。但我坚信更好的模型能加速并优化这一进程。请谈谈GPT-5在企业端将带来哪些突破?

It's pretty interesting we're talking about things so far beyond the chatbot. Like, of course, there's the chat function, but there's coding, there's health, and then of course, there's enterprise or the way that businesses use these models. And businesses are notoriously slow, at implementing this technology, and, I'm sure there's so many approvals and reviews, and, it's tough to get things out the door. But I do think that when you have better models, this is sort of my belief, when you have better models, you sort of are able to push that forward much faster and much more effectively. So talk a little bit about what a better model in GPT-five will enable on the enterprise front or business front.

Speaker 1

我完全同意你的判断。可以说AI在企业界尚未迎来'ChatGPT时刻'。对消费者而言,AI是绝佳工具——搜索空间更窄,问题更限定,处理语境更集中,可以逐步解决几乎无外部依赖的问题,纯粹展现模型智能。

Yeah, no, I would agree with your assessment there. Think in many ways, I I always kinda say we haven't yet seen the chatty bitty moment, I think, in business for AI. I think AI was an amazing tool for consumers where your your search space, so to speak, is is more narrow, and you've got a more constrained problem. You've got, obviously, a much more narrow context that you're processing, and I think, you know, you can kind of take things turn by turn with very, very few kind of external dependencies. And you really just kinda let the model's pure intelligence shine.

Speaker 1

企业应用则是完全不同的难度层级:复杂的业务流程,多用户依赖,海量待处理上下文,需要协调众多工具——

Businesses are a different category of of of difficulty. So, you've got complex business processes. You've got a lot of multiuser dependency. You've got a lot of context that you have to process. You've got a lot of tools that have to be brought to bear.

Speaker 1

这些工具需按特定顺序在安全范围内使用,且容错率极低。回到之前讨论,GPT-5对企业的影响在于基础能力的全面提升:工具调用、结构化思考、问题解决、自我纠错、长上下文提取等——这些细微改进在边缘场景至关重要。

Those tools have to be used in succession in certain ways with this you know, with certain guardrails. And there have to know, there's not as there's there's not as much fault tolerance for for when they don't work. And so we you know, kinda goes back to what we were talking about earlier. I think you look at models like GPT five and the impact that they're gonna have in business, it is that baseline of capability that's moved up. It's their ability to to use tools, to to, you know, think in a structured way, to solve problems, to kind of recursively correct, you know, their own mistakes, to do long context retrieval, things like that that actually, you know, these little things do matter on the edge.

Speaker 1

普通ChatGPT用户日常感知不到这些,但开发者与企业会逐渐体会。我们与大小企业测试GPT-5时获得大量反馈(如优步、安进、Harvey、Cursor、Lovable、JetBrains等),它们的用例高度依赖模型可靠调用工具、处理长上下文及有效推理的能力。

And that you don't feel them every day in ChatGPT as a as an individual user, but you will start to feel them as a developer or an enterprise. And so we see this anecdotally too. I mean, we've worked with large enterprises and small startups and the entire spectrum in between on testing these models and GPT five specifically before release. And we get a lot of feedback from companies like Uber and Amgen and Harvey and Cursor, Lovable, know, JetBrains. I mean, all companies that have use cases that are highly, highly sensitive to the model's ability to reliably call tools, to deal with long context, to, you know, to to to problem solve and and reason effectively.

Speaker 1

这将是席卷企业界的浪潮,关键在于合作开发者能否理解这些改进,并将其落实到应用构建中。

And so it's a it's a rising tide, I think, across the enterprise, and it's just really gonna be on on the developers we work with to to be able to kind of, you know, understand the the difference and the improvement and then implement them in the applications that they're building.

Speaker 0

确实很有趣了解到你们已经与许多公司合作,让他们使用GPT-5了。那么是否存在一种统一的、以前模型无法实现而GPT-5现在能做到的能力?还是说这些能力是分散在不同领域的?

Yeah, is interesting to know that you have been already working with many companies and letting them use GPT-five already. So has there been a unified, we couldn't do this with the previous models, but we can do it now with GPT-five or is it sort of spread out in terms of the capabilities that it's now enabling?

Speaker 1

我认为这是整体水平的全面提升。与我们合作的所有公司现在都已习惯对所有使用模型进行评估和基准测试,但大家都反馈在这些评估中GPT-5的表现有显著且持续的提高。有几个领域表现尤为突出,比如编程。

I would say it's it's been, you know, rising tide across the board. So every everyone who's kind of benchmarking and all the companies that we work with typically now are are are pretty accustomed to to evaluating and benchmarking performance across all the models that they use. But, everyone has kind of reported, you know, much higher, kind of consistently higher performance on those Evals. There are a few areas in particular we've seen spikes. So one is coding for sure.

Speaker 1

我提到过像Cursor、JetBrains、Windsurf、Cognition等合作伙伴,他们都表示GPT-5是目前最强大的编程模型,无论是在交互式编程环境还是代理式编程环境中。另外我们注意到它在技术领域的推理和问题解决能力也有显著提升。以Harvey AI为例,它服务于律师事务所,非常依赖其可靠、准确且一致地呈现案件和法律分析的能力,这正是法律分析所需的结构化思维。我预计这种优势会延续到金融服务领域——这个对数据分析、研究和规划要求极高的行业。

I mentioned companies like Cursor, JetBrains, Windsurf, you know, Cognition and others that we work with who, anecdotally are all, you know, have have all said that GPT-five now feels like the most capable coding model, whether that's in an interactive coding environment or more of an agentic coding environment. And then also one of the things that we see consistently now is its ability to reason and problem solve in very technical domains, is significantly improved. And so, Harvey is a great example of that where, you've got, you know, Harvey AI working with legal firms, and law firms, is, you know, very, very reliant on its ability to reliably, accurately, and and consistently portray, you know, cases that that that it's looking at, legal analysis, to provide that kind of level of structured thinking you want when you're doing legal analysis. And so I expect we'll see that carry over. I mean, financial services is a very interesting area, heavy on data analysis, heavy on research, heavy on planning.

Speaker 1

这些都是我们看到进步的领域。随着GPT-5在市场上进一步渗透,我们将获得更多反馈并持续优化这些应用场景。

Those are all areas that we've seen improvement in. And so as we continue to kind of see GPT-five permeate the market, we'll get more and more of that feedback and can can then continue to improve on those use cases.

Speaker 0

关于定价呢?现在输入token成本减半,输出token与GPT-4相同。这种降价会催生更多应用场景吗?另外,在你们今年融资480亿美元的情况下,降低成本如何与投资者期望的回报相匹配?

And how about pricing? Because it's half the cost, of an input an input token is half the cost, then GPT four o output token is the same. Are these lower costs going to help enable more use cases? And on that note, I mean, how does lowering costs sync with the fact that you've raised like 48,000,000,000 this year or announced 48,000,000,000 in funding? Is it really possible to lower costs and deliver on the expectations that the investors are expecting on that front?

Speaker 1

OpenAI历史上每次降低成本后,使用量的增长通常都超过了降价幅度。只要这个趋势持续,我们就会继续降低模型成本。开发者需要在延迟、模型质量与智能度、价格三者间权衡,而GPT-5系列(包括标准版、迷你版和纳米版)正是我们在市场反馈基础上,对这三个维度进行优化后的产品。

Yeah. So we've you know, in OpenAI's history, every time we've cut costs, we've seen typically some corresponding increase in consumption that usually outweighs the cost cut. And so, you know, for as long as that trend holds, we will continue to to cut costs on models. We know that there's this complicated dance that developers have to do between latency, model quality and intelligence, and price. And I think, you know, what we've tried to do here basically is take the market's feedback on all three of those fronts and really place these models, these GPT-five models, not just the standard model, but also the mini model and the nano model on this frontier of quality, cost, and latency that kind of optimizes for what we think the market needs to be successful.

Speaker 1

我们设定了极具吸引力的价格目标和平均延迟水平,同时保持了GPT-5固有的模型质量与智能。随着我们不断推进这个边界,人们会发掘更多应用场景——这种良性循环让我们倍感幸运,也激励我们持续创新。

And so we tried to find a really attractive price target at a very attractive latents average latency, and then obviously with the the the kind of built in model quality and intelligence you get with GPT five. And so we will continue to push that frontier. And I think the more we push that frontier, typically, the more we just see people wanna use it for more things. And so for that equation to exist, we're very fortunate, and it motivates us to try

Speaker 0

并让它们变得更好。你们未来能实现盈利吗?

and make them better. Are you ever going to be profitable?

Speaker 1

希望如此。

I hope so.

Speaker 0

好吧,暂且接受这个回答。Brad,结束前让我抢先问一句:GPT-6什么时候发布?

Okay. We'll take it. All right. Brad, before we wrap, let me be the first to ask you, when is GPT-six coming?

Speaker 1

嗯,你不是第一个这么问的人。可以告诉你,但我

Well, you're not the first to ask. Could tell you, but I

Speaker 0

知道 对,开玩笑的

know Yeah, kidding

Speaker 1

不。推特上对这事反应很快。但我的意思是,就像我说的,我们认为GPT5能力非凡。我们认为未来会有更好的模型。我们知道未来一定会有更好的模型。

no. Twitter is quick on the trigger on that one. But no, I mean, like look, we're like I said, we we think GBT5 is extraordinarily capable. We we think there will be better models in the future. We know there will be better models in the future.

Speaker 1

目前,我们只专注于如何让这个模型触达用户?如何支持那些与我们共建的公司使用这个模型?我们仍处于科学探索阶段。我认为最激动人心的部分在于,我们还在起步阶段,连我们自己都还在理解所处的范式。所以这是重要的第一步。

For now, we're just focused on how do we get this in people's hands? Do we support the companies that are building with us using this model? And then we're still in in in the science of it. I think, that's the exciting part is, like, we're in the first inning of it, and we ourselves are just understanding the paradigm we're in. And so this is, I think, an important first step.

Speaker 1

你需要先认清现状,才能明白未来方向。希望从GPT5汲取的经验能让GPT6更出色。

And you kind of have to understand where you are to to understand where you're going. And, you know, hopefully, the the the learning from this will make GPT six much better.

Speaker 0

布拉德,今天能邀请到你真是太棒了,尤其是在GPT5发布日。等GPT6问世时,我们得再聚一次。非常感谢你的参与。

Well, Brad, it's so great to have you on, especially today on, GPT five launch day. So whenever GPT six comes, we'll have to do it again. Thank you so much for joining.

Speaker 1

期待下次

Look forward to

Speaker 0

好了各位,GPT5已上线。你可以在chat.com试用,它将逐步向所有人开放。快去体验吧,明天我和兰詹·罗伊将继续深入讨论本周新闻,特别是GPT5的最新进展。感谢收听,我们下次《大科技播客》再见。

it. Alright folks GPT five is out. You can try it on, chat.com and it's gonna roll out to everybody, so, give it a look and, we'll be back to talk more about it tomorrow where Ranjan Roy and I will break down the week's news especially, what the latest is on GPT five. Thanks everybody for listening and we'll see you next time on big technology podcast.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客