Big Technology Podcast - OpenAI首席运营官专访:GPT-5的技术突破与未来规划 封面

OpenAI首席运营官专访:GPT-5的技术突破与未来规划

OpenAI COO访谈:GPT-5的技术突破与未来布局

本集简介

布拉德·莱特卡普是OpenAI的首席运营官。莱特卡普做客《大科技》节目,探讨GPT-5的发布、其工作原理、与前代模型的区别以及它是否属于通用人工智能。我们还讨论了扩展定律、训练后的突破性进展、企业采用情况、医疗保健应用、定价策略以及公司的盈利前景。点击播放,前排聆听OpenAI对人工智能未来的思考。 --- 喜欢《大科技》播客吗?请在您常用的播客应用中给我们五星好评 ⭐⭐⭐⭐⭐。 想获取Substack+Discord上《大科技》的订阅折扣吗?首年可享25%优惠:https://www.bigtechnology.com/subscribe?coupon=0843016b 有问题或反馈?请写信至:bigtechnologypodcast@gmail.com

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

Oktane是顶级身份认证盛会,汇聚全球顶尖人才探讨安全访问的未来。现代身份安全架构的关键并非将安全整合至单一平台,而是统一防御体系。在Oktane,您将学习如何将该架构扩展到所有身份类型,包括新兴的AI代理威胁。九月亲临拉斯维加斯现场参与,或在线观看主题演讲与分会场。注册及查看完整议程请访问okta.com/oktane。

Oktane is the premier identity event, bringing together the world's leading minds to discuss the future of secure access. Instead of consolidating security into a single platform, a modern identity security fabric is the key to unifying your defenses. At Oktane, you'll learn how to extend that fabric across all types of identities, including the emerging threat of AI agents. Join in person in Las Vegas from September, or catch the keynotes and sessions online. To register and see the full agenda, visit okta.com/oktane.

Speaker 0

网址是0kta.com/0ktane。GPT五正式发布,OpenAI首席运营官Brad Leitkopf将为我们解析新模型的能力、对AI产业的意义,以及这项前景技术的未来动向。Brad,非常高兴见到你。感谢你在《大科技播客》紧急特辑中接受采访。

That's 0kta.com/0ktane. GPT five is here and OpenAI COO Brad Leitkopf is with us to break down the new model's capabilities, what it means for the AI business, and what's next for this promising technology. Brad, it's so great to see you. Thank you for joining us on an emergency episode of Big Technology Podcast.

Speaker 1

这是我的荣幸。感谢邀请。

My pleasure. Thanks for having me.

Speaker 0

好的。请简要谈谈GPT-5是什么。能否用约60秒时间说明它的特性,以及相比之前OpenAI模型的改进之处?

Alright. So briefly, I just want you to talk a little bit about what GPT-five is. So maybe within like sixty seconds or so, can you talk about what it is and how it improves on previous OpenAI models?

Speaker 1

好的。GPT-5是我们的下一代旗舰模型。它实现了一项突破性功能——能动态自主判断是否需要对问题进行深度思考推理后再给出答案。您可能记得之前用户必须通过ChatGPT里那个备受'喜爱'的模型选择器,手动为不同任务挑选模型,然后才能提问获取答案。

Yeah. So GPT-five is our next generation flagship model. It does something really interesting, which is it actually combines into one model the ability to dynamically choose whether to think hard about a problem and reason about it to give you an answer or not. And so you'll remember previously, you had to go deal with the model picker in ChatGPT, everyone's favorite thing. You had to select a model that you wanted to use for a given task, and then you'd run the process of asking a question, getting an answer.

Speaker 1

有时选择思考模式,有时则不选。这种体验对用户来说颇为困惑。GPT-5彻底简化了这个流程,它会自动做出判断,并且作为更智能的模型,无论是否启用思考模式,在所有情况下都能提供更优质的答案。

Sometimes you choose a thinking model, sometimes you wouldn't. And that was, I think, a confusing experience for users. GPT-five abstracts all of that. So it makes that decision for you, and it's actually a smarter model. So you're gonna get a better answer in all cases regardless of whether you're using the thinking mode or not.

Speaker 1

在写作、编程、健康咨询等方面都有显著提升,准确性更高,响应更快。整体而言,我们认为这将带来更卓越的用户体验。

And it's vastly improved on things like writing, coding, health. It's much more accurate, it's much faster. And so all around, we think a better experience.

Speaker 0

对于一直关注此事的我们来说,可能以为你们会首先强调智能的爆炸性提升,而非在模型上设置一个根据情况切换推理与非推理模式的开关。能否解释为何优先考虑可用性而非智能提升?这中间的差异是什么?

And now for those of us who've been following the hype, I think we probably imagine you would lead with this is an explosive increase in intelligence versus there's a switcher on the model that will go to reasoning or non reasoning when it makes the most sense. So can you explain like what's the disconnect there and why lead with the usability versus the intelligence increase?

Speaker 1

是的,因为智能本质上取决于模型投入的思考时间。根据你为问题分配的思考时长,答案质量会有所不同。通常思考越久,给出的答案就越好。当我们在特定基准测试中允许模型思考时,其表现远超现有所有模型。即便不给予思考时间,其答案仍普遍优于非思考型模型如GPT-4 0.1。

Yeah, because intelligence really is a function of how much time the model is going to be thinking. And so depending on how much you want to allocate thinking time to a problem, you're going to get a better answer. Typically, longer it thinks, the better an answer it can give you. So when we test the model on certain benchmarks and evals and we allow it to think, it will dramatically outperform any of our existing models by far. Even though if you don't allow any thinking time, you still get a typically better answer than you would for one of our non thinking models like GPT-four 0.1.

Speaker 1

所以这确实是智能的显著进步。我认为它在几乎所有维度上都应该是更优质的模型。但动态运用推理时间的能力才是关键——这能带来更优质的用户体验。

So it is a dramatic improvement in intelligence. It should be, I think, a better quality model across pretty much all dimensions. But that reasoning time and being able to use the reasoning time dynamically to think, we think actually is the important part. It makes it for a much better user experience.

Speaker 0

我要稍作解读:您说这是对前代模型的显著改进。Sam在发布会上称GPT-5相比4是重大跨越。而试用过模型的Simon Wilson认为,虽不像其他大语言模型那样有颠覆性突破,但展现出稳定实力——极少出错且常令人惊艳。

Now I'm going to parse your words a little bit. You said that it is a dramatic improvement over previous models. Sam in a press call said that GPT-five is a pretty significant step over four o. Simon Wilson, who's been using your model for a little bit says, it doesn't feel like a dramatic leap ahead from what other LLMs, from other LLMs, but it exudes competence. It rarely messes up and frequently impresses me.

Speaker 0

我提出这个问题是想探讨:我们能否(或您是否愿意)说这个模型是能力上的指数级提升,还是渐进式改进?

I'm just setting this up because I'm curious whether we could say or whether you would say that this model is an exponential increase in capabilities or an incremental increase in capabilities.

Speaker 1

这种衡量方式其实很困难。我们现在需要从多维度评估智能,这不是回避问题,而是解释GPT-5的特殊性。显然,它在核心能力上更出色——在SWE基准测试和各类学术评估中表现更优。

You know, it's it's hard to measure it that way. I think we're now kind of into this regime of having to measure intelligence across a lot of different dimensions, which isn't a way to dodge the question so much as it is to explain why GPT five is such a special model. And so obviously, it's better at the core things that you'd expect it to be better at. It scores better on things like SWE bench. It scores better on all the kind of academic evals that we put it through.

Speaker 1

这个版本我们特别强调提升医疗健康类基准表现,因此在医学推理等领域更强大。但现代模型优劣涉及诸多因素:训练方式、问题思考能力等。比如响应速度更快——我们认为这本身就是改进;单位思考时间内给出更优答案——这也是重要的衡量维度。

This one in particular, we actually made a real emphasis to have it score better on certain health benchmarks. So it's better at medical reasoning and other health related things. But there's a lot of things that go into what makes a model good now because you have a lot of dimensions to play with depending on kinda how that model's trained and how it can think about problems. So if it's faster, for example, we think that's actually indicative of it being better. If it can give you a better answer per unit of time thinking, we think that's an improvement that is an important vector to measure also.

Speaker 1

如果它能做到结构化思考、解决问题、使用工具这类事情,这些实际上都是我们衡量的指标,但对用户来说却是隐形的。如果你只是在使用ChatGPT,你未必能察觉到这些功能在后台的运行。但所有这些方面,GPT-5都比我们之前的模型做得更好。

If it can do things like structured thinking, problem solving, tool use, All these things are things we actually measure, and they're kind of invisible to users. If you're just using ChatGPT, you don't necessarily appreciate each of these things happening under the hood. But all those things are better for GPT-five than they were for our previous models.

Speaker 0

没错。我之所以这么问,是因为很多人指出了从原始GPT到GPT-2、GPT-2到GPT-3、GPT-3到GPT-4的飞跃。人们观察到的一个现象就是各方面能力的全面提升。虽然可能存在某些限制的原因,但并没有出现'智能在这里提升而在那里停滞'的情况。我们就是训练了一个更大的模型。

Right. And the reason why I'm asking is because I think a lot of people have pointed to the leaps from original GPT to GPT-two, GPZ-two to GPT-three, GPT-three to GPT-four. And one of the things people have seen is just a general increase in capabilities across the board. There were no caveats of like, and maybe there's a reason for those caveats, but there were no caveats of, you know, there's, intelligence increases in this place and that place. Was we trained a bigger model.

Speaker 0

我相当确定事实就是如此,而且它在所有方面都更优秀。那么情况是否发生了变化?

I'm pretty sure this is what it was, and it's better across the board. So have things changed?

Speaker 1

确实发生了变化,从技术角度来看。我认为从GPT-2到GPT-3,再到GPT-4,这些本质上都是对规模扩展范式的开发——训练越来越大的预训练模型。这就像单一的训练向量,最终会得到更好的模型。这个规律仍然成立,但现在我们有了另一类训练方式,即后训练阶段,能够以前所未有的方式利用推理时计算资源,这几乎像是第二阶段的训练。我们认为这实际上为我们提供了额外的推动力,成为提升模型智能水平的倍增器,同时也能训练模型掌握更多智能体应有的能力。

They've changed, yeah, from a technical perspective. I think when you go from GPT two to GPT three, three to four, these were really just exploits of what was and is the scaling paradigm of training larger pretraining bigger and bigger models, training larger models. It's kind of one vector of training, and you get a better model that as a as a result. And that continues to hold true, but we now have this kind of other category of of of training, which is post training, and being able to use test time compute in more interesting ways than we used to as almost kind of a second stage of training. And so we think that that actually gives us a little bit of a boost, a force multiplier on our ability to push the model toward new intelligence levels, and also be able to train into it a lot of the things that you want an intelligent model to be able to do.

Speaker 1

以工具使用为例,这对整体智能非常重要。GPT-2和3在这方面表现欠佳,GPT-4只能算初具雏形。而现在GPT-5则能原生支持这种能力,并得益于多步骤、长视野的推理过程。是的,我们希望用户无需感知这些技术细节。

So using tools, for example, is something that rethinks really important for overall intelligence. GPT two and three couldn't really do that as well. GPT four could do it in a more nascent way. And now GPT five, you get that baked in with the benefit of of these kind of multi multistep and and longer horizon reasoning processes. So, yeah, we we want to abstract that from users.

Speaker 1

显然,我们认为ChatGPT用户不应该为此费神。某种程度上,模型选择器成为用户的痛点恰恰说明,人们并不想在每次与AI对话时都做这些决策。他们更希望模型能自主处理这些事,因此我们认为GPT-5是重大进步。

Obviously, we don't think that you as a a ChatGPT user should have to stop and think about that. And in some sense, I think the model picker being a point of frustration for people was an expression of the fact that people don't necessarily want to have to make those decisions every time they talk to an AI model. They kind of want the model to make those decisions for them, and so that's why we think GPT-five is a big step.

Speaker 0

回到扩大预训练规模带来模型性能可预测提升的话题。现在后训练技术已经登场,它以令人印象深刻的方式优化模型。但考虑到我们现在讨论的是不同形式的模型训练方式,你是否认为——OpenAI是否认为——预训练的边际效益正在递减?

And going back to that increasing pre training, increasing the scale of pre training, delivering predictable improvements in model performance. Yes, now post training is in the picture. It's making models better in really impressive ways. But are you of the belief and is opening eye of the belief now that there are diminishing returns from pre training, given that we're now talking about different forms of training these models?

Speaker 1

完全不是。我们的扩展法则依然适用。从经验来看,没有理由认为预训练会存在任何收益递减现象。而在训练后阶段,我们才刚刚开始触及这一新范式的表面。你知道,之前的O系列模型作为推理模型的雏形,仅仅是我们探索训练后可能性的起点。

Not at all. Our scaling laws still hold. Empirically, there's no reason to believe that there's any kind of diminishing return on pretraining. And on post training, we're really just starting to scratch the surface of of that new paradigm. You know, the the o series of models, which were kind of the previous reasoning models, were really just the beginning of us starting to explore what's possible in that post training regime.

Speaker 1

我认为这将成为未来一两年的主导趋势——继续在这个维度上扩展,持续见证由此带来的显著收益。现在我们正从两个轴向推动模型改进,相信这将加速创新进程的集约化。

And I think that's gonna be kind of the dominant theme here for the next year or two, is continuing to scale in that dimension, and continuing to see the gains that you get there, simply because they're so significant. And so now we're pushing on two axes for how to improve models, and we think that's going to tighten and condense the rate of innovation.

Speaker 0

因为OpenAI认为未来绝大多数改进将来自规模扩展或算法突破?

Because OpenAI believe that the vast majority of improvements from here are going to be coming from scaling or from algorithms?

Speaker 1

我认为会是算法。应该是组合效应,向来如此不是吗?算法、规模、算力和数据总是协同作用的。我们在三个方向同时发力,它们对未来发展都至关重要。真正的难点在于如何让这些要素有机融合。

Think it'll algorithms. Be a combination It's always a combination, right? It's it's always algorithms, scale, compute, and and data, right? And so we we push on all three, and they all play a really important role, I think, in in how we look at the future. And then the hard part, is having them come together.

Speaker 1

训练更大模型通常需要更多数据配合更强算力。这需要精妙平衡——单纯扩大规模未必能带来相应程度的改进。我们必须同步提升其他要素。这不是非此即彼的选择,而是需要周密协调的系统工程。

So being able to train larger models requires typically that you wanna train on more data, obviously, with more compute. And so that's a delicate balance between those things because just scaling up doesn't necessarily mean, you know, in all cases that you're gonna get kind of the the same, know, corresponding rate of improvement. You have to be able to bring those other pieces also. So it's not like we push one button or the other. We we actually make a really conscientious effort to try and kind of pull all of those of those together.

Speaker 0

好吧。你们没有称之为AGI(通用人工智能),我得承认我在这个节目上输了个赌注——之前听Sam在Theo Von节目中说GPT-5几乎在所有方面都比人类聪明。我当时想,这听起来就是想象中的AGI了。结果昨天GPT-5发布时...

Okay. And you're not calling it AGI, I have to say I've lost a bet on this show because I was listening to Sam on the Theo Von show. He says, GPT-five is smarter than us in almost every way. And I said, alright, well that sounds like what you would imagine AGI would be. And then GPT-five comes out yesterday or as the release happens.

Speaker 0

Sam却说讨厌AGI这个术语,因为现在人人对其定义都有微妙差异,虽然这显然是个具备通用智能的模型。请帮我理解现状:似乎他想称之为AGI但你们还没认可?为什么这不算是AGI?

Sam says, I kind of hate the term AGI because everyone at this point uses it to mean a slightly different thing, but this is clearly a model that is generally intelligent. Help me understand what's going on, because it seems like maybe maybe he wants to call it AGI, but you're not yet. So why is this not AGI?

Speaker 1

嗯,这确实是个难以定义的概念。这里的笑点在于,如果你问五个人什么是AGI(通用人工智能),你会得到七种不同的答案。我认为我们看待它的方式是一个累积的过程。对吧?它是一个系统。

Well, is it is a hard thing to define. You you ask the the joke here is you ask five people what AGI is, you'll get seven answers. And I think the way we kinda look at it is it's a cumulative process. Right? It's a system.

Speaker 1

我认为必须定义这个系统是什么,以及你期望它能做什么。对我来说,至少是一个能够可靠学习分布外新事物的系统,凭借其推理、思考、解决问题、使用工具和提出新想法的能力。所以,我认为我们现在的系统能称为AGI吗?不能。但我认为我们开始看到这种通用学习系统的雏形和组成部分在像GPT-5这样的模型中逐渐成形,我猜测在其后续版本中会继续发展。

And I think you have to define kinda what is it that that system is and what do you expect it to be able to do. And for me, least, that's a system that is reliably able to learn new things that are kind of out of distribution by virtue of its ability to reason, to think, to solve problems, to use tools, to come up with new ideas. And so I do I think we're at a system that I would call AGI? No. But I think we see we start to see the traces and the the pieces of that overall system for for generalized learning start to come together in models like GPT five, and I suspect suspect in in its successors.

Speaker 1

我不知道是否会有那么一个时刻,我们突然意识到:'好了,我们已经从非AGI世界跨入了AGI世界。'即使有这样的时刻,我也不确定我们是否能立即意识到,因为从现有模型中我们学到的是能力延展性非常显著。当Sam提到模型智能相当于口袋里装着博士时,我们其实还未充分开发这种潜力。某种意义上,我认为即使现在暂停AI发展十年,人们仍需要大约十年时间才能充分利用现有GPT-5级别模型开发出所有新产品和新应用方式。

I don't know if we'll have a point where we are like, okay. We've crossed from a non AGI world into an AGI world. And even if there were, I'm not sure we'd actually realize it necessarily until after the fact because one of the things we've learned working with the models that we have is the capability overhang is significant. I think when Sam refers to the intelligence of the models and having a PhD in your pocket, we haven't yet really exploited that as, as a thing. You know, that in some sense, like, I think you could pause AI progress right here for ten years, and you'd still have about a decade worth of new products to get built, of new ways that people figure out how to use the models, even at a GPT five level model in interesting products and interesting processes.

Speaker 1

有趣的是,随着模型变得更聪明,从产品构建角度它们反而对系统集成提出了更高要求。我常将其类比为:你可以有个非常聪明的实习生,但最终他们只能完成有限的工作——记录会议笔记、撰写摘要、整理基础分析。

And so there's and and one of the kind of interesting things is I think as the models get smarter, they almost demand more from a a product building perspective in terms of how you actually plug them into the system. I always kind of roughly analogize it to, like, you could have a really, really smart intern, and, you know, at the end of the day, they're only capable of doing a few things for you. They can take notes in meetings. They can write summaries. They can pull basic analyses together.

Speaker 1

但若引进一位博士,这个人拥有巨大的能力储备,虽然第一天可能不完全胜任,但关键在于如何为其提供足够背景信息、工具使其后期真正高效。这个过程比培养实习生达到全面效能所需时间更长。我认为AI模型也将类似。这是个持续且非线性的过程。就目前而言,我们还未达到我称之为AGI级别的系统。

But if you bring a PhD to work, that person has a tremendous capability set that may they may not be totally effective at on the job on day one, but your job is to really figure out how to expose them to enough context, enough information, give them the right tools to make them really effective later on. And that process actually takes longer to get them to their full effectiveness than it would an intern. And I think it's gonna be similar with AI models. And so, you know, it it is a continuous process, and it I don't think it will be linear. But where we are today, I would say, we're probably not quite yet at something I would call like an AGI level system.

Speaker 0

是的,这引发出一个有趣的问题:现阶段是应该继续提升模型智能,还是应该专注于构建辅助能力?Sam在媒体电话会上提到,GPT-3相当于高中水平,GPT-4接近大学生,而GPT-5达到专家级。所以我在想,OpenAI的目标是继续增加智能,还是聚焦于智能之外的其他能力?

Yeah. And it brings up such an interesting question, which is, does it really make sense to try to make the models smarter from here? Or is it about trying to build those ancillary capabilities? I think Sam mentioned this on the media call, but GPT-three, he said, was high school level intelligence, GPT-four, maybe the level of a college student, and GPT-five, an expert. So I guess, I wonder for OpenAI, is the quest to add more intelligence to the mix, or is it to focus on capabilities other than smarts?

Speaker 0

比如你提到的记忆和持续学习这些方面。

Some of the things that you mentioned, like memory and continual learning.

Speaker 1

我认为,它将涵盖所有这些方面。当然,目前仍存在一些未解决的问题。你提到的几个问题,我也同意,比如你会期望一个真正聪明的人——某种程度上这是默认的——我们的模型在这些方面仍有困难。因此,我认为我们需要进行更多的开放研究,以完成我所说的‘全谱智能’的闭环。但正如我们之前在播客中讨论的,智能可以通过多种方式体现。

It's going to be, I think, all of those things. Certainly, there are some unsolved problems. You mentioned a few here, and I would agree with those, that, you know, you'd expect a really smart person to you know, it kinda comes by default that our models still struggle with. And so there's open research there that we still have to do, I think, to be able to kinda close the loop on what I would call the full spectrum of intelligence. But, you know, there's intelligence like we were talking about earlier in in the podcast expresses in a lot of different ways.

Speaker 1

其中一部分就是纯粹的智商,即你对事物运作的理解和信息回忆能力,但还包括你如何利用其他工具解决问题的能力。它还包括你反思的能力,回顾自己的思维链条和推理过程,并在意识到‘我可能走错了方向,或者没有找到解决问题的正确策略’时进行修正。这是我们看到的一个很酷的现象:GPT-5在这些维度上,确实可以可靠地测量出比之前的系统更优。对我们来说,真正想了解的是它们在实际场景中的表现如何?

And part of it is just your, you know, pure IQ. It's your knowledge of how things work and your ability to recall information, but then it's also your ability to reason about how do you use other tools to solve problems. It's your ability to be reflective and to look back on your own chain of thought, your own line of thinking, and actually course correct when you feel like, you know, I actually went down the wrong path, and maybe I didn't come up with the right strategy to solve this problem. And so that's one of the cool things we see is GPT five on those vectors, we can actually reliably measure as better than the previous systems we had. And for us, I think one of the real world things that we really wanna understand is how do they actually perform in, you know, in in the real world?

Speaker 1

开发者如何使用这些模型?企业如何将它们应用于现有问题或现实世界的问题,并验证新一代模型是否比前代表现更好。因此,对我们而言,现实世界的基准测试正变得越来越重要,作为衡量智能的标志,而不仅仅是学术基准。

How do we how do developers use these models? How do enterprises use these models to actually apply them to existing problems, real world problems, and see if the next models do better than the last models. And so that's for us, I think the real world benchmark is increasingly becoming important as a sign of intelligence relative to the academic benchmarks.

Speaker 0

在OpenAI内部,持续学习的重要性排在第几位?

And how big of a priority is continual learning within OpenAI?

Speaker 1

我们有很多优先事项,这肯定是其中之一。我们对目前的研究进展感到相当满意——但优先级低吗?OpenAI的独特之处在于,我们系统化了研究的方式,这从公司早期就开始了。我是2018年加入的,我们采取了一种高度探索性的研究路径。

We have a lot of priorities. Think certainly that's among them. We feel really Middle, good about our research low priority? The cool thing about OpenAI is the way that we kind of, I think, have systematized being able to do research, and this has really been true from the early days of the company. I I joined OpenAI in 2018, is we we take this kind of highly exploratory approach to research.

Speaker 1

我们的研究方式绝非自上而下,不是围绕单一想法让所有人集中跟进,然后一次只做一件事。实际上,我们以小团队形式进行大量开放式探索,尝试不同路径,观察是否能催生新想法,再将这些想法反馈到核心主线中。如果无效,我们会重组团队投入其他有潜力的方向,并允许新的分支想法自然衍生。这有点像在黑暗中摸索前行。

And so we're very much not tops down, I think, in how we how we approach research where there's one idea and everyone kind of just, you know, gloms on to that one idea, and we kind of do one thing at a time. What we really do is a lot of open ended exploration in small teams. We explore different paths and see if those lead to new ideas that we then kinda cycle back into the kinda core idea, the main line of ideas if they work. And if they don't, we kind of, we recombine those teams into other ideas that seem to be working and then allow other you know, new ideas to offshoot from there. And so it really is kind of feeling around in the dark a little bit.

Speaker 1

当你发现某片‘草地’时——比如感觉‘我们可能找对路了’——就会把大家召集到这个节点,然后继续集体探索。我认为必须这样运作,因为这些事很难事先预判。

And when you find that kind of patch of grass that you're like, okay. We we might be on the right path here. You kind of bring everyone to that point and then kind of let everyone feel around a little more. And I think that's kind of how it has to work. I think it's really hard a priori to know these things, you know, in advance.

Speaker 1

我认为直觉是存在的,而且我们的研究人员往往比普通人拥有更敏锐的直觉,但这本质上仍属于科学探索的范畴。

I think you can have intuition, and I think our researchers tend to have kind of, you know, better intuition than than the average, but it really is still scientific exploration.

Speaker 0

现在我想聊聊付费订阅用户或使用聊天机器人的人会如何感受ChatGPT的改进。沃顿商学院教授Ethan Moloch——他也在试验GPT-5——有个有趣的评论:'我认为这是重大进步,但如果你一直关注发展曲线,这并不意外。这些模型本周在国际数学奥赛夺冠,我都快对巨大进步麻木了。'

Now I wanna talk about whether how your plus subscribers or how the people who are using chatbots will feel using ChatGPT will feel the improvements. You know, there's an interesting comment from Ethan Moloch, the Wharton professor who is also experimenting with GPT-five. He says, I think it's a big step forward, but not an unexpected one if you've been following the curve. He says, these models got gold at the Math Olympiad this week. I'm losing track of what massive advances mean.

Speaker 0

目前所有模型都在快速进步。问题是,如果一个模型从本科生物学水平提升到研究生水平,普通聊天机器人用户可能感受不到这种进步——尽管它确实变得更聪明了。所以我很好奇,你认为这种智能提升会如何体现在普通用户和长期使用推理模型的Plus用户的体验中?对他们来说会有明显不同吗?

All the models are improving very quickly right now. Their question is, if you have a model that's capable of graduate level or college level biology, then it goes to graduate level biology, the average chatbot user may not feel that even though it's, even though it's gotten much smarter. So I guess I'm curious how how you think this will be reflected, increased smarts will be reflected in the average users ChatGPT experience and the plus users experience who've been using these reasoning models, for a while. Is it gonna feel any different for them?

Speaker 1

是的。我在X平台看到类似观点:对于付费层级的顶尖用户——那些日常高频使用、堪称系统专家的群体——他们会感受到改进,但可能比较细微。而对于即将用到GPT-5的免费用户,这将是个巨大飞跃。观察免费用户的使用模式,他们大多还没体验过推理模型的威力,主要在用GPT-4进行类似搜索的快速问答,这其实没能充分发挥模型能力。

Yeah. I saw something on on X that was akin to what you're describing, which someone basically kind of said, I think for the, you know, upper echelon of of ChatGPT users who are probably in the paid tiers, who are very, you know, active on a daily basis and are really kind of expert level using these systems, they it it's gonna feel like an improvement, but maybe a, you know, a more subtle improvement. But for the average user, for the free user, and we're bringing GPT-five to our free tier, it will feel like a dramatic increase. If you actually look at kind of the way free users have used ChatGPT, most of them have actually not experienced the power of the reasoning models. They mostly are using GPT four o, and, you know, they they mostly are kinda using it for this very kind of, you know, turn based kind of, like, very quick, you know, back and forth, almost search like that ways and that I think don't actually kind of express the full capability of the model.

Speaker 1

对很多人来说,这将是首次使用具备推理能力的模型。不仅是首次接触推理功能,更是第一次体验模型能自主决定思考时长、根据问题难度调整回答质量。因此我们预计普通用户会感觉天壤之别,而高端用户可能差异感较小——我同意这个观点,这很自然。

And so for a lot of people, this will be the first time using a model that has reasoning capability. And not only will it be, you know, the first time using it with reasoning, but it'll be the first time that they're experiencing a model making a decision about how long to think about a problem and how good of an answer to give relative to how hard the question is. And so we expect that, like, for, yeah, for the average user, it will feel dramatically different. Maybe for the kind of upper echelon of power user, it may not feel as different. So I would agree with that, and and I think that's a natural thing.

Speaker 1

我认为这其实是好事。如果你一直紧跟AI发展节奏、始终探索技术前沿,确实会感到眩晕,但这种进步会逐渐显得连续;而如果你用的还是一两年前的旧模型,感受就会截然不同。

I don't I think that's actually a good thing that, you know, it's it it is if you've been following the kind of rate of AI progress and you're you're you're kind of exploiting the frontier at every point, yes, it probably is dizzying, but, it all it starts to feel, it starts to feel more continuous than if you've kind of you know, you're using what is basically kind of the the best model from a year or two ago.

Speaker 0

没错。你说得太对了——普通用户把它当搜索引擎用。他们总问我'该用AI做什么',我就说'上传资料然后直接对话讨论'。

Right. Think you're so spot on about the average user is using it as like a search version of search. And they're like, well, what should I use when they speak to me? They're like, what should I use AI for? I'm like, just upload stuff and start talking to it about the things you upload.

Speaker 0

我有个朋友曾上传他儿子足球训练的照片,向它寻求执教建议。他相当震惊于这东西竟能提供真实的站位分析。虽然我不会把它当足球教练用,但我认为当普通用户接触到这些功能时,绝对会感到非常震撼。

And I had a friend who was pictures of his son's football practice and asking it for tips about for coaching tips. And he was fairly blown away that this thing is giving some real analysis of positioning. I mean, wouldn't use it as a football coach, but, I do think that as the average user gets into these capabilities, it's gonna be fairly mind blowing.

Speaker 1

是的。你知道,每个人的切入点都略有不同,这正是它的美妙之处——对每个人来说都非常个性化。这次发布我们特别关注健康领域,因为这是我们反复听到的、人们使用强大AI的起点——当他们处于健康管理旅程时。因此我们全力确保,若人们要用AI系统处理健康问题,我们能提供最优质的模型。这是训练GPT五的重要推动力。

Yeah. It's you know, there's everyone's got a little bit of a different entry point, and that's a cool thing about it is, like, it's really personal for everybody. You know, we we focused on health a lot with this release because that was one of the consistently common things that we heard from people as a starting point for how they've used powerful AI was in when they're navigating a health journey. And so we really wanted to make an effort on on making sure that if people are going to be using AI systems for health related things, that we could serve them the best possible model. And so that was a big a big push for training GPT five.

Speaker 0

你多次提到健康领域。你希望它取代全科医生吗?虽然很多人医疗资源匮乏,但我担心给他们一个可能产生幻觉的模型,然后说'这就是替代方案'。

Yeah, you brought up health a couple of times. Do you want this to replace a GP? I mean, a lot of people are really underserved with health care, but I kind of worry about handing them a model that can hallucinate and saying, This is the substitute now.

Speaker 1

我认为它不会取代全科医生。但它能帮助人们在健康旅程中获得更多主动权,对护理管理过程有更强掌控力。它还能提升人们对病情的认知。我们常听到患者管理着他们根本不理解的病症——因为没人花时间解释。这并非任何人的过错。

I don't think it'll replace GPs. But what I think it helps people do is become have more agency in their journey, a little bit more control over the process of managing care. It gives people also just an awareness of the condition. So, you know, we hear stories all the time of people managing conditions that, you know, they didn't really understand because no one actually took the time to explain it to them. And I that's not because anyone did anything wrong.

Speaker 1

只因现行医疗体系的设计,本就不允许医护有时间让患者理解他们正在管理的疾病。所以哪怕只是提供基础教育——比如'这是你管理的病症,发病率如此,症状表现如下,你会感受到这类征兆'——

It's just because the health system health care system as it's designed doesn't allow for there to be time to allow people to understand what it is that they're they're managing. And so even just giving people that baseline of education of, like, you know, this is this this is the condition you're managing. It's this common. It's gonna express in these ways. You're gonna feel these types of symptoms.

Speaker 1

这对患者的疾病管理心理就是巨大解放。当然我认为仍需与全科医生或专科医生协作治疗。但有个能全程陪伴的助手,对多数人而言非常安心,事实上也证明确有助益。我们显然要确保模型尽可能准确,因此特别着力提升该领域的模型能力。

That's a huge unlock just in people's kind of psychology for what it means to be to be managing a disease. And, you know, I don't I don't think I think you still have to kind of work with a GP for care or, you know, a specialist for care. But having something that can can can kind of handhold you through that journey, I think, for a lot of people, really comforting, and in a lot of cases, it's actually proven to be helpful. Obviously, like, we wanna make sure that model is as accurate as possible. So being able to kinda push the model capability in that domain specifically has been a big area of focus.

Speaker 1

我们认为现在GPT五——当然未来模型更是——已持续展现出准确率上升与幻觉率下降的趋势。根据测量标准不同,GPT五的准确率约是前代的4-5倍。在健康领域可能更为显著。虽然我暂时没有具体数据,但我们拥有充分掌控力,正朝着打造可靠精准模型的方向稳步推进。

But we think now with GPT five and obviously with, you know, with future models, we've seen consistently the the rates of accuracy and the rates of hallucination go up and down respectively. GPT five, I think, depends on how you measure it, but it's, know, four to five times more accurate than its predecessors. And so and that, you know, that may be more accentuated in health. We we I don't I don't know off the top of my head, but but so we have, you know, a lot of control, I think, and are pushing in the right direction on being able to make them reliable and accurate.

Speaker 0

很有意思我们讨论的内容已远超聊天机器人范畴。当然有聊天功能,但还有编程、医疗,以及企业如何运用这些模型。众所周知企业采用这类技术总是慢半拍,我确信要经历层层审批复核,产品落地困难重重。但我认为当模型性能提升时——这是我的信念——更好的模型能极大加速并更有效地推动进程。那么谈谈比GPT-5更先进的模型将为企业或商业领域带来哪些可能性?

And it's pretty interesting we're talking about things so far beyond the chatbot. Like, of course, there's the chat function, but there's coding, there's health, and of course there's enterprise or the way that businesses use these models. Businesses are notoriously slow, at implementing this technology, and, I'm sure there's so many approvals and reviews and it's tough to get things out the door, but I do think that when you have better models, this is sort of my belief, you have better models, you sort of are able to push that forward much faster and much more effectively. So talk a little bit about what a better model than GPT-five will enable on the enterprise front or business front.

Speaker 1

我完全认同你的观点。可以说我们尚未见证AI在商业领域的'ChatGPT时刻'。对消费者而言AI是绝佳工具,因为搜索空间更窄、问题更局限,处理的上下文也更有限。这种情况下模型可以逐步推进,几乎不受外部依赖影响,纯粹依靠其智能发光发热。

Yeah, no, I would agree with your assessment there. Think in many ways, always kind of say we haven't yet seen the chatty bitty moment, I think, in business for AI. I think AI was an amazing tool for consumers where your your search space, so to speak, is is more narrow, and you've got a more constrained problem. You've got obviously a much more narrow context that you're processing, and I think, you know, you can kind of take things turn by turn with very, very few kind of external dependencies. And you really just kinda let the model's pure intelligence shine.

Speaker 1

企业则是完全不同的难度层级。这里有复杂的业务流程、多用户协作依赖、海量待处理的上下文环境,以及需要调用的各种工具链。

Businesses are a different category of of of difficulty. So, you've got complex business processes. You've got a lot of, multi user dependency. You've got a lot of context that you have to process. You've got a lot of tools that have to be brought to bear.

Speaker 1

这些工具必须按特定顺序在安全范围内使用,系统容错率极低。正如我们之前讨论的,像GPT-5这样的模型对企业的价值在于基础能力的全面提升:工具调用能力、结构化思维、问题解决、自我纠错、长上下文检索等——这些细微改进在边缘场景至关重要。

Those tools have to be used in succession in certain ways with this you know, with certain guardrails. And there have to you know, there's not as there there's not as much fault tolerance for for when they don't work. And so we you know, kinda goes back to what we were talking about earlier. I think you look at models like GPT five and the impact that they're gonna have in business, it is that baseline of capability that's moved up. It's their ability to to use tools, to do to, you know, think in a structured way, to solve problems, to kind of recursively correct, you know, their own mistakes, to do long context retrieval, things like that that actually, you know, these little things do matter on the edge.

Speaker 1

普通用户在ChatGPT中未必能感知这些进步,但开发者和企业会逐渐体会到。我们与各类企业合作测试GPT-5时也观察到这点——从优步、安进到Harvey、Cursor、JetBrains等公司,它们的应用场景高度依赖模型可靠的工具调用、长上下文处理及有效推理能力。

And that you don't feel them every day in ChatGPT as a as an individual user, but you will start to feel them as a developer or an enterprise. And so we see this anecdotally too. I mean, we've worked with large enterprises and small startups and the entire spectrum in between on testing these models and g p d five specifically before release. And we get a lot of feedback from companies like Uber and Amgen and Harvey and Cursor, Lovable, know, JetBrains. I mean, all companies that have use cases that are highly, highly sensitive to the model's ability to reliably call tools, to deal with long context, to, you know, to to to problem solve and and reason effectively.

Speaker 1

这就像企业级市场的涨潮现象,最终要靠合作开发者们理解这些改进,并将它们落实到应用构建中。

And so it's a it's a rising tide, I think, across the enterprise, and it's just really gonna be on on the developers we work with to to be able to kind of, you know, understand the the difference and the improvement and then implement them in the applications that they're building.

Speaker 0

你们已与众多公司合作提前使用GPT-5确实很有趣。是否有某种'旧模型无法实现,但GPT-5可以'的统一共识?还是说新能力带来的突破是分散在不同领域的?

Yeah. It is interesting to know that you've been you have been already working with many companies and letting them use GPT-five already. So has there been a sort of unified we couldn't do this with the previous models, but we can do it now with GPT-five? Or is it sort of spread out in terms of the capabilities that it's now enabling?

Speaker 1

可以说这是一种全面性的水涨船高现象。现在与我们合作的所有公司基本上都已习惯对所有使用的模型进行评估和性能对标。但大家都反馈称,在这些评估中普遍获得了更高、更稳定的性能表现。有几个领域表现尤为突出,编程能力肯定是其中之一。

I would say it's it's been, you know, rising tide across the board. So every everyone who's kind of benchmarking and all the companies that we work with typically now are are are pretty accustomed to to evaluating and benchmarking performance across all the models that they use. But everyone has kind of reported, you know, much higher, kind of consistently higher performance on those evals. There are a few areas in particular we've seen spikes. So one is coding for sure.

Speaker 1

我提到过像Cursor、JetBrains、Windsurf、Cognition等合作公司,他们普遍反映GPT-5现在堪称最强大的编程模型——无论是在交互式编程环境还是智能体编程场景中。另外我们还发现,它在专业技术领域的推理和问题解决能力有了显著提升。以Harvey为例,这家为律师事务所提供服务的AI公司,极其依赖其可靠、准确且稳定呈现案件分析的能力,这正是法律分析所需的结构化思维水平。我预计这种优势将延续到金融服务领域——这个极度依赖数据分析、研究和规划的有趣行业。

I mentioned companies like Cursor, JetBrains, Windsurf, know, Cognition, and others that we work with who anecdotally are all you know, have have all said that GPT five now feels like the most capable coding model, whether that's in an interactive coding environment or more of an agentic coding environment. And then also one of the things that we see consistently now is its ability to reason and problem solve in very technical domains is significantly improved. And so Harvey's a great example of that where you've got, know, Harvey AI working with legal firms and law firms is, you know, very, very reliant on its ability to reliably, accurately, and and consistently portray, you know, cases that that that it's looking at, legal analysis to provide that kind of level of structured thinking you want when you're doing legal analysis. And so I expect we'll see that carry over. I mean, financial services is a very interesting area, heavy on data analysis, heavy on research, heavy on planning.

Speaker 1

这些都是我们已看到改进的领域。随着GPT-5持续渗透市场,我们将收获更多反馈并不断优化这些应用场景。

Those are all areas that we've seen improvement in. And so as we continue to kind of see GPT-five permeate the market, we'll get more and more of that feedback and can continue to improve on those use cases.

Speaker 0

那么定价策略呢?现在输入token成本只有GPT-4的一半,输出token价格保持不变。这种降价会催生更多应用场景吗?另外,在你们今年融资(或宣布融资)480亿美元的情况下,降低成本如何与投资者预期相协调?

And how about pricing? Because it's half the cost, of an input. An input token is half the cost, then GPT $4.00 output token is the same. Are these lower costs going to help enable more use cases? And and on that note, I mean, how does lowering costs sync with the fact that you've raised like 48,000,000,000 this year or announced 48,000,000,000 in funding?

Speaker 0

真的可能在降低成本的同时满足投资者在这方面的期待吗?

Is it really possible to lower costs and deliver on the expectations that the investors are expecting on that front?

Speaker 1

是的。回顾OpenAI的发展历程,每次降低成本后,通常都会出现相应的使用量增长,且增幅往往超过降价幅度。只要这个趋势持续,我们就会继续降低模型成本。开发者需要在延迟时间、模型质量/智能水平和价格之间进行复杂权衡。我们这次尝试的就是根据市场在这三方面的反馈,将GPT-5系列模型(不仅是标准版,还包括迷你版和纳米版)定位在质量、成本和延迟的最优前沿,以匹配市场成功所需的条件。

Yeah. So we've, you know, in OpenAI's history, every time we've cut costs, we've seen typically some corresponding increase in consumption that usually outweighs the cost cut. And so, you know, for as long as that trend holds, we will continue to to cut costs on models. We know that there's this complicated dance that developers have to do between latency, model quality and intelligence, and price. And I think, you know, what we've tried to do here basically is take the market's feedback on all three of those fronts and really place these models, these GPT-five models, not just the standard model, but also the mini model and the nano model on this frontier of quality, cost, and latency that kind of optimizes for what we think the market needs to be successful.

Speaker 1

因此我们设定了极具吸引力的价格目标和平均延迟水平,同时保持了GPT-5与生俱来的模型质量和智能优势。我们将持续推动这个前沿边界——通常这个边界推得越远,人们就越愿意开发更多应用场景。能维持这种良性循环我们深感幸运,这也激励我们不断追求进步。

And so we tried to find a really attractive price target at a very attractive latents average latency, and then obviously with the the the kind of built in model quality and intelligence you get with GPT five. And so we will continue to push that frontier. And I think the more we push that frontier, typically, the more we just see people wanna use it for more things. And so for that equation to exist, we're very fortunate, and it motivates us to try and make them better.

Speaker 0

你们究竟什么时候能实现盈利?

Are you ever going to be profitable?

Speaker 1

希望如此。

I hope so.

Speaker 0

好吧,我们接受这个回答。布拉德,在结束前让我第一个问你——GPT-6什么时候发布?

Okay. We'll take it. All right. Brad, before we wrap, let me be the first to ask you, when is GPT-six coming?

Speaker 1

其实你不是第一个问的。我本可以告诉你,但还没确定吧?推特上早就有人追问了。不过就像我说的,我们认为GPT-5已经具备惊人能力。

Well, you're not the first to ask. Could tell you, but I haven't Already? Yeah, no. Twitter is quick on the trigger on that one. But no, I mean, like, look, we're like I said, we we think GBT5 is extraordinarily capable.

Speaker 1

我们相信未来会有更优模型,也确定会出现。目前我们只专注两件事:如何让现有模型触达用户?如何支持与我们共建的企业?这仍是科学探索的过程。

We we think there will be better models in the future. We know there will be better models in the future. For now, we're just focused on how do we get this in people's hands? How do we support the companies that are building with us using this model? And then we're still in in in the science of it.

Speaker 1

最令人兴奋的是,我们就像处在比赛的第一局,连我们自己都在理解当前的技术范式。这是重要的第一步,只有认清现状才能规划未来。相信这些经验会让GPT-6更出色。

I think, that's the exciting part is like, we're in the first inning of it, and we ourselves are just understanding the paradigm we're in. And so this is, I think, an important first step. And you kind of have to understand where you are to to understand where you're going. And, you know, hopefully, the the the learning from this will make GPT-six much better.

Speaker 0

布拉德,今天能在GPT-5发布日邀请到你太棒了。等GPT-6问世时我们务必再聚。非常感谢你的参与。

Well, Brad, it's so great to have you on, especially today on, GPT-five launch day. So whenever GPT-six comes we'll have to do it again. Thank you so much for joining.

Speaker 1

期待一下吧。

Look forward to it.

Speaker 0

好了各位,GPT-5已经发布。你可以在chat.com上试用,它将逐步向所有人开放,快去看看吧。明天我们将继续深入讨论,届时Ranjan Roy和我将解析本周新闻,特别是关于GPT-5的最新动态。感谢大家的收听,我们下次在《大科技播客》再见。

Alright folks GPT-five is out. You can try it on chat.com and it's gonna roll out to everybody so give it a look and we'll be back to talk more about tomorrow where Ranjan Roy and I will break down the week's news especially, what the latest is on GPT five. Thanks everybody for listening and we'll see you next time on Big Technology Podcast.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客