Training Data - OpenAI Codex 团队:从代码自动补全到异步自主代理 封面

OpenAI Codex 团队:从代码自动补全到异步自主代理

OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents

本集简介

OpenAI Codex 团队的王瀚森和亚历山大·恩比里科斯讨论了他们最新的 AI 编码代理,该代理可在独立环境中运行长达 30 分钟,仅凭简单的任务描述即可生成完整的拉取请求。他们解释了如何将模型的训练超越竞技编程,以契合现实世界的软件工程需求,从与 AI 协作转向委托给自主代理,以及他们对未来愿景的构想——大多数代码将由在各自计算机上独立工作的代理编写。对话涵盖了长期推理的技术挑战、创建真实训练环境的重要性,以及开发者如何已在 OpenAI 使用 Codex 修复漏洞和实现功能。 由索尼娅·黄和劳伦·里德主持,红杉资本 本集提及: 《文化》:伊恩·班克斯的科幻系列,描绘了对 AI 的乐观展望 《苦涩的教训》:里奇·萨顿的有影响力论文,强调规模是 AI 战略突破的关键。

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

在我看来,软件越容易编写,我们就能拥有越多的软件。

In my opinion, the the easier it is to write software, than the more software we can have.

Speaker 0

现在,如果我们想想,我打赌如果我们打开手机,你们这些人都是投资者。

Right now, if we think of, like I bet you if we look pull up our phones well, you folks are investors.

Speaker 0

但如果你不是投资者,我打赌如果你打开手机,上面大多数应用都是由大型团队为数百万用户开发的。

But if you're not an investor, I bet you if you pull up your phone, most of the apps on it are apps that are built by large teams for millions of users.

Speaker 0

而真正为个人或特定需求量身打造的应用却非常少。

And there's very few apps that are built, like, just for us and the specific thing that we need.

Speaker 0

因此,我认为随着为个人或团队开发专属软件变得越来越切实可行,我们对软件的需求将会越来越高。

And so I think as it becomes more and more practical to build, like, bespoke software for people or teams, we'll end up having higher and higher demand software.

Speaker 1

欢迎来到《训练数据》。

Welcome to Training Data.

Speaker 1

今天,我们邀请了来自OpenAI Codex团队的王瀚森和亚历山大·恩伯雷科斯,带我们深入了解软件开发的未来。

Today, we're joined by Hansen Wang and Alexander Emburekos from OpenAI's Codex team for a fascinating look at the future of software development.

Speaker 1

Codex是OpenAI的一系列AI编程工具,帮助开发者将任务委托给云端和本地的编程代理。

Codex is OpenAI's series of AI coding tools that helps developers delegate tasks to cloud and local coding agents.

Speaker 1

与2021年开发的原始OpenAI Codex不同,后者仅用于自动补全代码行,最新的Codex版本能够在后台自主完成整个任务。

Unlike the original OpenAI Codex, which was developed in 2021 to auto complete lines of code, the latest evolution of Codex can complete entire tasks for you autonomously in the background.

Speaker 1

O3与Codex的关键区别在于,O3擅长竞技编程,而Codex经过强化学习调优,擅长日常的企业开发任务。

The key difference between o three and Codex is that while o three is great at competitive programming, Codex has been RL tuned to be great at day to day enterprise development tasks.

Speaker 1

亚历山德拉和汉森分享了Codex的背景故事,以及从快速自动补全向长期运行后台代理的更广泛范式转变。

Alexandra and Hansen share more about the backstory for Codex and the broader paradigm shift from snappy auto complete to longer running background agents.

Speaker 1

此外,他们还分享了对未来开发者如何与AI互动的惊人愿景——同步与异步体验将逐渐融合。

Plus, they share their surprising vision for how developers will interact with AI in the future as sync and async experiences merge.

Speaker 1

提示一下,未来可能更像TikTok,而不是你当前的IDE。

Hint, it might look more like TikTok than your current IDE.

Speaker 2

感谢你们加入我们。

Thank you guys for joining us.

Speaker 2

很高兴你们能来到这里。

It's wonderful to have you here.

Speaker 2

嗨。

Hey.

Speaker 3

谢谢你们邀请我们。

Thanks for having us.

Speaker 3

很高兴来到这里。

Great to be here.

Speaker 2

我们很想多了解一些你们的工作内容。

We'd love to hear a little bit more about what you guys work on.

Speaker 2

跟我们讲讲Codex团队的故事吧。

Tell us about the Codex team in your story.

Speaker 3

嗯,是这样的。

Well, yeah.

Speaker 3

我是Hansen。

I'm Hansen.

Speaker 3

我是帮助训练Codex模型的研究人员之一。

I'm one of the researchers that helped train the Codex one model.

Speaker 0

我是产品负责人Alex。

And I'm Alex, the product lead.

Speaker 3

对我来说,Codex 这个名字很好地呼应了最初的 Codex 模型。

I think for me, the name Codex is is is such a great callback to the original Codex model.

Speaker 3

它刚发布的时候,对我来说是个特别的时刻,因为我觉得 GPT-3 真的很酷。

That was, like, kind of, like, a moment for me when it first came out because I think q p three was really cool.

Speaker 3

但 Codex 是第一个让我觉得‘哇,这真的不一样’的时刻。

Then But Codex was like the first moment where I felt like it's like, wow.

Speaker 3

它真的能做些改变世界的事情。

This can really do something that is gonna change the world.

Speaker 3

这实际上也是我进入创业领域的契机。

And that's like actually kinda like how I got into the whole like startup space.

Speaker 3

我最早做的几个演示之一,就是用 Codex 做数据分析。

Like one of the first couple demos I did was like using Codex to do data analysis.

Speaker 3

说起来还挺有意思的,我当时是作为 Sequoia 的 Arc 项目成员在这里的。

I think it's actually like a funny story, like I was here for as part of Sequoia's Arc program.

Speaker 3

所以我才认识了 Lauren。

That's why met Lauren.

Speaker 3

然后我们做演示时,实际上使用了OpenNet编码器来做数据分析,这让我进入了创业领域。

And then when the demos we did, we actually used OpenNet codecs to do data analysis and that's how I started in the startup space.

Speaker 3

随着时光推移,当GPT的后续版本陆续发布时,很明显,使用AI进行代理式应用将成为未来。

And and I think as time went on as like the later versions of GPT came out, it became super clear that, you know, like using AI for agentic use cases was gonna be the future.

Speaker 3

所以我加入了这家公司,专注于代理式编码相关的工作。

And so I I joined the company to work on on on agentic coding efforts.

Speaker 0

是的。

Yeah.

Speaker 0

这符合OpenAI一贯的命名风格,我们希望名称尽可能简单易懂。

And this was like, you know, a per standard OpenAI style where we like the naming to be as easy to follow as possible.

Speaker 0

这是Codex,我想大概是2021年的时候。

This is the codex of, like, think it was 2021.

Speaker 1

是的。

Yeah.

Speaker 1

这发生在ChatGPT之前。

This is pre Chatcha PT.

Speaker 1

对吧?

Right?

Speaker 0

没错。

Exactly.

Speaker 0

是的。

Yeah.

Speaker 0

所以它实际上是驱动GitHub Copilot的模型。

So it was actually like the the model powering GitHub Copilot.

Speaker 0

然后最近,当我们正在开发这个产品时(我们稍后会谈到),我们认为这是一个非常有趣的品牌,也是一个非常棒的App名称,比如Code、Code X、代码执行。

And then recently, you know, as we're working on this product, which we'll talk about, we thought, you know, this is like a super fun brand, also a very app name, you know, code, code X, code execution.

Speaker 0

所以我们决定重新激活这个品牌,并继续使用它。

So like, we decided to sort of resuscitate the brand and like keep using it.

Speaker 1

你说的是‘复活’。

You said resuscitate.

Speaker 1

所以Codex之前沉寂了一段时间,然后你们才重新启用它来应对We haven't

So was codex dormant for a while, then you all resuscitated it for the We haven't

Speaker 0

最近才重新使用了这个品牌。

used the brand like recently.

Speaker 1

好的。

Okay.

Speaker 1

好的。

Okay.

Speaker 1

太酷了。

Really cool.

Speaker 1

你能给我们讲讲 Codex Agent 是什么,它能做什么吗?

Can you tell us a little bit about codex the agent and and what it does?

Speaker 3

是的。

Yeah.

Speaker 3

我认为 Codex 是一个拥有自己容器和终端的编码代理,完全在云端运行。

I think basically Codex is a coding agent that has its own container and its own terminal, kinda like in fully in the cloud.

Speaker 3

你给它一个任务,它会以一次性的方式返回一个拉取请求。

You you give it a task and it comes it comes back to you with a with a PR in the sort of like one shot style.

Speaker 3

在过程中,我们实际上尝试了多种形态,但最终决定采用这一个。

And we actually experimented with a lot of form factors kind of along the way, but kind of in the end decided to to settle on this one.

Speaker 3

是的。

Yeah.

Speaker 3

所以,你知道,我们一直都在开发一堆

So like, you know, we've been working on a bunch

Speaker 0

各种代理,同时也一直在开发许多编码产品。

of agents agents and we've been working on a bunch of coding products as well.

Speaker 0

在我们看来,Codex 就像是一个思想实验,探讨如何与 AI 一起编程,但我们把所有精力都放在思考:如果 AI 在独立于你的电脑上工作,那会是什么感觉。

And basically, in our mind, Codex is like this thought experiment for how would it work to code with AI, but where we sort of put all our effort into thinking about what would that would feel like if the AI is working on its own computer independently from you.

Speaker 0

因此,你是将任务委托给它,而不是与它配对协作。

And so you're delegating to it rather than like pairing with it.

Speaker 0

因此,我们对这次 Codex 发布感到非常自豪的方面,包括思考计算环境——我们如何搭建它,使代理能够独立工作并保持高效,以及创建这个模型,我得再多说说。

And so, you know, some of the things that we're really proud of with this Codex launch are thinking about, like, the compute environment and, like, how do we set it up so that the agent can actually work on its own but be productive, and, like, creating the model, which I have to talk more about.

Speaker 0

基本上,它不仅擅长写出看起来不错或能运行的代码,还非常擅长编写出对专业软件工程师真正有用、理想情况下无需接触他们自己的电脑就能直接合并的代码。

Like, basically, that isn't just good at, like, writing code that looks good or is functional, but also is really good at writing code that, like, is useful for professional software engineers and, like, mergeable ideally without even touching your their own computer.

Speaker 2

那么,Codecs 和 Codecs CLI 之间有什么区别?

So what is the difference between Codecs and Codecs CLI?

Speaker 0

是的。

Yeah.

Speaker 0

我们确实收到了一些关于这个问题的提问。

We've we've definitely gotten some questions about that.

Speaker 0

我保证,随着时间推移,这一切会变得更加清晰。

I promise this is all gonna make even more sense over time.

Speaker 0

对我们来说,Codex 本质上是我们对智能编码代理这一概念的品牌命名。

So, basically, Codex for us is, like, our brand for, like, agentic coding.

Speaker 0

我们有一个愿景,就是会有一个代理,它主要在自己的计算机上工作,但也能在你使用的任何工具中与你互动,无论你是用终端、IDE 还是问题管理工具。

And we have this vision of, like, you know, like, we're gonna have this agent, and mostly the agent will work on its own computer, but it shall also be able to meet you in any of the tools that you use wherever you work, be that your terminal or your IDE or your issue management tool.

Speaker 0

所以,Codec CLI 就是终端里的 Codex。

So Codec CLI is basically like Codecs in your terminal.

Speaker 0

CLI 代表命令行界面。

So CLI stands for command line interface.

Speaker 0

所以,在你的终端里,你可以使用 Codex,这就像你的工作环境。

So it's like in your terminal, you can work with Codex, that's like your environment.

Speaker 0

而 Codex 或 Codex 和 ChatHPT 基本上是 Codex 在它自己的电脑上运行。

And then Codex or Codex and ChatHPT is basically a Codex working on its own computer.

Speaker 0

今天,它们只是不同的东西。

Today, those are just distinct things.

Speaker 0

顺便说一句,我非常喜欢在 OpenAI 工作的一点是,我们非常愿意削减范围,快速推出产品。

As a brief aside, one of my favorite things about working at OpenAI is how willing we are to cut scope and just launch things quickly.

Speaker 0

但随着时间推移,我们会真正把它们更紧密地整合在一起。

But over time, we'll actually bring those things closer together.

Speaker 0

所以你可以把它理解为就是 Codex,它既可以出现在 JWT 里,也可以出现在你的 CLI 中。

So you can really think of it as just like codecs and it can be in JWT or it can be in your CLI.

Speaker 3

非常

Very

Speaker 1

棒。

cool.

Speaker 1

那你当时有什么想法呢

And so what did you have

Speaker 2

为了让模型不仅仅能写出下一行代码,而是更有用,你们需要做哪些不同的调整?

to do differently for the model to make it useful beyond just writing the next line of code?

Speaker 3

是的。

Yeah.

Speaker 3

所以我认为其中一个最有趣的进展是,如果你回看我们推出的第一个推理模型,比如o1,我们曾强调它在数学和编程竞赛方面的出色表现。

So I think one of the most interesting progressions so if you go back to, you know, like the o one, the the first reasoning model that we launched, we highlighted, like, how good it is at math and even, coding competitions.

Speaker 3

到现在为止,我曾经是个竞技程序员,而它在编程竞赛上的表现已经比我更好了。

Like, as of now, I used to be a competitive coder and like it's better than me at competitive coding.

Speaker 3

它在OpenAI内部几乎比所有人都强。

It's better than most almost all people at OpenAI at that.

Speaker 3

但我觉得我们发现的一件事是,尽管它在编程竞赛中表现优异,但在生成可合并的代码方面其实并不算好。

But I think one of the things that we saw was that, you know, despite being good at these programming competitions, it wasn't actually that good at producing mergeable code.

Speaker 3

因此,我们甚至在博客文章中特别提到,比如o3这样的模型,它生成的代码往往不符合专业软件工程师所期望的风格或品味。

And so, like, we we even highlighted it this in the blog post with with models like o three, like, the the the code that it generates often, you know, like, isn't quite to the taste or style that a, you know, professional software engineer would expect.

Speaker 3

因此,我们在训练这个模型时投入了大量精力,使其符合专业软件工程师的品味和偏好,这方面的训练我花了很多专门的功夫。

So a lot of the effort that we spent on training this model was aligning the model to basically, like, the taste or the preferences of of professional software engineers, and that's something I took a lot of, I guess, specialized training.

Speaker 0

是的。

Yeah.

Speaker 0

我有一个特别喜欢的产品类比:如果你看看我们最近的模型,它们在编程方面非常出色,但这就像是一个非常早熟的、参加编程竞赛的大学生,虽然技术很强,却缺乏在团队中作为专业软件工程师的多年工作经验。

I have this, like, very, like, product y analogy that I like, which is, like, if you take our, like, recent models, which are great at coding, they're great at coding, but it's kind of like this, like, really precocious, like, competitive programmer, like, college grad who doesn't have many years of job experience being a professional software engineer at, like, on a team.

Speaker 0

对吧?

Right?

Speaker 0

因此,从 o3 到 Codex 1 的过程中,我们所做的大量工作,实际上相当于软件工程师刚入行的那几年经验,比如:

And so a lot of the work we did to go from, like, o three to, like, Codex one was actually, like, the equivalent of, like, those first few years of job experience where it's like, hey.

Speaker 0

一份好的 PR 描述应该是什么样的?

Like, what does a good PR description look like?

Speaker 0

PR 标题呢?你会不会去阅读代码库的风格,然后确保自己的代码与之保持一致?

You know, PR titles, like, do you read the style of the code base and then make sure your code is in the same style?

Speaker 0

你该如何进行有效的测试?

How do you, like, test well?

Speaker 0

你如何证明你测试得很好?

How do you show that you tested well?

Speaker 0

类似这样的事情。

Stuff like that.

Speaker 1

通常是什么时候有人会使用Codex?

What's typically the moment for when somebody uses Codecs?

Speaker 3

是的。

Yeah.

Speaker 3

我认为入职培训中有一项是:在代码库中找到并修复一个bug。

Think one of the things we have in the onboarding is like find and fix a bug in the code base.

Speaker 3

我认为这是Codex特别擅长的领域之一,就是专门用于修复bug,因为它不仅能独立判断某些地方是否有点异常,还能实际去验证:好吧,我可以尝试复现某个特定问题。

I think that's one of the areas where Codex really shines is like specifically like bug fixing just because it can actually like independently try not just to see if, you know, something looks a bit off, but it can actually go and then like verify that, okay, like I can try and reproduce a particular issue.

Speaker 3

所以我认为,甚至在Codex发布之前,就曾出现过几个bug,我们当时坐在那儿琢磨到底出了什么问题。

And so I think like even, you know, like leading up to the Codex launch, there were a couple of bugs where, you know, like we were sitting there kinda like wondering what's going on.

Speaker 3

老实说,有时候最简单的方法就是把问题描述直接粘贴到Codex里,我们惊讶地发现,云平台经常能给出一个可用的修复方案。

And honestly, like sometimes the easiest thing to do is just like paste in a description of the issue into Codex and we were surprised how frequently the cloud actually end up with a usable fix.

Speaker 0

是的。

Yeah.

Speaker 0

有个有趣的故事要说。

Like like fun story here.

Speaker 0

希望这不会透露太多,但在上线前一晚或上线当天凌晨一点,我们正在排查一个与Lottie动画相关的bug。

Hopefully, this doesn't give away too much, but at one AM the night before launch or the morning of launch, at 1AM, were we were looking at a bug with like an animation, a Lottie animation.

Speaker 0

你知道,这种情况下,我们可能会想,干脆把这项功能从上线范围里删掉吧。

And, you know, this is the kind of thing like, okay, guess we could cut it from launch scope.

Speaker 0

没有它,上线也无妨。

It'd be okay to launch without it.

Speaker 0

但我们真的很想解决它,却怎么也找不到原因。

But we really wanted to get in and we we just couldn't figure this out.

Speaker 0

于是,一位工程师描述了这个bug,并把它输入到Codex中。

And so an engineer ended up describing what the bug was and putting it into Codex.

Speaker 0

对于正在使用Codex的人,一个有趣的技巧是:如果遇到特别棘手的任务,可以让Codex多次尝试解决。

And actually a fun pro tip for anyone who's using Codex is that if there's a really hard task, it can be useful to ask Codex to take multiple cracks at it.

Speaker 0

于是他们把这段描述粘贴进去,运行了四次。

So they pasted that description in and ran it four times.

Speaker 0

嘿。

Like, hey.

Speaker 0

有个bug。

There's this bug.

Speaker 0

我们搞不清楚到底出了什么问题。

We can't figure out what's going on.

Speaker 0

其中三次运行都没成功。

And three of those rollouts did not work.

Speaker 0

而第四次的结果,正是我们在凌晨一点前为解决那个卡了几个小时的bug所急需的修复方案。

And then one of the four, which is, the fix of the bug that we were stuck on for, like, hours at 1AM before launch.

Speaker 0

于是他们提交了修复,部署了代码,动画最终如期上线。

And so landed the fix, you know, deployed the code, and the animation was in for launch.

Speaker 1

太棒了。

That's awesome.

Speaker 1

你能多讲讲你们在OpenAI内部是如何使用它的吗?

Maybe tell us more about how y'all are using it internally at OpenAI.

Speaker 1

比如,每位工程师、每位研究人员现在都在他们的工作流程中使用Codex吗?

Like, is every engineer, is every researcher using Codex now in their workflows?

Speaker 0

是的。

Yeah.

Speaker 0

而且,我能跟你分享另一个很神奇的时刻吗?

And, actually, can I give you the other, like, kinda magic moment?

Speaker 1

哦,好的。

Oh, yeah.

Speaker 1

请说吧。

Please do.

Speaker 1

当然。

Definitely.

Speaker 0

Codex的一个有趣之处在于,它的使用形式与人们通常所熟悉的很不一样。

So, like, one of the interesting things about Codex is that it's a very different form factor from maybe what people are used to.

Speaker 0

对吧?

Right?

Speaker 0

比如,很多人熟悉的AI产品,特别是在软件领域,GitHub Copilot 可能是第一个真正出色的。

Like, a lot of the AI products that people are used to, especially in software, maybe, like, GitHub Copilot was, like, the first really good one.

Speaker 0

有一些东西能很好地融入你的工作流程,让你无缝地来回互动。

There are really things that kinda, like, work with you in flow, and you're just kinda seamlessly going back and forth.

Speaker 0

你就像在结对编程。

You're kind of pairing.

Speaker 0

对吧?

Right?

Speaker 0

这其实是结对编程的一种延伸。

And it's flavors on pairing.

Speaker 0

我们认为这很棒,而Codex CLI就是一个可以这样使用的工具。

And we think that's awesome, and, like, the Codec CLI is a tool that you can use in that way.

Speaker 0

但对于Codex来说,我们真的想推动‘委托’这个理念。

But for the for Codecs, like, you know, we really wanted to push this idea of you're delegating.

Speaker 0

因为在未来,我们设想绝大多数编码工作实际上将由独立于人类的代理在各自的电脑上完成,而人类只能同时做一件事。

Because in the future, we imagine that actually the vast majority of coding is actually gonna be done independently from the human working on their computer who can only do one thing at a time.

Speaker 0

基本上,这些工作将由在自己电脑上运行的代理完成。

Basically it'll be done by agents working on their own computer.

Speaker 0

因此,将任务委托给代理,与在你的工具中与AI模型协作,是完全不同的两回事。

And so that is a very different thing to delegate to an agent than it is to pair with an AI model that's in your tooling.

Speaker 0

所以你必须以不同的方式使用它。

And so you have to kinda use it differently.

Speaker 0

在产品发布前的alpha版本开发过程中,我们直接把这代理交给用户,告诉他们:嘿,随便你怎么用。

And so when when we actually were working on an alpha before launch, we would just give this agent to people and be like, hey, like, just use this however you want.

Speaker 0

我们发现,许多尝试使用我们Codex alpha版本的人根本觉得它没什么用。

And we noticed that many, many of the people trying to use our alpha of codex were just like not really finding it super useful.

Speaker 0

然后我们觉得这很有趣。

And then we're like, that's interesting.

Speaker 0

于是我们开始观察像OpenAI这样的公司内部人员是如何使用Codex这类内部工具的。

Let's look at how people like at OpenAI are using like internal tooling like Codex.

Speaker 0

我们意识到,使用方式上存在一个巨大的差异,那就是心态的不同。

And we realized there was like a big difference, which is the mindset of using it.

Speaker 0

适用于Codex的绝佳心态是丰盈心态——试试一切吧。

The mindset that works really well for Codex is this abundance mindset and hey, let's try anything.

Speaker 0

哪怕多次尝试也没关系,看看什么有效。

Let's try anything even multiple times and see what works.

Speaker 0

它帮我节省了时间。

It saves me time.

Speaker 0

因此,我们调整了用户引导产品的方式,旨在创造这种时刻——即并行运行多个任务。

And so we've shifted shifted the way that we even onboard people into the product to try to create this moment, which is running many tasks in parallel.

Speaker 0

对我们来说,如果看到有人试用,并且一天或一小时内运行了20个任务,那就太棒了。

So like for us, if we see someone like trying it out and like they've run like 20 tasks in like a day or an hour, that's amazing.

Speaker 0

他们很可能已经基本掌握了如何使用这个工具。

And they're probably gonna they've understood basically how to use the tool.

Speaker 2

很有趣。

Fascinating.

Speaker 2

当你拥有这个工具时,人类的角色会发生怎样的变化?

How does that change the role of a human when you have

Speaker 1

要审查所有这些代码吗?

to review all of this code?

Speaker 1

如果其中两个

If two of the

Speaker 2

三个都运行成功,那你该怎么办?

three work, then what do you do?

Speaker 3

是的。

Yeah.

Speaker 3

我认为我们也非常注重让输出结果更容易被人审查。

I think we've put a lot of focus on also is making the outputs easy for people to review.

Speaker 3

我们引以为傲的一点是,像这种模型能够引用自己工作成果的功能,在其他工具中并不多见。

So one of the things that we're proud of is, like, we haven't seen this in too many other tools is, like, the ability for the model to cite its own work.

Speaker 3

不仅仅是那些被修改的文件,甚至包括终端输出内容。

So not just, like, the the files that have changed, but also even, like, the terminal outputs.

Speaker 3

所以,比如它运行了一个测试,但测试因为某种原因失败了,它会明确告诉你,并且告诉你我实际运行的精确终端命令是什么。

So, like, if it ran a test and, you know, like, for some reason test wouldn't work, it actually, like, tells you that and it tells you, like, here's the exact kinda, like, terminal command I ran.

Speaker 3

这是输出结果。

Here's the output.

Speaker 3

这使得验证输出变得容易得多。

Makes it much easier to to verify the outputs.

Speaker 3

但确实,这是一个很好的观点。

But but it it is, like, a a great point.

Speaker 3

我认为我们正朝着一个新方向转变,过去我们花在编码上的大量时间,现在将更多地转向审查这些代码。

I think we're shifting to a world where, like, a lot of the time that we spend, you know, like normally coding, a lot of that's gonna shift to actually reviewing this, reviewing the code.

Speaker 1

你真的需要人类来审查代码吗?

Do you need humans to review the code?

Speaker 1

因为我觉得代码就像那种东西,它能编译通过或者不能。

Because I think of code as one of those things where, you know, it it compiles or it doesn't.

Speaker 1

一旦编译通过,你就可以去检查它是否完成了预期的功能。

And once it compiles, you can go and check if it does the thing it was supposed to do.

Speaker 1

你是说,真的还需要人类来做代码审查吗?

Like, Do you even need humans to do the code review?

Speaker 3

我觉得是的。

I think yeah.

Speaker 3

我的意思是,至少在可预见的未来,我确实认为会是这样。

I mean, for the foreseeable future at least, do see that to be the case.

Speaker 3

我的意思是,很多情况下,这也是为了与早期用户建立信任。

I mean, I think a lot of it's also just like building trust with the early users.

Speaker 3

人们真的需要有一种感觉,知道哪些地方运行得好,哪些地方不行。

Think people really need to have a feeling for like you know what things are working well, what things are not.

Speaker 3

我认为,总有一些外部背景信息,比如什么让这段代码是正确的,这些可能超出了你最初提供的上下文范围。

And I think there's always just like some external context about like you know what makes this code correct that you know might be beyond what you initially provided as context.

Speaker 3

对。

Yeah.

Speaker 3

如果你想想开发者平时做什么,而且

If you think of what a developer does, and

Speaker 0

这显然是过于简化了,但比如确定哪些事情可能应该做,与团队讨论,决定做什么,这些都可以称为构思。

this is obviously oversimplifying, but there's coming up with what things maybe should be done, discussing them with the team maybe, deciding what to do, you call that ideation.

Speaker 0

然后可能是设计阶段,我们到底要做什么?

Maybe then there's design, okay, what are we actually doing?

Speaker 0

接着是规划,我们打算怎么做?

And then like planning, how are we gonna do it?

Speaker 0

然后是实现,再验证和测试这些改动。

Then there's implementing and then validating, testing those changes.

Speaker 0

这基本上是一个循环,而实现和测试这个小循环正是Codex目前擅长的。

That's basically a loop and that small loop of implementing and then testing is what Codex is great at right now.

Speaker 0

尽管我们也可以讨论如何用它来进行规划。

Although we can talk about how you can use it for planning too.

Speaker 0

然后是实际部署代码,以及可能的维护代码、撰写文档等等。

And then there's actually deploying the code and then maybe maintaining the code, writing documentation, etcetera.

Speaker 0

所以我记不清具体的数字了,但我最近记得一个数据是,工程师大约有35%的时间在写代码。

And so I forget the exact stack, but I feel like the stat I remember recently is engineers spend maybe 35% of their time coding.

Speaker 0

这甚至不是工程师所做工作的大部分。

It's not the majority of even what engineers do.

Speaker 0

因此,我们努力构建的未来是:无论你是软件开发者还是从事任何职业,所有那些容易自动化的工作——通常是繁琐的工作——你都不再亲自做,而是将它们委派出去。

And so the future that we're trying to build towards is one where if you're a software developer or even in any profession, all the work that is easily automatable, that's usually grungier type of work, you're not doing, you're delegating that.

Speaker 0

而那些更有趣的工作,可能是因为它们模糊不清,或者因为它们非常困难,才是你主导的工作。

And then the work that is more interesting because maybe it's ambiguous or maybe because it's really hard, that's the work that you're driving.

Speaker 0

因此,我们正努力朝着这样的世界前进。

So we're trying to build towards that world.

Speaker 0

我认为我们必须逐步实现这一目标。

And I think we have to get there iteratively.

Speaker 0

例如,现在如果你是一个人类并编写代码,另一个会由人类来审查这段代码。

So for example, now, if you're a human and you write code, another human's gonna review that code.

Speaker 0

因此,我们不会贸然试图改变这一点。

And so we're not gonna come in and just try to change that.

Speaker 0

我们觉得,好吧,那就融入这个流程吧。

And we're like, okay, let's plug into that.

Speaker 0

所以目前产品的运作方式是,你作为开发者,正被这个工具加速。

So the way the product works right now is like, you, the developer are being accelerated by the tool.

Speaker 0

你要求生成一些代码,然后你判断它是否良好,并决定是否将其推送给你的团队,接着你的团队可以进行审查。

You ask for some codes to be written, you decide if it's good and you wanna push it out to your team and then your team can review it.

Speaker 0

随着时间的推移,我们会逐步扩展我们能做的事情。

And then over time, we'll basically kind of expand what we can do.

Speaker 0

我们会更多地协助规划,甚至设计,以及思考如何应对你应用或工作中发生的事情。

So we'll help more and more with like planning, maybe even designing, maybe even thinking about what to do in response to things that are happening in your app or at work.

Speaker 0

然后我们会推动让审查变得越来越容易,正如汉森所描述的那样。

And then we'll push to make review easier and easier as Hansen was describing.

Speaker 0

是的。

Yeah.

Speaker 0

你觉得我看到

Do you think I see

Speaker 3

一个多个代理协同工作的未来吗?

a future where you have multiple agents collaborating together.

Speaker 3

所以你们有Codex。

So you have, you know, Codex.

Speaker 3

Codex代理负责写代码,然后可能是操作代理负责测试,我们公司正在开发的所有不同代理都可以协同工作。

The Codex agent writes the code, and then maybe, like, the operator agent's the one that's testing it, and all of the things that all the different agents that we've been working on at the company can kind of, like, come together.

Speaker 2

太棒了。

That's awesome.

Speaker 2

你现在有没有看到,除了工程团队之外,其他人也开始使用Codex来写代码,我们正进入一种‘氛围编码’的世界。

Have you seen people now that you can delegate doing writing code, people beyond engineering teams start to use codex and and we're to begin to the world of vibe coding.

Speaker 2

你们正在帮我们更深地陷入这个坑里。

You guys are helping us bring us further down that hole.

Speaker 0

是的。

Yeah.

Speaker 0

这其实特别有趣。

This is actually super funny.

Speaker 0

答案当然是肯定的,但我给你讲个故事。

We were so the answer is yes, but I'll tell you a story.

Speaker 0

我们当时正和琳赛一起撰写发布博客文章,讨论要引用哪些客户的话。

We were working on our launch blog post with Lindsay here and we were talking about what quotes to quote from customers.

Speaker 0

我们有一位客户想说,我们工程团队非常喜欢这个工具,而且它对产品经理来说就像一把强力工具。

And we had a customer that wanted to say, yeah, we on the engineering team love this and also it's like a power tool for PMs.

Speaker 0

我记得看到这句话时心想,这真是个很棒的引语,因为我是产品团队的,我用它来避免打扰工程师或回答问题。

And I remember looking at that quote and be like, is a really cool quote because I'm on the product team and I use it to just avoid having to bug an engineer about things or to answer questions.

Speaker 0

但我看着那段代码时又想,我们真的想在发布博客中用这句话吗?

But I remember looking at that code and being like, do we want that in the launch blog post?

Speaker 0

因为我们所构建的产品,目标受众是专业的软件工程师,而不是‘氛围编码者’。

Because the target audience for what we're building is specifically professional software engineers, not Vibe coders.

Speaker 0

所以最终我们决定不使用这句话,但我相信,随着我们拥有能帮助我们编程的智能代理,会有越来越多的人能够为代码库做出贡献。

So I think we ended up not including that exact line, but I think over time, as we have agents that can help us code, I would expect more and more people to be able to contribute to Codebases.

Speaker 1

你认为专业软件开发者的数量是增加还是减少?

You think number of professional software developers goes up or down over time?

Speaker 0

这只是我的看法,但我认为数量会大幅增加。

This is just my opinion, but I think it goes way up.

Speaker 0

想想

Think

Speaker 1

而不是Vibe程序员,是专业软件开发者。

And not Vibes coders, professional software developers.

Speaker 0

是的。

Yeah.

Speaker 0

是的。

Yeah.

Speaker 0

我也这么认为。

I think so.

Speaker 1

好的。

K.

Speaker 0

但你知道,依我看来,软件越容易编写,我们就能拥有越多的软件。

But, you know, yeah, in my opinion, the the easier it is to write software, then the more software we can have.

Speaker 0

现在,如果我们想想,我打赌如果我们打开手机,你们这些投资者会发现。

Right now, if we think of, like I bet you if we look pull up our phones well, you folks are investors.

Speaker 0

但如果你不是投资者,打赌你打开手机,上面大多数应用都是由大型团队为数百万用户开发的。

But if you're not an investor, bet you if you pull up your phone, most of the apps on it are apps that are built by large teams for millions of users.

Speaker 0

而专门为我和我们特定需求构建的应用却非常少。

And there's very few apps that are built just for us and the specific thing that we need.

Speaker 0

因此,我认为随着为个人或团队构建定制化软件变得越来越实际,我们将面临越来越高的软件需求。

And so I think as it becomes more and more practical to build bespoke software for people or teams, we'll end up having higher and higher demand software.

Speaker 3

是的。

Yeah.

Speaker 3

当我思考我是如何使用它的时,我觉得它现在更像是一个乘数效应,而不是某种替代品,尤其是当我们观察内部核心用户的使用模式时。

As I think about how I use it, I think it just really is a multiplicative factor right now rather than any kind of any sort of replacement just like especially especially looking at the patterns of our inner internal power users.

Speaker 3

顶级Codex用户之间的差异非常显著,他们每天能完成十多个拉取请求。

There's like a really dramatic, like, difference in like, the top users of Codex are, like, doing, you know, like, 10 plus PRs every day.

Speaker 3

这简直就是一个巨大的乘数效应,我无法想象一个世界,在那里它把创建软件的门槛降低到如此程度。

It's just, like, really such a multiplicative factor that I I can't see, like, a world in which, like, it's like lowering the bar to creating software so much.

Speaker 0

话虽如此,我认为这是一个非常重要的问题,坦白说,我们并不确定。

That said, I mean, I think this is a really important question and to be completely honest, like we don't know.

Speaker 0

因此,这是我们公司非常关注的一个方面。

And so this is something that we as a company pay a lot of attention to.

Speaker 1

我想谈谈技术层面背后正在发生的事情。

I wanna talk a little bit about the, you know, what's happening under the hood on the technology side.

Speaker 1

你提到这个模型本身,让它与竞争性编程不同的一点是,你们让它更擅长专业软件开发者会做的事情。

So you mentioned that the model itself, one of the things that makes it different from competitive programming is you've made it more be good at the things that a professional software developer would do.

Speaker 1

这是模型方面最大的差异吗?还是我们应该把它看作是 o3 的近亲?

Is that the biggest difference on the model side or should we think of it as a close cousin of o three?

Speaker 3

是的。

Yeah.

Speaker 3

所以,它确实是与 o3 相同的模型,只是增加了额外的强化微调。

So so it's it's definitely the same model as o three with additional reinforcement fine tuning.

Speaker 3

但话又说回来,是的。

But that said yeah.

Speaker 3

我认为其中一部分在于,那些让一个优秀软件工程师与仅仅是优秀程序员之间更本质的、更质性的区别。

Think so part of it is kind of like these more like qualitative aspects of what makes a good software engineer versus simply like a good, let's say like, coder

Speaker 1

是的。

Yeah.

Speaker 3

比如风格。

You know, like style.

Speaker 3

甚至像它是如何写注释的,我认为这是其他人对其他模型注意到的一点。

Even like how it writes comments, that's I think that's like one of the things that people have noticed with other models.

Speaker 3

此外,我还想强调一个重大挑战:为智能体创建良好的学习环境。

And then on top of that, I I also wanna highlight one of the big challenges was like making good environments for the agent to kind of learn in.

Speaker 3

如果你想想现实中的软件仓库,它们是如此多样和复杂。

And so if you think about like real world software repositories, it's like so varied and complicated.

Speaker 3

想想看,设置一个仓库需要多少DevOps工作。

Like, think about it, like, how much DevOps has to go into, like, setting up a repository.

Speaker 3

在这方面,我们正在通过环境搭建的实践中学到教训。

And that's something we're kind of, like, learning the hard way with the our environment setups.

Speaker 3

我们要不要谈谈

Should we talk about

Speaker 0

我昨天给你看的那个多仓库项目?

the multi repo I was showing you yesterday?

Speaker 0

哦,是的。

Oh, yeah.

Speaker 0

我当时给汉森看了那个初创公司的一个仓库,你知道的,就是OpenAI收购的那家,后来我们加入了。

Like, I was I was showing Hanson the repo for the startup that, you know, OpenAI acquired and so we joined.

Speaker 0

所以我们一起看了那个仓库,考虑把它用作一个环境。

And so we were looking at that repo together, thinking about it for use as an environment.

Speaker 0

汉森就问:单元测试在哪?

And Hanson's like, so where are the unit tests?

Speaker 3

因为智能体需要使用单元测试

Because the agent uses unit tests

Speaker 0

来验证它的行为。

to verify it.

Speaker 0

我当时就说:这是一个根本没有单元测试的真正初创公司。

And I was like, this is a real startup that has no unit test.

Speaker 1

我的意思是,所有时间都是一样的。

I mean, all the time is all the same.

Speaker 3

所以我没什么可抱怨的。

So I I can't complain.

Speaker 3

所以是的。

So yeah.

Speaker 3

比如,你有这么多非常混乱的环境。

Like, you have all these, like, really messy environments.

Speaker 3

所以我们是的。

So we yeah.

Speaker 3

在训练过程中,我们必须生成这些非常真实的环境,供代理学习。

We we have to over the course of training, like we had to basically generate these really realistic environments for the agent to learn from.

Speaker 3

我认为我们能够成功打造一个端到端产品的原因之一,就是我们在训练中使用的环境,和用于生产服务的容器化基础设施是完全一致的。

And I think like one of the reasons that we're able to make such like an end to end product work is that we have like the same environments that we use during training and the same like basically this, you know, containerization infrastructure that we're using to serve in production.

Speaker 3

所以我们的用户,你知道,我们自己在运行计算环境。

So this so our users are, you know, like we're running our own compute environments.

Speaker 3

当用户使用Codex时,他们运行的环境与我们用于训练的环境完全相同。

When users use Codex, they're running in the exact same environments that we're using for training.

Speaker 1

所以你不会遇到代理说‘但在我机器上能运行’的情况。

So you don't have the agent saying, but it works on my machine.

Speaker 3

没错,是的。

Exactly, yeah.

Speaker 1

好的,很好。

Okay, good.

Speaker 1

我认为这些也是我在OpenAI见过的运行时间最长的代理。

I think these are also the longest running agents I've seen out of OpenAI.

Speaker 1

Deep Research可能是之前运行时间最长的。

Deep Research maybe was the previous one that was longest running.

Speaker 1

据我了解,Codex有时会为不同的任务花费长达三十分钟。

And my understanding is codex can sometimes spend thirty minutes on different tasks.

Speaker 1

在将推理时间扩展到如此长时间的查询时,你们是否遇到过任何令人惊讶的挑战或问题?

Are there any kind of surprising challenges and things you've encountered just getting inference time to scale up on query for so long?

Speaker 0

也许我先从产品层面说起,然后还有许多关于即时性的内容。

Maybe I'll start with the product side and then there's many on the moment side.

Speaker 0

但在产品层面,我最常思考的是用户意图。

But on the product side, actually the thing that I think the most about is user intent.

Speaker 0

如果你想象有人在IDE中使用自动补全功能,这其实并不特别难。

It's actually, if you imagine someone using autocomplete in their IDE, it's not super hard necessarily.

Speaker 0

我的意思是,这虽然有难度,但预测用户在接下来的微秒内想做什么并不算特别困难。

I mean, it's difficult, but it's not super hard to predict, what are they trying to do right now for the next microsecond.

Speaker 0

但要完成一个耗时三十分钟的任务,帮助用户清晰地描述这个任务实际上相当困难。

But for doing a task that takes thirty minutes, it's actually fairly difficult to help a user describe the task.

Speaker 0

比如,他们自己可能都不清楚三十分钟的工作到底想要什么。

Like they may not even know exactly what they want for thirty minutes worth of work.

Speaker 0

因此,我们花了很多时间讨论这个问题,至今仍在争论:用户应该以怎样的任务粒度交给Codex,以及如何让它变得简单,使Codex能灵活应对,比如用于一行代码的修改。

And so something that we spent a while debating and it's like still a thing we debate is like, what is the right granularity of a task for someone to give to Codex and like, how can we make it easy so that Codex can like be really flexible where you can use it for like one line changes.

Speaker 0

你可以用它来做大型重构,当你清楚知道自己想要什么时,或者用于你已明确目标的更大功能。

You can use it for like big refactors that you know exactly what you want or like larger features where you know what you want.

Speaker 0

或者当你还不清楚自己具体想要什么的时候,也能使用Codex吗?

Or maybe can you use Codex when you don't know exactly what you want?

Speaker 0

所以也许你可以先让Codex为你制定一个计划,然后让它建议具体任务,之后再执行这些任务。

And so maybe you should ask Codex for a plan and then you can have it Codex suggest tasks and then do those tasks afterwards.

Speaker 0

这仍然是我们不断讨论和迭代的一个话题。

So that's still a topic of debate and like iteration for us.

Speaker 3

是的。

Yeah.

Speaker 3

我认为这实际上是一个使用它的良好建议。

I think that's actually like a good pro tip for for using it.

Speaker 3

它在制定自己的计划方面确实非常出色。

It's actually like really good at coming up with its own plans.

Speaker 3

有时候,提前详细指定你想要的一切会非常繁琐。

And then, you know, sometimes it's like it's really tedious to specify everything you up you want upfront.

Speaker 3

这可以说是从事这项工作时的独特挑战之一。

And that's kind of like one the unique challenges about working.

Speaker 3

比如,如果你希望每次只工作一小时,那就真的需要提前详细规划很多内容。

Like, if you wanted to work for, you know, an hour at a time, then you kind of do have to specify a lot upfront.

Speaker 3

这意味着你得花上大概十到二十分钟来制定这个计划。

But which means that you have to spend like, I don't know, like ten, twenty minutes coming up with that.

Speaker 3

但如果你先使用‘询问模式’,生成一个你想要做的事情的高层次计划。

But if you use actually like the ask mode to to first like, you know, generate like a high level plan of of what you want wanna do.

Speaker 3

然后你可以在正式让它运行一小时之前,和模型一起迭代这个计划。

And then you can like iterate on that with the model before you, you know, send it off for for an hour.

Speaker 1

这真的就像在和一个实习生合作。

It really is like working with an intern.

Speaker 1

是的。

Yeah.

Speaker 1

那从模型的角度来看呢?

What about on the model side?

Speaker 1

当模型运行这么长时间时,有没有什么让你感到意外的行为?

Anything that's surprising in terms of model behavior as it starts to run for so long?

Speaker 3

是的。

Yeah.

Speaker 3

我认为我们的模型在保持任务专注方面有了很大提升,尤其是在这些较长的任务中。

Think I think our models have gotten a lot better at kind of like sticking kind of like on task as it especially like with these longer rollouts.

Speaker 3

我得说,即使模型的耐心已经相当高,但仍然存在一些情况,它的耐心是有极限的。

I will say like there are cases where, you know, like even the there is a limit to the model's patience even though it's it's quite high.

Speaker 3

所以有时候会让人感到沮丧,比如它运行了三十分钟,然后就像这样——我们正在努力改进这种情况,就像人类会回来对你说:‘对不起。’

So so it can be frustrating sometimes, you know, it's like it goes off for like thirty minutes and then, you know, that this is a case that we're working to get better at where it's like, you know, it's it's kind of like just like a human comes back to you and it's like, I'm sorry.

Speaker 3

我不行,我真的觉得这个任务太难了。

I don't I I I this is too much.

Speaker 3

我其实没有足够的时间来做这件事。

I I don't have enough time to to do this, actually.

Speaker 3

它就是这样说的。

Like, that's one of the things it says.

Speaker 3

这并不是我们的错。

It's it's not ours.

Speaker 1

就像一个实习生。

Just like an intern.

Speaker 3

所以你的行为非常人性化。

So very very human like in in your ways.

Speaker 2

是的。

Yeah.

Speaker 2

我很想知道,你如何看待合适的交互模式,以及它们如何演变,围绕这一领域的整个产品套件如何随时间发展。

I'm curious how you think about the right interaction patterns and how they evolve and how the suite of products around this evolve over time.

Speaker 2

我们有 Codex。

We have Codex.

Speaker 2

我们有 Codex CLI。

We have Codex CLI.

Speaker 2

你认为在工程和产品构建方面,还有哪些设计空间尚未探索?

What else do you think is out there in the design space for engineering and building products?

Speaker 0

是的。

Yeah.

Speaker 0

所以,我们发布时的Codex实际上就像你知道的,只是一个研究预览版。

So the Codex as we launched it is really just like you know, it's a research preview.

Speaker 0

它是一个思想实验,一个有用的实验,但仍然非常早期。

It's a thought experiment, a useful one, but it's still still very early.

Speaker 0

我们对Codex最自豪的是这个模型,以及为计算机环境奠定的基础。

And what we're most proud of with Codex is the model and the beginning of this foundation for computer environments.

Speaker 0

我们发布的界面是我们不断迭代的结果,其中有一些有趣的故事。

And the UI we shipped is one that we iterated towards and there's some fun stories there.

Speaker 0

但它绝对不是最终的产品形态。

But it's definitely not the final form factor.

Speaker 0

对于在听的各位来说,我们发布的界面实际上是ChatGPT中的一个接口,你可以提交任务,让Codex回答你的问题或编写代码。

And for those listening, basically the UI we shipped is an interface in ChatGPT where you can submit a task and ask Codex to either answer your question or write code.

Speaker 0

然后你会看到一个类似待办事项列表的东西,你可以去查看并合并。

And then you have this, something that looks a little bit like a to do list of things that you can go look at merging.

Speaker 0

因此,我们构建这个界面是为了大力推动这种异步代理的概念,你可以将任务委托给它。

So we built that to really lean hard into this idea of an asynchronous agent that you delegate to.

展开剩余字幕(还有 194 条)
Speaker 0

但我们想要构建的是这样一种环境:你无需思考自己是在委托任务还是与智能体协作。

But what we want to build towards is a setup where you don't have to think about whether you're delegating or whether you're pairing with an agent.

Speaker 0

它应该仅仅像与一位队友合作一样自然,而这位队友能无处不在地融入你使用的每一个工具中。

And it's really it should just feel like working with a teammate and where that teammate is like ubiquitously present in all the tools you work with.

Speaker 0

因此,无论你使用的是终端、IDE、任务管理工具、告警工具、错误监控工具,还是其他显示错误的工具,你都可以随时召唤它寻求帮助。

So you should be able to pull up any tool that you're working in, be it your terminal, your IDE, your issue management tool, maybe your alerting tool, your errors, the tool that shows you errors, and just ask for help.

Speaker 0

甚至在你还没来得及查看之前,Codex 可能已经观察过并形成了自己的看法。

Maybe even Codex has already taken a look before you even got there and it has an opinion there.

Speaker 0

你可以提出任何问题,无论是简短的还是长篇的,它都会恰当地决定投入多少时间来回应你,并帮助你顺利完成这些修改。

And you could be able to ask something, be it a short question or a long question, it'll just like appropriately decide how much time to spend before answering you and just like help you land those changes.

Speaker 0

所以,我们本质上想融合协作与委托这两种理念,但我们最初发布的产品只是最纯粹的思想实验。

So basically, we wanna kind of blend this idea of, like, pairing and delegation, but the first thing we shipped was just, like, the the the purest thought experiment.

Speaker 0

我还要补充一点,关于在 OpenAI 工作的一个独特之处在于,我们正是大多数人所使用的 ChatGPT 这个 AI 系统的创造者。

The other thing I'll add to this is, like, one of the unique things about working at OpenAI is that we are the makers of ChatchBT, which is sort of the AI system that most people use.

Speaker 0

因此,我们并不认为未来你会在一天中不断纠结:是该用 Codex 智能体,还是购物智能体,或者叫车智能体。

And so we don't actually see a future where as you go about your day, you're deciding whether to use the Codex agent or, I don't know, you're a shopping agent or taxi ordering agent.

Speaker 0

顺便说一下,我这里只是随便举些例子。

By the way, I'm just naming random things here.

Speaker 0

或者你是个营销助手。

Or you're like marketing agent.

Speaker 0

实际上,我们设想的运作方式是:你只需要一个助手,可以向它询问任何问题,它能帮你完成所有需要的事情。

Actually, the way we think this should work is you should just have one assistant that you talk to and you can ask it anything about anything and it can just do the things you need.

Speaker 0

而这个助手就是将成为我们助手的ChatGPT。

And so that's ChatGPT that will become our assistant.

Speaker 0

如果你是某种工具的高级用户,比如你是软件开发者,花很多时间在某些功能工具上,那么你可以进入那个工具,使用专门的界面,里面有按钮、列表等,帮助你高效地完成日常工作。

And then if you're a power user of a certain type of tool, so let's say you're a software developer, you spend a lot of time in certain functional tools, then you can go into that tool and have a bespoke interface with buttons, with lists that you can use to, like, efficiently go about your day.

Speaker 1

你觉得我们还会继续使用IDE吗?

Do you think we'll still use IDEs?

Speaker 0

是的。

Yeah.

Speaker 0

当然。

For sure.

Speaker 0

但它们会演变。

But they'll evolve.

Speaker 0

对吧?

Right?

Speaker 0

比如现在,它们非常专注于编写代码。

Like, right now, they're, like, very focused on writing code.

Speaker 0

正如汉森所说,代理可能会编写越来越多的代码,因此重点将转向代码的部署、审查、验证,或者甚至转向规划更大的任务流程。

And, like, as Hansen was saying, like, probably agents will be writing more and more code, and so it's gonna become like, there'll be a a shift in emphasis towards, like, landing code or reviewing code or, like, validating them, or maybe even, like, a shift in emphasis towards planning, like, bigger arcs.

Speaker 0

是的。

Yeah.

Speaker 3

我觉得我们已经看到团队中很多人这样做了。

I think we're already seeing a lot of people on the team.

Speaker 3

他们早上一来,第一件事就是...

They kinda, like, first thing in the morning, they come in.

Speaker 3

他们先煮杯咖啡,然后启动几个任务,作为一天的开端。

Like, they make coffee and then they they like kick off a few tasks just to kind of get a starting point.

Speaker 3

然后,你知道的,他们吃完早餐回来,看看生成的任务或拉取请求,接着就会把这些内容带到IDE里——IDE就是你做这些事情的地方,它或许能帮你完成80%的工作,希望甚至更多。

And then, you know, they come back after their breakfast and they they look at the tasks that or the PRs that got generated, then they'll take those and the IDE is kinda like the place where you take, you know, it's it's not it's maybe it'll get you like 80% of the way there, hopefully, or even more.

Speaker 3

但总还有一段最后的路程,你需要亲自进去,根据自己的直觉做精细调整。

But then there's always this like last mile where you go in and really like fine tune based on kind of like your own vibes.

Speaker 2

你如何看待更广泛的市场演变?

How do you see the broader market evolving?

Speaker 2

比如在OpenAI内部,你们就有这么多不同的策略。

Like, within OpenAI, you have so many different strategies here.

Speaker 2

当你思考异步任务,以及你提到的那些要融入ChatGPT的功能时,我们正看到大量其他工具和专用模型的爆发式增长。

And as you think about async tasks, as you think about some of the things that you mentioned moving into ChatGPT, we're seeing a lot an explosion of other tools and specialized models.

Speaker 2

你显然有偏向,但我很好奇你对整个市场的看法是什么。

You obviously are biased, but I'm curious what your read is of the broader market.

Speaker 0

是的。

Yeah.

Speaker 0

现在对开发者来说,真是个疯狂的时期。

It's it's a crazy time to be a developer right now.

Speaker 0

现在有太多新工具,真的特别有用。

Like, there are just so many new tools that are just so helpful.

Speaker 0

最近有个有趣的故事:我在飞机上,没有Wi-Fi,本来想着要写点代码、做个东西,但没网。

Like, a a fun story recently is I was in the airplane and there was no Wi Fi, and I I had thought that I was gonna maybe write some code and, like, build a thing, and there was no Wi Fi.

Speaker 0

然后我就想,算了。

And I was like, you know what?

Speaker 0

不干了。

Screw it.

Speaker 0

现在连试着写代码都觉得不值得了。

Like, it's just not worth my time to, like, even try to write code anymore.

Speaker 0

而很多年前我创业时,那个创业点子的起源,就是我在没有Wi-Fi的飞机上写代码。

Whereas, the startup that I was working on like many years ago, like part of the genesis of that startup was like me writing some code without WiFi in an airplane.

Speaker 0

但现在我根本不会这么做了,因为市场变化实在太大了。

And I just wouldn't even do that anymore because like the market, it's just like, it's just changed so much.

Speaker 0

我觉得,接下来我们也会在同样短的时间内看到类似的转变。

And I think this, I think we're gonna see like an equivalent shift in an equivalent amount of time.

Speaker 0

所以在未来两年里,编程将会完全不一样。

So like in the next two years, coding will look completely different.

Speaker 0

我认为,现在大多数人觉得最有价值的工具,都是那些能与你紧密协作的工具,基本上就像实时结对编程。

I think right now, most of the tools that people spend you know, that people find the most value from are tools that work really closely with you, like in your development environment, basically pairing.

Speaker 0

我认为我们即将看到的转变是,实际上大部分代码将由智能代理编写,尽管我们还得弄清楚这具体会如何实现。

And I think the shift that we're gonna see, but we have to figure out how this will happen, but the shift that we're gonna see is that actually the majority of code will be written by agents.

Speaker 0

这些智能代理不会在你的开发环境中工作,让你一次只做一件事,而是会在它们自己的环境中运行。

And those agents won't be working in your environment where you can do one thing at a time, but they'll be working in their own environments.

Speaker 0

它们不会仅仅因为你想到某个具体任务才被触发,而是会接入你日常使用的工具,在你工作时自动参与。

And they won't just be triggered by you, like, thinking of specific tasks, but they'll be connected into the tools you use doing work there.

Speaker 0

所以我认为,我们将会看到这种向智能代理的转变。

And so I think we'll see basically that shift towards agents.

Speaker 0

就像你之前问到的,我认为我们还得搞清楚很多关于代码审查的事情。

I think we're gonna have to figure a lot out a lot about code review as you were asking about.

Speaker 0

就我个人而言,我还不完全清楚这会怎么运作,但我知道,即使在OpenAI,现在也有更多代码是由智能代理合并的,而且实际上更多代码是由智能代理生成的——比如人们会启动四次任务,然后选择他们最喜欢的那个实现方案。

Like, Personally, I don't exactly know how that's gonna work, but I do know that even already at OpenAI, we're seeing much more code is merged by agents, but actually also even more code is generated by agents as folks are like, say, kicking off tasks four times to choose their favorite implementation.

Speaker 0

因此,我们甚至还不清楚应该如何管理这些正在被编写的所有代码。

And so it's not a 100% clear how we should even manage all this code that is being written.

Speaker 0

不过,如果对观众有帮助的话,我想说的是,确实有一些方法可以让你的代码库更便于代理处理。

Some things that I will say though, in case it's useful to the audience is that there are definitely things you can do to your code base to make it more addressable for agents.

Speaker 0

这不一定特别新颖,但显然使用类型安全的语言非常有帮助。

This isn't necessarily particularly novel, but obviously using typed languages is really helpful.

Speaker 0

另一个非常有帮助的做法是拥有更小、测试更充分的模块。

Another thing that's very helpful is having smaller modules that are better tested.

Speaker 0

我们常开玩笑说要有良好的

We joke about Having good

Speaker 3

测试,是的。

tests at all, yeah.

Speaker 0

要有测试。

Having tests.

Speaker 0

我们常拿我的初创公司代码库开玩笑,但我打赌,如果我们今天重写它,肯定会不一样。

We joke about my startups repo, but I bet you we would have written it differently if we were writing it today.

Speaker 0

甚至还有一些小细节,比如这个项目的代码名是WAM。

And even there's small things like the code name for this project is WAM.

Speaker 0

这是Codex的代码名。

This is the code name for Codex.

Speaker 0

就是w-h-a-m。

It's like w h a m.

Speaker 0

我们在命名时非常有意为之,因为我们知道代码会出现在服务器、网站以及其他各种地方。

And when we named it, we were very intentional in doing so because we knew we would have code, like, in the server, like, for the website, in various other places.

Speaker 0

我们希望让代理能够轻松地搜索到与WAM相关的代码并找到它们。

And we wanted it to be really easy for the agent to, like, search for WAM related code and find it.

Speaker 0

有意思。

Interesting.

Speaker 0

所以我们给项目命名为WAM,然后首先用grep在代码库中查找它出现了多少次。

So we named the project, you know, WAM and we grep the code base first to figure out how often it was there.

Speaker 0

如果我们把它命名为类似code、codex或agent这样的词,你可以想象,对代理来说会很难

Like, if we would have called it something like code or or codex or agent, you can imagine, like, it would have been really hard for the agent to

Speaker 1

如果叫它Codex,那代理肯定会混淆。

Would have gotten very called it codex and now the agent's gonna be confused.

Speaker 0

在代码中,这正是我想表达的观点,对吧?

Well, so in the code, this is kind of my point, right?

Speaker 0

这是有意的设计。

Like intentional design.

Speaker 0

在代码中,我们大量使用'wham'这个术语,因为这对代理来说更容易查找。

Like in the code, we use the term wham, like a lot, because that's actually much easier for the agent to find.

Speaker 0

显然,如果我们不用这样的词,代理还是能找到,但需要花更多时间来定位正确的文件。

Obviously, if we didn't use a word like that, the agent could still find its way, but it would have to spend much more time to find the right files.

Speaker 3

是的。

Yeah.

Speaker 3

很有趣的是,很多让代码库对人类更友好的做法,通常也会让代理更容易使用,比如良好的测试,编写清晰的文档就是另一个很好的例子;现在我觉得更有动力去这么做,因为这不仅让你的生活更轻松,也让代理的生活更轻松。

It is it is cool that, you know, like a lot of the things that actually make the code base easier for humans too also tends to make it easier for the agents, like good tests, for example, writing good docs is is another good example where like now I think there's even more of an incentive to to do that because like not only does it make your life easier, it makes the agent's life easier.

Speaker 1

好吧,抱歉当个烦人的风投,但Claude Code和Jules也是其他人的智能编码体验。

Okay, sorry to be the annoying VC, but Claude Code and Jules are also, I think agentic coding experiences from others.

Speaker 1

我想知道,你觉得你今天的经历和以前相比如何?

I'm curious how you think your experiences compare today.

Speaker 1

那么,你认为市场是否会逐渐趋同于对同步和异步编码的同一愿景?

And then do you think the market is probably gonna converge towards the same vision of what sync and async coding look like?

Speaker 1

在这个未来的版本中,你认为OpenAI会在哪些方面胜出?

And in that version of the future, what do you think OpenAI wins on?

Speaker 0

我认为我们会看到各种各样的情况,就像你提到的,有些工具在你的电脑上运行,有些则在它们自己的电脑上运行。

I think we're gonna see a little bit of everything, even in what you mentioned, there's tools that are working on your computer, there's tools that are working on their own computer.

Speaker 0

正如我所说,我认为大部分工作会在代理拥有自己电脑的情况下完成,但我们也仍然非常有必要投资于加速那些在自己电脑上工作的开发者。

As I mentioned, I think we're gonna see the majority of work being written where the agent has its own computer, but it will still be really important for us to invest in accelerating developers who are doing work on their own computer too.

Speaker 0

所以理想情况下,我们能兼得两者之长,但大多数工作将在代理计算环境中完成。

So ideally we get the best of both worlds there, but most work is done in agent compute.

Speaker 3

我认为我的看法是,软件工程中最困难的部分之一,其实是将来自现实世界的所有上下文转化为需求和设计文档。

I think the way I see it as well is like, I think one of the the hardest part of software engineering really is like taking all the context from the world and like encoding it in these requirements, these like design docs.

Speaker 3

而实现部分,正如我们之前提到的,实际上在开发周期中所占的时间并没有那么多。

And then the implementation like I think as we alluded to earlier is like not actually like that much of the life cycle is spent on that physical coding.

Speaker 3

我认为Chattypuji的亮点在于,它是一个拥有记忆、能够接入你所使用的所有不同工具的助手。

And so I think where Chattypuji shines is like it is this assistant that has you know it has memories now, it it has access to like a lot of different connectors to like all the different tools you use.

Speaker 3

我们有Operator、深度研究等具备各种不同能力的工具。

We have like operator, deep research that have all these like different capabilities.

Speaker 3

因此,我认为当所有这些功能整合在一起时,像Codex这样的工具就能真正大放异彩,因为它能获取所有这些知识。

And so I think the vision where that like all comes together is where you know like a tool like Codex can really shine once it has access to all that knowledge.

Speaker 3

它能够充分利用这些信息。

It's able to like make use of that.

Speaker 3

我认为有了这些能力,它在编码部分就能做得更加高效。

And I think with that, it should be able to do a much more effective job at at, you know, just the coding part.

Speaker 0

是的。

Yeah.

Speaker 0

想象一下,你雇了一名软件工程师,而这名工程师唯一能做的就是接受你的任务并提交一个拉取请求。

Like imagine like hiring a software engineer and the only thing that that software engineer can do is take a task from you and produce a PR.

Speaker 0

或者它只能执行一些非常明确的功能,精确地完成那些特定任务。

Or it has these very well defined features and it can exactly do those things.

Speaker 0

然后你提出一个随机的要求,比如:‘嘿,团队要聚一聚了。’

And then you ask for a random thing like, Oh, hey, the team is getting together.

Speaker 0

你能不能顺便订个会议室并主持一次头脑风暴?

Do you mind also getting a meeting room and leading a brainstorming?

Speaker 0

如果你雇了一个同事,他却拒绝做这类工作,那一定会让你非常沮丧。

It would just be so frustrating if you hired a teammate and they refuse to do that kind of work.

Speaker 0

所以同样地,我认为我们正在朝着一个未来努力,在那个未来中,与你合作的智能体将更加通用。

And so similarly, I think it's really we're building towards a future where agents that you're working with are a little bit more generalized.

Speaker 0

举个例子,汉斯提到了Operator和Deep Research,如果你认为Operator有网页浏览器,Deep Research有不同版本的网页浏览器,Codex有终端,那么你的同事实际上拥有的工具和人类同事非常相似。

To reference, Hans was talking about Operator and Deep Research, if you think Operator has a web browser, Deep Research has a different flavor of a web browser, Codex has a terminal, really your teammate has pretty similar tools, like a human teammate.

Speaker 0

因此,我们的最终目标是,选择一些我们想要重点投入的特定用户群体,以实现快速突破。

And so like, the goal for us eventually is to, like, pick places where we wanna really invest in a specific audience to, like, make rapid progress.

Speaker 0

显然,我们在编码领域已经这么做了,比如通过Codex或GPT-4.1,我们为开发者群体专门设计了评估标准,然后打造了更优秀的模型。

So we obviously, we're doing that with coding, with codex or, like, GPT 4.1, where we, like, generated specific evals for that audience and then, like, made a better model for them for developers.

Speaker 0

但随着时间推移,我们会把这些功能逐步通用化,让每个人都能轻松使用。

But then over time, like generalize these things into like simple things that everyone can use.

Speaker 0

所以我认为,对于我们来说,OpenAI 和 ChatGPT 这样的产品,其形态会与那些仅专注于编程的工具大不相同。

So I think like, again, with us, with OpenAI and like ChatHPT, I feel like that's a place where the products we build will look very different from something that's like very only specifically for coding.

Speaker 1

你认为开发者与代码模型交互的主要界面会是什么?

What do you think will be the primary UI that developers use to interact with codecs?

Speaker 1

你觉得会是 ChatGPT、命令行、IDE,还是全部都有?

Like, do think it'll be ChatGPT, the CLI, the IDE, all of the above?

Speaker 3

是的,确实如此。

Yeah, does.

Speaker 3

我认为是以上所有方式的结合。

I think a mix of all the above.

Speaker 3

我们只是希望在开发者需要的时候,出现在他们所在的地方。

Think we just kinda like wanna meet developers where they are in that moment.

Speaker 3

所以它甚至可能不在编辑器或终端里,而是在 Slack 上。

So it might not even be like in the editor or in the terminal, it might be like on Slack.

Speaker 3

有人给你发消息说:嘿,有个 bug,你直接说:去修一下。

Someone messages you like, hey, there's a bug and you're just like, Hey, go fix it.

Speaker 0

我来告诉你我心目中有趣的未来界面是什么样子,但这可能完全不是未来与智能体协作的方式。如果你是一个初创公司的创始人,未来你的团队只有你,或者你和几个联合创始人,再加上很多智能体,那场景看起来会像TikTok。

I'll give you my fun future UI is but not at all maybe the future of working with agents, if you're a startup founder in the future and you have a team of just you or you and a couple of co founders and many agents, actually looks like TikTok.

Speaker 0

也许你会看到一个垂直信息流,每个视频都是由智能体生成的,内容可能是这样的:嘿,有客户提出了这个需求,我觉得我们应该修复它。

Maybe you have vertical feed and it's basically an agent has produced video that you can watch with an idea like, Hey, customer wrote in with this request, I think we should fix it.

Speaker 0

然后你向右滑动表示:好的,我们就修这个。

And then you swipe right to say, Yeah, let's fix this.

Speaker 0

就这么办。

Let's do this.

Speaker 0

你向左滑动表示:不,我们不搞。

You swipe left to say, No, we can

Speaker 1

Tinder还是TikTok?

Tinder or TikTok.

Speaker 0

抱歉,是两者的混合体。

Sorry, a hybrid.

Speaker 0

我没说这听起来会很有道理。

I didn't say this was gonna make a lot of sense.

Speaker 0

I

Speaker 1

我喜欢这个。

like it.

Speaker 0

然后你按住以提供反馈。

And then you press and hold to provide feedback.

Speaker 0

你会说,是的,去做吧,但确保字体是斜体。

You'd be like, yes, do it, but make sure the font is in italic.

Speaker 0

基本上,你有这些代理订阅了你公司或团队的信息,它们主动提出想法并执行,然后给你更新,而你只是在筛选正在完成的工作。

And so basically you have all these agents who are subscribed to information at your company or on your team, and they're proactively coming up with ideas and doing them and then giving you updates and you're kind of just curating the work that is being done.

Speaker 1

它们会给你展示一些预览,

And they show you little previews of what

Speaker 2

世界可能会是什么样子。

the world could look like.

Speaker 0

是的。

Yeah.

Speaker 0

当然,这其实是个半开玩笑的说法。

Obviously that's a half joke though.

Speaker 0

我认为,与智能代理协作将会成为一种趋势,而且人们肯定需要能够亲自去完成工作,并与代理配合。

I think that'll be like kind of the arms like working with agents and then there's definitely gonna be really important for people to be able to go do the work themselves and pair with agents in.

Speaker 1

我明白这是个半开玩笑的说法,但这个画面真的很棒,因为我认为大家在概念上都认同这种协作和审查代理所做更改的方式,将与我们今天的编程方式截然不同,但没人真正给我展示过这种场景可能是什么样子。

I get that it's a half joke, but it is like, it's a really cool visual because I think everyone agrees conceptually with this idea of collaborating and reviewing all the different changes an agent makes is gonna look very different from how we code today, but nobody's actually given me a visual of what that might look like.

Speaker 1

所以这个想法真的很棒。

So that's a really cool idea.

Speaker 2

我喜欢这个点子。

I love it.

Speaker 1

我们来个快速问答环节收尾吧?

Should we wrap with lightning round?

Speaker 0

好,来吧。

Let's do it.

Speaker 1

好的。

Okay.

Speaker 1

给AI爱好者的推荐读物或内容。

Recommended piece of content or reading for for AI fans.

Speaker 0

对我来说,这立刻就想到了。

For me, that's like immediate.

Speaker 0

就是伊恩·班克斯的《文化》系列。

It's like The Culture by Ian Banks.

Speaker 1

你读过吗?

Have you

Speaker 0

读过。

read it?

Speaker 2

是的。

Yes.

Speaker 2

太棒了。

It's amazing.

Speaker 0

是的。

Yeah.

Speaker 0

这是一部科幻系列小说,始于八十年代,它对人类和非人类种族在未来太空探索中的生活方式持一种不同寻常的积极看法。

It is a science fiction series, started being written in the eighties, and it is unusually positive in its view of how a future space faring human and non human race could kind of look.

Speaker 0

书中还深入探讨了当我们拥有通用人工智能时,生命的目的和意义究竟是什么。

And there's a lot of questioning about, like, what is the purpose and meaning of life when we have AGI.

Speaker 0

是的。

Yeah.

Speaker 3

对我来说,任何理查德·萨顿的作品都算。

I think for me, it's like anything by Richard Sutton.

Speaker 3

我认为那是我接触强化学习的入门读物。

I think that was, like, my introduction to reinforcement learning.

Speaker 3

在这里,我们几乎每天都会读《痛苦的教训》,这简直成了一个笑话。

And I think it's like it's kind of a joke here that, like, we read the bitter lesson like every single day.

Speaker 3

这可以说是OpenAI的一种哲学。

That that's like kind of the philosophy of OpenAI.

Speaker 3

比如即使是Codex,我们给它一个终端,它真的会直接使用POSIX工具。

Like I think, you know, even with Codex, like we give it a terminal and like it literally uses POSIX tools.

Speaker 3

这大概是与计算机互动最苦涩的方式了。

That's like the most like bitter lesson way of working with with the computer.

Speaker 2

你最喜欢的AI应用是什么?

And your favorite AI apps?

Speaker 3

肯定是ChatGPT。

Gotta be ChatGPT.

Speaker 3

是的。

Yeah.

Speaker 1

不是ChatGPT。

Not ChatGPT.

Speaker 1

别开玩笑了。

Come on.

Speaker 1

我们太

We're so

Speaker 0

无聊了。

boring.

Speaker 0

我的意思是

I mean

Speaker 2

要么是你们除了Codecs之外发布的其他新功能,或者OpenAI之外的东西。

Either it could be like a new feature that you guys have released other than Codecs or something outside of OpenAI.

Speaker 0

好的。

Okay.

Speaker 0

所以我想我不太会,这挺有趣的。

So I guess I don't, it's funny.

Speaker 0

我其实不太会想到AI应用,但我喜欢生活变得更轻松的时候。

I don't really think of AI apps, but I do like it when my life gets easier.

Speaker 0

所以我喜欢的一些东西是,当你在使用AI时,但它却是无形的。

So some things that I like are like when you're using AI, but it's kind of invisible.

Speaker 0

所以我主要用的是各种产品。

So just I'm in products.

Speaker 0

我经常提交bug,比如Linear有一个非常优雅的集成:当你从Slack对话中提交bug时,它会直接根据Slack对话自动生成bug。

So I often like file bugs and like Linear has a really elegant integration where when you file a bug from a Slack conversation, it just generates the bug from the Slack conversation.

Speaker 0

但他们从不提到AI。

But they never say AI anywhere.

Speaker 0

你甚至根本注意不到它在使用AI。

Just like you actually kind of don't even notice that it's using AI.

Speaker 0

哦,等等,我想起我最喜欢的AI应用了,是Waymo。

Oh wait, came up with an answer for favorite AI app, Waymo.

Speaker 2

啊,对。

Ah, there

Speaker 3

是的,我认为对我而言,Copilot确实每天都在持续提供价值。

Yeah, we I think for me like copilot has definitely been thing that keeps delivering value every day for me.

Speaker 1

机器人领域,看涨还是看跌?

Robotics, bullish, bearish?

Speaker 0

看涨?

Bullish?

Speaker 3

是的。

Yeah.

Speaker 2

你认为2025年会爆火的新应用或应用类别是什么?

Which new application or application category do think will break out in 2025?

Speaker 1

除了编程之外?

Other than coding?

Speaker 3

是的。

Yeah.

Speaker 3

我的意思是,当你请伊萨和乔什来做客时,答案其实差不多,但2025年绝对是智能代理的年份。

I mean, think when you had Issa and Josh on, it's kind of the same answer, but 2025 is definitely the year of agents.

Speaker 3

我认为我们会看到智能代理在许多领域兴起。

I think we're gonna see agents take off in a lot

Speaker 0

不同的领域。

of different categories.

Speaker 0

对。

Yep.

Speaker 0

我同意这一点。

I have to agree with that.

Speaker 1

你最期待哪种类型的智能代理?

What type of agents are you most excited about?

Speaker 3

除了编程类代理之外?

Aside from coding agents?

Speaker 3

是的。

Yeah.

Speaker 0

这是个好问题。

That's a good question.

Speaker 0

好吧,我的看法是,我知道这本该是快速问答,但我们对智能代理的理解是:你先有推理模型,然后给这些推理模型提供专业工具,再训练它们完成特定任务。

Well, so my take would be, I know this is meant to be rapid fire, but the way we think of agents is you have reasoning models, and then you give those reasoning models access to tools of the trade, and then you figure out how to train that agent to do the sort of specific function.

Speaker 0

所以这不仅仅是关于写作,还涉及新闻业;不仅仅是编程,还涉及软件工程。

So it's not just about writing, it's about journalism, it's not just about coding, it's about software engineering.

Speaker 0

这正是我们正在做的。

So that's kind of what we're doing.

Speaker 0

在我看来,我今年如此期待智能代理的原因是,现在OpenAI和其他公司都已经推出了自己的智能代理产品。

And in my mind, the reason I'm so excited by agents this year is because we now have a few agents shipped from OpenAI and other companies are shipping agents too.

Speaker 0

所以我们开始看到这种形态的轮廓,并开始识别出基本要素。

So we're starting to see what kind of the shape of this is and starting to identify the primitives.

Speaker 0

因此,我特别感兴趣的是,当我们把这些整合起来,你就能创造出一个不需要为每个功能单独配置的智能体——它是一个拥有浏览器和终端的智能体,能够完成多种任务,而无需你明确指定‘你是我的编程智能体’之类的功能。

And so specifically what I've been excited about is as we bring this together and you come up with an agent that you don't have to provision separately for every single function, but it's an agent with a computer that has a browser and has a terminal and it can do multiple things without you having to exactly specify like you are my coding agent or something.

Speaker 1

真的很酷。

Really cool.

Speaker 1

非常感谢你加入我们。

Thank you so much for joining us.

Speaker 1

祝贺你在 Codex 所取得的成就,也感谢你为我们预览了你对编程市场未来发展的看法,以及长期异步智能体体验的运作方式。

Congratulations on what you've built at Codex, and thank you for giving us a preview of how you think the coding market will evolve and also giving us a peek into how long running async agentic experiences will play out.

Speaker 1

非常感谢。

Really appreciate it.

Speaker 1

谢谢。

Thank you.

Speaker 0

非常感谢。

Thanks so much.

Speaker 0

谢谢你们邀请我们。

Thanks for having us.

Speaker 1

谢谢。

Thank you.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客