Big Technology Podcast - 95%的企业在人工智能投资上真的毫无回报吗?——与Aaron Levie对话 封面

95%的企业在人工智能投资上真的毫无回报吗?——与Aaron Levie对话

Are 95% of Businesses Really Getting No Return on AI Investment? — With Aaron Levie

本集简介

艾伦·列维是Box公司的首席执行官。在本期《大科技》播客中,列维将就大多数企业未能从人工智能投资中获得回报的相关报告展开讨论。他将分享对这些报告的见解,提出反驳意见,并探讨实际现状。请继续关注下半部分内容,我们将在这场关于AI代理的讨论中区分炒作与现实。本期节目是在Boxworks大会后对人工智能未来几年发展方向的深度探讨。 喜欢《大科技》播客吗?请在您常用的播客应用中为我们打五星好评⭐⭐⭐⭐⭐。 想获取Substack+Discord版《大科技》的订阅折扣吗?首年可享25%优惠:https://www.bigtechnology.com/subscribe?coupon=0843016b 有问题或反馈?请发送邮件至:bigtechnologypodcast@gmail.com

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

为何头条新闻宣称企业从AI投资中毫无回报,而AI代理是否终于准备好投入工作?广告过后,我们将与Box公司CEO亚伦·列维探讨这一切。Oktane是顶尖的身份认证盛会,汇聚全球顶尖头脑共议安全访问的未来。现代身份安全架构的关键并非将安全整合至单一平台,而是统一防御体系。在Oktane,您将学习如何将该架构延伸至所有身份类型,包括AI代理这一新兴威胁。

Why are the headlines telling us that businesses are getting no return on AI investment, and are AI agents finally ready to get to work? We'll cover it all with Box CEO, Aaron Levy, right after this. Oktane is the premier identity event, bringing together the world's leading minds to discuss the future of secure access. Instead of consolidating security into a single platform, a modern identity security fabric is the key to unifying your defenses. At Oktane, you'll learn how to extend that fabric across all types of identities, including the emerging threat of AI agents.

Speaker 0

九月亲临拉斯维加斯现场参与,或在线观看主题演讲与分会场。注册并查看完整议程请访问okta.com/oktane。拼写为0kta.com/0ktane。

Join in person in Las Vegas from September, or catch the keynotes and sessions online. To register and see the full agenda, visit okta.com/oktane. That's 0kta.com/0ktane.

Speaker 1

您已习惯通过我的声音收听来自世界各地的访谈报道。

You're used to hearing my voice on the world bringing you interviews from around the globe.

Speaker 2

您也常听我报道环境与气候新闻。我是卡罗琳·比勒。

And you hear me reporting environment and climate news. I'm Carolyn Beeler.

Speaker 1

我是马可·乌尔曼。现在我们共同为您呈现《世界》节目,以崭新音效带来更多全球新闻报道。

And I'm Marco Wurman. We're now with you hosting the world together. More global journalism with a fresh new sound.

Speaker 2

在您当地公共广播电台及任意播客平台收听《世界》节目。

Listen to the world on your local public radio station and wherever you find your podcasts.

Speaker 0

欢迎收听《大科技播客》,这是一档关于科技界及其他领域冷静而细致探讨的节目。今天我们将讨论AI在商业中的应用,它是否真正产生影响,以及AI代理是否真实可行。我们请到了刚从BoxWorks AI活动归来的完美嘉宾亚伦·列维。亚伦,一如既往很高兴见到你。

Welcome to Big Technology Podcast, a show for cool headed and nuanced conversation of the tech world and beyond. Well, today, we're gonna talk about AI and its application in business, whether it's actually making a difference, and whether AI agents are a real thing. We have the perfect guest to do it today because we have Aaron Levy back with us fresh off the Box BoxWorks AI event. And, Aaron, it's great to see you as always.

Speaker 3

谢谢。谢谢你,亚历克斯。很高兴能来到这里。

Thank you. Thank you, Alex. Good to do good to be here.

Speaker 0

那么我是否将BoxWorks AI活动添加为,还是它就叫BoxWorks?

So did I did I add box BoxWorks AI event as or is it just called BoxWorks?

Speaker 3

我只是确认一下——我喜欢你称之为AI活动。它确实就叫BoxWorks,但任何时候你想加入AI元素,我们都欢迎。

And I'm just checking the I I like you I like you call it an AI event. It it is just called BoxWorks, but but any anytime you wanna jam an AI in there, we're good.

Speaker 0

好的。听起来不错。你们有很多AI相关的新闻,我们稍后会讨论。但既然你与很多人讨论过AI在商业中的应用,我想请你看看这份MIT的研究,听听你对其中哪些是真实、哪些不是的看法。

Okay. Sounds good. You had a lot of AI news. We'll get into that in a moment. But since you are talking with a lot of folks about AI applications in business, I wanna run this MIT study by you and get your perspective on what's real and what's not.

Speaker 0

这是几周前来自Maxios的消息。MIT关于AI盈利的研究震动科技投资者。华尔街最大的担忧被最近一项MIT研究证实,该研究表明95%的被研究机构在AI投资上获得零回报。他们研究了300个公开的AI项目,试图揭示AI对商业影响的真实情况。尽管企业向生成式AI投入了300亿至400亿美元,95%的机构表示他们发现零回报。

So this is from Maxios a couple weeks ago. MIT study on AI profits rattles tech investors. Wall Street's biggest fear was validated by a recent MIT study indicating that 95% of organizations studied gets zero get zero return on their AI investment. They studied 300 public AI initiatives trying to suss out the no hype reality on AI's impact on business. 95% of organizations said they found zero return despite despite enterprise investment of 30,000,000,000 to 40,000,000,000 into generative AI.

Speaker 0

这项研究已成为商界热议的话题。你认为它有可信度吗?我看你已经在摇头了。

This has been a study that everybody in the business world is talking about. Do you think there's any validity to it? You're already shaking your head.

Speaker 3

我确实在摇头,而且是从七个维度来否定的。我们可以逐一分析。

I'm shaking my head on actually like seven dimensions. We can parse each one.

Speaker 0

我们开始吧。

Let's do it.

Speaker 3

我是说,实际上,也许第一个最有趣的是华尔街那一部分。事实上,华尔街在这个维度上完全是精神分裂的。你知道,显然,像那样的报告在一个维度上吓到了他们,但实际上,华尔街同样可能对‘AI将如此强大以至于所有软件都将消亡’这个想法充满了狂热能量。所以这是一种非常两极化的状态——我们究竟处于AI应用的哪个阶段,还是AI将强大到连软件商业模式都不复存在,因为一切都将由AI直接交付。和大多数极端对立的事物一样,我认为现实情况要微妙得多。

I mean, actually, maybe the first one that is maybe the most funny is the kind of Wall Street element. Actually, Wall Street is is completely schizophrenic on this dimension. You know, obviously, a report like that, you know, scares them on one dimension, but, actually, there's an equal amount of of kind of Wall Street maybe, you know, kind of frenetic energy around the idea that AI will be so good that all of software is dead. So this very kind of bipolar state of where are we in AI adoption versus AI is going be so powerful that there's not even going to be software business models because everything will just be delivered by AI. And as with most things that have these kind of extreme polarization elements, I think the reality is just way more nuanced.

Speaker 3

我们仍处于AI应用曲线的早期阶段。在所有这些类型技术的早期阶段,会有大量概念验证,人们尝试不同技术,试图找出哪种工具适合哪种用例。因此从定义上说,你正处于蛮荒西部,各种供应商和技术栈都在尝试应用这些技术。其中许多项目和试点注定会失败,因为本质上它们就是试点。我们仍处于早期阶段。

We are still early in the adoption curve of AI. In the early curve of all of these types of technologies, have lots and lots of proof of concepts, you have lots of trials of different technologies, people are trying to figure out which tool works for which use case. So by definition, you're kind of in the Wild West where there's lots of attempts at trying these technologies with various vendors and and technology stacks. And and many of those projects and pilots will absolutely fail because by definition, they're pilots. And they're and we're we're still in the early phases.

Speaker 3

这项研究有个有趣发现:他们发现试图自主搭建AI技术栈的公司与采用现成解决方案的公司之间存在显著差距。这和我们客户群的情况一致。我认为最初可能有这样的理论——AI相对容易掌握,我们可以自己构建AI应用,自己完成数据向量嵌入。

One one interesting thing about this study was they they saw a significant delta between companies that tried to effectively DIY their AI stack versus, you know, going with really kind of applied solutions and use cases. And this is what we tend to find in our customer base. So, you know, I think there was maybe an initial theory of, well, AI will be relatively easy to kinda get our arms around. We could build our own AI application. We'll do all of the vector embeddings of our data ourselves.

Speaker 3

我们会把数据存入向量数据库,自行管理数据访问的安全和权限。但很快企业就会发现,想在某个工作流程中部署AI,可能需要运行管理10到15个不同的软件模块,才能让单个用户真正使用AI。这种架构很可能行不通,你需要有针对特定用例的定制解决方案。

We'll put it into a vector database. We'll have we'll manage the security and permissions of data access ourselves. And, you know, before you know it, a company that wanted to deploy AI in in a particular workflow in their enterprise, they might have 10 or 15 different pieces of software that they have to run and manage just before, you know, before a single user could actually interact with with with AI within that organization. So that's that's probably an architecture that's not going to work. You need to you need to have purpose built solutions that that solve sort of tailored use cases.

Speaker 3

这些可以是像AI编程这样的大用例,但你不应该处于需要从零搭建的境地。这是调查中的一项认知。当然我完全不同意某些结论——除了必须找准用例,必须瞄准AI最有效的领域,以及不应该自行构建这项技术。不过我们这边是有实证依据的。

Those can be very big use cases like all of AI coding, but you probably don't wanna be in a position where you have to kinda bootstrap this or or build it all out yourselves. And that was one of the kind of recognitions in the survey. But I obviously wholeheartedly disagree with any of the maybe conclusions other than just you have to get your use cases right. You have to kind of target the most effective areas for AI and you probably shouldn't be building this technology yourselves. So the but it's sort of empirical on our end.

Speaker 3

我们每天都能与看到即时收益的客户交流。有些客户的同事无法向董事会展示真实的投资回报节省,因为董事会不相信那些数字好到那种程度,所以他们不得不稀释数据使其更符合实际认知。这就是他们亲眼所见带来的务实做法。

We get to talk to customers every single day that are seeing the immediate gains. We've talked to customers where they have had colleagues that can't actually they can't present the actual ROI savings to their board, The actual kind of expected ROI savings to the board because the board won't believe how how they won't believe the numbers based on how good they are. So they actually have to water them down. So it's actually more pragmatic and and believable based on what they're seeing.

Speaker 0

所以这难道不是一个糟糕的董事会吗?我是说,董事会听不进真话。好吧,谢谢。

So Isn't that a terrible board? I mean, the board can't hear the truth Well, thank you.

Speaker 3

但事实好到让人觉得难以置信。所以当投资回报率(ROI)高到一定程度时,你解释这个项目如何运作反而没人会信。至少在我们的客户中,这样的例子比比皆是。我们有一个非常具体的应用场景优势:我们处理文档和非结构化数据,然后通过AI代理对这些数据进行操作,比如从文档中提取结构化数据。比如给我们十万份合同,我们就能从中提取出结构化数据字段。

But the truth is so good that that it's it doesn't sound credible. So so that that that is the like, when when the the ROI is so good that you actually don't you you aren't gonna be believed when you actually explain how this thing's going to work. So we're we're seeing examples all across the board, at least for our customers. You know, we have we have the benefit of a a very applied use case, which is we we take documents and unstructured data, and then we have AI agents that can operate on that data to do things like extract structured data from your documents. So give us, you know, a 100,000 contracts, and we'll pull out the structured data fields in those contracts.

Speaker 3

或者给我们发票,我们就能提取出发票中的关键细节,从而帮助自动化工作流程。这类用例通常具有极高的投资回报率,因为要么你之前无法获取这些数据,要么获取成本非常高。而AI在执行这类任务方面正变得越来越擅长,因此客户能立即受益,工作流程的自动化也因此变得更加容易。

Or give us invoices, and we'll pull out the key details in an invoice so we can help automate a workflow. Those use cases tend to be very high ROI because either you weren't getting that data before or it used to be very expensive to do so. And AI is getting increasingly good at being able to execute that kind of task. And so there's immediate benefit to customers. You can automate workflows much more easily as a result.

Speaker 3

在某些领域,你还能降低运营成本。因此,根据客户群体中AI的采用情况,我们往往会看到不同的结果。但如果你放眼全局,思考过去几年的所有项目,我认为你会得到一个混杂的结果,这反映了我们在这一领域仍处于早期阶段。

You can lower the cost of operations in some areas. So we tend to see a different set of outcomes based on the AI adoption within our customer base. But if you zoom out and you kind of think about all projects across the past couple of years, I do think you're going get a mixed bag just as a reality of how early we are on this space.

Speaker 0

是的。报告显示内部构建的失败率是外部合作的两倍。这一点说得很准。人们试图自己拼凑解决方案,而不是借助外部力量,结果遇到了很大困难,这某种程度上颠覆了一些传统观念。传统观念认为你可能希望内部构建,也许是利用开源,以便根据你的用例进行定制,但事实证明一些现成的解决方案其实效果相当好。

Yeah. And it says internal builds fail at double the rate of external partnerships. So spot on there. People trying to parse this together on their own versus doing it externally are having a tough time, which sort of flies in the face of, like, some of the conventional wisdom. I think the conventional wisdom was you wanted to be able to build internally maybe with open source so you could customize to your use case, but it turns out some of the off the shelf stuff is actually working quite well.

Speaker 3

是的。我认为必须认识到,无论是这类调查还是讨论架构,很多挑战在于你需要将科技行业与非科技行业区分开来。非科技行业是这类技术的消费者,而科技行业则是构建者。开源极具价值,但这并不意味着律师事务所应该去用开源模型构建自己的AI项目。如果认为地球上的每家公司都会去构建自己的技术来自动化工作流程,那简直就是灾难的配方。

Yeah. I think you have to know, a lot of the the challenge with with either these types of surveys or even talking about architectures is you have to kinda separate the tech industry from the non tech industry. The nontech industry being the kind of consumers of these types of technologies and the tech industry being the builders. So open source is insanely valuable, but not in the sense where a law firm should go off and build their own AI project using an open source model. Like, that that is just a recipe for disaster if if, you know, we think that that every single company on the planet is gonna go build their own technology to go automate their workflows.

Speaker 3

实际上,很多试点项目已经证明了这一点,因为我们处于技术的早期阶段,还没有现成的应用解决方案可以部署。但对于像Box这样的公司来说,开源实际上非常有价值,因为我们为12万客户提供技术支持,内部确实具备利用这些能力的专业知识。因此,从开源这一维度得出的结论是,你不应期望每家公司都DIY自己的AI策略,那样很难从AI采用中获得回报和收益。最后我想指出的另一点是,要从AI中获得真正的收益,确实需要进行相当多的变革管理。

And that that has been actually the case for a lot of pilots because we've been early in the technology and you haven't had applied solutions that you could go deploy. But open source is actually extremely valuable for a company like Box because it it you know, we're we're con you know, we're powering technology for a 120,000 customers. And so we actually do have the expertise internally to leverage those kinds of capabilities. And so I would say the conclusion from the dimension of open source as an example is just you probably shouldn't expect that every company in the plan is going to DIY their own AI strategy, and that's a recipe for not getting the returns and gains from an AI adoption standpoint. And then maybe the other final point, the thing I'd kind of point out is just there really is a decent amount of change management required to getting real gains from AI.

Speaker 3

这并不是一种万能解决方案,不能直接将AI嵌入现有工作流程就指望它突然让效率提升三倍。通常需要重新设计工作流程才能充分发挥AI的优势。我最近越来越得出一个结论:两三年前我们曾认为AI会全面学习我们的工作方式,自动适应并优化我们的流程。但现实是,我们很可能需要调整自身工作模式——有时是渐进式调整,有时则需要重大变革——才能完全释放AI的潜力。

This is not a panacea type of of of solution where you could take an existing workflow, drop AI directly into it, and then all of a sudden that workflow will be, you know, three x better. You usually do have to reengineer the work to take advantage of AI. And the conclusion I've recently come to more and more is, I think we had this feeling maybe two or three years ago where AI was going to learn everything about how we work. It would be able to adapt to our workflows and then bring automation to our workflows. And I think realistically, increasingly, we probably will have to modify our work, hopefully incrementally, but in some cases meaningfully to fully take advantage of AI.

Speaker 3

这听起来或许有些困难,但对于付诸实践的企业,投资回报将非常可观。以当前最显著的AI编程为例,生产力提升的方式已与两三年前截然不同。工程师更像管理者,部署智能体处理大量代码后审核其产出。如果不调整工作方式——比如如何有效运用后台智能体、设计精准指令、重新规划代码库架构,以及制定AI代理的规范准则——

And that sounds maybe hard on one hand, but for the companies that do that, the ROI is going to be fairly massive. So if you think about AI coding as as maybe the the, you know, most obvious example right now where you're seeing productivity gains, the way that AI kind of first engineers tend to work is pretty different than how you engineered two or three years ago. The engineer really becomes more of a manager. You're deploying agents to go off and work on large parts of the code base, then it's coming back with a a bunch of work that you go and review. So if you don't change your workflow as an engineer to take advantage of background agents and how you give them the right kinds of prompts to actually execute on their task and the new ways you should effectively think about your code base and, handling the specifications and rules of what the AI agent should do.

Speaker 3

若不完成这些变革,就难以实现AI带来的两到五倍效率提升。我们必须重构业务流程来发挥智能体效能,而非指望它们直接无缝接管所有工作。

If you don't do all of that work, you're probably not gonna get a two x or five x gain from AI. And so we will actually have to reengineer some of our business processes to make agents effective as opposed to thinking agents will just drop into our processes and automate everything that we're doing.

Speaker 0

顺便提一下,你多次谈到试点项目。这项研究显示95%的企业AI投资零回报,而不仅限于试点。试点失败本属正常,有听众反馈说,我常提到只有10-20%的AI试点能真正投入生产。

By the way, you've brought up pilots a couple times. And I think it's important to talk about because this study was not just pilots. It was 95% of organizations get zero return on AI investment. So I think the pilot thing is interesting because it's natural that pilots are gonna fail. And in fact, we've had some listeners who've given me some feedback that said because I talk often about how like only 20% of AI pilots or 10 to 20% of AI pilots get out the door into production.

Speaker 0

这个比例或许合理,毕竟早期阶段必然存在试错过程。

And that might be a good number because you're you're gonna obviously, you know, have some trial and error in the early days.

Speaker 3

是的。需要说明的是,我用'试点'这个通俗说法是因为技术尚处早期——很多客户目前的部署规模本质上就是试点,毕竟...

Yeah. And and and to be clear, I I'm using pilots colloquially in the sense that we're just so early in the technology that when we talk to customers, a lot of times they have so far deployed is the equivalent of a pilot just because of literally how

Speaker 0

全组织范围推广。

Organization wide.

Speaker 3

是的。嗯,范围太广,一个集中式的调查者很难代表整个组织。这就像...这就是为什么我再次强调,这就是为什么我不太想...调查本身很棒,它是一个有趣的、可以开启对话的话题。但如果你真的试图去评估他们如何回答这个问题,他们衡量生产力的方式是什么,以及他们是否真的调查了所有那些未经授权就使用ChatGPT的终端用户及其行为。要捕捉所有这些信息是不可能的。

Yes. Well, wide is hard for one centralized survey taker to represent an organization It's like like the that that's why again, that's why I don't wanna like the survey is great. It's an interesting, you know, kinda conversation starter. But, like, if you actually tried to go assess how is the answer you know, answering this question and what is their way of measuring that productivity and have they actually surveyed all of the end users that are just using ChatGPT in an unsanctioned way and what they're doing. It's not possible to capture all of that.

Speaker 3

所以它往往更多地代表了那些集中化的、我认为更可能是试点导向型的项目,毕竟我们还处于早期阶段。'智能体'这个词出现还不到一年。我们在很多领域都才刚刚起步。但话说回来,我认为这是一项极好的调查,因为它引发了讨论。但如果得出的结论是要放缓使用AI的步伐,或者除了意识到需要从风险角度进行规避之外不做任何行动,那么实际上问题就在于——这只会导致一些公司行动更加迟缓,而其他公司则会超越它们。

So it tends to more represent the the centralized, you know, heavily sort of, you know, again, kind of, I think more more likely pilot oriented type projects because of just, again, how early we are. You know, the word agents just came onto the scene less than a year ago. So we're we're just early in a lot of these spaces. But again, I think it's a fantastic survey because it gets a conversation going. But I think if the takeaway was to slow down using AI or to do anything other than realize what you should mitigate from a risk standpoint, then actually the failure would just be or the problem with that would just be all it's gonna do is cause some companies to move even more slowly and then you'll have other companies just outrun them.

Speaker 3

所以某种程度上,这取决于...可以说,现在风险在于听众需要决定他们想如何对待那份调查结果。

So it's kind of up to the it's sort of, you know, at the risk of, you know, you know, the the the risk is now on the listener to decide what they wanna do about that that survey.

Speaker 0

没错。关于这项研究,我还可以告诉你一个未被充分重视但超级有趣的点:官方采购的大语言模型仅覆盖40%的企业,但90%的员工每天都在使用个人AI工具(至少受访者如此)。这太有意思了,因为它意味着个人使用和对AI的兴趣实际上超过了企业将其投入生产的意愿。

Yeah. And I can tell you one more thing that I found super interesting about this study, which has sort of been underappreciated. So it says official LLM purchases cover only 40% of firms, yet 90% of employees use personal AI daily. At least those surveyed. But that's just it's so interesting because it means that, yeah, there's there's more personal use and more interest among individuals than companies to get this stuff into production.

Speaker 0

是的,我对此显然有反应,让我们听听看。

Yeah. I obviously have reaction here, so let's hear it.

Speaker 3

对。我明白。我认为这就像是经验揭示的偏好——当你已经知道这个事实时,甚至不需要再调查:为什么人们会以如此高的比例私下将AI用于个人生产力提升?

Yeah. Well, I know. I just think that's that's like empirical revealed preference. So so, you even have to survey once you know that. Why are people going off and using AI a personal productivity sense at that rate?

Speaker 3

原因就是他们从中获得了价值。这几乎已经成为人们工作的基础模式。毫无疑问,如果今天突然禁用AI,你会立刻意识到:'哇,现在我得亲自去做那三个小时的研究了,而原本我只需要启动一个深度研究任务,五分钟后就能回来查看结果。'

It's because they're getting value from it. So you almost that is sort of now in the baseline of how people are working. It's it's unquestionable that if you just sort of eliminated AI just today, let's just say, you you would just notice, wow. Okay. I I actually have to go and do that three hours of research that I used to be able to go and kick off as a deep research project and go and check back in on it after five minutes.

Speaker 3

因此,我们选择日常使用这些技术是经验性的,因为它们提升了生产力。我认为迄今为止我们所看到的人工智能应用,仅仅触及了这些技术部署后即将发生变革的表层。

And so it's empirical that we're choosing to use these technologies on a daily basis because they're adding that productivity. And I would argue that what we've seen with AI thus far is barely scratching the surface of what is going to start to happen as you start to deploy these technologies.

Speaker 0

但你认为商业应用会主要是个人独立使用ChatGPT这类工具,还是企业规模化部署大语言模型?或者未来会是两者的混合?你显然在观察另一端的动态,因为你在预见未来趋势。

But do you think the use in business, could it potentially be just individuals using, let's say, ChatGPT on their own versus scaled enterprise use of large language models? Or because or or or do you think it will be some blend? In the future? You're obviously watching in the future because you're obviously watching this happen on on the other side of things.

Speaker 3

不。未来我认为我们仍处于技术扩散的最早期阶段——甚至基础用例层面。比如研究客户时,与其仅了解对方公司职位和兴趣领域,何不让AI系统生成完整的客户分析方案?

No. The the future is I I think that that we are in the earliest phases of just even the diffusion of the the technology itself of of the the basic use cases of, hey. When you're gonna go research a customer, you know, why don't why don't you get a full account plan, you know, instead of just saying, okay. This person works at this company and they're interested in these things and this is these are the trends of that industry. Why not ask a an AI system to to generate the full plan?

Speaker 3

这虽然强大,但相对于人类工作流程的复杂性仍属基础。本周Claude宣布能生成文件的新功能就是个典型案例——尽管我们已处于ChatGPT时代近三年,这还是首次有AI系统能可靠生成高质量Word文档或PPT演示文稿。我们才刚刚起步。

That that's super powerful, but also relatively basic if you think about about how people work and the full scope of workflows that people do. One one really interesting example of of, again, how early we are, Claude this week announced new capability that will generate files for you. And even though we're two and a half years, you know, nearly three years into the chatty bitty moment, it's the first time where an AI system can, I believe, generate reliably a kind of high quality document in the form of a Word document or a PowerPoint presentation? So we're nearly three years in and it's the first time ever that you could generate something that you would sort of look at and say, oh, that looks like a good presentation. We are only at the very, very beginning stages.

Speaker 3

想象这项技术还需几年渗透企业界。未来向客户推销产品前,你不必花一两小时调研制作PPT,只需对AI助手说'我要向某客户推销,请生成演示文稿',三分钟后就能获得成品。这种场景将渗透我们每日的几乎所有工作流程。

Now imagine it'll still take a couple of years. Now imagine a technology like that begins to ripple through corporations. And in the future, before you go and present whatever product you're selling to a customer, instead of spending one or two hours of doing a bunch of research and making your PowerPoint file that's your presentation, you go to an AI agent, you say, I'm about to go sell to this customer, generate this presentation for me. You kick that off and again, three minutes later, it's sort of done for you. That this is going to just show up in all of our workflows every single day in in almost everything that we're doing.

Speaker 3

程序员最早窥见未来图景,因为他们更擅长利用这些工具,AI编程已成为首个突破性用例。但'通过界面与代理对话,由其执行多步骤工作'的模式,未来几年将逐渐出现在所有知识工作中。我持务实态度——这不会是一夜之间的变革,需要多年管理调整。正如你所说,我们本周举办的会议参与者本就是技术先行者。

So coders are getting the first lens into what the future looks like, you know, earliest because, you know, they're they're sort of wired to take advantage of these tools, and AI coding has been the kind of first breakout use case. But that same dynamic of you're gonna go to an interface, you're gonna talk to an agent, it's gonna go and execute kind of multiple steps of work for you, that will start to emerge within, you know, all of knowledge work over the coming years. I actually am am probably a pragmatist on this sense that it will not be like this instant overnight, you know, transformation of work. It will take years of change management. We just hosted our conference this week as you as you noted, and it happens to be a a crowd, obviously, by definition that is sort of forward leaning and and kind of early adopters of technology.

Speaker 3

但这只是经济体的冰山一角。所有银行、制药公司、律师事务所完成AI化改造仍需多年。但毫无疑问,这场变革必将发生,没有任何力量能阻挡这个趋势。

But that represents a small fraction of the total economy. It will take years before, again, all of the banks, all of the pharma companies, all of the law firms start to get wired up in this AI first way. Unequivocally, it's going to happen. And there's nothing that will kind of slow that train down.

Speaker 0

好的。让我们再深入讨论一下这个利用云端生成文档的用例。我的意思是,你举的例子是通过这种方式去向客户推销。现在,我认为大多数企业都有他们自己的PowerPoint模板和内置数据。所以即使我进入云端,上传我的定价电子表格、库存电子表格、一份关于定位的文档,并说基于这些生成一个PowerPoint,我相信它会做得很好。

All right. Let's talk a little bit more about this using cloud to generate documents use case. I mean, I would imagine so you talked the example that you gave was using one of these to go in and sell into a client. Now, I would imagine most organizations, they have, like, their PowerPoint templates and the data baked in. So even if I were to go into Cloud and, like, upload my pricing spreadsheet, my inventory spreadsheet, a document about positioning and say make a PowerPoint based off of this, I'm sure it would do a good job.

Speaker 0

但要说这将取代人们的工作方式,而不是看起来像一种花招——当你真正要进入市场时,还是会使用已有的其他文档——这实际有多大的可行性?

But how practical is it to then say this is gonna be a way that people do their work versus something that might look like a party trick where you're gonna use the other documents that you have already when you actually are gonna go out into market?

Speaker 3

哦,是的。不,实际上这种方式会这样呈现——虽然我无法确切预测它何时会发生——但在Box上,你只需登录Box,然后说‘这是我的销售演示模板,这是新客户信息,请基于这些生成一个PowerPoint演示文稿’。

Oh, yeah. No. The the way that this will actually show up and and I I, you know, I can't represent the exact date that this will happen, but Box you you'll just go to Box and you'll say, here's my sales presentation template. Here's the new client information. Please generate a PowerPoint presentation with that.

Speaker 3

就像这样,你可以直接用现有数据完成。这不是某种一次性临时拼凑的文档。你将使用现有资产作为生成下一个文档的素材来源,然后去审查它的成果,这只需花费你三分钟,但却能节省你一两个小时用于客户调研、调整所有图表和放置相关信息的时间。这一切都会自动完成。想象在经济某个领域每天有上百万人这样做,你就会明白这将如何带来数千万小时的生产力提升。

Like and then you'll just do that with your existing data. This is not some kind of one off vibe coded document. You will use your existing assets as the source material for the next document that you'll generate and you'll go and review its work and that'll take you three minutes but it will have saved you an hour or two hours of all of the time that it took to do the customer research and move around all the graphics and put the relevant information in place. That will just be done for you. Multiply that over a million people that do that per day in some sector of the economy and you'll just see that's how you'll get tens of millions of hours of productivity gained within the economy.

Speaker 0

你对这些模型的可靠性怎么看?因为你多次提到可以用深度研究来准备,或者用这些模型生成PowerPoint后花几分钟检查。你现在是否认为这些模型及其输出已经足够可靠,只需做到这样就可以了?

And how are you feeling about the trustworthiness of these models? Because you've talked a couple times now about how you could use deep research to prepare you for something or you could use these models to generate a PowerPoint and then spend a couple minutes checking them over. Are you at the point now where you think these and the outputs of these models are trustworthy enough that that's all it takes?

Speaker 3

我认为只要——这正是让我对当前语境工程热潮感到兴奋的地方——只要你非常清楚提供给AI的语境,如何通过恰当的提示和足够高质量的模型将其根植于可信数据中,你几乎可以消除绝大部分的幻觉或准确性问题。以Box为例,我们始终将用户现有数据视为AI代理的素材来源,这是AI代理有效运作的上下文基础。所以如果我拿现有的销售演示PPT说‘根据新客户修改这个’,并使用具有推理能力的尖端模型操作,我敢说99%的情况下它只会犯微不足道的小错误。

I think as long as and this is where I get very excited about about now that obviously what's in the zeitgeist is context engineering. As long as you are really good about what context you're giving the AI and how you are effectively grounding AI in trustworthy data with the right kinds of prompts and a high enough quality model, you can nearly eradicate all of, if not the vast majority of hallucinations or accuracy issues. So in our case, everything that we do at Box is we think about your existing data as the source material for for the AI agent. So it's the source context for the AI agent to be effective. And so if I take an existing PowerPoint document that's our sales presentation and I say modify this for a new customer and and you do that with a, you know, a frontier model that is a a reasoning model, you know, with some degree of kind of thinking mode, I I would I would posit that 99% of the time it's going to make, you know, infinitesimally fall the small kind of errors or failures on that.

Speaker 3

这已经是个被解决的问题了。用五分钟审查换取节省几小时的工作绝对值得。我们实际上正坐在观察编码未来形态的前排座位。如果你和那些新兴初创公司交流——不知道你是否这样做,但我知道你常接触世界顶级精英——去和一个全新的五人初创团队聊聊就知道了。

That's just like a a solved problem at this point. And so and and it is still easily worth the the kind of five minute trade off for the couple hours you save to go and review its work. And this is the we actually have this incredible front row seat in watching what the future looks like with coding. So if you talk to the new like the brand new startups, and I don't know if you do this, but I know that you get to spend your time with the dumbests of the world and whatnot. Go talk to a five person startup that's brand new.

Speaker 3

令人兴奋的是,他们正以我毕生所见最疯狂的方式工作。前几天我和一家九人初创公司聊天,他们估计自己至少达到了百人规模公司的执行力。这还算保守估计,如果你仔细计算的话。这是因为他们的每位工程师现在能产出相当于五到二十名工程师的工作量,但他们的工作方式完全不同——他们是AI代理的管理者。

And what's exciting is they are working in the craziest ways that I've ever seen in my entire life. I was talking to a nine person startup the other day that estimates that they're, at a minimum, executing at the size of about a 100 person company. And that that was, again, kind of conservative probably when when you, you know, do the underlying math. And it's because each of their engineers has the capacity output now of five or 10 or 20 engineers worth of work, but they are working in a completely different way. They managers of AI agents.

Speaker 3

他们把时间花在撰写精准的需求说明上,精心设计软件架构,然后大量审阅AI代理的输出成果。当然并非所有知识工作都如此,但想象在销售、营销或法律领域,如果你的角色是管理那些负责数据准备、调研和内容生成的AI代理,然后审核其产出并整合到更宏观的业务流程中——这将成为未来多数工作的常态。

They spend their time on writing really good specs for what they wanna build. They spend really good time on on the design architecture of their software, and then they spend a lot of time on reviewing the output of the agent. So, not every area of knowledge work will look exactly like that. But if you imagine in sales, if you imagine in marketing, if you imagine in legal work, and your role is to manage agents that are doing a lot of the underlying data preparation, research, creation type of work and then your job is to go review that work and put it together in a broader business process. That will actually be what a lot of work looks like in the future.

Speaker 3

所谓AI的幻觉或错误,其实就像我需要审阅他人工作、他人也审阅我的工作一样平常。我做演示文稿时会出现错误,比如拼写错误或客户名称修改不当,这些都会被同事发现纠正。未来我们将以同样方式审阅AI代理的工作。这彻底颠覆了原有模式——我们原以为AI会审阅人类工作来提高效率,结果反过来了。

And this idea of hallucinations or errors will be no different than the fact that I have to sometimes review other people's work and other people review my work. And I have errors in the presentations that I create that somebody catches and they see a misspelling or they they see that I changed the name of a customer in the wrong way and they they changed that. We will be doing that for AI agents. So it's this it's this flip of the model where we thought AI agents were going to review our work and kind of incrementally make us more productive. We will be the reviewers of the AI agents work.

Speaker 3

我们将成为编辑者、管理者和协调者。正是通过这种角色转变才能实现生产力跃升。建议密切关注AI编程领域,观察初创公司如何获取杠杆效应,进而思考其对整体经济的影响。

We will be the editors. We will be the managers. We'll be the orchestrators. And that's actually how you then get the productivity gains. So I'd say watch the AI coding space, watch what startups are doing to get leverage, and then think about that against the broader economy.

Speaker 0

亚伦,这很有趣。上次交谈时你提到有人单用AI编程工具创业。当时我正在写Anthropic达里奥的专题报道(其中引用了你的观点),后来我确实找到个类似案例——有开发者独自用Cloud Code构建项目。现在连Anthropic都不得不设置速率限制了,显然这已成趋势。

You know, it's really interesting, Aaron, because the last time we spoke, you told me about this person that you knew who was basically building a company on their own using AI coding tools. And so I was in the process of writing this profile of Dario at Anthropic, which you're quoted in. And I went out and found a developer doing something quite similar using Cloud Code to build on their own. So Yep. This is clearly I mean, to the point where like Anthropic now has to put some rate limits on, but this is clearly a thing that's happening.

Speaker 3

没错。虽然我喜欢MIT的调研报告,但最大的悲剧莫过于人们忽视你所说的现实案例,进而忽略这些实践对企业的影响——现在是时候重新设计适应AI代理的工作流程了。

Right. And and this is the this is the thing that again, I I I'm I love the MIT survey. I think it's great. It's it's it's a fun conversation topic. But the the the the one travesty would be if people miss that what you just said is actually happening on the ground and then not starting to pay attention to what that's going to mean as that ripples through corporations and how people should probably start to think about reengineering workflows for a world of AI agents.

Speaker 3

每次技术浪潮都如此:先行者读到Anthropic的报道会确认趋势真实存在,后进者看到MIT报告则自我安慰。有些企业会快速获得早期红利,有些可以等待。有时这意味着被颠覆,但像辉瑞或礼来这样拥有核心竞争力的公司,即便谨慎采用AI也完全没问题。

And this happens in every single technology wave, which is actually why you have early adopters and early innovators and why you have laggards is because the early adopters and innovators are gonna read, you know, your anthropic piece and see, oh, this actually is a real trend. And the laggards are gonna read the MIT piece saying, oh, I've been vindicated. And some companies will then get those early returns at a much faster rate and other companies can wait. And sometimes that means that your company gets disrupted and sometimes it doesn't because you actually have some proprietary capability as an organization. If Pfizer or Eli Lilly took a little bit longer to adopt AI as a result of wanting to be more pragmatic, that'll be totally fine.

Speaker 3

它们不会被颠覆。它们拥有足够的市场地位和分销渠道,完全可以等待这项技术更加成熟。但如果我现在是一家初创公司,我可能会充分利用这一点作为优势,试图绕开那些规模更大的老牌企业。正是这种市场中的微妙张力,在每一波技术变革浪潮中催生了创造性破坏。

They're not gonna get disrupted. They have enough market position, they have enough distribution, they can afford to kind of wait for this technology to be more baked. But if I'm a startup right now, I'm probably gonna use that as my advantage as much as possible to try and run circles around maybe a larger incumbent. And this is what kind of creates this nice tension in the market that creates creative destruction in every wave of technological change.

Speaker 0

好的。我非常想多谈谈代理的定义是什么,以及你们在Box是如何推出它们的,同时也想听听你对GPT-5的看法。我们稍后就讨论这些。

Okay. I definitely wanna speak a little bit more about what the definition of an agent is and how you're rolling them out at Box and also get your reaction about GPT-five. So let's do that right after

Speaker 1

你已经习惯通过我的声音,聆听来自全球各地的访谈节目。

You're used to hearing my voice on the world bringing you interviews from around the globe.

Speaker 2

而你会听到我报道环境和气候新闻。我是卡罗琳·比勒。

And you hear me reporting environment and climate news. I'm Carolyn Beeler.

Speaker 1

我是马可·乌尔曼。现在我们共同为您主持《世界》节目。以全新的声音呈现更多全球新闻报道。

And I'm Marco Wurman. We're now with you hosting the world together. More global journalism with a fresh new sound.

Speaker 2

请在您当地的公共广播电台或任何播客平台收听《世界》节目。

Listen to the world on your local public radio station and wherever you find your podcasts.

Speaker 0

欢迎回到《大科技》播客,本期嘉宾是Box公司CEO亚伦·莱维。亚伦,在我们深入讨论代理和GPT-5之前,让我先问一个基本问题:既然这种技术已经在商业领域应用——也就是让AI自主工作、从不同数据源提取信息并连贯呈现——为什么像亚马逊的Alexa Plus和苹果的Apple Intelligence这样的消费级公司,却难以将其整合成设备端或面向消费者的类似产品?他们都承诺过,但至今仍未完全实现。

And we're back here on big technology podcast with Box CEO, Aaron Levy. Aaron, let me start before we get into agents and before we get into, g p t five, let me just start with a basic question, which is if this is already happening in business, which is basically, like, you're finding ways to get the AI to do work on its own and pull information from different data sources and present it coherently, why do you think it's been so difficult for consumer companies, like, let's say, Amazon with Alexa Plus and Apple with Apple Intelligence to put this together some as something on device or a consumer product that does similar activities? Because they've all promised it, but it's not quite there yet.

Speaker 3

是的。我认为技术能够存在的事实与将其实现所需的执行要求是两回事。因此,我们都能前排目睹前沿模型的能力,而有些公司能将它们打包应用于具体场景。但如果你是一家拥有数千万甚至数亿用户产品的公司,消费者对产品有特定期待。那么从前沿模型到如何以可靠、可信且经济的方式交付给终端客户,这中间存在着巨大的执行鸿沟。

Yeah. I think there's fact that the technology can exist is different from still the execution requirements to bring it to life. And so we get to all have a front row seat on what the frontier models can do, and you have companies that can package those up in a way for these applied use cases. But if you're a company with tens of millions or hundreds of millions of users of your product and consumers that have a certain expectation. That is a lot of execution gap required to go from the frontier model to how do you deliver that to your end customer in a reliable way that is trustworthy, that is affordable.

Speaker 3

而且,我认为大公司都在经历各自版本的这一过程。考虑到这个领域发展如此迅速,我也能理解某种程度上可能存在的犹豫——今天某个模型领先,明天另一个模型登顶,后天又有新模型突破。因此,你可能希望最终确定的架构是可持续的长期方案。某种程度上,时间站在你这边,因为你会想观望哪些参与者被淘汰,哪些能持续前进。比如你提到的那些公司,我认为它们的领域尚未被彻底颠覆,一旦确定最终架构,它们仍有迎头赶上的机会。

And and so I I think that the bigger companies are all going through their own version of that motion. I'd also imagine that given this given the space is moving so fast, I can sympathize for probably some degree of indecision maybe where one day a model is on top and then the next day a different model is on top and then another day, another model kind of breaks through. And so you probably wanna make sure that by the time that you land on a final architecture, you want that to be the sustainable long term architecture. And so to some extent, time is on your side up to a point because you might wanna wait to see who falls out and who keeps going. I think as an example, the companies you just mentioned, I don't think those spaces have been so utterly disrupted that they can't catch up once they land on a final architecture.

Speaker 3

但具体还要看它们如何执行。

But we'll have to see how they execute through this.

Speaker 0

那么对企业而言,更多是存在更明确的使用场景。比如手机端,如果你试图获取主动通知,面对的是海量数据宇宙,而企业场景更集中?还是说区别在哪里?

And so for business, it's more that there are more prescribed use cases. And I think with a phone maybe if you're trying to get these proactive notifications, you're looking at a massive universe of data, whereas you're more concentrated in business? Or what's the difference?

Speaker 3

实际上,我不认为存在差异。即便在企业领域,我们也处于非常早期阶段。必须清醒认识到这一点。目前突破性进展体现在面向消费者的ChatGBT,以及深度联网工程师群体使用的编程助手。

Well, actually, wouldn't say there's a difference. I would say even in business, we're insanely early. We have to process how early we are. The breakout so far have been chat GBT for consumers. The breakouts have been coding agents for very, very wired in engineers that are very online.

Speaker 3

这些群体密切关注所有动态。然后是经济各领域的早期采用者。目前企业部署的智能体,多数可能只是快速演示性质。杰弗里·摩尔提出(或至少推广了)技术采纳曲线的概念,将公司或群体划分为多个类别。

They're paying attention to everything going on. And then early adopters across the economy. Most of the agents that are being deployed in the enterprise are being done by maybe you can flash it up or something. Jeffrey Moore came up with this idea of the technology adoption curve or at least popularized it. It has multiple categories of where a company or a group of individuals will be.

Speaker 3

有这些创新先驱和早期采用者,接着是跨越鸿沟阶段,之后是实用主义者与早期大众,最后是落后者。我们正处于某些应用场景跨越鸿沟的最早期阶段。但必须意识到这个鸿沟的存在——早期采用者(我们整天交流的那些人)会尝试一切。我们会试戴疯狂眼镜,往头上贴磁铁,做最疯狂的事。我们会戴上Google Glass。

Have these early innovators and early adopters, then you have a chasm, then you basically have kind of pragmatists and early majority, and then you have laggards. We are in the early adopter kind of the earliest phase of jumping over the chasm on some use cases. But we have to imagine there's this chasm where what happens is the early adopters, the people that we all hang out with and talk to all day, they're gonna try everything. We're we're gonna try these crazy goggles and we're gonna put, you know, magnets on our head and we're gonna do the craziest things. We're gonna wear Google Glass.

Speaker 3

而这实际上几乎无法告诉你该事物是否能跨越鸿沟。你必须真正看到哪些能进入早期大众或那些大规模采用事物的实用主义者手中。因此,像ChatGPT、Cursor这类产品,以及下一代研究代理类工具如Perplexity,显然已在早期大众中取得突破。但就AI代理跨越鸿沟而言,我们仍处于非常早期的阶段。所以有些会成功,有些则不会。

And that that actually tells you almost nothing about whether the thing will jump over the chasm. You have to actually see what makes it to the early majority or those pragmatists that really adopt things at scale. And so the kind of technologies that have clearly broken through are ChatGPT, products like Cursor, products like, let's say, a bunch of these kind of next gen research agent type things, Perplexity has done well in that kind of early majority. But we are so early in terms of AI agents jumping over now the chasm. So some won't make it, some will.

Speaker 3

但我要说的是,商业领域的进展并不比你刚才提到的例子快多少。只是我们能看到的例子很多,但它们通常属于早期采用者那一类。

But I would say that business is not particularly moving faster than the examples you just gave. I just think we can see lots of examples of it, but they're usually in that kind of early adopter type category.

Speaker 0

没错。就在我们谈话的这一周,Box公司正在发布多款不同的代理。是的。但让我先问一个问题来展开讨论:什么是代理?因为这个术语似乎被过度使用,甚至像我这样一直身处其中的人,也不确定这个词的确切含义。

Right. And so the week we're talking, you at Box are releasing a number of different agents. Yep. But let me start this discussion by just asking you what is an agent? Because it does seem like it's an overused term and even myself who I'm in this all the time, I don't have on what that word actually means.

Speaker 3

我认为我们应该预料到它会被完全滥用。它现在已成为描述为你工作的AI系统的新行话。所以这将成为我们行业未来使用的主要术语,并非因为它是流行词,而是因为它确实有用——它是一个可定义的、为你执行自动化工作的对象。

I think we should anticipate that it's fully overused. It is now the new term of art for talking to an AI system that is doing work for you. So just we will hear this will be the main term that we use going forward as an industry. And not because it's a buzzword, but actually it's a it's a useful term. It's a it's a definable object that is doing automated work for you.

Speaker 3

在某些情况下,这可能简单到只是回答问题。但我想科技界大多数人会认为,它应该完成某种程度的工作,并通过多次循环调用AI模型来实现。这包括像Claude代码、Cursor的代理或Replit的代理——当你给它一个任务(比如'建一个具备这些特性的网站'),它能在十分钟内完成人类数周的工作。这种代理管理整个流程,多次调用模型,跟踪进度并更新记忆,本质上就是代理。

That could be, in some cases, as simple as answering a question. But I think most people in in the tech industry would generally argue that it should be doing some degree of of work and looping through the AI model multiple times to do that work. And so that could be everything from, you know, very clearly something like Claude code or Cursor has an agent or Replit has an agent where you give it a task like build me a website that has these qualities and it will go off and do, you know, weeks worth of human work in ten minutes. And that's an agent that is managing that whole process, looping through the model multiple times, keeping track of what it's doing, updating its memory in the process. And that's effectively an agent.

Speaker 3

这就是编程领域的代理。我们将在法律、医疗、金融、教育等领域看到相同的代理架构涌现,部署代理为你完成工作。关键问题是:在需要人工干预调整方向前,代理能独立完成多少工作?目前可能只能持续几分钟,但已有案例显示代理能运行数十分钟甚至数小时,持续产出更优质的结果。这就是代理的运作方式,未来几年它们将无处不在。

So that's an agent encoding. And we're going to see that same kind of agent architecture emerge in law, in healthcare, in finance, in education, where you can deploy agents to go off and do work for you. And and there will be, you know, a critical access which is how much work can the agent do before you have to intervene and modify and kind of repoint it in the right direction. And so a lot of that work right now can be maybe a couple minutes long, but we're seeing examples where agents could be running for tens of minutes or maybe even hours and effectively drive better and better and more high quality output. So I think that's a way to think about agents and these are gonna be very pervasive in the coming years.

Speaker 3

但2025年才是我们真正能严肃讨论它的第一年。我想安德烈·卡帕西说得对——我们不该认为这是'代理之年',而应视为'代理的十年'。这才是正确的思考方式。

But this is really the first year 2025 is the first year where we could even really be talking about it seriously. And I think Andre, you know, Carpathi had a had a you know, probably phrased it as we shouldn't think about this as the year of agents. We should think about it as the decade of agents. That's probably the right way to think about it. This is this

Speaker 0

某种程度上是移动化的。成为移动的十年,但最终我们开始使用移动设备。

is sort mobile. Became the decade of mobile, but then eventually we started using mobile.

Speaker 3

是的。但是但是但是,当你刚才提到移动之年很重要时。对吧?人们是否说过,有些人说那是在2022年,但可能第一次真正实现是在2000年,抱歉,不是2022年。

Yeah. But but but the and and again, when you just said the year of mobile mattered. Right? Did did people say know, so some people said that was in 2022, but probably the first time it could have been realistic was 2000 sorry. Not 2022.

Speaker 3

2002年2月。有些人认为,但直到2006年和2007年iPhone出现时才真正现实。所以,你知道,我和其他许多人实际上相信我们已经有了我们的iPhone作为代理。我们不需要任何新的突破性架构。我们已经有了一种架构,可以作为代理的核心框架。

02/2002. Some people but but it wasn't really realistic until 2006 and 2007 when when you had the iPhone. So, you know, I and I think fairly other many many other people are actually convinced we we already have our iPhone for agents. We don't need don't need any kind of new breakthrough architecture. We have an architecture that already kind of works as the core scaffolding for agents.

Speaker 3

所以我们现在可以开始这个十年的时钟。但这将是一个完全自动驾驶类型的问题。你知道,显然,Waymo在十年前或十五年前就开始了。而直到今年,它才在硅谷郊区可用。所以这花了十年或十五年?

So we can start the decade kind of clock now. But it will be a full self driving type problem. You know, obviously, Waymo, you know, got kicked off, I don't know, a decade, decade and a half ago. And only this year is it, you know, accessible in suburban Silicon Valley. So what what took a decade or a decade and a half?

Speaker 3

这只是大量的工程工作,大量的道路测试,以及系统准确性和智能性的每一个方面的改进。我们将在知识工作中看到同样的事情。这将需要多年时间。早期采用者将获得早期回报。实用主义者将在它基本无需大量手动干预时使用它,而每个人都会处于这个光谱的中间位置。

It was just lots of engineering work, lots of miles on the road, lots of improving every single dimension of, you know, of of of the accuracy of and the intelligence of of the system. We're going to see the same thing for knowledge work. It's going take years. The early adopters will get the early returns. The pragmatists will use it once it sort of works without a lot of hand holding and everybody will land somewhere in the middle of that spectrum.

Speaker 0

好的。我看了你这周的部分演讲,你谈到的一些代理将帮助公司部署,例如,可以查看申请以参与,也许可以租出公寓,哦,是的。或者查看一些财产记录并在那里执行任务,或者创建报告,查看临床测试并尝试找出问题。所以谈谈创建这些代理的过程。这还处于演示阶段,还是已经实际可用?

Okay. And so I watched a chunk of your presentation this week, and some of the agents that you're talking about enabling companies to deploy will be things that will, for instance, take a look at a application to be involved to maybe take an apartment out or oh, yeah. Or to look at some property records and then do tasks there or to create reports, looking at clinical tests and trying to pull out issues. So talk a little bit about how the process to create these works. And is this still in the demo phase, or is this actually real?

Speaker 3

所以也许先回答第二个问题。我们这周宣布了一些重大消息。我们宣布的一些产品和功能现在已经全面上市,所以客户可以立即开始使用。其中一些,我们提供了一些关于未来几个季度产品的展望。例如,我们现在有一个AI代理,任何客户都可以去使用,这是一个数据提取代理。

So maybe second question first. So so we we made a a number of big announcements this week. Some of the the the product and capabilities that we announced are fully GA right now, so customers can already start to use it. Some of it, we we kinda give a little bit of a of a crystal ball view into the next couple of quarters of of the product that we're getting out there. As an example, we have an AI agent right now that any customer can go and use, which is an a data extraction agent.

Speaker 3

因此,您可以再次提供给我们合同、发票或医疗数据。然后我们有一个AI代理来处理这些内容,从这些文档中提取关键数据,并让您围绕这些数据自动化工作流程。我们在Boxworks上宣布的是一项名为Box Automate的新功能。Box Automate的理念在于,拥有能够帮助您审查文档、生成提案或根据数据为客户制定销售计划的独立代理非常非常强大。这极其强大。

So you can give us, again, contracts or invoices or medical data. And then we have an AI agent that that that works through that content, pulls out the critical data from those documents, and then lets you go and automate a workflow around that. What we announced at Boxworks was a a new capability called Box Automate. And what the idea of Box Automate is is it's very, very powerful to have one off agents that can help you, you know, review a document or generate a proposal or generate a sales plan for a client based on data. That's super powerful.

Speaker 3

但更强大的是,我可以将许多这样的代理整合到一个完整的业务流程中。Box Automate让您能够在Box内实际定义您的业务流程。这可能是客户入职流程,可能是并购尽职审查流程,也可能是医疗保健患者的审查流程。

But what's even more powerful is that I can drop many of those agents into a full business process. So what Box Automate lets you do is actually define your business process within Box. It could be a client onboarding workflow. It could be an M and A due diligence review process. It could be a health core a health care patient, review process.

Speaker 3

您可以使用Box Automate定义该工作流程。它是一个拖放式的工作流构建器。然后,在流程的任何环节,您都可以引入AI代理来执行该流程中的工作。对于AI代理来说,非常重要的一点是它们需要正确的上下文才能有效工作。因此,我们的系统允许您从企业内容中为代理获取这些上下文。

And you define that workflow with Inbox Automate. It's a drag and drop, kind of workflow builder. And then at any point in the process, you can bring in an AI agent to do work within that process. And so one one thing that that that is very important with AI agents is they need the right context to be effective. So our system allows you to get that context to agents from your enterprise content.

Speaker 3

因此,您的营销资产、研究数据、合同、发票等成为代理的重要上下文。Box Automate让您基本上可以按需或在流程中即时构建这些代理,利用您现有的内容。然后,我们可以开始帮助您自动化企业中的一系列知识工作任务。

So your marketing assets, your research data, your contracts, your invoices, that becomes very important context for agents. So Box Automate lets you basically build these agents on demand or on the fly in workflow that leverages your existing content. And then we can start to help you automate a bunch of knowledge work tasks around the enterprise.

Speaker 0

现在,很多关于GPT-5的早期评论认为它某种程度上是为了做这类事情而构建的,或者作为这类工作的基础层。对吧?

Now a lot of the early reviews around GPT five was it was sort of built to do these type of things or, like, as a foundational layer for this type of work. Right?

Speaker 3

是的。

The Yep.

Speaker 0

我们早期读到的评论是它只是执行任务,而且有人注意到,当您在ChatGPT中使用GPT-5时,几乎每次回答都会问‘我能为您做些什么吗?’。所以,实际上我很好奇,Aaron,您的回应是什么。我们上次交谈是在GPT-5发布之前。您对这些新模型的感受如何?实际上这是一系列模型。我很好奇,您如何看待早期有这么多人感到失望的事实。

The reviews we read early on was that it just does stuff, and there have been people that have noticed that, like, when you're in ChatGPT using GPT five, you, like, literally can't have an answer where it doesn't say, can I do something for you? So So I'm actually curious, Aaron, what your response has been. We last time we spoke was pre GPT five. What your what your feeling has been about this new set of models? Really, it's a set of And and I'm curious, like, what you make of the fact that so many people were disappointed early on.

Speaker 3

嗯,是的。关于失望情绪或某种网络思潮,有趣的是这种思潮已经发生了相当大的转变,我认为很多人对GBD5的看法已经更新。而且我觉得Codex最近在编码代理方面表现非常强劲。要知道,过去一年左右我们已经习惯了这些惊人的跳跃和突破。回想一下,我们从GPD四代升级到GPD四点零一、零三,再到GPD 4.1。

Well, yeah. So on disappointment or kind of online zeitgeist, which actually interestingly has already shifted, I think, quite a bit where a lot of folks have kind of updated their views on GBD5. And I think Codex has come out very strong recently on the coding agentic side. You know, if I think we have gotten used to and we've been hooked on these incredible kind of jumps and breakthroughs over the past year or so. We went from if you think about it, we went from GPD four to GPD four point zero to one, zero three, and then GPD 4.1.

Speaker 3

每一个版本在不同维度上都实现了相当显著的阶梯式进步。如果直接从GPT四跳到GPT五,看起来会像是指数级飞跃。但正是这些中间版本让我们提前预览了GPT五最终的模样——一个具备思维链的思考模型,拥有更高质量的编程能力和多项关键工作维度的能力。我认为这主要是因为我们获得了通往GPT五的许多渐进式突破,而GPT五正是这些突破的集大成者。

And each of those on a different axis was actually a pretty meaningful step function. So if you had just taken GPT-four and then you jumped to GPT-five, it would have looked insanely exponential. But we got these points along the way that effectively, you know, kinda gave us an early preview into what GPT-five would ultimately become, which is a thinking model, a chain of thought, with with a a way higher quality of coding skills and a bunch of capability on capabilities on critical dimensions of work. And so I think it was mostly just driven by the fact that we got lots of incremental steps or step function steps on the path to GPT-five. And then GPT-five was just the culmination of a lot of those breakthroughs.

Speaker 3

所以我认为这更多是心理感受而非实证结果。如果我们直接从三代跳到四代再到五代,那将是史上最陡峭的增长曲线。但实际上正是那些中间步骤可能引发了部分人的这种反应。在我们的测试中,我们会用各类企业数据(合同、财务文件、研究材料、内部备忘录等)对每个模型进行多维度评估。

So again, I think it's probably more psychological than kind of empirical. Like I think if we had gone from three to four to five, it would be the most vertical axis we've ever seen. But it was really again those steps along the way that maybe caused a little bit of that kind of reaction. In our world, we test every single model on a number of evaluations where we give the model different types of enterprise data, contracts, financial documents, research materials, internal memos, those types of things. And we ask the model a series of questions about that document or data.

Speaker 3

在我们的评估中,GPT五相比GPT四一确实有显著改进。在多项关键测试中都有多个提升点,这些改进直接转化为客户体验的提升。比如医疗机构使用GPT五处理非结构化医疗数据时,会获得更优质的结果;处理合同时同样如此。在需要专业分析的领域(医疗、法律、金融服务)我们都看到了进步。

And we saw meaningful improvements from GPT-five versus GPT-four-one as an example on our eval. So for us it was multiple points of improvement on a number of our key tests. And those improvements then translate into real life improvements for customers where they all of a sudden will it'll mean that when you're a healthcare provider using GPT-five on unstructured healthcare data, you're going get better results than you got before. Or when you're using it on your contracts, you're going to get better results. And so on a number of spaces where either it was kind of expert analysis required in healthcare or law or financial services, we saw improvements.

Speaker 3

更广义地说,在需要逻辑推理或数学运算的领域,这些维度也都有所提升。

Or in more a general sense, if you needed logic or reasoning or math, it was also an improvement on those dimensions as well.

Speaker 0

能请你快速谈谈对当前AI行业经济状况的直观看法吗?我们上周五节目和Ranjan讨论过,OpenAI到2029年的累计亏损将达到1150亿美元。哦抱歉,是现金消耗。比之前预期高出800亿,今年预计收入100亿,却刚签了3000亿美元的甲骨文协议——这笔交易几乎一夜之间让甲骨文成为近万亿美元公司,使拉里·埃里森超越埃隆·马斯克成为世界首富。这怎么说得通呢?

Can I get a quick gut check from you on the economics of the AI industry right now? Mean, are talking at a moment where we just talked about this on the Friday show with Ranjan that OpenAI's losses are now gonna total a 115,000,000,000 through 2029. Oh, sorry. It's cash burn. A 115,000,000,000 through 2029, 80,000,000,000 higher than it previously expected.

Speaker 0

预期今年能赚100亿,却刚签署了3000亿美元的甲骨文协议,这笔交易几乎让甲骨文一夜成为近万亿美元市值的公司,还让拉里·埃里森超越埃隆·马斯克成为世界首富。这...这合理吗?

It's it's expected to make, like, 10,000,000,000 this year, but it just signed a $300,000,000,000 deal with Oracle that, like, turned Oracle into a nearly $1,000,000,000,000 company almost overnight and made Larry Ellison the richest person in the world above Elon Musk. How does this how does this make sense?

Speaker 3

我认为,如果你和我以及其他人——比如Jensen、显然还有Sam,甚至Elon——持有相同信念的话,这完全说得通。我相信这是我们可能接触过的最大规模的技术。试想这是第三次工业革命,我们有史以来首次能将自动化引入知识工作领域。稍微思考一下:我们正在将自动化引入知识工作。过去,知识工作的世界始终受限于人类的工作速度。

Well, I think it makes sense if you believe like I do and certainly others, Jensen, clearly Sam, even Elon, I think would believe that this is the single biggest technology that we've probably ever had access to. So if you think about this as sort of a third industrial revolution where for the first time ever, we can bring automation to knowledge work. Just think about that for a second. We're bringing automation to knowledge work. Everything about the world of knowledge work was always basically limited by how fast we as humans could work.

Speaker 3

我们可以向计算机输入数据,系统处理这些数据,他人读取后推动流程运转。知识工作的速度取决于我们打字或阅读信息的速度,以及基于这些数据在现实世界中采取行动的速度。这就是知识工作的节奏。无论是医疗专家解读诊断报告、生命科学专家进行临床研究、律师梳理案件事实或处理知识产权,还是工程师编写代码和阅读产品规格——所有这些工作始终受限于个人独立完成的速度。

We could type into a computer, put data into a system, somebody else reads that data, it moves along in some kind of process. That was about the speed of knowledge work was how quickly we could type or read information and then do something in the real world with that data. That was the pace that knowledge work could happen at. And so every field that we know of in kind of knowledge work, healthcare experts reading medical diagnoses, life sciences experts that are doing research on clinical studies, lawyers that are trying to find facts about a case or working through intellectual property, an engineer trying to generate code and read product specifications. All of that work has always been constrained by how fast we as individuals can do that work individually ourselves.

Speaker 3

借助AI,我们首次能够将自动化应用于几乎所有这类工作。这种自动化程度可以根据我们投入的计算能力进行调节,当然还取决于数据质量及系统向AI传递数据的效率。在一个通过调节计算能力就能以远低于人工成本获得不同自动化水平和产出的世界里,这堪称后工业时代经济领域最重大的突破。即便为此投入1000亿美元达到技术普及,考虑到医疗、法律、生命科学、金融服务和工程等全行业的规模,这个数字其实微不足道。

For the first time ever, with AI, we can bring automation to effectively all of that work. And that automation can kind of be tuned based on just how much compute we throw at the problem. And then, of course, how good our data is and how effective our systems are in getting that data to the AI. But in a world where you can toggle compute and then get different levels of automation and effective output in work to get done at a way lower cost than what people can do, that is the biggest breakthrough we've ever had in the economy and in the post industrial world. And so a $100,000,000,000 of loss, let's say, to get to that point of saturation where that technology is out there, that's a very it's actually a very small number when you think about the economy and the size of the economy for all of health care, all of law, all of life sciences, all of financial services, all of engineering.

Speaker 3

我认为这就是科技公司对此的评估逻辑。需要明确的是,这些亏损是主动选择的结果——这一点非常明显。他们出于战略考量而选择承受这些损失。

So I think that's how that's how these technology companies are underwriting this. And the losses are a choice, to be clear. Like, that and it's very that's very obvious. Like, they're choosing to lose that money. They're doing it for a strategic reason.

Speaker 3

这至少是他们的决策考量。战略意义在于:这个市场极具统治价值,他们宁愿先扩建基础设施,甚至通过免费用户层(比如ChatGPT的免费版)补贴使用率,也不愿立即按当前成本全面收费来确保盈利。这是主动选择——他们本可选择全面收费,但会牺牲当下的市场渗透率。

You know, that that's at least their decision. The strategic reason is is that this is such a valuable market to own and to dominate in that they would rather build up capacity and in many cases subsidize usage, let's say in free consumer tiers of Chatuchu BT, then charge everything at today's kind of rate of cost and then make sure everything is profitable. That's a choice. They could decide to charge for everything. They would get less adoption today.

Speaker 3

若选择全面收费,他们立即就能成为更可持续的企业。但足够多的人相信这个奖池足够大,值得投入所有研发费用、数据中心开支以及必要的补贴来推动应用需求。这是一场豪赌——显然,那些极其聪明、经济理性的公司、个人和主权财富基金都认为值得下注。我也倾向于认为值得,毕竟这项技术可能带来的经济影响实在重大。

They would be instantly a more sustainable business. But enough people believe that the prize is big enough that it's worth actually doing all of the research expenses, all of the data center expenses, and the subsidization where necessary to drive that adoption and demand. And it's a go big or go home type of bet. You know, clearly very very smart, very economically rational firms, individuals, sovereign wealth funds believe that that bet is worth it. I'm probably on the side that that the bet is worth it because of, again, how material of an economic impact this technology can have.

Speaker 3

至于具体某个行业参与者最终表现如何,我们自然会见证其发展轨迹。

And then we'll obviously how how we'll see how it plays out with any kind of individual player in the in the space.

Speaker 0

各位,您可以在box.com上了解更多关于Box的产品信息。首页正在播放一段视频,详细介绍了今天我和亚伦讨论的许多内容。亚伦,见到你真是太棒了。再次感谢你参加我们的

Folks, you can learn more about Box's offerings at box.com. There's a video playing on the home page right now that talks a lot more about things that Aaron and I have discussed here today. Aaron, so great to see you. Thanks again for coming on the

Speaker 3

节目。谢谢,亚历克斯。

show. Thanks, Alex.

Speaker 0

好了,各位。非常感谢大家的观看。我们下次在《大科技播客》中再见。

Alright, everybody. Thank you so much for watching. We'll see you next time on big technology podcast.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客