a16z Podcast - 为开发者和AI智能体构建API 封面

为开发者和AI智能体构建API

Building APIs for Developers and AI Agents

本集简介

不锈钢创始人Alex Rattray与a16z合伙人Jennifer Li共同探讨API、SDK的未来以及MCP(模型上下文协议)的兴起。基于他在Stripe的经验——曾协助重新设计API文档并构建代码生成系统——Alex阐释了为何SDK对大多数开发者而言就是API本身,以及为何高质量、符合语言习惯的库不仅对人类至关重要,如今对AI代理同样如此。他们深入探讨了以下话题: - SDK生成的演进历程及在Stripe大规模构建的经验教训 - MCP如何将API重塑为大型语言模型的接口 - 为开发者和AI代理同时设计工具与文档的挑战 - 上下文限制、动态工具生成及文档如何塑造代理可用性 - 在"每家公司都是API公司"时代开发者平台的未来 时间戳: 0:00 – 引言:API作为互联网的树突 1:49 – 构建API平台:来自Stripe的经验 3:03 – SDK:开发者的接口 6:16 – MCP模型:面向AI代理的API 9:23 – 为LLM和AI用户设计 13:08 – 解决上下文窗口挑战 16:57 – 强类型SDK的重要性 21:07 – API与代理体验的未来 24:45 – 领先API公司的经验分享 26:14 – 结束语与免责声明 资源: 在X上关注Alex:https://x.com/rattrayalex 在X上关注Jennifer:https://x.com/JenniferHli 保持更新:告诉我们您的想法:https://ratethispodcast.com/a16z 在Twitter关注a16z:https://twitter.com/a16z 在LinkedIn关注a16z:https://www.linkedin.com/company/a16z 在您喜爱的播客应用订阅:https://a16z.simplecast.com/ 关注主持人:https://x.com/eriktorenberg 请注意,此处内容仅用于信息目的;不应被视为法律、商业、税务或投资建议,也不应用于评估任何投资或证券;且不针对任何a16z基金的现有或潜在投资者。a16z及其关联公司可能持有讨论企业的投资。更多详情请参见a16z.com/disclosures。

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

所以我喜欢把API比作互联网的树突。想象一下,大脑是由神经元组成的,通过协同放电实现思考;同样,互联网可以看作是由服务器或运行在服务器上的程序组成的集合,它们通过相互交互来思考或行动。而这种连接——突触,几乎不言自明地,通常就是某种形式的API。要意识到,每家公司都在变成API公司。而每家API公司都需要一个包含SDK、文档和版本控制的API平台。

So I love to say that APIs are the dendrites of the Internet. So if you think about the brain being a collection of neurons that by firing together, you know, think, you can think of the Internet as a collection of servers or programs running on servers that by interacting with each other, think or do. And that connection, the synapse, that's almost tautologically, almost always an API of some kind. Realize that every company is becoming an API company. And every API company needs an API platform with SDKs and docs and API versioning.

Speaker 0

这正是我在Stripe与杰出团队共同协作的无数事项之一。人们希望能专注于核心能力,谨慎地向用户暴露接口,而不必操心底层细节。

And just, you know, a million different things that amazing team that I worked on at Stripe all collaborated on. People want to be able to focus on their core capabilities, be thoughtful in exposing the interfaces to users, and not have to worry about the low level details.

Speaker 1

如果你一直在使用API开发,就会知道它们是现代软件的连接组织,驱动着从支付到云服务的所有功能。但随着API的激增,开发者面临的挑战也在增加——SDK质量、文档、版本控制,以及一个全新领域:不仅面向人类,还要面向AI代理的接口。本期节目中,Steamless创始人Alex与十六区合伙人Jennifer Lee对话,分享他在Stripe的头盔经历如何塑造了Steamless的使命:为每家API公司带来世界级的开发者平台、SDK和文档。他们探讨了为何SDK常被视为开发者的API,MCP模型上下文协议如何将API重构为大语言模型的接口,以及将AI代理视为一级用户带来的新设计挑战。

If you've been building with APIs, you know they're the connective tissue of modern software, powering everything from payments to cloud services. But as APIs multiply, so do the challenges developers experience. SDK quality, documentation, versioning, and now a new frontier interfaces not just for humans, but for AI agents. In this episode, Steamless founder, Alex joins a sixteen z partner, Jennifer Lee, to share how his helmet Stripe shapes Steamless's mission to bring world class developer platforms, SDKs, docs, and more to every API company. They discuss why SDK is often the API for developers, how MCP model context protocol reframes APIs as interfaces for large language models, and the new design challenges of treating AI agents as first class users.

Speaker 1

这是一场关于API未来、代理体验的对话,探讨如何构建健壮、符合语言习惯且具备扩展性的开发者平台。让我们开始吧。

It's a conversation about the feature of APIs, agent experience, and what it means to build developer platforms that are robust, idiomatic, and built to scale. Let's get into it.

Speaker 0

我在Stripe时开始从事API工具开发。当时在其他团队工作,亲眼见证了API平台的重要性,于是转岗到该团队,主导了Stripe API文档的重设计——这段经历令人难忘。通过这个过程,我深刻认识到SDK的关键作用。甚至在Stripe内部,人们可能都没完全意识到:对于每天使用Stripe API的开发者而言,SDK就是他们接触的API。

So I started working on API tooling when I was at Stripe. I had been working on other teams, and I just saw how important the API platform was. And so I moved over to that team and built out a redesign of the Stripe API docs, which was an incredible experience. And through that, saw how important the SDKs were. Even internally at Stripe, people, I think, didn't quite see that to the developers who who use the Stripe API every day, the SDK is the API.

Speaker 0

当时那些(开发工具)有些落后了。所以我构建了一个代码生成系统,它能接收Stripe的开放API规范,并将其转化为优质的Python库、TypeScript库等等。构建这个过程非常有趣。作为一个计算机科学家,我在构建编译器相关功能和库设计的细节时感到无比兴奋。离开Stripe后,人们不断问我:'怎样才能获得优秀的SDK?如何为我的API打造出色的开发者体验?'

And those were kind of falling behind. So I built a code generation system there that can take this Stripe open API spec and turn it into a great Python library and a great TypeScript library and so on. And that was really fun to build. You know, the computer scientist to me just had a phenomenal time building sort of the compiler aspects of that and the library design nitty gritties. And so after I left Strite, people kind of kept asking, hey, you know, how do I get good SDKs or or how do I build a great developer experience for my API?

Speaker 0

这其实让我很痛苦,因为当你说'这是个好问题,你们确实需要SDK'时——它已经是开发者体验的核心基础部分了。任何使用你API的人都会期待获得他们所用语言的类型完善SDK。但遗憾的是,这个好问题的糟糕答案是:目前根本没办法实现。

And it was very painful for me, actually, because, you know, you're saying, hey, that's a great question. You really need SDKs. It's a core baseline part of the developer experience at this point. Everybody interacting with your API is gonna expect a well typed SDK in their language. Unfortunately, the bad answer to that great question is that there's no way to do it.

Speaker 0

要知道当时开源的OpenAPI生成器,其输出质量远达不到Stripe的标准。虽然后来投入了大量改进工作,但这本身就是个难题。维护成本、配置过程中的各种麻烦,对于除了像Stripe这样规模的公司(我在那里工作时有2000人)之外,几乎任何企业都难以承受。作为工程师,告诉人们'你们需要这个但就是得不到'实在太令人沮丧了。所以后来我开始问:'如果我为你们构建这个,你们愿意付钱吗?'

You know, the open API generator that's open source at the time, surely, it was pretty far from the Stripe quality bar in terms of the output it would generate. A lot of great work has done gone into that, but it's a very hard problem. The maintenance burden, the headaches involved in trying to set this up, it's just too much for almost any company other than, you know, something like a Stripe at 2,000 people, which was when I worked on that. And so it's very dissatisfying as an engineer to tell people a bad answer, like, you need this thing and also you can't have it. And so eventually, I started saying, hey, would you pay real money for this if I built it for you?

Speaker 0

人们不仅回答'愿意',还问'能否帮我们解决API相关的所有其他问题?'我意识到每家公司都在变成API公司,而每家API公司都需要包含SDK、文档、API版本控制等功能的平台——就像我在Stripe时那个杰出团队共同构建的整套系统。于是我想:'能否把Stripe开发者平台打包成标准化产品,提供给所有企业?'

People both said, hey, yes, I would. And also, can you help us with everything else around our API? And realize that every company is becoming an API company. And every API company needs an API platform with SDKs and docs and API versioning, and just, you know, a million different things that amazing team that I worked on at Stripe all collaborated on. And so, you know, can we bring that Stripe developer platform in a box and bring it to every company?

Speaker 0

这就是我最终创立Stainless API公司的缘起。

And so that's sort of the way that I came to to start Stainless API as a company.

Speaker 2

你之前对2022年的预测或断言如今确实应验了。每家公司都成了API公司,我们也将深入探讨MCP。能否解释一下你们产品中SDK生成部分的设计思路?当然,Stripe的经验对构建这款产品至关重要,因为直到今天,它仍是开发者心目中卓越开发者体验和优质SDK的标杆之一。

That prediction you had or that statement you had for 2022 definitely turned out to be true today. Every company is API company, and we're gonna dive into MCPs too. Can you explain sort of how did you approach the sort of SDK generation aspect of the product? And Stripe experience, of course, is very critical to building this product because till today, it's still one of the north stars of how what developers look to as great developer experiences and great SDKs.

Speaker 0

是的。我们在Stripe积累了丰富经验,见证过优劣参半的各种案例。要处理好无数细节——无论是打造真正健壮的客户端库,确保当网络异常时SDK不会崩溃或引发问题,还是保证其可扩展性。

Yeah. We had a lot of experience with this at Stripe and, you know, saw a lot of the good and a lot of the bad and a lot of the ugly. There's so many details to get right, whether it's making a client library that's really robust. And when things get weird on the Internet, your SDK does not it's not the thing that's blowing up or causing problems. It can handle scale.

Speaker 0

它需要能处理清晰的错误信息,与遥测系统良好协作等。除了健壮性,SDK还需高度精致。这意味着为Python开发者打造的工具要符合他们的语言习惯——比如悬停时能在编辑器中弹出文档说明,无论这些说明来自REST API还是库核心。打磨这些细节本身就是段迷人旅程,也是我个人最热爱的工作。

It can handle great error messages and can work well with telemetry and all this stuff. In addition to being robust, an SDK also needs to be really polished. And that means if you're building something for a Python developer that it's idiomatic to them, it's familiar. If they hover over on a thing, there is documentation that pops up in their editor telling them about that thing, whether it's coming from the REST API or from the core of the library. And crafting that is really quite a journey, you know, and it's something that I just personally love.

Speaker 0

多年前我创造过一门小型编程语言,整个过程就是不断打磨细节,寻找每个微小功能的最佳实现方式。这也是我在Stainless最享受的部分。有趣的是,当我们讨论MCP时,我们打造的Python库就是绝佳范例——毕竟Python是我学的第一门语言(无意冒犯Visual Basic)。

Years ago, I created a small programming language, and it was all about kind of polishing those nitty gritties and finding the best way to do every little detail. And so that's one of the things that I most enjoy at Stainless. There's an interesting thing here where, you know, we're gonna be talking about MCP. And crafting, I think, the Python library that we do is a great example of this because Python was the first programming language that I learned. No offense, Visual Basic.

Speaker 0

当我着手开发Python生成器时,发现这门语言已与我早年使用时大不相同——现在有了类型系统(还不止一套)、异步支持、模式匹配等,演变程度令人惊叹。

It had evolved a lot since I had last used it when I was sitting down to create our Python generator. There's types in Python, and there's multiple type systems. There is async. There's, you know, pattern matching. It's really incredible to see.

Speaker 0

而什么才是符合Python世界习惯的写法?这是个移动靶标。业界花了数年才摸索出如何打造地道的Python库,让Python开发者觉得直观可读。这个标准仍在进化,而我们推动了这个进程,为此感到自豪。来到MCP这个新领域——不同于向编写Python软件的开发者暴露API,通过MCP客户端你可以将API暴露给LLM智能体。

And what's idiomatic through all of this world? And that's a moving target. I think the industry spent years basically figuring out how to make an enigmatic Python library and figuring out what it looks like for a Python library to be an intuitive and legible to a Pythonista. And that's continued to evolve, and I think that we've moved the industry forward a little bit, and I'm proud of the work that we've done there. When you come to MCP, which is this kind of new frontier where instead of exposing your API to a Python developer writing Python software, one of the things that you can do with MCP is expose your API to an LLM agent using an MCP client.

Speaker 0

在这个新世界里,什么对LLM来说是直观的?当AI与庞大API交互时,怎样的设计才算符合语言模型的使用习惯和人体工学?这方面几乎是从零开始的探索,有许多引人入胜的工作等待我们去完成。对我们而言,这就像个充满乐趣的前沿研究课题。

In that world, what's intuitive to an LLM? What is usable and ergonomic and idiomatic to a large language model, to an AI, in interacting with a large API? And the world is starting from scratch there almost. So there's a lot of interesting work to do and develop and figure out. So for us, it's kind of this really fun exploratory research problem.

Speaker 2

或许我们可以拉远视角。你曾将API比作互联网的树突。如今MCP无处不在,占据了像Stanlis这类开发者工具公司的大量脑力带宽,你会如何定义MCP?它如何影响了你们的业务?你如何看待这个领域的发展格局?

Maybe zoom out a little bit. You called APIs being the dendrite of the Internet. Now that MCPs are everywhere, like, know, occupying a lot of the brain bandwidth for developer tooling companies like Stanlis, what do you call MCPs? And how did that have impacted your business? How do you think about the landscape?

Speaker 0

没错,我常说API是互联网的树突。如果把大脑看作通过神经元放电来思考的集合体,那么互联网就是服务器上运行程序的集合体,它们通过相互交互来'思考'或'行动'。而连接这些'神经元'的突触,几乎总是某种API——这是互联网软件生态的关键组成部分。

Yeah. So I love to say that APIs are the dendrites of the Internet. So if you think about the brain being a collection of neurons that by firing together, you know, think, you can think of the Internet as a collection of servers or programs running in running on servers that by interacting with each other, sort of think or do. And that connection, the synapse, that's almost tautologically, almost always an API of some kind. So it's a really critical part of the Internet software ecosystem.

Speaker 0

看着这种'CPU大脑'扩展到'GPU大脑'真是激动人心的时刻。而你需要合适的接口——MCP就扮演了这个角色。虽然暂时想不出贴切的生物学比喻,或许可以类比为眼睛、耳朵或某种新感官。

And seeing that kind of like CPU brain, you know, extend into the GPU brain is a really exciting moment, I think. And you need the right interface. So MCP kind of is that interface. I don't know if I have a biological metaphor off the top of my head for that. I kinda wanna compare it to eyes or ears or a new sense.

Speaker 0

当一个软件系统内部进行通信时,好吧,很酷,但从某种意义上说,这一切都在你的脑海中。它都在云端。在现实世界中它并没有做任何事情。那么,科技行业过去三十多年乃至更久以来构建的所有软件,是如何在现实世界中变得真实并做出有意义的事情的呢?其中一个长期存在的方式,我称之为通过眼睛,以用户界面的形式实现。

When a software system is communicating internally, okay, cool, but it's all in your head in some sense. It's all in the cloud. It's not doing anything in the real world. And how does, you know, all the software that the technology industry has been building for the last, you know, thirty and beyond years, how does that kind of become real in the real world and do something meaningful? And one of the long standing ways of doing that is sort of, I'm gonna call it through the eyes, in the form of user interfaces.

Speaker 0

比如有人登录一个网络应用,我以前在Stripe工作,所以经常用这个例子。小企业主可能登录他们的Stripe仪表板,给客户退款、创建新付款、将客户添加到系统等等。在API出现之前,一切都是这样操作的。后来我们发现需要自动化处理。也许不再是眼睛,现在我们要用手指了,对吧?

So you'll have someone log in to a web application, you know, I used to work at Stripe and so I use this example a lot. Someone at a small business might log in to their Stripe dashboard, refund a customer, create a new payment, add a customer to the system, anything like that. In the days before APIs, everything was this way. You know, then we saw that we needed to automate things. You know, maybe instead of eyes, I don't know, now we're going with fingers, right?

Speaker 0

在你的某个部分,你的思维与神经元之间存在沟通,与现实世界中的某些具体事物相连,突然间这些计算机可以自动化地做事。我不知道你是否会把MCP和AI称为新的耳朵之类的东西,但它是一种新的感官,或者是一种新的交互方式,从世界获取信息并输出某些东西。所以你就像在生长一个新的肢体。

And there's some part of your where there's a communication between your mind and the neurons and something tactical in the real world where all of a sudden these computers can do things in an automated fashion. I don't know if you would call MCP and AI like they're the new ears or something like that, but it's a new sense or it's a new way of interacting, getting information from the world and also putting something out. So you're sort of, like, growing this new limb.

Speaker 2

我确实想象一只手举在空中,好像在说‘这里,你可以使用我,我可以做这些事情’。所以绝对是更多的接触点和表面区域让AI去探索,并利用许多现有的软件能力。我非常喜欢这个类比。沿着这个思路,你长期为开发者开发工具,但现在我们有了像大型语言模型和AI这样的新角色。它们是如何作为用户服务大型语言模型的设计方面的一部分,以及构建工具时考虑这个新角色的思考。

I definitely imagine a hand raising up in the air of, like, here, you can use me and, like, I can do these things. So definitely, you know, more of the touch points and surface areas for AI to explore and also to leverage a lot of the existing, you know, capabilities from software. So I like that analogy a lot. And just maybe going down that train of thought, you have been developing for tools for developers and as a developer for a long time, but now we have this like new persona that's large language models and AI. They are part of the design aspect of how to serve large language models as a user and building sort of tools, thinking about this new persona.

Speaker 2

在考虑这种新的界面需求时,设计工具的思维方式和挑战有哪些变化,这些需求如何影响你构建产品的方式?

What are the changes and challenges in just like the thinking of designing tools when taking in this sort of new interface requirements that has impacted how, you know, you're building and building the products?

Speaker 0

是的。如果你把Python库看作是API和Python程序或Python开发者之间的接口,那么我会把MCP看作是API和LLM之间的接口。对我们来说,这些都是SDK。当然,你可以用MCP做其他事情,这些其他事情也很重要。

Yeah. So if you think about a Python library being the interface between an API and a Python program or Python developer, I would think about MCP as being the interface between an API and an LLM. And so to us, it's all SDKs. Right? Now, again, you can do other things with MCP, and these other things are important.

Speaker 0

文档就是一个重要的例子,我也很想谈谈这个。但就这些核心操作而言,比如让我去Stripe仪表板给客户退款,或者让我去查找上次交易发生的时间等等。正如你所说,让这一切工作起来有很多困难和新的挑战。Python开发者可能非常希望异步操作能简单些,Java开发者可能希望他们交互的类是类型安全的,但同时又不希望它们在运行时因为意外的null而崩溃。这是一个非常棘手的问题。

Documentation is really a big example, and and I I'd love to talk about that too. But in terms of sort of these, like, core operational, you know, let me go refund a customer in Stripe dashboard or let me go find out when this last transaction occurred or anything like that. You know, as you say, there's a lot of difficulties and new challenges and problems in making this work. And a Python developer might really wanna make sure that async works without too much rigmarole, or a Java developer might wanna see that the classes that they're interacting with are type safe, but at the same time may not want them to sort of blow up at runtime if something is unexpectedly null. And that's a really hard problem.

Speaker 0

对吧?昨天我和一个曾经在Java团队工作并帮助设计Java语言的人聊天。我告诉他这个,他说‘是的,这是一个无法解决的问题’。

Right? You know, yesterday, I was chatting with someone who was on the Java team back in the day and helped design some Java languages. And I was telling him about this. He was like, yeah, that's an unsolvable problem. Right?

Speaker 0

也许这是一个被错误解决的问题,但我们在Steamless已经解决了这个问题,我为此感到自豪。对LMs来说确实有很多困难挑战。你问到的这些挑战,很多都归结于上下文大小。目前使用MCP的典型方式是将API(通常用OpenAPI规范描述)通过MCP呈现给NLM,基本上是将API中每个端点、每个操作的JSON模式转换为MCP服务器中的一个工具。比如如果你想在Stripe创建一个收费,就使用这个创建收费工具。

And, you know, that's something that maybe solved as the wrong answer, but that we've tackled this at Steamless and I'm proud of. There's really difficult challenges for LMs. And so some of those challenges that you're asking about, a lot of it comes down to the context size. And so if right now with MCP, that the naive way, the typical way of taking an API, usually described with an open API spec, and presenting that to an NLM through through MCP, is you basically take the JSON schema for every single endpoint, every single operation in your API, and you turn that over into a tool in your MCP server. And say, if you want to create a charge in Stripe, you know, use this create charge tool.

Speaker 0

在这个工具内部,有每个请求参数。比如如何指定收费金额、描述、货币等等。这些都会有和Stripe API文档中相同的描述和文档。所有这些都会出现在工具描述中,你可以在这里获得准确性。

And inside of that tool, here's every request parameter. So here's how you specify the amount of the charge, here's how you specify the description, the currency, so on and so forth. And that's gonna have the same descriptions and documentation that you would see in Stripe API docs. It's all gonna be there in the tool description. You can get accuracy there.

Speaker 0

但如果你想用详尽文档记录Stripe API的每个端点及其所有参数,恭喜你,你已经耗尽了整个上下文窗口。这根本行不通。即使没有完全耗尽,大型语言模型(LLM)和MCP客户端在面对过多工具或可用参数时也会不堪重负。这些请求参数中,很多可能是你不需要的。比如某个端点你只需要其中三个参数,而不是全部50个。

But if you wanna describe every single endpoint in the Stripe API with every single parameter fully documented, congratulations, you've just used up your entire context window. And that basically just doesn't work. Even if you don't use up the whole thing, LLMs tend to kinda get overwhelmed, MCP clients tend to kinda get overwhelmed at seeing too many tools in use or available. And all these request parameters, like, you may not need them. You know, maybe you're only trying to maybe you only need three parameters from this endpoint, not all 50.

Speaker 0

或者你只需要该API的一两个端点,而不是全部500个或50个。因此,如何逐步揭示可用工具、参数及其描述是个巨大挑战。你希望能在尽量减少来回对话的情况下实现这一点,因为每次交互都耗时。

Or maybe you only need one or two endpoints from this API, not all 500 or 50 or what have you. So that's a really difficult challenge to figure out how to gradually unveil what are the tools available, what are the parameters available, what are the descriptions of those parameters. And you wanna be able to do that without too much back and forth because every turn takes time.

Speaker 2

还消耗token。

And takes tokens.

Speaker 0

没错,正是如此。

Yeah. Exactly.

Speaker 2

那么针对上下文窗口限制和工具调用受限的问题,我们目前有解决方案吗?

Yeah. So do we have a solution for the context window challenge as well as the limited tool calls? Is there a solution yet?

Speaker 0

Hasemos,我们目前采取了几项措施,近期还会推出更多。当前最简单的方案是:我们可以生成API所有工具集。当用户启动服务器时,他们可以通过CLI命令行标志或远程URL查询参数来声明——比如'我只想使用Stripe API的支付意图功能',或'仅需客户管理模块',或'仅限读操作/写操作'。

There's several things that we do to help with this today, Hasemos. And there's several things that we're doing soon. But the most simple thing that we do today is we can generate all of the tools in your API. And when someone wants to spin up this server, they can use command line flags in the CLI or query parameters in the remote URL to basically say, hey, I only want to interact with these resources. So only give me the payment intents part of the Stripe API, or only give me the customers part of the Stripe API, or I only want reads, or I only writes.

Speaker 0

这能大幅缩减范围。终端用户和API提供商都可以自主决定需求。这种初步筛选非常实用,虽然可能略显笨拙——毕竟你未必能提前确定需要哪些限制。

And that limits things down a lot. And the end user can decide what they want to do, the API provider can decide what they want to do. And that nipping and tucking is really useful at sort of a first approximation. That can be a little bit clunky. And so maybe you don't know in advance what the limits you want to do are.

Speaker 0

因此我们还提供了更动态的方案:启用动态模式后,原本50或500个工具只会暴露3个。复杂度从O(n)降为O(1)。你会获得三个核心工具:获取可用端点列表的工具、查看单个端点完整参数说明的工具,以及执行端点的工具。

And so a more dynamic approach that we also provide is the ability to say, hey, give me this in dynamic mode. And instead of say, you know, 50 or 500 different tools, we only expose three. Right? So, you know, it scales o of one, not o of n. And you have a tool to get the list of endpoints available, the list of operations available.

Speaker 0

这样工具定义就完全动态化了。当然需要权衡:原本一次完成的操作现在需要三次交互。运行时效率稍低,但节省了上下文窗口。如果你最终没有调用某些功能(比如Stripe API的大部分接口),就几乎不会占用上下文空间,这非常理想。

You have a tool to describe a single endpoint and get all those request parameters. And you have a tool to execute an endpoint. And so you've gone and made these tool definitions totally dynamic. Now there's some trade offs there. It takes three turns to do one operation here instead of just one.

Speaker 0

我们近期推出的第三项功能也很有趣,它解决了另一个问题:假设你向API发起请求'查询客户Jennifer的信息'——

And so at runtime, that's a little bit slower, but it saves context window. And if you end up not interacting with, you know, again, say, the Stripe API, then you don't burn any of your, you know, hardly any of your context at all, which is terrific. So that's another thing that we offer today. A third thing that we do currently, I think is really fun and interesting, this is something that we shipped recently. It actually solves a slightly different problem, which is okay, let's say you make your API request and you say, hey, Stripe, you know, tell me about the customer Jennifer, right?

Speaker 0

它会向客户端点发起检索请求,查找Jennifer——准确说是列表请求,会返回大量数据,对吧?尤其是列表请求时。比如你向Stripe API发起请求,系统试图找到正确的Jennifer。数据库里有很多Jennifer,需要筛选过滤。而且这类API返回的数据量可能非常庞大。

And it'll make a retrieve request to the customer endpoint, it'll find Jennifer or actually a list request, it'll come back with a bunch of data, right? Now, especially if it's a list request. So maybe you make a request to Stripe API, it tries to find the right Jennifer. There's lot of Jennifer's in there, so it's gonna have to sort it through. And the size of the API responses here can be very large.

Speaker 0

实际上,特别是像这样的列表请求,数据量通常会大到让你的MCP客户端(比如Cursor或Claude代码工具等)直接拒绝处理。你根本没法操作,即使能处理,数据量也太庞大难以筛选。我们为此给所有请求添加了JQ过滤器。JQ代表JSON查询,是常用的命令行工具,可以从庞大JSON对象中只提取指定属性,从而精简获取的数据。

In fact, especially for like a list request like this, they'll typically be so large that your MCP client like, you know, cursor or or Claude code or anything like this will just refuse to look at it at all. You know, it does you just literally can't interact with it. Even if it can, it's just too much data to sort through everything. What we've done is we've added sort of this JQ filter to all of these requests. And so JQ stands for JSON query, and it's a commonly used CLI for saying, hey, out of this large JSON object, give me only these properties, and filtering down the data that you're getting.

Speaker 0

类似于SQL中的SELECT子句(熟悉SQL的人会理解)。语言模型擅长JQ查询。赋予它们这个工具后,就能让它们从庞大的API请求中只提取所需内容,比如只需Jennifer的姓名、邮箱和描述,而不必获取其他细节。这样既能减轻上下文窗口负担,又能保持高度聚焦。

So sort of like the select clause in SQL, for those who are familiar with that. LMs are great at JQ. So just giving them that tool has enabled them to say, out of this large API request, I only need Jennifer's, like, name and email and description, and I don't need to know all these other details. And that keeps the, again, the impact on the context window light, and it keeps things really focused.

Speaker 2

这太棒了,听起来非常实用。这让我想到另一个问题:现在有种误解认为有了MCP,大模型访问API时,API或SDK的质量和接口就不重要了。但事实恰恰相反——正因为使用量增加,我们更需要强类型、高度完善的SDK。能详细说说这个观点吗?

That's incredible. That sounds really useful. It reminds me another question of there's a misconception or misconceived notion that now you have MCPs, and that's how LLMs are accessing and interacting with APIs. The quality or interface of APIs or SDKs are not that important anymore, but I think that's like the polar opposite. It's exactly because there's more usage.

Speaker 2

你是否介意进一步阐述这一点?

You need strongly typed, very polished SDKs as well. Do you mind elaborating on that?

Speaker 0

当然可以。今早我正好和某大型金融科技公司的开发者关系负责人聊到这个问题。她提到目前如果让编程代理集成他们公司的REST API,代理第一反应是安装该公司SDK。听起来不错,但她发现代理会安装十个月前的老版本,然后就开始幻觉式编造整个接口。

Yeah, absolutely. You know, so I was chatting about this morning actually with someone who leads developer relations at a large financial technology company. She was talking about this problem where today, if you ask a coding agent to integrate with this company's REST API, the first thing that coding agent is gonna do is install this company's SDK. That sounds great, but what she's seen is that it's gonna install a version from ten months ago, an old version, and then it's gonna hallucinate the whole interface. Right?

Speaker 0

很多人都有类似经历:让ChatGPT帮忙集成ChatGPT时,它选的SDK版本总是有点偏差。这正是MCP需要解决的问题——我之前提到过,MCP另一个重要用途是辅助编程代理获取文档。我们即将推出重磅功能:让大模型能获取完整的API参考文档。

And I think a lot of people have had this experience where they say, hey, ChatGPT, can you help me integrate with ChatGPT? And it gets the version of the SDK a little bit wrong. Right? And so something that we need to see with MCP, and this is sort of, you know, something I alluded to earlier, where the other big use case for MCP rather than interacting with an API is writing code and helping a coding agent basically access documentation. And so something I'm really excited for us to ship very shortly is the ability to get comprehensive reference documentation for the API of your library into the hands of an LLM.

Speaker 0

这样当用户要求集成某金融科技公司服务时,代理会知道正确版本,并掌握该版本SDK的完整API,避免出错。这是我最期待实现的闭环解决方案。需要强调的是,编程代理最初用错SDK版本确实不好,但关键点在于:代理首选仍是使用SDK而非直接发起fetch请求。

So that, you know, when people integrating with this financial technology company say, you know, hey, make me an integration, it'll know the right version and it'll have access to the full API of that version's SDK and it won't make mistakes. So that's one of the things that I'm really excited about all this kind of coming full circle. And I'll note, you know, again, there's still a reason that the it sounds bad that the that the agent the coding agent kind of got SDK usage wrong in the first place. But it's important to note that the agent preferred to use the SDK. It didn't wanna just make a fetch request given the option.

Speaker 0

原因在于SDK提供类型检查。如果让大模型直接构造复杂API请求,很难保证100%准确。你需要那些红色波浪线提示属性拼写错误,或提醒

And the reason is that with the SDK, you have type checking. So if you have an LLM that's kinda like trying to one shot a whole big crazy API request, that's not so easy to do with a 100% accuracy. You really want those red squiggly lines saying, hey, this property has a typo, or hey, you're assuming that this string is always gonna be a string and it's never gonna be null in the response, you know. A human reviewer is just not gonna catch a mistake like that for an untyped interface. They need to see their type checker kind of giving the quick thumbs up that the LLM is in the right direction so that they can basically quickly review the business logic that was produced by the coding agent.

Speaker 0

我认为很多人已经意识到:代码越声明式、越简洁干燥、类型越安全,就越能放心让大模型处理编码工作,自己则专注于更高层次的思考,同时保持适度监督。

You know, I think a lot of people are seeing this, that the more declarative and clean and dry the code that you're writing and the more type safe the code that you're writing is, the more you can trust LLMs to do the coding part of your job for you and think about the higher level stuff and keep an appropriate eye on things.

Speaker 2

完全正确。所以这不仅仅是保持语言模型输出的准确性。我的意思是,当探索空间受到更多约束时,这确实有助于提高准确性,但同时也提升了可调试性。比如,当你拥有强类型的SDK、完善的参考文档和菜单式文档支持时,体验会好得多。这将为AI代理提供一个更全面的环境来真正执行任务。

Totally. So it's not just about keeping the accuracy of the LM outputs. I mean, it certainly helps with that when you have a more constrained exploration space, but also just debuggability. Like, it's so much better when you have strongly typed SDKs and very I would call it like you have the tools, also have the reference docs and menus in place with the docs aspect of it. Like it's going to be a more holistic experience given to the AI agent to really perform the task.

Speaker 0

正是如此。你的编码代理将能够尝试某些操作,比如点击保存,检查是否有类型错误,并自动进行迭代,而不必尝试发起API请求。要知道,后者需要API密钥,而且可能是破坏性的API调用。让LLM在生产环境中测试外部API或系统,这绝不是你想要的做法。

Exactly. Your coding agent is gonna be able to try something out, you know, press save, see if there's a type error, and iterate by itself automatically rather than trying to make an API request. You know, it'll need an API key for that. It might be a destructive API request. Having an LLM test in production is, you know, with a foreign API, with a foreign system, that is not what you wanna be doing.

Speaker 0

对吧?因此,对于任何希望用户在生产环境中使用其API的公司来说,无论是大型企业用户进行关键集成,还是开发人员追求快速开发,关键在于让集成过程变得简单高效,这样他们就能在其他项目介入前顺利完成,并且充满信心地操作。

Right? And so, you know, for any company that ships an API that they want their users to be using in production, whether it's a big business user, you know, an enterprise doing a key integration, or whether it's a developer trying to move quickly. In either case, it's really critical to make it easy for people to move fast at that integration so that it can happen before some other project comes in and for them to do that with confidence.

Speaker 2

我很喜欢思考未来五到十年API开发者的角色。你既是开发者,也开发过非常知名的API和SDK。如果我们想象未来AI代理将像人类开发者一样普遍,我们将拥有这种代理体验而不仅仅是开发体验。你认为这种体验会是什么样子?目前我们仍在为这两种角色或用户设计。

I love to just think about maybe the role of API developers in five to ten years. You have been a developer, also developed really well known APIs SDKs as well. If we imagine the future where the agents are gonna be as common as human developers and we're going to have this agent experience instead of just dev experience. What do you think that experience is gonna be looking like down the road? Today, we're still continuing designing for both of these two personas or users.

Speaker 2

我仍然称LLM为用户。你认为在那个时间框架内会是什么样子?

I still call it like LLM a user. What do you think it'll look like in that time horizon?

Speaker 0

我认为这是一个非常有趣且令人兴奋的问题。我刚才提到,当LLM编码代理编写代码时,你会希望代码更具声明性和DRY(不要重复自己)原则。这一直是理想状态,但现在快速准确地审查代码变得更为重要。因此,对于构建API的人来说,你需要一种编写后端代码的方式,避免像现在很多人不得不编写REST API时那样混乱。

I mean, I think it's a it's a really interesting and exciting question. What I was saying a minute ago, you know, where an LLM, you know, coding agent writing your code means you want your code to be more declarative and more dry. You've always wanted that, but it becomes more important to be able to quickly and accurately review code. And so for people building APIs, you want a way to write your back end code that is going to be not like spaghetti. And that's how a lot of people unfortunately have to write the REST APIs today.

Speaker 0

此外,你会希望围绕API设计有更清晰和规范的标准,这样LLM可以遵循公司提出的关于如何进行API设计的规则集。LLM可以融入公司使用的API框架,专注于声明端点、声明参数和编写业务逻辑,以便人类能够快速审查。

And furthermore, you're gonna want more clear and prescriptive standards around what API design should look like so that the LLM can follow a set of rules that the company has put forward around how they wanna do their API design. And, the the LM can follow the slot into maybe the API framework that the company is using and just focus on, okay, let's declare the endpoint. Let's declare the parameters. Let's write the business logic so a human can quickly review that.

Speaker 2

我在想象那些品味极高的开发者。他们精心设计端点,既设计API本身,又详细设计文档和SDK。但现在我们有了许多代码生成工具,更先进的框架,以及更高层次的工具,你可以定义规范——或许我不会称之为自然语言,而更像是一种规范化的方式,而不必过于担心细节的打磨。这正是我认为Sonali的工作所做的,它让开发者能够更多地思考高层次的任务,关注功能需求和用户交互,而不是错误、细节、分页、可调试性或可观测性等细微之处。

I'm imagining really high taste developers. You're kind of crafting the endpoints are designing both the APS themselves to, you know, the documentation with that SDKs in a very detailed way. But now we have a lot of, let's say, cogen tools and we have even more advanced frameworks and also much higher level tooling that you can define the specs. Maybe I wouldn't call it natural language, but just more of like a spec out way, but not having to worry too much of the like polishing on the details. And that's exactly, you know, what I feel like Sonali's work is doing is to maybe elevate the developers to think about more of the higher level tasks because of what you want the functions to be and what we want the users to interface with instead of, like, caring about the errors and the details and pagination or, like, debuggability, observability, these type of nuances.

Speaker 0

构建API最痛苦和困难的部分——不幸的是,今天对于向客户、用户和合作伙伴提供API的人来说,有许多痛苦和困难的部分。但最糟糕的可能是设计部分,处理所有的琐碎细节。你只想说:‘我们需要一个API来做这件事,管理这种资源。’正如你所说,人类应该思考业务需求的高层次,而让机器人处理那些不那么令人兴奋的部分。现在,你可能希望AI编写业务逻辑,但肯定不希望它编写平台代码。

The most painful and difficult part around building APIs there's many painful and difficult parts, unfortunately, today for people who are shipping APIs to their customers and their users and their partners, which is unfortunate. But one of the worst is sort of the design part, you know, and dealing with all the bike sheds and thinking, you know, you just wanna be able to say, we need an API that does this thing, right, that manages this kind of resource. You know, the humans should be thinking about, as you say, the high levels of what the business needs and letting the robots do all the less exciting stuff. Now, you want AI to be writing your, maybe, business logic. You don't really want it to be writing your platform code.

Speaker 0

那些作为所有其他部分之间粘合剂的核心功能。为此,你需要一个好的平台、好的框架和好的库。这正是Seamless希望为API领域带来的——让人类专注于高层次的需求,LLM构建中间层,而Seamless则处理底层基础设施。

Those core things that sort of operate as all the glue between everything else. For that, you want an a a good platform. So you want a good framework, and you want good libraries. And that's something that Seamless is looking to bring to everything around the API so that the humans can say the high level stuff, the LLMs can build that media layer, and that Seamless can take care of the low level infrastructure.

Speaker 2

我想问的最后一个问题是,您与一些最前沿的品牌和客户合作,如OpenAI、Anthropic、Cloudflare。在不泄露任何机密信息的前提下,能否分享一些值得观众借鉴的合作经验?比如他们如何看待SDK设计、API平台构建,以及他们希望为开发者提供什么。

The last question I wanna ask is you work with some of the most cutting edge logos and customers like OpenAI, Anthropic, Cloudflare. Without disclosing any confidentials, is there any learnings that's worth sharing with the audience on these collaborations of how they're thinking about their SDK design, their API platforms, and what they want to offer to developers.

Speaker 0

是的。我认为每个API团队都在反复学习许多经验教训。这个领域的每个人都明白深思熟虑的重要性,理解用户需求的重要性。具体考量方式会有些变化,就像人们从编写JavaScript转向TypeScript,或者从慢速网络连接过渡到高速互联网时那样,但基本原则我认为仍然相当通用。看到人们深入思考如何开放他们的能力是令人兴奋的——无论这些能力是能处理各种文本和音频的大型语言模型,还是能在云端即时启动新服务器,或是实现全球资金流动的功能。

Yeah. I mean, I think there's so many lessons that every API team learns again and again. Everyone in this space sees how important it is to be thoughtful, how important it is to, you know, consider the user. The exact way you consider that is changing a little bit, you know, just like it did when people moved from writing JavaScript to writing TypeScript or when people went from having slow connections to fast connections on the Internet, the fundamentals, I think, are still fairly universal. It's exciting to see people really think deeply about how to expose their capabilities, whether those capabilities are a large language model that can do all sorts of, you know, text and audio, or whether that's the ability to spin up new servers in the cloud instantly, or whether that's the ability to move money around the world.

Speaker 0

人们希望能够专注于核心能力,在向用户开放接口时保持深思熟虑,而不必担心底层细节。

People wanna be able to focus on their core capabilities, be thoughtful in exposing the interfaces to users, and not have to worry about the low level details.

Speaker 2

非常感谢,Alex。这真是太棒了。

Thank you so much, Alex. This is wonderful.

Speaker 0

谢谢你,Jennifer。这次交流很愉快。

Thank you, Jennifer. This has been fun.

Speaker 1

随着API逐渐成为开发者和AI代理的核心基础设施,SDK、文档和接口的质量将愈发重要。如果您喜欢本次对话,请在您选择的平台上为播客评分并分享给您的社交网络。敬请期待,我们即将推出更多关于软件、基础设施和AI未来的精彩讨论。请注意,此处内容仅供信息参考,不应视为法律、商业、税务或投资建议,也不应用于评估任何投资或证券,且不针对任何A16Z基金的现有或潜在投资者。

As APIs evolve into core infrastructure for both developers and AI agents, the quality of SDKs, docs, and interfaces will only grow in importance. If you enjoyed this conversation, please rate and review the podcast on your platform of choice and share it with your network. Stay tuned. We've got more great discussions on the future of software, infrastructure, and AI coming soon. As a reminder, please note that the content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security, is not directed at any investors or potential investors at any A16Z fund.

Speaker 1

更多详情请参见a16z.com/disclosures。

For more details, please see a16z.com/disclosures.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客