本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
让我们简要地从你的背景开始。
Let's go ahead and start off briefly with your background.
所以你本科和硕士都在帕多瓦大学就读。
So you went to the University of Padua for undergrad and masters.
也许可以跟我们说说,你是如何最初对计算机科学产生兴趣的。
Maybe tell us a little bit about how you first got interested in computer science.
是的。
Yeah.
我在意大利的帕多瓦大学学习。
So I studied in Italy at the University of Padua.
实际上,我并没有学习计算机科学。
I actually did not study computer science.
嗯。
Mhmm.
我学的是电气工程。
I studied electrical engineering.
嗯。
Mhmm.
所以,实际上是控制理论。
So, actually, control theory.
嗯。
Mhmm.
我觉得这有点像是人工智能的前身。
I think it's kind of like a precursor of AI.
它就像是人工智能的简单版本。
It's kinda like the the simple version of of of AI.
我当时在做,是的,最优控制。
I was doing, yeah, optimal control.
所以这有点像非常基础的决策形式,可能是在不确定性下的,涉及大量优化和信号处理。
So it it it is kinda like a very basic forms of decision making, perhaps under uncertainty, a lot of optimization, signal processing.
所以我觉得我学到了很多重要的数学工具,嗯。
So I think I learned a lot of important mathematical tools Mhmm.
通过那段经历,我基本上为攻读博士学位做好了准备。
Through that experience that kinda, like, set me up to go for a PhD.
我决定转向一个稍微不同的方向,因为我感觉我已经完成了本科和硕士阶段的学习。
And I decided to go into a slightly different direction just because I felt like, you know, I did a, an undergrad and master.
我对电气工程已经有了一定的了解。
I kinda, like, know electrical engineering reasonably well.
嗯。
Mhmm.
如果我要读博士,不如干脆选一个邻近的领域。
If I do a PhD, I might as well go into a neighboring field.
嗯。
Mhmm.
我看到计算机科学领域正在发生更令人兴奋的事情。
And I was seeing more exciting things happening in, in, computer science.
所以,是的,我进入了康奈尔大学的计算机科学项目攻读博士学位。
And so, yeah, I went into a computer science program for my PhD at Cornell University.
嗯。
Mhmm.
你知道,这基本上就是我进入计算机科学领域的方式,我几乎立刻就开始了人工智能方面的研究。
And, you know, that's basically how I got into into into CS, and I started doing research in in AI, almost immediately.
嗯。
Mhmm.
是的,这大致就是我的经历。
And, yeah, that's kind of like my journey.
是的。
Yeah.
你在康奈尔大学的博士论文题目是《在有限信息和高维度条件下的决策与推理》。
So your thesis at Cornell was titled decision making and inference under limited information and high dimensionality.
你的导师是卡拉·戈梅斯,她当然非常出色,还有巴特,他也同样出色。
And you were advised by Carla Gomes, who's obviously incredible and Bart who's also incredible.
也许你能跟我们讲讲,你的博士研究的大致方向,以及你是如何将你在帕多瓦的电气工程背景与康奈尔的研究联系起来的?
Maybe tell us a little bit about, like, the general research direction of your PhD and how you ended up bridging your background in electrical engineering from Padova to Cornell.
是的。
Yeah.
所以最初,我一直对人工智能非常着迷,比如建造能够思考、能够自动化那些复杂过程的机器,例如证明定理或为我们编写代码。
So initially, I mean, I was always very excited about the artificial intelligence, like building machines that can that can think, that can automate, you know, processes, the hard processes, proving theorems or, like, writing code for us.
嗯。
Mhmm.
当我进入博士阶段时,我立刻开始与巴特和卡拉合作,因为他们当时大量研究自动定理证明、约束满足和计算机优化。
And I got into when I got into my PhD, I immediately started working with with Bart and Carla because they were working a lot on automatic theorem proving, constraint satisfaction, computer optimization.
嗯。
Mhmm.
让机器能够自动证明一些简单的定理,并形式化验证某些类型的结果,这个想法让我很着迷。
The idea of having machines is essentially simple theorems and formally verify certain kinds of results.
因为我认为,这正是构建自我改进系统的路径,也就是能够自我提升的系统,嗯。
Because I thought that that was kind of like the path to build self improving actual systems, so systems that can maybe improve themselves Mhmm.
并且以可验证的方式编写代码,从而让系统变得更好。
And and write code that would then, you know, make them better and in a verifiable way.
我当时觉得,这将是通往构建真正智能机器的道路。
And I kinda, like, thought that would be the path towards building a truly intelligent machine.
嗯。
Mhmm.
然后,我才逐渐意识到,这些技术并没有很好地扩展,因为在定义什么是正确的目标函数、它意味着什么、程序到底应该做什么等方面,存在很多困难。
Then the only way I kinda, like, realized that those techniques were not scaling particularly well, that there were a lot of difficulties in kinda, like, defining what is the right objective function, what it means, what should the program even do.
嗯。
Mhmm.
我越来越对数据驱动的方法感兴趣,开始思考:如何利用数据来找出正确的事情,以及如何为机器的改进定义目标。
And I got more and more interested into more data driven methods and kind of like, okay, how do you use data to figure out what is the right thing and how do you define the objectives for the machines should improve.
于是,我把我在逻辑推理和确定性世界中的所有经验,转化到了更偏向概率性的环境中。
So I kind of like translated all that expertise that I had on what you would call like logical reasoning and more like visioning in a deterministic world towards more like probabilistic settings.
所以,图模型当时是人们主要使用的概率方法。
So graphical models which were kind of like the main probabilistic method that people were using back then.
到了我博士后期,深度学习就开始席卷全球。
And that, you know, around towards the end of my PhD, then deep learning started to take over the world.
我越来越对深度学习与生成模型、图模型或概率模型的交叉领域感兴趣。
And I got more and more interested in the intersection of deep learning and generative models and graphical models or probabilistic models.
这就是我如何开始接触生成模型的,从那以后我就一直在研究这个方向。
And that's kind of like how I got into, into generative models, which is what I've been working on since, since then.
是的。
Yep.
非常令人兴奋。
Very exciting.
你毕业后不久就成为斯坦福大学的助理教授了,让我们回溯到2014年左右吧。
So you started off as an assistant professor at Stanford right after graduating, let's call this maybe all the way back in 2014.
你刚从康奈尔大学毕业,就立即在斯坦福开始了教授生涯。
So, you know, you had graduated from Cornell and now you had started as a professor at Stanford.
我发现一件特别有趣的事,那就是你成为了伍兹环境研究所的研究员。
And one thing I found really interesting is you you became a fellow with the Woods Institute for the Environment.
我特意提到这一点,因为我觉得这显然是你工作的主线之一,即注重社会福祉,尤其是环境应用方面。
And I'm gonna point this out specifically because I I think it's it's definitely sort of a theme of your work, which is obviously this, you know, sort of application for societal benefit, but specifically for the environmental applications.
所以我们最近联系了你的一位前学生罗汉·穆尼维,他现在即将在伯克利与谢尔盖·莱文一起进行第一轮轮转。
So we recently actually had one of your former students, Rohan Munvi, who's now gonna be at Berkeley with Sergei Levine in his first rotation.
所以我们翻阅了他的一些论文,但也许你能跟我们多讲讲,你最初是如何开始关注环境影响的,以及过去十一年来你的研究重点是如何变化的?
And so, you know, we went through some of his papers, but maybe you can tell us a little bit more about sort of the genesis of your work in in environmental impact and also just like how your sort of focus has changed over the past eleven years?
是的。
Yeah.
所以我开始涉足这个领域的原因,其实可以追溯到我的博士阶段。
So the reason, I I started working in in that space, I mean, it goes back again to my PhD.
嗯。
Mhmm.
我的导师卡拉·戈麦斯,她获得了美国国家科学基金会的一项重大奖项,嗯。
So my adviser, Carla Gomez, she won one of these, like, big award from the NSF Mhmm.
比如计算领域的探索性项目,他们实际上会提供大量资金来开创一个新领域。
Like an expedition in computing, where they basically give you a lot of money to start a new field effectively.
是的。
Yeah.
像是那种高风险、高回报的研究项目。
Like, very high high risk, high reward kind of, like, research projects.
是的。
Yeah.
她基本上创建了一个她称之为计算可持续性的领域。
And she essentially created this field that she called computational sustainability.
嗯。
Mhmm.
其理念是应用你从计算机科学、优化和运筹学中获得的一些技术。
Where the idea was to apply some of the techniques that you have been, you know, like from computer science, from optimization, from operations research.
并找出如何将它们用于造福整个社会,朝着可持续性及类似问题努力。
And figure out ways to use them for the benefit of society at large, kind of like towards sustainability and similar kind of problems.
其愿景是,想想私营公司和企业,它们已经从这些技术中获得了巨大收益。
The vision was that think about private companies and enterprises, they had benefit enormously from these kind of techniques.
但当你思考涉及如何以最佳方式管理生态系统的决策时,这些方法的应用却并不多。
But there was not a whole lot of applications of these methods when you think about decisions that involve, you know, how do you manage ecosystems in best possible way?
你怎么看待环境保护?
How do you think about environmental conservation?
当你想到可持续发展目标,以及监测实现这些目标的进展时,你会发现有很多机会可以引入计算工具、计算思维、更数据驱动的方法和更优化导向的方法。
When you think about, you know, just sustainable development goals, monitoring progress towards this target, You could see that there is a lot of opportunities for bringing in computational tools, computational thinking, more data driven methods, more optimization driven methods.
这些领域当时还处于人们所使用的分析工具相对不够成熟的阶段。
The fields were still at a stage where people were kind of like not very sophisticated in the kind of tools that they were using to analyze the problems.
嗯。
Mhmm.
因此,感觉正是引入计算机科学、应用数学和运筹学中这些新技术的绝佳时机。
So it felt very ripe to to to a a kind of like a perfect moment for bringing in some of these new techniques from CS and and applied math and operations research.
因此,我在攻读博士学位期间就一直在研究这些问题。
And so I had been working on some of those problems during my PhD.
嗯。
Mhmm.
加入斯坦福大学计算机科学系教职后,我继续了这项工作。
And then I continued that after joining the the faculty in the computer science department at Stanford.
但我立刻就认识了校园里其他更专注于应用方面的人,他们正在研究那些拥有有趣数据集、适合应用最先进机器学习方法或受新数据来源和新问题启发而开发新方法的可持续性问题。
But I I immediately kinda, like, got to know folks across campus that were working more on the applied side, on thinking about sustainability problems that had interesting data sets, interesting opportunities for applying the best machine learning methods or developing new methods that were inspired inspired by new challenges that you see once you started looking at new sources of data, new kinds of problems.
是的。
Mhmm.
这就是我如何参与进来的,它一直是我研究问题和新研究想法的非常令人兴奋且极具启发性的来源。
And, that's kind of like how I got involved and it's been a very exciting, a very, very interesting inspiration source for for problems and and, new research ideas effect.
太棒了。
Amazing.
让我们稍微深入聊聊一些我觉得非常令人兴奋的论文吧。
Let's get maybe a little bit into some of your papers that I think are very exciting.
所以你的研究触角几乎无处不在。
So you kind of have your, your hands everywhere.
你几乎在每一个领域都产生了影响。
You made an impact kind of literally everywhere.
我们之前邀请过丹·富(Dan Fu)做客播客,他也是闪存注意力论文的第二作者,而你则是与树·道(Tree Dao)共同署名的第三作者。
So, know, you we had Dan Fu on the podcast earlier who is also, I think, second author along with Tree Dao, and, of course, you're the third author with the flash attention paper.
你研究过生成对抗模仿学习。
You worked on generative adversarial imitation learning.
你研究过基于分数的扩散模型。
You worked on score based diffusion models.
所以你涉猎了很多领域。
So you worked on a lot of things.
我们在这段短暂的播客中可能无法涵盖你所有的工作,但你能简单概述一下,到目前为止你职业生涯的亮点是什么吗?
We probably can't cover everything you've worked on, in this short, short podcast, but maybe give us like a brief overview, like what are the highlights of your career thus far?
我确信未来几十年你还会发明更多令人兴奋的东西,但就过去十一年而言,有哪些是你最引以为豪的五项成果?
I'm I'm sure there's gonna be more exciting things that you invent in the next few decades, but like maybe for the past eleven years, what are like the five things that you're the most proud of?
是的。
Yeah.
我觉得你已经提到一些了。
I think you mentioned some of them.
是的。
Yeah.
总的来说,我过去十年一直在研究生成模型,也就是现在所说的生成式人工智能。
Overall, I've been working in generative model, what's now called generative AI for a decade.
是的。
Yeah.
我还记得在2014年、2015年左右,我们写关于生成模型的论文时,实际上很难发表。
I still remember back in, let's say 2014, 2015, that we were writing papers about generative models and it was actually hard to publish those papers.
我记得审稿人会问:为什么我要关心一个能生成新图像的模型?
Like I remember reviewers would say, why do I care about building a model that can generate new images?
而且这有什么用?
And then why is this useful?
你不得不在引言和其他实验中写一大段文字,说明这或许可以用于无监督表征学习。
And you have to write this kind of like text in the introduction and other experiments showing maybe this can be used to do unsupervised representation learning.
你把一个生成模型输入数据,就能得到一些特征,这些特征可以帮助你做半监督学习之类的事情。
You feed a generative model, then you get features that can help you maybe do semi supervised learning or like these kind of things.
但我一直觉得,这种能力是一种至关重要的、真正会有巨大用处的特性。
But I always felt that that was like a critical kind of like capability that would be really, really useful to build.
因此,过去十年我一直在研究各种类型的生成模型。
And so I've been working on different kinds of generative models for the past ten years.
而且是以一种全栈的方式,深入思考什么是正确的概率建模形式。
And in what you would call like a full stack manner, like really thinking about what is the right probabilistic formulation.
它应该是自回归的吗?
Should it be autoregressive?
它是从左到右生成的吗?
Would it kind of like generate left to right?
它应该基于对抗的方法吗?比如生成对抗网络和我与乔纳森合作过的生成对抗模仿学习。
Should it be based on an adversarial kind of like approach, like generative adversarial networks and generative adversarial imitation learning that I I worked on with with Jonathan.
我和杨松很早就开始研究扩散模型,当时是2019年,人人都在用生成对抗网络来生成图像和其他连续模态的数据。
I worked did very early work on on the fusion models with with Yang Song, kind of like back then in 2019, everybody was using GANS, generative adversarial networks to generate images and other kind of like continuous modalities.
那正是当时的前沿技术。
That was the state of the art.
我和 Jung 挑战了这一观点,向学界证明了有更好的方法,比如使用扩散技术——训练神经网络去噪图像,可以获得更高质量的结果。
With Jung, it kind of like we challenged that and we showed the community that there are better techniques for doing that, that you can use this diffusion approach or where you train a neural network to do de noise images and that can give you higher quality results.
这是一件大事,因为我们当时在和那些由许多人经过大量优化和微调的大型生成对抗网络竞争。
It was a big thing because we were kind of like competing with these very large generative adversarial networks that had been trained by a lot, you know, optimized and fine tuned by a lot of people.
业界投入了大量资源到这种技术上,但我们仍然证明了,通过更严谨的方法,你可以获得很多优势,比如生成更高品质的样本、计算似然性、实现稳定的训练。
There were a lot of resources that were put into that kind of technology from industry and still we were able to show that with a more principled approach, you know, you could get a lot of benefits, you could get higher quality samples, you can get likelihoods, you can get stable training.
这成为了潜在扩散模型以及MidJourney背后技术的基础。
And that kind of like became the foundation for latent diffusion models and the technology behind mid journey.
所以,如今人们用来生成图像、视频的许多扩散模型,都根植于我们与杨合作的那篇原始论文,我们当时证明了确实可以训练这样一个去噪网络,通过从纯噪声开始,逐步去除噪声,最终得到一个清晰的样本。
So like a lot of the follow all, you know, the diffusion models that people are using today to generate images, video, that kind of like rooted in that that original paper that we had with Young and kind of like showing that indeed you can train this de noise in our network and you can use it to to generate images by starting from pure noise and then gradually removing it in order to get a clean sample at the end.
我专注于优化,从架构层面进行思考。
I worked on, you know, optimizing, like thinking about at the architecture level.
是的。
Yeah.
归根结底,我们研究的是深度生成模型。
Like, you know, the end of the day, we're thinking about deep generative models.
这背后有一个概率框架。
There's a probabilistic formulation.
在幕后,总有一些深度学习的魔法,你需要使用神经网络来参数化统计模型。
Then there is always some deep learning magic under the hood where you need to use a neural network to parameterize statistical models.
因此,我一直在思考不同的架构。
And so I've been thinking about different architectures.
我一直在思考像状态空间模型这样的东西,我和Chris Three共同指导了Tree,也在思考如何通过Tree的Flash Attention论文让Transformer更高效。
I've been thinking about things like, you mentioned state space models with Tree who I co advised with Chris three, thinking about ways to make transformers more efficient with, again, with Tree on flash attention paper.
我研究过对齐问题。
I worked on alignment.
我最近与Rafael、Chris Manning和Chelsea合作发表了一篇关于DPO的论文。
My recent paper that I had with with Rafael and Chris Manning and Chelsea on on DPO.
也就是直接偏好优化,结合了我们从强化学习和偏好收集中理解的一些技术,找出一种更稳健的RLHF方法,而无需实际使用强化学习。
So direct preference optimization, of like combining some of the techniques that we understand from, from RL and preference solicitation and figuring out what is a more robust way of, doing RLHF that does not actually involve reinforcement learning.
所以做了不少事情。
So quite a few things.
我最近最兴奋的是通用语言模型。
The most recent one that I'm very excited about is the usual language models.
你知道,正如我们都知道的,其他名字无处不在,每个人都对此非常兴奋。
You know, as we all know, other names are everywhere and everybody is very excited about that.
但大多数现有模型都是自回归的。
But most existing models are autoregressive.
它们基本上是逐个令牌从左到右生成的。
They they kinda like generate left to right one token at a time.
这可以说是最简单、在很多方面也是最差的生成模型。
And that's the kind of like simplest and in many ways the worst kind of generative model.
至少,当我教授斯坦福大学的深度生成模型课程时,我们就是这样看待它的。
At least that's kinda like how we we frame it when I teach the deep generative model class at Stanford.
我的意思是,我们讨论了很多自回归模型的缺点,没错。
I mean, we talk a lot about the a lot of the downsides that you have Yeah.
一旦你试图构建一个自回归模型。
The moment you try to build an autoregressive model.
是的。
Yeah.
人们利用这项技术取得了令人惊叹的成果,这确实令人兴奋,但我认为我们实际上可以做得更好。
It's exciting that people have been able to get amazing results using that technology, but I think we can actually do better.
嗯哼。
Mhmm.
因此,我一直在探索如何使用扩散模型来生成非图像或视频的离散数据,比如代码、文本,甚至DNA。
And so I've been exploring ways to use diffusion models to generate not images or video, but more discrete kind of data, like code, text, perhaps even DNA.
因此,我一直在乐趣满满地训练基于扩散的新类型语言模型,而且确实如此。
And so I've been having a lot of fun training new kinds of language models that are based on diffusion and Yeah.
它们更快、更便宜、更可控。
They they they are faster, they're cheaper, more controllable.
所以,通过创新并真正走出一条少有人走的路,构建前人未曾探索过的新东西,你会获得许多有趣的特性。
So you get a lot of interesting properties by trying to innovate and really kinda like go going a little bit off the beaten path and and building new things that people have not explored before.
对吧?
Right?
还有大量待摘的果实,有更多优化和发现新事物的机会。
There's a lot more hanging fruits, a lot more opportunities for for optimizing things, discovering new things.
所以这一直令人兴奋。
So it's always been exciting.
我觉得,这大概是我研究生涯中一个共同的主题,
Think kinda like, I think, a common theme across my my research career or
是的。
Yes.
尝试追求高风险、高回报的想法。
Trying to go for high risk, high reward kind of ideas.
对。
Yeah.
对。
Yeah.
你总是需要保持平衡。
You you always need to balance.
你总是需要有一点组合,但嗯。
You always need to have a little bit of a portfolio, but Mhmm.
我总是鼓励我的学生要有雄心壮志,去追求那些真正能改变整个领域的想法。
I always try to encourage my students to be ambitious and try to go for the for the ideas that could really change the field.
是的。
Yeah.
这就有道理了。
That's why it makes sense.
我一直对你们实验室的运作方式印象深刻,尤其是你们每年能产出如此大量的研究成果。
I think one thing that's always impressed me about your lab and and the way that you run research is the volume of research that you guys are able to pump out every year.
这真的非常令人印象深刻。
It's, like, incredibly impressive.
我想举一个NeurIPS 2024的例子。
I think I just like one example in NeurIPS twenty twenty four.
我记得你们有十几篇论文被接收,超过十篇,大概有13篇不同的论文。
I think you guys had a dozen, more than a dozen, like 13 different papers that were accepted.
当然,你在这么多不同领域所取得的成就确实非常了不起。
You know, obviously, it's really impressive the work that you've done across so many different fields.
你是怎么管理你的实验室,以便同时开展这么多高质量、有影响力的研究项目的?
How do you sort of manage your lab to be able to work on so many sort of high caliber impactful projects at the same time?
是的。
Yeah.
我尽量让学生们负责自己的项目,拥有主导权。
I I try to have students work on to to basically have ownership Mhmm.
基本上就是对自己的项目拥有主导权。
Of of, you know, their own projects, basically.
我认为这对他们作为研究者的发展很重要,也能真正获得认可,比如成为论文X的第一作者。
I think it's important for them to develop as a researcher and really also get recognized as like, I'm the first author of PaperX.
对。
Yeah.
这很重要。
That that's important.
所以,攻读博士学位的一个优缺点是,你对工作拥有更多主导权,但团队规模较小,因此研究工作会比大型工业实验室更加分散和碎片化——在工业实验室里,会有庞大的团队共同专注于同一个项目。
And so it's a it's a pro and a con of of of doing a PhD that you have more ownership of the work, the teams are a little bit smaller and so it becomes a little bit more fragmented and fractional compared to what would happen, for example, in a big industry lab where you have like very large teams all working on the same thing.
但作为这一做法的副产品,我们确实能够探索许多不同的想法,因为作为教授和导师,确保学生对自己的项目拥有主导权、彼此不干涉、并能明确归属项目贡献者,这一点非常重要。
But as a side product of that, yeah, we tend to be able to explore many different ideas because for me as a as a professor, as an adviser, it's important to make sure that the students have ownership of their individual projects and they don't step on each other's toes and there is clear credits that can be assigned to whoever led the project.
但当你为他们写推荐信时,我有很多素材可以称赞学生,比如可以说:他们主导了项目X或项目Y,并取得了出色的成绩。
But, you know, whenever you write a recommendation letter for them, you know, I have material to be able to say amazing things about the students because I can say, okay, they led project x or project y, and they did amazing things.
所以,这是博士项目的一个优势。
So that's a good thing about PhD programs.
我想在播客开始前我们就讨论过这一点,我认为导师和学生之间的激励机制非常一致。
I think we were talking about this before this podcast started that I think the incentives between the advisor and the students are very aligned.
学生成功了,我才算成功。
I succeed if the students succeed.
因此,我一直非常关注这一点,确保学生拥有自己的空间,能够表达自我。
And so it's been, always, something that I pay a lot of attention to to make sure that, students have their own space and they can, you know, express themselves.
他们可以发挥创造力。
They can be creative.
他们可以探索各种各样的想法。
They can explore all kinds of ideas.
简直太棒了。
Absolutely incredible.
那么我们来谈谈生成模型这一块。
So getting into the generative model side of things.
我们之前简单聊过扩散语言模型,你知道,很多传统的其他方法都只是从左到右处理。
So, you know, we talked briefly about diffusion language models, you know, how a lot of these traditional other approaches just look left to right.
一个非常令人兴奋的进展是,当然,如果我们不提一下你创办的杰出公司Inception Labs,那就太遗憾了,这里每个人都应该去申请。
So one really exciting advancement is, of course, you know, we would be a miss to not mention your wonderful company Inception Labs that everybody here should go apply to.
所以能不能简单跟我们讲讲你的创业历程?
So may maybe just tell us a little bit about, you know, your founding journey.
你和弗拉基米尔·库列绍夫以及阿迪亚·格罗弗共同创办了这家公司,这两位也是了不起的人才,我记得他们都曾在斯坦福获得博士学位。
So you've you've founded with Vladimir Kuleshov and Adithya Grover as well, two other amazing people who also got, I believe, their PhD at Stanford.
我现在知道阿迪亚现在是加州大学洛杉矶分校的教授,而弗拉基米尔在康奈尔科技学院。
I'm gonna say Adithya is at UCLA right now as a professor, and Vladimir is at Cornell Tech.
你们三位正好覆盖了美国三大主要城市。
So you guys are sort of hitting all all three of the big the big three cities.
你有帕洛阿尔托的斯坦福,纽约市的康奈尔科技,还有洛杉矶的加州大学洛杉矶分校。
You got Palo Alto with Stanford, New York City with Cornell Tech, and even LA with UCLA.
你们是怎么管理这种跨城市的协作的?
How are you guys, you know, managing this multicity effort?
你们是怎么招募团队的?
How'd you guys go about recruiting a team?
你们最初是怎么认识的?
What sort of the the journey of how you guys first met?
是的。
Yeah.
阿迪亚和弗托实际上是我在斯坦福的博士生,所以我们认识很久了。
So Aditya and Voto were actually my PhD students at Stanford, so we've known each other for a very long time.
明白了。
Okay.
阿迪亚和弗托确实是我的顶尖博士生之一。
Aditya and Voto, yeah, were some of my best PhD students.
他们曾是办公室同事。
They were office mates.
他们彼此也认识很久了。
They've known also each other for for for a very long time.
他们合作得非常好。
They worked really well together.
所以我们知道,如果我们能一起合作,一定会非常成功。
And so we we knew we we could be very successful as a team if we were to to work together on this.
他们还曾在生成模型领域担任领导者,我们称他为Volodymyr,他的实验室发表了许多关于扩散模型和扩散语言模型的论文。
And they also had been leaders in generative models and Volodymyrvolo as we call him had a bunch of papers around the diffusion models and diffusion language models in his lab.
因此,我们都对这个领域非常感兴趣,觉得现在是创业的绝佳时机,能够获得所需的资源来扩大规模,真正证明这项技术不仅仅是一篇酷炫的研究论文,而是能够应用于现实世界,解决真实问题和客户痛点,承载生产流量。
So we were all kind of like very interested in this space and we felt like the time was right for starting a company and getting the kind of resources we needed to scale things up and really demonstrate that this technology is not just like a cool research paper, but it could actually be used in the real world to solve real problems, real customer pain points, serve production traffic.
这一切始于一篇ICML论文,该论文荣获最佳论文奖,首次证明扩散语言模型在性能上已能与自回归模型竞争,尽管仍处于小规模阶段,比如GPT-2级别,但它们已具备相当的似然性和样本质量。
And so it all started with an ICML paper, which won the best paper award, kind of showing that for the first time diffusion language models were competitive with autoregressive models, still at a small scale, like at the GPT-two scale, but they were getting competitive likelihoods, competitive sample quality.
真正让我兴奋的是,它们的速度快得多,对吧?
And what really got me excited was that they were much faster, right?
因为扩散模型在每次神经网络评估时基本上可以输出多个标记。
Because the diffusion model can essentially output multiple tokens for each neural network evaluation.
因此,与传统自回归模型相比,这带来了显著的速度提升,传统模型在每次神经网络评估前只能生成一个标记。
And so that was enabling some really big speed ups compared to traditional autoregressive models where you you only get one token before each neural network evaluation.
是的。
Mhmm.
看到这些结果让我非常兴奋。
So seeing those results really got me excited.
我真的很想看看,如果我们训练更大的模型会发生什么。
I really wanted to see what happens if we were to train bigger models.
我们真的需要花心思优化整个推理栈,比如使用闪存注意力之类的技巧。
We really need to put some care into actually optimizing the whole inference stacks with ideas like flash attention and and things like that.
当然。
Of course.
因此,这促使我创办了一家公司,组建一支顶尖的工程团队来优化整个技术栈。
And so that's kinda like what prompted me to to actually start a company to put together a really top notch engineering team to optimize the whole whole stack.
而且这是一个全新的栈。
And and it's a new stack.
它基于一种非常不同的技术。
It's based on a very different technology.
因此这让我感到非常兴奋,因为我们还有很多研发工作要做。
And so that's what makes it very exciting to me because there is a lot of R and D that we need to do.
还有很多想法有待开发。
There is a lot of ideas to be to be developed.
还有很多低垂的果实,我得说。
There is a lot and and still quite a few low hanging fruits, I have to say.
比如,以前没人真正构建过这些模型,因此有很多有待发现的东西,很多需要优化的地方,很多新算法,以及大量开展有趣研究工作的机会。
Like, nobody has actually built these models before, and so there is a lot to be discovered, a lot to be optimized, a lot of new algorithms, new new lots of lots of opportunities for, doing interesting research work, essentially.
嗯。
Mhmm.
是的。
Yeah.
当然。
Absolutely.
我们不妨聊聊你学到的一些东西。
Let's maybe talk a little bit about some of the things that you learned.
你知道,你显然有着深厚的研究背景,康奈尔大学的博士,斯坦福大学任教近十年,现在你开始组建团队,处理招聘、公司扩张、融资等所有运营事务。
So, you know, you're coming from obviously this very heavy research background, you know, a PhD from Cornell, you know, being a professor at Stanford for around a decade, and now you go into, you know, assembling a team, all this sort of operational overhead that comes with hiring people, scaling a company, raising from investors, etcetera, etcetera, etcetera.
你是怎么应对这一切的呢?
How'd you sort of go about doing all of that?
阿迪亚或沃洛有没有之前创业的经验,让你能从他们的经历中学习一些东西?
Did, you know, Aditya or Volo, maybe they had previous companies, so you're able to learn a little bit from their experience.
团队是怎么一步步走到这一步的?
Like, how did the team sort of move there?
是的。
Yeah.
我确实需要学习很多东西,但这也是这段经历中非常有趣的一部分。
I certainly had a lot of learning to to do, and that was also actually a fun part of the of the of the experience.
我觉得是的。
I felt like Yeah.
你知道,我当教授已经很久了,是时候做一些稍微不同的事情了,这样我可以成为一个更好的管理者,学习一些新技能。
You know, I'd been a professor for a long time, and it was time to actually do something slightly different so that I could, you know, become a better manager, learn some new skills
是的。
Yeah.
拓展我的人脉。
Expand my network.
幸运的是,Volo(弗拉基米尔)之前创办过一家名为Afresh的初创公司。
Luckily, Volo, Vladimir had a startup before called Afresh.
嗯。
Mhmm.
那是一家在不同领域非常成功的初创公司,他们专注于供应链优化。
Very successful startup in a different space that they they do supply chain optimization.
但他曾是这家公司的创始人兼首席技术官。
But he was a founder, CTO of the company.
因此,他带来了大量关于如何创业的专业知识和建议。
And so he brought a lot of expertise, a lot of advice on how to go about starting a company.
正如你提到的,比如融资、组建团队,还有工程团队相关的流程,这些都很有趣。
As you mentioned, like raising funds, building a team, Just processes around the the from the engineering team is fun.
所以,能有一位有创业经验的人加入我们的融资团队,真的非常有帮助。
So it's been super helpful to have a a repeat founder on the on the funding team.
当然。
Of course.
是的。
Yeah.
在你创立、启动并逐步扩大公司规模的过程中,有没有什么你认为学到的重要经验?
Are there any lessons that you feel like you've learned in your sort of journey of, like, founding inception and scaling it over time?
现在还为时过早。
It's still early.
我觉得,我的意思是,我感觉一切都还很模糊。
I feel like people I mean, I'm I'm feels like it's all Yeah.
这是最重要的研究。
That it's the most important research.
我看到这些项目和想法成功或失败,主要原因是人才,拥有卓越的才能似乎是最关键的因素。
Like, the I see the projects and and ideas succeed or fail, and it's mostly due to people, like having exceptional talent seems to be the most important thing.
因此,我们在招聘上投入了大量时间,努力招募最优秀的人才,帮助我们实现这一宏伟愿景——在未来的几年里,让融合语言模型成为下一代最顶尖的语言模型。
And that's why we're so spending so much time in recruiting and really getting the best people to to help us achieve this, very ambitious vision that we have of getting the fusion language models to be the next, and and and the best kind of language model in the in the next few years.
是的。
Yeah.
当然。
Of course.
完全合理。
Totally makes sense.
这也正是在听的各位应该去了解Inception并申请的原因,如果你感兴趣的话。
And that is also why people who are listening should definitely go look at inception and and definitely apply if you're interested.
让我们聊聊扩散语言模型的一些应用场景吧。
Let let's get maybe into some of the applications of diffusion language models.
所以,你的第一个产品是Mercury,它是世界上首个商业化规模的扩散语言模型。
So, you know, your first product is Mercury, which is like the world's first commercial scale diffusion language model.
我们之前讨论过,这种模型有大量不同的应用场景。
You know, there are tons of different applications of this that we've talked about.
其中之一是自动补全系统。
One of them is sort of auto complete systems.
我想我们也聊过一些其他的应用场景。
I I think we've also talked about some of the other applications.
也许你能简要概述一下,你对扩散语言模型所有潜在应用的愿景是什么?
Maybe give us like a brief overview, like where do you sort of see the vision of all potential applications of diffusion language models?
比如,你能列出三个你最希望用Inception实现的顶级应用吗?
Like, maybe give us, like, three of the top applications that are sort of on your bucket list to hit with inception.
是的。
Yeah.
从某种意义上说,扩散语言模型是传统自回归模型的替代方案。
So diffusion language models are, in some sense, function on replacement of traditional autoregressive models.
所以API是文本输入,文本输出。
So the API is the same text in text out.
因此,原则上,任何可以基于自回归语言模型构建的应用,也可以基于扩散语言模型构建。
And so in principle, anything that can be built on top of an autoregressive language model can also be built on top of a diffusion language model.
目前,我们拥有的模型速度非常快。
Right now, the models that we have are really, really fast.
因此,我们在其他实验室的应用中看到了很多进展,尤其是在延迟非常重要的场景中。
And so we're seeing a lot of traction in applications of other labs where latency is very important.
正如你提到的,类似于自动补全类型的问题,特别是在代码生成领域。
So as you mentioned, kind of like auto complete types of problems, especially in the code generation space.
我们看到了一些非常出色的结果,比如Mercury在CoPilot Arena中的评估表现。
We're seeing some really, really good results like Mercury is as evaluated by CoPilot Arena.
这可以说是代码生成模型的LLM竞技场。
This is kind like the LLM arena for code generative models.
它在质量方面排名第一,与少数几个模型并列,但仍然是第一。
It's the number one LLM for for quality tied with a few others, but still number one.
是的
Yeah.
而且在获取结果的速度方面,它也是最快的。
And it's the fastest in terms of, like, the the speed at which you can get results.
对
Mhmm.
它在下一步编辑方面也非常出色,这是一种高级版本的自动补全,可以根据上下文修改多个内容。
It is also really good at next edit, which is kind of like a fancy version of autocomplete where you're able to modify many things based on the context.
对
Mhmm.
我们刚刚宣布了与Continue的合作。
We were just, we just announced a partnership with Continue.
因此,他们的NextEdit系统完全基于融合语言模型。
So now their their NextEdit system is based entirely on on the fusion language models.
它基于Mercury。
It's based on Mercury.
所以如果你想试试,去下载 Continue,使用它的 NextEdit 功能,就能亲身体验扩散语言模型的强大之处。
So if you wanna try it out, go download, continue, use the next edit feature there, and you can see the the power of diffusion language models in action.
这真的很酷,因为你可以利用前后文来判断如何进行修改。
It's really cool because, you know, you can actually context to the left and to the right to figure out how to make edits.
我们看到很多这类应用的机会,比如需要修改代码或文本的场景。
We're seeing a lot of opportunities in the in this kind of applications where we need to essentially modify code or modify text.
嗯。
Mhmm.
另一个获得大量关注的应用是智能代理,这类场景需要与用户进行实时互动。
Another one that is getting where we're getting a lot of traction is, you know, like agents, where there is a need to, you know, interaction with people in real time.
举个例子,语音代理,或者在循环中使用大语言模型来决定该问什么问题,或者调用哪些工具来获取提供客户服务所需的信息。
So again, think of voice agents or maybe there is an LLM in the loop that is used to figure out what kind of questions to ask or figure out what kind of tools to call to get the information that you need to provide customer support.
这同样是一个对延迟要求极高的应用场景。
Again, it's an application where people really care about latency.
是的。
Yeah.
扩散语言模型速度极快、质量高且可控。
And the diffusion language models are extremely fast, high quality, controllable.
因此,我们看到许多应用程序基于Mercury模型构建,用于客户支持、呼叫中心自动化等场景。
And so we're seeing a lot of apps that are being built on top of Mercury models to to customer support, call center automation, things like that.
嗯。
Mhmm.
明白了。
Gotcha.
完全合理。
Totally makes sense.
那么,现在让我们切换一下话题,在我们最后的几分钟里给出一些建议。
So maybe switching gears now to maybe some advice in our sort of our last few minutes here.
你知道,观众中的很多人,比如博士生和研究生,都在思考他们接下来想做什么。
So you know a lot of the people in the audience of course you know PhD students, grad students that are thinking about maybe what they want to do next.
他们可能在考虑实习,或者甚至考虑成为一名教授。
Maybe they're thinking about internships or they're thinking about maybe even becoming a professor.
看起来你显然在学术界有一些经验,毕竟你当了十多年教授,但你也对业界方面有一定了解。
It sort of seems like obviously you have some experience you know in academia having been a professor for over a decade, but also you know you have some experience with the company side of things, maybe some of the sort of industry side of things.
你对当今的研究与产业之间的关系有什么看法?对于研究生,你有什么建议吗?
What's sort of your outlook on research versus industry today and maybe any advice that you would give to grad students?
我不认为这必然是一种非此即彼的关系,因为业界也在进行大量研究,所以这并不是研究与产业之间的对立。
I don't think that is necessarily like a, I mean, a lot of research is happening in the industry as well, so I don't think it's necessarily research versus industry.
如果你考虑学术研究与产业研究之间的区别,那确实存在一些更有趣的权衡。
If you think about academic research versus industry research, and there you have more interesting trade offs.
是的。
Yeah.
我觉得这和我们之前讨论的内容是相关的。
I think they tie back to to the discussion we were having before.
我认为博士生涯的美妙之处在于,你会拥有极大的自由度。
I think the beauty of a PhD is that you're gonna have a lot of freedom.
你将有机会建立属于自己的个人品牌,真正拥有自己的想法和项目,这非常重要。
You're gonna have a lot of opportunity to build a personal brand that really have ownership of your ideas and your projects, which, is, is really important.
如果你考虑成为一名教授,或者基于博士期间的研究创办自己的公司。
If you're thinking about becoming an, a professor perhaps starting your own company based on the research that you developed during the PhD.
因此,这仍然是一个极其宝贵的机会,可以提升你自己作为一个人、一名研究者和一位学者的能力。
So it's still an extremely valuable, I think, opportunity to improve yourself as a person, as a researcher, as a scholar.
嗯哼。
Mhmm.
而其中的权衡在于,如今工业界在人工智能研究上投入了大量资源。
The trade offs is that, of course, now there is a lot of resources that are being invested in AI research in industry.
因此,如果你选择攻读博士而不是加入初创公司或工业界的研究实验室,实际上可能错失了大量收入。
And so, it could be, you could be leaving a lot of money on the table essentially by going to a PhD instead of joining a startup or an industry research lab.
我认为,这是最大的一个权衡因素,至少从我与正在面临这一抉择的学生和人们的交流中可以看出。
I think that is the biggest, kind of like at least talking to students and people that are kind of like facing this decision right now.
这有点像是机会成本。
It's kind of like the opportunity cost.
而且,你知道,这种感觉是,这可能是历史上一个独特的时期,等博士毕业几年后,这种机会可能就不复存在了。
And and and the fact that, you know, this, the feeling that, this might be a unique time in history and this might not be there in in a few years when the PhD, is over.
是的
Yeah.
我的一贯理念是着眼于长期发展,而攻读博士学位是对自身的一种极佳投资。
My philosophy has always been to kind of like optimize for the long term and a PhD is a really good investment in yourself.
所以我仍然认为值得去做。
So I still think it's worth doing it.
未来会有更多机会,我不认同那种‘现在不干就永远没机会了’的说法。
There's gonna be more opportunities and I don't believe in this, oh no, it's now or never.
如果你现在不抓住这个机会,我确信未来还会有其他技术、其他机会,让你去创造酷炫的东西——而拥有博士学位,是的,
And then, you know, if you don't take this opportunity now, then I'm sure there's going to be other technology, there's going be other opportunities for, developing really cool things and and having a PhD Yeah.
会为你铺平成功之路。
Will set you up for success.
是的
Yeah.
最后两个问题。
Two last questions.
所以第一个问题,你认为什么造就了一名成功的研究员?
So number one, what do you think makes a successful researcher?
换句话说,你提到Volo和Aditya是你带过的最优秀的博士生之一。
In other words, you know, you mentioned Volo and Aditya are, like, two of the best PhD students you've ever had.
你认为他们身上有哪些具体的品质使他们如此出色?
What do you think are some of the concrete qualities that make them, like, really exceptional students?
而且,在你这十年的教学生涯中,你积累了大量的数据来分析这个问题。
And, obviously, in your decade, you have a lot of training data, to to work with here.
有太多太多、太多优秀的博士生了。
Tons of tons and tons and tons of incredible students.
当然,部分要归功于你的建议,但究竟是什么让普通优秀的学生和顶尖的学生区别开来呢?
Obviously, in part, to your to your advice, but, like, what what makes the good ones versus the great ones?
然后第二个问题,你现在最看好的研究领域是什么?
And then second, like, what are the fields that you're most excited about right now?
显然,我们有扩散模型和语言模型。
So, obviously, we have diffusion language models.
展开剩余字幕(还有 39 条)
这是一个进步。
That's one advancement.
但根据你的经历,你总是在同时进行很多不同的事情。
But, given your history, you're always working on a bunch of different things.
所以我相信你的学生们现在正默默从事着各种不同的项目。
So I'm sure you have your students that are quietly working on, you know, different projects right now.
有没有哪些突破让你更加看好?
Are there any sort of breakthroughs that you are more bullish on?
比如,很多人对人工智能用于科学领域非常感兴趣。
Like, think maybe one industry a lot of people are, are excited about is AI for science.
所以能否简要谈谈这个领域,以及你未来几年最期待的一些工作?
So maybe talk briefly about that and, some of the work that you're excited about in the next few
嗯。
Mhmm.
是的。
Yeah.
关于什么造就了一位优秀的研究者,我认为这其实是一个多方面的组合,人们在攻读博士时往往带着不同的优势和不足。
In terms of, like, what makes for a great researcher, I think it's really a mix of things, and I and I think people come into their PhD with different kinds of strengths and weaknesses.
我的意思是,这其实是很直观的。
I mean, it's the one that you would imagine.
没有什么神秘的秘诀或特别的东西。
Like, it's there's no secret sauce or anything special.
就是说,人们需要有创造力。
It's just like, you know, people need to be creative.
他们需要聪明。
They need to be smart.
他们需要非常努力地工作,并且能够快速迭代想法,判断哪些可行、哪些不可行。
They need to be working really hard and and kinda, like, being able to iterate fast on on ideas and see what work is and what doesn't.
是的。
Mhmm.
他们还需要有条理。
They need to be organized.
他们还需要是优秀的工程师,善于编写代码和做实验,并且能够妥善跟踪实验进展。
They need to be, you know, just good engineers, good writing code and doing experiments and being, you know, keeping track of experiments.
要擅长数学,推导定理,而且不同的人在进入博士阶段时技能水平各异,比如沟通能力、科学表达能力,能否清晰地表达和传达自己的想法。
So good at math, doing theorems and, you know, people coming to the PhD with different levels of skills, you know, maybe even, communication, scientific communication, the ability to express your ideas and communicate them clearly.
然后他们会随着时间推移不断进步。
And then they get better over time.
我认为我们是通过不断接触其他研究者、与聪明的学生合作、跟随教授工作、与同龄人交流而逐渐成长的。
I think we work on the through, you know, you get exposed to other researchers, you work with other smart students, you work with professors, work with your peers and you get better over time.
正是这些因素的综合作用,才真正决定了你作为一名研究者能取得多大的成功。
And it's a mix of those things that really determines how successful you're going to be as researcher.
关于研究方向,我觉得有很多选择。
In terms of research directions, I think there's quite a few.
人工智能赋能科学绝对是一个让我感到非常兴奋的方向。
AI for science is definitely one that I think is really exciting.
我们可以将大量现有知识融入模型中,还有很多有趣的应用领域正等待我们去探索。
Lots of our knowledge to bring into the models, lots of interesting applications areas that are maybe down at us.
由于工业研究实验室可能无法接触到当前学术研究机构中那样的专家,因此那里还存在更多容易实现的成果。
Where there is more low hanging fruits just because maybe industry research labs, they don't have access to the kind of experts that are available right now on the academic research campuses.
对。
Right.
所以我认为那里还有很多工作可做。
So I think there is a lot to be to be done there.
嗯。
Mhmm.
另一个更具体的领域,是词元化器仍然需要大量工作。
Another one, yeah, more concretely, you know, there's still quite a lot of work to be done on tokenizers.
对。
Right.
针对不同类型生成模型的高效推理。
Efficient inference for for different kinds of generative models.
我认为那里还有很多工作可做。
I think there's a lot of work to be to be done there.
嗯。
Mhmm.
好的。
Okay.
太棒了。
Awesome.
非常感谢你抽出时间,斯特凡诺。
Well, thank you so much for your time, Stefano.
谢谢你的邀请。
Thanks for having me.
是的。
Yeah.
再见。
Bye bye.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。