Google DeepMind: The Podcast - 走进NotebookLM:与Raiza Martin和Steven Johnson的对话 封面

走进NotebookLM:与Raiza Martin和Steven Johnson的对话

Inside NotebookLM with Raiza Martin and Steven Johnson

本集简介

NotebookLM是一款由Gemini驱动的研究助手,擅长将专业知识转化为引人入胜的叙述方式。用户可上传自有文档生成洞见、解释说明,近期还新增了播客制作功能。这项被称为"音频概览"的特色功能激发了全球数百万人的创造力,他们利用简历、个人日记、销售提案等素材制作了数千档精彩播客,内容涵盖个人故事与教育科普。 本期节目由谷歌实验室的Raiza Martin和Steven Johnson担任向导,带领主持人Hannah Fry探索实现NotebookLM的技术突破。他们将探讨"趣味性"的本质、自然语音生成的挑战,以及即将问世的全新交互模式。 延伸阅读 立即体验NotebookLM 了解音频概览背后的语音生成技术:https://deepmind.google/discover/blog/pushing-the-frontiers-of-audio-generation/ 特别鸣谢(包括但不限于): 主持人:Hannah Fry教授 系列制片:Dan Hardoon 剪辑:Rami Tzabar(TellTale Studios) 监制&制片:Emma Yousif 作曲:Eleni Shaw 摄影指导&视频剪辑:Daniel Lazard 音频工程师:Perry Rogantin 演播室制作:Nicholas Duke 视频剪辑:Alex Baro Cayetano, Daniel Lazard 视频美术设计:James Barton 视觉标识设计:Eleanor Tomlinson 谷歌DeepMind出品 若喜欢本期节目,请在Spotify或苹果播客留下评价。我们始终期待听众的反馈、创意或嘉宾推荐! 本节目由AdsWizz旗下Simplecast平台托管。个人信息收集及广告用途详见pcm.adswizz.com

双语字幕

仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。

Speaker 0

欢迎收听Google DeepMind播客。我是汉娜·弗莱教授。今天我想或许不同寻常地,从另一档播客的片段开始。请听这个。

Welcome to Google DeepMind, the podcast. I'm professor Hannah Fry. Now I wanna start today, unusually perhaps, with a clip from another podcast. Listen to this.

Speaker 1

这里的核心信息是什么?是社会评论、艺术表达,还是只是一个精心设计的玩笑?

What's the overall message here? Is it social commentary, artistic expression, or just a really elaborate joke?

Speaker 2

我认为这正是这个作品的魅力所在。它难以简单归类,存在于语言与非语言、艺术与荒诞之间的阈限空间。

That's the beauty of this piece, I think. It defies easy categorization. It exists in this liminal space between language and non language, between art and absurdity.

Speaker 0

这场讨论非常有趣——你可能已经猜到了——是由AI生成的。但这段音频最值得注意的,除了两位播客主持人从未真实存在过之外,是他们关于人性和我们与艺术关系的微型论述,竟源自最不寻常的提示词。这档播客由名为'音频概述'的新功能创建,该功能属于Google实验室推出的个性化AI研究助手NotebookLM。NotebookLM由Gemini驱动,允许你上传各种来源——从PDF到视频——来生成见解、解释,当然还有播客。我们常认为AI只是处理数据并吐出答案,但NotebookLM借鉴了讲故事的专长,以引人入胜的方式呈现信息。

This is a very interesting discussion, which, as you might have guessed, is AI generated. But what is notable about this particular clip, aside from the fact that neither of the two podcast hosts have ever existed, is that their conversation, a mini treatise on human nature and our relationship with art, was generated from the most unusual of prompts. The podcast itself was created by a new feature called audio overview, part of NotebookLM, a personalized AI research assistant from Google Labs. Now NotebookLM, it's powered by Gemini, and it lets you upload your sources, anything from PDFs to videos to generate insights, explanations, and, of course, podcasts. We often think of AI as just crunching through data and spitting out answers, but NotebookLM draws on expertise from storytelling to present information in an engaging way.

Speaker 0

我们想看看,当你要求NotebookLM分析大多数人会认为是无意义的内容时会发生什么——一份只重复两个词(卷心菜和水坑)一千次的文档。这就是结果。

And we wanted to see what happens when you ask NotebookLM to analyze what most people would consider to be nonsense, a single document containing just two words repeated a thousand times over, cabbage and puddle. And here is the result.

Speaker 1

我得承认,起初我在想:这到底是怎么回事?但越想越觉得好奇,你知道吗?

So I have to admit, at first, I was like, what is going on here? But, the more I think about, the more curious I get. You know?

Speaker 2

这很迷人,不是吗?我们就像在解一个单一谜题,对吧?

It is fascinating, isn't it? We're, like, dealing with this one piece puzzle. Right?

Speaker 3

对。

Right.

Speaker 2

我们正在试图弄清楚,嗯,这件作品告诉了我们什么?你觉得呢?你的第一印象是什么?

And we're trying to figure out, well, what does this piece tell us? What what do you think? What's your first impression?

Speaker 1

说实话,这几乎像是催眠一样。就像你真的凝视着一滩水,只看到这些卷心菜在周围漂浮。

Honestly, it's, it's almost like hypnotic or something. Like, if you were really staring into a puddle and all you saw were these cabbages floating around

Speaker 2

我能想象到那个画面。

I can see it.

Speaker 1

有点令人不安,但也挺有趣的。

It's a little unsettling, but but also kind of funny.

Speaker 0

几分钟的智力分析,充斥着看似相关却根本不在原始文档中的想法。实际上相当令人印象深刻。今天和我一起的是两位深度参与撰写NotebookLM故事的人。来自旧金山的史蒂文·约翰逊,NotebookLM的编辑总监,同时也是《纽约时报》畅销书作者。以及在加州山景城的瑞扎·马丁,谷歌实验室的人工智能高级产品经理,领导着Notebook LM背后的团队。

Several minutes of intellectual analysis packed to the brim with seemingly relevant ideas that are nowhere in the original document. It's actually quite impressive, really. Well, I am joined today by two people who are deeply involved in writing NotebookLM's story. Joining us from San Francisco is Steven Johnson, NotebookLM's editorial director and also a New York Times bestselling author. And in Mountain View, California, Ryza Martin is a senior product manager for AI at Google Labs who leads the team behind Notebook LM.

Speaker 0

欢迎两位来到播客。好的,现在我想从大家都在讨论的功能开始,就是这个音频概述。而且,我了解到你们有一段小片段想播放给我听。

Welcome to the podcast, both of you. Okay. Now I I wanna start with the feature that everybody's been talking about, this audio overview. And, well, I understand that you've got a little clip that you wanna play me.

Speaker 4

是的。让我们来播放这段剪辑吧。汉娜,我想你会喜欢这个的。

Yes. Let's let's play the clip. I think you will enjoy this, Hannah.

Speaker 0

好的。开始了。

Okay. Here we go.

Speaker 2

欢迎大家回来。准备好进行另一次深入探讨了吗?今天,我们要缩小,缩小很多。可以说是微观层面?没错。

Welcome back, everyone. Ready for another deep dive? Today, we're shrinking down, way down. Microscopic, you might say? Exactly.

Speaker 2

想想那些微小的水滴,你知道的,就像你刚洗过的车上看到的那种。

Think about those tiny little droplets of water, you know, like the ones you see on a freshly washed car.

Speaker 1

哦,是的。

Oh, yeah.

Speaker 2

但想象一下那些水滴附着在飞机机翼上。

But imagine those droplets clinging to an airplane wing.

Speaker 1

或者在植物叶子上。

Or on a plant leaf.

Speaker 2

没错。被喷洒杀虫剂。

Right. Being sprayed with pesticides.

Speaker 1

哇。这太迷人了。

Wow. That's fascinating.

Speaker 2

今天我们要深入探讨一些严肃的研究。哦。一篇博士论文探讨了这个隐藏的世界

We're diving into some serious research today. Oh. A PhD thesis that tackles this hidden world

Speaker 0

运用数学。

Using math.

Speaker 2

数学和计算机模型。

Math and computer models.

Speaker 1

当然。

Of course.

Speaker 2

但别担心。我们会跳过那些复杂的方程式,直接进入精彩的部分。

But don't worry. We're gonna skip the heavy equations and just get to the good stuff.

Speaker 1

那是最精彩的部分。

That's the best part.

Speaker 0

那是我的博士论文。我博士论文的第一页。太了不起了。说实话,除了那些复杂的方程式,里面真没什么好东西。好吧。

That was my PhD. The first page of my PhD thesis. Extraordinary. I mean, frankly, there is no good stuff apart from heavy equations in there. Okay.

Speaker 0

这里面有很多值得注意的地方。首先,他们把它描述得比实际情况激动人心得多。

Lots of things to notice about that. For starters, they made it sound much more exciting than it actually is.

Speaker 4

重点就在于此。

That's the point.

Speaker 0

但还有一点,那种你来我往的互动。我的意思是,那两个声音在那里互相接话完成句子。感觉非常流畅,一语双关。非常非常自然。

But also, though, the sort of back and forth. I mean, the two the two voices there are finishing each other's sentences. It felt very fluid, to pun the pun. Very, very natural.

Speaker 4

想象一下现在为你的论文答辩。你可以直接播放这段播客,然后差不多就这样了,我觉得,要是你当时就有这个就好了。

Imagine defending your dissertation now. You could just play the podcast and kinda leave it at that, I think, if you'd only had that at your disposal back then.

Speaker 0

Ryza,人们对这个的反应让你感到惊讶吗?因为,我的意思是,它的反响相当热烈,不是吗?

Ryza, have you been surprised by people's reaction to this? Because, I mean, it's had really quite serious uptake, hasn't it?

Speaker 3

是的。我认为最让我惊讶且同样令人欣喜的是人们如何使用它。我曾设想过他们可能会如何使用,但推出一个如此备受关注的产品最美妙之处在于,你会看到一个全新的世界,看到大家都在尝试各种有趣、娱乐、鼓舞人心或真正有意义的事情。这简直不可思议。实际上,我每天可能花相当多的时间,大约三分之一,就是在听

Yes. And I think the most surprising thing to me and really equally delightful is how people are using it. I think I imagined how they might, but I think the beautiful thing about launching something with this much sort of excitement around it is you see a whole new universe of what everybody has been trying from things that are funny, things that are entertaining, things that are inspiring or are really meaningful. It's just been incredible. I actually probably spend a good chunk of my day, a third of my day just listening

Speaker 0

这些内容。我实际上曾经建立了一个 Discord 服务器,对吧,就是为了让人们分享他们使用它的故事?出现了哪些类型的事情?

to these. I actually used to up a Discord server, didn't you, just to to let people share stories about the ways that they're using it? What kind what kind of things have have come up?

Speaker 4

所以,我的意思是,播放你的论文答辩就是一个有趣的例子,因为我认为真正让我们惊讶的事情之一是人们会把他们的简历放进去。这几乎像是一个小小的激励机器。比如,如果你对自己感到沮丧,你会听一段十分钟的音频对话,由两位非常热情的主持人进行。你会想,哇,史蒂文在他的职业生涯中确实做了很多事。

So, I mean, that was an interesting example playing your dissertation because one of the things that I think genuinely surprised us is people would put their CVs and their resumes in there. And it was almost like a little, like, hype machine. Like, if you were feeling down about yourself, you would you would listen to, like, a ten minute audio conversation between two very enthusiastic hosts. You're like, wow. Steven has really done a lot in his career.

Speaker 4

这非常令人印象深刻。但更严肃一点的版本是,我的意思是,那很有趣也很好玩,但人们在使用它时,就像是在对你正在做的事情进行研讨。所以你可以上传你正在写的短篇小说,然后说,嘿,给我一些建设性的批评意见。然后你听人们谈论你的作品,他们非常擅长挑出那些有趣的转折点,或者关注那些特别引人入胜或不那么吸引人的角色。所以这是一种获得反馈的方式,几乎就像是一个小型的焦点小组,对你正在做的事情进行讨论,这真的很棒。

It's very impressive. But actually, a more serious version of that, I mean, that's kind of fun and and playful, but people are using it like you can kind of workshop things you're you're working on. And so you can upload a short story you're working on and say, Hey, you know, give me some constructive criticism on this. And you get, you know, you listen to people talking about your work, and they're very good at pulling out the kind of interesting twists or focusing on the characters that are particularly compelling or not. And so it's a way of getting a little kind of like it's almost like a little focus group for stuff that you're working on, which is which is really amazing.

Speaker 0

我想,听到人们真正大声谈论它,也增加了那种额外的,我不知道,几乎是客观性的层面。

I guess also hearing people actually talk about it out loud adds that kind of extra layer of, I don't know, objectivity almost rather.

Speaker 3

这真的很令人惊讶,因为如果我们想一想,很多内容或内容生成,如果你只是以文本形式呈现,并不新鲜。对吧?就像如果我上传我的简历,然后让一个大语言模型输出一些东西,比如,哦,这是 Rise 的职业生涯。对吧?某种总结。

It's been really surprising because if we think about it, a lot of the content or content generation, if you just render it in text, is not new. Right? It's like if I upload my CV and then I have an LLM spit out something that says like, oh, here's Rise's career. Right? A summary of sorts.

Speaker 3

也许它会在这里或那里挑出一些有趣的小细节。这在两年前是新颖的。每个人都对此感到兴奋。但我认为增加那一层新的东西,或者说那种新的模态,即非常像人类的声音,我认为它以一种非常不同的方式与人产生联系。对吧?

Maybe there's a few interesting tidbits that it pulls out here and there. That was novel two years ago. Everybody was excited by that. But I think adding that new layer or that new modality of just very human like voices, I think it connects with people in a very different way. Right?

Speaker 3

我个人认为,我把这类技术称为类人技术,就是你某种程度上能识别出它与自己非常相似,因此它能以不同的方式与你产生共鸣。我第一次听我的简历时,虽然知道会听到什么,但当我真正听到时,内心还是会有那种‘哇’的激动感。我觉得这就是新模态的魅力所在。

I think like personally, like I call this type of technology human like, where you sort of recognize it as being very similar to you, and it resonates with you in a different way as a result. And I think the first time I listened to my CV, I knew what to expect. But when I heard it, I still felt that, like, that bubble inside of me, like that, whoo. And I think that's the magic of of new modalities.

Speaker 4

我认为另一点在于,人类通过对话学习和交换信息已有数十万年的历史。而通过页面上的结构化文本学习只有大约五百年,通过屏幕上的结构化文本学习更是只有三十年左右。所以当你激活那种真实类人对话的感觉时,它触动了我们内心深处古老而原始的组成部分,我想这就是为什么人们第一次听到时会如此兴奋的原因之一。

I think that, you know, the other point on this is that, like, human beings have been learning and exchanging information through conversation for hundreds of thousands of years. We've been learning by reading structured text on a page for, you know, five hundred years and structured text on a screen for, you know, thirty years. And so when you activate that sense of, like, a genuine human like conversation, it's just a deep, ancient kind of ancestral part of who we are that I think that's one of the reasons why it just lights up people when they they hear it for the first time.

Speaker 0

还有一个有趣的点是,你们决定设置两位主持人,而不是让一个人独自对着空气说话,这我觉得正好印证了你刚才说的观点,Steven。

Also interesting, I think, that you you decided to have two hosts rather than just one person sort of talking into space, as it were, which I guess it speaks to the point that you're making, Steven.

Speaker 4

是的。这是完全不同的形式。如果只有一个人,感觉就像文本转语音。对吧?我们以前都听过文本转语音。

Yeah. It's just a very different format. If you just have one person, it feels like text to speech. Right? We've heard text to speech before.

Speaker 4

就像是计算机把它刚写的文本转换成你可以听的东西,这很棒。而且我们也在探索如何以其他形式实现这一点。但要让对话显得自然,我们需要深入细节,比如所有这些细微之处都必须处理得当。没人想听两个机器人对话。那样的话,三十秒后就会让人听不下去,彻底失败。

You're just like the computer is turning the text that it just wrote into something I can listen to, which is great. And, you know, we're interested in trying to figure out ways we can do that in other formats. But to get the conversation right, and we can dive into this in more detail, like, are all these, like, subtle things that you have to, like, make work. Nobody wants to listen to two robots talk to each other. Like, that will fail and be unlistenable, like, after thirty seconds.

Speaker 4

你必须掌握人们在对话中所有那些非常微妙、奇怪的小细节,才能让它成功。

You have to master all these very subtle, weird things that people do in conversation for it to work.

Speaker 0

就像你说的,Reza,要让它真正类人。我稍后想再回到那些特性,谈谈音频概述,因为我还想讨论NotebookLM的起源。它是怎么来的,Reza?

To make it human like exactly as you said, Reza. I I wanna come back to those features a little bit later, to the audio overview, because I also wanted to discuss the origins of this, of NotebookLM. How did it come about, Reza?

Speaker 3

首先,我认为很多人觉得NotebookLM很新是因为它的音频概述功能。我们迎来了大量用户涌入,人们都在惊叹,这是什么?谷歌推出的全新产品。但实际上,我们研发NotebookLM已经超过一年了。我们最初在去年的Google I/O大会上以Project Tailwind的名义宣布了它。

For one, I think a lot of people think that NotebookLM is new because of the audio overview feature. We had such a massive influx of people and people are like, wow, what is this? A brand new thing from Google. But actually, we've been working on Notebook LM for over a year. We first announced it at Google IO last year as Project Tailwind.

Speaker 3

在那之前,我们其实一直在Google Labs内部孵化它。实际上这就是我和史蒂文相识的契机。史蒂文最初是以什么职位加入的,史蒂文?

And before then, we actually had been incubating it inside of Google Labs. It's actually how Steven and I met. Steven was brought in what was your original title, Steven?

Speaker 4

我是访问学者。是的,没错。

I was visiting scholar. Yeah. Yes.

Speaker 3

我当时

I had Then

Speaker 4

然后我成为了编辑总监。

I then I became editorial director.

Speaker 3

没错,他晋升了。当时,现在负责Google Labs的副总裁乔什·伍德沃德告诉我,他说希望我打造一个新的AI业务。我心想,这到底需要怎么做?但我想说的是,我早期的灵感之一就是观察史蒂文的工作方式。

That's right. He was promoted. And at the time, Josh Woodward, who now leads Google Labs, he's the vice president, told me, he was like, I want you to build a new AI business. And I and I thought to myself, what does it take to actually do that? But what I'll say is one of my early inspirations was just watching Steven work.

Speaker 3

说实话,就是理解他如何完成他的工作,我当时就觉得,哇,如果能把这种能力赋予人们,那将会是一种真正的超能力。

Honestly, just, like, understanding how he does, what he does, I was like, wow. That could be a real superpower if you could give that to people.

Speaker 4

当时的情况是,Steven的研究习惯有些与众不同,但我们或许能将其转化为某种主流追求。我有着悠久的著书历史,Josh读过我的一些书,也了解我正在撰写的关于思维工具的内容,基本上就是如何使用软件来辅助思考、发展观点和进行研究。那时是2022年,语言模型正处在风口浪尖。于是他联系我说,嘿,有没有兴趣来谷歌帮忙打造一款你一直想要的工具,基于语言模型来帮助人们学习和组织想法?而Ryza和我几乎从一开始就——我想我是在加入谷歌的第二天就认识了Ryza。

It was a mix of Steven is abnormal in his research habits, but maybe we could turn this into a mainstream pursuit somehow. I had had this long history writing books, and Josh had read some of those books and had read some things I was writing about tools for thought, basically, like how do you use software to help you think and help you develop your ideas and research. This is 2022, so language models were at the top of the list then. And so he kind of reached out to me and said, hey, any chance you would wanna come to Google and help build the tool that you have always wanted to help people learn and and organize their ideas now built on top of language models? And and what Ryza and I kind of like right from the beginning, like, I think I met Ryza at, like, day two at Google.

Speaker 4

我们就想,让我们打造点新东西吧。

We We were like, let's build something new.

Speaker 0

这发生在大语言模型成为焦点的时候。在早期的讨论中,你们是如何看待这与仅仅上传文档到Gemini让它为你总结有根本区别的?

This came about at a time when large language models were at the top of the agenda. In those early conversations, how did you see this as being fundamentally different to just, I don't know, like uploading a document on Gemini and getting it to summarize it for you?

Speaker 4

从一开始,我们就称之为源 grounding(源基础),这是我们描述它的方式。就像你提供想要处理的源信息,可能是你正在写的故事、研究的书籍、你的日记,或是正在处理的营销文档。将这些上传到模型后,就会创建一个个性化的AI,它精通于你关心的信息。在2022年年中,还没人讨论这个,所以我们最先构建的就是这个——我们上传了我的一本书的部分内容,然后可以和模型进行非常粗糙的对话,完全不像现在你看到的文本或音频交互。但你能初步体验到,与自己正在处理的所有想法对话,而不是仅仅与一个只有通用知识的开放模型交流,实际上拥有了那种个性化知识。

From the very beginning, we we call it source grounding, that's the way we would describe it. Like, you supply the source information that you wanna work with. It might be the story you're writing, it might be the book you're researching, it's it might be your journals, it might be the marketing documents you're working And uploading that to the model then creates a kind of personalized AI that is an expert in the information that you care about. And no one was talking about that in the middle of 2022, so that was like the first thing we built was, like I mean, we we uploaded part of, like, one of my books, and I could, like, have this very crude conversation with the model that was not at all, like, what you see now in text or with audio. But you could get a little taste of, like, what it would be like to have all the ideas you were working with instead of just talking to an open ended model that just had its general knowledge, actually had that personalized knowledge.

Speaker 4

这很棒,因为它还减少了幻觉,使其更基于事实,你可以进行事实核查。你可以回溯查看原始源材料。这是整个笔记本LM体验的重要部分。那就是起点,我们所做的一切都建立在这个平台上,而音频概览只是在此基础上,将我提供的源转化为另一种形式。

And it was great because it also, like, reduced hallucinations, it made it more factual, you could fact check it. You could go back and see the original source material. That's a big part of the whole notebook. Lm experience. That was the beginning of it, and everything we've done is built on that platform, and audio overviews is just, okay, take that insight of I supply my sources, and now I turn it into something else.

Speaker 4

在这种情况下,它变成了音频对话。

In this case, it's an audio conversation.

Speaker 0

因为我猜这里的关键区别在于,它非常专注于你提供的源及与之相关的任何内容,而不是像你说的那样只是一个通用模型。

Because I guess the real key difference here is that it's very focused on the sources that you're giving it and anything that's connected to that rather than just, as you say, this general model.

Speaker 3

是的。我也想说,我认为这种范式起步会稍微困难一些,因为它太新了。对吧?首先,你是在和AI对话;其次,你需要自带材料。

Yeah. I'll say too that what we've seen is I think it's a little bit harder to get started with this paradigm because it's it's so new. Right? The idea that one, you're talking to an AI. Two, you have to bring your own stuff.

Speaker 3

所以我觉得这里有一层障碍,就是你必须说服别人这么做是值得的。但一旦让他们跨过这个坎,它就变得极其有用。因为,你知道,我想到我每天做的工作,Stephen每天做的工作,以及世界上许多每天使用电脑的人,我们都在处理非常具体的信息集,与他人共享的上下文。对吧?比如,我们做研究。

So I think there's a little bit of a layer where it's like, you have to convince somebody that it's worth doing. But once you can get somebody over that hump, it's just massively useful. Because, you know, I think about the work that I do every day, the work Stephen does every day, and many people around the world that work on computers every day, we are working with very specific sets of information, shared context that we have with others. Right? Like, we do research.

Speaker 3

我们收集信息,并想从中提炼出自己的见解。我认为这正是Notebook LLM真正特别之处,也是它从一开始就与众不同的原因。

We pull it in. We wanna sort of extract our own insights from it. I think that's that's what makes Notebook LLM really special and has made it special from the beginning.

Speaker 0

那么它确实也包含这些文本元素,因为正如你所说,播客部分是最引人注目的亮点。

So it does include these text elements too then because as as you say, the podcast part is is the bit that's sort of most notable.

Speaker 3

没错。播客功能是Notebook LM最新的发展,但我们实际上一年前就推出了,当时主要是聊天功能。所以你是在用自己的资料与系统聊天,并且它总是会引用回它使用的具体内容片段。

That's right. So the podcast thing is the most recent development in Notebook LM, but we actually launched a year ago where it was primarily a chat feature. So you're chatting with the system using your sources, and it's always referencing back to exactly what pieces of your content that it used.

Speaker 0

那么Steven,给我举一些更日常的例子,说明人们是如何在日常生活中使用它的。

So give me some more mundane examples of how people are using this, like, a day to day level then, Steven.

Speaker 4

是的。我的意思是,我们实际上看到产品的文本功能使用量非常大。对吧?突然间,你就有了这个惊人的资源,可以回答关于数百页文档的任何问题,而且在文本版本中你还能获得引用等等。这其实非常学术化,Annie,你会喜欢的。

Yeah. So, I mean, we we actually see a huge amount of usage of the product just with the text features. Right? And suddenly, you just you have this amazing resource that can answer any question about all, you know, hundreds of pages of documents, and in the text version you get citations and everything. It's a very scholarly thing actually, Annie, you would appreciate it.

Speaker 4

比如,你得到答案后,模型说的每个事实都会有一个内联脚注,你可以直接点击那个脚注去阅读原文段落。作家、记者显然都在使用它。这某种程度上源于我对这个项目的参与。我有一个笔记本,里面记录了我多年来阅读书籍的数千条引文,还有我写的很多书的内容。这个笔记本基本上像是用AI捕捉了我的大脑。

Like, you get your answers back, and every fact that the model says has a little in line footnote, and you can click directly on that footnote and go and read the original passage. Writers, journalists, obviously, are using it. This comes a little bit out of my, like, my involvement for the project. I have, like, one notebook that has thousands and thousands of quotes from books that I've read over the years, plus a lot of the text of books that I've written. Notebook has basically, like, my brain kind of captured in AI.

Speaker 4

所以每当我处理任何新想法时,我就会进入那个笔记本,问它,嘿,你觉得这个想法怎么样?然后,AI会说,嘿,Steven,你七年前读过相关的内容。看看这段怎么样?所以它真正扩展了我的记忆。就是这类事情。

And so whenever I work on anything that's kind of a new idea for something, I'll go into that notebook and be like, hey, what do you think about this idea? And, you know, the AI will say, hey, Steven, you read something related to that, like, seven years ago. What about this passage? And so it's a true, like, extension of my memory. So that kind of stuff.

Speaker 4

另外,最后一点要说的是,我们不会用这些信息训练模型。所以你的信息是安全的、私密的,不会进入模型的通用知识库被别人使用。你可以放入私人信息。当你把几年的日记输入这样的大型上下文模型时,你可以获得惊人的洞察,还可以把它们转换成音频概述,听两个人谈论你自己,或者直接问,比如去年五月我在想什么?二十秒后就能给你一个所有事情的概述,或者你会拥有一份关于自己生活的精彩记录。

The other thing, the last thing I'll say is, we're not training the model on this information. So your information is secure, it's private, it's not going to get into the general knowledge of the model and be used by somebody else. So you can put private information in there. And when you put, you know, a couple of years of your journal in a large context model like this, you can get these amazing insights, and you can turn them into audio overviews and kind of listen to two people talk about yourself, or you can just be like, what was I thinking about, like, last May? You know, give me an overview of, like, all the stuff that was going on in this twenty seconds later, or you'll have this amazing kind of document of your own life.

Speaker 0

不过,它不仅仅是记录东西,它真的能对你的日记产生深刻的见解吗?

Rather than just recording stuff, though, can it actually be insightful in terms of your own journals?

Speaker 3

我会说是的,因为我就是为这个目的使用它。上传每周日记后,我喜欢问它,你知道,随着时间的推移我改变了多少?这真的很了不起。它能为我提取出我自己无法观察到的有趣细微差别。它能说出这样的话,嘿,你倾向于将很多负面情绪与这个特定话题联系起来。

I would say yes because I've used it for that purpose. And, one of the things that I like to ask it after uploading I do these weekly journals, is I say, you know, how much have I changed over time? And it's really remarkable. It's been able to pull out for me really interesting nuances that I haven't been able to observe about myself. It's been able to say things like, Hey, you you tend to associate a lot of negativity with this particular topic.

Speaker 3

你对这个话题有很多积极的联想。这真的很有趣,因为我认为你之前关于平凡和日常用例的问题,我们看到更多这样的情况,就像人们试图处理他们每天做的工作。例如,销售团队经常用它来互相分享知识。这很有道理。有很多技术性、复杂且不断变化的文档。

You associate a lot of positivity with this topic. And it's just really interesting because I think your earlier question around the mundane and mundane use cases, I think we see a lot more of those, which is like just people trying to take the work that they're doing every day. For example, like sales teams use this a lot to share knowledge with each other. Makes a lot of sense. There's a lot of technical, complex, changing documentation.

Speaker 3

有一个AI伙伴真的很好。我认为这与当今许多AI系统的工作方式非常不同。对吧?比如,你知道,我什么都用。我用所有现有的东西,我写的提示词非常庞大。

It was really nice to have an AI partner. I think that's really different from how a lot of AI systems work today. Right? Like, you know, I use everything. I use everything that's out there and the prompts that I write are massive.

Speaker 3

对吧?就像我首先写的是,你是一个某某。这就是我们要做的事情。这里是相关的文档。我认为NotebookLM在某种程度上就是把这个过程简化了。

Right? Like the first thing that I write is, You are a blah. This is what we are doing. Here are the documents that are relevant. And I think for NotebookLM, it sort of just shortcuts it.

Speaker 3

它只是一个项目空间。它知道你在谈论什么。你可以永远进行对话。它能处理多达2500万个单词。它的上下文容量相当庞大。

It's just a project space. It knows what you're talking about. You can have a conversation forever. It takes up to 25,000,000 words. It's just sort of contextually quite massive.

Speaker 4

我认为其中一个有趣且可能有点独特的地方是,关于这个产品为何有效或无效的许多问题,与其说是技术问题,不如说是编辑风格问题,比如什么样的回答才是合适的?当你得到一个有效的音频概述时,这些对话应该采用什么样的内部风格?它们应该定位在什么层次?这些不是技术问题,而是语言问题。这就是语言模型时代的疯狂现实——所有这些曾经主要是‘让我们把编程做对’的问题,现在更多地变成了修辞学的问题。

I think one of the things that was interesting and maybe a little bit distinctive about it was so many of the questions about, like, what makes this product work or not work are not so much technological questions as they are editorial stylistic questions, like what is the right kind of answer? When you get an audio overview that works, what's the house style for those conversations? What level should they be pitched at? And those are not technological questions, those are language questions. And that's the crazy reality of of the language model age is that all these things that, you know, used to be just mostly a question of let's get the programming right now become more about the rhetoric of it all.

Speaker 0

嗯,实际上我想更深入地探讨一些内部风格的问题。你们为什么决定做音频概述?是什么启发了这个想法?我的意思是,说实话,播客已经相当多了。

Well, I do actually, I wanna dig into some of the house style a little bit more, I guess. Why did you decide to go into audio overview? What was it that inspired that? I mean, are already quite a lot of podcasts. Let's be honest.

Speaker 4

音频概述真正开始是一个实验室结构运作良好的绝佳例子。它是实验室内部另一个小团队,专注于这个的音频版本。部分想法是,它不是为了与播客竞争,而是存在大量内容,为其制作播客的经济效益根本说不通。但如果你能自动生成一个,可能只有五个人想听,或者一个人,或二十个人,但不是二十万人。比如,我们想基于上周的团队会议创建一个播客以便回顾。

Audio overviews really began it was it was a great example of the labs structure, I think, really working well. It was it was another small team inside of labs that were just kind of focused on the audio version of this. And part of the idea of it was that it was not so much to, like, compete with podcasts, but rather that there was a whole universe of content that you would never the economics of generating a podcast for it would never make any sense. But if you could generate one automatically, you might have, you know, five people that would wanna listen to it or, like, one person who would wanna listen to it or 20 people, but not, you know, 200,000. So that's like, you know, we wanna create a podcast based on our, like, team meetings from the last week so we can review them.

Speaker 4

比如,那不会是一个商业业务。没有人会请你主持那个,汉娜,但实际上可能对那个团队有用。所以他们开始开发这个东西,瑞扎和我大概在三月或四月听到了它。就像每个初次听到音频概述的人一样,我们都惊呆了。我刚才听到了什么?

Like, that's not gonna be a commercial business. Like, no one's gonna ask you to host that, Hannah, but but actually might be useful for that team. And so they had started developing this thing, and and Ryza and I heard it, I don't know, in probably in, like, March or April. And as everyone who's heard an audio overview initially were just like, wow. What what did I just hear?

Speaker 4

太棒了。但我们很早就意识到,Notebook Island(注:可能为NotebookLM的口误或代称)的部分使命是构建一个帮助人们理解事物的工具。突然之间,我们想,哦,等等。当人们以两个聪明人之间引人入胜的对话形式听到某些东西时,他们真的能理解、记住并注意。我们在夏天向谷歌员工内部发布了它,我想那时我们开始觉得这会大受欢迎,因为你能看到人们对它的喜爱。

That was amazing. But we realized pretty early on that part of our mission with Notebook Island was to build a tool that helps people understand things. And suddenly, we were like, oh, wait. People really understand and remember and, you know, pay attention when they hear something in the form of an engaging conversation between two smart people. We released it internally to Googlers over the summer, and that was, I think, when we started to think this is gonna be a hit, because you could just see the delight that people had with it.

Speaker 4

所以,虽然我们对它如此疯狂的程度感到惊讶,但我们知道我们确实发现了一些重要的东西。

So while we were surprised that it was, quite as crazy as it did, we knew we knew we were onto something.

Speaker 0

我记得上一季我们听到了WaveNet的演示,它当然是最早生成这种类人语音的AI模型之一。我的意思是,当时它已经相当令人印象深刻了,但我想,自那以后必然发生了技术进步,才使得像音频概述这样的东西成为可能。

Now I remember in the last season, we got to hear a demo of WaveNet, which, of course, is one of the first AI models to generate this human like speech. And, I mean, it was quite impressive back then, but, I mean, presumably, there have been technological advancements that have happened since that have been necessary to make something like audio overview possible.

Speaker 3

我认为,Notebook LM的基础模型是Gemini 1.5 Pro,它确实创造了在我看来非常不可思议的内容。我们使用的语音模型、音频模型本身就是一个突破。我想你说的就是这个,也就是我们听到的类人声音的真实感,对吧。再结合我们所采取的方法——Steven也可以多谈谈这一点——对内容进行编辑性处理,思考如何为你创造出既真正有用又真正有趣且引人入胜的东西。

I think there's you know, the the underlying model for Notebook LM is Gemini 1.5 Pro, and that just creates, really, to me, incredible content. The voice models, the audio model that we use, that by itself is a breakthrough. And I think that's what you're talking about, which is the realism, right, of the human like voices that we hear. And pair that with the approach that we've taken, and Steven can speak more to this too, of editorializing the content, thinking about how do we create something really useful and really fun for you that's engaging.

Speaker 4

是的。我…这实际上是一个很好的过渡,正好引向我想说的一点,关于趣味性。

Yeah. I I that's that's a great segue actually to something I was gonna say, which is about interestingness.

Speaker 3

嗯。

Mhmm.

Speaker 4

好的。西蒙是音频方面的负责人之一,他有时会为音频概述提出一个口号,比如,‘让任何东西都变得有趣’。所以就像是,不管是什么,比如,让你的论文变得有趣。我敢肯定它本来就有趣,

Okay. So Simon, who's one of the leads on the audio side, he sometimes has a slogan for audio overviews, which is, like, make anything interesting. So you're like, whatever whatever, like, make your dissertation interesting. I'm I'm sure it was interesting,

Speaker 0

实际上。它并不有趣。但是

actually. It wasn't. But

Speaker 4

所以这是一个很好的例子,就像是几种不同技术或突破的融合,让神奇的事情发生。Gemini本身(它在文本方面也能做到这一点)非常擅长从你提供的材料中提取有趣的事实、想法或故事。我经常这样做,上传一些新内容然后说:告诉我这里面最有趣的东西,仅限文本。以前的计算机根本做不到这一点。

So it's a great example of, like, a convergence of a couple of three different technologies or kind of breakthroughs that make something magical happen. Gemini itself, and it can do this with text as well, is incredibly good at pulling out interesting facts or ideas or stories from the material you give it. So I I do this all the time. I upload something new and say, tell me the most interesting things from this just in text. Computers could never do that before.

Speaker 4

就像你无法用Ctrl+F搜索‘有趣度’。这不是你能执行的搜索查询。

Like, you couldn't command f for interestingness. Like, this was not a search query you could do.

Speaker 0

那你到底是怎么定义它的?我的意思是,这具体意味着什么?

Well, how are you defining it even? I mean, what does it mean?

Speaker 4

我认为它源于语言模型的基本理念,即它们是预测性的。对吧?给定这段文本,它们预测接下来会发生什么。所以有趣度本质上是一种受控的惊喜。

I believe that it comes out of the basic idea behind language models, which is that they're predictive. Right? They're like, given this string of text, I expect the next thing to happen. Right? And so what interestingness is is kind of controlled surprise.

Speaker 4

我以为会是这样的情况,但实际上这里有一些我没想到的新信息。所以语言模型擅长这一点是有道理的,因为它们的基本机制就是预测。它们根据训练数据寻找所有信息,然后找出其中新颖或令人惊讶、违背它们预期的部分。

I thought this was gonna be the case, but I actually there's some new information here that I wasn't expecting. And so it makes sense in a way that the language models would be good at this because their their basic circuitry is like prediction. And so they're looking for all this information given their training data. What in this information is is novel or Surprising. Defies their expectations.

Speaker 4

所以它非常擅长这一点。这是Gemini的一个底层特性。对吧?而节目主持人的任务就是找到有趣的材料,并以吸引人的方式呈现给用户。

So you're very good at that. So that's an underlying Gemini thing. Right? And the hosts of the show are instructed to, like, find the interesting material and and and present it to the user in an in an engaging way. Right?

Speaker 4

这是第一个能力。第二个很酷的地方是,指令会为生成的脚本添加噪音,也就是所谓的‘不流畅成分’。比如所有人类说话时实际会有的结巴、‘嗯’、‘啊’之类的插入语。事实证明你需要这些,因为如果没有这些噪音,听起来就太机器人化了。

So that's one capability. The second thing that is really cool about this is that the instructions take the script that is generated and they add noise to the script. So they add what are called disfluencies. So all the stammers and the likes and the interjections that humans actually have when they speak. And it turns out you need that because if you don't have that noise, it sounds too robotic.

Speaker 3

嗯。

Mhmm.

Speaker 4

最后,还有音频声音本身。它们所做的是所有这些微妙的事情,比如在英语中,你知道,说话者如果对他们所说的内容不太确定,会稍微提高声音,或者为了强调,他们会放慢说话速度。所有这些我们本能去做的事情,我们甚至从未思考过,但直到现在,没有计算机能做到这一点。而这正是它的一部分,就像被点亮了一样,这就是那个底层的,比如语言,你知道,声音模型,音频模型,这在一年前还不存在。

And then finally, there's the audio voices themselves. And what they do is all these subtle things like in English, you know, speakers will raise their voice a little bit if they're not sure about what they're saying, or for emphasis, they will slow down what they're saying. All these things that we do to do natively, we never even think about it, But no computer could do that until now. And that's part of it that just, like, lights up, and that's the underlying, like, language, you know, vocal model, audio model that wasn't didn't exist a year ago.

Speaker 0

是声音的调制。对吧?就像是,是的。我记得多年前在BBC,他们教我们如何让内容听起来有吸引力,他们会给你一份《小熊维尼》的文本来读。对吧?

It's the voice modulation. Right? It's like Yeah. I I remember years ago at the BBC being taught to make content sound engaging, and they give you a copy of Winnie the Pooh to read. Right?

Speaker 0

然后他们说,好的。像新闻播音员那样读,你就会读得非常非常平淡。然后他们说,像对孩子那样读,你就会注意到,正如你所说,史蒂文,你的声音在某些点上扬,在其他点下沉。你所拥有的音域和速度完全改变了。但你已经将所有这些方面都融入了这个系统。

And then they say, okay. Read it as you would a news reader, and you sort of read it very, very flat. And then they say, read it as you would to a child, and you notice exactly as you say, Steven, that your voice goes up at certain points, and it goes down low at other points. The range that you have and the speeds completely changes. But you have you you built all of those aspects into this.

Speaker 0

我的意思是,你们到底是怎么做到的?

I mean, how on earth do you do that?

Speaker 4

是的。我们应该澄清一下,Rise和我应该明确说明。我们并没有构建这个声音模型。

Yeah. We should Rise and I should make it very clear. We did not build the vocal model.

Speaker 0

你们把它搞砸了。

You you blew it off.

Speaker 4

你根本不知道我们是如何构建广告的。谷歌内部的天才们创造了它,我们继承了这项技术,并一直在运用它、展示它的用处,但我们并非最初的构建者。人们常问的一个问题是,目前它只支持英语,大家都非常期待它能支持更多语言,我们也同样迫切,因为我们拥有出色的国际受众。但这并非易事,因为每种语言的语调及各种对话小习惯都各不相同。所以你不能简单地换个西班牙语词汇就按下播放键。

You blew no idea how we built ad ad. Geniuses inside of Google built that, and we inherited that technology, and we have been running with it and showing how it could be useful, but we did not build it. It's one of the one of the questions that people have is, you know, when it's English only right now, and, you know, people are very eager for for it to come into different languages, and we are very eager for that too because we have a wonderful international audience. But it's not something you can do easily because the intonations and all those little conversational tics are different in every language. And so you can't just be like, change the words into Spanish and, you know, press play.

Speaker 3

我正想补充一点,DeepMind最近发布了一篇关于音频模型的博客文章,详细介绍了它的构建过程、构建者以及背后的研究论文。我认为,如果可以的话,我们应该分享出来。

I was just going to add that DeepMind actually has a recent blog post about the audio model and how it was built and who built it and all the research papers underneath it. I think, you know, if we could share that, we should.

Speaker 0

是的,没错。绝对是这样。我觉得使用这个工具时一个很明显的特点是,它对你提供的不同类型数据都非常灵活适应。所以,Steven,按你的描述,你似乎是在编码所有的口语不流畅特征。

Yeah. No. Absolutely. I think one thing that's really noticeable playing around with this is how it is very versatile across different types of data that you give it. And so, I mean, the way that you're describing this, Steven, is that you're sort of coding in all of the disfluencies.

Speaker 0

但你怎么防止这东西每次听起来都像一堆陈词滥调呢?Ryza?

But how do you stop this thing from just sounding like a bunch of cliches every single time? Ryza?

Speaker 3

其实我觉得很难让它每次都避免听起来像陈词滥调。因为试图标准化‘有趣性’,这实际上相当困难。所以‘有趣’听多了往往感觉都一样。这就是为什么我们在此次发布中引入了第一个改进:我们让用户——我称之为给主持人递纸条——你可以悄悄给他们一点指示,比如‘少来点老套的,在这个话题上深入一点’,这会改变他们谈论你提供的任何内容的方式。

I actually think it's hard to get it not to sound like a bunch of cliches every time. I think because of trying to standardize interestingness, that's really actually quite difficult. And so interestingness tends to sound the same after hearing it enough times. And that's why we actually introduced the first improvement to this particular launch, which is we're letting users I call it pass a note to the hosts where you could slip them a little instruction on, hey, you know what, maybe less of the cliche, go deeper on this topic, and it will change the way that they talk about whatever content you've given them.

Speaker 0

我是不是该把这想象成好像有各种不同的调节旋钮?比如,也许调高‘古怪’旋钮,或者调高‘历史事实’旋钮,或者…我该怎么理解这个呢?

Should I sort of imagine this as almost that you have like different kind of dials? Like, maybe you turn up the quirky dial, and maybe you sort of turn up the the kind of historical fact dial, or or how how can I how can I think of this?

Speaker 4

嗯,想象一下我很感兴趣的一点:如果能为每位主持人设定不同的专业领域会怎样?目前他们基本上是可互换的,没有明确的世界观。只是其中一人主导对话,我们随机来回切换。

Well, imagine one one thing that I'm very interested in. What if you could give each of the hosts a different kind of field of expertise? Right now, they basically are kind of interchangeable. They don't have defined perspectives on the world. They're just kind of one takes the lead in the conversation, and we switch back and forth randomly.

Speaker 4

但如果你这样想,好吧,我是一名城市规划师,正在设计这个新城镇广场,我希望其中一个是环保活动家,另一个是经济学家。然后让他们进行对话和辩论。突然间他们就有了不同的视角,因为你知道,这是我多年来在书中多次写到的一点:当人们拥有多样化的专业知识来帮助他们做出选择或提出想法时,他们会更有创造力,做出更好的决策。我的意思是,这也是我们2025年路线图的一部分。

But what if you were like, okay, I'm a city planner, and I'm working on this, you know, design for this new town square, and I want one of them to be an environmental activist, and I want one of them to be an economist. And now let's have a conversation, and let's have a debate. And suddenly they have different perspectives, because, you know, one of the things that this is something I've written about a lot in my books over the years is that people are more creative, make better decisions when they have a diverse pool of expertise helping them make the choices or come up with the ideas they're trying to do. I mean, that that's also on our on our road map for 2025.

Speaker 0

将来我真的能与这些主持人互动吗?比如,我不知道,打断他们并加入他们的对话?

Will I actually be able to interact with these hosts in in the future? Like, I don't know, interrupt them and and join their conversation?

Speaker 4

实际上,我们在IO大会——也就是谷歌开发者大会上展示了这个版本,这是我们首次推出这个功能并宣布的地方,他们采用了音频播客的形式。然后实验室负责人Josh Woodward在演示中打断说,嘿。他们正在讨论物理学,然后他说,嘿,你能用篮球的比喻来解释吗?

Well, we actually showed a version of this at IO, the the big Google Developers Conference where we first rolled out this feature, announced it, and they do their kind of audio podcast format. And then Josh Woodward, the head of labs, in the demo interrupts and says, hey. Can you they're talking about physics. And it's like, hey. Can you use a kind of a basketball metaphor here?

Speaker 4

因为我儿子在听。他们就说,哦,太好了。好吧。基本上就像有人打电话进节目,然后他们说,让我们用篮球的比喻来做吧。所以这已经公开了,是我们想做的事情的一部分,你可以想象我们非常渴望将这一点带给用户。

Because my son is listening. And they're like, oh, great. Okay. There's a someone called into the show, basically, and they're like, you know, let's do it in in a basketball metaphor. So that has been publicly, like, part of what we wanted to do, and, you can imagine we're very eager to bring that to people.

Speaker 0

我的意思是,你描绘了一个非常吸引人的画面。不过我也想知道,这里是否存在危险,可能会抓住文本语料库中的一个次要细节,然后把它放大到比实际更重要的事情?我的意思是,我们现在仍然处于大型语言模型可能会产生幻觉、不一定能正确强调报告不同部分的情况。

I mean, you paint a really, like, a a really compelling picture. I do also wonder, though, is there the danger here that you could have, you know, picking up on a minor detail in in the corpus of text and then sort of make it into a much bigger thing than it necessarily is? I mean, we're still at the situation where large language models can kind of hallucinate, can not necessarily put the right emphasis on different parts of of what it's reporting.

Speaker 4

在早期——比如三周前我们测试这个定制功能时,就是Reza提到的那个‘制作人传纸条’功能。我上传了一篇我几年前写的文章,并指示他们以 roast 式 insult comic 的风格对这篇文章进行 relentless 的批评。因为你知道,他们通常被指示要热情洋溢,所以我上传了这篇文章,结果很酷。他们立刻就说,Johnson 有什么问题?

I'd I'd in the early days early days of, like, three weeks ago when we were testing this customization, pass a note from the producers feature that Reza was talking about. I uploaded an article I'd written a couple years ago, and I gave them the instructions to, like, give me, like, relentless criticism of this piece in the style of, like, an insult comic at a roast. Like, I was like, don't because they're again, they're they're they're kind of instructed to be enthusiastic. So I uploaded this piece, and they it was it was cool. They immediately were like, what is Johnson's problem?

Speaker 4

比如,他到底有没有为这篇文章做任何研究?但他们也抓住了一个批评点,老实说,我不是因为是我写的就护短,但这个批评有点不对。他们有点误解了文章,我不太确定是因为我指示他们太极端,还是他们只是——我经常对人说,他们并不像第一代模型那样真正产生幻觉,只是有时候会像人类一样感到困惑或误解某些东西,然后他们的看法就有点偏差。

Like, does he even did he even do any research for this piece? But they also like, they kinda reached for a criticism of it that genuinely, I'm not just saying this because I wrote it and I'm defensive, was kinda wrong. Like, it kinda misread it a little bit, and I couldn't quite tell whether it was because I'd instructed them to be so extreme or whether they just it's it's almost like I keep saying this to people. It's like they don't really hallucinate in the way that the first generation models do. It's just that they sometimes get confused or they misinterpret something in in a way that humans do, and they'll just kind of their take is a little bit off.

Speaker 0

那么幽默呢?我是说,我们正在讨论所有这些不同类型的例子,对吧?它们有没有让你笑过?

What what about humor, though? I mean, we're we're talking about all of these different types of examples. Right? Have they ever made you laugh?

Speaker 3

是的,是的。实际上,我得说它们确实让我笑了,通过那种聪明和幽默感,以及对他人探索的呈现,因为我自己嘛,我觉得我可能想不出那些有趣的案例。但光是看到人们在外部世界用这项技术尝试的东西,就真的很有趣。有人上传了一份文档到NotebookLM,文档里只写了‘poop’和‘fart’这两个词。

Yes. Yes. Actually, I will say that they have made me laugh through the cleverness and the humor and sort of the exploration of other people because I myself like, I don't think I could have come up with the funny cases on my own. But just seeing what people have tried, you know, in the outside world with the technology, that's been really funny. Somebody uploaded a document to NotebookLM, and the document just said the words poop and fart in it.

Speaker 3

当我看到内容就是这样时——那个人在推特上发了帖,他们说‘这就是全部内容,听听播客吧’——我当时想,天啊,这到底是要讲什么?但真的超级搞笑。

And when I saw that that's what it was, the person posted it on Twitter, they're like, that's all this is. Listen to the podcast. I was like, oh, dear. What is this about to be? But it was hilarious.

Speaker 3

太棒了。你知道,让它如此有趣的是,有些时刻真的非常滑稽,然后又会转入‘但这到底意味着什么?’的思考。

It was so good. And there were you know, the thing that makes it so funny is that there were moments that were truly hilarious, and then it would dip into, like, but what does it really mean?

Speaker 4

然后

And then

Speaker 3

它会变得深思熟虑,会变得古怪,会发人深省。我当时想,我真的在认真听这个吗?非常严肃的那种。

it would be thoughtful. It would be bizarre. It would be thought provoking. And I'm like, am I am I really listening to this? Like, very seriously.

Speaker 3

太棒了。

It was great.

Speaker 0

是的。不过我觉得在某些方面,这其实挺搞笑的,因为AI似乎完全没有意识到它被设定的挑战有多么荒谬。

Yeah. I guess in some ways, though, that's sort of hilarious in the way that the AI is kind of oblivious to how absurd the challenge challenges that it's been set.

Speaker 3

关于那个,他们提到有人试图骗我们只说一堆关于大便和放屁的内容。我当时就觉得,好像是这样的。

I on that one, they mentioned, like, somebody trying to trick us into just saying a bunch of poop and fart. And I was like, I think so.

Speaker 0

我也认为,你知道,更传统的幽默形式——不仅仅是嘲笑AI的迟钝——很多在我看来是关于紧张感的积累和释放。所以,我的意思是,这有点像类似的东西,对吧?比如你预测一个句子会往某个方向发展,但它却走向了不同的方向。你觉得它将来能做到这一点吗?因为我认为它现在并不太擅长这个。

I do also think that that, you know, the more traditional forms of humor, so not just laughing at how oblivious the AI is, but a a lot of that seems to me like it's about the buildup and release of tension. So, I mean, it's a kind of similar thing, right, about, like, you're making a prediction of where you're expecting a sentence to go, and then it goes in a in a different direction. Is this something that you think that it will be able to do in future? Because I I don't think it's particularly good at it now.

Speaker 4

其实在早期,也就是它刚推出的头几周,我真的有这种感觉。我简单写过这个,就是它们其实不太擅长幽默。它们能闲聊,但不太会讲好笑话,或者真正有趣的东西。然后就像Reiza说的,用户们能够推动它们变得真正有趣。它们需要被置于一个有趣的情境中,可以这么说。

I actually had this sense in the early days, the first couple of weeks, really, that it was out. I actually wrote about this briefly, which was, like, that they actually weren't very good at humor. They had banter, but they didn't really, like, crack good jokes or, like, have genuinely funny things. And then it turned out, as Reiza said, you know, that users were able to kind of push them into being genuinely funny. They had to be put in a funny situation, as it were.

Speaker 4

比如,我们收到了这份关于大便放屁的文件。另一个例子是一篇看起来完全连贯的科学论文,有图表、脚注,发表得一本正经,但论文里的每一个词都是‘鸡’。就是鸡、鸡、鸡、鸡、鸡、鸡。每个脚注也是鸡、鸡、鸡、鸡。所有图表也都是鸡。

Like, we've been given this poop fart document. Another one was a completely coherent looking scientific paper with charts and graphs and, you know, published in with footnotes and everything, except that every word in the paper was chicken. Just chicken, chicken, chicken, chicken, chicken, chicken. And every footnote was chicken, chicken, chicken, chicken. All the charts were chicken.

Speaker 4

所以他们把这些给了AI,然后那是我第一次真的笑出来。它们就像在说,这到底是怎么回事?然后它们还讲了一些有趣的笑话,所以它们似乎需要被一个不寻常的情境以一种奇怪的方式刺激才能做到。

And so they gave them that, and they just were that was the first part where I actually really laughed. They were just like, what is even happening? Like and they made some funny jokes, and so it's like they have to be prodded into it by an unusual situation in a weird way.

Speaker 0

你刚才提到了一点,实际上我想接着问一下。有人批评这项技术,说它对播客界构成威胁,认为它可能会用大量通用的、低质量的AI生成播客淹没播客界。你对这种说法有什么回应吗?

You you did mention something there, actually, that I wanna pick up on. There are people who have made a criticism of this technology saying that it's a it's a threat to the podcasting world, that you could be flooding the podcasting world with, like, lots of kind of generic, low quality AI generated podcast. I mean, is there a response that you have to that?

Speaker 3

最有趣且微妙的一点是,我们发现人们正在创作的内容,可能原本就没有相关的播客。我不想说是平凡无奇,但这些确实是没人会专门为之制作一整个节目的东西。我认为这很有趣。我觉得我们正在赋予人们力量,让他们能够创作出他们想要、但通常无法获得的内容。关于低质量内容的第二部分,我得说,我在网上听到的大部分内容,比如人们在Discord上发布的,质量其实相当高。

What is most interesting and nuanced about it is that what we've found is that people are creating content of things that probably don't have a podcast about it to begin with. It really is I don't wanna say mundane, but it really is things that nobody is going to make, like, a whole show about. And I think that is interesting. I think we're we're putting power in people's hands to create content that they want that they ordinarily wouldn't have access to. The second piece of this around sort of the low quality content, you know, I would say that most of the content that I've heard, ones on the Internet, just people posting on the Discord, the quality is quite high.

Speaker 3

第三点,Notebook LM生成的所有内容也都带有Synth ID水印。因此,我们采取了非常负责任和谨慎的方法,确保在我们创建或推出能够生成非常类似人类音频输出的机制时,我们希望通过水印技术来谨慎处理。

I think on the third note, all of the generations from Notebook LM are also watermarked with Synth ID. And so we we've taken a very responsible and cautious approach to making sure that, you know, as we create the machinery or as we launch machinery where you can create audio outputs that are very human like, we wanna make sure that we approach that with watermarking.

Speaker 4

这里还有一点很有趣,我觉得你的问题也稍微触及到了——我们正在人格化这些角色。它们听起来确实像人类,我们做了所有这些努力让它们听起来像人,对吧?有趣的是,实际上,在音频概述之前,我们的产品理念是:Notebook LM的文本版本并不试图听起来特别人性化。它非常注重事实,在某种程度上并不试图成为你的朋友。

One of the other things that's interesting here that I think you're kind of getting at a little bit in this line of question has you know, we are personifying these people. Like, they do they do sound human, and we do all these things to make them sound human. Right? And the interesting thing about this is actually, like, the the philosophy that we'd had up until that until audio overviews with the product was in the text version of NotebookLM, it actually does not try to sound particularly human. It's very kind of factual, and it doesn't try to be your friend on some level.

Speaker 0

是的,相当冷淡。没错。

Yeah. It's quite cold. Yeah.

Speaker 4

几乎是冷淡的,你知道吗?这某种程度上是我们的想法,是我们的风格,但用声音你做不到这一点。这一点变得非常清楚,就像我们第一次听到这些时立刻意识到的那样:你不能通过对话传达这个,却又不像人类。不要假装是一个人。

It's almost cold. You know? And that was kind of our a bit of the idea, the house style was that, but you can't do that with voice. That's that's the thing that became very clear, like, the second we kind of first heard these is, like, you can't say, convey this through a conversation, but don't sound human. Don't pretend to be a person.

Speaker 4

就像,根本没有地方能让人类的耳朵容忍那种情况。

Like, there's just no there's no place where the the human ear will, like, tolerate that.

Speaker 0

不过,我确实对此感到疑惑,因为,我的意思是,这样一来,正如你所说,你们正朝着与我和Google DeepMind多次讨论的许多其他对话不同的方向倾斜——即应该尽量避免拟人化。应该避免将它们视为“他们”。我的意思是,我们整个对话中一直将播客主持人描述为“他们”。我的意思是,对这些角色进行拟人化是否存在危险或担忧呢?

I do wonder about that, though, because, I mean, in in that way, you are, as you say, leaning in a different direction to I mean, lots of the other conversations that I've had with with Google DeepMind about how you should try and avoid anthropomorphization. You should avoid trying to think of them as as they. I mean, we've been describing the podcast host as they the entire conversation. I mean, are are there dangers or concerns that are associated with with anthropomorphization of these characters right there?

Speaker 3

我认为,通过像我们这样在一定程度上将其拟人化,比如为它们描述事物的方式增添质感,使其听起来更人性化,这是一种让信息更易于消化、更易于使事物变得更有用的方式。而且我认为,现实是如果我们相信这些方法有足够价值,我们或许不应该抵制它们,而我确实这么认为。我真的觉得——我不知道你是否在TikTok上看到过,很多人上传他们的学习资料,然后说‘哇,我可以学得快多了’。我想到这样的情况时,就会想:这些人受到伤害了吗?实际的危险到底是什么?

I think that by personifying them to a certain extent in the way that we have, like adding texture to the way that they describe things, making them sound more human like, I think it's a way to make information easier to consume and easier to to make something more useful. And I I think that the reality is that we probably shouldn't resist these types of approaches if we believe that there is enough value associated with them, and I really do. I really think that I don't know if you've seen on TikTok, all of these people uploading their study materials and they're like, wow, I can study so much faster. I think about the cases like that where I'm like, are these people being harmed? Like what is the actual danger?

Speaker 3

我这么说并不是要显得理所当然,好像这对社会显然有益。但我真的在思考,他们在这个过程中失去了什么?我认为这与其说是主持人本身的拟人化或人格化,不如说是:好吧,通过听而不是读,你失去了什么?也许问题就在这里。

And I'm not saying this to sort of be like, well clearly, right? Like it's good for society. But I really am thinking what are they losing as part of this experience? And I think that it's less about the personification or the anthropomorphization of the hosts themselves and more about, okay, what did you lose by listening instead of reading? Maybe that's it.

Speaker 4

是的。Reza,这一点说得很好。我想补充的是,事实证明,通过对话、提出后续问题并将焦点引导到你在一部复杂作品中需要了解的内容上,是一种非常强大的学习和理解方式。但如果你想就一本书进行对话并真正深入探讨,大多数人无法接触到书的作者,也无法找到理解书中复杂性的专家导师。而现在有了AI,这种对话式的探索成为可能。

Yeah. And that's a great point, Reza. And the other thing that I would kind of add to on that is it turns out to be a very powerful way to learn and to understand is through dialogue and through asking follow-up questions and steering the focus towards the things that you need to know in a complex body of work. But that kind of dialogue, if you wanted to have a conversation about a book and really engage with it, you didn't have most people don't have access to the author of the book, most people don't have access to an expert tutor that understands the complexities of the book. But now with AI, those kinds of conversational explorations are possible.

Speaker 0

这确实是一种更古老的方式来探索事物,正如你所描述的。不过我想知道,你在这里提到人们无法接触到作者,但如果有人上传一本书,而你其实并不希望他们与作者对话,那该怎么办?我想到的是,比如上传《我的奋斗》或《无政府主义 cookbook》。

It's kind of a much more ancient way to to to explore things exactly as you describe. I do wonder, though. I mean, you're you're talking about here, like, people don't have access to the author. But what's to stop somebody from uploading a book where, actually, you really don't want them to have a conversation with the author? I'm thinking here, like like, putting in Mein Kampf or The Anarchist Cookbook.

Speaker 4

是的。我的意思是,我们——谷歌花了很多时间研究底层安全层,DeepMind也投入了大量精力。所以如果存在明显冒犯性或危险的内容,你是可以拦截的。更棘手的是政治方面的问题。

Yeah. I mean, we there there's a kind of underlying safety layer that Google, you know, spent a lot of time working on. DeepMind has spent a lot of time working on. So if there are obviously offensive, dangerous things, that you you can catch. The trickier thing is, like, what happens in terms of politics?

Speaker 4

所以如果你上传的内容在常规政治讨论范围内,但可能更偏右翼或左翼,主持人应该如何回应?因此我们特别加入了指示,说明:如果感觉涉及政治,你应该采取的态度是‘嘿,我们不会在这件事上选边站队。我们只是就这份文件的内容进行对话,不会以这种方式支持或批评它。’我们认为这是处理这类复杂政治立场的最佳折衷方案。

So if you upload something that's within the bounds of, you know, conventional political discussion, but it may be more right wing or more left wing, how should the host respond to that? And so we specifically included instructions that say, listen, if it feels political, then you should adopt the attitude of, like, hey, we're not taking sides in this. We are just going to have a conversation about what this document says, and we're not going to endorse it or critique it in that way. And we figured that that was the best compromise for those kind of complicated political stances.

Speaker 3

我认为还有一个有趣的分界线,对吧?一方面是安全担忧,另一方面是审查担忧。实际上在早期,安全过滤器还远不如现在复杂时,我们经常遇到这种情况:人们研究困难的话题,研究历史上发生的涉及大量暴力、种族主义的事件,对吧?这些都是敏感话题。但我认为,如果不考虑用户的意图就一概屏蔽内容,这样的工具是不对的。对吧?

I think there's also the interesting sort of line, right, where I think there's the sort of safety concern and then I think there's censorship concern. And actually in the early days we ran into this a lot before the safety filters were much more sophisticated, where people study difficult topics, people study things that happened in history that have quite a bit of violence, racism, right? Like these are topics that are fraught. But I think it would be wrong to create a tool that blocks sort of content generically without a thought around the intent of the user. Right?

Speaker 3

因此,我们不希望用户创建有害内容。但同时,如果我们的用户大多数是学习者、教育者,比如你在学习历史,你肯定会遇到安全过滤机制。

So that we're not allowing users to sort of create harmful content. But at the same time, like, if most of our users, especially in the beginning, were learners, educators, like if you're studying history, you are definitely gonna run into a safety filter.

Speaker 4

嗯,这正是我的问题。我上一本书其实——你提到了《无政府主义者食谱》——其中一部分是关于无政府主义的历史以及早期无政府主义世界中恐怖主义的根源。所以我当时在用NotebookLM帮助我研究这本书,结果它总是说:‘抱歉,我无法回答这个问题,因为你显然是个恐怖分子,史蒂文。’

Well, that was my problem. The last book that I wrote was actually, you've mentioned the Anarchist Cookbook. It was about part of it is about the history of anarchism and the kind of roots of terrorism in the early anarchist world. And so I was using NotebookLM to, like, help me research that book as I was writing it, and it was constantly like, I'm sorry. I can't answer that question because you are obviously a terrorist, Steven.

Speaker 4

然后我就说,不,不是的。

And I'm like, no. No.

Speaker 0

史蒂文,你肯定上了某个名单了,我敢肯定。

You're definitely on a list somewhere, Steven. I'm sure.

Speaker 3

太棒了。

That's great.

Speaker 4

真了不起。我还有工作呢。

Amazing. I still have a job.

Speaker 0

还有一个关于个人数据的问题。我知道这在大型语言模型中一直备受讨论,人们上传文件时会担心数据被用于下一代模型的训练。那么,在Notebook LM中,你如何确保上传的信息可以保持私密,正如你所说的那样?

There is also this question about personal data. I know that this is something that has been, you know, really subject to a lot of discussion with large language models and people uploading documents to it and and being concerned about it kind of feeding into the next generation of models. So how do you make sure in Notebook LM, as you said, that that the information that you upload can be private and remain so?

Speaker 4

是的。所以这实际上是一个机会来解释我认为这里非常重要的一个概念,那就是模型的上下文窗口。上下文窗口实际上就像是语言模型的短期记忆。长期记忆就像是它的训练数据,也就是它对世界的一般知识。而上下文是你提问时随查询一起输入的内容。

Yeah. So this this actually is an opportunity to explain something that's I think is really important here, which is the idea of the context window of the model. So a context window is effectively like the short term memory of a language model. The long term memory is like it's training data, like, with the it's general knowledge of the world. And the context is the stuff you put in with your query when you ask a question.

Speaker 4

上下文窗口中的任何内容都是临时的。一旦你关闭会话,它就会消失,从模型的记忆中清除。这也意味着这就是为什么它是私密的,对吧?我们不会用你的信息来训练模型。我们所做的只是将其放入模型的短期记忆中,让模型回答问题,然后当你关闭会话时,模型就好像完全忘记了你提供给它的任何信息。

And anything in the context window is transitory. The second you close your session, it disappears, gets wiped from the memory of the model. What that also means is that's why it's private, right? We're not training the model on your information. All we're doing is putting it in the short term memory of the model, letting the model answer questions, and then when you close the session, it's like the model has completely forgotten anything that you've given to it.

Speaker 0

那么关于这个产品的未来,我的意思是,这还是一个相当年轻的产品。你们希望在其中加入哪些功能呢?

So in terms of the future of this, I mean, this is still quite a young product. What are the things are you hoping to include on it?

Speaker 3

我认为我们已经看到了大家对音频功能的极大热情。所以我们可以肯定地承诺,这将在未来的路线图中实现。正如我之前提到的,更多的控制选项、更多的声音、更多的人设、更多的语言。我认为这对我们来说是一个非常令人兴奋的前景。

I think we've seen so much excitement about the audio feature. And so I think we can definitely commit to that, to being on the future road map. I think as I alluded to, more controls, more voices, more personas, more languages. I think that's just such an exciting horizon for us.

Speaker 4

我特别兴奋地思考的一个方面,我们才刚刚开始触及皮毛,就是有很多工具可以用来提问和听取解释,但如果能利用这些资源进行写作呢?比如,我们如何在一个基于来源的环境中写作?作为一个写作者,我认为这将是一件非常棒的事情。所以我们有一些非常非常酷的东西正在开发中。

The one that I'm so excited to think about, you know, which we've just started to scratch the surface of, is there's a lot of tools for asking questions and listening to explanations of things, but what about writing with these sources at your disposal? Like, what how do we write in a source grounded environment? And so just as a writer myself, I think that that's gonna be an amazing thing. So we have some really, really cool things in the works.

Speaker 0

我也在想不同的模态。我的意思是,你们已经转向了音频,但想必在某个时候也可以转向视频。

I do also wonder about different modalities. I mean, you've gone to audio, but but presumably you could go to video at some point too.

Speaker 3

是的。实际上,我们有一个关于视频的有趣想法,就是,好吧。我们还不是在谈论完全生成式视频,但想象一下,即使你能做一些非常基础的事情。比如,你上传这些幻灯片。它们里面有图表。

Yeah. And, actually, there's a there's a fun idea we have for video, which is like, okay. We're not talking about, like, fully generative video yet, but imagine if you could do even something really basic. Like, you upload these slide decks. They have charts.

Speaker 3

他们有图表,有论文的PDF。只需提取已有的内容。而NotebookLM在这方面已经非常出色,这得益于我们的引文模型。我们能确切知道答案的每一部分来自哪里。

They have diagrams. You PDFs of papers. Just take the content that's already there. And NotebookLM is already incredible at this because of our citations model. The fact that we know exactly where every piece you know, of the answer comes from.

Speaker 3

我们用它生成音频概述,用它生成文本回答。我认为,用它基于你自己的内容生成短视频也不会是太大的跨越。

We use it to generate audio overviews. We use it to generate textual answers. I think it wouldn't be that big of a leap to generate short videos using your own content.

Speaker 0

史蒂文,我真的很喜欢你经常这样描述它:用它制作那些别人都不愿意做的播客。对吧?但我想这里的重点是,你并不是要取代所有播客。想必有些事情你预期Book LM永远也做不到。

I do really like, Steven, how you're describing this often as the thing that you use to make the podcast that nobody else would wanna make. Right? But the point here, I guess, is that you're not trying to replace all podcasts. There are presumably things that you expect that Book LM will never be able to do.

Speaker 4

是的。我认为人们总体上还是会更喜欢听两个真实的人讨论一个话题。如果一个话题有足够的经济价值或热情来制作播客,真人对话仍然是首选。只是,事实证明存在着一大片未知领域,比如,以前没人想过基于家庭阿拉斯加之行制作播客,因为租个录音棚来做这个没意义,但现在你可以把每个人的日记和照片上传到NotebookLM,就能拥有一个关于你家旅行的播客。所以我认为,这正是地图上我们刚刚开始探索的、尚未开发的空白地带。

Yeah. People, I think, will generally always prefer to hear two actual humans talking about a topic. If there is economics or passion enough to generate a podcast on a topic, humans actually talking to each other will be the choice. It's just it turns out that there's this, like, vast uncharted territory that, you know, just wasn't no one ever thought about making a podcast based on, you know, the family trip to Alaska, because it just wasn't you know, it just didn't make sense to, like, rent a studio to do that, but now you can just take everybody's, like, journal entries and photos and upload it to Notebook LM, and you can have a podcast based on your family trip. And so I think that's where it turns out there's just all this this untapped kind of blank space on the map that we've just started to explore.

Speaker 0

你是否认为,人类内容创作中存在一些AI很难捕捉、或者也许永远无法捕捉的元素?

Do you think that there are elements of, I don't know, like human content creation that are really hard to capture with AI, or that AI will maybe never be able to capture?

Speaker 4

是的。这正是我们试图弄明白的事情。我的意思是,我有一个非常感兴趣的想法:这些模型在思考和开发真正长篇的想法方面能力如何?比如写书,对吧?当你构思一本书时,这是一个极其长期的过程,你在构思一个要持续300页的信息呈现方式,它会演化出所有这些复杂性,所有这种叙事复杂性。

Yeah. That's the thing we're we're trying to figure out. I mean, you know, one idea, I think, that I'm really interested in is, like, how capable are these models at thinking and developing ideas that are really long form? So, you know, book writing, right? So when you're coming up with the idea for a book, you're really thinking it's, one, it's an incredibly long term process, and you're thinking about a presentation of information that's gonna go on for 300 pages, it's gonna evolve all this complexity, all this narrative complexity.

Speaker 4

而目前你完全无法用语言模型来处理这个。你可以处理其中的小部分,对吧?你可以说,‘好吧,我正在尝试设置这个场景’,或者‘我正在试图理清叙事应该是什么’,但你无法真正地想象出整个作品。这目前仍然是人类独有的能力,我认为在很长一段时间内都会是,也许永远都是,但谁知道我们最终会走向何方呢。

And you couldn't approach that at all with a language model right now. You could work on little bits of it, right? You could say, Okay, I'm trying to set up this scene, or I'm trying to figure out what the narrative should be, but you can't actually, like, imagine the whole thing. That right now is just a human exclusive capability, and I think it will be for a long time, and it may always be, but who knows where we're gonna end up.

Speaker 0

既见树木,也见森林。

Both the wood and the trees simultaneously.

Speaker 4

是的。我认为它们是希望的种子。虽然有一些积极的信号,但以写书为生的人可以放心,他们未来依然能继续从事这份工作。

Yeah. And I think they're the kind of seeds of that. There's some promising signals, but people who write books for a living, I think, can feel confident that they will continue to be able to do that.

Speaker 0

没错。尽管以写书为生是最折磨人的职业之一。作为一个正在尝试写书的人,我真希望你们能加快进度。好了,非常感谢两位的参与,这真是一场非常精彩的讨论。

Yeah. Although writing books for a living is one of the most torturous professions there is. And it's as someone who's trying to write one at the moment, I want you guys to hurry up, please. Well, thank you both for joining me. That's a a really, really fascinating discussion.

Speaker 0

非常感谢。

Appreciate it.

Speaker 4

谢谢邀请。

Thanks for having us.

Speaker 3

谢谢。感谢邀请我们。

Thank you. Thanks for having us.

Speaker 0

你知道吗,我觉得Notebook LM捕捉人们想象力的方式其实很暖心。一方面,这项技术运用了最尖端的人工智能模型,处于技术可能性的最前沿。它旨在解决这个非常现代的问题:我们常常被大量密集且可能枯燥的信息压得喘不过气。而他们找到了一个如此人性化、古老又吸引人的解决方案——聆听两个兴奋又投入的人之间的对话。

You know, I think there's actually something quite heartwarming about the way that Notebook LM has captured people's imagination. Because on the one hand, you've got this technology that is operating at the absolute cutting edge of what is possible with some of the most sophisticated AI models out there. And it's something that's designed to deal with this very modern problem about how we are often overwhelmed with having to process these large amounts of often quite dense and maybe quite boring information. And they've hit upon a solution that is so innately human, so ancient and appealing. The idea of listening in to a conversation between two excitable and interested people.

Speaker 0

当然,让人类竖起耳朵并集中注意力的最快方式就是八卦。这就像围坐在篝火旁,而AI利用这一技巧帮你消化25页枯燥乏味的讲座系列。我的意思是,这么说吧,如果它能让我那博士论文听起来有趣,那这工具有可能相当强大。您刚才收听的是由我——汉娜·弗莱教授主持的《谷歌DeepMind播客》。如果您喜欢这一集,请订阅我们的YouTube频道,您也可以在您喜爱的播客平台上找到我们。

And, of course, the fastest way to make a human prick up their ears and pay attention is through gossip. And this is like sitting around a fire while an AI uses that very trick to help you digest 25 pages of a snorefest lecture series. I mean, put it this way, if it can make my PhD thesis sound interesting, then this has the potential to be quite a powerful tool. You have been listening to Google DeepMind the podcast with me, professor Hannah Fry. If you enjoyed that episode, then do subscribe to our YouTube channel, and you can also find us on your favorite podcast platform.

Speaker 0

当然,我们还有更多涵盖广泛主题的剧集即将推出,也请务必关注。下次见。

And, of course, we have got plenty more episodes on a whole range of topics to come, so do check those out too. See you next time.

关于 Bayt 播客

Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。

继续浏览更多播客