本集简介
双语字幕
仅展示文本字幕,不包含中文音频;想边听边看,请使用 Bayt 播客 App。
在人工智能时代,发现是什么样子的?
What does discovery look like in the age of AI?
它是改变了一切,还是一点都没变?
Does it change everything or does it change nothing?
你知道吗,我最近被问了很多次,比如,当配送免费时,我们还需要做发现吗?
You know, I've been getting asked a lot, like, when delivery is free, do we still need to do discovery?
实际上,我认为当配送免费时,发现反而变得更加重要。
And I actually think when delivery is free, discovery becomes more important.
在今天的节目中,
In today's episode,
我与特蕾莎·托雷斯坐下来交谈,她是《持续发现习惯》的传奇作者。
I sat down with Teresa Torres, the legendary author of Continuous Discovery Habits.
这本书我自己读过很多遍,也做过很多标记。
This is a book that I myself have read multiple times, marked up multiple times.
许多产品经理都在进行客户访谈,但他们的产品和功能却依然失败。
So many PMs are doing customer interviews, yet their products and their features fail.
为什么?
Why?
她已在全球100多个国家与超过17,000名产品经理合作,因此她能为你提供改进发现流程的洞见——不仅适用于普通功能,也适用于AI功能。
She has worked with over 17,000 PMs across the world in over 100 countries, so she brings the insight you need to improve your discovery, not just for regular features with AI, but for AI features.
我看到的提示工程面临的挑战是:
Here's the challenge I see with prompt engineering.
我们都曾有过类似与ChatGPT或Claude聊天的经历,那是一种对话过程。
We all have experience like chatting with ChatGPT or Claude, and we're in a conversation.
如果我们第一个提示写错了,可以立即调整。
If we get that first prompt wrong, we can immediately refine.
但当你在构建产品时,这个提示无法由你亲自调整。
But when you're building a product, the prompt can't be refined by you.
一旦它上线到你的产品中,就再也没有调整的机会了。
Once it's live in your product, there's no refinement.
这是一次性机会。
It's a one shot.
那么,产品经理们进行虚假探索的迹象有哪些?
So what are the signs that PMs are doing fake discovery?
他们的待办事项列表里没有任何变化。
Nothing in their backlog changes.
他们从不放弃任何想法。
They don't kill any ideas.
市面上有很多所谓的‘探索表演’。
There's a lot of Discovery Theater out there.
在你离开之前,特蕾莎,我必须问一下,特蕾莎·托雷斯的业务规模有多大?
Before you go, Teresa, I have to ask, how big is the business of Teresa Torres?
是的。
Yeah.
所以我觉得
So I'm like
特蕾莎,欢迎来到这个播客。
Teresa, welcome to the podcast.
谢谢你的
Thanks for
光临。
having me.
我很兴奋能参加这次访谈。
I'm excited to do this.
就像我在录音前说的,你和马蒂·卡根一样,是我嘉宾名单中的顶级人物。
As I was saying off air, you are on my s tier of guests along with Marty Cagan.
当我梦想创办这个播客时,我最想邀请的两位嘉宾就是你们两位。
You are the two guests I wanted most when I dreamed of starting this podcast.
这是因为我认为,你可能比世界上任何人都更广泛地指导过产品经理们进行需求探索。
And that's because I think you have probably advised more PMs on Discovery than anyone else in the world.
你现在觉得这个数字大概是多少?
What would you say the number is at now?
是的。
Yeah.
这取决于我们如何计算。
It kind of depends on how we count.
当我直接指导团队时,我每年会与大约30个团队合作。
So when I was coaching teams directly, I would work with about 30 teams a year.
我这样做了十多年,所以可能指导过300多个团队。
I did that for over a decade, so probably over 300 teams.
这意味着我会与他们进行持续数月的每周通话。
And what that means is like weekly calls for multiple months.
所以我当时是深度参与的。
So I was sort of in-depth.
通过Product Talk学院,我们有超过17,000名学生,这相当惊人。
Through the Product Talk Academy, we have over 17,000 students, which is pretty mind blowing.
他们来自100多个国家。
And they come from over a 100 countries.
哇。
Wow.
一百个?我都不知道产品经理在一百个国家都有实践。
A 100 I didn't even know PM was practiced in a 100 countries.
所以你真的见识过整个产品发现的世界。
So you have really seen the whole world of product discovery.
这很有趣,因为这么多产品经理都在做用户访谈,但他们的产品和功能还是失败了。
And it's interesting because so many PMs are doing customer interviews, yet their products and their features fail.
为什么?
Why?
是的,这是一个复杂的话题。
Yeah, this is a complicated topic.
我认为原因有很多。
I think there's a lot of reasons for this.
我认为最主要的原因是我们不太擅长访谈。
I think the primary reason is that we're not that good at interviewing.
很多团队在进行访谈时,目的是探索他们的解决方案并获取关于该方案的反馈。
So a lot of teams, they go into interviews with the intent of exploring their solution and getting feedback on their solution.
这并不是获取我们解决方案反馈的最佳方式。
That's not really the best way to get feedback on our solutions.
我们在客户访谈中的目标应该是了解我们的客户。
Our goal in our customer interview should be to learn about our customers.
即使我们知道访谈的目标是了解客户,而不谈论我们的解决方案,我们仍倾向于提出非常不可靠的问题,比如:你对不同事物喜欢和不喜欢的是什么?
Even if we know that's the goal of the interviews and we don't talk about our solutions, we tend to ask really unreliable questions like, what do you like and dislike about different things?
能跟我谈谈你的整体经历吗?
Tell me about your experience broadly.
因此,我所教授的内容之一,也是我在书中提出的概念,并在我们所有的课程中传授的,就是基于故事的访谈方法。
And so one of the things that I teach, I introduce this idea in the book, we teach it through all of our programs, is this idea of story based interviewing.
那么,我该如何与你交谈,收集你过去经历的可靠故事呢?
So how do I talk to you and collect a reliable story about your past experience?
我要了解你实际做了什么,而不是你以为自己做了什么,也不是你希望做什么,而是现实中你最近做了什么?
I learn about what you actually do, and not what you think you do, not what you aspire to do, but like in reality, what did you do recently?
这样我才能确保我打造的产品能融入你的真实生活。
So that I can make sure that I'm building a product that fits in your lived world.
所以听起来,人们问了太多假设性的问题。
So it sounds like people are asking too many hypothetical questions.
这是一个原型,你会想使用它吗?
Here's a prototype, would you like to use this?
他们应该用什么更好的方式来展开这种对话?
What would be the better way for them to approach that conversation?
是的,让我们从几个层次来看这个问题。
Yeah, so let's look at this in tiers.
首先,很多人会展示一个解决方案,然后问:你会使用这个吗?
So the first is a lot of people present a solution and say, would you use this?
这是糟糕且不可靠的反馈。
That's terrible, unreliable feedback.
我们并不擅长预测自己未来的行为。
We're not good at predicting our future behavior.
而且人类都希望表现得友善,所以我们会说:当然,我会用的。
Also humans wanna be nice, so we're gonna say, yeah, of course I would use that.
即使我们认为自己很诚实,实际上我们对未来的时间和自己可能做的事情持乐观态度。
And even if we think we're being honest, it's actually we're optimistic about our future time and about what we might do in the future.
所以我们可能真的以为自己会使用它,但这并不意味着我们真的会用。
So we might actually genuinely think we're gonna use it, but it doesn't mean we are.
其实有更好的方法来测试我们的解决方案。
There's actually better ways to test our solutions.
在评估解决方案时,我非常推崇假设检验,这与访谈是完全不同的活动。
So I really like assumption testing when we're evaluating solutions, which is a whole different activity from interviewing.
如果你愿意,我们可以深入探讨这一点。
We can get into that if you want.
我用访谈的主要目的是了解你。
What I like to use interviews for is let me just learn about you.
所以接下来更进一步,人们会明白,我应该问一个开放性问题。
And so what I wanna do sort of the next level, like people learn, okay, I should ask an open ended question.
于是他们会说:谈谈你使用我产品的体验吧。
So they'll be like, tell me about your experience with my product.
这种问题虽然开放,但存在挑战。
The challenge with that type of question, it is open ended.
我可能会了解到很多关于你的事,但我知道的只是你认为自己会怎么做,而不是你实际上会怎么做。
I might learn a lot about you, but what I'm gonna learn is what you think you do, not necessarily what you actually do.
因此,为了解决这个问题,我想问你:你能谈谈上次使用这个产品的情况吗?或者更好的是,谈谈你上次解决这个产品旨在解决的问题的经历。
And so to fix that, I wanna ask you, tell me about the last time you used the product or even better, tell me about the last time you solved the problem the product was designed to solve.
这正是我通过反复试验、阅读和重读这本书所学到的。
That is exactly what I've learned through hard trial and error and reading and rereading this book.
这真的有效,朋友们。
It actually works folks.
你刚才简要提到了假设测试。
And you briefly mentioned assumption testing.
你能给我们简要介绍一下这是什么吗?
Can you give us the thirty second overview of what that is?
是的,这指的是我们容易陷入大型创意测试的陷阱,这意味着我们必须在前期完成所有设计,然后进行可用性测试,或者必须先构建出来再进行A/B测试。
Yeah, so it's this idea of we tend to fall into the trap of big idea testing, which means we have to do all the design upfront and then usability test it, or we have to build it and AB test it.
这些策略的挑战在于,它们确实是我们工具箱里很好的工具,我们最终希望使用它们。
The challenge with those strategies, they're great to have in our toolbox, we wanna do them eventually.
但当我们在探索阶段使用它们时,是在完成所有工作之后才去验证这个想法是否可行。
But when we're doing them in discovery, we're learning after we did all the work if the idea would work or not.
我更倾向于在投入全部工作之前,先了解某件事是否可行。
I prefer to learn if something's gonna work or not before I do all the work.
因此,关键在于将这个想法分解成其底层的假设。
And so the key to that is to break the idea down into its underlying assumptions.
那么,为了让这个想法成立,哪些前提必须为真?
So what needs to be true in order for this idea to work?
然后去验证这些具体的假设。
And then to test those particulars.
我们可以更快地测试这些假设。
And we can tend to test assumptions much quicker.
我们不必完成所有设计工作。
We don't have to do all the design work.
我们当然也不必完成所有的工程工作。
We certainly don't have to do all the engineering work.
我们可以在完成所有工作之前,就开始收集数据来验证这个想法是否可行。
And we can start to collect data on an idea would work or not before we do all the work.
好的。
Okay.
我认为这与持续发现习惯体系是相契合的,当然如果你觉得不对可以纠正我。
And I think this comes together, although you can correct me if I'm wrong, in the Continuous Discovery Habits system.
你能具体解释一下什么是持续发现吗?
Can you break down what exactly is Continuous Discovery?
是的,这个框架是我开发出来的,旨在帮助新获得自主权的产品团队弄清楚到底该做什么。
Yeah, so this framework I developed to help newly empowered product teams figure out what in the world to do.
我们来稍微谈一谈这一点。
So let's talk about that for a second.
我认为,传统上产品团队被要求在特定日期前交付特定功能,我们称之为路线图。
I think historically product teams have been asked to deliver specific features usually by specific dates, we call these roadmaps.
有时,团队会自己创建这些路线图,不得不思考该把什么内容放进自己的路线图里。
Sometimes teams are creating those roadmaps themselves and they've had to deal a little bit with what do I put on my roadmap.
但对于许多产品经理来说,是他们的利益相关者把内容加到他们的路线图上,而他们的工作仅仅是执行交付。
But for a lot of product managers, their stakeholders are putting things on their roadmap and their job is literally to just deliver.
因此,我看到企业开始从关注产出转向关注成果。
And so what I saw happen is companies started to shift from an output focus to an outcome focus.
现在他们说:好吧,团队,我们明白未来充满不确定性,我们不知道你们该开发什么,但你们需要降低流失率,或提高留存率,或提升用户参与度。
So now they're saying, okay, teams, we get it, the future is uncertain, we don't know what you should build, but you need to reduce churn, or you need to increase retention, or you need to drive engagement.
这些团队却说:什么?
And these teams are like, what?
你一直告诉我该开发什么,我现在不知道该怎么做了。
You've always told me what to build, I don't know how to do this.
因此,持续发现习惯就是为了解决这类永恒的开放性挑战,比如提升参与度和留存率,甚至是用户获取。
And so the Continuous Discovery Habits are about how do we answer these evergreen wide open challenges like engagement and retention, or even customer acquisition.
它始于明确你的目标是什么。
And so it starts with having a clear idea of what your outcome is.
这通常是一个指标,比如提升参与度。
That's typically a metric, can be something like increase engagement.
通常它会更具体一些,我们会定义我们想要的参与类型,但目前这样就够了。
Usually it's more specific than that, we're defining the types of engagement we want, but that's good enough for now.
然后左边的习惯是进行访谈,我们每周都进行访谈,目标是了解我们的客户。
And then the habit on the left here is we're interviewing, and we're interviewing week over week with the goal of understanding our customers.
所以我们在访谈中不是测试我们的解决方案,也不是评估我们的解决方案。
So we're not testing our solutions, we're not evaluating our solutions in our interviews.
我们真正想弄清楚的是:我们为谁而建。
We're really trying understand who are we building for.
我们正在建立一个有节奏的开发流程。
We're setting up a good building with cadence.
然后中间的图示——我本该从这里开始讲——是一个机会解决方案树。
And then the visual in the middle, I should have started there, is an opportunity solution tree.
这是我设计的一种可视化工具,帮助人们跟踪他们杂乱无章的探索工作。
This is a visual that I designed to help people keep track of their messy discovery work.
所以它从顶部的结果开始。
So it starts with an outcome at the top.
我们通过访谈来发现机会空间。
We're interviewing to uncover the opportunity space.
机会是未被满足的客户需求、痛点和愿望。
Opportunities are unmet customer needs, pain points and desires.
因此,在访谈过程中,当我们了解人们过去的行为时,我们会发现摩擦点。
So as we interview, as we learn about what people did in the past, we're going to uncover friction.
我们会将所有这些信息记录在机会空间中。
We're going to capture all that in the opportunity space.
最终,希望不会太久,我们会选择一个目标机会。
Eventually, hopefully not after too long, we're going to choose a target opportunity.
我非常主张在评估解决方案时进行对比决策。
I really am an advocate of comparing contrast decisions when evaluating solutions.
因此,我们会为同一个目标机会考察多种解决方案,然后通过假设检验将这些解决方案分解为底层假设,并通过假设检验来评估哪一个更有可能成功。
So we're looking at multiple solutions for the same target opportunity, and then we're using assumption testing to break those solutions down into their underlying assumptions, and then assumption testing to evaluate which one looks like a winner.
所以这个图表是永恒的。
So this diagram is timeless.
我仍然会推荐别人回到核心的持续发现循环。
I still refer people back to the core Continuous Discovery loop.
但人工智能是如何改变持续发现的呢?
But how has AI changed Continuous Discovery?
我认为这取决于你问谁,我对这个问题其实很矛盾。
I think it depends on who you ask, and I really am conflicted about this.
让我们谈谈人工智能影响这一过程的两条不同路径。
So let's talk about two different paths of how AI impacts this.
第一条路径是:人工智能如何影响我日常工作的执行方式?
There's the path of how does AI affect how I do my day to day job?
第二条路径是:当我构建人工智能产品时,我的工作会发生怎样的变化?
And then there's the path of how does my job change when I'm building AI products?
在第一条路径中,我非常喜爱把人工智能当作思维伙伴。
So in the first path, I love AI as a thought partner.
所以,即使我在定义成果时,也可能和Claude或ChatGPT聊天,讨论如何更好地表述或衡量这些成果。
So like even if I'm defining outcomes, might be chatting with Claude or chat GPT about my outcome and how I might better frame it or how I could better measure it.
它可能不会告诉我我的成果是什么,因为这需要大量的业务背景知识。
It's probably not telling me what my outcome is because it needs a lot of business context for that.
我的意思是,现在人们在启动项目时会输入所有业务背景,也许AI能在这方面提供帮助。
I mean, now people set up projects with all their business context and maybe it could help with that.
但通常我们的成果都来自我们的高管。
But typically our outcomes are coming from our executives.
所以,也许我们只是用它来优化这一过程。
And so maybe it's we're using it to refine that activity.
我知道有一些团队已经开始尝试让AI代替他们采访客户。
I do know some teams that are starting to experiment with having AI interview their customers for you.
实际上,我认为在一个这样的做法完全可行的世界里,这是很合理的。
I actually think this relive in a world where this is very possible.
但我有点担心这对我们的客户意味着什么。
I'm a little concerned about what it says to our customers.
我们真的很想了解你,但又不想花自己的时间去做这件事。
We really want to learn about you, but not enough to spend our own time doing it.
我也对访谈带来的一个超越客户反馈的益处有些担忧。
I also have some concern about one of the benefits of interviewing beyond what we learn from our customers.
仅仅通过直接接触客户,就能帮助我们建立对他们的同理心。
Just this act of having firsthand exposure to our customers helps us build empathy for them.
它能帮助我们看清我们自己的想法与客户想法之间的差距。
It helps us see the gap in how we think and how they think.
我不认为仅仅阅读访谈文字记录就能获得这种好处。
And I don't think like reading a transcript is going to get you that benefit.
所以我仍然是人类与人类直接交流的坚定支持者。
So I'm still a big advocate of humans talking to humans.
在综合分析这一领域,我其实非常矛盾。
One area where I'm really torn on is on synthesis.
我知道很多团队已经开始使用人工智能进行综合分析。
So I know a lot of teams are starting to use AI for synthesis.
我喜欢这一点。
Here's what I like about this.
我知道很多团队确实会做综合分析。
I know a lot of teams that do know synthesis.
他们进行访谈,笔记存进文件夹,之后就再也没看过。
They conduct interviews, the notes go into a folder, they never look at them again.
如果你只是这样做,我认为AI可以帮上忙。
If that's what you're doing, I think AI can help.
我的顾虑是,我大量实验过我自己做的综合分析与Claude或ChatGPT的综合分析,后者遗漏了很多内容。
My caution is I've experimented a lot with like my synthesis versus like Claude synthesis or chat GPT synthesis, and it misses a lot.
你必须付出很大努力,才能让AI精准地帮助你识别机会。
You have to work really hard to get it specific to help it really identify opportunities.
我一直在进行这些实验,最终可能会发布这个领域的工具。
I've been running these experiments, I will probably release tools in this space eventually.
它大约有60%到80%的准确率,但我担心那20%到40%的缺失会带来什么损失。
It's like 60 to 80% good, and I worry about what we lose in that 20 to 40%.
我还担心当人类不在数据中时,我们会失去什么。
I also worry about what we lose when humans aren't in the data.
我认为,当我们花大量时间沉浸在数据中时,我们的大脑会发生变化。
I think our brains change when we spend that much time in our data.
所以我不会完全否定合成这一部分;事实上,我目前的思考方式是:对于日常琐碎的工作,也许我们不需要深入分析,AI合成已经足够应对;但对于那些真正让我们在市场中脱颖而出的差异化部分,我们仍需深入。
So I'm not going to say never on that synthesis part, in fact, the way I'm starting to think about it is for our everyday mundane things that we work on, maybe we don't need to go deep and AI synthesis is more than adequate for our differentiators, for the things that really differentiate us from our competitors in the market.
我可能还是会深入下去,继续进行人工合成。
I'm probably going to go deep and still do my human synthesis.
我知道大卫·布兰德正在大量实验如何利用AI帮助你识别假设,甚至帮助你设计假设验证测试。
And then I know David Bland is experimenting a ton with AI in helping you identify assumptions and even helping you identify assumption tests.
我认为这里也有巨大的潜力。
And I think there's a ton of potential there too.
我始终坚持的一点是:我坚信AI应该增强人类的工作,而不是取代人类的工作。
I just, what I fall back on is I'm a big believer in AI augmenting human work, not replacing human work.
因此,我强烈提醒人们,当他们在研究中使用AI时,一定要思考它如何加速你的工作,而不是取代你正在做的事情。
And so I really caution people that when they start to use AI in discovery to really think about how does it accelerate, but not replace what you're doing.
是的,这正是我一直建议人们的做法。
Yeah, that's what I keep advising people.
每个人都问我,阿卡什,我应该用AI来综合和汇报我的发现吗?
Everybody keeps asking me, Aakash, should I be using AI to synthesize and report out on my discovery?
我觉得这样做会阻碍你提升自己作为大语言模型的思维能力。
And I feel like what it's doing is it's preventing you from improving your own brain as an LLM.
对吧?
Right?
我们总想让大脑充满正确的上下文。
We keep wanting to load our brain up with the right context.
而持续发现的精髓在于,你通过与用户实际交流,为大脑注入了来自用户的丰富上下文。
And the magic of Continuous Discovery was that you were loading it up with this beautiful context from your users of actually talking to their use your users.
如果你把这一切都外包出去,尤其是访谈部分,可能会失去一些上下文的构建过程。
And if you're outsourcing all of that, especially the interviewing part, you might lose some of that context building.
构建。
Building.
是的。
Yeah.
我知道,尤其是访谈整理非常耗时,需要大量工作。
I know that like especially interview synthesis is so time consuming and it takes a lot of work.
我对这个问题领域非常感兴趣,那就是如何让人工智能加速这项工作?
I'm actually really interested in this as a problem space of like how can AI speed up that work?
我担心的是完全用人工智能来自动化这项工作。
What I'm wary of is using AI to completely automate that work.
这背后是有充分理由的,对吧?
And there's a good reason for this, right?
我们都使用同样的人工智能工具,如果我们全都把工作外包给同样的工具,最终的差异化在哪里?
Like we all have access to the same AI tools, like what's eventually going to be your differentiator if we're just all outsourcing it to the same tools.
你可能会说,如果你进行了更优质的访谈并拥有更好的输入,那确实是一种差异化,但我还认为,人类在做这些工作时会有一些独特的价值。
Now you could argue if you're conducting better interviews and you have better inputs, Yes, that is a differentiator, but I also think there's something that humans will do.
不仅仅是人类比人工智能做得更好,而是当我们亲自做这些工作时,人类自身会发生怎样的变化。
Not just that humans will do it better than AI, but how humans change when we do the work.
这是我最担心会失去的部分。
That's the part I'm most worried about losing.
所以我对那些能帮助我们做得更好、更快,同时仍让我们亲自动手、参与工作的工具很感兴趣。
So what I'm interested in is like what are the tools that help us do it better and faster, but still enable us to get our hands dirty and to do the work.
是的,也许AI能完成60%到80%的工作,但我们要确保之后亲自介入,做一些上下文构建,来提升效果并创造差异化的洞察。
Yeah, maybe AI can do 60 to 80% of it, but let's make sure that we then go in on top and do some of the work, do some of the context building to enhance it and also to create differentiated insights.
是的,完全正确。
Yeah, absolutely.
所以这一点,还有AI如何影响我们的工作流程?
So that and that's like the AI, how does it affect our workflow?
你想聊聊当我们在构建AI产品时,探索方式会发生怎样的变化吗?
Do you wanna get into like how does discovery change when we're building AI products?
我马上就谈到那个,但我想再花一秒讨论工作流程,因为还有另一个关于AI的方面我想谈,那就是AI原型设计。
I wanna get there in just a second, but I want to stay on the workflow one more second because there's one other element of AI I want to talk about, which is AI prototyping.
一些播客嘉宾称之为功能工厂的黄金时代,因为高管们可以随便想出一个原型,发给你,然后说:‘你能做几轮用户测试,下周就把这个上线吗?’
And some podcast guests have called it the golden age of the feature factory because executives, you know, they can just come up with a prototype, they can send it your way, and they can say, hey, can you please do a couple user tests and then ship this into production next week?
是的。
Yeah.
这确实是最近我所接触的产品经理们面临的现实。
And that's actually a reality for PMs I'm talking to these days.
那么,你该如何正确地将AI原型设计融入这些发现流程中呢?
So how do you correctly pull AI prototyping into these discovery workflows?
是的。
Yeah.
好的。
Okay.
首先,我非常推崇AI原型设计。
So I'm gonna start by I am a huge fan of AI prototyping.
它真的让人感觉像魔法一样。
It really feels like magic.
我最近刚采访了17位产品经理,了解他们如何在工作中使用Lovable,这篇博客文章将于8月13日发布。
I actually just interviewed 17 product people about how they're using Lovable at work and we'll be publishing this blog post on August 13.
所以我们要深入探讨一下,人们在工作中是如何使用AI原型设计的。
So a huge deep dive on what are people doing at work with AI prototyping.
我最近被问了很多次:如果交付是免费的,我们还需要做需求探索吗?
I've been getting asked a lot like when delivery is free, do we still need to do discovery?
但我认为,当交付变得免费时,需求探索反而变得更加重要,因为如果我们什么都做、什么都往产品里塞,最终会让客户崩溃。
And I actually think when delivery is free, discovery becomes more important because if we're building anything and everything and just shoving it into our product, we are going to drive our customers nuts.
产品会失去一致性,用户会感到疲惫,功能也会过度膨胀。
There's gonna be no product coherence, they're gonna have change fatigue, we're gonna have feature bloat.
我想到的是《辛普森一家》里霍默自己设计汽车的那一集吗?
And what comes to mind is have you seen the Simpsons episode where Homer got to design his own car?
哦,我得听听这个故事,我还没看过。
Oh, I have to hear this story, I haven't.
是的,这是我对此最贴切的比喻。
Yeah, this is the best analogy I have for this.
我记不清细节了,因为可能二十、三十年前看过,但霍默·辛普森确实自己设计了一辆汽车。
So I forget the details because I saw this probably twenty, thirty years ago, but Homer Simpson got to design his own car.
你可以想象,那就是荷马·辛普森,对吧?
And you can imagine it's Homer Simpson, right?
所以他的椅子像个大躺椅,到处都是啤酒杯架,还有甜甜圈架。
So his chair is like a big recliner and he's got cup holders everywhere for his beers and he's got donut holders.
这简直是一辆荒谬的车。
And like, it's just a ridiculous car.
他看起来傻里傻气的。
It He's looks out of silly.
这车可能根本开不了。
It probably doesn't even drive.
他只是在考虑所有他认为重要的东西,但实际上那些都不重要。
Like he's just thinking about all the things that like he thinks matter, but actually don't matter.
因为对于一辆车来说,最重要的是从A点到B点的出行功能。
Because actually getting from point A to point B is what matters most in the car.
我认为这正是如果我们允许任何人随意给产品添加功能时,可能会发生的情况的最佳比喻。
And I actually think this is the biggest, the best analogy for what might happen if we just let anybody tack on features to our product.
所以我认为,AI原型设计最棒的地方在于,我们现在可以极其迅速地进行假设验证, literally 按一下按钮就能完成。
So I think like what I love about AI prototyping, we can now assumption tests so quickly, like literally at the push of a button.
我们可以非常快速地进行高保真的假设验证,但我认为这并没有改变我们的探索工作,只是加速了一个非常重要的步骤。
We can do much higher fidelity assumption testing really quickly, but I don't think it changes our discovery work other than it speeds up a really important step.
我认为我们仍然需要谨慎地决定哪些内容放入待办列表,哪些真正要发布给客户。
I think we still want to be really considerate about what do we put in our backlog and what are we actually releasing to customers.
因为变化带来的最大影响是针对客户的。
Because the biggest impact of change is on our customers.
而不是针对我们的工程师。
It's not on our engineers.
当然,过去我们做探索可能是为了节省工程师构建错误功能的时间,但探索的真正目标是避免用错误的功能和频繁的变更让客户感到疲惫。
Like sure, we used to maybe do discovery to like save our engineers time from building the wrong thing, But the real goal of discovery was to not exhaust our customers with the wrong features and constant change.
所以给我讲讲,我想是整个生命周期,对吧?
So take me through, I guess the life cycle, right?
通常我们会先探索问题空间,提出我们的假设,验证这些假设,然后设计师可能会花时间制作一个原型,再把原型展示给客户。
Typically, would do, we would like explore the problem space, we would come up with our assumptions, we would test our assumptions, then finally the designer might it might be worth their time to create a prototype, then we'd put the prototype in front of them.
现在,人们直接跳到原型设计阶段了。
Nowadays, are are just jumping straight to the prototyping.
正确的生命周期应该是怎样的?
What is the actual life cycle it should be?
你该在什么时候将AI原型设计引入流程?
When should you bring the AI prototyping into the process?
你究竟应该如何使用它?
And how exactly should you use?
本期节目由Miro赞助。
Today's episode is brought to you by Miro.
我问你一个问题。
Let me ask you something.
为了把一个项目顺利完成,你正在同时使用多少个工具?
How many tools are you juggling just to get a single project across the finish line?
一个用于头脑风暴,另一个用于规划,再用别的工具来跟踪任务。
One for brainstorming, another for planning, something else for tracking tickets.
这就是Miro的用武之地。
That's where Miro comes in.
它成为一个一体化的协作工作空间。
It becomes an all in one collaboration workspace.
无论你是整合多次用户访谈的研究成果,开发和提炼产品简报或线框图,还是管理项目开发,Miro都能让所有人置身于同一空间。
Whether you're consolidating user research from several interviews, developing and synthesizing product briefs or a wireframe, or project managing development, Miro brings everyone into the same space.
它快速、直观,且功能齐全,提供项目模板、双向Jira同步,以及与draw.io和PlantUML等软件的集成。
It's fast, intuitive, and fully loaded with features like project templates, two way Jira sync, and integration with software like draw.io and PlantUML.
Miro的AI功能可以将看板中的元素进行整合,几秒钟内生成一份可直接审阅的产品需求文档。
Miro's AI features can be used to synthesize elements in a board to develop a ready to review product requirements document in seconds.
如果你厌倦了标签过多和分散的工作流程,试试Miro吧。
If you're tired of tab overload and scattered workflows, try Miro.
前往miro.com,看看为什么超过九千万用户选择Miro来引导从想法到成果的全过程。
Head to miro.com and see why over 90,000,000 users choose Miro to guide from idea to outcome.
本集节目由Jira产品发现版赞助。
Today's episode is brought to you by Jira Product Discovery.
如果你和大多数产品经理一样,你可能正在Jira中跟踪任务并管理待办列表。
If you're like most product managers, you're probably in Jira, tracking tickets and managing the backlog.
但交付之前的所有工作又该如何处理呢?
But what about everything that happens before delivery?
Jira产品发现功能帮助你将发现、优先级排序甚至路线图规划工作从电子表格转移到专为产品团队设计的工具中。
Jira Product Discovery helps you move your discovery, prioritization, and even roadmapping work out of spreadsheets and into a purpose built tool designed for product teams.
收集洞察,优先处理重要事项,并创建可轻松调整以适应任何受众的路线图。
Capture insights, prioritize what matters, and create roadmaps you can easily tailor for any audience.
由于它与Jira无缝集成,从想法到交付的所有内容都保持联动。
And because it's built to work with Jira, everything stays connected from idea to delivery.
Canva、Deliveroo甚至《经济学人》的产品团队都在使用它,立即访问atlassian.com/productdiscovery免费试用吧。
Used by product teams at Canva, Deliveroo, and even The Economist, check out why and try it for free today at atlassian.com/productdiscovery.
网址是atlassian.com/productdiscovery。
That's atlassian.com/productdiscovery.
Jira产品发现,打造正确的产品。
Jira Product Discovery, build the right thing.
是的
Yeah.
我喜欢在假设验证的语境下使用它。
I like it in the context of assumption testing.
那意味着什么?
So what does that mean?
这意味着我已经有了一个预期结果。
It means I already have an outcome.
我已经进行了三到四次访谈。
I've already conducted, let's say three to four interviews.
我草拟了我的机会空间。
I have a draft of my opportunity space.
随着我继续访谈,它还会继续演变。
It's going to continue to evolve as I keep interviewing.
我已经选定了一个目标机会。
I've chosen a target opportunity.
理想情况下,我已经为这个目标机会头脑风暴出了多个解决方案。
Ideally, I've brainstormed multiple solutions for that target opportunity.
所以我们正在建立一个良好的对比决策框架。
So we're setting up a good compare and contrast decision.
现在有了AI原型设计,我可以说,特蕾莎,我其实一天内就能做出这三个原型。
Now with AI prototyping, might say, Teresa, I can actually prototype all three in a day.
那我为什么不直接这么做呢?
So why wouldn't I just do that?
整个想法测试面临的挑战在这里。
Here's the challenge with whole idea testing.
好的,太棒了。
Okay, great.
我们基本上已经把原型制作三个想法的成本降到了几乎为零。
We just made it free basically to almost free to prototype three ideas.
所以我有了三个高保真、可交互的原型,那为什么不能直接去测试它们呢?
So I've got three high fidelity interactive prototypes, so why can't I just go test those?
首先,用客户来测试一个完整的想法需要很长时间。
Well, first of all, testing a whole idea with a customer takes a long time.
这是一次大规模的可用性测试。
It's a big usability test.
所谓的长时间,可能每个客户要花二十分钟,但这确实很长,我稍后会解释原因。
There's like, and by a long time, it might be twenty minutes per customer, but that's a long time and I'll explain why in a second.
所以我们已经在让客户感到疲惫了,因为我们要进行二十分钟的访谈,而且还要做很多次。
So we're already fatiguing our customers because we're going through twenty minute sessions, we're doing a lot of them.
那么在这些反馈环节中,我们能得到什么呢?
And then what do we get in that feedback session?
我们得到的是一些零散的反馈,关于想法的不同部分,取决于哪些部分引起了客户的共鸣或没有引起共鸣。
We're getting like random bits of feedback on different parts of the idea based on what resonated or didn't resonate with that customer.
这种反馈并不系统。
It's not very structured.
但如果我将我的想法拆解成底层的假设呢?
If instead I take my ideas and I break them down into underlying assumptions.
让我举个最近我在开发的一个例子。
So let's say that, I'll give you a recent example from something I'm building.
我正在开发一个面试教练工具,它可以给我的学生提供关于他们面试表现的反馈。
I'm building an interview coach, it gives my students feedback on how they conducted an interview.
在我构建这个解决方案时,遇到了一些技术上的挑战。
When I was like building out this solution, had some like technical challenges.
我的学生在课堂上通过Zoom的分组会议室进行面试,但Zoom不允许我录制每个分组会议室,这本身就是一个技术难题。
So my students conduct interviews in a zoom breakout room during class, zoom doesn't let me record every breakout room, that's already a technical challenge.
我需要录制分组会议室的内容,然后将录音转成文字,再把文字内容传给教练系统。
I need recordings of the breakout room, I need to go from recording to transcript, and then I need to go from transcript to the coach.
所以当我测试这个想法是否可行时。
So when I'm like testing this idea, will it work?
我需要测试多个步骤。
I have a number of steps that I have to test.
学生会不会在主会议室停留足够长的时间,让我有权限启动录制?
Will students like stay in the main room long enough for me to give them permission to record?
当他们进入分组会议室时,会记得录制吗?
When they go to their breakout room, will they remember to record?
他们能顺利在电脑上找到录制文件吗?
Will they be able to find the recording on their computer?
他们愿意把文件提交给我的SRT文件提取工具吗?
Will they be willing to submit it to my extract an SRT file tool?
他们会然后复制吗?
Will they then copy?
这个过程中有很多障碍,简直是一场噩梦。
Like there's a lot of friction in this process, it's kind of a nightmare.
谢谢,Zoom。
Thank you, Zoom.
因此,在测试这个想法时,我不会直接发布出去然后指望他们完成所有步骤,因为我不知道是哪个环节出了问题。
And so when I'm testing this idea, I don't just release it out there and hope they do all the things that they do because I don't get visibility into what step broke down.
但我可以逐一测试每个假设。
But I can test each individual assumption.
所以,我可以先和一组人说:好吧,今天我们来录制练习环节,以下是具体操作方式。
So I can first, like, with one group just be like, okay, we're gonna record our practice sessions today, here's how it's gonna work.
然后我会观察,是否需要把人从分组会议室里拉回主会场,因为只有在主会场我才能给他们授权。
And I'm going to see like, did I have to go pull people out of their breakout rooms to get them back to the main room because that's the only place I can give them permission.
对吧?
Right?
所以我正在系统地测试每一个环节,这样就能精确知道问题出在哪个步骤,以及我需要改进什么。
And so like I'm methodically testing each of these individual pieces, So it tells me exactly where the breakdown is and exactly what I might need to iterate on.
而如果我只是直接发布产品,结果学生根本没有提交字幕,我就完全不知道流程中哪里出了问题,尤其是我还在用第三方工具,无法对Zoom进行埋点监控。
Versus if I just released a product and students didn't actually submit a transcript, I would have no idea where in that process it failed, especially because I'm using third party tools, I can't instrument and zoom.
所以,尽管现在AI原型制作已经变得非常容易,我仍然希望为每个特定环节单独做原型,以便清楚知道流程中具体哪个环节出了问题。
So even though AI prototyping now makes it really easy to create a prototype, I still want to create a prototype for a particular, so I know exactly where in my process the breakdown happened.
因此,我正在单独测试每一个环节。
So I'm testing each particular in isolation.
好的。
Okay.
所以总结一下,看看我有没有理解对:我们使用AI原型的时机是在已经确定了机会空间、头脑风暴出多个解决方案之后,但并不想直接原型化整个解决方案。
So to recap that back to see if I got it right, where we want to use AI prototypes is after we already have identified the opportunity space, after we brainstorm several solutions, and we don't wanna just prototype the whole solution.
我们只想原型化该解决方案中的某个特定部分,以便围绕这个部分进行假设验证。
We wanna prototype a particular element of that solution in order to assumption test around that element of the solution.
没错。
Exactly.
人们在使用AI原型工具时,通常做错了什么?
And what are people doing wrong with AI prototyping tools?
你最常看到他们做什么?
What do you most often see them doing?
你知道吗,我最喜欢的是,很多人一开始就把注意力放在解决方案上。
You know, I love Well, the first thing is a lot of people are just starting with solution ideas.
他们根本不清楚自己要解决的是什么客户问题。
They don't have any idea what customer problem they're solving.
这在所有产品管理中都非常普遍,对吧?
And this is really rampant across all product management, right?
我们很多人把这称为炫技综合征。
A lot of us, we think of it as shiny object syndrome.
我的工作是提出创意点子。
My job is to have a creative idea.
但我认为这是对产品管理的巨大误解——我们的工作是解决客户的需求。
And this I think is a really big misunderstanding of product management, like our job is to solve customer needs.
因此,我们始终应该在客户问题、客户痛点或客户需求的背景下开展工作。
And so we should always be working in the context of a customer problem, a customer pain point, a customer desire that we're working to address.
我认为,AI原型设计让人们更容易陷入炫技综合征,因为我只需简单输入一个半生不熟的想法,就能立刻把它变成现实。
I think AI prototyping makes it even easier to fall prey to shiny object syndrome because I can literally just type like, I had this half baked idea and now you can make it real.
我认为,保持一点自律很重要——停下来问问,我们这么做究竟是为了什么?
I think it takes a little bit of discipline to be like, no, hold on, what are we doing this for?
那么,继续围绕任何功能的持续发现这一基本点,PM们在非假设性访谈中常问的最糟糕的问题有哪些?
So staying a little bit more on these fundamentals of Continuous Discovery for any feature, what are the most common bad interview questions that PMs ask outside of the hypotheticals?
你的访谈教练,或者你在指导他人时,最常帮他们纠正的错误是什么?
What are the common mistakes that your interview coach or you when you're coaching people are fixing for them?
是的,假设我们正在进行一个基于故事的访谈。
Yeah, so let's assume we're conducting a story based interview.
所以人们已经知道要收集故事。
So people know already to collect a story.
因此他们一开始会问:'你能说说上次使用 Lovable 的经历吗?'
So they're starting with something like, tell me about the last time you used Lovable.
困难在于,人们并不擅长自然地讲述自己的故事。
What's hard is that people aren't good at telling their stories naturally.
因此,人们低估了'挖掘故事'这项技能所需付出的努力。
And so people underestimate how much work goes into the skill of, we use the verb excavating the story.
我喜欢这个动词,因为它是一个非常贴切的比喻,对吧?
So I love this verb because it's a really good analogy, right?
通常我们用小型手工工具,轻轻从发掘现场取出文物。
Like typically we excavate artifacts with like small hand tools and like gently pull out the artifact from like a dig site.
这实际上是对访谈的绝佳比喻,因为在对话中存在着一种五五分的默契。
This is actually a great analogy for interviewing because in a conversation there's this fiftyfifty norm.
我说一句,你说一句,我说一句,你说一句。
I say something, you say something, I say something, you say something.
在访谈中,我们实际上希望打破这种规范。
In an interview, we actually want to break that norm.
我们希望参与者多说一些。
We want the participant doing most of the talking.
所以我们必须传达:不,我们真的需要所有细节,对吧?
And so we have to communicate, no, we really want all the detail, right?
如果我说:‘讲讲你上次看Netflix的经历’,你可能会说:‘昨晚晚饭后我看了一部电影。’
So if I say like, tell me about the last time you watched Netflix, you're going be like, I watched a movie last night after dinner.
而我必须把你带回到那个时刻:‘晚饭后,对吧?那我们回到你刚吃完晚饭的那一刻,告诉我你是怎么决定要看点什么的。’
And I have to actually situate you in that moment like, it was after dinner, okay, so let's put you back in that moment when you're finishing dinner, tell me about the decision to decide to watch something.
然后让我了解整个过程:你是怎么从餐桌走到客厅,打开电视,用了什么设备?
And then walk me through the whole narrative of like, you went from the dining room table to the living room and turned on the TV and what device did you use?
我会把整个故事完整地挖掘出来。
And like, I'm going to pull out the whole narrative.
展开剩余字幕(还有 356 条)
而人们最大的错误是,他们一开始就问故事类的问题,但由于不知道如何挖掘故事、如何提取叙述,只能猜测发生了什么。
Whereas what the biggest mistake people make is they start with the story based question, but because they don't know how to excavate the story, they don't know how to pull out that narrative, they start guessing what happened.
因此他们的提问都是在猜测:比如,你观看的时候是不是出了什么问题?
And so their questions are guesses of, oh, did something go wrong while you were watching?
或者你是不是看了推荐内容?
Or did you look at recommendations?
对吧?
Right?
而不是仅仅使用时间线索来引导。
Instead of just using like temporal prompts.
你最先做了什么?
What did you do first?
接下来发生了什么?
What came next?
因此,这种挖掘故事的技巧,我认为很多人严重低估了。
So there's sort of this skill of digging out a story that I think a lot of people underestimate.
当他们完成一次访谈后,如何正确地提炼出这次访谈的洞察?
And once they're done with an interview, what is the right way to capture the insights from that particular interview?
是的,好的。
Yeah, okay.
我想澄清一下,我教授的是一种访谈形式——基于故事的访谈。
So I want to clarify, I teach one form of interviewing, story based interviewing.
我之所以教授这种形式,是因为虽然它是一种需要练习的技能,但其核心理念很简单,所以我们都能理解,对吧?
The reason why I teach this is while it is a skill and it does require practice, the idea is simple, so we can all grok it, right?
而且我认为,这也是进行持续访谈的最佳方式。
It's also I think the best form of interviewing for continuous interviews.
对于每周都要访谈客户的产品团队来说,这种方式让我们能够持续深化对客户的理解。
For a product team that is interviewing a customer every week, it's allowing us to continuously invest in our understanding of customers.
不过,我之所以要补充这一点,是因为当我们开始讨论如何整合这些访谈内容时,这种整合的目标是为了支持周复一周的产品工作。
Now, the reason why I had to caveat that is when we start talking about how do we synthesize this interview, the goal of this synthesis is supporting week over week product work.
因此,这与用户研究团队可能进行的访谈,或者他们一次性整合十几份访谈的方式是不同的。
So this is different from interviews that your user research team might conduct or how they might synthesize a dozen interviews at once.
所以我想澄清一下这一点。
So I want to just clarify that.
但在这个情境下,如果我身处一个产品团队,每周都进行一次访谈,我希望做的就是记录下每次访谈中学到的内容。
But in this context, if I'm on a product team, I'm doing an interview every week, what I want to be doing is I want to be capturing what I'm learning from each interview.
因此,我区分单次访谈的总结和跨访谈的综合分析。
So I distinguish between single interview synthesis and then a cross interview synthesis.
我们正在看的这个访谈快照是一个单页模板,旨在帮助你总结单次访谈的内容。
So this interview snapshot that we're looking at, it's just a one page template and it's designed to help you synthesize a single interview.
我们在这里做了一些事情。
And we're doing a few things here.
我们从这里最重要的部分开始,也就是底部的经验地图。
We're starting with the biggest thing here is at the bottom, it's actually this experience map.
所以,当我收集一个故事时,我想找出这个故事中的关键时刻,并通过经验地图来记录它们。
So if I'm collecting a story, I want to identify what are the key moments in that story and how do I capture them in an experience map.
这将帮助我长期记住这个故事。
This is going to help me remember the story over time.
它能让我一眼看过去就能想起:哦,对,我记得这个故事。
It's going to allow me at a glance to look at the snapshot and be like, Oh yeah, I remember that story.
它还能帮助我发现不同故事之间的模式,并为机会解决方案树中的机会空间提供结构。
It's also going to help me find patterns across stories, and it's going to help me give structure to the opportunity space on an opportunity solution tree.
我认为这个快照中第二重要的部分是机会列表。
I'd say the next most important thing on this snapshot is the list of opportunities.
这正是让我们的访谈具有可操作性的关键。
This is what's making our interviews actionable.
那么,在这次访谈中,我听到了哪些机会、哪些未被满足的客户需求、痛点和愿望呢?
So what opportunities, what unmet customer needs, pain points, and desires did I hear in this interview?
我会尽可能多地记录下来。
And I'm capturing as many as I can.
而这里剩下的元素主要是帮助我快速获取一些基本信息,比如我的客户群体数据。
And then the rest of the elements here are just helping with like the quick facts are sort of my customer segment data.
那么,我该如何把这个具体的故事放在正确的背景下呢?
So how do I put this specific story in the right context?
这是一个活跃的客户吗?
Is this an engaged customer?
这是一个新客户吗?
Is it a new customer?
他们是大型企业吗?
Are they a large enterprise?
他们是中小企业吗?
Are they a small business?
这些快速信息会因企业而异,但你会确定大约五六条,最多十几条关键信息,帮助你将这个故事置于特定的背景中。
These quick facts are going to vary business to business, but you're settling on half a dozen, maybe up to a dozen kind of quick facts that help you situate the story in a specific context.
照片和引述再次起到记忆辅助的作用。
The photo and the quote, again, are memory aids.
我喜欢提取一句突出的引述,提醒我在这次访谈中学到了什么。
So I like to pull out a salient quote that just reminds me what did I learn in this interview.
这有点像一种记忆技巧。
This is kind of a memory trick.
就像我会告诉你,我仍然记得多年前做过的访谈,就是因为那些突出的引述。
Like I will tell you, I still remember interviews I did years ago because of the salient quote.
所以这只是另一种快速查看访谈摘要的方式,让人立刻想起:对,我记得这个人。
So it's just another way to quickly look at the interview snapshot and be like, yeah, I remember that person.
当然,这些洞察力是我们记录下来的其他笔记,但它们还不能直接付诸行动。
And then of course, the insights is just other notes that we've captured that they're not quite actionable yet.
我们还不确定它们该用在哪里,但我们不想丢失它们。
We're not sure where they go, but we don't want to lose them.
所以这是针对单次访谈的。
So this is for a single interview.
另一个场景下会是什么样子?
How does it look for the other scenario?
是的,当我们跨访谈进行综合时,我喜欢在机会解决方案树的框架下进行这项工作。
Yeah, when we're synthesizing across interviews, this is where I like to do this in the context of my opportunity solution tree.
所以我会从我的目标开始。
So I'm starting with my outcome.
我正在浏览我的访谈快照,但并非所有快照中的机会都会进入我的树状图,我的结果会起到过滤作用。
I'm looking across my interview snapshots, which not all my opportunities for my snapshots go on my tree, my outcome acts as a filter.
那么,这些机会中哪些有可能推动结果的实现呢?
So which of these opportunities do I think have the potential to drive the outcome?
这些就是我会移到树状图中的机会。
And those are the opportunities that I'm moving over to my tree.
然后,我会使用体验地图为机会空间提供结构。
And then I'm using the experience maps to give structure to the opportunity space.
所以我教团队的是,一旦你完成了三到四次访谈,就可以开始查看每个故事的体验地图了。
So what I teach teams is once you have three or four interviews done, you can start to look at your experience maps for the individual stories.
接着,你要寻找的是能涵盖所有故事的体验地图。
And then what you're looking for is the experience map that encompasses all of the stories.
这就像是一个超级体验地图。
So it's like a super experience map.
而这个超级体验地图中的各个时刻,对应着你的顶层机会。
And then the moments in that super experience map map to your top level opportunities.
这非常重要,因为它确保了你树状图中的每个分支都是独立的,因为每个分支代表故事中的一个不同时刻,因此出现的需求和痛点都会与该时刻紧密相关。
And this is really important because it guarantees that each branch in your tree is distinct, because each branch represents a different moment in a story, and so the needs and the pain points that show up will be specific to that moment.
这有助于减少我们各个机会之间的依赖关系,并让我们能够一次专注于一个机会。
This helps to reduce dependencies between our opportunities, and allows us to work on one opportunity at a time.
你谈到的这一点非常微妙,正是人们在持续开展需求发现过程中会逐步培养的一项技能——如何更新机会解决方案树。
You're speaking to a part that I think is very nuanced and is part of the skill that people are going to develop as they do more Continuous Discovery, which is how to update the opportunity solution tree.
你能再为我们详细解释一下吗?
So can you go in a little bit more depth for us?
比如,我们是如何在这里添加内容的?
Like, how are we adding things here?
我们又是如何将它们划掉的?
How are we crossing them off?
优化这个机会解决方案树的常规节奏是怎样的?
What is the typical cadence looking like to refine this OST?
是的,我要强调的第一点是,这确实是一个动态更新的文档。
Yeah, so the first thing I'll highlight is this really is a living document.
这并不是一次性的活动。
It's not a one time activity.
它有前置条件,你需要一个成果。
It does have prerequisites, you need an outcome.
我建议你先进行至少三到四次以故事为基础的客户访谈,并为这些访谈做快照,这样你已经完成了体验映射,并且已经识别出了机会。
I recommend you start with at least three to four story based customer interviews with snapshots for those interviews, so you've already done your experience mapping, you've already identified your opportunities.
然后你可以开始起草你的机会空间,我再怎么强调也不为过,这只是一个初稿。
And then from there, you can start to work on your first draft of the opportunity space, and I cannot emphasize enough it is a first draft.
因此,我们再次根据故事中的关键时刻来发现这些顶层机会,接着可以把所有与我们的成果相关的个体机会,归类到它们发生的时刻之下。
And so again, we're finding those top level opportunities based on the key moments across our stories, we can then take all the individual opportunities we're hearing that are related to our outcome and just group them under the moment in which they occurred.
所以我们可以问:对于这个机会,它是在故事的哪个时刻出现的?
So we can say for this opportunity, what moment in the story did it emerge?
然后对于每个分支,我们将开始寻找机会之间的关系,哪些是子节点,哪些是父节点,我们可能需要添加一些父节点来为机会空间提供结构。
And then for each branch, we're going to start to look for relationships between the opportunities, which ones are children, which ones are parents, we add probably have some parents to give structure to the opportunity space.
我们来谈谈理想的时间线:如果我在一个季度初启动一个新的成果,第一周我会集中进行三到四次访谈,到第一周末,我就有了机会空间的初稿。
Let's talk about an ideal timeline, like if I'm starting a quarter with a new outcome, in week one I'm probably front loading three to four interviews, So by the end of week one, I have a draft of my opportunity space.
这让我在第二周就能确定一个目标机会,开始头脑风暴解决方案,并在第二周就开始验证假设。
That sets me up in week two, I can choose a target opportunity, I can start brainstorming solutions, I can start assumption testing as soon as week two.
我会每周持续进行访谈,每三到四次访谈后,我会重新审视我的机会空间。
I'm going to continue to interview every week, and every three to four interviews, I'm going to revisit my opportunity space.
如果我按季度工作,每周进行一次访谈,那么我大约每三到四周就会修订一次机会空间。
So if I'm working on a quarterly basis, if I'm doing one interview a week, then I'm roughly revising my opportunity space every three to four weeks.
太棒了。
Beautiful.
我觉得没多少人真正理解这一点。
I don't think enough people are grokking that.
它是一个迭代性的文档,同时也是与利益相关者沟通的工具。
It's an iterative document and it's also a communication tool for your stakeholders.
完全正确。
Absolutely.
那么,产品经理在进行虚假探索时有哪些迹象?
So what are the signs that PMs are doing fake discovery?
他们待办列表中的内容没有任何变化。
Nothing in their backlog changes.
他们不会放弃任何想法。
They don't kill any ideas.
他们最终总是会构建最初的想法。
They always end up building what they started with.
他们没有考虑多种解决方案。
They're not considering multiple solutions.
我的意思是,外面有很多虚假的探索行为,其中一部分只是因为缺乏专业知识。
I mean, there's a lot of discovery theater out there, which some of this is just lack of know how.
我不认为团队是故意装模作样的。
I don't think teams are like, like intentionally putting up a facade.
我实际上从未见过有团队在欺骗系统。
I've actually never seen a team that's cheating the system.
我认为更多是因为我们需要个人做出改变,才能让这一切真正奏效。
I think it's more like, there's personal work we have to do to make this work.
我们必须能够稍微放下自己的自尊。
We have to be able to set aside our ego a little bit.
我们必须与自己最钟爱的想法保持距离。
We have to distance ourselves from our favorite idea.
我们必须带着好奇心去进行访谈,尤其是在同一个主题上进行第七次访谈时。
We have to come and do an interview with curiosity, especially when it's the seventh interview on the same topic.
我们必须认识到每个客户都是独特的,并积极挖掘这位客户独有的特点。
We have to recognize that each customer is unique and actively dig for what's unique about this customer.
我认为,许多产品经理以及产品三人组中的其他成员,并没有获得他们完成这项工作所需的指导或支持。
And I think plenty of product managers and everybody else on the product trio aren't getting the coaching they need or the support they need to do that work.
许多组织环境也没有为他们提供这样的空间。
Plenty of organizational contexts don't give them the space to do that.
我们仍然奖励那些被认为正确且持有强烈观点的人。
We still reward people for being right and for having strong opinions.
因此,我们并没有奖励他们的好奇心,也没有奖励那些表明我们真正学到了新东西的失败。
And so we're not rewarding them for being curious and we're not rewarding failures that indicate we actually learned something new.
所以我不把这一切完全归咎于产品经理。
So I don't put this entirely on product managers.
我认为我们的许多组织环境并不支持这种工作方式,但我确实认为,即使组织不支持,个人也能取得比他们想象中更多的进展。
I think a lot of our organizational context doesn't support this way of working, but I do think individuals can make more progress than they think they can, even if their organization doesn't support it.
你的AI产品是否在消耗预算却毫无成果?
Is your AI product burning budget without results?
被各种框架淹没,却无法提升LLM性能?
Drowning in frameworks, but can improve your LLM performance?
这正是Parlance Labs存在的原因,今天的播客合作伙伴。
That's exactly why Parlance Labs exists, today's podcast partner.
他们是一家由Hamel Husson领导的纯工程师AI咨询公司。
They're an engineer only AI consulting firm led by Hamel Husson.
没有幻灯片,只有切实可行的解决方案。
No slide decks, just practical solutions.
他们已帮助30多家公司,如Honeycomb、DBT Labs,甚至Langchain,通过系统性评估显著提升了AI能力。
They've helped 30 plus companies like Honeycomb, DBT Labs, and even Langchain dramatically improve their AI through systematic evaluation.
这就是他们的独特之处。
Here's what makes them different.
他们不希望被依赖。
They don't want dependency.
他们会教你团队如何评估AI系统并优化性能,以便你准备好时可以解雇他们。
They teach your team to evaluate AI systems and optimize performance so you can fire them when ready.
准备好停止猜测,开始提升你的AI了吗?
Ready to stop guessing and start improving your AI?
访问 parlancelabs.com。
Visit parlancelabs.com.
那就是 parlancelabs.com。
That's parlancelabs.com.
或者发送邮件至 consultingparlance labs dot com。
Or email consultingparlance labs dot com.
在我们深入之前,先聊聊每个产品经理都会面对的问题:如何在产品决策上达成共识。
Before we dive deeper, let's talk about something every PM faces: getting alignment on product decisions.
你有没有过这种感觉:当你试图向工程师解释用户流程,或向管理层说明设计决策时,只能用手比划着描述?
You know that feeling when you're trying to explain a user flow to engineering, or justify a design choice to leadership, and you're just describing it with your hands?
这就是MOMIN的用武之地。
That's where MOMIN comes in.
MOMIN是全球最大的真实移动和网页应用设计库,涵盖Airbnb、Uber和Pinterest等业界领先应用的设计。
MOMIN is the world's largest library of real world mobile and web app designs from industry leading apps like Airbnb, Uber, and Pinterest.
你无需花数小时截图或四处寻找灵感,可以立即找到成功产品如何处理注册流程、付费墙、结账流程等你面临的任何问题。
Instead of spending hours taking screenshots or hunting for inspiration, you can instantly find how successful products handle onboarding, paywalls, checkout flows, whatever you're facing.
超过170万产品从业者使用MOMIN来对标行业顶尖产品,并向团队展示经过验证的解决方案。
Over 1,700,000 product builders use MOMin to benchmark against best in class products and show their teams proven solutions.
无论你是需要说服利益相关者采用更好的用户激活方式,还是研究顶级应用的未来发现功能,MOMIN都能为你提供视觉证据,支撑你的产品决策。
Whether you need to convince stakeholders there's a better way to handle user activation or research a top app's approach future discovery, MOMIN gives you the visual proof to back up your product decisions.
访问mobbin.com/aakash,也就是mobbin.com/aakash,享受首年8折优惠。
Check out mobbin.com/aakash, That's mobbin.com/aakash and get 20% off your first year.
如果你每周不与客户交流,你真的在做真正的产品管理吗?
If you're not talking to customers weekly, are you really doing real product management?
是的
Yeah.
过去,我曾说不,而且对此持非常非黑即白的态度,但我的观点已经软化了。
In the past, I've said no, and I've been very black and white about this, but I've softened my answer.
我认为这是因为我们行业现在充斥着许多有毒的信息:如果你没有做所有正确的事,你就不是真正的产品经理。
And I think it's because there's a lot of toxic messages in our industry right now where if you're not doing all the right things, you're not a real product manager.
这是我现在的答案:如果你在做公司期望你做的事,你就是一名产品经理。
Here's my answer now, if you're doing what your company expects you to do, you are a product manager.
从根本上说,这是我们的底线。
Fundamentally, that's our floor.
你的工作就是做公司期望你做的事,仅此而已。
Your job is to do what your company expects you to do, period.
我们能做得更多吗?
Can we do more than that?
我认为可以,而且我真心觉得每个人拥有的自主权都比他们意识到的要多。
I think so, and I really think everybody has more agency than they realize.
所以我希望鼓励人们即使没有得到组织的支持,也要勇敢地迈出这一步,但我并不想告诉他们他们做的是错的。
And so I want to encourage people to step into this even if they're not getting organizational support for it, but I don't want to tell them what they're doing is wrong.
我觉得最近发生了一些变化,感觉每个人都在打造个人品牌,而LinkedIn已经变成了一个充斥着各种声音的有毒平台,大家都在告诉你你把工作做错了。
I feel like something has changed recently, I feel like it's everybody's building their personal brand and LinkedIn has become this toxic dump of everybody telling you you're doing your job wrong.
我真的不想助长这种风气。
And I really don't want to contribute to that.
我认为我们每个人都有提升的空间,还有更多我们可以去做的事情。
I think we could, all of us have room for improvement and there's more that we could be doing.
但从根本上说,我们的工作就是完成公司对我们的期望。
But fundamentally, our job is to do what our company expects us to do.
我喜欢这个观点。
I love that message.
现在有太多人在告诉你,如果你想成为优秀的PM而不是普通的PM,就像本·霍洛维茨二十五年前写的一篇很棒的文章,当时确实很有共鸣,但我们现在能不能别再重复这种说法了?
There's way too many people telling you, if you wanna be a great PM versus good PM, I think it's like, yes, Ben Horowitz wrote a great piece of content twenty five years and it resonated then, but can we please move on from that messaging?
现在我想谈谈AI和产品发现的另一面。
So I want to touch on the other side now of AI and Product Discovery.
AI功能的产品探索是什么样的?
What does Product Discovery look like for an AI feature?
是的,我本人也在学习这个。
Yeah, I'm just learning about this myself.
所以我来分享一下,现在是七月下旬。
So I'll share that, let's see, it's late July.
我刚开始构建我的第一个AI产品,已经四个月了,这完全让我意识到AI产品的项目管理有多么不同。
I am about four months into building my first AI product, and it's completely opened my eyes to just how different product management is for an AI product.
我们先来看看有哪些不同之处,然后再讨论探索在这个过程中如何发挥作用。
So let's first start with the differences of what's different and then we can get into where does Discovery fit in there.
我认为有几个方面是不同的,而这个话题让我感到困难的是,我还在努力区分哪些是产品工作,哪些是工程工作。
So I think there's a few things that are different and I think what's hard for me about this topic is I'm still struggling to separate what's product work and what's engineering work.
我确实认为AI产品会让我们的角色比以往更加模糊,这其实让我非常兴奋,因为我喜欢角色界限的融合。
And I really think AI products are going to blur our roles a lot more than they have in the past, which I'm actually really excited about because I love it when our roles blur.
我一向是个跨越边界的人,所以我希望看到更多这样的融合。
I've always been a boundary spanner, and so I think I want to see more of that.
但我们来谈谈这个。
But let's talk about this.
我实际上正在写一篇博客文章,主要是为了帮助自己理清思路,试图理解AI产品与确定性产品或功能相比有哪些关键的不同之处。
I'm actually working on a blog post where I'm just doing this for my own sense making, trying to understand what are the big components of an AI product that are different from a deterministic product or feature.
所以第一个是,我知道人们开始使用‘上下文工程’这个术语,我觉得这个说法比‘提示工程’稍微好一点。
And so the first one is, I know people are starting to use this term context engineering, which I like a little bit better than prompt engineering.
我觉得‘提示工程’的问题在于,我们都有过与ChatGPT或Claude聊天的经验,那是一种对话过程,如果我们第一个提示没写好,可以立即调整。
Here's the challenge I see with prompt engineering, we all have experience like chatting with chat GPT or Claude, and we're in a conversation, if we get that first prompt wrong, we can immediately refine.
因此,当我们想到‘提示工程’时,就会觉得:哦,我们每天都在用这些工具聊天,自然就知道怎么做了。
And so when we think prompt engineering, we're like, oh, we know how to do that because we talk to these tools all day every day.
但当你在构建一个产品时,提示无法由用户在使用过程中进行调整——当然,在开发阶段你可以调整,但一旦产品上线,就无法再修改了。
But when you're building a product, the prompt can't be refined by you in the, I mean, it can be as you're developing, but like once it's live in your product, there's no refinement.
这是一次性使用的。
It's a one shot.
所以这个提示必须有效。
So that prompt has to work.
所以,这里有一个技能,就是我们该如何编写提示词?
So there's this skill of like, how do we write prompts?
我们该如何指导大语言模型,在成千上万次实例中可靠地完成相同任务?
How do we instruct an LLM to reliably do the same thing over thousands and thousands of instances?
我认为人们低估了这个问题的难度,这和我只是和ChatGPT聊天是完全不同的问题。
And I think people underestimate how hard this is, and this is a very different problem than like I'm just chatting with chat GPT.
所以我认为这是其中一部分,我们可以深入探讨,以及如何通过探索来指导这一点。
So I think that's one piece and we can get into that and how discovery can inform that.
我认为第二部分是,在产品工作中,我们始终面临分解任务。
I think a second piece is always in product work, we have this task of decomposition.
我们有一个宏伟功能的愿景,该如何逐步实现它?
We have this vision of this big feature and how are we going to get there over time?
但我认为,对于AI产品来说,分解任务是双重的。
But I think with AI products, decomposition task is twofold.
我们该如何逐步实现它?
How are we going to get there over time?
但同时,我们如何协调任务,如何将任务拆分,以便让大语言模型能够很好地完成它们?
But also, how are we orchestrating, how do we break up the tasks so that LLMs can be good at it?
在我的学习过程中,我经历了这一点:我的面试教练最初只是一个提示,现在已扩展为七个提示,而且还在快速增长。
So I went through this on my own learning journey, my interview coach started as one prompt and now it's seven prompts, and it's quickly growing.
这就引出了协调的问题,这有点像工程,但我认为它也是产品的一部分。
And so then it introduces this orchestration question, which is a little bit engineering, but I think it's also product.
在这里,我们会涉及到诸如应该使用什么工具、服务器,以及我们是否在使用RAG等问题。
And this is where we get into things like which tool should we use, servers, are we using Rag?
这有点混乱,我就直接称之为协调。
There's sort of this messy, I'm just going to call it orchestration.
然后是第三部分:可观测性,我们是否在收集追踪数据?是否以符合伦理的数据实践方式在做?是否告知了我们的用户?
And then there's a third piece of observability, are we collecting traces, are we doing that in an ethical data practice way, are we informing our customer?
但我们必须存储追踪数据。
But we have to store traces.
对于不了解的人,追踪数据就是大语言模型的提示加上其响应,也就是大语言模型与最终用户之间的所有交互过程。
And for those that don't know, a trace is just a LLM prompt plus the response, so all the back and forth between the LLM and the end user.
然后我认为,这部分的最后一点是如何评估质量,这就是评估(Evals)发挥作用的地方。
And then I think the last piece of this is like how are we evaluating quality, and this is where Evals come in.
因此,我认为总体来看,一个功能的持续维护与确定性功能非常不同。
And so I think broadly, and then there's the ongoing maintenance of a feature which I think looks very different than a deterministic feature.
所以,这是我开始思考的五个方面,构成了我们AI产品差异化的框架。
So it's the five buckets that I've started to noodle on of this framework of how our AI products different.
我认为,探索过程可以影响这五个方面。
And I think discovery can inform each of those.
但在深入讨论之前,让我先停一下。
But let me pause there before I get into that.
让我总结一下。
So let me recap.
上下文工程、编排、可观测性、质量和维护。
Context engineering, orchestration, observability, quality, and maintenance.
我总结得对吗?
Did I get that right?
那我们先深入探讨一下上下文工程。
So let's start with a little bit of a deeper dive on context engineering.
这有什么不同?发现过程应该如何影响它?
How does that change, and how should discovery be playing into that?
是的。
Yeah.
所以我想先简单说明一下背景,我正在开发的AI产品是一个面试教练。
So I was really surprised when I so just to give a little bit of context, the AI product that I'm in building is an interview coach.
在我们的课程中,人们练习面试,提交他们的转录文本,AI教练会给出详细的反馈。
So in our courses, people practice interviews, they submit their transcripts and an AI coach gives them really detailed feedback.
比如提取出关键片段,提供指导建议。
Like it pulls out excerpts, it gives them coaching tips.
当我刚开始构建这个产品时,最初做实验时,我完全是在Claude项目里操作。
And when I started building this, when I first started experimenting, I was literally in a Claude project.
我上传了大量的课程内容。
I uploaded a lot of my course content.
我们课程中早已有一个评分标准,用于让学生互相对彼此的面试进行反馈。
We already had a rubric in the course that we gave to students so they could give feedback on each other's interviews.
我有了一个非常重要的洞察:在LLM产品中,上下文工程与教授人类技能非常相似,对吧?
And I had this really big insight that context engineering in the context of a LLM product is very similar to teaching humans a skill, right?
因此,教LLM做一件事,与教人类做一件事非常相似。
So to teach an LLM to do a thing is very similar to teach a human how to do a thing.
关键在于你何时给予它们什么样的上下文,以便它们能够很好地掌握这项技能。
It's really about what context do you give them and when, so they're able to do the skill well.
因此,我现在非常感兴趣的是,如何让我们的所有自然教师参与进来,帮助我们构建AI产品,但这稍微偏离了主题。
And so now I'm really interested in how do we get all of our natural teachers involved in helping us build AI products, but that's a little bit of a side tangent.
我们很多人一开始都以为:哦,我要把所有可能需要的东西都给它。
It's really like a lot of us start out with like, oh, I'm just going to give it everything it could possibly need.
但这样做的问题是,LLM会感到困惑。
And the challenge with that is that LLMs get confused.
我们知道人们经常谈论百万级token的上下文窗口,或者20万token的上下文窗口,但当你接近这个限制,甚至在还没达到这个限制之前,LLM就很难有效遵循上下文中的所有内容。
I know we talk about like million token context windows or 200,000 token context windows, But when you start to bump up against that limit or even long before you hit that limit, the LLM is not good at following everything in that context.
因此,这一部分的关键在于,我该如何在正确的时间提供正确的上下文?
And so one of the keys of this first piece is, how do I give it the right context at the right time?
这就是为什么MCP服务器、RAG以及所有这些帮助我们在正确时间注入正确上下文的工具正变得越来越普遍。
And this is why like MCP servers and Rag and all these tools that are helping us feed in the right context at the right time are becoming more pervasive.
这同时也是智能体流程的重要部分:如何让智能体通过工具、MCP服务器或甚至将RAG作为工具来获取它所需的信息。
And this is also a big part of the agentic flows, is how do you equip the agent to pull in the right information it needs, either through tools or through MCP servers or even RAG as a tool.
如果人们对这些术语还不熟悉,四个月前,这些对我来说也全是缩写迷宫。
And if people aren't familiar with those terms, like four months ago, all of this was like acronym soup in my brain.
实际上,MCP服务器就是一个工具,智能体可以向另一个工具请求信息,对吧?
Really, it's just like an MCP server, a tool is like an agent can request information from another tool, right?
这个工具可以是本地系统中的,也可以是通过第三方提供的,这就是MCP服务器发挥作用的地方。
And that tool could be local in your system, or it could be through a third party, which is where an MCP server comes in.
而RAG就是,你可以有一个文档数据库,它通过搜索来检索相关信息。
And then Rag is just, like you could have a database of documents and it's retrieving, thinking about it like a search.
它只是为该查询或输入检索出正确的上下文。
It's just retrieving the right context for that query, for that input.
因此,在上下文工程这一步中,我会把所有这些信息都纳入其中,以确保提供正确的信息,帮助大语言模型出色地完成这项任务。
And so I put all of those in that context when like that context engineering step of like, is the right information to help the LLM do this task really well.
你接下来提到的步骤是编排,但很少有人谈论这一点。
The next step you highlighted is orchestration, and not many people talk about this.
就连我也不太清楚,发现阶段在这里究竟处于什么位置?
Even I'm a little unclear in terms of where does discovery come in there?
是的。
Yeah.
好的。
Okay.
实际上,我们在提示部分并没有深入探讨发现环节,但我认为发现环节对提示和编排都有重要影响:如果我不了解客户的需求,就无法教会大语言模型如何帮助客户。
So actually, we didn't really get into discovery on the prompt piece, and I think where discovery informs the prompt piece and this will also orchestration, is I can't teach an LLM how to do a thing that's supposed to help a customer if I don't know what the customer needs.
所以我以我的面试教练为例。
So fundamentally, I'll use my interview coach as an example.
如果我不知道学生在面试中常犯哪些错误,我又该如何给他们反馈呢?
How do I give a student feedback on their interviews if I don't know what mistakes students are making?
对吧?
Right?
所以,我有我的教学经验,我知道一场好的面试是什么样子的,我可以从这里开始。
So like I have my teaching knowledge, I know what a good interview looks like, and I can start there.
但一旦我接触到真实数据,开始看到学生真实的面试表现,我就开始发现:哦,他们犯的是这些错误,而不是那些错误。
But as soon as I hit real data and I start seeing real student interviews, I start to learn like, oh, they're making these mistakes, not these mistakes.
所以现在,我的教练需要了解这些类型的错误,并准备好针对这些错误提供反馈。
So now my coach needs to know about those types of mistakes and needs to be prepared to give feedback on those types of mistakes.
因此,即使是在这个上下文工程环节中指导教练的方式,也深受我在面试中或通过观察学生作业时发现的机会所影响。
So even just how I'm instructing the coach in that context engineering piece is really informed by the opportunities I'm identifying in my interviews or by looking at students work.
当我们进入编排阶段时,我建议人们从最简单的方案开始,然后通过可观测性(这是下一步),当他们开始发现错误时,这可能会对编排产生影响。
When we get into orchestration, this is where I recommend people start with the simplest solution to start, and then through their observability, which is the next step, when they start to identify errors, then that is going to maybe affect their orchestration.
所以我来举个例子。
So I'll give an example of this.
我的面试教练最初 literally 就是一个提示。
My interview coach started literally as one prompt.
它只是一个很长的文档,说明了我们如何评分面试。
It was just one long document of here's how we grade an interview.
我有一个评分量表,从七个不同维度对面试进行评分,但我发现教练在这些维度之间开始感到困惑。
And I have a rubric where I grade interviews on seven different dimensions, and what I found was that the coach was starting to get confused across the dimensions.
比如,它在评估一个维度时,却引用了另一个维度的指示。
So like it'd be grading one dimension, but it would be pulling in instructions from another dimension.
我当时就想,好吧,我的提示虽然能放进上下文窗口里,但它变得太庞大了,大语言模型无法有效组织和理清内容。
And I was like, okay, it's great that my prompt fits in the context window, but it's getting so big, the LLM isn't keeping it organized and keeping it straight.
我认为这是我经常看到人们提到的一个普遍现象:当我们给大语言模型分配简单任务时,它们的表现会更好。
I think this is a general thing that I see people write about a lot is that LLMs perform better when we give them a simple task.
所以我决定,好吧,我有七个维度,那就把每个维度拆分成独立的大语言模型调用。
So I was like, okay, I have seven dimensions, I'm going to break each of those dimensions into their own LLM call.
现在我有了一个工作流程。
So now I have a workflow.
原本只是一个提示,现在变成了一个工作流程:我将同一份转录文本发送给七个不同的大语言模型调用,然后需要整合它们的响应。
So what started as one prompt now becomes a workflow where I take the same transcript and I send it to seven different LLM calls, and then I have to orchestrate the response.
所以在把回复发给学生之前,我会先整理这些回应。
So I'm collating the responses before I send it to the student.
所以,工作流就是一系列有组织的LLM调用。
So, workflow is just a series of orchestrated LLM calls.
这就是工作流协调的一个例子。
So, that's one example of orchestration.
另一个协调的例子是,从长远来看,我可能会转向代理模型,让代理足够智能,能够查看 transcript,也许还了解一些我的学生的背景信息。
Another example of orchestration is I could see in the long run, like I might move to an agent model, where the agent is smart enough to look at the transcript, maybe knows a little bit about my students history.
也许我会通过类似RAG或工具的方式调取学生的背景信息,综合两者来判断哪个维度最值得给予反馈。
Maybe I'm going to pull in like the students history through something like rag or a tool, and it's going to look at both and make a decision about what's the dimension that's most important to give feedback on.
它不会对所有维度都提供反馈,而是只针对对学生最重要的那个维度给出反馈。
And instead of giving feedback on all dimensions, it gives feedback on the dimension that's most important to the student.
我会把这种做法归入协调的范畴,也就是我如何协调LLM对这个输入的回应方式?
I would put that in the orchestration bucket, like how am I orchestrating how the LLM is going to respond to this input?
那么,探索在这个过程中又如何发挥作用呢?
And so where does discovery come into play there?
我认为编排和上下文工程是紧密相关的。
Think orchestration and context engineering are really closely tied.
我们要理解自己想解决哪些机会,并为大语言模型提供合适的上下文和足够小的任务,以便它能清晰且充分地应对这些机会。
It's how do we understand what opportunities we're trying to address and how are we giving the LLM the context and small enough tasks that it can meet those opportunities clearly and adequately.
好的,我听到的重点是错误分析,看起来在构建AI功能时,发现和错误分析几乎是同义词。
Okay, and what I'm hearing here is a really important focus on error analysis then, where discovery and error analysis, it seems like, are almost synonymous in the case of building AI features.
当你建立良好的可观测性,进行质量评估,并维护系统时,你会回到上下文工程和编排中去实际调整这些内容。
And as you set up this good observability, and you're doing your Evals for quality, and you're maintaining the system, you're going back into the context engineering and the orchestration to actually change things.
这个总结准确吗?
Is that a fair summary?
是的,让我们进入第三个层面:可观测性。
Yeah, so let's get into this third bucket of observability.
人们可能没有意识到这一点,因为我以前也没意识到,但我觉得这是一个重大的伦理问题。
People may not realize this, because I didn't realize this, and I think this is a huge ethical concern.
大多数AI产品都在记录你的调用轨迹。
Most AI products are logging your traces.
那么,这意味着什么?
So, what does that mean?
这意味着当你与AI功能交互时,系统会记录下输入和输出。
It means when you interact with the AI feature, they are creating a record of the inputs and outputs.
这对开发者来说是好事,实际上我认为我们应该在这方面更加透明。
This is good for the developer, I actually think we need to be much more transparent about this.
我认为,这正是我们数据实践中关于坦诚说明存储内容的发现最佳实践所适用的领域。
So I think this is a part where our discovery best practices of being upfront about what we're storing in our data practices is very relevant here.
我认为我们需要明确告知用户这一点,而不是将其埋藏在服务条款中,这是一个关键步骤。
I think we need to be informing people this, and maybe not buried in the terms of service, but this is a critical step.
我们必须观察这些非确定性工具的行为。
We do have to observe what these non deterministic tools are doing.
因此,我们需要一种方式来记录至少一部分,如果不是全部的话,我们的追踪数据,以便能够开始查看它们。
So we need a way to log at least a percentage of our traces, if not all of our traces, so we can start to look at them.
你提到了错误分析,这正是我们要查看部分追踪数据,并让人类逐条审查和标记其中错误的地方。
And you mentioned error analysis, this is where we're going to look at our traces, some percentage of our traces, and we're to literally have humans review them and tag them were there errors.
我一直在为我的面试教练做这件事,我会仔细阅读对话记录,查看教练的回复,并写下作为讲师我会如何做得不同。
And I have been doing this a ton for my interview coach, where I literally look at the transcript, I look at the coach's response, and I write notes about what would I as the instructor have done differently.
对话记录很长,所以我通常一次处理50条左右,完成一批后,我会查看我的标注,找出常见的错误。
Transcripts are long, so I typically work in batches of like 50, And then once I've done a batch of 50, I'm looking across my annotations and I'm looking for what are my common errors.
这让我意识到,这正是我反馈循环的一部分:我是否需要调整我的上下文工程?
And that's telling me, that's my feedback loop into, do I need to change my context engineering?
我是否需要更新我的提示?
Do I need to update my prompts?
还是我需要优化我的编排流程?
Or do I need to update my orchestration?
实际上,一些错误已经促使我做出了这两种类型的调整。
And I've actually had errors that have driven both of those types of changes.
最终,分析会导向评估。
And then eventually, analysis leads to evals.
有些错误我只需修改提示就能解决,然后它们就消失了。
So some of my errors I can fix by just making prompt changes and they go away.
有些错误,我可能改了提示词后会有所改善,但它们又会重现。
Some of my errors, maybe I change a prompt and it fixes a little bit, but then they come back.
所以对于那些持续存在的错误,我会编写一个评估测试。
So for errors that are kind of persistent, I'm going to write an eval.
评估测试就是代码,它可以是代码、数据集,也可以是另一个大语言模型。
And an eval is just code, it can be code, it can be a dataset, it can be another LLM.
这是一种方式,用来确定这类错误在我的追踪记录中出现的频率。
It's a way to say, how do I know how often this error is appearing in my traces?
因此,这是一种自动化的方法,用来判断这类错误在我的追踪记录中出现的频率。
So it's an automated way of saying how do I know how often this error is showing up in my traces?
我更倾向于使用基于代码的评估或使用大语言模型作为评判的评估。
I prefer to use code based Evals or LLMs as judge Evals.
这意味着,一旦我识别出某个错误,如果它持续存在,且不容易通过简单修改消除,我就会编写一个评估来检测这个错误。
And so what that means is once I identify an error, if it persists, if it's not easy to make it go away, I write an eval to detect the error.
这样做的好处是,我现在可以对不同改动进行A/B测试。
And then what that allows me to do is now I can AB test changes.
所以我可以有一组对话记录,用来进行实验。
So I can have a set of transcripts that I use to run experiments.
我可以检测到一个错误,然后改变我的上下文工程或编排方式,接着用旧方法和新方法分别运行这组对话记录,由评估系统来判断新方法是否更好。
I can detect an error, and then I can change the way either my context engineering or my orchestration, and then I can run that set of transcripts on the old way and the new way, and then the Evals is grading is the new way better.
我说了很多,如果大家有问题,我很乐意暂停一下回答。
That was a lot, so I'm happy to pause for questions if there are some questions.
是的。
Yeah.
但由于时间有限,朋友们,我们之前做过一期关于评估的节目,嘉宾是哈梅德·胡森和谢蕾亚·尚卡尔。
Well, because we have limited time, folks, we did an episode with Hamed Husson and Shreya Shankar on Evals.
如果你想要更深入地探讨这个话题,我特别想问一下特蕾莎,因为她一直在撰写关于Claude Code的内容。
You wanna go a little bit deeper on that, I wanna ask Teresa really badly because she's been writing about Claude Code.
对。
Yeah.
能不能简单给我们介绍一下Claude Code?
To just give us the thirty second introduction to Claude Code.
为什么?它是什么?
Why what is it?
大多数产品经理都害怕使用它。
Most PMs are really scared of using it.
他们能用吗?
Can they use it?
是的,好的,我们现在是星期一在录制。
Yeah, okay, so we're recording this on a Monday.
我告诉你,我已经用了七天的Claude Code了。
I will tell you, I am seven days into using Claude Code.
但让我告诉你在这七天里我做了什么。
But let me tell you what I did in those seven days.
大约两周前,我注意到我的面试教练出了问题。
So about two weeks ago, I noticed an error in my interview coach.
这个错误是一个编排错误。
The error was an orchestration error.
所以,通过将我的提示拆分成七个独立的提示,这意味着每个提示都没有其他维度的上下文。
So by splitting my prompts into seven individual prompts, what it meant is the prompt, like one prompt didn't have any context for the other dimensions.
而我看到的情况是,它们都在使用相同的摘录。
And what I was seeing was they were all using the same excerpts.
因此,一个部分会说,在我这个维度的背景下,这是一个很好的问题。
And so, one section would say, this is a great question in the context of my dimension.
而在另一个部分,它又说,在我这个维度的背景下,这是一个有问题的问题,这导致了给客户的反馈非常混乱。
And then in another section, it'd be like, this is a problematic question in the context of my dimension, and it was leading to really confusing feedback to my customers.
所以我首先得写一个评估来检测这个错误,看看这种错误发生的频率,然后我得找出解决方案,再对这个方案进行A/B测试。
And so first I had to write an eval to detect the error, how often was this error happening, and then I had to identify a solution, and then I had to AB test the solution.
好吧,一周前,我所有的评估都是在Jupyter Notebook里做的,我的A/B测试也是临时随意进行的。
Okay, before a week ago, I did all of my evals in a Jupyter Notebook, and I did my own AB testing ad hoc.
好吧,从上个星期一开始,我用Claude Code做了以下事情。
Okay, here's what I did starting last Monday with Claude Code.
所以我安装了Claude Code,你可以在终端里安装它,和使用Claude一样与它对话,但区别在于,在终端里它可以看到你所有的文件。
So I installed Claude Code, you install it in your terminal, you talk to it just like it's Claude, but the difference is in your terminal, it can see all your files.
所以我一周前也开始使用 Versus Code,这真是忙碌的一周。
So I also started using Versus Code a week ago, it's been a big week.
所以我开始使用一个正式的 IDE,我在开发环境(即 Versus Code)中使用 Claude。
So I started using a proper IDE, so I'm using Claude in the context of my development environment, which is Versus Code.
我 basically 说,我发现了我的面试教练的问题,而它能查看我所有的面试教练代码。
And I basically said, I've identified a problem with my interview coach and it can see all my interview coach code.
我首先想做的是检测这个问题。
And the first thing I want to do is detect it.
所以帮我设计一个针对这个问题的评估方案吧。
And so like help me design an eval for this.
它给了我一个非常复杂的方案,我卡在那里,花了前三天试图解决这个问题。
And it gave me sadly a really complex design, which I got stuck on and spent the first three days trying to work on that problem.
然后我去睡觉了,第二天早上醒来时,我想出了一个简单得多的解决方案。
And then I went to sleep and I woke up the next morning and I came up with a way simpler solution.
我对 Claude 说:嘿,我们为什么不试试这个更简单的方案呢?
I was like, hey Claude, why don't we do this simpler solution?
这真是个好主意。
It was like, great idea.
谢谢,马屁精大语言模型。
Thank you, sycophant LLM.
它真的为我写好了所有代码。
It literally wrote all the code for me.
现在,我懂得编程,也懂得阅读代码,因此当我让Claude帮我写代码时,我的原则是仔细审查所有代码,确保我完全理解每一部分,并确认它确实实现了预期功能,因为Claude写的代码也会出错。
Now, I know how to code and I know how to read code, and so my rule when I let Claude code for me is I review all the code and I make sure I understand all of it and that it's actually doing what it does because Claude Code does make mistakes.
所以你得像个保姆一样盯着,但它真的为我的评估脚本、这个评估的所有代码,以及我提出的解决方案都写好了全部代码。
So like you have to be a babysitter, but it literally wrote all of the code for my evals, for that eval, it wrote all of the code for my proposed fix.
它还为AB测试写了一个测试框架。
It also wrote code like a testing harness for the AB test.
它简直包办了一切。
Like it just did everything.
我觉得我这一周根本没写过一行代码。
I don't think I wrote a line of code all week.
我审查了很多代码,扮演了架构师的角色,因为Claude的解决方案通常非常复杂。
I reviewed a lot of code and I was like the architect because Claude's solutions were often very complex.
但我想,是的,我可能一行代码都没写,却发布了我的面试教练的重大更新和一个全新的评估系统。
But if I think, yeah, I don't think I wrote a single line of code, but I released a major update to my interview coach and a brand new eval.
这是我第一次做跨越LLM调用的评估。
And it was the first time I did an eval that spanned LLM calls.
所以这是一个比我之前做过的更复杂的评估,而我一行代码都没写。
So it was like a more complicated eval than I had done, and I didn't write a line of code.
好了,各位,就是这样。
There you go, guys.
她一周前刚开始用,现在已经做出了巨大的改变。
She started with it a week ago, and she's already making huge changes.
我再怎么强调也不为过,如果你在听这个播客,一定要克服‘这玩意儿在终端里’这个心理障碍。
I cannot emphasize enough if you are listening to this podcast to get over the barrier of, hey, it's in the terminal.
即使你不像特蕾莎那样擅长理解和阅读代码,也赶紧开始用这个工具吧。
Even if, unlike Teresa, you're not great at understanding and reading code, and just get started with this tool.
我觉得这改变了游戏规则。
I think it's a game changer.
我觉得它甚至比Cursor还要好。
I think it's even better than Cursor.
在你离开之前,特蕾莎,我必须问一下,因为你知道,会有许多创作者听这个播客,他们一直仰慕你。
Before you go, Teresa, I have to ask because, you know, there's so many creators who are gonna be listening to this podcast who have been looking up to you.
也许你正是激励他们成为产品创作者的原因之一。
Maybe you were one of the inspirations for them to even become a product creator.
特蕾莎·托雷斯的业务规模有多大?
How big is the business of Teresa Torres?
是的,我是一个人的公司,勉强算两个人的公司。
Yeah, so I'm a company of one, kind of a company of two.
我在菲律宾有一位全职行政人员。
I have a full time admin in The Philippines.
她从技术上讲是外包人员,所以从美国的角度来看,我是一个人的公司,但我有一些外包人员帮助我打理业务。
She's technically a contractor, so from a US standpoint, I'm a company of one, but I have a number of contractors who help with my business.
在典型的年份里,我们的课程每年有2000到3000名学生。
In a typical year, we have 2,000 to 3,000 students a year in our courses.
我知道在Maven的时代,这听起来可能不算大,但我们每门课的学生人数都在20到50人之间。
I know like in the era of Maven that might not sound very big, but we teach all of our courses are 20 to 50 students each.
我们尽量保持小班教学,希望实现高的师生比例。
We try to keep them small, we want a high instructor to student ratio.
对我来说,这真的不是关于我接触过多少人——我知道人们总说,我在需求调研方面接触的人比谁都多,诸如此类的话。
And for me, it's not really about like, I mean, I know people talk about I've worked with more people in discovery than anybody else, blah, blah, blah.
这不关乎规模,真正重要的是影响力。
It's not about size, it really is about impact.
我知道培养需求调研的习惯很难,我们所有的课程都设计成旨在改变行为。
I know it's hard to build discovery habits, and all of our programs we've designed them to change behavior.
我认为这和市面上一些所谓的‘教育娱乐’产品非常不同。
And I think it's very different from some of these edutainment products that are out there.
我知道Maven上有很多出色的课程,我并不是在贬低Maven。
I know there's amazing classes on Maven, I'm not dissing Maven.
我参加了Hamel和Shreya在Maven上开设的AI评估课程,非常喜欢,强烈推荐。
I took the AI Evals class on Maven with Hamel and Shreya and I absolutely loved it and I strongly recommend it.
但我们为我们的小组课程采用了一点不同的流程。
But we follow a little bit of a different process with our cohort courses.
我们非常注重实际操作和大量支持。
Focus a lot on really hands on practice and a lot of support.
目前我们的学生人数已经超过17,000人。
So we're at, it looks like just over 17,000 students.
太棒了。
Amazing.
这些课程的费用是多少?
And how much does one of those courses cost?
我们有几种不同的形式。
We a few different formats.
比如我们的产品发现基础课程,涵盖了所有的发现习惯,但可以把它看作是一门入门课程。
So we have our like product discovery fundamentals course, which covers all of the discovery habits, but think about it as an introductory course.
当你涵盖所有习惯时,我们无法深入任何一个习惯。
When you cover all the habits, we can't go deep in any habit.
这是一门六周的课程,包含12次直播教学,售价17.95美元。
That's a six week course, it has 12 of live instruction and it's $17.95 US.
我们还有一系列深入探讨单一习惯的进阶课程。
We then have a series of deep dive courses that are each on a single habit.
这些是我们的技能类课程,旨在快速提升技能。
And so those are our skill based classes, they're designed to build skill really quickly.
例如,我们有一门关于持续访谈的课程,届时会有访谈教练参与。
So for example, we have one on continuous interviewing, this is where the interview coach shows up.
你会一轮又一轮地练习访谈,我们的目标是让你上完课后能自信地进行真实的客户访谈。
And it's you round after round after round of practicing interviews, our goal is for you to leave the class feeling really comfortable to do a real customer interview.
这些课程的价格都是7.99美元。
Those courses are all $7.99.
然后我们正开始将部分课程内容转化为点播课程。
And then we are starting to convert some of our curriculum into on demand courses.
我们今天有一个名为《持续发现中的客户招募》的课程。
We have one today called Customer Recruiting for Continuous Discovery.
我们即将推出第二个课程——基于故事的客户访谈,预计九月上线,售价2.59美元。
We're about to launch our second one, story based customer interviewing, probably by September, and those are $2.59.
好的,各位。
Okay, folks.
如果你想算算她的生意规模,自己可以算一算。
You can do the math on that if you want to figure out what her business is.
特蕾莎,正如我开头所说,这真是梦想成真,我觉得你给出的回应远远超出了预期。
Teresa, as I said at the beginning, this is a true dream come true, and I think you over delivered on the responses.
你对人工智能的精通令人惊叹。
Your fluency about AI is mind blowing.
非常非常感谢你做客这个播客。
Thank you so so much for being on the podcast.
谢谢你们邀请我。
Thanks for having me.
这真是太有趣了。
This was a lot of fun.
我始终相信为未来许愿,所以我要在这里向大家表达一下我的愿望。
And I always believe in wishing in the future, so I'm gonna go ahead and put it out there to everybody.
我希望我们能有一轮支持这个节目,让特蕾莎能再次回来。
I hope we have a round to support this episode, so Teresa just has to come back.
感谢大家的观看,我们下一期再见。
Thank you all for watching, and see you in the next episode.
关于 Bayt 播客
Bayt 提供中文+原文双语音频和字幕,帮助你打破语言障碍,轻松听懂全球优质播客。