第2026-W09期亮点

本集简介

选择是好事，但有时你可能需要一点帮助！我们的前两个亮点展示了你可以采用的方法，以指导你的下一个大语言模型进行分析和开源许可证选择，以及如何在版本控制历史中留下你的印记。节目链接本期策展人：Sam Parmar - @parmsam@fosstodon.org（Mastodon）& @parmsam_（X/Twitter）如何使用 R 和 vitals 选择最佳的 LLM 选择一个许可证，而非随便一个许可证 Git 提交：请标记你的修改！完整内容请访问 rweekly.org/2026-W09 补充资源 rollama - Ollama 的 R 封装 https://jbgruber.github.io/rollama/ Oops, Git！如何从常见错误中恢复工作坊 https://r-posts.com/oops-git-how-to-recover-from-common-mistakes-workshop/ 支持本节目请通过 https://serve.podhome.fm/custompage/r-weekly-highlights/contact 发送您的反馈 R-Weekly Highlights 在 Podcastindex.org 上——您可直接在 Podcast Index 中为节目打赏。首先通过 Alby 充值，然后前往 Index 上的 R-Weekly Highlights 播客页面。一种思考价值的新方式：https://value4value.info 在社交媒体上与我们联系 Eric Nantz：@rpodcast@podcastindex.social（Mastodon）、@rpodcast.bsky.social（BlueSky）和 @theRcast（X/Twitter） Mike Thomas：@mike_thomas@fosstodon.org（Mastodon）、@mike-thomas.bsky.social（BlueSky）和 @mike_ketchbrook（X/Twitter）音乐鸣谢由 OCRemix 提供 Seven Pipes to Heaven - Super Mario Land - Nostalvania - https://ocremix.org/remix/OCR03256 You Are Not Confined - Final Fantasy 9 - Sonicade - https://ocremix.org/remix/OCR01064

双语字幕

仅展示文本字幕，不包含中文音频；想边听边看，请使用 Bayt 播客 App。

Speaker 0

大家好，我们又回来了，这是RWeeklyHighlights播客的第222期，如果允许我自夸一下的话，这一串2真是漂亮。

Hello friends, we are back with episode two twenty two, that's a row of twos, if I dare say so myself, of the rweeklyhighlights podcast.

Speaker 0

这是每周节目，我们会分享每周在rweekly.org上发布的最新动态和精彩内容。

This is the weekly show where we talk about the latest happenings and highlights we are sharing every single week at rweekly.org.

Speaker 0

我叫埃里克·纳恩茨，很高兴无论你身在世界何处，都能加入我们。遗憾的是，我的联合主持人还在外面厚厚的积雪和树林里艰难清理，他就是迈克·托马斯。

My name is Eric Nantz and I'm delighted that you join us from wherever you are around the world and unfortunately my co host is still digging out of a lot of cruft on the outside with snow and his neck in the woods, Mike Thomas.

Speaker 0

迈克，我真的太抱歉了，老兄。

Mike, I'm so sorry, man.

Speaker 0

我觉得你刚经历过这些，结果又得再来一遍。

I feel like you just went through this, but you had to go through it again,

Speaker 1

没关系。

It's okay.

Speaker 1

到目前为止，这几乎每天都在发生。

It's almost a daily occurrence at this point, seems like.

Speaker 1

我们院子里的积雪、山丘和小坡都比我高，但在暖和的日子里，这倒成了建造雪屋和堡垒的好材料。

So we've got some some snow, mountains and hills in our yard that are taller than me, but, it makes for fun igloos and and fort making on the warm days.

Speaker 0

确实如此。

That is true.

Speaker 0

我们之前在这里时也经历过不少这样的情况，但我得承认，我对另一件事感到非常开心。

We had a good share of that back when we had our batch here, but I do admit I am a happy camper about something else.

Speaker 0

周日早晨是个美好的早晨，因为现在我们所居住的国家——

Sunday morning was a glorious morning because now our country of The U.

Speaker 0

美。

Speaker 0

在未来四年可以被称为冰球之都。

That we reside in can be considered the hockey capital for the next four years.

Speaker 0

男子和女子美国队都赢得了金牌，

Both the men's and women's U.

Speaker 0

比赛都非常精彩，都进入了加时赛，

Speaker 0

我上一次如此兴奋地看比赛可能还是在90年代中期，总之一切都太棒了，我会好好享受这一刻。

Teams have brought home the gold, they were exciting games, both went to overtime, I haven't been that excited watching a game since probably the mid-90s, so good stuff all around and yeah, I'm gonna savor that for a bit.

Speaker 1

这三人对三人加时赛绝对会让你紧抓座椅。

Definitely that three on three overtime will make you grab your seat.

Speaker 0

是的，但并不是每个人都对此感到高兴。

Yes and not everybody's happy about that.

Speaker 0

那是另一回事了，但全镇上下都清楚，大家事先都了解规则。

That's a different matter altogether, but the town was there on all sides of it that everybody knows the rules beforehand.

Speaker 0

所以我想再享受一会儿这份荣耀，尽管常规赛昨晚刚刚重新开始。

So I'm gonna bask in that glory for a little bit longer, although the regular season is just started again last night.

Speaker 0

在经历了那令人激动的两周盛宴之后，再看常规赛冰球真的很难适应，我们冰球迷需要一点时间来调整。

It's very difficult watching the regular hockey after that glorious two week fest, so it's going to take a bit of adjustment for us hockey fans.

Speaker 0

你们这些住在国境线以北的听众，我表示同情，但我们已经四十六年没赢过了，所以希望你们能开心地分享这份荣耀。

Those of our listeners north of the border, you have my condolences, but we needed one in forty six years, so hope you're happy sharing the wealth a

Speaker 1

一点点。

little bit.

Speaker 1

康纳·麦克戴维能赢得大奖吗？

Can Connor McDavid win the big one?

Speaker 0

他最终会的，只是可能不是现在这个地方。

Eventually he will, it just may not be where he's at now.

Speaker 1

他在加时赛中正努力尝试。

He was trying to in overtime.

Speaker 0

他确实努力了，是的。

He sure was, yeah.

Speaker 0

我为他感到难过，但事情就是这样。

I felt bad for him but no, it is what it is.

Speaker 0

这就是体育的魅力。

That's the beauty of sports.

Speaker 0

但我们这里谈的不是体育。

But we're not talking about sports here.

Speaker 0

我们每周都要聊一聊，这里也有不少乐趣。

We got our weekly to talk about with its own share of great fun here.

Speaker 0

本期内容由萨姆·帕马尔精选。

And this week's issue has been curated by Sam Parmar.

Speaker 0

而且和往常一样，他表现得非常出色。

And as always, he had tremendous.

Speaker 0

感谢我们来自全球的rweekly编辑和贡献者们，感谢你们的投票请求和其他精彩建议。

From our fellow rweekly curators and contributors like all of you around the world with your poll requests and other great suggestions.

Speaker 0

是的，我们正生活在一个大型语言模型的世界里，用它们来完成日常工作中的许多任务，也许我们才刚刚开始这段旅程，还不太清楚会遇到什么，因为就像这个科技世界中的任何事物一样，当你踏入这个领域时，你有很多选择：比如选择哪家前沿提供商，使用这些提供商的哪些具体模型，以及针对你的特定任务，它们各自的优缺点是什么？

Yes, we are living in the world of large language models and using them for many tasks in our daily work, maybe we're just getting started in our journeys and we don't quite know what to expect because just like anything in this world of tech, you all got choices when you start this world, such as which frontier provider to use, which actual models from these frontier providers you use, what are the benefits and trade offs for your particular task?

Speaker 0

如果能有一种方式，借鉴我们过去在计算机科学和统计学中使用的原则，进行一系列实验——你可能会称之为基准测试——来看看这些模型实际表现如何，那该多好。

Sure would be nice to have a way to kind of, you know, take in a principle that we've done in the past and, you know, computer science and statistics, and maybe run a little set of experiments, you might say benchmarking, on how these models can actually perform.

Speaker 0

在R语言这边，确实有一个包能帮你做到这一点。为了带我们深入了解，我们请来了谢丽恩·马奇利斯，她多年来在InfoWorld上对R的报道享誉全球，如今她虽然已退休，但仍持续撰写博客，不断寻找值得分享的内容。

Well, on the R side of things, there is very much a package that can help you do just that And to walk us through this, we have a very comprehensive case study here, is Sharon Machlis, who has been, you know, world renowned in her coverage of R for many years with InfoWorld, and now she is, I guess, kind of retired now, but she's still blogging away, so finding her spots to pick and choose.

Speaker 0

但她一直都在不断探索AI的极限。

But she has been, you know, pushing the limits of her adventures of AI.

Speaker 0

自从她开始采用并撰写相关内容以来，我就一直在关注她，她有一篇很棒的博客文章，我们将在本节的第一个亮点中进行讨论。

I've been watching her ever since she started adopting this and writing about, so she has a great blog post that we're going to talk about here in this first highlight here.

Speaker 0

正如我提到的，这个节目的主角是vitales包，它受到了Python的inspect框架的启发。

And yes, as I mentioned, the vitals package is the star of this show and it has been inspired by Python's inspect framework.

Speaker 0

如果你在 Python 一侧，你已经使用边缘语言模型一段时间了。

If you're on the Python side, you've been doing marginal language models for a while.

Speaker 0

vitals 包由 Posit 团队开发，其作用是与我们经常提到的 LLM 包集成，作为在 R 中与多种大型语言模型交互的一站式解决方案。

The vitals package is authored by the team at posit and in on the TIN, what it is supposed to do is to integrate with another package we talk about quite a bit, LLM, as your kind of one stop shop to interface with a mix of different large language models in R.

Speaker 0

但 vitals 能让你设置这些实验，并根据你指定的标准来评估不同模型的性能。

But what vitals will let you do is to let you set up these experiments, if you will, and let it judge the performance of different models based on criteria that you specify.

Speaker 0

那么，你如何指定这些标准呢？

And how do you specify that criteria?

Speaker 0

你可以创建一个数据框，其中至少必须包含两列。

Well, you can create a data frame that at the minimum must have two columns associated with it.

Speaker 0

第一列是输入列，对应你希望发送给 LLM 提供商的请求；另一列是目标列，表示你期望大型语言模型对该请求做出的回应。

First is the input column, and this is corresponding to what is the request that you want to send to the LLM provider, and then the other column is target, and this is what you expect the large language model to respond to that request.

Speaker 0

正如你所想象的，这可以非常详细，也可以非常笼统，但她在文章中多次提到，越具体，效果可能越好。

So as you can imagine, this can be very detailed, maybe very general, but she does mention throughout this post that the more specific you can get, probably the better off you'll be.

Speaker 0

如果你愿意，还可以添加一些额外的元数据，但这是开始进行此类实验最简单的方式之一。

You can add some additional metadata to this if you wish, but this is, one of the easiest ways you could start doing this.

Speaker 0

打开你最喜欢的电子表格编辑器，开始填充不同的行，每一行对应你想要执行的不同类型任务。

Pull out your favorite spreadsheet editor of choice and just start, you know, populating different rows of this corresponding to maybe the different type of task you want to do.

Speaker 0

也许你可以给每个任务分配一个唯一ID，在这篇博客文章的示例中，她列出了三个想要评估的领域。

Maybe you give a unique ID and in the case of this blog post to start with, she's got three different kind of areas that she wants to benchmark here.

Speaker 0

其中一个任务是让LLM使用ggplot2创建一个条形图，自行生成示例数据，并给出关于坐标轴标签、条形排序等的明确指令，非常直接。

One of which is to have the LLM create a bar chart with ggplot2 and do its own sample data and basically giving you some instructions on what to do on the axes labels, the bars should be sorted, pretty straightforward.

Speaker 0

另一个任务是文本情感分析，她要求LLM将情感评定为正面、负面或混合。

The other task is a sentiment analysis of text and she asked the LLM to rate it as either positive, negative, or mixed.

Speaker 0

她在这里举了一个桌面处理器评测的例子，我知道这类评测的内容因视角不同而差异很大。

And she has an example here, a desktop processor reviews, which I know run the full gamut depending on you look at.

Speaker 0

英特尔在过去几年里出现过一些问题，最近这些话题又在社区里热了起来。

Intel's had some quirks the past couple years and boy did that enter the block sphere recently.

Speaker 0

最后但同样重要的是，她还提出了一个有趣的任务：让LLM写一首俳句，非常简单直接，她明确要求必须是三行诗，第一行五个音节，第二行和第三行各七个音节。

Last but not least, has another interesting one here where she asked the LLM to make a haiku, very, very straightforward there, and then she is very, direct with the target that that must be like a three line poem with five syllables in the first line, upper symbols in the second and third line.

Speaker 0

一旦你准备好这个数据框，就可以启动vitals和ELMER了，你需要通过task R6对象定义任务，创建一个新实例，传入你的数据集，这样就准备好了，可以开始关联其他内容了——因为定义输入和目标只是第一步，现在你还需要告诉vitals你打算如何解决这些问题。

So once you have that data frame ready to go, it's time to boot up vitals and ELMER, and you define then a task via the task r6 object, I believe, and you give it a new and you feed in your data set and you got yourself ready to go to start associating it with some other things because it's one thing to define what your input and target are, now you got to tell vitals what you want to do to solve it.

Speaker 0

而这就是LLMER发挥作用的地方。

And this is where the LLMER comes to play here.

Speaker 0

求解器是你与ELMER的聊天对象交互以执行任务的方式。

The solver is a way for you to usually interface with a chat object from ELMER to actually carry out the task.

Speaker 0

在某些特定需求下，你可能需要自定义求解器，但两种方式你都可以选择。

There may be cases you might need to do a custom solver for some more specific needs, but you get the ability to do either way.

Speaker 0

是的，熟悉LLM在这里确实有帮助，但需要注意的是，这正是你与模型交互的部分。

And yeah, being familiar with LLM is definitely helpful here, but the other thing to note is that this is the part where you're going to interact with a model.

Speaker 0

所以，如果你使用的是前沿服务商，请确保你已准备好API密钥和所有认证信息。

So if you are using a frontier provider, make sure you got your API key ready to go and all your authentication stuff ready for you.

Speaker 0

对于本地模型来说，这不是问题，我们稍后会讲到，但至少对于前沿模型，你必须先在ELMER中完成所有设置。

That's a non issue for local models, which we'll get to later, but at least for the frontier ones, you got to get all that squared away in Elmer first.

Speaker 0

最后但同样重要的是，发送查询是一回事，但你必须实际对其进行基准测试，对吧？

Last but certainly not least, it's one thing to send that query out there, but you got to actually benchmark it, right?

Speaker 0

这意味着vitals的第三个组成部分是评分。

Now that implies the third component of vitals is the score.

Speaker 0

现在一种流行的方法是让另一个大语言模型扮演评判者的角色，这本身就是开箱即用的功能。

Now one popular method is that you let yet another LLM play the role of judge, and this is right out of the box.

Speaker 0

它们提供了一个名为模型评分问答的功能，本质上就是利用这个大语言模型来评估它如何解答特定问题或完成任务。

They have a function called model graded QA, which is basically using that LLM to test how it solved that particular question or that task, if you will.

Speaker 0

你还可以为它提供多种解决方式，比如要求回答必须精确指定某些内容，或者必须包含特定关键词。

You can also give it some different ways of how to solve it, how maybe it needs to look at things like exactly being specified, or maybe certain keywords must be part of the response.

Speaker 0

你可以用很多方式自定义这个过程，但同样要注意，在设定标准时要保持谨慎，别太过具体。

There are lots of ways you can customize that, but again, you want to be diligent on how specific you get in your criteria.

Speaker 0

有时候，如果你的标准太模糊，可能会根据你使用的模型不同而引发后续问题。

Sometimes the more vague you are, that can cause issues later on depending on the model you're using.

Speaker 0

所以你也可以设定：是否要求答案必须100%准确地符合目标，否则就算失败？

So you can also say, do you want it to just be kind of a binary type of thing where the answer must be 100% accurate to your target or not?

Speaker 0

或者你可以允许部分得分，让系统能给予一定的认可。

Or you can let it do partial credit to get it kind of seen.

Speaker 0

就像如果答案接近正确，就有一个叫‘部分得分’的参数，可以让你在这方面更宽容一些。

Just like if it got closely there, there's a parameter called partial credit that lets you be a little more lenient on that.

Speaker 0

她已经探索了多种模型来担任这个评判角色。

And she's explored different models to be this judge.

Speaker 0

看起来 Opus 4.6 在这方面表现更好，但相比之前推出的 Sonnet 模型，它的成本更高。

Seems like opus4.6 seems to be better at this, but you are paying more for that as as opposed to, say, the Sonnet model that came previously.

Speaker 0

但同样，你也可以使用其他模型，比如 GitHub 的模型，或者来自其他提供商的某些模型，具体取决于你想将评分任务做到多深入。

But, again, there may be other things that you can use or models you can use such as GitHub models or maybe some other ones from different providers, depending on how extensive you want to take that grading task, so to speak.

Speaker 0

一旦设置好这一切，就该进行评估了，这在 Vitals 中就是简单的 eval 方法。

So once you set all this up, it's time to evaluate it, and that's simply the eval method in vitals.

Speaker 0

有了它，你就能完成整个运行过程。

And with that, you are going to then be able to finish running that.

Speaker 0

它在后台实际上会执行五个步骤或方法。

It actually does five processes under the hood or methods.

Speaker 0

其思路是将任务发送出去、进行评分、记录日志、确保指标准确，之后你就能以更详细的方式查看各个任务，也可以查看日志文件。

The idea of sending it out there, scoring it, doing a log of that, getting the measures accurate, and you will get then a way you can view the different tasks in more detail if you wish, well as looking at the log file too.

Speaker 0

这时，统计分析就派上用场了——这可能只是单次执行，但就像统计学中的任何情况一样，执行次数不同可能会导致异常波动，因此这里引入了一个名为 'epochs' 的参数。

And this is where putting our stats add to the play, that might just be one execution of it, but just like anything in statistics, you might have weird variation depending on how many you do here, so that's where a parameter called epochs comes in.

Speaker 0

这类似于多次迭代，如果你进行大量并行处理或重采样的话。

This is akin to maybe iterations if you do a lot of parallel processing or resampling if you will.

Speaker 0

所以你可能想把这个数值调高到10，甚至更高，但同样，你调得越高，花费就越多，因为你如果走这条路，就得向这些前沿提供商付费。

So you might want to bump this up to say 10, maybe more, but again the more numbers you bump up here, the more you may be paying out of your pocket because you aren't paying these frontier providers if you're going out that way.

Speaker 0

她做了一个简单的成本估算，发现在这11次运行中，使用Sonnet作为评判者每次大约花费14美分，而使用Opus则为27美分。

She did a little, cost estimate it was about 14¢ to use Sonnet as the judge in these cases versus 27 for opus on say 11 runs of this.

Speaker 0

当然，实际情况可能因情况而异，但密切关注它的表现总是好的。

Again, mileage may vary, but again that's always good to keep an eye on how this might perform.

Speaker 0

现在，这个调用可能只使用了一个模型提供商。

Now that invocation may just use one model provider.

Speaker 0

当然，这样做的目的就是想比较多个模型。

Of course, the idea behind this, you want to compare multiple models.

Speaker 0

她在帖子中简单介绍了几种方法，你可以轻松替换为另一个LLM提供商来扩展现有任务，或者从头开始。

There is easy ways that she outlines in this post of how you can simply swap in another LLM provider if you wish to augment that existing task, or you can start it from fresh.

Speaker 0

无论选择哪种方式，都可能有一些优势。

There may be some benefits to doing either approach.

Speaker 0

最后，如果你要使用新模型，一定要确保正确地进行了身份验证，然后就可以开始运行了。

In the end, if you're doing a new model, again, you need to make sure authenticated that correctly and you'll be off to the races.

Speaker 0

所以你可能已经完成过多个任务，或者对类似的任务进行过多次迭代。

So maybe you've done multiple tasks or maybe you've iterated on that similar task.

Speaker 0

现在是时候查看结果了，而vitals包提供了一个名为vitals bind的便捷函数，可以让你将运行过的多个任务进行行合并，从而得到一个整洁的数据框，其中包含一个元数据列表列；你会看到任务本身、该子任务的唯一ID，比如条形图、情感分析或俳句创作，然后你会看到每个轮次或每次迭代的得分。

It's time to actually look at the results, and that's where the vitals package has this handy function called vitals bind, which will let you literally, in essence, row bind, if you will, the multiple tasks that you ran to do the benchmarking, and you get a nice tidy data frame with a list column for the metadata, but you'll see the task itself, its unique ID of that subtask, such as the bar chart, sentiment, or the haiku poem creation, and then you'll get the, for each epoch or each iteration, the score.

Speaker 0

而这个得分会分为不同类别。

And that score will be different categories.

Speaker 0

我想我们有I代表错误，C代表正确。

We have, I believe, I for incorrect and C for correct.

Speaker 0

如果你采用部分评分，可能还会多出一个等级，但最终你可以查看这些结果，了解各项表现如何，甚至可以更巧妙地进行可视化或对元数据和结果进行深入分析，以便以更独特的方式呈现——因为这些元数据可能很有用，比如她想查看LLM生成的条形图时，就能从元数据列中提取出来，进行可视化或手动检查。

If you do the partial credit, it might have an additional level there too, but in the end you can take a look at that and see how things did and you can even, you know, be really slick and do some visualizations or do some interrogations of the metadata and these results if you want to present this, you know, a little bit of a different way Because that metadata might be useful if you want to do some specific analyses such as when she wanted to look at the bar charts themselves that the LLM created, she was able to extract that from the metadata column and to be able to, you know, do some visualizations or at do some hands on inspections.

Speaker 0

所有这些都基于使用前沿模型来比较OPUS、Gemini或Copilot等模型的背景。

All of that was set in the context of using the frontier models to maybe compare OPUS with Gemini or copilot models and whatnot.

Speaker 0

但你知道吗？

But you know what?

Speaker 0

另一方面，如果你想省钱，那就是本地大语言模型这一侧。

There's another side to this, especially if you wanna save some money, and that's a local large language model side of it.

Speaker 0

这一点真的会变得很有趣，因为本地模型确实有一些优势，而且我觉得它们正在变得越来越好。

And this is where things can really get interesting because the local models, you know, there are definitely some gains to be had, but I think they're getting a bit better.

Speaker 1

是的。

Sure.

Speaker 1

当你使用本地模型时，正如莎伦在这里指出的，你会受限于你运行这些模型的电脑性能。

And when you're using local models, as Sharon points out here, you are gonna be limited to the power of your own computer that you're running this on.

Speaker 1

所以她使用的是能适配她电脑12GB显存的模型，这些模型相对来说是有些局限的。

So she was leveraging models that fit onto her PC's 12 gigabytes of GPU RAM, which, you know, are are somewhat limited.

Speaker 1

许多不同的本地模型或开源权重模型，通常都会发布多个版本。

A lot of these different local models or open weights models, if you will, have multiple versions that are published.

Speaker 1

你知道，有些版本比较小，有些则是中等规模。

You know, some that are smaller than others and then some, you know, kinda that are medium size.

Speaker 1

而显然，表现最好的是那些庞大的模型，运行它们需要强大的计算能力。

And then, obviously, the ones that are gonna perform the best are the the giant ones, that you would need some serious computing horsepower to run on.

Speaker 1

所以有一件非常有趣的事情是，Vitals 支持 Ollama 这个项目，如果你还没听说过，它就像是一个一站式平台，让你能以类似 Docker 的方式轻松下载本地模型。

So one really interesting thing is the fact that, you know, vitals supports this Ollama project, which if you've never heard about it, it's kind of this, you know, sort of one stop shop for leveraging local models, that you can essentially pull down these local models in kind of a very docker ish way.

Speaker 1

有两个命令：ollama pull 和 ollama run，分别用于下载你想要的模型，然后在本地机器上运行模型进行推理。

There's two commands, ollama pull, ollama run, to pull down the model you want, and then to to actually run that model for inference on your local machine.

Speaker 1

我之前不知道的是，其实有一个 R 语言包，是 Ollama 的封装，叫做 r ollama 包，或者可能叫 rollama，这真的很值得一看。

And something that I didn't know is that there is actually an r package that's a wrapper around ollama called the r ollama package or rollama maybe, which which is really cool to check out.

Speaker 1

这个项目从 2025 年左右就开始了，可能还更早一些。

It's a project that's been around for since I think the 2025, maybe a little bit longer.

Speaker 1

看看它目前存在的问题，以及它被开发了多久，就能明白这一点。

Just looking at some of the, issues that are out there, how long it's been worked on.

Speaker 1

这似乎是 Johannes Gruber 和 Maximilian Weber 做的一个非常有趣的项目。

It's really, really interesting project by Johannes Gruber and Maximilian Weber, it looks like.

Speaker 1

如果你对使用本地模型感兴趣，这个项目非常值得一试。

It's a one to check out if you're interested in using local models.

Speaker 1

Sharon 在测试时使用了三个模型。

And Sharon leveraged three for testing purposes.

Speaker 1

Minestrel，一个拥有约140亿参数的模型。

Minestrel, three with, I think, 14,000,000,000 parameter model.

Speaker 1

她还使用了谷歌的Gemma和微软的Phi模型。

And then she also leveraged, Gemma from Google and Phi, I think, is a Microsoft, offering out there.

Speaker 1

她使用的这三个本地模型在她设定的情感分析任务中表现都很出色，但在条形图任务上全都表现很差。

And all three of the local models that she leveraged did a great job at the sentiment analysis tasks that she put it put forth towards it, and they all did really poorly on the bar chart.

Speaker 1

听起来有些代码生成了条形图，但没有实现她所期望的坐标轴翻转和条形降序排列。

It sounds like some code produced bar charts but didn't do the axis flipping that she had looked for and, you know, the bar descending order sorting.

Speaker 1

还有一些代码根本无法正常工作。

And some of the other code just really didn't work at all.

Speaker 1

她在博客文章中插入了几个不错的GT表格，展示了她获得的性能和准确率。

And she puts in a couple nice, I think, GT tables throughout this blog post that kind of showcase the performance and the accuracy rates that she got.

Speaker 1

在完成上述任务后，她接下来尝试的一步是再次使用Opus LLM作为评判，花费了39美分。

And sort of the next thing that she tried to do after this, and in that previous task, she did use again, Opus LLM as a judge, cost her 39¢.

Speaker 1

显然，运行本地模型的成本为零，这是其主要优势，除了你需要具备相应的计算资源。

And obviously, running the local models cost you zero, which is the big benefit there, besides the compute that you need to to have.

Speaker 1

是的。

Yep.

Speaker 1

接下来，她让这些本地模型尝试从纯文本中提取几个不同的数据点，使用的是vitals库中的generate_structured函数，该函数需要一个聊天对象和一个你希望返回的定义数据类型，这很便捷。

So the next task that she did was to have, these local models try to extract, a few different data points from plain text, using this function from vitals called generate structured, which requires a chat object and a defined data type that you want the to return, which is nice.

Speaker 1

我想，截至这篇博客文章发布时，你实际上需要安装vitals的开发版本才能使用这个函数，这是一个需要注意的细节，任何听到这里并打算安装vitals来实现类似功能的人都要留意。

And I guess as of this blog post, you actually need the development version of vitals in order to use this function just as a gotcha for anybody that hears this and and tries to go out and install vitals and is looking to do the same type of thing with that particular function.

Speaker 1

但这在这里创建了一个不错的数据框，其中包含一个ID列，该列只有两个值。

But this creates a a nice data frame here, with an ID column that has really two values.

Speaker 1

要么我们提取的是基本实体，要么我们进行某种更复杂的实体提取。

Either we're extracting a basic entities or or we're doing some sort of a more complex entity extraction.

Speaker 1

然后她还有一个名为input的列，其中包含两个观测值，本质上是她希望LLM从中提取信息的文本，以及几个最后的列，称为target。

Then she has a column called input that contains two observations as well, essentially containing the text that she wants the LLM to pull information from, and then a couple final column here called target.

Speaker 1

她还提供了一些非常不错的示例，说明如何使用Elmer的type_object函数来定义你希望LLM返回的数据结构输出。

And she provides, you know, some really nice examples about how to actually define that data structure output that you want, the LLM to return using Elmer's type underscore object function.

Speaker 1

令她惊讶的是，本地模型Gemma的准确率达到了100%。

And, to her surprise actually, the local model, Gemma, scored a 100%.

Speaker 1

她又进一步深入研究，想看看这是否只是偶然，最终得出结论：在她随后提供的20个任务中，模型在其中两个任务上失败了，但那些反而是更简单的任务。

And she did a little bit more digging to try to see if that was a fluke, it essentially came to the conclusion that it actually failed on two out of the 20 tasks that she had provided it with thereafter, but those were the easier tasks.

Speaker 1

我想这里的启示是，尽量在给这些模型的指令中做到尽可能具体。

And I guess the takeaway here is, you know, try to be as specific as possible in your instructions to these models.

Speaker 1

我认为，当我们交给大型语言模型的任务更复杂时，我们或许会更自然地提供更具体的指令，以应对任务中的各种复杂性。

And I I think maybe inherently when we have more complicated tasks that we're giving these large language models, maybe we do a better job at providing specific instructions to manage all of the complexities of that task.

Speaker 1

所以，Sharon 在利用前沿模型和开源权重模型方面，提供了一个非常有趣的案例。

So, you know, really interesting use case here from Sharon on both leveraging these frontier models and the open weights models.

Speaker 1

我认为，自从12月以来，随着Claude的OPUS 4.6、OpenAI方面Codex的诸多更新，以及Gemini 3和即将推出的Gemini 3.1的出现，技术进展显著。

I think with some of the advances, you know, since December and Claude's, you know, OPUS 4.6 and all the updates with Codex on the the, the OpenAI side and obviously Gemini three and now three one coming out.

Speaker 1

至少在我所关注的圈子里，最近关于本地模型的讨论似乎变少了。

There hasn't been as much talk about the local models lately, at least in the the circles that I've seen.

Speaker 1

很高兴看到有人再次谈论本地模型，因为我认为它们在部署这类大型语言模型及其相关解决方案时，能解决许多安全问题，尤其是在需要高度安全和严格监管的环境中。

And excited to see somebody talking about them again because, again, I think they solve a lot of problems around security when you're trying to deploy these types, you know, these types of large language models and and deploy these types of solutions that leverage these models in, you know, secure, highly regulated environment, if you will.

Speaker 1

所以，Sharon 的这篇文章写得真不错。

So, yeah, great write up by Sharon.

Speaker 1

很高兴看到她并没有完全退休，我们对此深表感激。

It's good to see that she's not fully retired, clearly, and we're grateful for it.

Speaker 0

我们全力支持。

We're here for it.

Speaker 0

是的。

Yeah.

Speaker 0

没错。

Exactly.

Speaker 0

这篇帖子真的很好。

Really great post here.

Speaker 0

我想说，这可能算是埃里克在节目中发表的一个有点激进的观点了：随着最近在科技界引发热潮的OpenCLaw计划，现在很多人如果要在自己的基础设施上部署OpenCLaw来帮助处理日常生活中各种琐碎任务时，都会想省点钱。

And I'd venture to say, and this may Eric going on a bit of a hot take segment of the show, with the advent of what has taken a lot of the tech world by fire here, so to speak, this Open Claw initiative, where now a lot of people are going to want to save a bit of money if they're going to deploy OpenCLaw on their infrastructure to help with various mundane tasks with their day to day life.

Speaker 0

如果所有大语言模型，或者一些新推出的模型都能在本地完成这些工作，我认为这将推动市场发生一些变化。

And if all LLM models or some of the newer ones coming out can do the job locally, that's going to drive the market, I think, a little bit.

Speaker 0

我还不确定具体能有多大影响，但我相信它一定会引起大公司的关注；如果这真能带来一波热潮，从而整体上促进模型的发展，那对我们这些希望为非常明确的任务运行本地模型的数据科学家来说，只会是有益的。

I'm not sure how much just yet, but I'm sure it's going to get the attention of the big players here And if it's gonna, you know, have a boom that can benefit the models as a whole, that can only help us on the data science side of things that may want to run local models for very well defined tasks.

Speaker 0

正如你所说，当我用我的Shiny应用测试一个非常深奥的问题时，我会发现自己会非常具体地设计提示，以帮助Opus或任何其他模型来协助我排查问题。

Like you said, I do catch myself if I'm trying to interrogate a very esoteric problem with my shiny app testing that I'm being very specific in my prompt to help Opus or whatever, whichever model to try and help me troubleshoot it.

Speaker 0

剧透一下，我上次这么做的时候失败了，但那是另一个故事了。

And spoiler alert, it kind of failed the last time I did it, but that's been story for another day.

Speaker 0

但对于这类文本提取任务，它可能在帮助你将某些文本整合成更简洁的摘要，或者你需要消化其他一些信息来源。

But for these kinds of like text extraction tasks, maybe it's helping you, you know, consolidate certain texts into a more, you know, bolded summary, or maybe you need to digest maybe some other, you know, sources of information.

Speaker 0

我敢说，本地模型非常适合这些范围较小的任务，也许你可以构建一个多智能体式的流水线，或者把这些东西串联起来。

I dare say the local models, I think will end up being well for these smaller scope tasks and maybe you can put in a multi agent ish pipeline or maybe, you know, string these things together.

Speaker 0

我们已经看到R社区在尝试将这些多重工作流程整合在一起，让你为不同任务选择最适合的模型。

We've seen efforts of that in the R community to try to bring these multiple workflows together and let you pick the right model for the job.

Speaker 0

我认为Vitals在这方面能给你提供很大帮助。

And that's where I think vitals can help you a lot.

Speaker 0

也许你有一个任务，但不确定该从哪个方向入手。

Maybe you know you have a task, you know, it's not sure which direction to go for that task.

Speaker 0

你可以用Vitals来评估这个任务，然后再转向更复杂的模型来处理其他任务。

You can use vitals to benchmark that and then move on maybe to more complicated model for the other task.

Speaker 0

但在这个每周都似乎在不断变化的领域中，你能够客观地衡量这一切。

But you get the power to objectively measure this in this world where it seems like everything is changing every week in this landscape.

Speaker 0

有些事情是你能掌控的，其中之一就是你采用的基准测试方法。

There are certain things you can control and one of those is the approach that you take to benchmarking these things.

Speaker 0

一如既往，很高兴看到POSIT团队为我们带来这些工具，同时也感谢Sharon为我们亲身实践并分享她的经验。

It's, as always, it's great to see the POSIT team, you know, bring this tooling to us, but also Sharon really putting this through the paces for us to learn from her experiences.

Speaker 1

当然。

Absolutely.

Speaker 0

接下来，我们今天的亮点之一是讨论了利用本地模型的本质，这种做法既能让事情保持低成本，我也喜欢说，同时也保持开放。

And up next in our highlights today, did talk about the nature of, you know, leveraging local models kind of in the spirit of keeping things both, you might say, cheap, but also I like to say in the open as well.

Speaker 0

它们之所以是开源的，是有原因的。

They're open source for a reason.

Speaker 0

猜猜怎么着？

Guess what?

Speaker 0

当我们为R社区，甚至其他流行语言如Python等开发包时，你很可能希望整个社区都能从你的包中受益。

When we build packages in the R community or even other popular languages like Python and others, you want the rest of the community to benefit from your package most likely.

Speaker 0

因此，有一个任务常常被忽视，尤其是对于刚进入开发领域的人来说，但根据你选择的方向，这可能非常重要，那就是为你的项目选择一个许可证。

And with that, there is a task that often gets overlooked, especially if you're new to the world of development, but it is a pretty big deal depending on how you go with this direction, and that is picking a license for your project.

Speaker 0

在这方面，选择也非常多。

And there is a lot of choice in this as well.

Speaker 0

这可能会让许多新手感到困惑，但我们的下一个亮点是，从R开发者在数据科学领域的视角，为我们整理了一份全面指南，告诉你应该关注什么，以及在哪里可以获取更多信息。

It can befuddle a lot of people that are new to this, but our next highlight here, we have a great roundup from the R developer perspective in data science, what you should be looking for, and where you can find out more.

Speaker 0

这篇博文来自Pete Nagaraj，发表在Stephen Turner运营的Paired Ends博客上。在这篇文章中，他讨论了当你看到一些免费或可用的资源时，你该关注什么——这些资源表面上看似如此，但实际情况可能并非如此。

So this blog post comes to us from Pete Nagaraj over on the Paired Ends blog run by Stephen Turner, and in this post he talks about you know what are you looking for in terms of if you see something out there for free or available may not always be what you seem, what it seems I should say.

Speaker 0

你通常会关注哪些方面？

What do you typically look for?

Speaker 0

说实话，你可能不会首先看许可证，我更关心的是这个包是否能实现我想要的功能。但当你在企业环境中工作，或构建的软件整合了开源组件时，确实需要重视许可证问题。

It's probably not the license, honestly, I'm probably looking if the package does what I want to do, But when you work in an enterprise or you are building software that incorporates open source software, you do indeed need to pay attention to licenses.

Speaker 0

即使作为包的作者，你可能一开始并没有意识到，但你正在开发的项目有很大可能性会成为更大项目的一部分，被社区成员甚至企业所采用。

And even as a package author, you may not know it right away, but there is the high chance that maybe the thing you're building is going to be part of maybe a larger project adopted by both those just in the community and perhaps even those by enterprises.

Speaker 0

当然，R语言本身采用的是GPL许可证，即GNU通用公共许可证。

Of course, the R language itself is under our license, and that is the GPL, the GNU public license for R itself.

Speaker 0

但如果你希望将一个包发布到CRAN，其要求是必须提供你包的源代码。

But the mandate, if you want to bring a package to be available on CRAN, is that you must, of course, make the source code of your package available.

Speaker 0

话虽如此，这并不意味着所有包都会使用相同的许可证，因为在社区中分享代码时存在多种不同的许可证。

Now with that said, that doesn't always mean that every package will have the same license because there's a lot of different licenses when you share your code out in the community.

Speaker 0

因此，这篇博客文章的第一部分可视化内容分析了所有发布在CRAN上的包，是否存在某种模式，即哪些许可证最为常见？

So the first part visual in this blog post looks at for all the packages deployed on CRAM, is there kind of a pattern in terms of like what are the licenses that are the most frequent?

Speaker 0

对于一些听众来说，这可能并不令人惊讶，但让我感到有趣的是，高达68%的包使用的是某种形式的GPL许可证，这或许是因为R本身采用的就是GPL许可证；但实际上GPL有多种变体，我们可以深入探讨哪种许可证在不同情况下更优或更劣。

This may not come as much of a surprise to some of you listeners, but it was interesting to me that by a wide margin that they are typically some form at 68% about a form of the GPL license and that maybe is because R itself is under GPL, but there are actually different variants of GPL which we could go down much of rabbit holes about which one is better or worse for different cases.

Speaker 0

排在第二的是占24%的MIT许可证，它通常被认为更加宽松和自由。

The next one up at 24% is the MIT license, which is often viewed as a little more permissible.

Speaker 0

而在尾部则是一些有趣的许可证，例如Apache、Creative Commons，以及BSD。

And then you'll see at the tail some interesting ones such as the Apache, Creative Commons, and yes BSD.

Speaker 0

向所有BSD的爱好者致意。

Shout out to all the BSD fans out there.

Speaker 0

我知道你们就在那里。

I know you're out there.

Speaker 0

我不确定在数据科学方面有多少，但我知道在我的Linux播客中，我会听一些。

Not sure how much on the data science side of it, but I know on my Linux podcast, I do listen to a few of them.

Speaker 0

关于许可证，另一件需要注意的是它们是否允许商业使用，或者你可能称之为 copyleft，但并非所有许可证都允许。

And the other thing to pay attention about licenses is whether they permit commercial use or you might call copy left, and not all of them do that.

Speaker 0

如果你想要使用 GPL 许可证，而你打算基于它开发商业产品，那么你很可能必须将你的产品也以相同的 GPL 许可证发布。

If you want, for instance, a GPL license, if you're going to make commercial thing out of that, you must license your product as the same GPL most likely.

Speaker 0

别太当真我说的这些，但我很确定情况就是这样。

Don't quote me all on that, but I'm pretty sure that's the case.

Speaker 0

而其他一些许可证对商业使用和保持相同许可证的限制较少，MIT 通常就被视为如此。

Whereas other ones are not as prohibited for commercial use and being in the same license, MIT is typically cited as that.

Speaker 0

你知道，最后我们会在这里提供一些资源，这可能是一个有点困难的选择，但有一些方法可以帮助你做出决定。

And you know, in the end, we'll get to resources at the end here, it can be a somewhat difficult choice, but there are ways to help you make that point.

Speaker 0

我提到了企业视角。

I mentioned the enterprise perspective.

Speaker 0

当许可证发生变化时，不同供应商可能会引发一些问题。

There are things that can happen with different vendors that can cause havoc when licenses change.

Speaker 0

所以这篇帖子中提到了其中一个情况。

So one of those is talked about in this post.

Speaker 0

Anaconda 项目引发了一点风波，这可能让不少人感到意外。

A little bit of drama for the Anaconda project and this bit a few people, you know, probably by surprise.

Speaker 0

对于不熟悉的人，Anaconda 是一个 Python 包的软件仓库，使用起来相当方便。今年以及过去几年都曾出现过一些争议，但最近的一些更新表明，如果你的组织员工超过 200 人，就必须购买许可证才能使用 Anaconda。

And Anaconda for those who aren't familiar, this is kind of a software repository for Python packages and there are some convenience involved with it and there was a little bit of a stink that happened this year as well as previous years, but there has been some updates that basically say if you have an organization of 200 employees or more, you need to purchase a license to actually use Anaconda.

Speaker 0

这次的情况并不像过去那样对所有人都一视同仁。

It wasn't quite the same for everybody like it was in the past.

Speaker 0

这引发了社区的强烈反弹，以至于在 2025 年 7 月，官方更新了一项政策，规定某些非营利组织和学术机构可以免于这一商业许可要求。

This caused a lot of blowback in the community, such that in 2025 there was a bit of an update in July that said that certain nonprofit organizations and academic institutions may be exempt from that commercial requirement.

Speaker 0

但到那时，距离最初设定 200 人门槛的变更已经过去了五年，损害可能已经造成。

But by that point, that was five years after this initial change with that 200 limit, the damage may have been done.

Speaker 0

因此，当这些许可证从更宽松转向更严格时，往往会引发巨大混乱。Anaconda 在这次转变中的未来如何，我们拭目以待，但当时确实引起了广泛关注。

So a lot of times when these licenses change, especially when they go from a more permissive to a non permissive, this can cause a lot of havoc and we'll see what the future holds for Anaconda in this shift, but it did get a lot of attention from a lot of people back then.

Speaker 0

这让我们更清楚地看到目前存在的选择，但也许已经到了这样的地步：当你在开发一个包时，会开始纠结：我到底该选哪条路？

So this is illuminating in terms of what choice is out there, but it may be getting to the point of you may be developing a package, you're like, which way do I go here?

Speaker 0

当然，如果你正在基于其他包构建依赖，你可能需要查看这些包的许可证，确认是否存在使用限制，以免措手不及。

Definitely, you may want to look at things like if you are building upon other packages as your dependencies, you may want to look at the licenses they have, see if there's any restrictions if you use one of those packages as a dependency and making sure you're not caught by surprise there.

Speaker 0

但这里末尾也提供了一些很棒的资源供你参考。

But there's also some great resources you can look at that are linked at the end here.

Speaker 0

一份不错的入门指南叫做《开发者开源许可证指南》。

A nice primer is called the Dev's Guide to Open Source Licenses.

Speaker 0

GitHub 也有一个关于许可证的 README 项目，可能也很有帮助。

GitHub also has a README project that they talk about licenses, that might be a good one.

Speaker 0

GitHub 还提供了一个网站，可以帮助你通过不同的决策树来选择合适的许可证。

GitHub also has a mechanism for, I should say, a website that lets you walk through the different decision trees that you can do for choosing a license.

Speaker 0

如果你想深入了解 GNU GPL 许可证的细节，自由软件基金会（FSF）维护了一份常见问题解答。

If you want to get into the GNU GPL license, nuts and bolts, there's an FAQ maintained by the FSF, the Free Software Foundation.

Speaker 0

最后但同样重要的是，还有一个名为《选个许可证吧，随便哪个》的有趣资源，它对许可证能带来的影响持较为尖锐的看法，但或许能帮你获得更全面的视角，了解在选择过程中可能遇到的一些陷阱或雷区。

And then last but not least, there is an interesting one called Pick a License, Any License, and it honestly is a bit of a harsher take on what licenses can do for your choice, but it may be good to get a well rounded perspective on some of the pitfalls or some gotchas you might experience as you're navigating through this.

Speaker 0

至于我本人，是的，我确实会留意这些事情。

Now as for me, yeah, I do pay some attention to it.

Speaker 0

说实话，根据我在社区中听到的建议，我通常默认选择MIT许可证。

I'll be honest from recommendations I've heard in the community, I kind of default to MIT.

Speaker 0

它看起来确实很管用。

It just seems to work.

Speaker 0

无论是我的个人项目，还是我公司内部启动的项目，我都未曾遇到任何问题。

I haven't had any trouble with that, both for my personal projects, as well as projects that started at my company.

Speaker 0

之后，我们还通过了开源披露流程，将这些项目向更广泛的社区开放。

And then we went through an open source disclosure process to bring those out to the broader community.

Speaker 0

MIT是我们的组织批准用于开源使用的六七种许可证之一。

MIT is one of the like six or seven licenses my organization approves for open source use.

Speaker 0

因此，使用MIT许可证我从未遇到过任何麻烦，但这仍然是一个重要的选择，尤其是当你希望你的包能融入更大的生态系统，或者未来可能被用于商业用途时。

So I haven't had anything bad happen to me with using MIT, but again, it is an important choice, especially if you expect your package to be part of either larger ecosystems, maybe it might be used for commercial use down the road.

Speaker 0

这是一篇很棒的博客文章，能帮助你了解R语言领域的情况，以及在为新包选择许可证时可能需要考虑的一些因素。

It's a great blog post to kind of get you grounded on what's happening out there in the world of R as well as some considerations that you might have in mind as you're beginning your journey of picking that license for your new package.

Speaker 1

是的，这很有趣。

Yeah, it's interesting.

Speaker 1

这似乎是个热门话题。

This seems like a hot topic.

Speaker 1

我知道乔西亚·佩里最近发表了一篇关于几乎完全相同主题的博客文章。

Know Josiah Perry had a blog post recently on almost the same exact topic.

Speaker 1

我还听了查理·马什的分享，他创建了UV，在Python这边也做了不少工作。

I was also listening to Charlie Marsh who created UV and and rough on the Python side.

Speaker 1

他最近上了《播客播客》节目，谈到了UV自带Python这一点。

He was on Podcasts Podcast recently, and he was talking about the fact that UV ships Python itself.

Speaker 1

我认为UV不仅仅管理你的Python包版本，还完全管理了你项目所使用的Python版本，这跟我们在R这边的做法不一样。

And I think when UV I think UV actually goes beyond versioning your Python packages and actually fully manages your project's version of Python as well, which is, you know, different from, our end on the r side.

Speaker 1

我认为这在Positron周围是个挺重要的事情。

And I think this was kind of a big thing around positron.

Speaker 1

对吧？

Right?

Speaker 1

他们对Positron的授权方式跟我们的Studio有所不同。

They licensed positron sort of different than our studio.

Speaker 1

我认为这是因为过去我们的工作室被第三方直接原封不动地转售了。

And I think that that's because maybe in the past with our studio, there were third parties that were essentially reselling our studio, right out of the box.

Speaker 1

所以我完全能理解这一点。

So I can fully understand that.

Speaker 1

这确实是个棘手的问题。

And it's it's a tricky thing.

Speaker 1

我认为在发布自己的包之前，大家 definitely 应该花时间仔细考虑一下。

And I think something that folks should definitely take time to look at before they publish their package.

Speaker 1

我知道我自己就犯过这种错误。

I know I'm guilty of it.

Speaker 1

当你在开发一个 R 包或 Python 包，终于完成时。

Know, you're working on an R package or a Python package, and you finally finish it.

Speaker 1

你感觉它状态很好，想把它分享给全世界。

You feel like it's in a great state, you wanna share it with the rest of the world.

Speaker 1

你可能会匆匆选择 MIT 许可证，或者为了尽快发布而选择最简单的方式。

You know, just kind of quickly jumping to that MIT license or, you know, maybe the easy thing to do just to get it out the door.

Speaker 1

但这实际上可能是你整体产品的重要组成部分，取决于你开发的软件是否与你所工作的组织有关，或者你希望谁使用它、如何使用它，诸如此类的问题。

But it's can actually be an important part of your your entire offering, you know, depending on if that piece of software you're developing has, you know, ties to an organization that you you work for or who you want to use it, how you want it to be used, things like that.

Speaker 1

所以我建议你务必花足够的时间和精力关注这些方面，而不是轻易跳过它们。

So I I guess make sure that you are, you know, paying enough time and attention to those types of things and not just sort skipping over them.

Speaker 1

而且我认为，其中最好的工具之一是能提供建议的工具。

And and I think probably one of those best tools is is one that recommends.

Speaker 1

它有点像GitHub的访谈风格。

It's it's GitHub sort of interview style.

Speaker 1

你可以选择一个开源许可证网站，它或许能帮助你理清所有那些不同的决策点，以及你对如何使用你的包的种种考虑。

You know, choose an open source license site that can probably help you navigate the all those different decision points and things that you may have in mind for how you want your package to be used.

Speaker 1

这是一篇很棒的博客文章。

So awesome blog post.

Speaker 1

我觉得这是一个被讨论得不够充分的话题。

You know, I think it's a topic that doesn't get talked about enough.

Speaker 1

正如我们所见，如果忽视了这一点，没有认真思考，确实会引发大量问题。

And as we've seen, if it does get overlooked and you're not thinking through it, it can certainly create a whole lot of issues.

Speaker 0

是的

Yep.

Speaker 0

再次向那些在企业一线奋斗的人致敬。

And again, shout out to those in the enterprise trenches on this.

Speaker 0

如果你像我一样，去年发布了最新的shiny state包，这一点也可能非常重要——这个项目最初源于我的日常工作，旨在增强shiny的可书签状态功能。

This can also be very important if like you're in my situation with the recent shiny state package I released last year, certainly that started with my actual day to day job and trying to enhance shiny bookmarkable state.

Speaker 0

这并不是像我做这个小小的播客那样，是业余时间的副业，它实际上是我日常工作中的一个重要需求。

It wasn't like something I was doing on the side, like I do this humble little podcast here, it actually had a significant need for my day job.

Speaker 0

因此，我逐渐发现，这个东西最初是专有的，比如最终成为包代码的部分，但最初只是我日常的R脚本。

And with that, I got to a point where this was a proprietary thing, like what ended up being the package code, but just starting in my typical R scripts.

Speaker 0

后来，我对自己说：‘等等，这个东西我应该分享给社区。’

And then eventually when I was saying to myself, you know what, this is something that I need to share with the community.

Speaker 0

正是在那时，我得到了管理层的支持。

That's where I've got support from my management.

Speaker 0

但随后，我们公司有一套流程来审查：如果你打算公开发布，我们会进行一系列审核；正如我所说，我们有一份获准的许可证清单，我确实在发布前仔细核对了所有细节。

But then I have, we have a process at our company that looks at, okay, if you're going to disclose this, we go through kind of a check process, but then as I said, there's a list of licenses that we have on our approved list and yeah, I definitely dotted my I's and crossed my T's on that before I release that out there.

Speaker 0

所以，如果你只是在做个人爱好项目，通常在选择许可证时会有更大的自由度。

So if you're just doing something as your hobby project, typically you get a little more leeway in which license you use.

Speaker 0

但当你在企业环境中想要开源某些东西时，就必须关注这一点，希望你能获得治理部门或法务部门的支持，他们能为你指明正确的方向。

But when you're in the enterprise, you want to open source something, you do have to pay attention to that and hopefully you have access to either a governance type of department, a legal department that can point you in the right direction.

Speaker 0

我知道一些组织，尤其是生命科学领域的，至今尚未充分应对这个问题，但随着我所持的偏见观点——制药领域正在迅速发展，开源已成为一个关键议题，这种情况正在改变。

I do know some organizations, especially those in life sciences, just haven't been as equipped for this issue yet, but that's changing now with our, you know, my biased view on how the pharmaverse is taking off and are being more involved in life sciences, the open source nature of it has become a critical issue.

Speaker 0

但在某些行业，这还是个新事物。

But some industries, this is a new thing.

Speaker 0

因此，一定要查看这些外部资源——这篇博文链接了它们——以及你所在组织的内部资源，具体取决于你所在的机构。

So definitely look at those resources you have both out here that this blog post links to, as well as your internal resources depending on which organization you're at.

Speaker 0

今天我们将通过一段关于版本控制的旅程来总结我们的重点，我在这档节目中多次说过：如果你不使用版本控制，将来一定会后悔。

And we're gonna round out our highlights today by having a journey to the world of version control, which I've always said many times on this show, you will regret it if you do not use it.

Speaker 0

天哪，就在刚才几个小时，当Claud Opus把测试脚本搞得一团糟时，我庆幸自己用了版本控制，否则就得从头再来。

And boy, oh, boy, was I glad I used it literally the last few hours when Claud Opus wreaked havoc on a testing script, and I had to, nope, go back to square one.

Speaker 0

不过，这并不是这篇博文的重点。

Anyway, that's not the point of this post here.

Speaker 1

经历过。

Been there.

Speaker 0

你经历过。

You've been there.

Speaker 0

是的。

Yeah.

Speaker 0

我敢说你经历过不止一次，更别提今年早些时候了。

I'm sure multiple times, no less, and don't get me started earlier in the year.

Speaker 0

但在许多情况下，Git 都能救你一命，不过使用 Git 既是一门科学，也是一门艺术，尤其是在涉及实际提交内容的基本原理时。

But Git can save your behind on many of these situations, but there is kind of both, you might say, a science and an art to effective use of Git, especially when you get to the fundamentals of your actual commits themselves.

Speaker 0

这篇有趣的帖子标题很棒：《Git 提交，请标记好你的针脚》。

And this, this interesting post that has a great title, Git commits, please mark your stitches.

Speaker 0

这篇帖子由前 rweekly 编辑 Maelle Salmon 撰写，她从不避讳分享自己在学习 Git 技术细节和应对常见问题方面的经验。

And this has been authored by former rweekly curator, Maelle Salmon, who has not been shy about her experiences in learning both the nuts and bolts of Git as well as practical advice on certain issues you might encounter with Git.

Speaker 0

在这篇文章中，她开篇提到自己不仅在学习软件开发方面的技术，还在学习编织，这非常值得敬佩。

And in this post, opens with that she is learning not just technical things on the software development side, she's learning how to crochet, which big respect for that.

Speaker 0

我担心自己在尝试使用这张图里展示的针具时会戳到手指，但她对初学钩针时的经历有一些很好的见解，比如实际用不同的标记来标注你编织的线圈，这些标记依据你必须绕线圈的频率而定，而且你还需要遵循不同的图案，比如连续线圈之间需要多少针之类的——我这里说得不够准确，但我相信如果你也走过这条路，你就明白我们在说什么。

I'm afraid I poke my fingers out trying to work with the needles that are shown in this picture here, But she has some great insights into her early days of learning crochet, such as actually marking in these strings that you're putting together literally different markers based on how often you have to kind of fold the loops around and there are different patterns that you have to adhere to in terms of like the number of stitches in between consecutive loops or whatever and again I'm doing this horrible justice here, but I'm sure if you've been down this road, you know what we're talking about.

Speaker 0

那么，这和 Git 有什么关系呢？

Now, what does this have to do with Git?

Speaker 0

好问题，但正是在这里，我完全能体会到她在文章其余部分所表达的意思。

Great question, but this is where I definitely relate to what she says here in the rest of this post.

Speaker 0

你所做的这些提交，如果处理得当，完全可以作为代码库中时间点的标记。

These commits that you make, if you treat them the right way, can very much function as these markers in time to your code base.

Speaker 0

我常常把它看作是一种开发者日记，每个提交都有一个简短的标题，说明那次代码更新的内容；如果你在未来，比如一年后，甚至几周后回头看，只需一眼就能明白那个提交的具体含义。

I often think of it as like a developer diary of sorts that you get a tagline of what that code update was in that commit, and then if you read that in the future, say a year from now or even a couple weeks from now, you will be able to tell at that quick glance what that particular commit was all about.

Speaker 0

如果最糟糕的情况发生——你已经提交了代码，但后来发现出现了回归问题，必须找出这个回归到底发生在哪一步，这种时候就特别有用。

And this can be really handy if kind of the worst case scenario happens, you've done your commits, you find out sometime in the future there's been a regression, and you've got to figure out where that regression actually happened.

Speaker 0

如果你的提交数量很多，而这些提交又没有写得清晰简洁，那排查起来可能会非常痛苦，而解决这个问题有许多不同的方法。

And if you have a lot of commits, that could be quite painful if you didn't write these commits in a clear and concise way, and there are many approaches to this.

Speaker 0

但正如我们之前在节目中提到的，我的同事开发了一个非常实用的 R 包，每次我说出这个名字，都感觉自己在走冰面一样。

But in, as we mentioned before on the show, my allies created this handy r package, and I keep every time I say the name, feel like I'm walking on ice here.

Speaker 0

超级波普拉普特，大概是这种说法，我试过了。

Superbopulpate, something to that effect, I tried.

Speaker 0

这里面包含了一些练习，让你在Git出现异常情况时练习操作。

And this has exercises inside to let you practice working with Git when things aren't going as expected.

Speaker 0

在这种情况下，精心撰写的提交信息就显得尤为重要，因为有些练习可能涉及如何浏览提交历史或日志。

And this is a case where having these thoughtful commit messages come to play because there could be some exercises related to when you are trying to traverse the history or the log of your commit messages.

Speaker 0

我是什么时候删除某个文件的？

When did I delete a certain file?

Speaker 0

我是在哪里删除了那一行代码？

Where did I delete that certain line?

Speaker 0

我是什么时候添加了某一行或某个文件的？

Did I, you know, when did I add a specific line or file?

Speaker 0

我需要查看这个应用或项目在某个版本或时间点的样子，以及是哪个提交引入了我正在处理的这个bug。

I need to see what this app or project looked like at a certain version or time and which commit introduced this bug that I'm wrestling with.

Speaker 0

这些都涉及Git底层的不同操作。

Each of those look at different Git operations under the hood.

Speaker 0

特别是最后一个，你会进入一种叫做 Git Bisect 的功能。

The last one in particular is where you go into something called Git Bisect.

Speaker 0

当你进入这个模式时，你会突然查看提交信息日志，试图确定在哪个时间点需要回溯并检查你的代码库状态。

And when you're in that mode, you're suddenly looking at your log of commit messages to try and figure out at what point do you want to traverse back in time and inspect what your code base looks like.

Speaker 0

她在本文中展示的，特别是最后几张图里提到的是，你可以在提交信息开头使用一个所谓的单字关键词。

Well, what she shows in this post here, one of the pictures at the end here, is that you can lead off your commit with what is called kind of a one word keyword at the beginning.

Speaker 0

这种关键词可以表明提交的类型。

It kind of tells about the type of commit.

Speaker 0

这在一种叫做约定式提交的规范中被提及，自从大约一年前起，我就完全采用了这种规范。

This is mentioned in this paradigm called conventional commits, which I have now fully adopted ever since about a year ago.

Speaker 0

我一直在提交时使用这种方式，效果非常好。

I use this all the time in my commits and it is awesome.

Speaker 0

只是需要很多自律才能记得这样做，而你需要在提交信息开头使用诸如 fix（修复）、docs（文档）、feature 或 feat（新增功能）这样的关键词，当你想添加新功能时。

It just takes a lot of discipline to remember to do it, but this is where you lead off that commit with a keyword such as fix, docs for documentation, or feature or feat when you wanna build something new.

Speaker 0

还有其他一些关键词，比如 CI（持续集成相关）、testing（测试），或者 chore（日常事务），这类任务就像一些琐碎的、必须完成的事情，就像家务活一样，正如我孩子每天抱怨要做家务一样。

There could be other ones like CI if it's related to continuous integration, could be testing, you know, or one called chore, where it's kind of like this mundane thing you just have to do, just like your household chores, just like my kids whine every day about doing their household chores.

Speaker 0

我们经常也得对提交做这样的事，我该说什么呢？

We often have to do that with commits too, what am I to say?

Speaker 0

因此，配合一条有信息量的提交信息，将有助于更轻松地找到需要回溯的时间点，从而定位bug的引入位置。

So having that along with an informative commit message will make finding that point in time when you need to find just where that bug was introduced, hopefully a lot easier.

Speaker 0

正如我所说，她的工具包Superpulphet已经更新了一些练习，帮助大家了解这些不同任务。

So as I said, her package, Superpulphet, has been updated with some exercises to look at these various tasks.

Speaker 0

如果你希望在R的友好环境中、在一个非生产项目里学习，我建议你去试试看，因为通过这种方式并有意识地撰写提交信息，这里不会发生任何坏事。

I invite you to check it out if you want a way to learn from the friendly confines of R in a non production project, because the learnings you have through this and being intentional, I should say, on your commit messages, there is nothing bad that will happen here.

Speaker 0

这需要一些自律，但未来的你会感谢现在的你，当你越来越坚持这些原则时。

It's going to take some discipline, but future you will thank you later the more you adhere to a lot of these principles.

Speaker 0

这是Eric的看法。

That's Eric's take anyway.

Speaker 0

Mike，我这么说完全错了吗？还是你也喜欢Mel在这里做的这些内容？

Mike, am I completely off base or do you like what Mel's cooking up here?

Speaker 1

不，我超爱Mel在这些博客文章里总是捣鼓出来的东西。

No, I love what Mel's always cooking up in these types of blog posts.

Speaker 1

而且，是的，Git 在刚开始时确实有点棘手，就像我以前在这篇博客文章中多次提到的那样，它既难教也难学。

And yeah, Git in can be a tricky one at first, as I think I've said many, many times on this blog post before that, it can be a tricky language to teach, tricky language to learn.

Speaker 1

但归根结底，它能带来一种绝对无价的安心感。

And, but it does provide this this peace of mind at the end of the day that is absolutely invaluable.

Speaker 1

当我招聘时，Git 是我最看重的技能。

And it's when I go to hire, Git is the most important skill that I hire for.

Speaker 1

在 R、Python、SQL 之前，Git 是必不可少的。

Before R, before Python, SQL, Git is imperative.

Speaker 1

如果没有 Git，我们就无法完成日常的工作、协作，也无法确保不丢失我们的成果，更无法谨慎而安全地引入新功能之类的东西。

We couldn't do our day to day work and collaborate and make sure that, you know, we don't lose our work and carefully and safely introduce, you know, new features and things like that without Git.

Speaker 1

所以，感谢 Git 本身，也感谢 Mel 创造了这个非常棒的类比——将编织与 Git 联系起来，让你明白编织时的记号扣能帮你防止迷失位置，并帮助你计数复杂的图案。

So thank you to Git itself, and thank you to Mel for making this really cool analogy between, you know, crocheting and, and Git itself and ensuring that, you know, these stitch markers when you crochet prevent you from losing your place, and help you count complex patterns.

Speaker 1

这听起来确实很像 Git 提交，它们就像你工作主线上的一个个标记点。

And that certainly sounds a whole lot like Git commits that act as markers along your strand of work, if you will.

Speaker 1

这带来了许多心理上的益处。

A lot of psychological benefits.

Speaker 1

你知道，频繁地做提交可能会让人觉得有点烦人。

You know, it can feel like doing these commits early and often can be kind of annoying.

Speaker 1

但当你需要回溯时，它简直就是救星。

But, when you have to go back, it is an absolute lifesaver.

Speaker 1

我首先要承认，我并没有充分利用那些非常酷的各类 Git 命令。

And I'll be the first to admit that I don't take advantage of all of the kind of pretty cool different Git commands that are out there.

Speaker 1

我通常只用 git status、git pull、git push、git switch branch，除非需要调试问题，比如出了严重错误必须修复时，我才会深入那些更复杂的操作。

I am Git status, Git pull, Git push, Git switch branch, and and not a whole lot more unless I need to debug something and, you know, something has gone horribly wrong that we need to fix, then then I can dive into the the trickier things.

Speaker 1

但令人惊讶的是，Git 并不像是一个拥有大量炫酷功能的语言，但它其实确实有。

But there are shockingly, you know, Git doesn't strike me as a language that, has a lot of really awesome bells and whistles, but it kinda does.

Speaker 1

你知道，像 git show、git log、git blame 和 bisect 这些命令，正如我们发现的那样，都是非常强大且实用的工具。

You know, some of these these functions here like git show, git log, git blame, and bisect, as we've come to find out, are really powerful and really useful tools.

Speaker 1

我非常感谢 Mel 提到了 Git 中一些被忽视、未被充分利用的功能。

And I really appreciate the fact that Mel highlights some of these, you know, underused, underserved functions from Git itself.

Speaker 1

我真的非常喜欢这篇博客文章。

So really love this blog post.

Speaker 1

梅尔还提到，她将在三月下旬举办一场名为《哎呀，Git：如何从常见错误中恢复》的工作坊，我相信你们能从中学到更多关于这些内容的知识。

Mel also mentions that she's hosting a workshop later in March, titled Oops, Git, How to Recover from Common Mistakes, where I'm sure you can learn even more about this stuff.

Speaker 0

哦，这太棒了。

Oh, that's awesome.

Speaker 0

是的。

Yeah.

Speaker 0

我一定会关注这个活动。

I'll definitely be looking out for that.

Speaker 0

看起来这是‘乌克兰工作坊系列’的一部分。

It looks like that's part of the, yeah, workshops for Ukraine series.

Speaker 0

在这一领域已经分享了大量优质内容，预计将在3月19日举行。

There's been a lot of great content that's been shared in that space coming up looks like March 19.

Speaker 0

如果你们中有任何人想从一个非常权威的来源学习这些概念，我会在本集的节目说明中附上链接。

So I will put a link to that in the episode show notes if any of you are interested in learning from a very, very authoritative source on these concepts.

Speaker 0

我还要说，除了在命令行层面练习Git之外，我不讳言，这种情况下，拥有一个直观的图形界面确实非常有帮助。

I do say that along with, you know, practicing at the command line level with Git, I'm not gonna lie, this is one of those cases where having an intuitive GUI with it is extremely helpful.

Speaker 0

市面上有很多不同的工具。

There are many different ones out there.

Speaker 0

我承认我日常经常使用的一个叫 GitKraken，它是跨平台的。

I admit there is one I use a lot in day to day called Git Kraken, that's cross platform.

Speaker 0

它能很好地可视化我的提交历史，以及暂存区中哪些内容已更改、哪些未更改。

It does a great job of visualizing both the history of my commits, as well as when my staging area, what's been changed, what's not been changed.

Speaker 0

我经常遇到这种情况：写完一个提交并推送后，才意识到天啊，我忘了包含那一个额外的更改。

Often what happens to me is that I'll write a commit, I'll put it in, and then I realize, oopsies, I forgot to include that one additional change.

Speaker 0

这时候就可以用 git commit --amend 命令来修改提交的标题或内容。

That's where there's a git commit, dac, tac, tac, amend command that'll let you update either the title and or what you include in that commit.

Speaker 0

像 GitKraken 这样的工具和其他一些工具，能让你通过点选来判断这是要修改提交，还是要新建一个提交。

And that's where a framework like Git cracking and some other ones out there will let you like point and click, hey, is this an amend or is this a fresh one?

Speaker 0

这个功能多次救了我，因为有时候我会进行大量文档撰写或重写，添加很多 roxygen 标签，然后刷新帮助文档时才发现，糟了，我漏掉了那一个。

That one has saved my behind more than once because there are times when I do an extensive like doc write, rewrite, and I'm putting all these roxygen tags, and then I refresh the man pages of those, and I realize, oh, rats, I forgot to include that one.

Speaker 0

这时候我就可以用这个功能来修正。

Then I can do that.

展开剩余字幕（还有 31 条）

Speaker 0

所以，不要害羞，去找任何能帮助你使用 Git 的工具。

So don't be shy about finding whatever tool in Git helps you on that journey.

Speaker 0

没人会对此评头论足。

Nobody's judging on that.

Speaker 0

正如你所说，迈克，当我分享同样的经历时，我在和同事合作项目时，更关注的是有效使用 Git 的基本原理。

As you said, Mike, when I shared the same, when I work with colleagues on projects, it's more about the fundamentals of effective use of Git.

Speaker 0

只要你掌握了这些基本功，无论使用哪种语言，你都能处于有利位置，并根据需要选择合适的工具。

If you have a good handle on that, no matter which language you're working with, you're gonna be in good position and use all the tools for the job that you see fit.

Speaker 0

我完全同意，甚至要加一千个赞。

So I'm plus 1,000 on that one.

Speaker 0

在 RWeekly 里，你可以为很多东西加一千个赞。

And you can plus a thousand a lot of things in rweekly itself.

Speaker 0

R 社区不断推出各种新包，还有 R 在组织和领域中的优秀应用，甚至还有很多精彩内容。

There's the full gamut of new packages that are coming out in the rcommunity as well as great uses of r in organizations and areas and there's even a great selection of this.

Speaker 0

我的生命科学领域的同行们肯定会密切关注这些内容，以及其他更多内容。

Definitely my peers in life sciences will be paying attention to a lot along with much more.

Speaker 0

还有一件事，我还没来得及读，但正如我曾经关注的一位著名摔跤手罗迪·派珀常说的：每当你以为自己有了答案，我就把问题改了。

And one thing I didn't have time to read this yet, but it definitely got my attention just as a famous wrestler I used to follow, Roddy Piper, would say, every time you have the answers, I change the questions.

Speaker 0

还有关于大型语言模型的另一个方面，是我之前不了解的，或者说是AI领域的一般性内容，詹姆斯·韦德正在研究这个。

Well, there's yet another part of large language models that I did not know about or AI in general that James Wade is looking into.

Speaker 0

他在R和LLM领域的研究一直处于前沿，那就是递归语言模型的应用。

He's been in the forefront of a lot of this research in R and LLMs and that is the use of recursive language models.

Speaker 0

我得好好琢磨一下这个概念，看看那里到底在发生什么，这只是一个例子，说明你在rweekly的其他内容中能找到什么。

I'm gonna have to wrap my head around that one and see what's going on there so I may that's just one example of what you can find in the rest of the rweekly issue.

Speaker 0

所以这个领域从不缺乏精彩，但我们的项目并非由AI驱动，而是完全由人工精选，我们需要你的帮助。

So never, never a dull moment in this space, but this project, we're not AI driven, it's all human curated and we need your help.

Speaker 0

如果你看到一篇很棒的博客文章或值得推荐的亮点，希望被纳入下一期内容，只需在rweekly.org提交一个建议即可。

If you see that great blog post or highlight that should be included in the issue, we are a poll request away at rweekly.org.

Speaker 0

点击右上角那个小的Octocat图标，就会直接跳转到GitHub，你可以编辑本周或下周刊的草稿。

Just hit that little octo code in the upper right corner and it'll be taken to GitHub directly to edit the draft of this week's or the upcoming week's issue.

Speaker 0

当你提交PR时，我们会提供一些很棒的模板指南供你参考。

We have some great, you know, template guidelines when you fill out that PR.

Speaker 0

这非常简单直接，但我们依赖于您的贡献。

It's very easy, very straightforward, but we live off of your contribution.

Speaker 0

所以在这方面别太害羞。

So don't be shy with that side of it.

Speaker 0

也别不好意思联系我们。

And don't be shy about getting in contact with us.

Speaker 0

您可以通过播客播放器中的节目联系页面或节目说明中的节目联系页面找到我们。

You can find us through the episode contact page or the show contact page in the episode show notes right here in your podcast player of choice.

Speaker 0

您也可以通过这些社交媒体与我们联系。

You can also get in touch with us on these social medias.

Speaker 0

我在 BlueSky 上的账号是 rpodcastbsky。

I am at rpodcastbsky.

Speaker 0

在 BlueSky 上的社交账号。

Social on BlueSky.

Speaker 0

我是 rpodcasts，哇，我话说不出来了。

I'm rpodcasts wow I can't talk.

Speaker 0

我的账号是 rpodcastpodcastindex。

I am rpodcastpodcastindex.

Speaker 0

社交平台方面，我在 Mastodon 上，如果你想在 LinkedIn 上找到我，直接搜索我的名字就可以了，迈克，听众们在哪里能找到你？

Social, Mastodon and if you want me on LinkedIn just search my name and you'll find me there and Mike where can the listeners find you?

Speaker 1

你可以在 BlueSky 上找到我，账号是 mike thomas。

You can find me on BlueSky at mike thomas.

Speaker 1

或者在 Bsky.social 上，也可以在 LinkedIn 上搜索 catch brook analytics，k e t c h b r o o k，就能看到我在做什么。

Bsky.social or on LinkedIn if you search catch brook analytics k e t c h b r o o k you can see what I'm up to.

Speaker 0

很棒，而且我们希望能继续开展更多疯狂的冒险，我们会像往常一样，随时向大家更新我们每周的动态。

Great stuff and, yeah we're hopefully going to be up to more, crazy adventures in our side of things, and we shall keep you updated as always of our weekly itself.

Speaker 0

再次感谢大家收听本期节目，我们很快就会带来下一期的每周亮点。

So thank you again for tuning in to this week's episode and we'll be back with another edition of our weekly highlights at least soon.

Speaker 0

再见，各位。

Bye everybody.