紐時賞析/人工智慧模型是否該「開放原始碼」? 科技界看法兩極

关于开源人工智慧实际意义,目前尚未有一致定义。有些人指控人工智慧公司「洗白式开放」—不够诚实地使用「开放原始码」,以让他们形象看起来好一点。(纽约时报)

Some AI Companies Face a New Accusation: 'Openwashing'

有些AI公司面临新指控 遭控「洗白式开放」

There's a big debate in the tech world over whether artificial intelligence models should be "open source". Elon Musk, who helped found OpenAI in 2015, sued the startup and its CEO, Sam Altman, on claims that the company had diverged from its mission of openness. The Biden administration is investigating the risks and benefits of open source models.

关于人工智慧模型是否应该「开放原始码」,科技界存在着一场大规模争论。2015年协助创立OpenAI的马斯克,对这间新创公司和其执行长奥特曼提起诉讼,声称该公司偏离其开放的使命。拜登政府正调查开源模型带来的风险和益处。

Proponents of open source AI models say they're more equitable and safer for society, while detractors say they are more likely to be abused for malicious intent. One big hiccup in the debate? There's no agreed-upon definition of what open source AI actually means. And some are accusing AI companies of  "openwashing" — using the "open source" term disingenuously to make themselves look good. (Accusations of openwashing have previously been aimed at coding projects that used the open source label too loosely.)

开源人工智慧模型提倡者表示,它们对社会而言更公平且更安全,而批评者表示,它们更可能被恶意滥用。但这场辩论会不会有个大问题?关于开源人工智慧的实际意义,目前尚未有一致定义。而且有些人指控人工智慧公司「漂开」(洗白式开放),不诚实地使用「开放原始码」一词,让他们的形象看起来好一点。(关于漂开的指控,先前主要针对过于宽松地使用开放原始码标签的程式编码专案)

In a blog post on Open Future, a European think tank supporting open sourcing, Alek Tarkowski wrote, "As the rules get written, one challenge is building sufficient guardrails against corporations’ attempts at 'openwashing'." Last month the Linux Foundation, a nonprofit that supports open-source software projects, cautioned that "this 'openwashing' trend threatens to undermine the very premise of openness — the free sharing of knowledge to enable inspection, replication and collective advancement."

在支持开放原始码的欧洲智库「开启未来」部落格贴文中,塔科沃斯基写道:「随着规范被制定,一个挑战是建立充足的护栏防止企业尝试『洗白式开放』。」支持开源软体专案的非营利组织Linux基金会上月提醒,「这种『洗白式开放』趋势可能破坏开放的前提,即自由分享知识以实现检视、复制和集体进步」。

Organizations that apply the label to their models may be taking very different approaches to openness. For example, OpenAI, the startup that launched the ChatGPT chatbot in 2022, discloses little about its models (despite the company’s name). Meta labels its LLaMA 2 and LLaMA 3 models as open source but puts restrictions on their use. The most open models, run mainly by nonprofits, disclose the source code and underlying training data, and use an open source license that allows for wide reuse. But even with these models, there are obstacles to others being able to replicate them.

将这个标签应用在其模型的组织,可能会采取截然不同的方式开放。例如,在2022年发表了聊天机器人ChatGPT的OpenAI,几乎没有公开其模型(尽管其公司名称)。Meta将其LLaMA 2和LLaMA 3标记为开放原始码,但限制其使用。最开放的模型主要由非营利组织运行,会公布原始码和作为基础的训练数据,并使用允许广泛重复使用的开放原始码授权。但即使有这些模型,其他人复制它们时还是有障碍。

The main reason is that while open source software allows anyone to replicate or modify it, building an AI model requires much more than code. Only a handful of companies can fund the computing power and data curation required. That's why some experts say labeling any AI as "open source" is at best misleading and at worst a marketing tool.

主因是尽管开源软体允许任何复制和修改,但建立人工智慧模型需要的不仅是程式码。只有一些公司能为所需的算力和资料应用提供资金。这就是为何有些专家说,将任何人工智慧贴上「开放原始码」标签,往好的状况看是种误导,往坏的状况看是种行销工具。

Efforts to create a clearer definition for open source AI are underway.

创建开源人工智慧更清楚定义的努力正在进行。

文/Sarah Kessler 译/罗方妤