1 / 2

Meta Warns Its Latest Large Language Model ‘May Not Be Suitable’ for Non-English Use

Meta is back with a new large language model (LLM) u2014 Llama 2, which was released via a research paper published in July 2023. Llama 2 is the successor to Llama 1, which Meta introduced in February 2023.<br><br>In a move that Yann LeCun, VP and Chief AI Scientist at Meta, described as u201chugeu201d in a July 18 tweet, the tech giant has chosen to make Llama 2 both open source and available for free for research and commercial use. Users will therefore be able to start building on Llama 2.

Slator
Télécharger la présentation

Meta Warns Its Latest Large Language Model ‘May Not Be Suitable’ for Non-English Use

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Meta Warns Its Latest Large Language Model ‘May Not Be Suitable’ for Non-English Use Meta is back with a new large language model (LLM) — Llama 2, which was released via a research paper published in July 2023. Llama 2 is the successor to Llama 1, which Meta introduced in February 2023. In a move that Yann LeCun, VP and Chief AI Scientist at Meta, described as “huge” in a July 18 tweet, the tech giant has chosen to make Llama 2 both open source and available for free for research and commercial use. Users will therefore be able to start building on Llama 2. “This is going to change the landscape of the LLM market,” LeCun stated. Llama 2, like Llama 1 before it, takes its name from “Large Language Model Meta AI.” According to Meta, Llama 2 is trained on 40% more data than Llama

  2. 1. Its pre-trained models are trained on no less than two trillion tokens, while its fine-tuned models have been trained on more than one million human annotations. One might therefore presume, with all that training data, that Llama 2 could well have an edge (or, at least a use) in machine translation and other multilingual applications. Apparently, not so. As Meta explained in the research paper, “Most data is in English, meaning that Llama 2 will perform best for English-language use cases.” It also warned, “A training corpus with a majority in English means that the model may not be suitable for use in other languages.” Llama 2's Language distribution in pretraining data with percentage According to the paper, the model’s pretraining data is nearly 90% English. Other languages, including German, French, Chinese, Spanish, Dutch, Italian, Japanese, Polish, Portuguese, and others, collectively make up less than 2% of Llama 2’s training data, while the language is “unknown” for more than 8% of training data. (This includes programming code data.) Llama 2’s lack of language diversity is somewhat surprising given that Meta has focused heavily on the need to improve coverage for low-resource languages (and poured significant R&D efforts into this area) in recent years. Or perhaps, after its self-proclaimed “breakthrough” in machine translation for low-resource languages in July 2022, Meta’s attention is beginning to shift to new and shinier areas of language research.

More Related