Latest AI Tools for AI Media Processing NEWS

[ad_1]

As artificial intelligence evolves, the research community’s access to AI generative tools, such as language models, is critical to innovation. However, today’s AI models often live behind proprietary walls, which stifles innovation. Meta’s release of LLaMA 2 is designed to democratize this space, empowering researchers and commercial users worldwide to explore and push the boundaries of what AI can achieve.

In this article, we will explain the Meta LLaMa model and its latest version, LLaMa 2.

What is LLaMa?

In February 2023, Meta announced LLaMA, which stands for Large Language Model Meta Artificial Intelligence. This large language model (LLM) was trained on different model sizes, from 7 billion to 65 billion parameters. LLaMa models vary due to parameter size¹:

7B parameters (trained on 1 trillion tokens)
13B parameters
33B parameter (trained on 1.4 trillion tokens)
65B parameters (trained on 1.4 trillion tokens)

Meta AI claims that LLaMa is a smaller language model that may be more suitable for retraining and refinement. This is a benefit because sophisticated models are more suitable for profitable entities and specific applications.

To refine LLMs for enterprise purposes, take a look at our guide.

Unlike many powerful language models, which are typically only available with limited APIs, Meta AI has chosen to make LLaMA’s model weights available to the research AI community under a non-commercial license. Initially, access was provided selectively to academic researchers, government institutions, civil society organizations and individuals associated with academic institutions around the world.

How did LLaMa study?

Like other large language models, LLaMA works by taking a string of words as input and waiting for the next word to iteratively generate the text.

Training this language model prioritized text from the top 20 languages with the most speakers, especially Latin and Cyrillic scripts.

LLaMa training data is mostly from large public websites and forums such as²:

Web pages crawled by CommonCrawl
GitHub’s open source repositories
Wikipedia in 20 different languages
Public domain books from Project Gutenberg
LaTeX source code for scientific papers uploaded to ArXiv
Questions and answers from Stack Exchange websites

How does LLaMa compare to other major language models?

According to the creators of LLaMA, the model outperforms GPT-3 (which has 175 billion parameters) by 13 billion parameters on most natural language processing (NLP) benchmarks.³ In addition, their largest model effectively competes with higher-end models such as PaLM and Chinchilla.

5jpZ3DRkEJJJ3k0stXvFXDuD9IO4gNLpL OOcqKY XlW6HwGliu PSZbq3feEbg2LhnLdOpI3fuu ROBxCPm6vZ1hQxu7KuBZXQ5hqPvcQbKUP4 — Figure 1. LLaMa vs other LLMs on a reasoning task (Source: LLaMa research paper)

Truth and bias

LLaMa performs better than GPT-3 in the truth test used to measure the performance of both LLMs. However, as the results show, LLMs still need improvement in terms of accuracy.

5dcdgwPPYn027zlgMPFEicYsbhaup3hUq2Of quE6bFsdqdBRN5NoDFhKfhMrJnIiW5M5VsG3xkUb HkOOHK7tGssnQUWy2 bSt5ZDDfWKbGm2Ky0 332 kSFoSGkw — Figure 2. LLaMa vs GPT-3 on truth test (Source: LLaMa research paper)

LLaMa with 65B parameters produces less biased requests compared to other large LLMs such as GPT3.

xg136tCpq4bQEW3CrbypfLPgwKaitTQwa 9foNHKL4fsr LdBXRobSjN0cix0N7atFdkAhkUtbBY — Figure 3. LLaMa vs GPT-3 and OPT response bias (Source: LLaMa research paper)

What is LLaMa 2?

On July 18, 2023, Meta and Microsoft jointly announced support for the LLaMa 2 family of large language models on Azure and Windows platforms.⁴ Both Meta and Microsoft are united in their commitment to democratizing AI and making AI models widely available, and Meta is taking an open stance on LlaMa 2. For the first time, the model was opened for research and commercial use.

LLaMa 2 is designed to help developers and organizations build generative AI tools and experiences. They give developers the freedom to choose the types of models they want to develop, validating both open and boundary models.

Who can use LLaMa 2?

Users of Microsoft’s Azure platform can specify and use LLaMa 2 models with 7B, 13B, and 70B parameters.
It is also available through Amazon Web Services, Hugging Face and other providers.⁵
LLaMa is designed to run efficiently on a local Windows environment. Windows developers can use LlaMa with the DirectML runtime provider through the ONNX Runtime.

If you have questions or need help finding vendors, don’t hesitate to contact us:

Find the right sellers

Introducing LLaMA: A Fundamental, 65 Billion Parameter Language Model. Meta AI, 24 Feb. 2023, https://ai.facebook.com/blog/large-language-model-llama-meta-ai/. Accessed 24 July 2023.
“LLaMA.” Wikipedia, https://en.wikipedia.org/wiki/LLaMA. Accessed 24 July 2023.
“LLaMA: Open and Efficient Foundation Language Models.” arXiv, 13 June 2023, https://arxiv.org/pdf/2302.13971.pdf. Accessed 24 July 2023.
“Microsoft and Meta Expand Their AI Partnership with LLama 2 on Azure and Windows – The Official Microsoft Blog.” Microsoft’s official blog, July 18, 2023, https://blogs.microsoft.com/blog/2023/07/18/microsoft-and-meta-expand-their-ai-partnership-with-llama-2-on-azure-and-windows/. Accessed 24 July 2023.
“Meta and Microsoft represent the next generation of llamas.” Meta AI, 18 July 2023, https://ai.meta.com/blog/llama-2/. Accessed 24 July 2023.

Share LinkedIn

Cem has been the chief analyst at AIMultiple since 2017. AIMultiple reports to hundreds of thousands of businesses (based on similar websites) every month, including 55% of the Fortune 500.

Jam’s work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms such as Deloitte, HPE, and NGOs such as the World Economic Forum and supranational organizations such as the European Commission. You can see more reputable companies and resources referencing AIMultiple.

Throughout his career, Jam has worked as a tech consultant, tech buyer, and tech entrepreneur. For more than ten years, he advised enterprises on technology solutions at McKinsey & Company and Altman Solon. He also published a McKinsey report on digitization.

He led technology strategy and acquisitions for the phone company, reporting to the CEO. He also led the commercial growth of deep technology company Hypatos, achieving 7-figure annual recurring revenue and 9-figure valuation from 0 to 2 years. Jam’s work at Hypatos has been covered by leading technology publications like TechCrunch, Business Insider.

Jam regularly speaks at international technology conferences. He graduated from Bogazici University with a degree in Computer Engineering and holds an MBA from Columbia Business School.

[ad_2]

Source link

Large language models for multilingual AI-driven virtual assistants

How SAS can help catapult practitioners’ careers

Causes, impacts and best mitigation practices

Leave A Reply Cancel Reply

An in-depth guide to the Meta LLaMa language model and LlaMa 2