Llama 2: our thoughts on the ground-breaking new large language model.
Open-source large language models (LLMs) like ChatGPT have brought generative AI into the public domain, making it possible for anyone to use AI to generate content. But as Aiimi CEO Steve Salvin explained in our recent blog on ChatGPT for business, we’ve identified significant risks when it comes to using cloud-based generative AI for business settings. Firstly, you’re sending corporate data to a centralised cloud, potentially leaving you open to loss of IP. The second (and in some cases, even greater) risk is what the service can learn about you from your interactions with it. For example, a security agency analyst might read lots of publicly available information online – while the information itself isn’t proprietary, the service could start to understand the analyst’s intent (and therefore the security agency’s objectives) through their dialogue. For those looking to use generative AI for business, we think these risks far outweigh the benefits of trying something new.
Enter Llama 2
In July, Meta (the team behind Facebook) launched Llama 2, a game-changing LLM with a difference. With performance characteristics around GPT 3.5, it’s not quite as powerful as ChatGPT, but it marks an enormous step forward from previous open-source LLMs for two reasons. Not only did Meta release Llama 2 into the public domain, but they also took the unprecedented step of giving it a very permissive commercial license, with no costs or restrictions. To have a model that’s this powerful with such open access is quite a profound and liberating thing. And here’s the clincher: because it’s downloadable, you can run Llama 2 privately and securely offline.
A previous version of the Meta model leaked a few years ago, but its license was limited to research and academia only, and it was a very early-stage model – more like GPT 1 or 2. Since then, we’ve seen other open-source models like GPT2, but nothing close to ChatGPT. Until now! There are three different strengths of Llama 2: one trained on 7bn parameters (a medium-sized model), one trained on 13bn parameters, and one trained on 70bn parameters.
Llama 2 for business: Our observations so far
When Llama 2 launched last month, we were among those racing to download it and start exploring its capabilities. We quickly integrated it with our Aiimi Insight Engine and are already seeing a striking improvement in quality of responses compared to other open-source models. We’re now running the 7bn-parameter version and using it in an open-book scenario, meaning we give it a set of documents and ask it questions of those documents; we might ask it to summarise sentiment or tone, or to improve a document with more emphasis on a certain aspect of the content. Here are our observations from just a few weeks of using Llama 2:
Are Llama 2 outputs high quality?
We’re finding the output from Llama 2 to be extremely high quality, with meaningful and correct content. We’ve also found Llama 2 to have good reasoning ability, meaning it’s very difficult to fool it with trick questions. Obviously running the 70bn-parameter model takes more compute power, but the result is a very sophisticated, clever model.
How consistent are Llama 2's generated answers?
Llama 2 feels much more like ChatGPT in terms of the way it derives its answers and gives you insight. But the answers provided by Llama 2 seem to be much more consistent and predictable compared to ChatGPT. The latter overtly tries to be creative, so when you repeat a question, you’ll get a different answer. Whereas if I give Llama 2 a page of text and ask it to summarise it, it’ll pretty much give me the same answer every time. This is important when considering how to adopt generative AI for business, where consistency is key – in a customer service setting, for example, a customer should get the same response when asking the same question. That’s one of the appeals of extractive technology, where it provides the source of the answer so there’s no ambiguity. You don’t get that with generative AI, so the fact that Llama 2’s answer is the pretty much the same every time is compelling.
Is Llama 2 secure?
When you run Llama 2 in your own business environment, you mitigate the two major risks we identified with ChatGPT. You can literally run Llama 2 from your laptop with the internet switched off, so it’s not able to talk to anything else online. With the model safely inside your world, you’re able to leverage generative AI for business in a secure way. So Llama 2 shows a lot of promise for private use within corporate organisations.
How toxic is Llama 2 compared to other models?
We’ve found it very hard to get Llama 2 to introduce any political bias or toxicity. In fact, in benchmark testing, the percentage of toxic generations shrinks to effectively 0% for Llama 2-Chat of all sizes – the lowest toxicity level among all compared models. Meta has commented that the model is specifically trained to eliminate toxicity. It’s not that other language models are inherently toxic, but that Llama 2 has been fine-tuned to ensure toxic language doesn’t creep in.
An assistant, not a replacement
Based on these initial observations, we’re embracing Llama 2 as an effective tool to make generative AI for business a reality. Why? Because it offers organisations a way to explore and benefit from the technology, without posing risks to IP or exposing your intentions, and it’s cost effective to run securely within your organisation. Here are a couple of examples of how we can use it to deliver value for our customers:
- A security analyst receives feeds from news agencies overnight and arrives to 40 new articles each morning, all related to the topic they’re researching. Using Llama 2, we can summarise all the content overnight – we can ask it to pull out key points, ask what it would suggest investigating further, or have it group the articles according to how trustworthy their source. LLMs are very good at understanding whether a piece of content is subjective or objective, left or right leaning, positive or negative in sentiment. In this way, the technology becomes a valuable assistant to an intelligence-based user in a geopolitical setting (or anyone else trying to find information about a topic), rather than their replacement.
- In infrastructure engineering, large operational manuals provide information on assets in the field. With Llama 2 and Aiimi Insight Engine, an engineer can use voice activation to ask how to replace a part on a numbered piece of equipment. It’d find the answer, create a piece of text, and read the instructions back to you, without you having to trawl the whole document. Even if the answer is spread across different sections of the manual, the LLM reads the whole thing, pulls together the relevant information, then creates its own cohesive answer.
We’re already working closely with several of our customers to explore and deploy this technology, from practical guidance on getting the requisite compute power in place to run Llama 2 for business, through to advising more widely on a strategic plan for safely adopting AI technologies in business. It all starts with how you discover and map the information you already have, to get a clear view of your entire data picture. Because while the release of public LLMs has reignited the conversation about the use of AI in business, many organisations still need to get these essential foundations in place before they can get any valuable output. And once your enterprise data is in good shape, there’s also a wealth of simpler, risk-free things you can do with AI too.
Our final thoughts (for now)
Llama 2 isn’t as technically powerful as ChatGPT, but actually, does that matter? ChatGPT is a gigantic model running on a massive platform, with a megaton of compute resources. But as far as an enterprise business is concerned, there are certain characteristics that make Llama 2 a safe, near-term option to explore and exploit the technology with virtually no risk to information security. Because you can run it inside your infrastructure in a completely disconnected way, it’s not subject to the same risks as cloud services, so it overcomes those barriers to corporate adoption. For us, the benefits of being able to run a model securely outweigh the additional power you get with a cloud-based service. And arguably, would an enterprise business get additional value from the extra power anyway? From what we’ve seen, Llama 2 is more than powerful enough for users to start exploring the technology in a safe way.
And the technology is only going to get better. Following the release of Llama 2, we’ll see the pace of innovation in the open-source world accelerate to catch up with closed private models. When we talk about ethical use of AI, one advantage of open-source models is that they’re transparent – you can hide what you’re doing, but not how the model works, so it drives more ethical use in the first instance. For now, we see Llama 2 as a secure stepping-off point to adopt generative AI for business safely, while the tech continues to mature at an astonishing pace. It’s also a prompt to get your data foundations in place and a strategy for AI adoption mapped out, to avoid being left behind or exposed to risk.
Get in touch with our experts to find out more about what generative AI is capable of and get practical advice on how to safely deploy it in your business.
Aiimi Insights, delivered to you.
Discover the latest data and AI insights, opinions, and news from our experts. Subscribe now to get Aiimi Insights delivered direct to your inbox each month.
Enjoyed this insight? Share the post with your network.
ChatGPT Explained: A breakdown of how it works for curious business leaders
Structure your unstructured data with automated discovery and enrichment, steering business success with timely and cost-effective insights
Where to use Extractive AI vs. Generative AI for enterprise
ChatGPT Explained: A breakdown of how it works for curious business leaders
Why I’m advising against using ChatGPT for business – and what to do instead