Why is everyone in AI talking about Llamas?

Meta’s Llama model, Llama Index, and more Llama named things for developers

Logan Kilpatrick
6 min readJul 28, 2023
Image by Author

Someone please remind me why we are talking about Llama’s?

Edit: This story was updated on July 28th, 2023 to reflect that Llama Index was the first to use this naming before Meta.

Keeping up with the pace of new generative AI research, news, products, and open source projects can be pretty exhausting. Everyday there is some new thing that begs the attention of those in the generative AI space.

If you have been following the news closely at all, you have likely heard the word Llama a lot more than any other time in your life. And no, it’s not because the new San Francisco trend is to own your own Llama, it is because of AI products named around this.

In this post we will explore some of the many Llama related generative AI projects, where the naming comes from, and more.

As always, you are reading my personal blog so you guessed it, these are my personal views. Let’s dive in!

The LLaMA foundation model

Back in February of 2023, Meta (facebook) released the now famous LLaMA foundation model in response to the excitement spurred by ChatGPT a few months prior. At the time, this was one of the largest models which had been released to data so the developer community was very excited.

Let’s get it out of the way to begin with, why did Meta decide to name it Llama? Is it because their ML model has a thick coat of fur or a oddly long neck? Sadly, the real reason is a boring acronym, Large Language Model Meta AI.

Taking a small step back, Meta releasing Llama is actually, in my opinion, a very positive thing. I think their commitment to open science is admirable and something I am happy to see. For those who don’t know, I am an advisor at NASA supporting the open science initiative there which is now adopted by over 10 US government agencies so it is a topic which in many contexts is deeply aligned with my world view.

Quick interruption: my brother Chandler is working on a project where he creates custom hard cover AI art coffee table books for people based on the theme they want, it is so fricken cool! Check it out to support him:

So what made Llama so exciting?

Much of the initial excitement about Llama was hampered by the non commercial license which prevented companies from building products with it. This was until the weights (the brain of a deep learning model) were leaked online. Another aspect that seemed to get developers excited was the differing sizes of the model. For many, being able to fine-tune a smaller model is critical for their use case and just having a large 75 billion parameter model would likely cost too much to use in practice:

We trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Our smallest model, LLaMA 7B, is trained on one trillion tokens.

More broadly speaking, my impression of the excitement around Llama was that developers were happy Meta was taking an open source approach to developing large language models, but the current versions did not help much given the licensing and infrastructure required to run the model on your own.

Enter Llama 2, the next generation of Meta’s open source large language model

Since the initial release, developers have been waiting to see when Meta would release the next version. To many people’s surprise, the 2nd iteration of Llama came on July 18th 2023 with an unexpected parter: Microsoft.

Llama 2 solves the two main problems that the original Llama model fell short of:

  • Llama 2 is free for research and commercial use
  • Llama 2 is available in the Azure AI model catalog, enabling developers using Microsoft Azure to build with it

Both of these are generally wins for the developer ecosystem. I will say that historically Meta has had a dicey track record and is the target of frequent attacks for privacy and other short comings. But there are also a lot of really smart people working hard to make these models work so I wish them all well!

So why are other people using the word Llama?

Edit: This section has been updated to reflect that Llama Index was the first to use the Llama name, here’s a quote from Llama Index founder Jerry Liu:

A few friends and I were brainstorming cute animal prefixes. Llama sounded nice because it had LLM in the name and evoked the image of a cute, friendly animal (later we found out that alpacas were in fact much friendlier than llamas but that’s beside the point)

I can only imagine the validation and marketing + SEO boost Llama index got from Meta planting the flag of Llama in the AI ecosystem. I expect that these two early adopters will spawn a whole new ecosystem of tools based around this brand.

My general impression is that like GPT (generative pre-trained transformer) has become a popular term in the AI space, Llama is filling a much needed gap to give people some naming optionality while still making it clear you are in the generative AI space.

There was a single moment in my mind the crystalized Llama as a term, and it was Hugging Faces AI meetup in SF where they had a real Llama present:

Generally speaking, the Llama naming schema passes my viral product checklist simply because it has an available emoji, a critical angle in todays product space.

What is Llama Index?

Besides the Llama foundation model, Llama index is probably the 2nd most popular Llama project out there. It is designed as a data framework for large language models that allows you to seamlessly connect different data stores (like a database, your email, etc) to a large language model. This is incredibly useful as you are building a project because it means you don’t need to build something from scratch to connect with all these sources yourself.

Llama index also connects with other tools like LangChain which serve more as the application layer than the data layer. I wrote up some thoughts on LangChain in another post:

Llama index actually does many similar things as Langchain with support for agents, chat bots, data sources, and more tooling to make working with large language models easier.

For embeddings, there are a bunch of helpful tools to enable changing the batch size of the embeddings, switch vector database provider, and more.

import chromadb
from llama_index.vector_stores import ChromaVectorStore
from llama_index import StorageContext

chroma_client = chromadb.PersistentClient()
chroma_collection = chroma_client.create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

The above code is a simple example of using the ChromaDB provider right inside of Llama Index. This is helpful as you can test multiple providers (as long as they support the embeddings format you are using) without needing to worry about writing code for all of them.

I am working on a more in-depth article on Llama index so I will save the rest of this for later!

Who else is building with Llama?

There are so many interesting projects around the Llama models, a few are as follows:

And many others which port the Llama model to different languages or support different training architectures that the original model did not.

Llama Llama Llama

In general, simply from a naming perceptive, I am glad that Llama seems to be helping push people away from calling everything GPT. While I think GPT has the added boost of relating to ChatGPT, it is a bit overwhelming that it’s used so frequently.

I hope that this post was useful to get just a little more context about why everyone is talking about Llamas. I will be interested to see how Meta’s approach to releasing large language models changes over time as the competition heats up. 🦙