Decoding AI Adoption
December 30, 2024189 views0 comments
How region, industry and company size impacts open-weight AI use by developers.
Gemini, O1, Grok, Claude, Llama, Yi and Mistral. The number of Large Language Models (LLMs) seems to have grown exponentially since OpenAI’s ChatGPT burst into the public consciousness in late 2022. It is estimated that US$154 billion was spent on AI by businesses in 2023, while the most recent McKinsey global survey on AI reveals that up to 65 percent of firms were regularly using generative AI (GenAI) in their operations. But beyond the hype and billion-dollar figures, it’s still a little unclear where and how the technology is really being used.
Our research aimed to go beyond surveys of business managers and included real-world data on how AI is being adopted. To achieve this, we went to the source to analyse the activities of software developers and how they were interacting with some of the leading open-weight LLMs. These are freely available LLMs that can be accessed, modified and redistributed with minimal restrictions.
We found significant variations in the adoption of AI based on geography, company size and industry. Perhaps unsurprisingly, start-ups and technology firms were the largest users, with US companies way ahead when it comes to embracing the potential of AI. Educational institutions were another key locus of AI activity, with a particularly high adoption of Llama with this segment. What also emerged was the domination of United States, with Mark Zuckerberg’s Llama and Elon Musk’s Grok way ahead of other open-source LLMs in market share.
Open access data
Our search for concrete usage data led us to the popular developer platform GitHub, an important repository for AI code and other resources. This includes a number of notable open-weight LLMs such as Grok, Meta’s Llama, Yi by the Chinese group 01.AI and Mistral by its eponymous French AI start-up.
The beauty of GitHub is the ability to identify which developers have downloaded (“forked”) which LLM code to power AI applications, and therefore, offers insights into what LLMs are proving popular among developers. As developers are the one pushing technological boundaries it perhaps offers insights into AI trends.
Equally important, the open nature of GitHub also allowed us to identify a decent percentage of those developer’s country of residence, employer sector and employer size. While not an exact science, our data provide a snapshot of how developers from different countries, company sizes and industries are adopting LLMs. We benchmarked AI adoption numbers against the use of TensorFlow, one of the most widely used machine learning developer tools released by Google back in 2015.
Industry and company differences in adoption
With AI adoption at a relatively early stage, we find significant variation across different sectors. As you can see from Table 1, technology firms lead the way (48.3 percent of forks), but the education industry also accounts for a relatively high percentage of adoptions of LLMs (26.3 percent).
The strong showing of educational institutions reflects the extent of research around LLMs and GenAI. At INSEAD we certainly see deep interest among our faculty and students to understand their capabilities and limitations. Given the extraordinary costs involved in developing modern LLMs, researchers benefit from the availability of these open-weight models rather than relying on the proprietary offerings from the likes of OpenAI and Anthropic.
We found a much lower uptake of LLMs in traditional sectors, especially those producing and selling physical goods – a reminder of the important role of tech and higher education in driving innovation in the economy.
We expected that smaller start-ups would be at the vanguard of open AI adoption. After all, they are traditionally more agile and therefore much quicker to adopt new technology than larger firms, especially if it might give them a significant market advantage. While start-ups do lead the way, there is considerable activity across all size categories. This contrasts with the TensorFlow benchmark where large companies are the biggest users by far.
Regional differences
Not surprisingly, North America was the dominant location for developer LLM activity on GitHub, with just over 50 percent of the forks originating from there. Nonetheless, the dominance in LLMs is less than TensorFlow, where North America has over 60 percentage of the forks. In Table 2, we highlight those geographical differences by breaking down our data by region.
While it was mostly consistent between regions, there were notable differences. Start-ups have the leading share of forks in every region except for North America, where the largest companies (those with over 10,000 employees) hold the leading share. This could well reflect the greater maturity of AI adoption in the US and Canada, where firms have had longer to get to know and understand the benefits this technology can bring to an organisation.
What is perhaps more striking is the domination of US LLM models. Llama has an outsized share across all the regions, with Grok firmly in second spot. Our data doesn’t seem to suggest that developers express any regional loyalties towards local LLMs, such as Mistral in Europe.
Musk vs. Zuckerberg
Llama is clearly benefiting from their decision to embrace an open-weight strategy for their LLMs early on, starting with the release of the weights for Llama 2 in November 2023. Their leadership position also benefits from its backing by the Meta brand. As Merouane Debbah (distinguished AI researcher and leader of the team that developed Falcon LLM model in Abu Dhabi) puts it: “Developers need to have confidence in the staying power of an open model for them to build their applications on top of it.”
With his Grok models, Musk is looking to challenge Meta for leadership in the open-weight segment, while also taking on former collaborator Sam Altman CEO at OpenAI. Since the release of Grok 2 in August 2023, Musk’s xAI has adopted a hybrid strategy where its latest model remains private, though they publish the weights to their prior models.
It is interesting to note Grok’s strong performance despite being late to the game, especially in China and the rest of the world. Its close association with X may help, as might the star appeal of Musk. It will be interesting to see if his new role in the US government will impact adoption levels in the future.
Beyond region, our data (see table 3) suggests other differences in the user base of these two models. Grok has a slightly heavier weight in the tech sector while Llama seems more popular among those in the education sector. Finally, Llama is slightly more popular among large organisations, while Grok is more associated with start-ups.
Musk, who described Grok as a maximum truth-seeking AI”, has previously commented on his desire for tough regulations around AI. Meanwhile, Zuckerberg has made numerous statements about the need for open-source AI to become the industry standard. It will be interesting to see the impact of their LLMs going head-to-head on the future of the open-weight approach to GenAI.
Looking ahead
We are still in the early days of the development and adoption of LLMs – expect to be surprised! Divining the future is highly challenging given the intense competition among the top players, not to mention the pressures from geopolitical pressures. Will a more assertive Trump administration impact the ability of the US to dominate GenAI?
Our analysis, rooted in actions by real-life developers, provides insights into the current state of play. Importantly, it gives us a benchmark for future analysis and the opportunity to spot trends on developers’ preferences for specific LLMs.
We are particularly interested in following the adoption dynamics behind open-weight models. Will they be able to compete effectively with fully proprietary models? Might Meta or xAI shift away from their commitment to open weights? Currently, they are facing restrictions from data access to cost, but they seem to have the greatest opportunities for significant innovation.
Co-founder of Near.ai and co-author of the landmark paper “Attention is all you need”, Ilia Polosukhin states: “the future of Al should be open and accessible to everyone. As developers continue to push the boundaries of what’s possible with this technology, permissionless accessible models will be the foundation upon which new breakthroughs are built.”
The picture will undoubtedly change and develop over time. We hope that, this will allow us to further map adoption trends and provide better indicators of where we might be headed on this revolutionary AI journey.