Which companies will capture value in AI? (2024)

Humans are about to give up their 200,000 year monopoly on thinking. As we transition from meat- to silicon-based cognition, new empires of wealth will be built. Who will be the Standard Oil and US Steel of this revolution?

To tackle this question, we need to delve into the basic economic and market dynamics of AI. Here are the key conclusions:

  • Task-engaged AIs will dominate value creation. Current state-of-the-art AIs (LLMs) are oracles — they respond to requests with output streams of text or media but have no direct engagement with any task or the result of the task (other than a crude thumbs up or down). Task-engaged AIs, on the other hand, have direct connections to task context, actions, and results — and are part of the feedback loop. We should expect transformative economic impacts when task-engaged AIs mature, not before.

  • The AI value chain will become commoditized everywhere except proprietary data. In the near term, winners may emerge based on advantages in access to compute, energy, and R&D talent, as these remain severely supply constrained, but, in the long term, proprietary data is the only sustainable differentiator. Early leaders will have to leverage their advantages into a more durable data moat.

  • Proprietary data for AI comes from owning the interface. Companies that own the interface to a given task will own the end-usage feedback loop and thus generate the proprietary data necessary to improve an AI model. I call this loop control — and companies that have it will capture the majority of the value in AI. For many tasks there is a clear interface owner, usually the operating system developer — e.g. Apple, Google, and Microsoft. In others, the interface and feedback loop is less obvious: backend software engineering, for example, involves the production of simple text, but getting feedback on how good that text is involves lots of different data sources — error rates, logs, user experience metrics, etc. Companies that can close feedback loops for tasks into a single interface will capture the data necessary to build leading AIs.

    • Example: Tesla owns the end-usage feedback for its self-driving AI; the OpenAI GPT-4 api does not.

      Which companies will capture value in AI? (1)
  • Companies that control an interface (like Apple and Google with phone interfaces) will develop AIs that subsume all tasks amenable to that interface. - AIs will demarcated by their training data, and their training data will be determined by their interface. AIs will expand until they hit the limit of their data availability.

    • Example: The vehicle interface

      • Interface input: video, lidar, etc.

      • Interface output: steering wheel, pedal movements, etc.

      • Tasks: passenger vehicle driving, commercial truck driving, food delivery, package delivery — all possible driving tasks

  • Each interface will end up dominated by a single interface-dominant AI - Because of the flywheel of loop control (more usage → more data → better AI → better product → more usage) the first viable company that develops a lead in usage for a given interface will accelerate away from its competitors, and own all tasks in that interface’s domain.

    • Example: Tesla will be the interface-dominant AI for the “driving interface” described above — it has the most data and the most users and other companies will struggle to catch up once the AI flywheel starts. (Tesla could misstep, of course.)

  • Interface-dominant AIs will own huge portions of the economy in a way few companies ever have. Just as Google and Facebook, with their data feedback loops and super aggregator dynamics, captured so much of the internet’s value, interface-dominant AIs will do the same for these and other markets, but much further. They will aggregate all demand for all tasks amenable to their interface, commoditizing their suppliers. The extent of their dominance will depend on the complexity and depth of knowledge required for relevant tasks, and how easily this knowledge can be commoditized (via simulation, for instance).

    • Example: Tesla will capture a significant share of the total economic value of all driving activity, wherever their AI is legally allowed to drive. This holds true only while state-of-the-art driving knowledge remains un-commoditized.

  • The Default Interface (or digital interface) is the universal personal assistant interface. It can interact with humans via text, audio, and video to accomplish any digital task.

  • The interface-dominant AI for the default interface will be the most important AI, by far. I refer to it as the Default AI. Think Google, but capable of completing any task for you, not just answering any question.

  • By the nature of aggregation and the demands of the data flywheel, the Default AI will necessarily be open-access, affordable, and commercial so as to aggregate the most data and usage and amortize reinvestment in the AI over the largest revenue base. Government and open-source projects will lack the resources or justification for the unprecedented investments required.

  • Open source AIs will excel in use cases with commoditized knowledge. The problem will be that interface-dominant AIs will provide this commoditized knowledge for free as well as any proprietary knowledge, with the additional resources to provide a better user experience. The market dictates that interface-dominant AIs will give away huge amounts of value for free to retain loop control and monopolize end-usage feedback, much like how Google gives away billions of dollars worth of operating system, browser, and even dollars themselves (i.e. the Apple search deal) today to maintain access to user’s queries. Open source won’t have any chance of competing for users.

In a future post I will catalog the major interfaces and which companies are likely to dominate each of them. Here, we’ll continue to focus on the default interface (as the most important one) and the driving interface (as a simpler, illustrative example).

Share your email so you don’t miss the next post!

AI as an economic good

AI as a commercial product manifests as a trained model generating output, i.e. making predictions or choosing actions. This output can be anything — text, audio, video, api calls, motor actuations, etc. Such an AI has low marginal cost, just the energy and GPU utilization to generate the output, high fixed investments of R&D, infrastructure, and model training, and, therefore, strong economies of scale: once you train a large AI model, and spin it up on a large GPU cluster, it is very little extra work to clone another such GPU cluster.

This means AI will replicate a lot of the market dynamics of internet-age super aggregators like Google and Facebook, which face the same near-zero marginal costs, high fixed investments, and instant global distribution that imply a winner-take-all dynamic.

Although technically the GPUs and energy used to produce AI output are rival and it would be possible to make AI excludable, in practice, AI will present as a public good, just as internet search does today, thanks to the returns to scale inherent to AI. We’ll discuss this more below.

The AI supply chain

AI has four primary inputs:

  1. Compute - GPU chips to run the model

  2. Energy - to run the chips

  3. Data - to train the neural network

  4. R&D - to design, build, and manage the AI system

And one intermediate and one final output:

  1. Neural network (model) - intermediary output

  2. Predictions - the final output of the model

Which companies will capture value in AI? (2)

Compute

We can assume, in the medium run, that compute will become commoditized enough, as it has for previous computer revolutions. For now though, there are only a few players in the AI compute game, and there are significant barriers, with Nvidia as the only independent supplier with real volume. All of the big cloud players are looking to build vertically integrated AI chip offerings, but only Google has developed something competitive so far. All are supply constrained, and Nvidia has pre-bought a big portion of supplier capacity. All steps of the GPU supply chain are human- and physical-capital intensive, and scaling capacity will take time. But with the amount of money at stake, expect the market to find creative solutions, and eventually oversupply, as tends to happen in technology revolutions.

Energy

Energy is generally commoditized, but there can still be strong local effects — placing your training data center next to a hydro facility, for instance. Energy is not a constraint at the moment in the production of AI, but it will become one soon as AI grows and the chip crunch eases. We should expect to see dedicated nuclear reactors and solar farms powering AI data centers in the future, for instance.

R&D

To build a cutting-edge AI will likely continue to require large capex of compute and R&D. GPT-4 cost $50-100m in compute to train, it seems economical for GPT-5 to cost upwards of $1B. Those costs are matched by the high cost of the specialized experts that research and develop the models and systems, ones that have never been built before. Part of why OpenAI has a lead in AI models is because they committed early and aggressively to large-scale transformer training runs, but partly also because of what appears to be superior R&D, as other companies have attempted to outspend but have come up short in performance. Nonetheless, thanks in part to America’s (and California’s specifically) laissez faire employment regulation, these R&D advantages tend to dissipate rapidly as talent moves between companies.

Data

Data is the heart of any AI. Despite 75 years of AI research into clever methods, it’s almost entirely the “unreasonable effectiveness” of data and the “bitter lesson” of exponential compute that have brought us to where we are today. In the end it is the size and quality of your data (and having enough compute to digest it all) that ultimately determines the quality of your AI, full stop. As compute becomes commoditized and oversupplied, data will be the only long-term differentiator.

Current state-of-the-art AI (LLMs) are trained on large corpuses of text, most of which are open, or semi-open access data sets — internet text, web forums, books, research papers. We’ll call all this public data. It is undifferentiated and freely accessible.

Long-term, the value of a given AI, if it is to have any non-commoditized value at all, will necessarily come from its access to proprietary data.

In conclusion, the AI value chain will be fully dominated long term by proprietary data. In the near term, winners may emerge based on advantages in access to compute and R&D talent, as these remain severely supply constrained.

AI in the market

If we accept the following assumptions:

  1. AI is a general purpose technology — it is widely useful to a large share of economic activity,

  2. The cost of developing a leading AI is very large, both in compute and R&D (as discussed above),

  3. End-usage feedback (proprietary data) is the dominant factor in determining an AI’s performance at various tasks,

  4. Task performance (on economically important tasks) does not have a low ceiling that is reached quickly by cheap or small AIs,

then this implies that AI will be:

Monopolistic - The AI with the largest user base will gather more end-usage feedback data which will give it the better model which will provide a better user experience which will attract more users, in a reinforcing flywheel of data, usage, and money.

Commercial - Leading AI will necessarily be developed by private companies with maximum (global) reach. Governments will be constrained to just their own citizens and will lack commercialization instincts thus limiting their user base, and disadvantaging themselves in the competition for more feedback data. Open source models will not be able to support the speed of development and focus on best-in-world user experience required to stay first, nor will they have the resources to entice and retain users.

Open-access - Again, because of the low marginal cost, and the need to maximize user data and amortize investments over the greatest amount of revenue, AI will be made available to everyone legally allowed.

Affordable - To further maximize users, AI will provide its services mostly free of charge, with revenue generation tied to task completions, and subscriptions for premium features or services.

Because of these properties, AI has the potential to be a great equalizing force for the world. Differences in intellectual capital, which today explain a lot of inter- and intra-country inequality in outcomes, will be leveled, globally. A 500 year old trend of increasing returns to intellectual capital in humans will be nullified.

Ways this is wrong

Although we’ve outlined why companies that own interfaces are likely to dominate AI, there are ways our logic could fail:

AI plateaus quickly. This is maybe what we are seeing with helpful oracles — open-source models that can run on your laptop competing with proprietary models with 20 times the parameters, all of them producing high quality output. In a couple of years, will there really be much room to grow? The issue here is that helpful oracles are just not that valuable relative to the task-competent AIs that are coming soon, and those can’t just be trained on open data. There is no foreseeable limit to the scope and skill that task-competent AIs will need to tackle, so a plateau seems unlikely.

Model scraping is very effective. Many of the best open source models and fine-tunes today are trained off of examples generated with GPT-4; what I call “model scraping”. If you can sample enough of the problem space that you care about with a commercial model, and use its output to train your own problem-specific model, then it will be hard for the commercial model to defend against this, since it is bound by the open-access market requirement discussed above. You could see companies getting aggressive with detecting and punishing this behavior, but like with scrapers on the modern internet it is a cat and mouse game. Nonetheless legal issues would likely restrict the relevance of any such scraped model. More damning for this approach is that the sample space for important problems may just be too large for this to be practical.

Users or governments force data sharing. Users could demand that their end-usage feedback data be owned by them and that they can resell or share it. Or anti-trust law could expand to include restrictions on loop control. (We will discuss policies like this much more in the future, they will be very important to get right.) The default is for users to just take the subsidized offerings provided by interface-dominant AI companies since it is win-win for them and the company. This is exactly how the internet played out with Google and Facebook, we shouldn’t expect anything different here unless we get ahead of it with consumer awareness and regulation. But how do you regulate against consumer choice in a free country? This is a problem at the heart of aggregation dynamics. Even if you could find the right regulations, you must go jurisdiction by jurisdiction and competition makes such coordination difficult.

Synthetic / simulated data is sufficient. It’s possible in certain domains that simulated or synthetic end-usage data is sufficient for training state-of-the-art task-competent AIs. We’ve seen some of this already with model scraping. You could imagine, for instance, open-source driving AIs trained on simulated environments. Although no one will have access to the huge volume of real-world data that Tesla has, developing a simulated driving environment (as self-driving companies already have) and training your AI in that environment would be much cheaper and simpler. But this means your AI is only as good as your simulation, and how can you be sure your simulation is high fidelity in the rare circ*mstances that matter? Only by doing a lot of real world driving! So you are back to your original problem. This is a complex topic worthy of its own post — how easy is it to simulate the world? — one I’ll tackle shortly.

Super AI is so powerful it negates all other market advantages. Super AI is so powerful that it can negate all other market advantages. The first company to develop a sufficiently advanced engineering AI could potentially enter a self-reinforcing feedback loop of increasingly better-designed AI systems, dominating not just the software engineering market but all tasks across all markets. While this idea of runaway AI progress is popular, how likely is it? The three other essential inputs to AI besides intelligence — compute, energy, and knowledge (valuable data) — are expensive to obtain and require interaction with slow physical and human systems. These systems are not easily manipulated by superior intelligence alone (intelligence is not magic; it is neither omniscience nor omnipotence: humans cannot conquer the common flu virus, despite its indisputably low intelligence). This does not mean that machine intelligence won’t eventually out-compete human intelligence in all domains (it surely will), but as a competitive market advantage, super intelligence will matter less than the social, institutional, and physical aspects of market control. That said, we have never encountered super machine intelligence of this sort before and it could evolve in unexpected ways. We would be foolish to rule out the possibility that super intelligence, as an R&D input to AI, could dominate all other inputs in importance.

My hunch, though, is that intelligence is mostly about knowledge, and knowledge is mostly about modeling the real world, and modeling the real world is mostly about observing it and interacting with it and seeing what happens. The companies that are best positioned to do that are companies with interfaces in the wild, with millions or billions of unpaid data collectors (customers) helping them model the world for free. Think Apple, Google, Tesla, and Microsoft. That these are already the largest companies in the world, with the money and people to lead in AI without any data advantage at all, is simply icing on the cake.

Which companies will capture value in AI? (2024)
Top Articles
How to Spoof in Pokémon Go on Android & iPhone [2024]
How to spoof your location for Pokémon GO on Android
Bon plan – Le smartphone Motorola Edge 50 Fusion "4 étoiles" à 339,99 €
Culver's Flavor Of The Day Ann Arbor
Celebrity Guest Tape Free
Wharton County Busted Newspaper
Craigslist Greenville Pets Free
Best Taq 56 Loadout Mw2 Ranked
Butte County Court Oroville Ca
Sssniperwolf Number 2023
Hamboards Net Worth 2022
Dealer 360 Login Generac
Craigslist Cars For Sale By Owner Oklahoma City
Newsweek Wordle
San Antonio Craigslist Free
Does Publix Have Sephora Gift Cards
Highplainsobserverperryton
Indian Restaurants In Cape Cod
برادران گریمزبی دیجی موویز
Spinning Gold Showtimes Near Mjr Westland Grand Cinema 16
Exploring IranProud: A Gateway to Iranian Entertainment
All Added and Removed Players in NBA 2K25 (Help Us Catch 'Em All)
Frostbite Blaster
Amex Platinum Cardholders: Get Up to 10¢ Off Each Gallon of Gas via Walmart Plus Gas Discount
Spicy Bourbon Pumpkin Pie
Hca Florida Middleburg Emergency Reviews
NFL Week 1 games today: schedule, channels, live streams for September 8 | Digital Trends
Wells Fargo Holiday Hours
No Cable Schedule
Liberty Prime Poster
Hotcopper Ixr
Philasd Zimbra
Sa 0 Spn 2659 Fmi 18
Transformers Movie Wiki
CareCredit Lawsuit - Illegal Credit Card Charges And Fees
Terrier Hockey Blog
Lenscrafters Westchester Mall
Www.manhunt.cim
Unfall mit Ikarus C42: Gefangen in der Umkehr-Falle
Los Alamos Beach in Torremolinos: A Perfect Mediterranean Escape - Mama Málaga
Rush Copley Swim Lessons
Chars Boudoir
Bulk Amateur 51 Girls Statewins Leak – BASL058
What Is TAA Trade Agreements Act Compliance Trade Agreement Act Certification
Photogeek Goddess
Ucla Football 247
Ultipro Fleet Farm
Grizzly Expiration Date 2023
Craigslist Farm And Garden Atlanta Georgia
Best Asian Bb Cream For Oily Skin
Lanipopvip
Texas State Final Grades
Latest Posts
Article information

Author: Fr. Dewey Fisher

Last Updated:

Views: 6386

Rating: 4.1 / 5 (42 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Fr. Dewey Fisher

Birthday: 1993-03-26

Address: 917 Hyun Views, Rogahnmouth, KY 91013-8827

Phone: +5938540192553

Job: Administration Developer

Hobby: Embroidery, Horseback riding, Juggling, Urban exploration, Skiing, Cycling, Handball

Introduction: My name is Fr. Dewey Fisher, I am a powerful, open, faithful, combative, spotless, faithful, fair person who loves writing and wants to share my knowledge and understanding with you.