Vertical AI: Where Capital Actually Compounds

Every AI pitch deck in 2025 and 2026 has some version of the same thesis: "We're applying AI to X — a massive market that is still running on legacy systems." This is true for a very large number of industries. It is not, however, a sufficient basis for investment. The question is always: where does the value of the AI application accrue? To the buyer of the software? To the end user? To the model provider? To the application layer? The answer varies significantly by sector and by business model.

After two years of looking at AI investments seriously — and passing on most of them — we have converged on three patterns that we believe represent the most durable compounding dynamics for capital.

Pattern 1: Data Moats That Are Hard to Replicate

The canonical AI business model assumes that the application layer is where the defensibility lives. Train a model on proprietary data, build a better workflow on top of it, and the customer lock-in flows from the model performance. In theory, this is correct. In practice, the model performance of most vertical AI applications depends on foundation models that are accessible to any competitor — and the proprietary data advantage is often smaller than it appears.

A verticals data moat is durable only if the data is expensive to collect, not just private to the customer. If the data is siloed behind customer relationships but can be collected by a determined competitor, the moat is not real.

The durable moat forms when the data is both proprietary and expensive to replicate. In logistics, this means real-time operational data from an active freight network — cargo movements, carrier performance, exception events — collected over years of operation in specific trade lanes. No competitor can collect this data without running the operations. It does not exist in any public dataset. It compounds with every shipment.

Pattern 1

Proprietary Operational Data in Physical Industries

Where the data is generated by the physical movement of goods, materials, or people — and cannot be faked, purchased, or scraped from public sources. Logistics, industrial inspection, precision agriculture, healthcare delivery. The AI application improves with every unit of real-world operation, and that accumulation is genuinely hard to replicate.

Pattern 2: Workflow Integration That Changes the Unit Economics

The difference between AI as a feature and AI as a product is whether the AI fundamentally changes the unit economics of the workflow it operates in, or whether it makes an existing workflow marginally better. The former is investable. The latter is a commoditised SaaS business dressed up in AI language.

Workflow integration that changes unit economics typically involves the AI making a decision that was previously made by a expensive human — and doing so at a cost and speed that opens up a market segment that was previously unserviceable. A quality inspection AI that replaces a team of QC engineers at one plant is a feature improvement. An AI that makes quality inspection cheap enough to deploy at every node in a supply chain that previously only had spot-check inspection is a unit economic change.

Pattern 2

AI That Makes Previously Unservicable Segments Servicable

When the AI application enables a workflow to operate in a market segment that was previously unserviceable due to cost or speed constraints. The key test: does the AI change what is economically viable, or does it just make an existing viable workflow slightly better?

Pattern 3: Decision Systems With High-Stakes Feedback Loops

AI applications that improve with use are not automatically defensible — the improvement only creates a moat if the feedback loop is proprietary, if the data quality is high, and if the improvement is meaningful enough that early entrants accumulate a meaningful capability lead.

The highest-stakes feedback loops are in decisions where the cost of a wrong answer is large and the data on outcomes is clean and immediate. Logistics routing in contested lanes: the cost of a wrong routing decision is high (delayed cargo, missed connections, customer penalties), and the outcome data is available within days. Medical imaging diagnosis: the cost of a wrong diagnosis is very high, and the outcome data requires longitudinal follow-up. These are both businesses we have looked at and, in some cases, invested in.

Pattern 3

High-Stakes Decisions With Immediate, Proprietary Feedback

Where the AI system's recommendations produce measurable, attributable outcomes in a short time horizon — and that outcome data flows back into the model through a closed loop that competitors cannot access. The combination of high stakes and clean feedback creates genuine performance differentiation over time.

What We Have Not Invested In

We have passed on AI businesses in sectors where the data moat is customer-relationship data rather than operational data — because customer data can be replicated by a competitor with the same relationships. We have passed on AI businesses where the workflow integration does not change the unit economics — because the defensibility is then in the relationship, not the technology, and relationship-based businesses do not deserve software multiples. We have passed on AI businesses in sectors where the feedback loop is slow or noisy — because the compounding advantage is too weak to create durable differentiation.

The result is a concentrated AI portfolio: Helix Industrial AI in manufacturing quality, Quanta Sigma in financial decision systems. Both fit all three patterns. We are actively looking for a third position in healthcare delivery.

← Back to Shangye