Live · Open·AI·Senior

Head of Inference at an AI infrastructure provider.

Head of Inference, reporting to the VP Platform. The Head of Inference owns model serving at production scale for frontier-lab customers.

About the firm

Anonymised by default.

An AI infrastructure provider. The firm operates compute clusters for frontier labs and large applied-AI customers. Around 200 staff. Profitable in core compute, and scaling inference platform services as the inference share of customer spend grows. Series A funded, with strategic investors on the cap table.

We name a client on the mandate page only where the firm has approved publication. Where not named, the description above is intended to be enough for a senior candidate to recognise whether their pattern fits the brief.

The brief

What this seat exists to do.

Head of Inference, reporting to the VP Platform. Newly created seat as the firm's inference business has grown into a standalone line.

Direct line management of around 20 platform engineers and SREs across three squads — serving runtime, throughput optimisation, and customer integration. Hiring two further senior engineers in the first nine months.

The 12-month remit is to lift the firm's inference platform from a beta with three frontier-lab customers to a generally available product with measured cost-and-latency SLAs.

Remote within US or EU, with quarterly off-sites at the firm's primary engineering hub.

What we are assessing for

A current or recent leader of an inference platform programme at a peer infrastructure provider, frontier lab, or hyperscaler.
Comfort with the dual remit — customer-facing on integration and SLAs, platform-facing on serving and optimisation.
An engineering instinct that survives translation to economics. The seat is the named owner of inference unit economics for the firm.
Patterns: a current inference platform lead from a peer compute provider; a senior platform engineer from a frontier lab's serving team; an SRE lead from a hyperscaler's AI platform ready for a leadership seat.

The firm's read

Spectrum's read on this search.

The Head of Inference owns model serving at production scale for frontier-lab customers. Latency, throughput, and the cost curve that decides whether inference earns margin are the brief. The remit runs end to end — the serving stack, the SLOs the customer signs against, and the unit economics underneath. Pricing the platform, shaping the product roadmap, and carrying the inference revenue line come with it, as inference takes the larger share of customer spend. Core compute is already profitable. Inference has to become the second profitable line.

— Peter Wood, Partner & Chief Strategy Officer

Introduce yourself

Against this brief.

Spectrum holds first introductions in confidence. Your approach is read by the partner running this search before any external check. We do not share candidate identities with the client until you have agreed to be put forward.

Not the right brief?

See other current searchesLive mandates across digital assets and AI.Introduce yourself in confidenceFor senior candidates whose pattern fits the scope but no current brief matches.

Against this brief.