Staff ML Infrastructure Engineer

San Francisco

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

We are a fast-growing AI company building next-generation large language models at scale. Our mission is to bring powerful, reliable AI systems into production environments used by thousands of customers. We value technical excellence, deep collaboration, and engineers who thrive on solving real-world problems at scale.

Role Overview

We are seeking a Staff / Principal ML Infrastructure Engineer to lead the design, deployment, and scaling of our large language model infrastructure. This role sits at the intersection of machine learning, systems engineering, and platform design, enabling teams to train, serve, and monitor models efficiently and reliably.

This is not a prompt engineering role – it is focused on building robust, production-grade ML infrastructure and operational pipelines.

Responsibilities

Design, implement, and maintain high-performance infrastructure for training and serving LLMs
Optimize model pipelines for efficiency, latency, and cost at scale
Collaborate with ML researchers, platform engineers, and product teams to deploy models safely into production
Build monitoring, alerting, and tooling to ensure reliability and observability of large-scale ML systems
Evaluate and integrate new frameworks, tools, and architectures to improve ML workflows
Provide technical leadership and mentorship to other engineers on the team

Qualifications

7+ years of software engineering experience, including 3+ years building production ML systems
Deep experience with distributed training and inference frameworks (e.g., PyTorch, JAX, TensorFlow)
Familiarity with model serving technologies and orchestration (e.g., Triton, Ray, Kubernetes)
Strong understanding of GPU/TPU infrastructure, performance optimization, and scalability challenges
Proven experience solving reliability, latency, and cost trade-offs in production ML systems
Excellent collaboration, communication, and problem-solving skills
Experience mentoring or leading engineering teams is a plus

Why You’ll Enjoy This Role

Work on cutting-edge LLM infrastructure at scale
Influence the design of systems that power real-world AI applications
Collaborate with some of the most talented engineers in AI
Flexible work arrangements and competitive compensation

Darwin Recruitment is acting as an Employment Agency in relation to this vacancy.

Reece Waldon

Back to search results

Phone

+44 1277 287285

E-MAIL

Click To Email

SOCIAL MEDIA

View LinkedIn Profile

APPLY NOW

or Back to search results

Submit Your CV

This field is for validation purposes and should be left unchanged.

Name_1

Email(Required)

Phone

Upload CV File(Required)

Max. file size: 512 MB.

UPLOAD CV WITH:

or Upload CV with

Similar Jobs

Permanent

Staff ML Infrastructure Engineer

Technology

Articificial Intelligence

Location: United States About the Company We are a fast-growing AI company building next-generation large language models at scale. Our mission is to bring See more…

to $300,000/year

New York

USA

Permanent

Staff LLM Systems Engineer

Technology

Articificial Intelligence

Location: United States (West Coast preferred, remote considered) About the Company We are a rapidly growing AI company delivering large language models at scale. See more…

to $300,000/year

San Francisco

USA

Permanent

Staff LLM Systems Engineer

Technology

Articificial Intelligence

Location: United States (West Coast preferred, remote considered) About the Company We are a rapidly growing AI company delivering large language models at scale. See more…

to $300,000/year

New York

USA

View All Jobs

Salary Benchmarks and
Market Insights

Are you looking for a new role? Would you like to compare your current salary
against the market rate?

Maybe you're looking to grow your team and need help planning and setting
hiring budgets.

Our interactive market update, split by industry, has all the information you need; from salary
benchmarks, gender split and average tenure to 'time to hire' and fastest-growing skills.

Staff ML Infrastructure Engineer

San Francisco

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

Role Overview

Responsibilities

Qualifications

Why You’ll Enjoy This Role

Submit Your CV

Similar Jobs

Salary Benchmarks and
Market Insights

Take the Next Steps

SERVICES

For clients

For candidates

menu

Staff ML Infrastructure Engineer

San Francisco

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

Role Overview

Responsibilities

Qualifications

Why You’ll Enjoy This Role

Submit Your CV

Similar Jobs

Salary Benchmarks and Market Insights

Take the Next Steps

SERVICES

For clients

For candidates

menu

Salary Benchmarks and
Market Insights