Staff LLM Systems Engineer

New York

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

We are a rapidly growing AI company delivering large language models at scale. Our mission is to ensure models not only perform well in research but also serve real-world applications reliably and efficiently. We are looking for engineers who enjoy solving high-scale inference and systems challenges.

Role Overview

We are seeking a Senior / Staff LLM Systems Engineer to lead the development, optimization, and deployment of large language model inference pipelines. This role focuses on high-throughput, low-latency serving and production reliability, bridging ML research and platform engineering.

This is not a training-focused role – the emphasis is on serving models at scale, optimizing systems, and enabling production ML reliability.

Responsibilities

Design, implement, and optimize inference pipelines for large language models
Improve throughput and latency of model serving in production environments
Collaborate closely with infrastructure, platform, and ML research teams to ensure smooth deployment
Build monitoring, observability, and alerting systems for inference performance and reliability
Identify and solve scaling challenges across GPUs, TPUs, or distributed environments
Evaluate and adopt new technologies, frameworks, and architectures to improve inference efficiency
Mentor other engineers and contribute to technical strategy for production ML systems

Qualifications

5+ years of software engineering experience, including hands-on ML systems experience
Strong background in distributed systems, performance tuning, and low-latency architectures
Experience with model serving frameworks (e.g., Triton, vLLM, Ray, TorchServe)
Familiarity with GPU/TPU infrastructure, multi-node deployment, and system-level optimization
Understanding of ML workloads and trade-offs between accuracy, latency, and cost
Proven ability to deliver production-grade ML systems at scale
Excellent collaboration and problem-solving skills

Why You’ll Enjoy This Role

Work on cutting-edge LLM inference systems at scale
Solve technically challenging, high-impact engineering problems
Collaborate with top ML researchers and platform engineers
Competitive compensation and flexible work arrangements

Darwin Recruitment is acting as an Employment Agency in relation to this vacancy.

Reece Waldon

Back to search results

Phone

+44 1277 287285

E-MAIL

Click To Email

SOCIAL MEDIA

View LinkedIn Profile

APPLY NOW

or Back to search results

Submit Your CV

Instagram

This field is for validation purposes and should be left unchanged.

Name_1

Email(Required)

Phone

Upload CV File(Required)

Max. file size: 512 MB.

UPLOAD CV WITH:

or Upload CV with

Similar Jobs

Permanent

Staff ML Infrastructure Engineer

Technology

Articificial Intelligence

Location: United States (West Coast preferred, remote considered) About the Company We are a fast-growing AI company building next-generation large language models at scale. See more…

to $300,000/year

San Francisco

USA

Permanent

Staff ML Infrastructure Engineer

Technology

Articificial Intelligence

Location: United States About the Company We are a fast-growing AI company building next-generation large language models at scale. Our mission is to bring See more…

to $300,000/year

New York

USA

Permanent

Staff LLM Systems Engineer

Technology

Articificial Intelligence

Location: United States (West Coast preferred, remote considered) About the Company We are a rapidly growing AI company delivering large language models at scale. See more…

to $300,000/year

San Francisco

USA

View All Jobs

Salary Benchmarks and
Market Insights

Are you looking for a new role? Would you like to compare your current salary
against the market rate?

Maybe you're looking to grow your team and need help planning and setting
hiring budgets.

Our interactive market update, split by industry, has all the information you need; from salary
benchmarks, gender split and average tenure to 'time to hire' and fastest-growing skills.

Staff LLM Systems Engineer

New York

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

Role Overview

Responsibilities

Qualifications

Why You’ll Enjoy This Role

Submit Your CV

Similar Jobs

Salary Benchmarks and
Market Insights

Take the Next Steps

SERVICES

For clients

For candidates

menu

Staff LLM Systems Engineer

New York

USA

Permanent

Articificial Intelligence

Location: United States (West Coast preferred, remote considered)

About the Company

Role Overview

Responsibilities

Qualifications

Why You’ll Enjoy This Role

Submit Your CV

Similar Jobs

Salary Benchmarks and Market Insights

Take the Next Steps

SERVICES

For clients

For candidates

menu

Salary Benchmarks and
Market Insights