Apply to the open roles at Delphi Ventures' portfolio companies.

133
companies
249
Jobs

Full TimeML Ops Engineer

Nous Research

Nous Research

Operations
Posted on May 9, 2025

Architect and implement efficient ML Inference pipelines for large language models.

Responsibilities:

  • Design and implement high-performance inference pipelines
  • Optimize model serving for throughput, latency, and cost across different workloads
  • Collaborate with research and product teams to integrate inference into real-world applications
  • Help enhance and manage the deployment pipeline and monitor production clusters
  • Debug production inference issues
  • Stay up-to-date with the latest in inference tech and open-source frameworks

Qualifications:

  • Deep experience developing and tuning LLM inference frameworks (e.g. vLLM)
  • Solid communication skills; ability to work independently and within a team
  • Experience with cloud infrastructure (AWS, GCP, Azure) and Kubernetes
  • Passion for AI and practical ML systems
  • Experience building, deploying and operating highly available, scalable, distributed cloud services.