Job Details

View jobs in our app

Learn more about the app. Workinapps.com

Senior Site Reliability Engineer / HPC - Pre-IPO Tech Leader

2025-10-20 Andiamo San Francisco,CA

Description:

Senior Site Reliability Engineer / HPC - Pre-IPO Tech Leader

We are seeking a highly skilled Senior DevOps Engineer to drive the automation, scalability, and reliability of the infrastructure powering a $2.5B pre-IPO technology leader. Our systems support large-scale AI, data, and next-generation software platforms, and we are committed to building world-class DevOps practices that accelerate innovation while ensuring security and resilience.

In this role, you will design and manage CI/CD pipelines, develop infrastructure-as-code, optimize container orchestration, and lead automation initiatives across engineering teams. You will play a critical role in enabling high-velocity development, efficient deployments, and rock-solid production systems.

What You'll Do

Own CI/CD: Build and maintain continuous integration and delivery pipelines that ensure rapid, reliable deployments.
Infrastructure as Code: Develop scalable infrastructure using Terraform, Ansible, or similar frameworks.
Cloud & Hybrid Operations: Operate and optimize infrastructure across AWS, GCP, or Azure combined with on-premise systems.
Containerization & Orchestration: Drive adoption and performance tuning of Kubernetes, Docker, or similar platforms.
Automation First: Eliminate manual processes by building robust automation for provisioning, scaling, monitoring, and recovery.
Observability: Implement monitoring, logging, and alerting systems (Prometheus, Grafana, ELK, OpenTelemetry) to ensure system visibility and reliability.
Collaboration: Partner with engineering, product, and SRE teams to align infrastructure with application and business requirements.
Reliability & Security: Embed security, compliance, and resilience into every stage of the software lifecycle.

What We're Looking For

Experience: 6+ years of professional experience in DevOps, infrastructure engineering, or systems engineering roles.
Programming Skills: Proficiency in Python, Go, or Bash for automation and tooling.
Cloud Expertise: Hands-on experience with AWS, GCP, or Azure, including networking, storage, and security services.
Containers & Orchestration: Advanced knowledge of Kubernetes, Helm, and Docker.
CI/CD Mastery: Experience with Jenkins, GitLab CI/CD, ArgoCD, or similar tools.
Infrastructure as Code: Strong background with Terraform, Ansible, or Pulumi.
Monitoring & Observability: Skilled with Prometheus, Grafana, ELK stack, or OpenTelemetry.
Mindset: Passion for automation, performance optimization, and enabling high-velocity development teams.
Education: Bachelor's degree in Computer Science, Engineering, or a related technical field.

This is a rare opportunity to join a pre-IPO technology leader valued at $2.5B, where DevOps is at the heart of engineering scale and velocity. As a Senior DevOps Engineer, you will architect and automate systems that enable global innovation in AI, cloud, and large-scale data. You'll have the chance to tackle complex infrastructure challenges, work alongside world-class engineers, and shape the DevOps culture at a company preparing for IPO and hyper-growth.

About Andiamo

Andiamo is a globally recognized staffing and consulting firm specializing in placing the top 2% of technology and go-to-market professionals with the world's largest and most well-known companies.

We are an equal opportunities employer and welcome applications from all qualified candidates.

#J-18808-Ljbffr

Job Details

View jobs in our app

Senior Site Reliability Engineer / HPC - Pre-IPO Tech Leader

Senior Site Reliability Engineer / HPC - Pre-IPO Tech Leader

Apply for this Job

Registration Required

Login to Apply

You are leaving our site

Registration Required

Email this job to a friend

Job: Senior Site Reliability Engineer / HPC - Pre-IPO Tech Leader

Job Alert Sign Up

Add To Job Alert

Job Alert Updated

Email Customer Care