Director of Cloud Infrastructure & SRE - Radiant Dev : Job Details

Director of Cloud Infrastructure & SRE

Radiant Dev

Job Location : Golden,CO, USA

Posted on : 2025-07-31T18:18:28Z

Job Description :

Director of Cloud Infrastructure & SRE

The Team

In this role, you would be a part of the Engineering Leadership team and reporting to EVP, Technology.

Office Expectation

This role is HYBRID, with an expectation of 3 days a week in-office in Golden, CO. Core hours are 8-5pm.

Compensation

Targeting $175-$185K Base salary with 20% Bonus

The Overview:

Our client is seeking a strong Director of Cloud Infrastructure & SRE to lead the design, implementation, and optimization of their multi-cloud platform supporting real-time, high-volume financial data flows. This is a hands-on leadership role where you'll shape cloud strategy, define best practices, and scale mission-critical systems that power modern banking infrastructure.

You'll head global teams of DevOps, SRE, and CloudOps engineers, champion Infrastructure-as-Code, AI/ML-based automation, and work with tools like Kubernetes, Terraform, CI/CD pipelines, Prometheus, and ELK. The ideal candidate brings deep technical expertise in Azure & AWS, strong security and compliance knowledge (SOC2, GDPR, NIST), and a passion for building resilient, secure, and scalable systems.

What you'll own:

  • As a leader, you will have the opportunity to lead the DevOps & Cloud Infrastructure transformation in a rapid growing organization of multiple teams in delivering on business priorities while collaborating with development leaders and executives to define and advance best practices
  • We are seeking a strategic and experienced leader to oversee the cloud infrastructure, Site Reliability Engineering (SRE) for our large-scale, connected products ecosystem and CloudOps

Cloud Infrastructure & SRE Strategy

  • Define and execute global cloud operations and SRE strategies, ensuring 99.99%+ uptime for mission-critical financial services applications
  • Architect, implement, and optimize multi-cloud infrastructure to support financial services application with low-latency data processing, scalability, and high availability
  • Drive cost optimization strategies while balancing performance, redundancy, and financial efficiency across cloud platforms (Azure & AWS)
  • Develop automated deployment, monitoring, and recovery systems using technologies like Kubernetes, Terraform, Ansible, and CI/CD pipelines

Reliability, Performance & Incident Management

  • Establish and refine SLOs, SLIs, and KPIs for service reliability, performance, and capacity planning
  • Build and optimize incident management, disaster recovery, and resilience engineering frameworks
  • Leverage AI/ML-driven automation for proactive failure detection and remediation
  • Implement robust security practices and ensure cloud security, compliance with standards such as SOC2, GDPR, and NIST, and oversee the zero-trust security model
  • Collaborate with security and compliance teams to manage risk and ensure regulatory adherence across cloud platforms

Team Leadership & Cross-Functional Collaboration

  • Lead and mentor a team of DevOps Engineers, SREs, Escalation Engineers and SW professionals, fostering a culture of continuous learning and innovation
  • Partner with product management, software engineering, and customer support to optimize the software, scalability and performance
  • Collaborate with executive leadership to develop long-term cloud investment strategies

Requirements

Necessary Qualifications:

  • 10 + years in Computer Science, Engineering, or a related field
  • 10+ years of experience in Cloud Operations, SRE, or Infrastructure Engineering, with 8+ years in technical leadership roles
  • Experience managing large & Medium scaled, distributed cloud environments supporting millions of data connections per day
  • Deep professional experience in Azure and AWS cloud platforms including networking, storage, compute, and database services
  • Experience in Kubernetes, Terraform, CI/CD pipelines, and Application Monitoring & observability tools (e.g., Prometheus, Grafana, ELK, etc.)
  • Experience in large-scale systems design and architecture, with a focus on reliability, performance, and scalability of cloud-native platforms
  • Hands-on experience with tools like Terraform, Cloud Formation, Ansible, CDK, Pulumi for Infrastructure-as-Code (IaC), and managing cloud-native architectures
  • Strong background in AI/ML-driven automation for cloud infrastructure monitoring, self-healing, and optimization
  • Solid understanding of security-first cloud architectures, DevSecOps, and compliance standards (PCI, SOC2, GDPR, NIST)
  • Proven ability to manage teams across multiple global time zones, ensuring operational excellence and driving performance in large, distributed environments
  • Expertise in incident management, disaster recovery, and building resilience engineering frameworks
  • Ability and desire to review code, system designs, and engage in system engineering discussions and decisions
  • Expertise in serverless architecture, and edge computing
  • Strong financial acumen in cloud cost management, and forecasting
  • Familiarity with regulatory compliance frameworks such as SOC2, GDPR, PCI, and ISO 27001
  • Relevant certifications in Azure or AWS Cloud Practices
Apply Now!

Similar Jobs ( 0)