Reliability Engineer
: Job Details :


Reliability Engineer

monday.com

Job Location : all cities,IL, USA

Posted on : 2025-08-12T00:54:34Z

Job Description :

We aremonday.com, a global software company transforming how businesses run. Our product suite can adapt to the needs of diverse industries and use cases within one powerful platform, empowering ~245,000 customers worldwide to reimagine how work gets done, drive greater efficiency, and scale like never before.

With over 2,500 employees across the globe, we grow by prioritizing transparency and knowledge sharing. We care about the impact you make, not the hours you clock, so we encourage initiative, ownership, and fresh thinking. We back our people with flexible work, wellness and mental health support, and a work environment built on collaboration.

About The Role

We're looking for a Reliability Engineer to join our Reliability team. This role will be integral in ensuring the robustness and dependability of our platform, impacting millions of users globally.

  • Maintain a comprehensive understanding of our service architecture and its dependencies.
  • Identify and mitigate risks associated with tightly coupled services and complex interconnections.
  • Lead service re-architecture initiatives to improve reliability and scalability.
  • Review new services and ensure they meet our reliability standards.
  • Advocate for Chaos Engineering, collaborate with R&D teams, build tools/envs, and improve system resilience.
  • Manage the full lifecycle of reliability tools and services, adhering to the architectural guidelines.
  • Collaborate with teams to define and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) that align with business goals and user expectations.
Your Experience & Skills
  • Proven k8s and Linux admin/internals experience.
  • Proven experience with microservice architectures and reliability engineering.
  • Deep understanding of reliability concepts (e.g., SLOs, SLIs, and service interconnections).
  • Strong background in incident response and resilience efforts.
  • Ability to collaborate across teams to drive reliability improvements.
  • Proficiency in a programming language (e.g., Node.js, TypeScript, Go) with the ability to design and implement reliability tooling, including microservices and/or microfrontends.
  • (Nice-to-have): Prior knowledge with chaos engineering.

#LI-DNI

Apply to this job

We believe in equal opportunity. monday.com is an equal opportunity employer and bans discrimination and harassment of any kind. We are committed to creating a workplace free of discrimination and harassment. All qualified applicants will be considered regardless of personal characteristics. We encourage candidates from all backgrounds to apply, regardless of race, religion, national origin, ethnicity, sexual orientation, gender identity, age, marital status, family or parental status, physical or mental disability, or any other protected status.

monday.com is committed to providing access and reasonable accommodations for applicants with disabilities. If you require accommodation during the recruitment process, please contact ...@monday.com. All requests are confidential.

Meet the R&D team

The R&D Team is passionate about building innovative, lovable products, and tackling complex engineering problems at scale. We're accountable for bringing the company's vision to life through flawless execution and encouraging full ownership and independence in projects.

#J-18808-Ljbffr
Apply Now!

Similar Jobs (0)