Join to apply for the Sr Site Reliability Engineer role at Voloridge Investment Management, LLC
Voloridge Investment Management is an award-winning, rapidly growing quantitative investment management firm based in Florida managing over $9B in assets. We implement sophisticated machine learning techniques to solve the challenging problem of modeling and predicting financial markets. We build our own software to drive and execute trade decisions.
We maintain a collaborative entrepreneurial environment where everyone contributes to the design and implementation of our software platform. The atmosphere at Voloridge is fast-paced but casual. We are seeking a Senior Site Reliability Engineer to join our talented development team.
Objectives of the role
Define Service Level Agreements and Objectives (SLA/SLOs) with internal customersMeasure Service Level Indicators (SLI), and increase Mean Time Between Failures (MTBF)Measure and decrease Mean Time to Recovery (MTTR), Mean Down Time (MDT)Summary Of Job Functions
Determine the best hardware, software and configuration for meeting objectivesStrong incident management and SRE-aligned thinking (e.g., proactive issue identification)Detect issues related to ingress, processing, storage and egress of market dataWork with IT to prepare disaster recovery plansKeep the system up and reliable, providing insights to leadership on KPIs related to platform usage, uptime, health, and performance of componentsUnderstand service level indicators and utilize service level objectives to proactively resolve issuesCollaborate with software engineers and teams to design, develop, test, and implement solutions for availability, reliability, and scalabilityMinimum Requirements
At least 10 years of experience in a Site Reliability Engineer or relevant roleBSc in Computer Science, Software Engineering, or related disciplineProficiency in programming (scripting and OOP) using languages like Python, C, C#, etc.Experience writing shell scriptsExperience with job schedulers: cron, JAMS, Tidal, Control-M, etc.Experience with containerized applications and infrastructure: Docker, Podman, Kubernetes, etc.Understanding of software engineering principles, SDLC, version controlExperience in an agile environmentExperience with alerting tools like OpsGenie or PagerDutyAbility to work onsite in Jupiter, FL#J-18808-Ljbffr