Reliability Engineer - Ansible and DataDog - WFH - 1099 / C2C ok - Datamanagementgroup : Job Details

Reliability Engineer - Ansible and DataDog - WFH - 1099 / C2C ok

Datamanagementgroup

Job Location : Atlanta,GA, USA

Posted on : 2025-05-13T00:53:16Z

Job Description :

Reliability Engineer - Ansible and DataDog - WFH - 1099 / C2C ok

Looking for an experienced Reliability Engineer to support critical projects for our Technology, Infrastructure & Operations teams. Work from home, work to be done primarily on US Eastern Timezone.

Minimum of 7 years performance engineering and performance testing experience
MUST HAVE 3+ years of recent work with Ansible
MUST HAVE 4+ years of work with DataDog
Excellent English Communications skills - Verbal & Written (idiomatic English)
Experience managing performance engineering efforts for applications strongly preferred
Knowledge of developing scripts for monitoring using PowerShell, Python, and Shell scripting
5 years of Splunk programming proficiency is highly preferred
5-6 years experience using .NET and Java application and Application Monitoring Tools like AppDynamics or DataDog are highly preferred
Proficiency in performance tuning is preferred
Good understanding of the UI, Middleware, and backend Databases
BA/BS degree in Information Technology, Computer Science, or related field of study

Duties include:

Develop and maintain comprehensive monitoring solutions for cloud-based services and applications
Configure monitoring tools and systems to collect relevant metrics, logs, and traces
Create custom monitoring dashboards and reports using Splunk, DataDog, DynaTrace, or other tools, to provide real-time insights into system performance and health
Continuously monitor the cloud infrastructure's performance and capacity, anticipating and addressing potential scalability issues
Proactively suggest and implement improvements to enhance the system's reliability, resilience, and fault tolerance
Work on automating tasks to streamline operational processes and reduce manual intervention
Collaborate with cross-functional teams to investigate and resolve critical incidents, ensuring minimal impact on end-users
Work with Problem Management team to complete post-mortem analysis of incidents to identify root causes and implement preventive measures
Understand the overall architecture of our systems to identify gaps in monitoring and troubleshoot issues
Configure and maintain custom dashboards and alerts in various monitoring tools
Create custom reports, deliver report presentations to various stakeholders
Develop scripts for monitoring PowerShell, Python, Shell scripting
Develop metrics for both the business and technical teams to determine the health of systems
Provide on-call support as needed
Leads and coordinates performance engineering for medium to large initiatives
Collect and document expected system performance and operational characteristics
Collect and/or prepare test data for test execution
Develop and execute performance tests including load, stress, endurance, fail-over, and interoperability
Conduct technical analysis of performance test results and production systems, and provide recommendations on performance tuning, systems, and infrastructure. Identify, report, and review defects in assessing system performance and stability
Defining the strategy for enabling performance diagnostics and monitoring using an Application Performance Management (APM) tool, other monitoring tools, and diagnostic techniques
Collaborating with developers to promote the concept of performance engineering during all phases of the SDLC to detect and correct performance issues earlier in the lifecycle
Leads peer reviews to ensure the completeness of all test assets created
Resolve performance and stability issues in the performance test environment
Develop a performance engineering work plan structure and project schedule
Review architectural design for performance risks and potential issues
Prepare capacity analysis when applicable

#J-18808-Ljbffr

Apply Now!

Similar Jobs ( 0)

-- View More Similar Jobs --