We invite you to apply for the Big Data Engineer - Healthcare Analytics role at Molina Healthcare, where you'll be part of an innovative healthcare data team.
Job Summary As a Big Data Engineer, you will play a crucial role in the design, development, and management of large-scale data systems tailored for healthcare analytics. Your mission will be to architect and maintain robust, scalable, and secure data pipelines that are essential for critical decision-making across our organization. This position demands a deep technical expertise in modern Big Data tools, with a focus on real-time and batch data integration while ensuring compliance with healthcare data governance.
Knowledge/Skills/Abilities
- Design and implement scalable, high-performance Big Data solutions accommodating both structured and unstructured data.
- Create and manage batch and real-time data ingestion/extraction pipelines utilizing tools such as Kafka, Spark Streaming, and Talend.
- Develop reusable and efficient ETL frameworks through Python/Scala to facilitate high-volume data transformation and movement.
- Optimize data models to support analytical and operational needs, including healthcare claims and utilization data.
- Work collaboratively with cross-functional teams, including data scientists, analysts, and business partners, translating requirements effectively into robust data solutions.
- Deploy, monitor, and troubleshoot Hadoop-based infrastructure using management tools like Cloudera Manager and Ambari.
- Uphold data quality, security, and compliance standards through established protocols using tools like Kerberos and Ranger.
- Implement web services and APIs (REST/SOAP) for seamless application and visualization platform integrations.
- Contribute to data governance initiatives, including metadata management and quality assurance.
Job Qualifications
Required Qualifications
- A minimum of 3 years of hands-on experience in Big Data engineering, data integration, and pipeline development.
- Proficiency in Python, Java, or Scala specifically for data transformation and system scripting.
- Expertise in a variety of Big Data tools including Spark, Hive, Impala, and Hadoop (HDFS, YARN).
- Experience in building real-time stream-processing systems employing Kafka or similar technologies.
- Strong knowledge of both NoSQL databases and traditional RDBMS like PostgreSQL and Oracle.
- Skilled in ETL design and execution using tools such as Talend.
- Demonstrated experience in deploying and monitoring big data infrastructure using Ambari.
- A solid understanding of data warehousing concepts, data validation, and data governance.
Preferred Qualifications
- Over 5 years of progressive experience in Big Data engineering or analytics.
- Prior experience in the healthcare industry, familiar with clinical, claims, or care management data.
- Exposure to cloud platforms (AWS, Azure) and containerization tools (Docker, Kubernetes).
Technical Environment
- Big Data Ecosystem: Hadoop, Spark, Hive, Kafka
- Streaming & Messaging: Kafka, Spark Streaming
- ETL & Integration: Talend, Python/Scala-based ETL
- Programming Languages: Python, Java, Scala, SQL
- Databases: HBase, MemSQL, PostgreSQL, Oracle
- Cloud & DevOps: AWS, Azure, Docker, Kubernetes
- Security & Governance: Kerberos, Ranger
- Monitoring Tools: Ambari, Cloudera Manager
- APIs: REST, SOAP
Join Molina Healthcare and be part of a team that values diverse ideas and the drive for continuous improvement. We offer a competitive benefits and compensation package that reflects our commitment to innovation and excellence in healthcare.