Iron Systems is an innovative, customer-focused provider of custom-built computing infrastructure platforms such as network servers, storage, OEM/ODM appliances & embedded systems. For more than 15 years, customer have trusted us for our innovative problem solving combined with holistic design, engineering, manufacturing, logistic and global support services. Job Title: Software Engineer III Location: US - NY - New YorkJob Description:
- Refresh of request 60754-1. See notes on the old request:
- Resume feedback: The resumes were more focussed on deploying AI/ML solutions or maybe some fine-tuning vs training a large-scale model from scratch.
- Make sure resumes are detailed in the experience we're looking for, provideing specific repository names or links to projects. This will help candidates stand out.
- Interview feedback: There were a large volume of candidates who were suspected of using AI assistance during their interview.
- Make sure to screen this out of candidates, and let them know that if AI assistance is suspected, they will be automatically disqualified.
- Main winning points of the candidate we offered were his knowledge of a wide variety model architectures and his experience with profiling tools and low-level kernel debugging.
- The winning candidate's resume showcased a good knowledge of GPU and distributed systems, as well as acceleration, and the candidate provided clear and detailed descriptions of past projects.
- Here's what we need: Knowledge of LLM architectures. Experience implementing ML pipelines, Pytorch proficiency, large-scale distributed GPU training, LLM finetuning.
Job Description: Systems / ML Engineer
- Meta is seeking a strong System / Machine Learning Engineer to join our Fundamental AI Research (FAIR) team, an organization focused on making research breakthroughs in AI
- Responsibilities include developing deep learning libraries that support large-scale distributed training, open sourcing high quality code and reproducible results for the community, and bringing the latest research to Meta products for connecting billions of users.
- The chosen candidate will work with a diverse and highly interdisciplinary team of scientists, engineers, and cross-functional partners, and will have access to cutting edge technology, resources, and research facilities.
Responsibilities
- Engineer, design, implement, and improve highly-scalable machine learning systems and tools for enabling research
- Apply knowledge of relevant research domains, along with expert coding skills, to platform and framework development projects
- Write clean and robust machine learning code
Minimum Qualifications
- Degree in Computer Science, Computer Engineering or relevant technical field
- 5+ years experience with deep learning
- Experience developing machine learning algorithms or machine learning infrastructure in Python or C/C++ Preferred Qualifications Demonstrated software engineering experience via work experience, coding competitions, or widely used contributions in open source repositories (e.g. GitHub)
- Experience in open-source development
Top 3 Skill Sets:
- Pytorch
- Machine Learning
- Python
Top 3 Nice to Have:
- Building Open Source Libraries for Machine Learning Distributed training for ML models
- Experience with Machine Learning Research, publishing papers
- Experience with python backends and APIs Experience in software design and development
Mandatory Skills Pytorch Machine Learning Python