Generative AI Company - Research Scientist, Foundation Models
Salary: $250-350k + significant equity
Location: New York (Hybrid)
We are working with a very exciting start-up in the Gen AI space who have just closed a landmark Series A funding round, backed by a consortium of leading investors and a founding team with a history of disrupting industries. Their mission is to unlock unprecedented creativity and intelligence through cutting-edge generative AI, empowering individuals and industries to achieve what was once unimaginable. They are building a team of the brightest minds to revolutionize how we interact with and create content, and we want you to be a key part of this journey.
Are you an exceptional Research Scientist with a profound passion for foundation models and their application in generative AI? Do you dream of pushing the boundaries of what's possible with large language models, multi-modal generation, and beyond?
As a Research Scientist, Foundation Models with our client, you'll be at the forefront of developing the next generation of generative AI capabilities, driving innovation that will captivate and inspire.
In this high-impact role, you will:
- Develop and refine novel deep learning architectures and training methodologies to significantly enhance the performance and capabilities of large-scale foundation models for text, image, audio, and multi-modal generation.
- Design and implement cutting-edge machine learning techniques for fine-tuning, adaptation, and personalization of generative models for a diverse range of downstream applications.
- Lead the curation and construction of massive, high-quality datasets, focusing on aspects like human preference learning, alignment, and ethical considerations in generative AI.
- Collaborate closely with our product and engineering teams to transform research prototypes into robust, scalable, and deployable generative AI products that reach millions.
- Contribute to our research publications and actively participate in the wider AI community, sharing insights and fostering open innovation.
Who we're looking for:
- Education: An MS or PhD in Computer Science, Artificial Intelligence, Machine Learning, or a relevant quantitative field.
- Experience: A minimum of 3 years of hands-on research and development experience, with a strong focus on one or more of these key areas: Foundation Models, Large Language Models, Multi-modal AI, or Distributed Training for large-scale models.
- Technical Proficiency: Deep expertise in the architecture and training of large language and multi-modal foundation models, including practical experience with model parallelism, distributed training, and optimization techniques.
- Coding Ability: Exceptional command of deep learning frameworks such as PyTorch or JAX, and advanced proficiency in Python for research and rapid prototyping.
- Research Influence: A proven track record of impactful publications in top-tier machine learning and AI conferences (e.g., NeurIPS, ICML, ICLR, AAAI, ACL, EMNLP).
Bonus points if you bring experience in:
- MLOps and MLOps platforms specifically for large models.
- Scalable cloud computing environments (AWS, GCP, Azure) and orchestration with Kubernetes/Docker.
- Big data processing tools (Spark, Ray, Dask) and GPU optimization techniques (CUDA).
- Human-in-the-loop systems for model refinement.
This is a phenomenal opportunity to join a fast-paced, intellectually stimulating environment where your work will directly influence the future of generative AI. Come build the impossible with us!