Job Location : New York,NY, USA
We are seeking a highly skilled Machine Learning Engineer with hands-on experience in diffusion models, deep generative modeling, and deployment of AI systems using TensorFlow or PyTorch. The ideal candidate will work on cutting-edge projects involving denoising diffusion probabilistic models (DDPMs), latent diffusion models (LDMs), or related architectures to develop state-of-the-art generative AI applications.
This role requires a deep understanding of machine learning, model optimization, and efficient large-scale inference. The candidate will also work closely with data engineers, research scientists, and software engineers to bring production-ready AI solutions to market.
Key Responsibilities
Design, develop, and optimize diffusion models (DDPMs, LDMs) for tasks such as image generation, text-to-image synthesis, or noise-based denoising techniques.
Implement and fine-tune deep learning models using PyTorch or TensorFlow for generative AI applications.
Develop scalable and efficient ML pipelines for training and inference using multi-GPU, TPU, or distributed computing environments.
Optimize models for latency, memory efficiency, and performance through techniques such as quantization, pruning, distillation, and mixed-precision training.
Integrate diffusion models into production systems, including API endpoints, cloud-based inference, and real-time processing.
Collaborate with research teams to experiment with new architectures and improvements in generative AI.
Utilize cloud services (AWS, GCP, Azure) and MLOps tools (MLflow, Kubeflow, ONNX, TensorRT) to deploy and monitor models.
Keep up-to-date with state-of-the-art generative modeling research and implement innovative methodologies in projects.
Required Qualifications
Experience with Diffusion Models:
Strong knowledge of denoising diffusion probabilistic models (DDPMs), stable diffusion, latent diffusion models (LDMs), or similar generative AI techniques.
Hands-on experience implementing diffusion models from research papers and deploying them in real-world applications.
Deep Learning & Model Optimization:
Expertise in deep learning architectures (CNNs, VAEs, GANs, Transformers, or ResNets) for generative modeling.
Proficiency in TensorFlow or PyTorch with experience in writing custom training loops, fine-tuning, and debugging large models.
Understanding of latent space representation, noise scheduling, and generative priors in deep generative models.
Efficient Model Training & Deployment:
Experience with multi-GPU/TPU training, data parallelism, model parallelism, and distributed training frameworks.
Knowledge of model acceleration techniques (e.g., ONNX, TensorRT, quantization, mixed precision training, JIT compilation, XLA optimization).
Software Engineering & MLOps:
Strong proficiency in Python and experience with containerized deployment (Docker, Kubernetes, FastAPI, Flask, etc.).
Experience working with cloud services (AWS, GCP, or Azure) for training and deployment.
Familiarity with MLOps workflows, versioning, and monitoring tools such as MLflow, Kubeflow, or Weights & Biases.