ML Ops Engineer

MARZ Office, 1220 Dundas St E, Toronto, Ontario, Canada ● Virtuel Numéro de demande 7
26 février 2025

Who we need

 

As an ML Ops Engineer at Lipdub AI, you will be responsible for developing and maintaining end-to-end ML pipelines that ensure our AI models' seamless deployment, monitoring, and optimization. You will collaborate with AI researchers, data scientists, and software engineers to deploy state-of-the-art ML models for real-time video and audio applications.

 


What you bring



  • Design, develop, and optimize ML pipelines for training, validation, and inference.

  • Automate deployment of deep learning and generative AI models for real-time applications.

  • Implement model versioning, reproducibility, and rollbacks for seamless updates.

  • Deploy and manage ML models on cloud platforms (AWS, GCP, Azure) using containerized solutions.

  • Optimize real-time inference performance (TensorRT, ONNX Runtime, PyTorch).

  • Work with GPU acceleration, distributed computing, and parallel processing for high-performance AI workloads.

  • Fine-tune models to reduce latency and improve scalability in real-time AI-driven applications.

  • Build and maintain CI/CD pipelines for ML models (GitHub Actions, Jenkins, ArgoCD).

  • Automate model retraining, validation, and deployment to ensure continuous improvement.

  • Develop monitoring solutions for model drift, data integrity, and inference performance.

  • Ensure compliance with security, data privacy, and AI ethics standards.



We are looking for an engineer with the following experience and skills:

 

  • 3+ years of experience in ML Ops, DevOps, or AI model deployment.

  • Strong proficiency in Python and ML frameworks (TensorFlow, PyTorch, ONNX).

  • Experience deploying ML models using Docker, Kubernetes, and serverless architectures.

  • Hands-on experience with ML pipeline tools (ArgoWorkflow, Kubeflow, MLflow, Airflow).

  • Expertise in cloud platforms (AWS, GCP, or Azure) for AI/ML model hosting.

  • Experience with GPU-based inference acceleration (CUDA, TensorRT, NVIDIA DeepStream).

  • Proficiency in CI/CD workflows and automated testing for ML models.

  • Solid understanding of real-time inference optimization and scalable ML infrastructure.

  • Excellent Technical Judgment: You can design and implement elegant and clean solutions that meet the requirements of today while allowing for growth tomorrow. You know how to pick the right tool for the job.

  • Strong Automation Focus: You seek to script and automate as much computing as possible. 

  • Proven understanding of distributed systems and computing architectures.

  • Motivation: You are self-driven and work well independently.

  • You have working experience with Kubernetes, docker or microservices in general.

  • BS or MS in Computer Science or equivalent work experience.

 

Nice to have:

  • Some experience with CUDA Programming.

  • Experience working with LLMs and generative AI models in production.

  • Basic networking knowledge

  • Knowledge of distributed computing frameworks (Ray, Horovod, Spark).

  • Experience with edge AI deployment (Triton Inference Server, TFLite, CoreML).



Autres détails

  • Type de paie Salaire
Location on Google Maps
  • MARZ Office, 1220 Dundas St E, Toronto, Ontario, Canada
  • Virtuel