About the Role:
We’re looking for a Machine Learning Infrastructure Engineer to help us design and optimize the backbone of our AI platforms. You’ll work at the intersection of ML, DevOps, and cloud engineering—ensuring fast, reliable, and scalable training/inference pipelines.
Responsibilities:
-
Build and maintain scalable ML infrastructure for training and deployment.
-
Collaborate with data scientists and ML engineers to optimize workflows.
-
Automate model lifecycle: from training to deployment and monitoring.
-
Evaluate and implement distributed computing frameworks (Kubernetes, Ray, etc.)
Requirements:
-
Strong experience in cloud infrastructure (AWS/GCP/Azure).
-
Proficient in Python and infrastructure-as-code (Terraform, Helm).
-
Familiar with containerization and orchestration tools (Docker, K8s).
-
Experience with ML workflows (MLflow, Kubeflow, or similar) is a plus.
