Job Description
Role Overview
We are building a European AI Infrastructure & Platform Operations team responsible for operating large-scale AI infrastructure environments powered by NVIDIA GPUs, high-performance networking, Kubernetes, and next-generation platform technologies.
What You Will Do
Monitor, operate, and support production AI infrastructure platforms. Investigate and resolve infrastructure, networking, hardware, and platform-related incidents.
Why It Might Be a Fit
Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments. Help define how next-generation AI infrastructure is operated and supported.
Requirements
- 3+ years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or related technical roles
- Strong Linux administration and troubleshooting skills
- Good understanding of networking concepts and experience diagnosing infrastructure-related issues
- Working knowledge of Kubernetes in production environments
- Experience supporting production infrastructure and services
- Strong analytical and problem-solving skills
- Experience working within structured operational and incident management processes
- Excellent communication and collaboration skills
- Ability to work within a shift-based operational environment
Benefits
- Work with some of the most advanced AI infrastructure environments in production today
- Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments
- Help define how next-generation AI infrastructure is operated and supported
- Be part of a team shaping the future of AI-powered operations through k0rdent AI
- Join a growing organisation investing heavily in AI infrastructure and platform services
Originally posted on Himalayas
📱 Want jobs like this daily? Join @remotywork on Telegram — top 5 scored remote jobs every weekday, no spam.