Introduction to the position
Join Haldorix, a startup studio turning industrial challenges into scalable ventures powered by AI.
We’re building NITRA, an intelligent industrial vision system revolutionizing real-time monitoring in textile production. As we scale to more sites, we’re transitioning from cloud-based processing to Edge AI, optimizing performance, latency, and cost.
We’re looking for an MLOps Engineer – Edge AI Specialist to lead the design and deployment of our on-premise inference infrastructure. You’ll play a pivotal role in enabling our deep learning and generative models (YOLOv8, Stable Diffusion, BERT) to run efficiently on embedded GPU hardware — delivering reliable, low-latency insights directly on the factory floor.
Your role
Architecture & Infrastructure:
Design and deploy a hybrid edge/cloud architecture optimized for real-time video analytics. Define hardware specs (Jetson Orin, RTX A2000, Intel NUC) and ensure reliable communication between edge servers and the cloud.
Model Optimization:
Convert and optimize deep learning models for embedded GPUs using ONNX Runtime and TensorRT. Apply quantization (INT8, FP16) and pruning techniques to reduce latency and memory footprint.
MLOps Pipeline:
Build and maintain a CI/CD pipeline tailored for edge deployment — containerized models, version control, automated OTA updates, and proactive performance monitoring.
Orchestration & Deployment:
Deploy and manage fleets of edge servers using K3s/MicroK8s. Implement declarative deployments (ArgoCD/Flux) and centralized management via KubeEdge or AWS IoT Greengrass.
Security & Compliance:
Enforce full data locality, end-to-end encryption (TLS/mTLS), and anonymization pipelines to ensure GDPR compliance.
Monitoring & Reliability:
Set up comprehensive dashboards (Prometheus, Grafana, Loki) to track inference performance, GPU utilization, and uptime (>99%).
LLM Integration:
Support deployment of a centralized LLM server (Claude, GPT-4, or open-source) powering RAG-based analytics and real-time conversational interfaces for clients.
Field Operations:
Conduct on-site installations, validations, and troubleshooting sessions with client teams. Train local technicians and maintain up-to-date documentation for reproducibility and scalability.
Your team
You’ll join a multidisciplinary engineering team focused on bringing real-time AI to industrial environments. Collaborating closely with computer vision, backend, and infrastructure engineers, you’ll report to the Technical Lead overseeing deployment strategy.
Our culture values autonomy, precision, and hands-on problem solving. Every team member contributes to the full lifecycle - from architecture to on-site deployment.
Your qualifications
Required:
- 3–5 years of experience deploying AI models in production
- Strong expertise in MLOps, edge computing, and embedded GPU environments
- Proven track record with TensorRT, ONNX Runtime, quantization (INT8/FP16), and model pruning
- Proficiency in Python (PyTorch, TensorFlow, FastAPI) and DevOps tools (Docker, CI/CD, Ansible)
- Solid understanding of Kubernetes/K3s, networking, and Linux administration
- Experience with Prometheus, Grafana, and GPU performance profiling
- Excellent documentation and troubleshooting skills
Nice to Have:
- Familiarity with NVIDIA Jetson and other embedded AI hardware
- Experience with Fleet Management Systems (AWS IoT Greengrass, KubeEdge, Balena)
- Knowledge of Stable Diffusion and LLM pipelines (RAG, Pinecone, Weaviate, ChromaDB)
- Background in industrial computer vision, IoT, or real-time systems
- Understanding of GDPR compliance and data anonymization for on-prem AI systems
Benefits
- Join a startup studio scaling high-impact AI ventures from prototype to production
- Work on cutting-edge Edge AI systems deployed across industrial sites
- Collaborate with an agile, expert team blending AI, hardware, and DevOps engineering
- Gain hands-on experience with inference optimization, GPU benchmarking, and large-scale orchestration
- Be part of a project delivering tangible cost and performance breakthroughs in manufacturing AI
Recruitment process
- Jobzyn AI interview (25–45 min)
- Technical interview (1h) with the Lead Developer or Technical Architect
- Practical test (2–3h) simulating a real-world MLOps deployment case
- Final interview with the NITRA team and Haldorix partners