<- Back to search
Artificial Intelligence Engineer
- To be advised
- Permanent, Full Time
- Closing in 27 days
- Danucore Limited									
Key responsibilities: Design and implement high-performance distributed inference systems for running large language models and multi-modal AI models at scale. 
Optimise model serving infrastructure for maximum throughput, minimal latency, and optimal power efficiency.
Develop and maintain deployment pipelines for efficient model serving, and monitoring in production.
Research and implement cutting-edge techniques in model optimisation, including pruning, quantisation, and sparsity methods.
Design, build and configure experimental hardware setups for model serving and optimisation.
Design and implement robust testing frameworks to ensure reliable model serving.
Collaborate with the team to build and improve our distributed inference platform, making it more accessible and efficient for users.
Monitor, optimise and document system performance metrics, including latency, throughput, power consumption and benchmark scores.
We are looking for someone with a passion for AI and experience in building, training, fine-tuning and deploying state-of-the-art LLMs and other deep-learning models on large multi-GPU clusters and applying advanced optimisation techniques to maximise accuracy-per-watt and inference speed.
Experience in optimising across the hardware/software boundary—profiling CPU/GPU utilisation, implementing various parallelism techniques, and minimising cross-node latency via smart network-layer choices.
Good working knowledge of leading AI runtimes: PyTorch, vLLM, TensorRT, ONNX Runtime, Llama.cpp.
Experience with distributed inference engines: Ray Serve, Triton Inference Server, vLLM, SLURM.
Knowledge of AI compilers: OpenXLA, torch.compile, OpenAI's triton, MLIR, Mojo, TVM, MLC-LLM.
Good working knowledge of inter-process communication: message queues, MPI, NCCL, gRPC.
Good working knowledge of high performance networking: RDMA, RoCE, Infiniband, NVIDIA GPUDirect, NVLink, NVIDIA DOCA, MagnumIO, dpdk, spdk.
Experience with model quantisation, pruning, and sparsity techniques for performance optimisation.
It would be a bonus if you have a homelab, blog, or a collection of git repos showcasing your talents and interests and/or made contributions to open-source projects or publications in the field of AI/ML systems optimisation.
We're looking for future experts with curious minds and a growth mindset.
If you're ready to challenge the norm and think outside the box, we'd love to hear from you.
Email your cover letter and CV to jobs@danucore.com with subject "AI Engineer - Isle of Man".
In your cover letter, please include details of what parts or technologies mentioned in this job advert you have experience with and can add value with
links to any public work e.g. github profile, blogs or papers.