inference systems using frameworks like vLLM, TGI, and Ollama. Optimize GPU usage (CUDA, cuDNN, VRAM-aware batching). Maintain... of quantization techniques (GGUF/GPTQ/AWQ). Experience working with GPU optimization and the CUDA stack. Ability to build solutions...
Senior Compiler Engineer
for GPGPU architectures via HIP, CUDA, or OpenCL . - Foundational understanding of GEMM (General Matrix Multiply) execution...
Luxoft ⚡ ⚡ Mon, 22 Jun 2026 22:09:59 GMT