Project description
Working on GPU support for OpenAI/Triton — a language and compiler for writing highly efficient custom Deep-Learning primitives. Work with the open-source community to analyze, develop, test, and deploy performance improvements for neural networks implemented with Triton on GPUs with ROCm.
Responsibilities
SKILLS
Must have
Nice to have
• Basic understanding of ML technologies • Experience with GPGPU (General purpose GPU) computing (HIP, CUDA, OpenCL, etc.) • Experience with PyTorch • Experience with LLVM and MLIR compiler infrastructure, analysis or optimizations implementation • Knowledge of ROCm infrastructure • Experience in CMake, make/ninja build system • GEMM performance fundamentals • Experience with Docker