Machine Learning Compiler Engineer

at DeepRec.ai
Location Dar es Salaam, Tanzania, United Republic of
Date Posted May 10, 2024
Category Engineering
Job Type Full-time
Currency TZS

Description

About the job

Machine Learning Compiler Engineer

For a high growth Series A Deep Tech Company with over $50 million in funding, we're seeking a Machine Learning Compiler Engineer to join their growing team.

Responsibilities:

Lower deep learning graphs - from common frameworks (PyTorch, Tensorflow, Keras, etc) down to an IR representation for training - with particular focus on ensuring reproducibility

Write novel algorithms - for transforming intermediate representations of compute graphs between different operator representations.

Ownership - of two of the following compiler areas:

  • Front-end: Integrate common Deep Learning Frameworks with our internal IR, and implement transformation passes in ONNX to adapt IR for middle-end consumption.
  • Middle-end: Design compiler passes for training-based compute graphs, integrate reproducible Deep Learning kernels into the code generation stage, and debug compilation passes and transformations.
  • Back-end: Translate IR from the middle-end to GPU target machine code.

Required Skills:

  • Fundamental knowledge of traditional compilers (e.g., LLVM, GCC) and graph traversals necessary for compiler code development.
  • Strong software engineering skills, demonstrated by contributing to and deploying production-grade code.
  • Understanding of parallel programming, particularly concerning GPUs.
  • Willingness to learn Rust, as it is our company's default programming language.
  • Ability to operate with High-Level IR/Clang/LLVM up to middle-end optimization, and/or Low-Level IR/LLVM targets/target-specific optimizations, especially GPU-specific optimizations.
  • Highly self-motivated with excellent verbal and written communication skills.
  • Comfortable working independently in an applied research environment.

Bonus Skills:

  • Thorough understanding of computer architectures specialized for training neural network graphs (e.g., Intel Xeon CPU, GPUs, TPUs, custom accelerators).
  • Experience in systems-level programming with Rust.
  • Contributions to open-source Compiler Stacks.
  • In-depth knowledge of compilation in relation to High-Performance Computer architectures (CPU, GPU, custom accelerator, or a heterogeneous system).
  • Strong foundation in CPU and GPU architectures, numeric libraries, and modular software design.
  • Understanding of recent architecture trends and fundamentals of Deep Learning, along with experience with machine learning frameworks and their internals (e.g., PyTorch, TensorFlow, scikit-learn, etc.).
  • Exposure to Deep Learning Compiler frameworks like TVM, MLIR, TensorComprehensions, Triton, JAX.
  • Experience in writing and optimizing highly-performant GPU kernels.
Drop files here browse files ...