You are viewing a preview of this job. Log in or register to view more details about this job.

Quantization Research Engineer

As part of the data science team, you will focus on model optimization for a custom GPNPU architecture. You will research, prototype, and implement novel quantization algorithms specifically tailored to unique hardware constraints. In addition to applying existing techniques, you will develop custom low-precision methods to maximize performance on the proprietary GPNPU. Your contributions will directly shape the quantization capabilities of the platform's SDK and influence future hardware design features.

This engineering role is primarily based in-office at our California Bay Area location. We prioritize strong technical collaboration and rapid iteration through in-person problem-solving. The team also gathers periodically for onsite meetings and offsite events to align strategic priorities.

Responsibilities

Design statistically rigorous experiments to compare Post-Training Quantization (PTQ), Quantization-Aware Training (QAT), and mixed-precision schemes on vision, language, and multimodal models.
Implement custom quantization algorithms from scratch, adapting existing techniques, or developing novel approaches to match the unique architectural features and numerical formats of the GPNPU.
Build calibration datasets and develop Python-based tools (notebooks/dashboards) to track trade-offs between accuracy, latency, power, and memory.
Perform layer-level error analysis to guide the selection of numerical formats.
Partner with the compiler team to integrate research findings into turnkey SDK flows and reference configurations.
Publish internal white papers and external benchmarks, and present technical results to customers and at industry events.
Monitor academic literature regarding model compression and efficient inference, translating promising research into reproducible prototypes.

Requirements

Education & Experience: M.S. or Ph.D. in CS, EE, Applied Math, or a related field, with 5+ years of experience in ML model optimization or data-science-driven research.
Technical Expertise: Deep understanding of fixed-point arithmetic, quantization theory, numerical analysis, and statistical calibration.
Implementation Skills: Strong ability to implement quantization algorithms from first principles rather than relying solely on existing frameworks.
Software Proficiency: Fluent in Python, deep learning frameworks (PyTorch or TensorFlow), data analysis libraries (NumPy/Pandas/SciPy), and visualization tools.
Hardware Interfacing: Experience implementing custom quantizers and understanding their interaction with hardware constraints such as bit-width, format, and operations.
Toolkits: Hands-on experience with at least one quantization toolkit (e.g., PyTorch FX, TF-Lite, ONNX-Runtime, TVM, or MLIR Quant) and the ability to extend its functionality.
Model Knowledge: Working knowledge of CNNs, Transformers, and modern DNN architectures.

Bonus Qualifications

Experience with custom hardware accelerators, Digital Signal Processors (DSPs), or specialized neural processing units.