✨ About The Role
- The role involves optimizing deep learning models for deployment using frameworks like Pytorch, ONNX, and TensorRT.
- Responsibilities include developing techniques for model quantization and compression to enhance performance.
- The engineer will collaborate with AI researchers and developers to integrate performance optimization techniques into production systems.
- Analyzing and improving existing model architectures for better efficiency and performance is a key task.
- The position requires interfacing with the production engineering team for assistance with on-prem deployments.
⚡ Requirements
- A bachelor's or master's degree in Computer Science, Electrical Engineering, or a related field is essential for this role.
- The ideal candidate will have experience implementing modern deep learning architectures such as transformers and CNNs.
- Strong software development skills are necessary to succeed in this position.
- Familiarity with machine learning frameworks like PyTorch, ONNX, and TensorRT is crucial.
- Candidates should have at least 2 years of industry experience preparing machine learning models for production.