✨ About The Role
- Lead critical work on the "Engine" service at OpenAI, responsible for powering model inference with GPT-4 and ChatGPT
- Own substantial portions of the inference stack and ensure efficient running of current and future models at high scale
- Coordinate inference needs across OpenAI's teams and products, while hiring and growing a world-class AI systems engineering team
- Create a diverse, equitable, and inclusive culture that fosters radical candor and challenges group think
- Work with core technologies like Python, PyTorch, CUDA, Triton, Redis, Infiniband, NCCL, NVLink in a large-scale GPU deployment across Kubernetes clusters
âš¡ Requirements
- Experienced engineering manager with a track record of leading high-scale distributed systems and ML systems
- Deep understanding of ML systems and modern LLMs, with a focus on highly available, reliable, production-grade systems at scale
- Proven ability to build inclusive teams and close competitive candidates in a challenging hiring market
- Strong communication skills with the ability to convey compelling visions of the future and share learnings effectively
- Comfortable with ambiguity and rapid changes, able to add structure and order when needed