Job Title: ML Networking Performance Engineer
----------------------------------- Overview:
In today's fast-paced digital landscape, ensuring the optimal performance of Machine Learning (ML) workloads is crucial for businesses to stay ahead. As a Principal Network Development Engineer in our team, you will be responsible for designing and implementing systems that can intelligently measure and baseline ML network performance without direct visibility into customer applications.
Key Responsibilities:
The ideal candidate will take ownership of ML network performance dependent on the EC2 interface, a critical capability that directly impacts customers' ability to train and deploy ML models efficiently. In the immediate term, they'll tackle one of our most pressing challenges: building a comprehensive understanding of network performance for ML workloads in production. This means developing new ways to identify and classify network traffic patterns from ML training, building systems that can automatically tune network configurations based on observed workload characteristics.
Required Skills and Qualifications:
To succeed in this role, you will need:
• A Master's Degree in Computer Science or Engineering, or equivalent experience
• Excellent IP networking fundamentals and extensive experience in the application of IP protocols
• Expertise with major internet routing protocols; specifically, BGP, OSPF, MPLS, RSVP, and ISIS
• Expert level network analysis fundamentals and robust troubleshooting skills; specifically, network performance analysis
• Ability to lead teams of engineers to deliver large scale solutions
• Excellent written and verbal communication skills and an ability to interact efficiently with peers and customers
Benefits:
We offer a dynamic and inclusive work environment that values innovation, collaboration, and employee growth. Our team members enjoy a range of benefits, including professional development opportunities, flexible working arrangements, and a competitive compensation package.
Why Join Us?
We are committed to creating a workplace where every individual feels valued, respected, and empowered to contribute their best work. We believe that diversity and inclusion are essential to driving business success and making a positive impact in the world. If you share our passion for innovation, excellence, and social responsibility, we encourage you to explore this opportunity further.