Quantitative Development - DevOps Engineer
Job Function Summary
The Central Liquidity Strategies (CLS) business manages a number of portfolios and products designed to optimize the firm's trading and execution approach by providing internal liquidity solutions for portfolio managers on both a risk and agency basis.
We are seeking a highly driven, results-oriented, and opinionated dev ops leader with experience in handling research infrastructure, deploying critical applications, and operating on large amounts of data to create battle-tested infrastructure and improve the research development experience.
Principal Responsibilities
* Leadership: the candidate will design and implement infrastructure, and advise on and enforce best practices to maximize research and development velocity
* Research: the candidate is expected to keep up with the state-of-the-art tools that are being used in the field and continuously evaluate what the best tools and practices are for our use cases
* Machine Learning Operations (MLOps) + Development Experience (DevEx):
* create dependable and reproducible polyglot (Python, native extensions, CUDA) environments for rapidly iterating research projects that can be easily deployed to prod
* Enforce best practices and packaging standards for large research codebases
* Work with the cloud to help scale research jobs
* Infrastructure automation:
* develop CI/CD pipelines for research processes and live trading apps
* develop robust monitoring solutions for infrastructure and deployed applications
* automate recurring jobs with tools like Airflow/Prefect
* Performance engineering:
* Be familiar with best practices for profiling, monitoring performance to assist with performance investigations
* Develop solutions that empower researchers and developers to understand the performance of their code
Qualifications/Skills Required
* Experience: 10+ years of experience with research focused DevOps (HPC, ML research, quant research) and experience with high-availability production deployments
* Strong communications skills and ability to work with many stakeholders in a team environment
* Leadership skills: ability to work with constraints, make decisions under time pressure, and own your work
* Development skills: Experience writing clean, robust, and testable code for automating processes pertaining to infrastructure management and deployment
* Systems knowledge:
* familiarity with Linux internals
* understanding of package management, how software is deployed on systems
* Python:
* Strong understanding of Python internals
* Familiarity with the latest standards in the packaging ecosystem (uv), build tools like hatchling
* Familiarity with tools like: Nix, Conda, Pixi, and containers