Job Opportunity
As a Senior Research Engineer, you will drive innovation in architecture development for cutting-edge models of various scales, including small, large, and multi-modal systems. This role requires expertise in video generation model architectures with a hands-on, research-driven approach. Your mission is to explore and implement novel techniques and algorithms that lead to groundbreaking advancements: data curation, strengthening baselines, identifying and resolving existing pre-training bottlenecks to push the limits of model performance.
Pioneer multimodal and video-centric research that moves fast and breaks ground, contributing directly to usable prototypes and scalable systems. You will design and implement novel AI architectures for multimodal language models, integrating text, visual, and audio modalities. Develop scalable training and inference pipelines optimized for large-scale multimodal datasets and distributed GPU systems across thousands of GPUs. Optimize systems and algorithms for efficient data processing, model execution, and pipeline throughput. Build modular tools for preprocessing, analyzing, and managing multimodal data assets (e.g., images, video, text). Collaborate cross-functionally with research and engineering teams to translate cutting-edge model innovations into production-grade solutions. Prototype generative AI applications showcasing new capabilities of multimodal foundation models in real-world products. Develop benchmarking tools to rigorously evaluate model performance across diverse multimodal tasks.