Applied Methods
~JobsPikaResearch Scientist, Foundation Model

Pika

Research Scientist, Foundation Model

ResearchPalo Alto HQOn-SiteFull-TimePosted May 16, 2026

USD 18500k–40000k/yr

About the role

About the Role

 

At Pika, we are pioneering the next generation of creative infrastructure built around real-time, multimodal generation and intelligent agentic platforms. We are seeking accomplished Research Scientists in Foundation Models with expertise in pre-training and mid-training large-scale multimodal foundation models to advance our mission of making agentic, real-time generative technology accessible and transformative for millions of creators. This is a staff and lead-level opportunity.

 

As a key member of our research team, you will design and implement core technologies, develop new methodologies for large-scale multimodal pre-training/mid-training (text, image, audio, and video), and drive innovative approaches for foundational model architecture. You will collaborate closely with engineering and product teams, shaping the future of real-time creative and agentic platforms at scale.

 

What You’ll Do

 
  • Lead research and development on pre-training and mid-training of multimodal foundation models at scale.

  • Design and prototype novel algorithms and architectures for high-fidelity, real-time multimodal synthesis and interaction across modalities.

  • Focus on scalable data pipeline curation and model training strategies for broad, diverse, and sensory-rich datasets.

  • Advance state-of-the-art techniques in diffusion, autoregressive, and other generative models for large-scale pre-training and fine-tuning.

  • Identify, create, and leverage large, high-quality cross-modal datasets.

  • Bring research advancements into production-ready systems in collaboration with engineering and product teams.

  • Publish work in top-tier conferences and journals, and clearly communicate research both internally and externally.

  • Stay at the forefront of foundational model and real-time multimodal AI research.

 

What We’re Looking For

 
  • 5+ years of research experience in large-scale pre-training/mid-training of multimodal foundation models (LLMs, VLMs, Audio LMs, or similar), ideally at the staff or lead scientist level.

  • Track record as a first author on major publications in top conferences or journals (e.g., NeurIPS, ICML, ICLR).

  • Extensive hands-on experience with large-scale multimodal model design, training, and deployment.

  • Deep understanding and implementation experience with generative architectures (diffusion, autoregressive, cross-modal, etc.).

  • Expertise in high-throughput, scalable dataset curation and model pipeline optimization for multimodal applications.

  • Strong programming and prototyping skills (Python, PyTorch, TensorFlow, etc.) and experience deploying research into production systems.

  • Excellent communication and collaboration skills, and a passion for building creative enabling technology.

 

What We Offer

 
  • Competitive salary and substantial equity in a high-growth startup

  • Full health benefits + 401k matching and more

  • Collaborative, mission-driven team environment with major growth opportunities

  • Flexible on-site/remote hybrid (HQ in Palo Alto, CA)

 

About Pika

 

Pika empowers creators by building state-of-the-art agentic and multimedia platforms. Our vision is to break down technical barriers to creativity, making real-time generative and intelligent orchestration accessible to all. Join us and help shape the next evolution of creative technology!

 

If you are a leading researcher excited to build and scale real-time multimodal foundation models, we want to hear from you.