Applied Methods
~jobsLiquid AIMember of Technical Staff - Post Training, Applied (Audio)

Liquid AI

Member of Technical Staff - Post Training, Applied (Audio)

Applied MLSan Franciscoremotefull-timeposted 1 week ago

About the role

About Liquid AI

Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there.

The Opportunity

LFM2.5-Audio is Liquid's end-to-end multimodal speech and text language model. At 1.5B parameters, it handles speech-to-speech conversation, ASR, and TTS without requiring separate components, making it uniquely suited for real-time, on-device deployment.

We're now bringing this model to enterprise customers. The core challenge: teaching audio models to understand user intents and translate them into structured tool calls. Think voice-driven function calling, where a spoken request triggers the right API, extracts the right parameters, and confirms back to the user in natural speech.

This role sits at the intersection of frontier audio models and real-world deployment. You'll own the applied post-training work that adapts LFM2.5-Audio for customer use cases end-to-end, from data generation through delivery. Unlike most roles that force a trade-off between customer impact and foundational work, this one gives you both: deep ownership over how audio models are adapted, evaluated, and shipped, and a direct line into the evolution of Liquid's post-training and audio stacks.


If you care about data quality, evaluation, and making models actually work in production, this is a chance to shape how applied audio AI is done at a foundation model company.

What We’re Looking For

We need someone who:

  • Takes ownership: Owns customer post-training projects end-to-end for audio workloads, from requirements through delivery and evaluation.

  • Thinks end-to-end: Can reason across audio data pipelines, speech-text alignment, model adaptation, and evaluation as a connected system.

  • Is pragmatic: Optimizes for model quality and customer outcomes over publications or theory.

  • Thrives under constraints: On-device, low-latency, memory-limited audio systems excite you. You see constraints as design parameters, not blockers.

The Work

  • Act as the technical owner for enterprise audio post-training engagements.

  • Translate customer requirements into concrete post-training specifications and workflows for LFM2.5-Audio and future audio models.

  • Design and build function calling capabilities for audio models: training models to map spoken user intents to structured tool calls (API invocations, parameter extraction, confirmation flows).

  • Design and execute data generation pipelines for speech-to-speech and text-to-text training, including synthetic dialogue, function calling examples, and intent-action pairs.

  • Run supervised fine-tuning, preference alignment, and reinforcement learning workflows on audio language models.

  • Design task-specific evaluations for audio function calling (intent recognition accuracy, parameter extraction, end-to-end task completion) and feed learnings back into core post-training pipelines.

Desired Experience

Must-have:

  • Hands-on experience with post-training for language models (SFT, preference alignment, and/or RL).

  • Experience with data generation and evaluation pipelines for LLM or audio model training.

  • Strong intuition for data quality and evaluation design.

  • Familiarity with function calling, tool use, or structured output training for language models.

Nice-to-have:

  • Experience with speech or audio language models (speech-to-speech, ASR, TTS, or multimodal audio-text systems).

  • Prior exposure to customer-facing or applied ML delivery environments.

  • Experience with alignment or RL techniques beyond basic supervised fine-tuning.

  • Familiarity with on-device or low-latency inference constraints.

What Success Looks Like (Year One)

  • Independently owns and delivers enterprise audio post-training projects with minimal oversight.

  • Has built and shipped function calling capabilities that reliably translate spoken user intents into tool calls for production use cases.

  • Is trusted by customers as the technical owner, demonstrating strong judgment and delivery quality.

  • Has made durable contributions to Liquid's general-purpose post-training and audio pipelines by feeding applied learnings back into baseline model development.

What We Offer

  • Real ML work: You will fine-tune audio and speech models, build audio data pipelines, and ship solutions to enterprise customers under real-time on-device constraints.

  • Compensation: Competitive base salary with equity in a unicorn-stage company

  • Health: We pay 100% of medical, dental, and vision premiums for employees and dependents

  • Financial: 401(k) matching up to 4% of base pay

  • Time Off: Unlimited PTO plus company-wide Refill Days throughout the year