Electrical engineer, ML researcher

I'm Xavier, an electrical engineer based in New York city. I work with Machine Learning core algorithm, doing research in diverse areas from model archicture, loss functions, ensembling, multi-source adaptation, and transformers.

A short summary

I received my M.Sc. (2004) and Ph.D. (2010) in Electrical Engineering from Universitat Ramon Llull in Barcelona, Spain.

I worked as an Assistant Researcher at La Salle (2003-2008) before joining Phonetic Arts Ltd in Cambridge, UK, a company focused on high-quality synthetic speech for video games. After three years, we were acquired by Google. At Google London, I led the speech research team, then moved to Google NY to join the Research and Machine Intelligence team.

I previously led the AutoML team, working on ML efficiency, ensembles, multi-source adaptation, and neural architecture search. I'm now focused on LLMs for efficiency and learning without training. My research explores in-context learning dynamics, how prompts can be transmuted into model weights, and the equivalence between context and parameter updates in transformers, aiming to improve long context handling and optimize memory and tokens usage at the model level.

This website is a personal platform, and any views expressed here are solely mine and do not represent the opinions of Google.

Work

  1. Company
    Google NY
    Role
    Research scientist
    Date
  2. Company
    Google UK
    Role
    Research scientist
    Date
  3. Company
    Phonetic-arts, Cambridge, UK
    Role
    Research engineer
    Date
  4. Company
    La Salle
    Role
    Research engineer
    Date
  5. Company
    La Salle
    Role
    Intern engineer
    Date

Education

  1. Degree
    PhD - Information technology
    Title
    A hybrid speech synthesis using HMM and concatenative speech generation
    Year
  2. Degree
    MSc - Electrical engineering
    Title
    A dialogue system using Reinforcement learning
    Year
  3. Degree
    BSc - Electrical engineering
    Title
    Speaker detection using an array of microphones
    Year
Download CV

Interesting projects

These are some of the projects I've worked during my career.

  • LLMs at Google

    Worked on two key innovations in large language models: Deep fusion, an architecture growth approach for improving efficiency, and Learning without training, a memory-based approach to handling long context.

  • AutoML and model architecture optimization

    Led the development of advanced ML models using efficient techniques. Created AdaNet, an ensemble system, and expanded it for general model architecture search, excelling in speech technology applications. Skilled in automating ML models, encompassing feature engineering and multi-objective optimization.

  • High-quality Google TTS

    Spearheaded Google TTS system development from research to production (backend and on-device), implementing a hybrid PhD proposal for all voices and languages. Main focus on high-quality, expressive parametric synthesis and hybrid systems for optimal voice naturalness.

  • TTS for videogames

    Led research at Phonetic Arts to incorporate advanced TTS models into gaming, focusing on: a) achieving natural audio processing and expressivity, meeting user expectations, and b) creating the first gaming-specific on-device TTS systems.

  • SALERO project

    Semantic audio-visual entertainment reusable objects. I was responsible for creating a TTS system in English and Spanish as an API capable of handling meta-data.

    cordis.europa.eu

  • Meteosam project

    Research engineer developing the weather forecast TTS system for television broadcasting, as a limited domain unit selection system. It utilizes a curated database of speech segments for accurate and natural speech output. The system excelled at conveying weather details but was less versatile compared to general TTS systems.