Photo of Lingjuan Lyu

Artificial intelligence & robotics

Lingjuan Lyu

Developing Sony’s first-ever vision-centric federated learning platform.

Year Honored
2024

Organization
Sony Research

Region
China

Hails From
China
Lingjuan Lyu’s current research focuses on developing responsible, low-cost, high-performance vision foundation models and visual generative AI models while addressing critical issues such as data privacy, security, and copyright. She has led her team in conducting fundamental and business driven research in relevant fields.

Her team has developed Sony’s first vision foundation model, the industry’s most cost-effective diffusion transformer model, an end-to-end privacy toolbox, and a vision centric federated learning platform. Her team also deployed frontier on-device privacy and on-device multi-task solutions on multiple edge devices, including the world’s first smart vision sensor with edge processing capabilities, positively impacting millions of users.

Notably, her team developed a 100-million-parameter vision foundation model (VFM) in 8 months. This model supports 17 practical vision tasks and outperforms many well-known VFMs and multimodal foundation models. Using only 8 H100 GPUs for 2.5 days and based on the legal data, her team trained a stable diffusion v1/v2-quality model (MicroDiT) from scratch. This model is currently the cheapest image generation model trained from scratch, reducing training cost to just $1,890, which is 118X cheaper than training stable diffusion.