Photo of Guohao Dai

Artificial intelligence & robotics

Guohao Dai

Significantly improving the computational efficiency and energy performance of AGI.

Year Honored
2024

Organization
Shanghai Jiao Tong University; Infinigence AI

Region
China

Hails From
China
The rapid advancement of artificial intelligence, especially large language models, is pushing humanity into the era of AGI. However, the enormous computational demands that come with this progress have led to shortages in computing power and soaring energy consumption, emerging as core challenges for the continued development of the AI industry.

Guohao Dai has long been committed to the research of sparse computing and software-hardware co-design. His core ideas are based on prior-knowledge-driven structured sparsity, machine-learning-driven dynamic compilation, and fine-grained parallel sparse architecture. By reducing the amount of computation and improving hardware utilization, his methods enable hardware with modest manufacturing processes and limited peak performance to match high-end hardware. This leads to an order-of-magnitude increase in effective computing power, significantly boosting both the computational and energy efficiency of future AGI systems.

In 2023, Guohao co-founded Infinigence AI with the mission of commercializing these sparse computing acceleration technologies to meet the growing demands for large-scale computing in real-world applications. Building on his foundational research in software–hardware co-design, he has extended his vision toward scalable, heterogeneous industrial systems to increase the overall available computational power in the AI era. He has since introduced a portfolio of intelligent solutions for both edge and cloud environments. On the edge side, this includes the Megrez-3B-Omni multimodal understanding model, the SpecEE dynamic sparse engine, and the first custom inference LPU IPs, FlightLLM for large language models and FlightVGM for video generation models. On the cloud side, the offerings include the FlashDecoding++ inference engine, the Semi-PD semi-decoupled inference scheduling system, and the FlashOverlap communication acceleration scheme for inference systems. By enabling efficient and coordinated deployment of large-model algorithms across diverse hardware in both edge and cloud environments, Dr. Dai has provided critical technological foundations for democratizing computing power and ensuring the sustainable development in the AGI era.