Artificial intelligence & robotics

Aston Zhang

Improving AI model efficiency and pioneering Multimodal-CoT to enhance complex reasoning.

Year Honored: 2024

Organization: Meta

Region: China

Hails From: China

Dr. Zhang has long aimed to make cutting-edge artificial intelligence accessible to everyone. To this end, he has made outstanding contributions in key areas such as AI efficiency and open-source initiatives to break down technological barriers and promote AI democratization.

To address the huge computational and storage demands of deep learning models, he proposed a parameterized hypercomplex multiplication framework. It overcomes the dimensional limitations of traditional hypercomplex operations (e.g., quaternions), enabling parameter compression to arbitrary dimensions (1/n) without sacrificing model expressiveness, thus significantly boosting the efficiency of AI models.

In multimodal reasoning, to overcome the limitation that Chain-of-Thought (CoT) research focused mainly on the language modality, Dr. Zhang proposed Multimodal-CoT in 2023. This method innovatively integrates language (text) and visual (image) modalities within a two-stage framework, separating the process of generating fundamental principles from answer inference. It allows answer inference to utilize higher-quality fundamental principles generated based on multimodal information, effectively reducing hallucinations and accelerating model convergence. In benchmark tests such as ScienceQA, models with less than 1 billion parameters achieved leading performance at the time, offering important insights for complex multimodal reasoning tasks.

Furthermore, as a core contributor to Llama 3 and Llama 4, Meta’s open-source large language model, he played a key role in advancing large-scale AI accessibility. He not only participated in the model’s pre-training and post-training but also led the research and development of super long (10M-token) context capabilities and made significant contributions to data preprocessing, multimodal capability integration, and inference optimization. Upon release, Llama models quickly became a focal point for the global developer community. Their high openness and performance provide a strong alternative for breaking down the proprietary barriers of advanced AI models.

MIT Tehnology Review

Innovators Under 35

Language

Artificial intelligence & robotics

Aston Zhang

More in Artificial intelligence & robotics

Asia Pacific

Jiaming SONG

Global

Jade Abbott

Global

Nazneen Rajani

China

Lingjuan Lyu