The trend towards hyperscale AI started a few years ago, and many companies and institutions have been increasing their investment in GPU cluster systems and clouds. Gangwon Jo co-founded Moreh in 2020 to make it easier to build and utilize AI infrastructure at scale. He believes that many infrastructure-level challenges are due to the limitations of the legacy AI software stacks, specifically deep learning frameworks and parallel computing platforms. From this insight, he leads the development of the MoAI platform, a set of fully integrated software components from deep learning primitive libraries to application-level APIs. The platform bridges AI applications and underlying accelerators in a more efficient, scalable, and flexible way.
The MoAI platform provides 100% PyTorch/TensorFlow compatibly and ensures the same user experience as traditional AI software stacks. Numerous existing AI applications can run on the platform without any modification. However, the internal structure of the platform is completely different from existing deep learning frameworks. The platform adopts new techniques including runtime IR construction and just-in-time compilation – it first records the behavior of an application as a computational graph. Then the just-in-time compiler finds the optimal way of executing the application based on this birds-of-eye view of what the user wants to do.
The platform features accelerator portability, single device abstraction, and application-level virtualization. It can run AI applications on non-NVIDIA GPUs and other types of accelerators such as NPUs. AI infrastructure can be built on the most cost-effective hardware without concerning software compatibility. It encapsulates a large cluster system as a single device. Users can easily implement AI applications, especially those that deal with large AI models such as GPT-3, without considering parallelization across multiple accelerators and nodes. Lastly, it can provide the users with virtual devices instead of directly exposing physical accelerators to users. The mapping between virtual and physical devices is solely managed by the platform. This enables more flexible AI cloud services and drastically improves the average utilization of accelerators.