Research and development (R&D) costs for academic institutions are rising at an unprecedented rate. In 2022 alone, $97.8 billion was spent on R&D – an increase of $8 billion from 2021. A significant factor behind these growing expenses has been the increasing cost of essential hardware, such as large-scale data storage and high-performance computing (HPC) systems. As a result, research budgets are often strained, pushing institutions to seek more cost-effective technological solutions.
This substantial financial demand is one of the reasons why markets like cloud infrastructure are predominantly controlled by a few large corporations. In response to this challenge – a consortium of private and public institutions across Massachusetts, called the Mass Open Cloud (MOC) Alliance, was established in 2013. Backed by seed funding and support from the Massachusetts Technology Collaborative and dedicated to the creation of cloud computing infrastructure that supports a broad industry and research community, this nonprofit initiative brings together universities, government, and businesses – aiming to create an open computing marketplace.
This platform allows both academic and industry users to access cutting-edge, low-cost cloud resources with predictable billing models.
Jon Stumpf, Strategic Engagement Coordinator at the MOC Alliance, elaborates: “The beauty of the MOC’s model lies in the fact that we don’t charge data egress fees. This can lower academic researchers’ costs by up to a half making us an especially attractive option.”
Since its inception, the MOC has served thousands of researchers and open-source developers, emerging as a laboratory for cloud research and innovation. It has facilitated valuable contributions to academic research and the development of open-source software.
With the aim of fostering an active partnership between higher education, medical research centers, government, and industry, this alliance provides the structure and resources needed to facilitate close collaboration between research, development, and operations across a range of interconnected projects.
Powering AI Research
In recent years, Artificial Intelligence (AI) has become a key driver of innovation in research and development across academic institutions. AI is valued for its capabilities to enhance other technologies like HPC and simulation tools, enabling researchers to conduct more advanced analysis and modeling.
And, while the MOC Alliance initially provided a Central Processing Unit (CPU)-only cluster for general research purposes, with plans for more intensive AI research, the demand for higher computational capabilities, particularly Graphics Processing Units (GPUs), became impossible to ignore.
As GPUs have become invaluable for accelerating workloads and AI exploration, they have also become significantly more expensive and supply constrained than traditional CPUs. To meet the growing demand for GPU resources while keeping costs manageable, the MOC Alliance needed to find a solution that would allow it to scale its infrastructure without compromising its budget. Maintaining a balance between cost-efficiency and the computational power required for AI research became a central challenge for the alliance.
“While we had a well-established set of compute resources, we lacked GPU nodes. We got around this by running workloads that would traditionally be done on GPUs on CPUs — but, obviously, this wasn’t ideal. As GPU computing started taking off across more and more domains, we needed to respond to the demand with dedicated GPU resources.” -says Nancy Clinton, Managing Director, Mass Open Cloud Alliance
The MOC Alliance explored various options to provide its research clients with access to GPUs and found Lenovo to be an ideal partner. With the ability to drive innovation by providing scalable GPU resources without upfront costs through Lenovo TruScale GPU as a Service (GPUaaS), the MOC Alliance addressed a crucial need for a cutting-edge computational power in a cost-effective way.
Energy Efficient Approach
With the growing focus on advancing AI, while maintaining net-zero goals, industries are responding to the need for solutions which can mitigate the additional power requirements. In its search for innovative, energy-efficient options, the MOC Alliance recognized Lenovo TruScale GPUaaS offer particularly compelling due to its lower power consumption and scalable offering.
The MOC Alliance found Lenovo’s Neptune™ liquid cooling technology addresses a critical challenge in high-performance computing: managing the heat generated by increasingly powerful hardware. While traditional air-based cooling systems often struggle to keep up with the thermal demands of modern data centres, leading to inefficiencies in both performance and energy consumption, Neptune™ offers a more effective solution by utilizing liquid cooling to enhance thermal management – and reducing power consumption by up to a 40%.
Unlike conventional systems that rely solely on air conditioning and fans for air circulation, Neptune™ directs liquid coolant (typically de-ionized water) to the components generating the most heat, such as CPUs and GPUs. This method provides a 3.5x improvement in thermal efficiencies compared to traditional air-cooled systems, allowing data centres to maintain energy efficient operating temperatures even under heavy computational loads.
Beyond energy savings, Neptune™ enables organizations to maximize their computing performance without the risk of overheating. This is particularly important for research institutions and enterprises conducting resource-intensive tasks such as large-scale simulations, AI training, or data analysis, like the MOC Alliance.
By implementing Lenovo TruScale GPUaaS, the MOC Alliance was able to optimize its AI workloads while working towards its sustainability goals. Lenovo Professional Services have also delivered data-driven insights into key challenges related to power, cooling, and energy efficiency. This ensured the smooth implementation, deployment, and maintenance of advanced liquid cooling technologies, critical for the success of academic research. For example, Lenovo Power and Cooling Services helped optimize demanding AI workloads, boosting computational power for generative AI while also aligning with sustainability objectives.
Jon Stumpf, Strategic Engagement Coordinator, Mass Open Cloud Alliance said: “We’re very impressed with Lenovo’s Neptune water-cooling technology. The density that it permits is astounding. We can fit 192 GPUs across just two server racks, which really helps us from a floor space perspective. Lenovo Power and Cooling Services helps us to maintain the data center technology, ensuring optimal performance and allowing us to focus on more strategic initiatives.”
Future of Academic Research
By creating a flexible and scalable infrastructure, the collaboration between Lenovo and the MOC Alliance has made high-performance GPU resources more accessible to universities across the Boston area, including Boston University, MIT, Northeastern University, Harvard University and University of Massachusetts, with further national and international partnerships on the horizon.
Lenovo TruScale GPUaaS enables quick scale–ups and resource allocations for new initiatives and is already being used to support AI-driven research, such as drug discovery efforts, at Boston University. The system’s flexibility also fosters collaboration among universities, allowing students and faculty to share resources and knowledge, accelerating breakthroughs in scientific computing, data analytics, and AI.
Linda Yao, VP AI Solutions and Services, Lenovo, commented: “The collaboration between Lenovo and the MOC Alliance shows how Lenovo TruScale GPU-as-a-Service lets a range of institutions get access to powerful tools for research and modelling without being burdened by high up-front investment costs. By breaking down these barriers to access, we ensure that researchers and innovators worldwide are not priced out of being able to explore how AI can improve our world.”
The MOC Alliance’s decision to team with Lenovo helped to address the challenges traditionally associated with high-end computing resources, demonstrating how universities and research institutions can leverage advanced technology to foster innovation, enhance collaboration, and support the development of solutions to global challenges.
By adopting a scalable and flexible GPU infrastructure through Lenovo TruScale GPUaaS, MOC Alliance has positioned itself as a leader and innovator. This project not only empowers academic research but also sets a new standard for resource sharing and sustainability in the field of AI-driven scientific research. It has supported work on the latest AI advancements, including large language models and AI research platforms like Red Hat’s OpenShift.ai and InstructLab. The partnership is a model of how industry and academia can collaborate to drive progress in AI.
Visit https://techtoday.lenovo.com/us/en/truscale to learn more about Lenovo TruScale.
LENOVO, NEPTUNE and TRUSCALE are trademarks of Lenovo. All other trademarks are the property of their respective owners. ©2024 Lenovo Group Limited.