Rapidly deploy AI inferencing from pocket to cloud with Lenovo and Intel

Lenovo offers proven, powerful, scalable, and energy-efficient infrastructure solutions, powered by 5^th Gen Intel® Xeon® processors, to help organizations of all sizes to accelerate their AI journey and improve business outcomes. In organizations today, there is a growing need for a hybrid mix of personal, private, and public AI for training as well as inferencing deployments.

Lenovo and Intel have teamed up to deliver purpose-built solutions designed specifically for AI inferencing applications. With the reduced processing requirements and lower barriers to entry, AI inferencing is opening the doors for organizations and businesses of all sizes to harness the power of AI for a wide range of applications, from pocket to cloud.

AI inferencing involves taking existing, pre-trained models and applying them to new proprietary data sets for application-specific tasks. The outcomes and insights are then adapted for new applications that are tailored to deliver more precise and relevant experiences. Inferencing builds off the learning already accomplished, so the processing demands of generating predictions and insights are significantly lower than what was required during the initial training.

Get started by extending your existing infrastructure

Generative AI has the potential to transform and reinvent virtually every aspect of business from customer experience to business operations to employee engagement, yet it can often be a daunting task to determine just how to get started. Companies interested in starting the journey on Generative AI can extend their existing infrastructure.

Research conducted by Intel and Lenovo showed that the Lenovo ThinkSystem SR650 V3, accelerated by 4th or 5^th Gen Intel® Xeon® processors, can help organizations achieve revolutionary business impacts without having to invest in dedicated (and often costly) GPU accelerators. One study showed how a single node SR650 V3 with 4th Gen Intel Xeon processors delivered a highly performant, scalable solution for Generative AI.

Another study showed that a cluster of Lenovo ThinkSystem SR650 V3s, with 5th Gen Intel Xeon processors, delivers a highly performant, scalable solution for Generative AI. Test results demonstrated response times that were perceived as instantaneous, providing the necessary performance to support a variety of use cases, including real-time chatbots. Deploying Generative AI use-cases on a cluster with Red Hat® OpenShift® container platform allows for ease of deployment, usability, and scalability with containers and services managed by Kubernetes. Red Hat OpenShift supports hardware acceleration for inference use cases, a broad ecosystem of AI/ML and application development tools, and integrated security and operations management capabilities.

A variety of batch sizes were used to simulate concurrent users and token lengths of 1024 and 2048 represent a typical enterprise chatbot scenario. A latency of 100ms or less is a response time perceived as instantaneous for most conversational AI and text summarization applications. Test results demonstrated this solution could successfully meet that target and provide the necessary performance to support a variety of use cases, including real-time chatbots.

Streamline deployment with Lenovo AI Innovators

IT leaders looking to implement AI solutions in their organization can depend on the Lenovo AI Innovators Program to provide state-of-the-art enterprise AI solutions for multiple industries, enabling faster, safer, and more efficient deployments. Our infrastructure solutions are optimized for the complexities and challenges of delivering AI for all, from edge to cloud.

Since organizations don’t need to develop the foundational training models, this dramatically accelerates development and helps them move toward real applications more quickly. Additionally, since it’s working off proprietary data and the processing performance isn’t as demanding, the insight processing doesn’t have to happen in the data center. That means inferencing can happen where the data is collected, including on the edge.

AI in retail: Understand customer behavior to drive business value

In partnership with Intel and Sensormatic Solutions, Lenovo provides retailers with computer vision technologies that enable frictionless and personalized experiences for shoppers. With Lenovo ThinkEdge servers powered by Intel Xeon processors, Sensormatic can deploy its computer vision applications in retail stores for rapid data processing and real-time analytics at the edge. These compact, ruggedized servers ensure fast and reliable performance for compute-intensive workloads. As the system collects data, insight can help retailers understand shoppers’ in-store journey to optimize merchandise placement. Further, it can look for suspicious behaviors, such as shoplifting, entering unauthorized areas, and even identifying people loitering in groups. Equipped with this information, retailers can reduce risk, optimize labor, and enhance the shopper experience, helping to drive sales.

AI in manufacturing: Optimize performance at the edge

Today’s manufacturers are constantly pressured to increase efficiency and productivity, while improving product quality and worker safety. Lenovo, Intel, and byteLake developed an innovative AI-assisted Visual Inspections Solution to deliver smarter what-if scenarios, with up to 93% accuracy of predictions. By combining Lenovo ThinkEdge servers with Intel® Distribution of OpenVINO™ toolkit and byteLAKE technologies, manufacturers can improve quality, enhance process monitoring, and provide predictive maintenance to transform their business.

AI in food service: Increase efficiency at quick-service restaurants
Lenovo and Intel partnered with Sunlight.io to deliver a solution that consolidates restaurant infrastructure to run existing and new applications in VMs or containers on a highly available and fault-tolerance platform. The solution reduces the hardware complexity in each restaurant and delivers centralized management and deployment infrastructure to all locations – ultimately reducing costs and increasing innovation speed. This innovative Edge AI solution uses Lenovo ThinkEdge and ThinkSystem servers and Intel CPUs combined with the Sunlight.io Edge Hyper-Converged Infrastructure to provide a platform that accelerates the digital transformation of the restaurant. Further, it enables a simple deployment of more applications, such as multichannel ordering video analytics. AI-based applications can massively speed up ordering and fulfillment processes inside restaurants and at the drive-through and curb-side pick-up points.

Looking ahead

Discover how Lenovo and Intel can help put the power of AI inferencing to work for your business today. Join us at Intel Vision in Phoenix, AZ on April 8-9, 2024, or visit us online to learn more.

Rapidly deploy AI inferencing from pocket to cloud with Lenovo and Intel

Lenovo powers Lenovo