As organizations’ day-to-day operations continue to transform through the implementation of AI solutions, so too does the demand for AI-required resources. It is no secret that deep learning and machine learning often require considerable memory and compute power to run successfully. With two common buzzwords in AI being Graphics Processing Unit (GPU) and Batch Processing, there is widespread need to run AI efficiently in production. Modzy tackles this not-so-simple problem with ease, providing a seamless process for data scientists and developers.
Understanding CPUs vs. GPUs in Deep Learning
To fully understand batch processing, we must first understand the difference between CPUs and GPUs. A CPU is a processor in computers designed to handle a wide variety of tasks very quickly. However, a CPU is limited in the number of concurrent tasks it can handle. These processors are commonly referred to as the “smart” one of the two.
A GPU is designed to quickly process large quantities of data or high-resolution imagery and video concurrently. GPUs are not as “smart” as CPUs, and as a result are less versatile when executing a wide range of tasks. However, GPUs are what enable batch processing to work. To put this into perspective, a server with a few CPUs may contain up to 50 very fast, smart processing cores. Adding several GPUs to the same server can increase the number of cores to tens of thousands. This sheer quantity of processers gives GPUs an unmatched ability to process large datasets and handle extensive scientific computations in parallel. These two concepts form the core functionalities of batch processing.
Making the Case for Batch Processing
Ask any data scientist and they will expound on how pivotal batch processing is for both deep learning training and inference. Training durations speed up monumentally, and production inference applications run more efficiently. Both scenarios can benefit from batch processing, particularly when large quantities of data are involved. To better illustrate the point, we have two example scenarios.
- A financial institution monitors retail parking lots to monitor consumer activity, inform economic trends, and gain a better understanding of a geographic tendencies. In this case, the organization receives hours of drone footage per day and runs analysis once a week. Having a batch processing capability for several computer vision models is paramount to successful and continuous analysis.
- A retail company performs extensive social media analysis following a new product release. In doing so, the organization collects millions of data points from several sources per hour. Leveraging several Natural Language Processing (NLP) models, this analysis helps forecast short-term sales revenues and provides real-time feedback. These computations would not be possible without access to a batch processing capability.
Enter Modzy’s Solution
Implementing Modzy’s batch processing solution on your model is simple:
- Include model code in your container that supports batching (e.g., built-in code to popular frameworks like PyTorch or TensorFlow that process images in batch and run inference)
- Set a maximum batch size (e.g., 8, 16, 32, or 64) that your model can handle based on available resources
- See the performance increase in your model as you submit large quantities of data to your model
Implementing batch processing in a production setting causes headaches for even the best, most highly trained data scientists. When developing Modzy, we set out to develop a product that would reduce headaches and allow teams to focus on the core of their work. The easy to use, intuitive capability built into Modzy provides fast integration into the model deployment process. Batch processing unlocks the ability to process data efficiently and cost-effectively and is crucial in deploying successful deep learning applications.
