High-performance computing with Machine Learning at Intel

One of the most important and exciting computing topics at the moment is high-performance computing (HPC). This is a no surprise development given our current data deluge moment in this space putting increasingly greater pressure on IT infrastructure to cope with unprecedented levels of Data. We nowadays talk about data but the word is somewhat a bandwagon term that hype what in the end is essentially information; structured or unstructured, with various types of origins and purposes, we are mostly talking about information that is accessible to someone or everyone.

It might make sense to distinguish the word data from information precisely on the terms of accessibility. Information is a more ubiquitous concept, which encompasses all that is about all kinds of systems interacting with each other and providing a scaffolding of reality as we perceive it. But the word data is already somewhat more structured and it really means information that is easy to access and process.

Nevermind the possible controversy of what I said in the above paragraphs, let’s move on to the post today. I share today here a link and video from the publication on High-performance computing insideHPC, that I subscribe on a daily basis and it is a good resource to this important field of computing and IT. In the link we learn about developments in the important micro-processor company Intel about the what they are doing with implementation of Machine Leaning, and in particular Deep Learning Networks to high-performance platforms at the company. The video features Ananth Sankaranarayanan who was talking at the Intel HPC Developer Conference that took place recently in Salt Lake City, USA:

Accelerating Machine Learning on Intel Platforms


“Availability of big data coupled with fast evolution of better hardware, smarter algorithms and optimized software frameworks are enabling organizations create unique opportunities for machine learning and deep learning analytics for competitive advantage, impactful insights, and business value. Caffe is one of most popular open source frameworks developed by Berkeley Vision Learning Center (BVLC) for deep learning application for image recognition, natural language processing (NLP), automatic speech recognition (ASR), video classification and other domains in artificial intelligence. Intel has extensively contributed to an optimized fork of Caffe for Intel Xeon, Xeon Phi, and Xeon+FPGA CPUs. Convolutional Neural Networks (CNNs) are extensively used in image recognition for deep learning training and building an accurate model, which then can be used for scoring in applications such as Advanced Driver Assistance System (ADAS) in the automotive industry for driverless vehicles, in medicine, finance, etc. Deep learning training is highly compute intensive and can take a very long time from multiple weeks to days on large datasets. For meaningful impact and business value, organizations require that the time to train a deep learning model be reduced from weeks to hours. In this talk, we will present the details of the optimization and characterization of Intel-Caffe and the support of new deep learning convolutional neural network primitives in the Intel Math Kernel Library. We will present performance data for deep learning training for image recognition achieving >24X speedup performance with a single Xeon Phi 7250 compared to BVLC Caffe. In addition, we will also present performance data that shows training time is further reduced by 40X speedup with a 128-node Xeon Phi cluster over Omni-Path Architecture. Furthermore, we will also present data that shows >17X speedup for image scoring with 2P Xeon E5-2699 v4. These performance results were critical components of KNL, generating very strong interest in Xeon / Xeon Phi for ML/DL using Intel-Caffe and displacing Nvidia as the only performant solution.”












Of what is shared above I would add a couple of points. The first point is the obvious reason why HPC and big data applications are a close match, namely the examples of the compute intensive nature of deep learning frameworks such Caffe or TensorFlow. The second point concerns the belief that these frameworks being open source could provide competitive advantage and business value across different domains of expertise and applications, which is something that is still largely in the wide open field of evidence for us all to see. But I am also an optimist.

Body text image: TensorFlow Disappoints – Google Deep Learning falls shallow

Featured Image: IBM’s Blue Gene/P supercomputer at Argonne National Laboratory from the Supercomputer wikipedia page


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s