How to train, deploy and develop TensorFlow AI Models, SparkML from Jupyter Notebook to production

 

Today I would like to post a more technical and pure engineering topic. The heart of the matter in Artificial Intelligence(AI) is more practical/empirical based than theoretical. Even though the conceptual framework is undoubtedly important. But to get a good grasp of the real work involved in setting up all the apparatus for a machine learning/deep learning and AI model or project we need to get the hands dirty, so to speak.

The video below may offer the right frame of mind with this goal. It features a talk by research scientist Chris Fregly from PipelineIO – a machine learning and AI start-up from San Francisco, US. Chris starts by presenting the GitHub repository called fluxcapacitor/pipeline. In spite of the talk being from January 2017 – less than six months of development in software development might mean a lot of time and already not up to data work -, I thought this presentation to preserve its relevance. And it manages to put together a plethora of software developments such as Kubernetes orchestration tools, Docker containers, Apache SparkML, TensorFlow and the Jupyter Notebook all in a bundled development stack, quite impressive…

 

 

The YouTube video description of the talk is also worth to read through. I share it here:

 

In this completely demo-based talk, Chris Fregly from PipelineIO will demo the latest 100% open source research in high-scale, fault-tolerant model serving using Tensorflow, Spark ML, Jupyter Notebook, Docker, Kubernetes, and NetflixOSS Microservices.

 

This talk will discuss the trade-offs of mutable vs. immutable model deployments, on-the-fly JVM byte-code generation, global request batching, miroservice circuit breakers, and dynamic cluster scaling – all from within a Jupyter notebook.

 

Chris Fregly is a Research Scientist at PipelineIO – a Machine Learning and Artificial Intelligence Startup in San Francisco.

 

Chris is an Apache Spark Contributor, Netflix Open Source Committer, Founder of the Advanced Spark and TensorFlow Meetup, Author of the upcoming book, Advanced Spark, and Creator of the upcoming O’Reilly video series, Deploying and Scaling Distributed TensorFlow in Production.

 

Previously, Chris was a Distributed Systems Engineer at Netflix, Data Solutions Engineer at Databricks, and a Founding Member of the IBM Spark Technology Center in San Francisco.

 

This talk was part of meetup from the Advanced Spark and TensorFlow Meetup and anticipates the StartupML Conference to be held in San Francisco later in August. These are cutting-edge software development around data engineering, big data pipelines, machine learning and Artificial Intelligence compute engines. Then there is the hardware part of all this multitude of developments, which has also witnessed recently some important milestones. Chris’ talk manages to give us picture of the relationship between these software and hardware developments within the SparkML and TensorFlow AI models stacks.

pipelineIO
PipelineIO: Spark ML and Tensorflow AI Platform

 

 

One other interesting aspect of the talk is how the Jupyter Notebook of Python language software development origin is seamlessly integrated with the various pipelines, providing flexible environments contributing to a more productive setup overall.

My final work goes to the part of the talk where Chris gave his view about model deployments and rollbacks and the Graph Transform tool that Chris has been working on . It seemed to be the heart of the matter in this talk. The way modern AI and machine learning/deep learning models are being deployed deserves greater attention. Of special interest is the interplay between mutable and immutable deployment, with the rollback option always an on button. Docker containers and images are playing an increasing crucial role in this development. So to know as much as possible about Docker containers and images is a plus. On the Graph Transform tool development Chris recommends simplification procedures in order to properly make sense of what can easily become a deeply convoluted graphical display as well as optimizing the serving runtime environment.

 

featured image: Continuously Train & Deploy Spark ML and Tensorflow AI Models from Jupyter Notebook to Production

Leave a comment