Today’s post is about Probabilistic Programming. This is a nice development in the field of probability & statistics as it pertains to inference with computational models. Its applications may be wide, but the field is a quite a recent one and so there isn’t yet much to say about real world applications, beyond scientific computing and Bayesian inference models.
But the video I present here today is telling a story of an application of Probabilistic Programming that would appear at first to not be reasonable; but it is and is in algorithmic trading and computational finance. This Blog has posted about computational finance before, but in the context of blockchain technology. Today it is the turn of algorithmic trading and in the context of one of its many active vibrant communities gathered around the website Quantopian. Thomas Wiecki is one of those usual suspects in the community and with this talk at PyData London 2016 given in May this year he presents his work around implementation of Probabilistic Programming in the context of algorithmic trading with the Quantopian Platform.
This is a nice presentation, but I would encourage all interested to dig deeper on this subject. It isn’t properly that of an easy task, and a video presentation like this one only scratch the surface of the relevant issues. One of the main points, of course, is the open source nature of these developments, which gives anyone the ability (the only barrier being the effort and passion required) to get the necessary skills to try for themselves an implementation of their own trading strategies, to try to make money trading the financial markets; and the whole process is competitive, so only the best and better strategies will in the end see the light of day. Anyway this is definitely interesting stuff, and loser mindsets need not to apply…
Here it is the video with some of the transcribed highlights:
Probabilistic programming is a new paradigm that greatly increases the number of people who can successfully build statistical models and machine learning algorithms, and makes experts radically more effective. This talk will provide an overview of PyMC3, a new probabilistic programming package for Python featuring intuitive syntax and next-generation sampling algorithms.
Machine learning is the driving force behind many recent revolutions in data science. Comprehensive libraries provide the data scientist with many turnkey algorithms that have very weak assumptions on the actual distribution of the data being modeled. While this blackbox property makes machine learning algorithms applicable to a wide range of problems, it also limits the amount of insight that can be gained by applying them.
The field of statistics on the other hand often approaches problems individually and hand-tailors statistical models to specific problems. To perform inference on these models, however, is often mathematically very challenging, and thus requires time-deriving equations as well as simplifying assumptions (like the normality assumption) to make inference mathematically tractable.
Probabilistic programming is a new programming paradigm that provides the best of both worlds and revolutionizes the field of machine learning. Recent methodological advances in sampling algorithms like Markov Chain Monte Carlo (MCMC), as well as huge increases in processing power, allow for almost complete automation of the inference process. Probabilistic programming thus greatly increases the number of people who can successfully build statistical models and machine learning algorithms, and makes experts radically more effective. Data scientists can create complex generative Bayesian models tailored to the structure of the data and specific problem at hand, but without the burden of mathematical tractability or limitations due to mathematical simplifications.
This talk will provide an overview of PyMC3, a new probabilistic programming package for Python featuring intuitive syntax and next-generation sampling algorithms.
Of this transcription I would underscore the notion of probabilistic programming being in a sense a revolutionary way to do machine learning and data science – that may be an overstatement, but this is a field where disruptive ideas are to be expected -, and the belief that with increased processing power, the combination of automation of the inference process with an open community of developers, data science challenges will be readily solved, with benefits emerging in a wide array of applications. So be it!