Bayadera Bayes + Clojure + GPU

Bayadera
Bayes + Clojure + GPU

Dragan Djuric

dragandj@gmail.com

A little background

Clojure
GPU computing
Bayesian Data Analysis

A Bayesian Data Analysis Primer

How to know something we cannot observe?

Probabilities of all answers (prior)
Probability of measured data (evidence and likelihood)
Calculate "backwards", using Bayes' rule and get
posterior probabilities of answers

Pr (H | D) = \frac{Pr (D | H) \times Pr (H)}{Pr (D)}

$\begin{equation*} \Pr(H|D) = \frac{\Pr(D|H)\times \Pr(H)}{\Pr(D)} \end{equation*}$

Hello World: Fair or Trick coin?

Coin bias, tendency to fall on head: θ
We can not measure θ directly
We can only know sample data (D)
θ is general; θ₁, θ₂, …, θ_n are specific values
D is general; 3 heads out of 10 specific

By Bayes' rule:

Pr (θ | D) = \frac{Pr (D | θ) \times Pr (θ)}{Pr (D)}

$\begin{equation*} \Pr(\theta|D) = \frac{\Pr(D|\theta)\times \Pr(\theta)}{\Pr(D)} \end{equation*}$

Easy computation

Q: What is Pr(0.4<θ<0.6)?

A: Pr(θ<0.6) - Pr(θ<0.4)

(let [a-prior 5
      b-prior 5
      z 1
      N 10
      a-post (+ a-prior z)
      b-post (+ b-prior (- N z))]

(- (beta-cdf a-post b-post 0.6)
   (beta-cdf a-post b-post 0.4)))

0.15985465263144316

HARD to compute

Usually:

Pr (\vec{h} | \vec{d}) = \frac{\prod_{i} f (\vec{d_{i}}, \vec{h}) \times g (\vec{h})}{\int \dots \int \prod_{i} f (\vec{d_{i}}, \vec{h}) d \vec{h}}

$\begin{equation*} \Pr(\vec{h}|\vec{d}) = \frac{\prod_i f(\vec{d_i},\vec{h})\times g(\vec{h})}{\idotsint \prod_i f(\vec{d_i},\vec{h})\,d \vec{h}} \end{equation*}$

computationally:

a n s w e r = \frac{h a r d \times a c c e p t a b l e}{i m p o s s i b l e}

$\begin{equation*} answer = \frac{hard\times acceptable}{impossible} \end{equation*}$

Markov Chain Monte Carlo (MCMC)

a family of simulation algorithms
draws samples from unknown probability distributions
(enough) samples approximate the distribution

Pr (\vec{h} | \vec{d}) \propto \exp (\sum_{i} \log f (\vec{d_{i}}, \vec{h}) + \log g (\vec{h}))

$\begin{equation*} \Pr(\vec{h}|\vec{d}) \propto \exp \left(\sum_i \log f(\vec{d_i},\vec{h}) + \log g(\vec{h})\right) \end{equation*}$

computationally:

a n s w e r \propto z i l l i o n s \times (h a r d \times a c c e p t a b l e)

$\begin{equation*} answer \propto zillions \times (hard\times acceptable) \end{equation*}$

Bayadera Bayes + Clojure + GPUDragan Djuricdragandj@gmail.com