Performance and Safety

Philipp Kant, IOHK

philipp.kant@iohk.io

Bobkonf 2017, Berlin

Performance vs. Safety

typical dichotomy:

performance: work close to bare metal, use c
safety: work with abstractions, use haskell

sometimes you need both

example: medical application

need timely answer
errors can cause real harm

one solution: write c in haskell

working close to the machine: direct control over allocations, movement of data in memory, can do tricks to get extra performance

However, that is very far away from most of the problems that we want to solve. This mismatch is a common source of errors – errors that can be avoided by hiding the details of how the program is executed on the away behind abstractions.

But sometimes, you really cannot afford to make a choice or a tradeoff here: some applications must be both performant and correct.

As an example, consider a medical application that performs intensive calculations, the result of which will be used for the treatment of patients. You will not want keep the doctors waiting, but you definitely don't want to give them incorrect data which could cause physical harm to the patients.

The goal of this talk is to convince you that the dichotomy between performance and safety is a false dichotomy, and that you don't jhave to choose between the two. One way to achieve this is to "write C in Haskell"

Writing C in Haskell

haskell does expose "the metal":

Ptr, plusPtr, moveBytes, …
typically not used in application code

reserved for high-performance libraries, with a high-level API
as long as the libraries are correct, there will be no buffer overflow

Making Sure Libraries are Correct

static types go a long way
write small functions that fit in your head
use QuickCheck

when you need proof: LiquidHaskell

refinement types, extension of haskell's type system
statically ensure relations between possible values
optional step at compile time – zero overhead for library users

And Haskell is particularly well-suited for this!

You have string static types, that preclude whole classes of errors. For instance, in C, there is no real difference between pointer arithmetic and number arithmetic. But in Haskell, a pointer gets its own type. The function PlusPtr takes one pointer p and one number n, and advances p by n bytes – it is impossible to make nonsensical operations such as adding two pointers.

Also, the succinctness of Haskell makes it easy to write functions that fit on your screen and in your head, making them easier to reason about.

Tools like QuickCheck go a long way, exposing corner cases that you didn't think about via randomised testing.

But sometimes, you need a little bit more. When other people's lives or money depends on the correctness of your code, what you really want are proofs!

This is where LiquidHaskell comes into play. It's basically an extension of Haskell's type system, that allows you to express relations between values at the the type level.

While Haskell itself is slowly embracing dependent types as well, LiquidHaskell is non-invasive in the sense that it is an optional step at compile time, and all the modifications to your code are treated by GHC as comments, so your code and API remain unchanged. This is a huge benefit, especially for libraries, since hardening a library with LiquidHaskell imposes no burden on the users of the library.

Distributed Computation

Amdahl's Law: \(t_n = t_{0,\text{seq}} + \tfrac{t_{0,\text{par}}}{n}\)

serialisation significant overhead in sequential part

To fill all this with a little bit of life, I'm going to show you how we used Liquid Haskell in a real world project.

The example is the medical application that I mentioned in the introduction. This application does numerical simulations at a large scale, the results of which are to be used by physicians in the treatment of their patients.

To save time, the calculations are parallelised on a large scale. A master node orchestrates the computation, distributing it amongst a large number of slave nodes.

Typically, the speedup that you get is 1/n at first, but at some point the curve flattens and adding more CPUs does not give you any further speedups. The reason for this is known as Amdahl's Law: if you divide your program into a part that is parallelisable and a part that will always be run sequentially, at some point the sequential part will dominate your computation time and you cannot benefit any further from parallelisation.

In order to escape Amdahl's Law, you have to bring down the sequential part as much as possible.

Every time the master sends or receives a message, it will have to perform serialisation or deserialisation, i.e., it has to turn the structured data into a sequence of bytes to send over the network, or turn a sequence of bytes back into structured data. The serialisation that happens on the master will always be a part of the sequential code, so we must make it efficient if we want to scale up.

Scaling Behaviour

Serialisation

serialisation: represent data as sequence of bytes
- to save it to files
- to send it to another computer/process
possible features
- versioning, backwards compatibility
- architecture independence
- cross-language compatibility
- incremental (de)serialisation
- easy to use
- speed

Fast Serialisation: Store

use case: distributed high performance computing
typical data: vectors of simple data types, fits in memory
design goal: speed
- no versioning, fixed architecture, no partial deserialisation and backtracking

Streaming Data

store: serialisation to/from strict ByteStrings

efficiency: one allocation per serialisation, no partial results
networking: data arrives in chunks, need streaming

add thin streaming layer on top of store

Streaming with ByteBuffer

-- | Copy the contents of a 'ByteString' to a 'ByteBuffer'.
copyByteString :: MonadIO m
    => ByteBuffer
    -> ByteString
    -> m ()

-- | Try to get a pointer to @n@ bytes from the 'ByteBuffer'.
--
-- If there are not enough bytes in the ByteBuffer, indicate
-- how many bytes are needed
consume :: MonadIO m
    => ByteBuffer
    -> Int
    -> m (Either Int ByteString)

ByteBuffer

type ByteBuffer = IORef BBRef

data BBRef = BBRef {
      size      :: !Int
    , contained :: !Int
    , consumed  :: !Int
    , ptr       :: !(Ptr Word8)
    }

Reading from a ByteBuffer

Filling a ByteBuffer

What Can Go Wrong?

Buffer overflow

best case: segmentation fault
worst case: data corruption anywhere in the same program

Buffer Underflow

best case: segmentation fault
worst case: heartbleed

prove correctness via LiquidHaskell

Using LiquidHaskell

Refinement Types

refine data types

type Nat = Int

{-@ type Nat = {v:Int | 0 <= v} @-}

measures

[a]

{-@
measure len :: [a] -> Int
len []     = 0
len (x:xs) = 1 + (len xs)
@-}

refined function types

head :: [a] -> a
tail :: [a] -> [a]

{-@ head: {xs:[a] | len xs >= 1} -> a @-}
{-@ tail: xs:[a] -> {xs':[a] | len xs' <= len xs} @-}

As I mentioned, using LiquidHaskell amounts to adding specially formatted comments to your code, that refine the Haskell types in your program.

As a simple example, you can define the type Nat, as a synonym for an Int, and tell LiquidHaskell that a Nat is a value of type Int, with the restriction that this value is non-negative. Every time a value of type Nat is constructed, LiquidHaskell will try to prove that this condition is satisfied.

LiquidHaskell has the notion of measures, which allow you to define inductive properties of algebraic datatypes, such as the length of a list, like this:

Things get interesting when you refine the types of functions. Essentially, this lets you define pre- and postconditions that are verified, at compile time, by LiquidHaskell.

For instance, if you refined the type of head – taking the first element of a list – to require the list to have a length of at least 1, LiquidHaskell would try to prove that your code never invokes head on an empty list.

Running `LiquidHaskell`

optional step at compile time

λ> stack exec -- liquid src/System/IO/ByteBuffer.hs
LiquidHaskell Copyright 2009-15 Regents of the University of California. All Rights Reserved.


 **** DONE:  A-Normalization ****************************************************


 **** DONE:  Extracted Core using GHC *******************************************


 **** DONE:  Transformed Core ***************************************************


 **** DONE:  Uniqify & Rename ***************************************************

 Working  72% [================================================.................]
 Done solving.

 **** DONE:  annotate ***********************************************************


 **** RESULT: SAFE **************************************************************

Refined ByteBuffer

data BBRef = BBRef {
      size      :: !Int
    , contained :: !Int
    , consumed  :: !Int
    , ptr       :: !(Ptr Word8)
    }

{-@
data BBRef = BBRef
    { size      :: Nat
    , contained :: { v: Nat | v <= size }
    , consumed  :: { v: Nat | v <= contained }
    , ptr       :: { v: Ptr Word8 | (plen v) = size }
    }
@-}

Allocating ByteBuffers

new :: MonadIO m => Maybe Int -> m ByteBuffer
new maybel = liftIO $ do
    let l = fromMaybe (4*1024*1024) maybel
    newPtr <- Alloc.mallocBytes l
    newIORef BBRef { ptr = newPtr
                   , size = l
                   , contained = 0
                   , consumed = 0
                   }

{-@ mallocBytes :: l:Nat -> IO ({v:Ptr a | plen v == l}) @-}

**** RESULT: UNSAFE ************************************************************

/home/philipp/clones/store/src/System/IO/ByteBuffer.hs:181:15-33: Error: Liquid Type Mismatch

181 |     newPtr <- Alloc.mallocBytes l
		    ^^^^^^^^^^^^^^^^^^^
  Inferred type
    VV : {VV : Int | VV == l}

  not a subtype of Required type
    VV : {VV : Int | VV >= 0}

  In Context
    l : Int

Allocating ByteBuffers

new :: MonadIO m => Maybe Int -> m ByteBuffer
new maybel = liftIO $ do
    let l = max 0 (fromMaybe (4*1024*1024) maybel)
    newPtr <- Alloc.mallocBytes l
    newIORef BBRef { ptr = newPtr
                   , size = l
                   , contained = 0
                   , consumed = 0
                   }

**** RESULT: SAFE **************************************************************

Reading From a ByteBuffer

{-@ unsafeConsume :: MonadIO m
  => ByteBuffer
  -> n:Nat
  -> m (Either Int ({v:Ptr Word8 | plen v >= n})) @-}
unsafeConsume :: MonadIO m
        => ByteBuffer
        -> Int
        -> m (Either Int (Ptr Word8))
unsafeConsume bb n = liftIO $ do
    bbref <- readIORef bb
    let available = contained bbref - consumed bbref
    if available < n
        then return $ Left (n - available)
        else do
             writeIORef bb bbref { consumed = consumed bbref + n }
             return $ Right (ptr bbref `plusPtr` consumed bbref)

{-
  plen ptr == size >= 0
  contained <= size
  consumed <= contained
-}

\begin{align} &\texttt{available}\; \geq \texttt{n}\\ \Leftrightarrow\; & \texttt{contained}\; - \texttt{consumed}\; \geq \texttt{n}\\ \Rightarrow\; & \texttt{plen p} - \texttt{consumed}\; \geq \texttt{n}\\ \end{align}

Writing to a ByteBuffer

copyByteString :: MonadIO m => ByteBuffer -> ByteString -> m ()
copyByteString bb bs =
    bbHandler "copyByteString" bb go
  where
    go bbref = do
        let (bsFptr, bsOffset, bsSize) = BS.toForeignPtr bs
        -- if the byteBuffer is too small, resize it.
        let available = contained bbref - consumed bbref
        bbref' <- if size bbref < bsSize + available
                  then enlargeBBRef bbref (bsSize + available)
                  else return bbref
        -- if it is currently too full, reset it
        bbref'' <- if bsSize + contained bbref' > size bbref'
                   then resetBBRef bbref'
                   else return bbref'
        -- now we can safely copy.
        withForeignPtr bsFptr $ \ bsPtr ->
            copyBytes (ptr bbref'' `plusPtr` contained bbref'')
            (bsPtr `plusPtr` bsOffset)
            bsSize
        writeIORef bb $ Right BBRef {
            size = size bbref''
            , contained = contained bbref'' + bsSize
            , consumed = consumed bbref''
            , ptr = ptr bbref''}

Enlarging a ByteBuffer

{-@ enlargeBBRef ::
       b:BBRef
    -> i:Nat
    -> IO {v:BBRef | size v >= i
                  && contained v == contained b
                  && consumed v == consumed b}
@-}
enlargeBBRef :: BBRef -> Int -> IO BBRef
enlargeBBRef bbref minSize= do
    let getNewSize s | s >= minSize = s
        getNewSize s = getNewSize $
          (ceiling . (*(1.5 :: Double)) . fromIntegral) (max 1 s)
        newSize = getNewSize (size bbref)
    ptr' <- Alloc.reallocBytes (ptr bbref) newSize
    return bbref { size = newSize
                 , ptr = ptr'
                 }

{-@ reallocBytes :: Ptr a -> l:Nat -> IO ({v:Ptr a | plen v == l}) @-}

Rewinding a ByteBuffer

{-@ resetBBRef ::
       b:BBRef
    -> IO {v:BBRef | consumed v == 0
                  && contained v == contained b - consumed b
                  && size v == size b}
@-}
resetBBRef :: BBRef -> IO BBRef
resetBBRef bbref = do
    let available = contained bbref - consumed bbref
    moveBytes (ptr bbref) (ptr bbref `plusPtr` consumed bbref) available
    return BBRef { size = size bbref
                 , contained = available
                 , consumed = 0
                 , ptr = ptr bbref
                 }

{-@ moveBytes :: p:Ptr a -> q:Ptr a -> {v:Nat | v <= plen p && v <= plen q} -> IO ()@-}

Conclusions

safety and performance

you can have both!

write c in Haskell

hide details behind high-level API

LiquidHaskell

prove invariants of your code at compile time

does not change the code at all
no run time overhead
compile time overhead optional
no need to change client code