History. Briefly.

Fri Oct 3, 2014

So I've been thinking about history. Or, more accurately, history-aware persistent storage, and what it takes to implement it. There's a fairly small core that can be torn out of :fact-base and generalized, and I wanted to point that out before I forgot.

(in-package #:cl-history)

(defclass event ()
  ((timestamp :accessor timestamp :initform (local-time:now) :initarg :timestamp)
   (event-type :accessor event-type :initform :in :initarg :event-type)
   (data :accessor data :initform nil :initarg :data)))

(defgeneric tick (whole part)
  (:documentation "Should take a whole and a part. Should return the whole with the part added."))

(defgeneric wind (whole part)
  (:documentation "Should take a whole and a part. Should return the whole with the part removed."))

(defmethod apply-event (whole (ev event))
  (let ((part (data ev)))
    (case (event-type ev)
      (:in (tick whole part))
      (:out (wind whole part)))))

(defmethod load-from (empty (storage stream))
  (let ((res empty))
           do (setf res (tick res (cl-store:restore storage)))
           do (read-byte storage))
      (end-of-file () res))))

(defmethod load-from (empty (fname pathname))
  (with-open-file (s fname :element-type '(unsigned-byte 8))
    (load-from empty s)))

(defmethod new-event! (whole (ev event) (storage stream) &rest more-streams)
  (loop for s in (cons storage more-streams)
     do (cl-store:store ev s)
     do (write-byte (char-code #\newline) s))
  (apply-event whole ev))

And once more, in Haskell

module History where

import System.IO

data Event a = In a
             | Out a deriving (Eq, Read, Show)

data Archive a b = Archive { tick :: (a -> b -> a)
                           , wind :: (a -> b -> a)
                           , empty :: a }

applyEvent :: Archive a b -> a -> Event b -> a
applyEvent arc a (In b) = (tick arc) a b
applyEvent arc a (Out b) = (wind arc) a b

loadFrom :: Read b => Archive a b -> Handle -> IO a
loadFrom arc handle = recur handle (empty arc)
    where recur h acc = do res <- hIsEOF h
                           if res
                           then return acc
                           else do ln <- hGetLine h
                                   recur h . applyEvent arc acc $ read ln

newEvent :: Show b => Archive a b -> a -> Event b -> [Handle] -> IO a
newEvent arc a ev handles = do _ <- mapM_ (\h -> hPutStrLn h $ show ev) handles
                               return $ applyEvent arc a ev

The Haskell version is pretty much all in the IO monad because this is meant to be a persistence layer. And regardless of whatever else your persistence layer does, it's eventually going to want to write some data out to a stream. Not necessarily to disk, mind you, it might be going out over a TCP socket or something|1|.

If you've been paying any attention at all to :fact-base, or seen what I'm getting down to with it, you'll notice that a few things are missing from the above.

Lets get the trivials out of the way first. There is no batch mode, as implemented in :fact-bases':change event. I just haven't bothered yet; that's something we can fix with between four and ten lines of code, so it's not a big deal. Second, there's no indexing system, because that's entirely incidental. In the Haskell version, the a part of Archive a b might have its own indexing scheme, but it's none of our business as history recorders. It'll be hooked into whatever the appropriate implementation of tick and wind happen to be. Finally, there aren't deltas because we're not buffering events here; we're just writing them straight out to whatever Handles or streams we get as input.

The real hole in functionality is the lack of a rewinding mechanism. I'm still thinking it through, although the core would basically have to look something like

opposite :: Event a -> Event a
opposite (In a) = Out a
opposite (Out a) = In a

rewindEvent :: Archive a b -> a -> Event b -> a
rewindEvent arc a = applyEvent arc a . opposite

The thing is, if we're not making the assumption that newEvent is specifically writing to a file, we can't assume we have random access to it, hence we can't really abstract the traversal of history. That has to be handled by the users of this little framework, although we can provide convenience functions to handle the file-storage case specifically.

I'm not entirely sure which direction to go in, because there's exactly one place I use the :fact-base rewind system heavily, and that one place performed poorly enough from disk that I was forced to go back and implement an in-memory option. And this was on a pretty speedy SSD, so it was almost the optimal case for the technique. In other words, it's not clear to me that we want to provide a full rewind-from-disk implementation. This might be the reason to chuck deltas into the mix after all; you could then keep one sessions' worth of data around, which would coincidentally both give you the ability to rewind into the shallow past, and reduce the need for disk IO, and keep a bit more of the Haskell implementation in the realm of pure code.

So yeah. The three potential directions I can see are:

Not quite sure whether I want to pick one, or a hybrid, or something else entirely.

I'll let you know if I figure anything out.


1 - |back| - In fact, if I wanted to really generalize newEvent, it would instead have the signature newEvent :: Show b => Archive a b -> a -> Event b -> (Event b -> IO ()) -> IO a. Which would then let you do some mildly crazy things. Like, for instance, using this with the Haste compiler to set up a client-side database that sends off write requests via Ajax. Left as an exercise for the reader, or perhaps a mildly bored future self.

2 - |back| - Assuming they were ok with the assumption, of course.

Creative Commons License

all articles at langnostic are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License

Reprint, rehost and distribute freely (even for profit), but attribute the work and allow your readers the same freedoms. Here's a license widget you can use.

The menu background image is Jewel Wash, taken from Dan Zen's flickr stream and released under a CC-BY license