kmarekspartz

Expanding our capacity to respond to unforeseen changes

A look back over some recent posts from this blog


In the last year, I've written 103 posts (an average of one every three and a half days) for a total of 20,183 words.

In the last six months, I've written 24 posts, for a total of 6,223 words. Compared to my first six months of blogging, my post quantity is definitely down, but my word count per post is up (~260 words compared to ~178), which may be a naive estimate of quality.

Russell Brown has a potential fix for the Data Type growth


A potential fix

As I mentioned in my last post, Russell Brown is working on a feature branch of riak_dt which may have a fix for some of this. Today, I ran my benchmark against a locally-compiled Riak, and then I built it with the potential fix.

My environment

I'm on Mac OS X 10.9.5, and I ran the benchmark against a single Riak 2.0.2 node running locally, installed from source. I have OTP R17, and therefore had to make a few changes in the Riak build system:

To include the patched version, I changed the riakdt branch that riakkv depends on to bug/rdb/faster-merge-orswot.

Results

Map writes has improved significantly. Not only does it no longer seem exponential, but it is much faster! Notice the time scales:

Unpatched Map writes (data-type API)

Patched Map writes (data-type API)

The same applies for Map reads:

Unpatched Map reads (data-type API)

Patched Map reads (data-type API)

The stripes in the Patched Map reads chart is likely due to the precision the benchmark uses when printing the time. They can be ignored.

However, there was a regression (tradeoff) with regards to size. It appears that the patched version grows at a faster rate, though it is still linear.

Unpatched Map byte size and number of items

Patched Map byte size and number of items

The blip around 1,000 items may be due to the change in the size of the elements from 3 to 4 bytes.

There are no visually significant changes for Map reads using the key-value API.

Unpatched Map reads (key-value API)

Patched Map reads (key-value API)

20k Elements

Testing the unpatched version with 20,000 elements would be unbearable, but did test the patched version to give a better idea of how it would behave at that size. It's not quite what I was expecting, but it is still much faster than the unpatched version with 3,000 elements.

Patched Map writes (20k elements) (data-type API)

Patched Map reads (20k elements) (data-type API)

Raw data

Here you go!

Future work

  • Test the patched Set implementation.
  • Test against an actual cluster rather than a single local node.
  • Make a basho_bench version of the benchmark.

Once again, big thanks to Russell Brown for the quick turn-around on this! We're looking forward to using this change once it is officially released. We are also eagerly anticipating delta-mutation, which should bring the linear growth (for writes) down to amortized constant time, if I'm understanding the change correctly.

Some additional information about yesterdays post


Yesterday's post could have included a bit more information. Here's that information.

My environment

I'm on Mac OS X 10.9.5, and I ran the benchmark against a single Riak 2.0.2 node running locally, installed from homebrew, which does a binary install.

Other graphs

Some graphs I didn't include were a comparison of byte size and number of items, Set read times for both the key-value and data-type APIs, and Map read times for the key-value API. Here they are:

Set byte size and number of items

Map byte size and number of items

Set reads (key-value API)

Set reads (data-type API)

Map reads (key-value API)

Raw data

Here you go!

A potential fix

Russell Brown is working on a feature branch of riak_dt which may have a fix for some of this. I'm running the benchmark against a locally-compiled Riak right now, and then I'll build it with his potential fix and compare the benchmark results.

Update: I ran the benchmark against the potential fix with good results.

A benchmark for large (10k+ elements) Riak Data Types


Riak 2.0 includes Data Types which provide reasonable guarantees with regard to eventual consistency. However, the data types, as implemented, aren't really designed to be performant with a large number of elements.

To that end, here's a benchmark which demonstrates the behavior of large maps and sets using the Riak Ruby Client. First, it reads using the key-value API and gets the size of the object (number of bytes, not number of elements). Then, using the data-type API, it reads, adds an element, and reads twice. This allows us to compare read performance between the APIs, as well as examine performance relative to the number of elements or bytes.

require 'riak'
require 'benchmark'
require 'securerandom'

data_type = "maps"  # or "sets"

bucket = "testing CRDT #{data_type}: #{SecureRandom.hex}"

n_items = 15000
n_containers = 1

client = Riak::Client.new(nodes: [{:host => 'localhost', :pb_port => 8087}])

puts 'items, id, size, kv_read_elapsed, read_prewrite_elapsed, write_elapsed, read_postwrite_elapsed, read_postwrite_elapsed2'
(1..n_items).each do |i|
  (1..n_containers).each do |id|
    key = id.to_s
    size = 0
    kv_read_elapsed = 0
    begin
      kv_read_elapsed = Benchmark.realtime do
        size = client.bucket(bucket).get(key, type: data_type).content.raw_data.size
      end
    rescue Riak::ProtobuffsFailedRequest
      # get will fail the first time through.
    end
    container = nil
    read = nil
    write = nil
    if data_type == "maps"
      read = lambda{ container = Riak::Crdt::Map.new(client.bucket(bucket), key) }
      write = lambda{ container.sets['set'].add(i.to_s) }
    elsif data_type == "sets"
      read = lambda{ container = Riak::Crdt::Set.new(client.bucket(bucket), key) }
      write = lambda{ container.add(i.to_s) }
    end
    read_prewrite_elapsed = Benchmark.realtime &read
    write_elapsed = Benchmark.realtime &write
    read_postwrite_elapsed = Benchmark.realtime &read
    read_postwrite_elapsed2 = Benchmark.realtime &read
    puts "#{i}, #{id}, #{size}, #{kv_read_elapsed}, #{read_prewrite_elapsed}, #{write_elapsed}, #{read_postwrite_elapsed}, #{read_postwrite_elapsed2}"
  end
end

Results

I ran the benchmark for 5,000 items in a Map, and 10,000 items in a Set against a single local node.

For both Sets and Maps, size has near-linear growth relative to the number of items[^2].

[^2]: Item size should matter, but they are all short strings in this benchmark.

Set reads for both the key-value and data-type APIs are small and seemingly constant.[^1] These are nearly flat lines, so I didn't include the plots. The same goes for Map reads using the key-value API, but not Map reads using the data-type API:

Map reads (data-type API)

Consecutive reads don't seem to have much of an effect:

Map consecutive reads (data-type API)

Both Set and Map writes seem to grow superlinear:

Set writes (data-type API)

Map writes (data-type API)

[^1]: This may be a flaw in the benchmark. Perhaps someone with more knowledge of the Ruby Riak Client can tell me more.

Future work

I haven't tested this against a Riak cluster rather than a single local node, though similar behavior has been seen in both. basho_bench could be used to test without using the Ruby client. Additionally, I'd like to test the data types directly.

Update: For some missing graphs, and more information about my environment, please see my next post.

Update #2: Russell Brown may have fixed this in a feature branch. I ran the benchmark against the potential fix with good results.

Ben Franklin is a fan of Linus's Law.

These Thoughts, my dear Friend, are many of them crude and hasty, and if I were merely ambitious of acquiring some Reputation in Philosophy, I ought to keep them by me, 'till corrected and improved by Time and farther Experience. But since even short Hints, and imperfect Experiments in any new Branch of Science, being communicated, have oftentimes a good Effect, in exciting the attention of the Ingenious to the Subject, and so becoming the Occasion of more exact disquisitions (as I before observed) and more compleat Discoveries, you are at Liberty to communicate this Paper to whom you please; it being of more Importance that Knowledge should increase, than that your friend should be thought an accurate Philosopher.

Benjamin Franklin, from the Invention of Air by Steven Johnson.

Global sandboxes for Cabal


I recently moved to a new laptop and decided to rethink my GHC and cabal setup. After installing GHC for Mac OS X and running cabal install cabal-install, I decided not to install any other cabal packages globally in order to avoid cabal hell. However, there are a few programs I link to install globally with cabal, e.g. pandoc. I asked Tyler Holien what he does, and he suggested a separate cabal sandbox for each program and a symlink to the resulting binary somewhere on my $PATH.

I created a script called glabal to do the first part of this work for me:

# ~/bin/glabal:
mkdir -p ~/.installs/$1
pushd ~/.installs/$1
cabal sandbox init
cabal install $1 -j8
popd

To use this, I type glabal pandoc and it creates a sandbox in .installs/pandoc and attempts to install pandoc. If this doesn't complete successfully, I either rm -r .install/pandoc or go into the sandbox manually and attempt to recover.

I demonstrate parallelizing existing code in Haskell


I've been working through Parallel and Concurrent Programming in Haskell. In my last post, I demonstrated the facilities Haskell provides for lightweight concurrency. In this post, let's take a look at Haskell facilities for parallelism.

As a brief example, let's parallelize Quicksort[^1].

import Control.Parallel.Strategies

Strategies provide a means to tell the run-time system how to evaluate objects. We'll be using rseq is the sequential evaluation strategy, and parList takes a strategy for list items, and uses that strategy for each list element in parallel.

Here's our non-parallelized Quicksort implementation:

quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
  let leftPartition = [y | y <- xs, y < x]
      rightPartition = [y | y <- xs, y >= x]
      left = quicksort leftPartition
      right = quicksort rightPartition
  in left ++ [x] ++ right

Quicksort partitions a list around a pivot, sorts each partition, and then combines the partitions and the pivot.

Our parallelized version is almost the same:

parallelsort :: Ord a => [a] -> [a]
parallelsort [] = []
parallelsort (x:xs) =
    let leftPartition = [y | y <- xs, y < x] `using` parList rseq
        rightPartition = [y | y <- xs, y >= x] `using` parList rseq
        left = parallelsort leftPartition
        right = parallelsort rightPartition
    in left ++ [x] ++ right

We simply tell the run-time system what strategy to use for the list comprehensions.

This doesn't really improve much in this case, but when used judiciously, extending your existing code with parallelism is straight-forward in Haskell.

This post is also available as a literate Haskell file.

[^1]: This isn't the best algorithm to parallelize, nor is this an efficient implementation, but it shows how to add parallelism to your code.

I demonstrate a concurrent implementation of the Daytime Protocol in Haskell


One example from Parallel and Concurrent Programming in Haskell is a concurrent network server. The server given in the book implements an informally specified doubling protocol, where each submitted line gets parsed as an Integer and returns the double of the input.

Back in May, Andrew Clarkson gave a talk at the PyMNtos meeting about Asynchronous IO in Python. As an example, he included an asynchronous version of a Daytime server.

Let's take the Haskell doubling server, and make it a (TCP) Daytime server.

... but first, some imports.

import Control.Monad (forever)

forever is useful for making loops out of void functions (IO ()).

import Text.Printf (printf)

printf behaves similarly to how it does in other languages: it takes a format string and items to interpolate into the format string.

import Data.Time (getCurrentTime, formatTime)
import System.Locale (defaultTimeLocale)

These are used to get the current time, and format it with the default locale.

import System.IO (
    Handle

A handle is a place for IO to stream data.

  , stdout

stdout is the default handle used for things like putStrLn.

  , hPutStrLn

This is the handle equivalents of putStrLn. Rather than assuming the standard handle, the handle is passed in explicitly. One can define putStrLn (in point-free style) as putStrLn = hPutStrLn stdout.

  , hClose

Handles need to be closed when you are done with them.

  )
import Network (
    withSocketsDo

This is needed on Windows to initialize the networking subsystem. It is here for portability reasons.

  , listenOn

This opens a socket on a specified port.

  , PortID(PortNumber)

To specify the port, we'll pass in a PortNumber.

  , accept

accept takes a socket and returns a tuple of a Handle, a host, and the port.

  )
import Control.Concurrent (
    forkFinally

forkFinally creates a new (light-weight) thread to run a specified process and when it completes, it “finally” runs another command. You'll see when we get there.

  , threadDelay

It turns out our single-threaded implementation is quite efficient, so we'll add a slight delay to make clear how concurrency affects the server.

  )

This program has four different main functions. The fourth one is the concurrent Daytime server, so we'll use that implementation as our main main:

main = main4

First, lets write a non-server version of what a Daytime server does. According to the specification:

“Once a connection is established the current date and time is sent out the connection as a ascii character string (and any data received is thrown away). The service closes the connection after sending the quote.”

A non-server of this would just be a simple version of date

main1 :: IO ()
main1 = do
  ct <- getCurrentTime
  let time = formatTime defaultTimeLocale "%F %T" ct
  putStrLn time

This outputs the time, and halts.

In order to work with sockets, however, we'll need to use the Handle equivalent program. We'll also adding a slight delay to make the benefits of concurrency clear later.

To convert main1 to use handles explicitly, we'll pass an output handle in, and use hPutStrLn instead of putStrLn:

mainWith :: Handle -> IO ()
mainWith outH = do
  threadDelay (10^6) -- to simulate "real" work.
  ct <- getCurrentTime
  let time = formatTime defaultTimeLocale "%F %T" ct
  hPutStrLn outH time

Ignoring the threadDelay, the equivalent program to main1 would be:

main2 :: IO ()
main2 = mainWith stdout

Now we can get on with implementing the server. The specification establishes port 13 for the Daytime protocol.[^1]

port :: Int
port = 13

Let's start out with the server implementation from the book:

main3 :: IO ()
main3 = withSocketsDo $ do
  sock <- listenOn (PortNumber (fromIntegral port))
  printf "Listening on port %d\n" port
  forever $ do
    (handle, host, port) <- accept sock
    printf "Accepted connection from %s: %s\n" host (show port)

Here's where things diverge. We use our mainWith in place of talk. Since we can pass handles into mainWith, we can pass handles returned by accept into mainWith:

    mainWith handle

Don't forget to close the handle after the connection has been handled (as specified)!

    hClose handle

This handles each connection in sequence. Since handling the connection takes over a second of work (due to the thread delay), it can only respond to one connection per second.

Let's add some concurrency! This first part is the same:

main4 :: IO ()
main4 = withSocketsDo $ do
  sock <- listenOn (PortNumber (fromIntegral port))
  printf "Listening on port %d\n" port
  forever $ do
    (handle, host, port) <- accept sock
    printf "Accepted connection from %s: %s\n" host (show port)

Here's where forkFinally comes in. Instead of calling mainWith handle, we fork a new thread to call it. When the thread completes, we (finally) close the handle.

    forkFinally (mainWith handle)
                (\_ -> hClose handle)

Since a new thread is created for each connection, we are no longer limited to one connection per second.

To run this file[^2]:

runhaskell daytime.lhs

To test the concurrency[^3]:

yes "nc localhost 13" | parallel -j 32

This streams commands to connect to localhost on port 13, and uses parallel to have 32 worker threads running those commands. With main3, you should see one response per second, whereas with main4, you should see 32 responses per second.

[^1]: You may need root access in order to run this program. If you do not have root access, change the port to 1313 or some other number above 1024. Socket numbers below 1024 are generally protected on modern computers.

[^2]: Again, you may need root access.

[^3]: You'll need GNU parallel installed, if you don't have it.

I've been asked a follow up question about reexporting items.


I sent yesterday's post about imports to Danny Gratzer. He had a follow-up question:

“Is there a Python equivalent to the Haskell trick of doing the following which will then export all 3 modules?”

module Foo (module X) where
import Data.Text as X
import Data.ByteString as X
import Control.Lens as X

The following should accomplish the same result, though it isn't as concise due to Python's lack of an embedded module language. Though Haskell's module language is minimal[^1], it has one. In Python, file structure determines modules (and packages).

[^1]: ... relative to languages such as ML. There is ongoing work to improve Haskell's module language.

# foo/x.py:
from data.text import *
from data.byte_string import *
from control.lens import *

# foo/__init__.py:
import foo.x as x

# bar.py:
from foo import x

I wouldn't recommend these sorts of tricks as the import order affects what is in scope, e.g. if Data.Text and Data.ByteString both defined baz, which would be X.baz? Ideally import order should not matter as it is an implementation detail that should be hidden by a module. A language (or linter) can enforce that property by preventing imports with conflicting names in general.

A comparison of imports in Python, Haskell, and Swift


At the last HaskellMN meetup, Danny Gratzer had a question about Python imports. Specifically, he wanted to import and re-export a constant. Here's how it is done in Python:

# a.py:
some_constant = 3.14

# b.py:
from a import some_constant

# c.py:
from b import some_constant

Edit: He had a follow-up question.

This reminded me that I wanted to compare the import styles of Python, Haskell, and Swift. Their imports are similar, though there are a few differences to keep in mind.

Python

Python has a few import styles:

import module1
from module2 import a, b, c
from module3 import *

The first loads module1 and brings it into scope. The second loads objects a, b, and c from module2 and brings them into scope. The third loads every object from module3 and brings them into scope.

The first is useful for short names (os, sys), but with module hierarchies it can be annoying. The second is my preferred style, as it forces you to name what will be used. The third is frowned upon, as you can accidentally overwrite variables if a module you import later provides an object with the same name as one you import earlier. Additionally, providing a list of what is going to be used in the file is useful from a documentation standpoint. “Explicit is better than implicit.”

Python modules are isomorphic to files. A module is either a file with that name or a folder with an __init__.py file in it[^1]. Submodules can be placed in the folder.

[^1]: A folder with an __init__.py is technically a package.

Python imports can be qualified using as:

from module import a as b

This imports a from module, but gives it the name b in scope.

Python imports can also happen anywhere in a file, even inside functions. This is a double-edged sword, useful for both making and breaking circular dependencies.

Haskell

Haskell, too, has a few import styles.

import qualified Module1
import Module2 (a, b, c)
import Module3

The first loads Module1 and brings it into scope. The second loads objects a, b, and c from Module2 and brings them into scope. The third loads every (exported) object from Module3 and brings them into scope.

These correspond loosely to the three Python examples, however, I put them in the roughly similar semantic order, rather than by ordering them by their similar syntax. Again, I prefer the middle one for similar reasons.

There is also a hiding option, which is the inverse of of the second option option above:

import Module hiding (a, b, c)

Haskell modules are made explicit within files:

module Module where
    ...

The convention is to put them in the corresponding file structure, though this is not enforced programmatically.

Unlike Python imports, Haskell imports need to be at the top of a file, optionally following a module declaration.

Haskell provides the as option, too, though only for module-level imports:

import Module as M
import qualified Module as M

However, the first example here also imports everything from Module in addition to qualifying the module-level import as M.

This would import everything from Module except a, b, and c.

Swift

Swift imports are not as diverse. They only provide equivalents for the second and third examples from above:

import var module2.a
import module3

The first loads module2 and brings it into scope. The second loads objects a from module3 and brings it into scope. There doesn't seem to be an equivalent to the module1 example from Python and Haskell.

Compared to the module2 examples from Python and Haskell, the example here provides two additional constraints. First, the import kind needs to be specified. In the example I gave here the import kind is var, though it could also be typealias, struct, class, enum, protocol, or func. The other constraint is each import from a module would need to be in its own line. A more precise equivalent to the Python and Haskell examples for would be:

import var module.a
import var module.b
import var module.c

This is a little excessive. I think a better syntax would be something like one of these:

from module import var a, var b, var c
import module (var a, var b, var c)

Like Python, Swift imports can be placed anywhere within a file.

The morality of import styles

Importing a module name is fine, though it can make some code hard to read due to the repetitive and verbose nature of including the module name for each access into the module.

Importing specific items from a module is my preferred import style, except when it isn't idiomatic (e.g. import sys in Python).

Importing everything from a module without naming each item is frowned upon because it doesn't convey everything to readers of the code. Code readers like to be able to see what is in scope in order to get a feel for what kind of code will be read in a file.

Additionally, if multiple modules define items with the same name, it is possible to overwrite imports implicitly, without notifying the reader:

# a.py:
x = 1

# b.py:
x = 2

# c.py:
from a import *
from b import *
print(x)

These suggestions are reflected in Python's PEP 8 and the HaskellWiki page on “Import[ing] modules properly”. These discuss the merits of the various import styles, and they arrive at similar conclusions, though the HaskellWiki page is less axiomatic and provides more rationale for the various decisions. There is not yet a similar document for Swift, though the official blog post on Files and Initialization mentions that you “can even import modules at the bottom of the file (although that is not recommended Swift style.”

Enter your email to subscribe to updates.