kmarekspartz

On Choosing the Best Tool for the Job

February 4, 2014

Finding balance between tools that do it all and tools that are ideal

Django. Ruby on Rails. jQuery.

There are many such frameworks which promise to be Swiss army knives. They include as many tools that they think you'll need as possible, and then some. There are people who can program in these frameworks without knowing much of their respective host languages. They'll argue they are working on the level of the problem domain and don't need anything else. Working on the level of the problem domain is great, however, there are issues with dependence on a single tool.

If all you have is a hammer, everything looks like a nail.

– Law of the instrument

Flexibility of a tool could be defined in at least two ways. First, one could argue a Swiss army knife is flexible as it can be used for many different problems. On the other hand, decomposing the problem allows you to be choose the best tools for their respective jobs.

As a corollary, when using a decomposed system, if you're screwdriver stops being maintained or your employer uses a different platform, you can swap it out with minimal effort. When using a Swiss army knife, your screwdriver and your bottle opener might be one and the same. Since the tools are combined, there may be unintentional interdependencies that prevent you from switching completely.

You're locked in.

Interface Package Now on PyPI

February 3, 2014

Releasing a package to PyPI

I've put the interface system that I've been discussing in recent posts into a package now available on PyPI.

Testing Interfaces in Python

January 30, 2014

Generating tests for interfaces in Python

In yesterday's post, I proposed a way to write and test interfaces in Python. Testing these interfaces was quite verbose. I left refactoring that testing as an exercise to the reader. Then I decided to do that exercise. Here's a neat way to generate interfaces and abstract tests given a dictionary mapping interface names to a list of method names:

# interfaces.py
def Interface(interface_name, method_names):
    def interface_helper(*args, **kwargs):
        raise NotImplementedError
    methods = {method_name: interface_helper for method_name in method_names}
    return type(interface_name, (object,), methods)

def AbstractInterfaceTest(test_name, method_names):
    def abstract_interface_test_helper(method_name):
        def test_method(self):
            try:
                getattr(self.obj, method_name)()
            except NotImplementedError:
                self.fail(
                    type(self.obj).__name__ +
                    ' does not implement ' +
                    method_name
                )
        return test_method
    methods = {
        'test_' + method_name: abstract_interface_test_helper(method_name)
        for method_name
        in method_names
    }
    return type(test_name, (object,), methods)

interfaces = {
    'CountFish': ['one', 'two'],
    'ColorFish': ['red', 'blue']
}

for interface_name, methods in interfaces.iteritems():
    interface_name += 'Interface'
    globals()[interface_name] = Interface(interface_name, methods)
    test_name = 'AbstractTest' + interface_name
    globals()[test_name] = AbstractInterfaceTest(test_name, methods)

In order to use this with the other code from yesterday, we'd have to update tests.py as well:

# tests.py
from unittest import TestCase, main
from interfaces import CountFishInterface, ColorFishInterface,\
    AbstractTestColorFishInterface, AbstractTestCountFishInterface
from models import OurFish

class TestOurFish(AbstractTestCountFishInterface,
                  AbstractTestColorFishInterface,
                  TestCase):
    def setUp(self):
        self.obj = OurFish()

if __name__ == '__main__':
    main()

Interfaces in Python

January 29, 2014

Implementing interfaces in Python

There's been a long history of proposals and disagreement over interfaces in Python. I'm going to ignore all of that and show one way to utilize interfaces.

An interface can be a class from which implementing classes inherit:

# interfaces.py:
class CountFishInterface(object):
    def one(self, *args, **kwargs):
        raise NotImplementedError
    def two(self, *args, **kwargs):
        raise NotImplementedError

class ColorFishInterface(object):
    def red(self, *args, **kwargs):
        raise NotImplementedError
    def blue(self, *args, **kwargs):
        raise NotImplementedError

# models.py:
from interfaces import CountFishInterface, ColorFishInterface

class OurFish(CountFishInterface, ColorFishInterface):
    pass

Now, OurFish doesn't yet implement the interface. Before we do that, let's add some tests. Note that since multiple classes may implement our interfaces, we make abstract tests for each interface.

# tests.py:
from unittest import TestCase
from interfaces import CountFishInterface, ColorFishInterface
from models import OurFish

class AbstractTestCountFishInterface(object):
    def test_one(self):
        try:
            self.obj.one()
        except NotImplementedError:
            self.fail(
                str(type(self.obj)) + 'does not implement one'
            )
    def test_two(self):
        try:
            self.obj.two()
        except NotImplementedError:
            self.fail(
                str(type(self.obj)) + 'does not implement two'
            )

class AbstractTestColorFishInterface(object):
    def test_red(self):
        try:
            self.obj.red()
        except NotImplementedError:
            self.fail(
                str(type(self.obj)) + 'does not implement red'
            )
    def test_blue(self):
        try:
            self.obj.blue()
        except NotImplementedError:
            self.fail(
                str(type(self.obj)) + 'does not implement blue'
            )

class TestOurFish(AbstractTestCountFishInterface,
                  AbstractTestColorFishInterface,
                  TestCase):
    def setUp(self):
        self.obj = OurFish()

Now our tests should fail! Let's implement the interface in OurFish:

# models.py:
from interfaces import CountFishInterface, ColorFishInterface

class OurFish(CountFishInterface, ColorFishInterface):
    def one(self):
        return 1
    def two(self):
        return 2
    def red(self):
        return '#FF0000'
    def blue(self):
        return '#0000FF'

Now our tests should pass!

Declaring new interfaces and writing the tests is quite verbose. Here's a simpler way of declaring new interfaces:

# interfaces.py
def Interface(interface_name, method_names):
    def interface_helper(*args, **kwargs):
        raise NotImplementedError
    methods = {method_name: interface_helper for method_name in method_names}
    return type(interface_name, (object,), methods)

ColorFishInterface = Interface('ColorFishInterface', [
    'red',
    'blue'
])

That's not that pythonic looking, but here's another way to do it:[^1]

# interfaces.py
interfaces = {
    'CountFishInterface': ['one', 'two'],
    'ColorFishInterface': ['red', 'blue']
}

for interface_name, methods in interfaces.iteritems():
    globals()[interface_name] = Interface(interface_name, methods)

Still messy, but it makes it easy to add more interfaces.

I'll leave refactoring the test cases as an exercise to the reader. Beyond moving the try-block into a helper method, the best solution I can presently come up with is code-generation.

[^1]: Update: No one noticed, but I was originally missing the call to iteritems!

Databases, Primality, and Category Theory

January 27, 2014

An observation relating primality from number theory to databases

My fiancée is taking a databases course right now, and she pointed out that database normalization is related to prime factorization.

If we think of JOIN as a multiplication operation, and the multiplicative inverse as splitting a table into two tables, the equivalent process to prime factorization would be database normalization!

My limited background in abstract algebra and category theory makes me think this is a field, but do all fields have this property? Chemistry could possibly seen as a field with this property, where the prime factors would be the set of elements in a compound.

On Perl Programming

January 23, 2014

Snoopy swearing in Perl

With Perl you can [...] write programs that look like Snoopy swearing.

– The Pragmatic Programmer

Singular Dispatch and Update

December 18, 2013

Providing more implementations for a singular dispatch method

In my earlier singular dispatch post, I suggested that the update method of the singular_dispatch class is quite useful. One place I'm using it is for an implementation of QuickCheck in Python[^1].

[^1]: I'll be posting more on this as things get implemented.

An important aspect of QuickCheck is an arbitrary function which returns a random value of a specified type. Assuming we have a random_int function, we can create an initial arbitrary function which works for int:

def random_int():
    return random.randint(-sys.maxint - 1, sys.maxint)

arbitrary = singular_dispatch({
    int: random_int,
})

Now, when you want to extend this behavior to new types in Haskell, one would write something like:

instance Arbitrary Bool where
  arbitrary = ...
  ...

Back in Python land, we can add another instance to arbitrary using the update method:

arbitrary.update({
    bool: lambda: arbitrary[int]() > 0
})

Calling arbitrary[cls]() isn't very pythonic. The preferred notation would be cls.arbitrary(). This mixin provides that notation for classes which inherit from it[^2]:

[^2]: This mixin does not provide the object-oriented notation for primitive types.

class ArbitraryMixin(object):
    @classmethod
    def arbitrary(cls):
        return arbitrary[cls]()


class Natural(int, ArbitraryMixin):
    ## Does not actually enforce self >= 0!
    pass

class NaturalPlus(int, ArbitraryMixin):
    ## Does not actually enforce self >= 1!
    pass


arbitrary.update({
    Natural: lambda: abs(arbitrary[int]())
    # Only arbitrary integers >= 0.
    NaturalPlus: lambda: Natural.arbitrary() + 1
    # Only arbitrary positive integers, and sys.maxint + 1.
})

(Note: for simplicity, we're ignoring the sys.maxint + 1 case. It would not still be an int due to Python casting on overflow.)

This is reminiscent of deriving in Haskell, though nowhere near as powerful.

This even works for complex data structures such as Trees, as long as the structure is defined in arbitrary appropriately. To do so, we create a class combinator which returns a random arbitrary function:

def or_(*args):
    return arbitrary[random.choice(args)]

class Tree(ArbitraryMixin):
    pass

class Leaf(Tree, ArbitraryMixin):
    def __init__(self, value):
        self.value = value

class Node(Tree, ArbitraryMixin):
    def __init__(self, left, right):
        self.left = left
        self.right = right

arbitrary.update({
    Tree: lambda: or_(Leaf, Node)(),
    Leaf: lambda: Leaf(Natural.arbitrary()),
    Node: lambda: Node(Tree.arbitrary(), Tree.arbitrary())
})

One benefit of this is Leaf.arbitrary() is guaranteed to be a Leaf and Node.arbitrary() is guaranteed to be a Node but Tree.arbitrary() makes no such guarantees. This is a useful result of subtyping in Python.

Pattern Matching and Singular Dispatch in Python

December 12, 2013

Implementing pattern matching and singular dispatch in Python

Pattern matching and singular dispatch are useful tools not readily usable in Python. There is PEP 443, but I'm not a fan for how verbose that is.

We can use dictionaries to do basic pattern matching. Here's an absolute value function:

def abs(x):
    abs_dict = {
        True: lambda x: x,
        False: lambda x: -x
    }
    return abs_dict[x >= 0](x)

We can use any expression to dispatch the abs_dict. This allows for more complex patterns than just True and False. This function takes an int or a string and returns more:

def more(x):
    more_dict = {
        int: lambda x: x + 1,
        str: lambda x: x + 's'
    }
    return more_dict[type(x)](x)

more(1) == 2
more('noun') == 'nouns'

This used singular dispatch to choose which implementation to use. I’ve been using the following:

from collections import OrderedDict

class singular_dispatch(OrderedDict):   
    def __call__(self, *args, **kwargs):
        return self[type(args[0])](*args, **kwargs)

Pretend OrderedDict is a normal dict for now. By inheriting from dict, our singular_dispatch objects can function like any other dictionary, and has all of its methods.[^1] We want singular dispatch objects to also function as a function, so we add a __call__ method. The call method uses the type of its first argument (other than self) to dispatch which function within self to use. This assumes that the values in self are functions which operate on the type for their keys.

This simplifies the definition of more:

more = singular_dispatch({
    int: lambda x: x + 1,
    str: lambda x: x + 's'
})

more(1) == 2
more('noun') == 'nouns'

singular_dispatch objects also allow us to specify an instance to use instead of the type of the first argument. In this case, that would raise a TypeError:

more[int](1) == 2
# more[str](1) raises TypeError
# more[int]('noun') raises TypeError

Inheriting from OrderedDict isn’t strictly necessary in this case, but it is useful for the singular_dispatch class since it remembers the order in which keys are added.

[^1]: Including update. More on that in a different post.

Pattern Matching and Singular Dispatch in Python

December 12, 2013

Implementing pattern matching and singular dispatch in Python

Pattern matching and singular dispatch are useful tools not readily usable in Python. There is PEP 443, but I'm not a fan for how verbose that is.

We can use dictionaries to do basic pattern matching. Here's an absolute value function:

def abs(x):
    abs_dict = {
        True: lambda x: x,
        False: lambda x: -x
    }
    return abs_dict[x >= 0](x)

We can use any expression to dispatch the abs_dict. This allows for more complex patterns than just True and False. This function takes an int or a string and returns more:

def more(x):
    more_dict = {
        int: lambda x: x + 1,
        str: lambda x: x + 's'
    }
    return more_dict[type(x)](x)

more(1) == 2
more('noun') == 'nouns'

This used singular dispatch to choose which implementation to use. I’ve been using the following:

from collections import OrderedDict

class singular_dispatch(OrderedDict):   
    def __call__(self, *args, **kwargs):
        return self[type(args[0])](*args, **kwargs)

This simplifies the definition of more:

more = singular_dispatch({
    int: lambda x: x + 1,
    str: lambda x: x + 's'
})

more(1) == 2
more('noun') == 'nouns'

singular_dispatch objects also allow us to specify an instance to use instead of the type of the first argument. In this case, that would raise a TypeError:

more[int](1) == 2
# more[str](1) raises TypeError
# more[int]('noun') raises TypeError

Inheriting from OrderedDict isn’t strictly necessary in this case, but it is useful for the singular_dispatch class since it remembers the order in which keys are added.

[^1]: Including update. More on that in a different post.

Never Blame People

December 11, 2013

Computer programs should not blame people for their own failings

Many people don't like using computers because they internalize computer problems when things go wrong. People blame themselves when computers don't behave as expected.

One attempt to fix this would be to provide better error messages. Instead of 'Operation Failed', software should report why the operation failed. Was it a data problem? Network connectivity? A timeout for the server? Invalid response from the server? [^1] By telling people what is wrong, rather than telling them that something went wrong, software can empower people to adjust the necessary conditions to fix the problem.

Another current trend which people internalize is obfuscated UI. When people cannot find a menu because it is hidden off the edge of the screen, or buttons are labeled with abstract icons, they blame themselves.

Software needs to teach people not to blame themselves for things beyond their control.

[^1]: Note: this also makes things easier to debug!