kmarekspartz

Expanding our capacity to respond to unforeseen changes

Marginal maintenance costs are much more important than fixed initial costs when making business decisions

When making business decisions, people often use initial or fixed costs to justify their decision. While being difficult to estimate, they still are much easier to estimate than maintenance and marginal costs. However, in the long term, maintenance and marginal costs will outweigh initial and fixed costs, so careful attention must be paid to marginal maintenance costs.

Long term costs increase for marginal or maintenance costs, and increase significantly for marginal maintenance costs.

Let's consider each combination. While a given cost isn't categorically in one quadrant, these dimensions can be used to conceptualize relative costs of alternatives under consideration.

Fixed initial costs

These are one time costs that are always necessary to take a given path, even in an ideal world.

Marginal initial costs

These are inefficiencies in implementing a decision that are included each time you make similar decisions. Bureaucracy, redundant efforts, and technical debt are three examples.

Fixed maintenance costs

These are the base recurring costs that you must incur as a consequence of your decision. Flat membership or licensing fees are a good example.

Marginal maintenance costs

These are recurring costs that grow proportionally to the number of customers are affected the outcome of your decision. Hiring people or paying for more computers are examples of this. Minimizing these costs has more benefit than reducing the other three types because they are recurring and growing.


See also: The Equation of Software Design

Use themes to clarify your goals

When defining goals, I've found themes to be more effective than particular achievements. Themes give clarity of purpose and direction, even when the day to day tactical priorities change. Does my plan for the day fit into my theme?

On the other hand, measuring progress against a theme is difficult. Instead of measuring directly, we have to use proxies. Choose proxies carefully. Be comfortable changing your proxy measurements even when you maintain the theme.

No need to start from scratch; you can merge your repos and preserve history.


If you'd like to merge two (or more) git repositories together and preserve the commit histories of both, here's the script for you:

cd some-repo

git remote add other-repo git@other-repo.com:other-repo/other-repo.git
git fetch other-repo
git checkout other-repo/master

git checkout -b merge-other-repo
mkdir other-repo

for f in *; do
  git mv $f other-repo
done

# If you're making a merge request:

git merge master --allow-unrelated-histories

git push origin merge-other-repo

# Then make a merge request

# If you're pushing directly to master:

git checkout master

git merge merge-other-repo

git push origin master

A concurrent implementation Daytime protocol in Rust


When learning a language, I rewrite small programs I've previously written to jumpstart my learning. Implementing a concurrent Daytime server has proved particularly useful because it uses both sockets and threads. If a language has good socket and threading libraries, it is likely a good language.

Previously, I demonstrated a Haskell implementation. Here's an example in Rust:

extern crate time;

use std::time::Duration;
use std::io::Write;
use std::net::{TcpListener, TcpStream};
use std::thread;

fn handle_client(mut stream: TcpStream) {
    let date = time::strftime("%F %T\n", &time::now_utc()).unwrap().to_string();
    let _ = stream.write(date.as_bytes());
}

fn main() {
    let listener = TcpListener::bind("127.0.0.1:13").unwrap();

    for stream in listener.incoming() {
        match stream {
            Ok(stream) =>  {
                thread::spawn(move || {
                    // connection succeeded
                    thread::sleep(Duration::new(1,0));
                    handle_client(stream)
                });
            }
            Err(_) => { /* connection failed */ },
        }
    }

    drop(listener);
}

Using a log-structured schema, we can merge SQL databases to achieve eventual consistency.


Previously, I introduced eventual consistency for SQL. This post illustrates how to normalize an eventually consistent SQL database.

To demonstrate how to normalize for eventual consistency, let's design a database for a Twitter clone, consisting of users and statuses. A traditional schema for a Twitter looks like:

CREATE TABLE Users (
  username VARCHAR(255) PRIMARY KEY,
  email VARCHAR(255),
  phone VARCHAR(255),
  location VARCHAR(255),
  confirmed BOOLEAN NOT NULL,
  salt VARCHAR(255) NOT NULL,
  hashed_password VARCHAR(255) NOT NULL
);

CREATE TABLE Statuses (
  content VARCHAR(140) NOT NULL,
  created_at DATE DEFAULT ( SYSDATE ) NOT NULL,
  username VARCHAR(255) REFERENCES Users (username) NOT NULL,
) PRIMARY KEY (content, created_at, user_id);

CREATE TABLE Follows (
  follower_username VARCHAR(255) REFERENCES Users (username) NOT NULL,
  followee_username VARCHAR(255) REFERENCES Users (username) NOT NULL
) PRIMARY KEY (follower_username, followee_username);

The first step in normalizing for eventual consistency is to identify state changes in your data. For example, a user becomes confirmed after clicking a link in an email or text message. Under the schema above, the following UPDATE statement would get executed:

UPDATE Users
SET confirmed = true
WHERE username = 'kmarekspartz';

However, since we're avoiding UPDATE, this will not work. Instead, let's normalize this mutation out of our database.[^1]

[^1]: I'm going to assume offline migrations for simplicity, but these migrations can be achieved in a zero-downtime environment, too. You would create both places for the data reside, deploy a version of the application to read from both, deploy a version of the application to write to both, run a backfill migration (like in the example), then deploy a version which only reads and writes the new place, then drop the old place. Fun!

CREATE TABLE Confirmations (
  username VARCHAR(255) REFERENCES Users (username) NOT NULL
);

INSERT INTO Confirmations
SELECT username
FROM Users
WHERE confirmed = true;

ALTER TABLE Users
DROP COLUMN confirmed;

INSERT INTO Confirmations VALUES ('kmarekspartz');

SELECT Users.*, IS_NULL(Confirmations.username) AS confirmed
FROM Users
LEFT OUTER JOIN Confirmations
ON Users.username = Confirmations.username;

Applying this normalization to the rest of the schema would lead to a new schema:

CREATE TABLE Users (
  username VARCHAR(255) PRIMARY KEY,
  salt VARCHAR(255) NOT NULL,
  hashed_password VARCHAR(255) NOT NULL
);

CREATE TABLE Confirmations (
  username VARCHAR(255) REFERENCES Users (username) NOT NULL
);

CREATE TABLE Emails (
  email VARCHAR(255) PRIMARY KEY
);

CREATE TABLE UserEmails (
  username VARCHAR(255) REFERENCES Users (username) NOT NULL,
  email VARCHAR(255) REFERENCES Emails (email) NOT NULL
) PRIMARY KEY (username, email);

CREATE TABLE Phones (
  phone VARCHAR(255) PRIMARY KEY
);

CREATE TABLE UserPhones (
  username VARCHAR(255) REFERENCES Users (username) NOT NULL,
  phone VARCHAR(255) REFERENCES Phones (phone) NOT NULL
) PRIMARY KEY (username, phone);

CREATE TABLE Locations (
  location VARCHAR(255) PRIMARY KEY
);

CREATE TABLE UserLocations (
  user_location_id PRIMARY KEY AUTOINCREMENT
  username VARCHAR(255) REFERENCES Users (username) NOT NULL,
  location VARCHAR(255) REFERENCES Locations (location) NOT NULL
);

CREATE TABLE UserLocationDeletions (
  user_location_id REFERENCES UserLocations (user_location_id) NOT NULL
);

CREATE TABLE Statuses (
  content VARCHAR(140) NOT NULL,
  created_at DATE DEFAULT ( SYSDATE ) NOT NULL,
  username VARCHAR(255) REFERENCES Users (username) NOT NULL,
) PRIMARY KEY (content, created_at, user_id);

CREATE TABLE Follows (
  follow_id PRIMARY KEY AUTOINCREMENT,
  follower_username VARCHAR(255) REFERENCES Users (username) NOT NULL,
  followee_username VARCHAR(255) REFERENCES Users (username) NOT NULL
);

CREATE TABLE FollowDeletions (
  follow_id REFERENCES Follows (follow_id) NOT NULL
);

I've turned most properties into many-to-many relationships, and added deletion tables for the join tables. This is because many-to-many relationships are easier to merge than one-to-one. With a many-to-many, you can use UNION as your merge strategy. With one-to-one, there's not a deterministic way to choose a winner. This will result in temporary inconsistencies, but if you have each user interact with a particular host or shard, you can minimize those inconsistencies. In a mobile environment, the user is interacting with their phone, and we can guarantee that their phone is in a consistent state at any given time. In a server environment, we can route that user's interactions to a particular host, failing over to an in sync replica if that host is down.

One thing that doesn't work well with a many-to-many relationship is passwords. I left it as one-to-one, but passwords aren't needed anymore, particularly when there is an email address or a phone number available. Instead of asking a user for a password, we can send them a link with a token to sign in. This is one-factor authentication, but uses what is commonly a second factor in two-factor authentication. Removing passwords from this schema would make eventual consistency possible.


I like to call this method of normalization 'log normalization' but that gets confusing. It is too bad we have unique constraints on technical terms. If there's a better name for this, let me know!

Using a log-structured schema, we can merge SQL databases to achieve eventual consistency.


Eventual consistency is typically thought of as a property of certain NoSQL databases. However, SQL databases can achieve eventual consistency with adequate planning.

Eventual consistency is particularly useful for horizontally scaling web services, as each node does not need the full, up to date picture. Another use case is synchronizing mobile applications with intermittent network connectivity. In both cases, partitioning or sharding allows each user or client to maintain a consistent view.

A data structure is considered eventually consistent if two instances can reach a consistent state given the same unordered set of state changes. Two databases are eventually consistent if all actions taken in one propagate to the other.

Strong eventual consistency can be achieved if state changes are commutative, associative, and idempotent.

State changes are commutative if swapping two operations yields the same state. For example, these two operations can apply in either order to yield a logically equivalent table:

INSERT INTO some_table (some_column) VALUES (1);
INSERT INTO some_table (some_column) VALUES (2);

Similarly, state changes are associative if three operations can be paired either way yielding the same state. For example, three databases each with one of the following operations can be merged as (1 + 2) + 3 or 1 + (2 + 3) to yield a logically equivalent table:

INSERT INTO some_table (some_column) VALUES (1);
INSERT INTO some_table (some_column) VALUES (2);
INSERT INTO some_table (some_column) VALUES (3);

Not all SQL DML can be commutative or associative:

UPDATE some_table SET a = 1;
UPDATE some_table SET a = 2;
UPDATE some_table SET a = 3;

Depending on the order of these operations, the resulting database would be in different states.

DELETEs also cause problems. Given two databases, where one has ran a deletion and one has not, merging them would re-insert the deleted row in the database with the deletion:

DELETE FROM some_table WHERE some_column = 4;

We can limit DML to SELECTs and INSERTs to avoid using other synchronization methods. Avoiding updates and deletes provides immutability, which guarantees a row will not change or disappear from our databases.

By staying INSERT-only, we treat our database as a persistent data structure, or an append-only commit log.

Immutability and persistence helps with eventual consistency, but they are not sufficient. We also need idempotence. State changes are idempotent if repeatedly applying the same state change yields the same state. Our previous example, INSERT INTO some_table (some_column) VALUES (1) is not idempotent, unless we de-duplicate rows at query time:

SELECT DISTINCT some_column
FROM some_table;

Another way to achieve idempotence is to use a unique constraint to prevent duplicates from being inserted and ignore unique constraint violations upon insertion. We can also manually check for duplicates during an insert:

INSERT INTO some_table (some_column)
SELECT 4
WHERE NOT EXISTS (
  SELECT 1
  FROM some_table
  WHERE some_column = 4
);

How can we delete an item from our database without using SQL DELETE? Tombstones provide our answer. Instead of DELETEing the row, we can create a marker in another table that says an object is deleted.

INSERT INTO some_table_deletions (some_table_id)
VALUES (123456);

Queries for undeleted rows then become:

SELECT some_table.*
FROM some_table
WHERE id NOT IN (
  SELECT some_table_id
  FROM some_table_deletions
);

If we want to support deletion and re-adding of the same value, we cannot use a unique constraint for idempotence since the unique constraint would prevent the addition of the new row. Instead, either de-duplicate at query time using DISTINCT or prevent duplicates from getting inserted without using a unique constraint:

INSERT INTO some_table (some_column)
SELECT 4
WHERE NOT EXISTS (
  SELECT 1
  FROM some_table
  WHERE some_column = 4
  AND id NOT IN (
    SELECT some_table_id
    FROM some_table_deletions
  )
);

At this point it looks like we'll need to normalize our database, or structure our data in a particular way in order to satisfy these properties. In my next blog post, I'll demonstrate how to normalize a database to achieve eventual consistency.

An exercise from SICP

A lisp can be implemented using a small set of primitives from which other lisp features can be derived. The set is not fixed; different lisps use different primitives. cons, car, and cdr are often primitives, but they do not need to be.

Primitive cons, car, and cdr

Typical implementations of cons, car, and cdr use an underlying array representations in their host environment. In a JavaScript host environment, this could look like:

var cons = function (head, tail) {
  return [head, tail];
};

var car = function (pair) {
  return pair[0];
};

var cdr = function (pair) {
  return pair[1];
};

Derived cons, car, and cdr

To implement cons, car, and cdr as derived features, the lisp should only need function definitions and lambdas as features (either derived or primitive).

(define (cons head tail)
  (lambda (f) (f head tail))

(define (car pair)
  (pair (lambda (head tail) head)))

(define (cdr pair)
  (pair (lambda (head tail) tail)))

This could compile down into:

var cons = function (head, tail) {
  return function (f) {
    return f(head, tail);
  };
};

var car = function (pair) {
  return pair(
    function (head, tail) { return head; }
  );
};

var cdr = function (pair) {
  return pair(
    function (head, tail) { return tail; }
  );
};

See also: SICP Exercise 2.4

Making routine simpler.


Review, reflection, and refinement should be part of your routine. I'm a loose follower of GTD and a heavy user of Things (though I don't always use it in the way they suggest), but I've struggled to make reviews part of my routine. I made these checklists to lower the barrier of entry. In addition, I've created Beeminder goals to keep me on track.

Unlike most blog posts, this is a living document. I expect it to change. When It does, I plan to make notes of the changes at the bottom.

Daily Review

What do I want to get done today?
Check calendar.
Review and archive all emails in inboxes, creating next actions as needed.
Review next actions in Things, moving to today as needed.
Review due today actions in Outlook.
Choose one important thing to get done first tomorrow.

Weekly Review

What went well last week?
What didn't go well last week?
What do I want to get done this week?
Review someday actions in Things, moving to next as needed.
Review unscheduled actions in Outlook, scheduling as needed.
Do I have appointments to make?

Monthly Review

What went well last month?
What didn't go well last month?
What do I want to get done this month?
What events are this month?
What projects are my focus this month?
Are all recurring items being minded?

Annual Review

What went well last year?
What didn't go well last year?
What do I want to get done this year?
Where do I want to be next year, and how do I get there?
What are my priorities for the next year?

Updates:

  1. Added “What went well” and “didn't” checkpoints. Added backlog reviews.
  2. Added calendar check. Changed email process from keeping actionable to creating actions.

Never give in to impostor syndrome


This week I'll be participating in the Your Turn Challenge. Participants have promised each other to make at least one blog post each day this week. The challenge combats impostor syndrome and gets a routine going.

Impostor Syndrome

Impostor syndrome is the fear of claiming you are something you are not. Most often, I see people struggle with it in programming, but personally I have had it when becoming a business owner and with writing.

“I'm terrible at it.”

I've noticed people with impostor syndrome often have high standards. People get to a point where they can recognize the good from the bad. However, their standards cause them to notice issues in their work that others would gloss over.

These standards are often a reflection of the breadth of their experience. They would not have these standards if they were true impostors.

“What if I fail?”

We hear about others' successes, but usually not about the failures that lead to their eventual successes. Even with failure, we learn. Sometimes the best lessons are learned when we fail. Embrace failure and jump in.

“What if I say the wrong thing?”

With many things in life, you have to claim it before growth can happen. This is the “fake it until you make it” strategy. If you are waiting to be qualified, you'll have missed your opportunity. Qualification comes through experience; your qualifications will catch up.

Habit, routine, ritual

We improve when we iterate and reflect. Each program, business opportunity, and blog post is an opportunity to start again. Get in a habit of reflecting on lessons learned in previous iterations. Build a routine of trying something new. Make starting a ritual.

To appease Google, I inlined CSS in my Hakyll pages.


Google wasn't happy with how many render-blocking resources I was loading from pages on this site. It suggested that I inline render-blocking CSS.

I'm not sold that this is ideal. Previously, if a client requested a resource they already had, they wouldn't necessarily need to re-download the resource. The CSS being used across pages could be cached, so each request would only get the necessary unique data. Other than the initial fetch which would get all the resources, the amount of information exchanged is minimal.

With the CSS inline, this content cannot be cached (across pages), and therefore it is downloaded for each page. While there are fewer requests needed to get a single page, there is redundant information being included in each request.

Anyway, I thought I'd try it out. Hakyll provides a CSS compiler which compresses CSS by removing whitespace, comments, and redundant punctuation. It also provides a template compiler, which allows variable substitution and nesting of templates. However, it doesn't provide a way to compress CSS inside a template. I took the definition of it's template compiler and its CSS compiler and combined them:

cssTemplateCompiler :: Compiler (Item Template)
cssTemplateCompiler = cached "Hakyll.Web.Template.cssTemplateCompiler" $
    fmap (readTemplate . compressCss) <$> getResourceString

I made a few other changes in order to make this work. I needed to combine my CSS files, and then include the combined CSS as a partial template in my default template.

I did have one CSS file that I couldn't inline like this, since it specifies additional resources to fetch (fonts), but this is a Google hosted file, so hopefully they won't hold it against me. Google suggested that I create a link to this file after rendering, using JavaScript, so that's in my changes, too.

Enter your email to subscribe to updates.