Introduction

Probably everyone that has dived into the world of custom mechanical keyboards, learns pretty fast that almost any aspect of a keyboard can be customized. This includes the arrangement of letters on the keyboard, we refer to this as the keyboard layout. There are already plenty of alternative layouts (Dvorak, Colemak, etc.), but often these are based only loosely on letter frequency and hand ergonomics.

The purpose of this post is not to advertise some shiny new layout, but to give people the chance to find an optimal keyboard layout for their own needs.

While I was comparing different layouts I stumbled upon the Colemak Mod-DH layout which uses a mathematical model to compare well-known layouts. As I was reading about the model, I wondered if it is possible to find a provably optimal layout according to this particular model.

Eventhough the sheer number of possible layouts prohibits enumeration, the answer fortunately is still "YES". The model can be formulated as an integer program. These kind of mathematical optimization problems have a strong theoretical foundation and have powerful solvers available.

The Model

In this section I want to first describe the model informally and in the second part I will precisely introduce variables/constraints that fit the description.

The informal description

The model is based on a rather intuitive and simple categorization of the keys. For each key on the keyboard we define a value which is supposed to estimate how hard it is to press the corresponding key. In our model higher values correspond to harder to reach keys. What we try to achieve with this is that common letters like 'T' or 'E' will be assigned to keys that are easy to reach.

Another thing we care about is so-called 'bigrams'. The word bigrams refers to combinations of two letters, examples might be 'th', 'he', 'ou', etc. To achieve better speed and ergonomics we want to minimize finger travel, so any bigram in which we have to use the same finger for both letters will be penalized by an appropriate factor. Similarly we apply a penalty for pinky-ring-finger bigrams and ring-middle-finger bigrams.

The formal model

We will start with a list of some definitions

For each key $k \in K$ and each symbol $s \in S$ we introduce a binary variable $v_{k,s}$ that indicates if the the symbol $s$ is assigned to key $k$:

$$ \forall k \in K, s \in S : v_{k,s} \in {0,1} $$

With these variables we are finally able to formulate the constraints that define a valid layout assignment. The first constraint will be that every symbol is assigned to exactly one key.

$$ \sum_{k \in K} v_{k,s} = 1, , \forall s \in S $$

Also we need to add a constraint that limits the number of symbols to one symbol per key.

$$ \sum_{s \in S} v_{k,s} \leq 1, , \forall k \in K $$

Defining bigram variables

Now we need to consider the bigram penalties. In order to account for these bigram penalties, we introduce variables for each bigram $(s, t) \in S \times S$ and each (relevant) key combination $(k, l) \in K \times K$. In our case a relevant key combination is a combination of two keys, in which both keys correspond to the same finger (same-finger bigram), one key is assigned to a pinky and the other key is assigned to the ring finger of the same hand (pinky-ring bigram) or one key is assigned to a ring finger and the other one to a middle finger of the same hand (ring-middle bigram). For notation purposes we define the left and right pinky finger $p_l, p_r \in F$, the left and right ring finger $r_l, r_r \in F$ and the left and right middle finger $m_l, m_r \in F$.

In order to make the constraint on key combinations easier to write, we introduce the following short-hand notation:

$$ q_{f_1,f_2}(k,l) := (f(k) = f_1 \land f(l) = f_2) \lor (f(k) = f_2 \land f(l) = f_1) $$

This is expression is true if the keys $k, l \in K$ are assigned to the fingers $f_1, f_2 \in F$.

We can use this to define the following sets of key combinations:

$$ B_{\text{same}} = { (k,l) \in K^2 | \exists f \in F : q_{f,f}(k,l) } \\ B_{\text{pinky-ring}} = { (k,l) \in K^2 | q_{p_l,r_l}(k,l) \lor q_{p_r,r_r}(k,l) } \\ B_{\text{ring-middle}} = { (k,l) \in K^2 | q_{r_l,m_l}(k,l) \lor q_{r_r,m_r}(k,l) } $$

Finally we introduce binary variables for the bigrams

$$ \forall (k,l) \in B_{\text{same}}, (s,t) \in S^2 : x_{k,l,s,t} \in {0,1}\\ \forall (k,l) \in B_{\text{pinky-ring}}, (s,t) \in S^2 : y_{k,l,s,t} \in {0,1}\\ \forall (k,l) \in B_{\text{ring-middle}}, (s,t) \in S^2 : z_{k,l,s,t} \in {0,1} $$

In order to force a bigram variable to be $1$ iff the current assignment contains the bigram, we add the following constraints to the program:

$$ v_{k,s} + v_{k,t} + v_{l,s} + v_{l,t} - x_{k,l,s,t} \leq 1, , \forall (k,l) \in B_{\text{same}}, , (s,t) \in S^2 \\ v_{k,s} + v_{k,t} + v_{l,s} + v_{l,t} - y_{k,l,s,t} \leq 1, , \forall (k,l) \in B_{\text{pinky-ring}}, , (s,t) \in S^2 \\ v_{k,s} + v_{k,t} + v_{l,s} + v_{l,t} - z_{k,l,s,t} \leq 1, , \forall (k,l) \in B_{\text{ring-middle}}, , (s,t) \in S^2 \\ $$

To highlight why this works, consider the following arguments for some bigram $s, t$ and keys $k, l$. From our previous constraints we know that each symbol can only be assigned to a single key. Thus the maximum of the sum

$$ v_{k,s} + v_{k,t} + v_{l,s} + v_{l,t} $$

is 2, in fact it is 2 if and only if the symbol assignment contains the bigram $s,t$. So only in this case

$$ x_{k,l,s,t} $$

has to be equal to one. We also know that we assign a non-negative weight to this variable in the objective function. We want to minimze our objective, so it is safe to assume, that in an optimal solution the variable is only one, if the corresponding bigram exists.

The objective function

Now it is time to specify the objective function, therefore we define some parameters:

Finally we can combine the variables and parameters to define the objective

$$\begin{eqnarray} \min_{v,x,y,z} & \sum_{s \in S} \sum_{k \in K} f_s \cdot c_k \cdot v_{k,s}\\ +& \sum_{(s, t) \in S^2}\sum_{(k,l) \in B_{\text{same}}} g_{s,t}\cdot a_{|r(k) - r(l)|}\cdot x_{k,l,s,t} \\ +& \sum_{(s, t) \in S^2}\sum_{(k,l) \in B_{\text{pinky-ring}}} g_{s,t}\cdot b_{|r(k) - r(l)|}\cdot y_{k,l,s,t} \\ +& \sum_{(s, t) \in S^2}\sum_{(k,l) \in B_{\text{ring-middle}}} g_{s,t}\cdot c_{|r(k) - r(l)|}\cdot z_{k,l,s,t} \end{eqnarray}$$

Speed-Ups and Trade-Offs

There are many binary variables in the above program; with commercial solvers, solving a typical instance, usually works really well and is fast, but if you are using an open-source solver, it might make sense to apply some modifications to get good results fast.

The biggest problem are the bigrams, they introduce a variable for each bigram and two related keys. By using only the relevant bigrams you can reduce the number of variables drastically. One way to do this is to take only the most frequent bigrams that make up the major percentage of all bigrams (for example 99%).

Another improvement might be to consider symmetries in the layout. For example matrix layouts usually are symmetric between both hands, meaning that each solution has a corresponding solution that is mirrored. In cases like this it might be a good idea to assign the most frequent key to one side.

Results

I implemented the model in python using the Pyomo optimization library. My implementation can be found here.

An optimal layout for the standard values from the Mod-DH homepage:

col0 col1 col2 col3 col4 col5 col6 col7 col8 col9
x c l m j / f u , q
o s r t v p n e a i
z g w d k b h y . ;