CAKE has a number of basic types that repeatedly show up in its messages. Here is some detail on what those types look like and what they're for. The number of types is kept purposely small to simplify software designed to encode and decode CAKE messages.

These types have a canonical binary representation, but this does not preclude their representation in other formats, though the exact details of such representations are not described in this document. All implementations must understand the binary representation.

The reason I've chosen a binary representation as canonical is that cryptography requires that messages in transit retain their precise forms. Changes in whitespace or character encoding defeat digital signature and message authentication algorithms that are designed to ensure message integrity.

Brief, explanatory descriptions

count

A count is a positive integer of limited size. The limit on the size of a count is quite large, and should be large enough that you could concievably use it to count all of the subatomic particles in the visible universe. The limited size is acceptable as a count is supposed to be a count of something that exists. It is not meant to represent any arbitrary number.

Link to octet-by-octet description
key name

A key name is a fixed length string of octets holding the unique identifier for a public key. It is the secure hash of a serialized representation of the key that includes the type (RSA, ECRSA, etc…) of the key.

Link to octet-by-octet description
variable length string

This is actually a compound type. Since it's very common, it's listed here. It consists of a count followed by that many bytes of data. This structure is used for almost every variable length data component in CAKE.

Link to octet-by-octet description

Detailed descriptions

Type: count

A count is a positive integer of limited size. The limit on the size of a count is quite large, and should be large enough that you could concievably use it to count all of the subatomic particles in the visible universe. The limited size is acceptable as a count is supposed to be a count of something that exists. It is not meant to represent any arbitrary number.

counts are encoded rather strangely to minimize their length. There are three main ways of encoding a count, and they are each good for a particular range of count values.

Octet values Value range Explanation
0 -> 222 0 -> 222

A single octet count for very small values. The value of the octet is the same as the value of the count.

223 -> 254 223 -> 8414

A double octet count for medium sized values. This is the most complex encoding. Here is some psuedocode showing how the octets (in count_octets produce the count value (in count_value):

upper_5bits <- count_octets[0] - 223
lower_8bits <- count_octets[1]
count_value <- (upper_5bits * 256) + lower_8bits + 223

From 223 to 254, there are 32 values, which is 5 bits of information. The second octet may have any value, which is 8 bits of information. 223 is added because if the count's value was less than 223, you could've encoded it using a single octet count.

255 0 -> 24080 - 1

This count consists of a variable number of octets. The first octet is the marker value, 255. The second is the half the number of octets to follow. A value of 0 for the second octet is not allowed, though it may be used in some special cases where a count also needs to hold some non-count flag value. Such uses should be minimal, and will be flagged prominently in documentation when they occur.

The string of octets that follows may have leading 0s. The octets are interpreted in big-endian order.

This representation is rather simple. If you want to write as simple an implementation as you can, you can simply encode all values this way, even though you will often be wasting a lot of space. Your implementation must, of course, be able to read all the representations.

Example counts

Here are a few octet strings in hex representing counts, and the corresponding count value in decimal.

Hex string Count value
00 0
ff 01 00 00 0
a3 163
de 222
df 00 223
e0 00 479 (479 - 223 = 256)
fe ff 8414
ff 01 20 de 8414
ff 01 01 00 256
ff 02 ff ff ff ff 4294967295 (232 - 1)
ff 02 00 00 00 01 1
ff 02 00 00 01 Illegal value
ff 00 Illegal value

Type: key name

A key name is very simple. It is 32 octets of data. The 32 octets are generated by using SHAD-256 (SHAD-256 vs SHA-256 is described elswhere) on the canonical representation of the public key data for that key. It is also known as a key id.

It's possible for two distinct keys to have the same name. If the SHAD-256 hash function has a fairly even distribution, it is so astronomically unlikely as to not even be worth thinking about. I recommend ignoring this possibility.

There is also a canonical ASCII representation of a key name. This is the representation when displaying it someplace where someone might actually read it. It's not very comprehensible, but it works. The canonical representation is the key name encoded in Base32 as described in RFC 3548 with trailing '=' signs removed.

Here's an example canonical ASCII representation of a key name:
2BS2C2HOG62754DFYSMTNMNVFCZA7YQXRPRXNIOF67LNBZNZAK3A

Type: variable length string

A variable length string is a count followed by a number of bytes equal to the count's value. These tend to occur when there is a value (like a key, or message body) that is of intedeterminate length, and (unlike a count) does not encode its own length.

Note that values ending in a delimiter or tag value are not considered to encode their own length for two main reasons. First, it must be possible to know how bug a buffer you will need to hold something after reading the first 1000 bytes or so of the message.

Secondly, to keep parsing simple, fast, and less prone to error, no escape sequences must exist. That is to say, there must be no value which has significance in one context, but not in another. The parser must be able to find the last byte of a message without needing a state machine while going over the message contents. Unless you limit the characters that can be in your string, there will be no value which will fit this requirement for an arbitrary string.

So, fixed length fields, or prefix length encoding is used for everything in CAKE. No delimiter encoding is allowed.