Skip to content

Commit

Permalink
schema_syntax: Add comments for our index format functions
Browse files Browse the repository at this point in the history
We had to devise our own scheme for writing integers in a human readable
format which also sorted correctly numerically. This might look a bit
confusing to outsiders, so here's a large comment as a peace offering.

Pair-programmed-with: Tim Beale <[email protected]>

Signed-off-by: Garming Sam <[email protected]>
Reviewed-by: Andrew Bartlett <[email protected]>
  • Loading branch information
GSam authored and abartlet committed Apr 8, 2019
1 parent c9b2a37 commit db584d5
Show file tree
Hide file tree
Showing 2 changed files with 81 additions and 1 deletion.
21 changes: 20 additions & 1 deletion lib/ldb-samba/ldif_handlers.c
Original file line number Diff line number Diff line change
Expand Up @@ -835,7 +835,22 @@ static int ldif_canonicalise_int32(struct ldb_context *ldb, void *mem_ctx,
return 0;
}

/* Lexicographically sorted representation for a 32-bit integer */
/*
* Lexicographically sorted representation for a 32-bit integer
*
* [ INT32_MIN ... -3, -2, -1 | 0 | +1, +2, +3 ... INT32_MAX ]
* n o p
*
* Refer to the comment in lib/ldb/common/attrib_handlers.c for the
* corresponding documentation for 64-bit integers.
*
* The same rules apply but use INT32_MIN and INT32_MAX.
*
* String representation padding is done to 10 characters.
*
* INT32_MAX = 2^31 - 1 = 2147483647 (10 characters long)
*
*/
static int ldif_index_format_int32(struct ldb_context *ldb,
void *mem_ctx,
const struct ldb_val *in,
Expand All @@ -852,6 +867,10 @@ static int ldif_index_format_int32(struct ldb_context *ldb,
}

if (i < 0) {
/*
* i is negative, so this is subtraction rather than
* wrap-around.
*/
prefix = 'n';
i = INT32_MAX + i + 1;
} else if (i > 0) {
Expand Down
61 changes: 61 additions & 0 deletions lib/ldb/common/attrib_handlers.c
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,63 @@ static int ldb_canonicalise_Integer(struct ldb_context *ldb, void *mem_ctx,

/*
* Lexicographically ordered format for a ldap Integer
*
* [ INT64_MIN ... -3, -2, -1 | 0 | +1, +2, +3 ... INT64_MAX ]
* n o p
*
* For human readability sake, we continue to format the key as a string
* (like the canonicalize) rather than store as a fixed binary representation.
*
* In order to sort the integers in the correct string order, there are three
* techniques we use:
*
* 1. Zero padding
* 2. Negative integer inversion
* 3. 1-byte prefixes: 'n' < 'o' < 'p'
*
* 1. To have a fixed-width representation so that 10 sorts after 2 rather than
* after 1, we zero pad, like this 4-byte width example:
*
* 0001, 0002, 0010
*
* INT64_MAX = 2^63 - 1 = 9223372036854775807 (19 characters long)
*
* Meaning we need to pad to 19 characters.
*
* 2. This works for positive integers, but negative integers will still be
* sorted backwards, for example:
*
* -9223372036854775808 ..., -0000000000000000002, -0000000000000000001
* INT64_MIN -2 -1
*
* gets sorted based on string as:
*
* -0000000000000000001, -0000000000000000002, ... -9223372036854775808
*
* In order to fix this, we invert the negative integer range, so that they
* get sorted the same way as positive numbers. INT64_MIN becomes the lowest
* possible non-negative number (zero), and -1 becomes the highest (INT64_MAX).
*
* The actual conversion applied to negative number 'x' is:
* INT64_MAX - abs(x) + 1
* (The +1 is needed because abs(INT64_MIN) is one greater than INT64_MAX)
*
* 3. Finally, we now have two different numbers that map to the same key, e.g.
* INT64_MIN maps to -0000000000000000000 and zero maps to 0000000000000000000.
* In order to avoid confusion, we give every number a prefix representing its
* sign: 'n' for negative numbers, 'o' for zero, and 'p' for positive. (Note
* that '+' and '-' weren't used because they sort the wrong way).
*
* The result is a range of key values that look like this:
*
* n0000000000000000000, ... n9223372036854775807,
* INT64_MIN -1
*
* o0000000000000000000,
* ZERO
*
* p0000000000000000001, ... p9223372036854775807
* +1 INT64_MAX
*/
static int ldb_index_format_Integer(struct ldb_context *ldb,
void *mem_ctx,
Expand All @@ -165,6 +222,10 @@ static int ldb_index_format_Integer(struct ldb_context *ldb,
}

if (i < 0) {
/*
* i is negative, so this is subtraction rather than
* wrap-around.
*/
prefix = 'n';
i = INT64_MAX + i + 1;
} else if (i > 0) {
Expand Down

0 comments on commit db584d5

Please sign in to comment.