diff options
author | Jason A. Donenfeld <Jason@zx2c4.com> | 2013-02-21 11:01:20 +0100 |
---|---|---|
committer | Jason A. Donenfeld <Jason@zx2c4.com> | 2013-02-22 16:29:11 +0100 |
commit | 112fe388d5cb90a29d6a07e98815c11b65bf801b (patch) | |
tree | b416e26632d3a3efa8ceaf7fbebcf5fb43c9a7a7 /README.md | |
download | domain-lookup-tree-master.tar.xz domain-lookup-tree-master.zip |
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 57 |
1 files changed, 57 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..a606fc4 --- /dev/null +++ b/README.md @@ -0,0 +1,57 @@ +# Domain Lookup Tree +### by Jason A. Donenfeld (<Jason@zx2c4.com>) + +Some applications, like [dnsmasq](http://www.thekelleys.org.uk/dnsmasq/doc.html), need to find the closest-match domain name. One way is by iterating through a list of domains and finding the one that has the longest length match. But a more efficient way is to build a tree of domain name components. + +This project provides a very simple C API for a domain name matching tree. The **[source code is simple, extremely well commented, and easy to read](http://git.zx2c4.com/domain-lookup-tree/tree/domain-lookup.c)**. + +## API + +#### `struct domain_lookup_tree* init_dlt()` + +Constructs a new tree head and returns a pointer to it. + +#### `void* find_dlt(struct domain_lookup_tree *head, const char *domain)` + +Returns the prior inserted data in `head` for a given `domain`. + +#### `int insert_dlt(struct domain_lookup_tree *head, const char *domain, void *data)` + +Inserts `data` into a tree given by `head` for a given `domain`. If `domain` has length of zero or is equal to `#`, it will act as the default match for non-matching lookups. + + +## Example + + struct domain_lookup_tree *domains = init_dlt(); + + insert_dlt(domains, "zx2c4.com", "zx2c4 root node"); + insert_dlt(domains, "data.zx2c4.com", "zx2c4 data node"); + insert_dlt(domains, "yahoo.co.uk", "yahoo.co.uk node"); + insert_dlt(domains, "#", "generic node"); + + char *data = find_dlt(domains, "cheese.somewhere.zx2c4.com"); + printf("zx2c4 root node == %s\n", data); + +## Benchmarks + +With gcc 4.7.2 on an Intel Core i7-3820QM, we show a **~587x speed improvement** over dnsmasq's old algorithm: + + zx2c4@thinkpad ~/Projects/domain-lookup-tree $ make + cc -march=native -pipe -fomit-frame-pointer -flto -O3 benchmark.c domain-lookup.c domain-lookup.h -o benchmark + + zx2c4@thinkpad ~/Projects/domain-lookup-tree $ ./benchmark + [+] Populating in-memory word list from /usr/share/dict/words. + [+] Truncating in-memory word list to 300 random words. + [+] Creating random lists of 100000 domains to insert and 100000 domains to query. + [+] Populating domain lookup tree. + [+] Performing lookup benchmarks: + [*] New method took 0.34 seconds. + [*] Old method took 199.45 seconds. + [*] The new method is 586.617647 times faster than the old method. + [+] Verifying that new and old methods produced identical results: + [*] New and old methods produced the same results. + +## Limitations & Improvements + +* Currently no helper functions for removing and freeing nodes +* Instead of `strcmp`'ing, a hash function could be used |