referenced:https://flapenguin.me/elf-dt-gnu-hash
Let's start with the hashing function. It can be found in bfd_elf_gnu_hash or in dl_new_hash.
#include
uint32_t gnu_hash(const uint8_t* name) {
uint32_t h = 5381;
for (; *name; name++) {
h = (h << 5) + h + *name;
}
return h;
}
gnu_hash("") == 0x00001505
gnu_hash("printf") == 0x156b2bb8
gnu_hash("exit") == 0x7c967e3f
gnu_hash("syscall") == 0xbac212a0
struct gnu_hash_table {
uint32_t nbuckets;
uint32_t symoffset;
uint32_t bloom_size;
uint32_t bloom_shift;
uint64_t bloom[bloom_size]; /* uint32_t for 32-bit binaries */
uint32_t buckets[nbuckets];
uint32_t chain[];
};
Bloom filter is used to stop the lookup for missing symbols early.
Before doing symbol lookup, take bloom[(hash / ELFCLASS_BITS) % bloom_size].
If bits hash % ELFCLASS_BITS and (hash >> bloom_shift) % ELFCLASS_BITS
are set then a symbol may or may not be in the hash table, and you should proceed with a regular lookup through buckets and chains.
But if at least one bit is not set then a symbol is certainly absent from the hash table.
DT_HASH contains an element per symbol table's element.
This leads to a waste of space
because STN_UNDEF and some other symbols are in the hash table but are never looked up.
GNU hash table allows to skip first symoffset symbols
at the beginning of the symbol table.