DHCP(Dynamic Host Configuration Protocol,动态主机配置协议)通常被应用在大型的局域网络环境中,主要作用是集中地管理、分配IP地址,使网络环境中的主机动态的获得IP地址、Gateway地址、DNS服务器地址等信息,并能够提升地址的使用率。
在Android系统中,通常由dnsmasq进程负责这部分工作;
通常情况下,DHCP交互包含四次握手:

流程示意图如下:

分步骤解释一下:
在STA与AP已经建立连接的情况下,STA作为客户端,会首先发出一个DHCPDISCOVER的广播,寻求局域网内可以提供IP地址分配的服务器(此处即为DHCP服务器);
局域网内所有DHCP服务器(理论上同一局域网可以存在多个DHCP服务器)在收到DHCPDISCOVER后,选出自己认为可用的一个IP地址,以DHCPOFFER报文的形式发回给客户端;
客户端在收到DHCPOFFER报文后(若存在多个回执,则通常响应最早到达的一个),再次发出广播,向DHCP服务器确认IP地址(同时告知其他DHCP服务器中止其握手流程);
目标DHCP服务器确认后,回执DHCPACK,告知客户端可以使用该IP地址;
解答几个疑问:
为什么需要四次握手,而不是两次即可?
因为局域网中可能存在多个DHCP服务器可以响应该报文;
为什么DHCPDISCOVER与DHCPREQUEST是以广播形式发送?
同上,为了尽可能从局域网内所有DHCP服务器中获取回执;
DHCP服务器如何选取“自己认为可用的IP地址”的?
以dnsmasq为例,是查表+ARP双重确认的(下面ARP章节会介绍)
DHCPREQUEST既然是广播,局域网内所有DHCP服务器是如何确认自己是不是目标客户端呢?
根据transaction ID确认;
上面提到,dnsmasq会根据查表+ARP双重确认,以DHCPOFFER报文发出的IP地址是可用的;
查表比较简单,自己分配了哪些IP地址,肯定内部是有维护的;
那么为什么还要使用ARP来确认,ARP是什么,接下来简单介绍下:
地址解析协议,即ARP(Address Resolution Protocol),是根据IP地址获取物理地址的一个TCP/IP协议;
顾名思义,就是通过IP地址获取MAC地址的一个TCP/IP协议;
此处dnsmasq反向利用了这一特性:通过ARP,如果在局域网内获取不到MAC地址,那么就说明该IP地址没有任何人使用;
同样以上面那一次关联为例,过滤DHCP与ARP报文,结果如下:

结合dnsmasq代码可知,在确认发送DHCPOFFER报文时包含的IP地址之前,dnsmasq会花3s时间,尝试ping这个IP地址,而内核在ping该地址时,需要查找其具体MAC地址(详见__ipv4_neigh_lookup_noref()函数),从而发出了ARP报文;
int address_allocate(struct dhcp_context *context,
struct in_addr *addrp, unsigned char *hwaddr, int hw_len,
struct dhcp_netid *netids, time_t now)
{
/* Find a free address: exclude anything in use and anything allocated to
a particular hwaddr/clientid/hostname in our configuration.
Try to return from contexts which match netids first. */
struct in_addr start, addr;
struct dhcp_context *c, *d;
int i, pass;
unsigned int j;
/* hash hwaddr */
for (j = 0, i = 0; i < hw_len; i++)
j += hwaddr[i] + (hwaddr[i] << 8) + (hwaddr[i] << 16);
for (pass = 0; pass <= 1; pass++)
for (c = context; c; c = c->current)
if (c->flags & CONTEXT_STATIC)
continue;
else if (!match_netid(c->filter, netids, pass))
continue;
else
{
/* pick a seed based on hwaddr then iterate until we find a free address. */
start.s_addr = addr.s_addr =
htonl(ntohl(c->start.s_addr) +
((j + c->addr_epoch) % (1 + ntohl(c->end.s_addr) - ntohl(c->start.s_addr))));
do {
/* eliminate addresses in use by the server. */
for (d = context; d; d = d->current)
if (addr.s_addr == d->router.s_addr)
break;
/* Addresses which end in .255 and .0 are broken in Windows even when using
supernetting. ie dhcp-range=192.168.0.1,192.168.1.254,255,255,254.0
then 192.168.0.255 is a valid IP address, but not for Windows as it's
in the class C range. See KB281579. We therefore don't allocate these
addresses to avoid hard-to-diagnose problems. Thanks Bill. */
if (!d &&
!lease_find_by_addr(addr) &&
!config_find_by_address(daemon->dhcp_conf, addr) &&
(!IN_CLASSC(ntohl(addr.s_addr)) ||
((ntohl(addr.s_addr) & 0xff) != 0xff && ((ntohl(addr.s_addr) & 0xff) != 0x0))))
{
struct ping_result *r, *victim = NULL;
int count, max = (int)(0.6 * (((float)PING_CACHE_TIME)/
((float)PING_WAIT)));
*addrp = addr;
if (daemon->options & OPT_NO_PING)
return 1;
/* check if we failed to ping addr sometime in the last
PING_CACHE_TIME seconds. If so, assume the same situation still exists.
This avoids problems when a stupid client bangs
on us repeatedly. As a final check, if we did more
than 60% of the possible ping checks in the last
PING_CACHE_TIME, we are in high-load mode, so don't do any more. */
for (count = 0, r = daemon->ping_results; r; r = r->next)
if (difftime(now, r->time) > (float)PING_CACHE_TIME)
victim = r; /* old record */
else if (++count == max || r->addr.s_addr == addr.s_addr)
return 1;
if (icmp_ping(addr))
/* address in use: perturb address selection so that we are
less likely to try this address again. */
c->addr_epoch++;
else
{
/* at this point victim may hold an expired record */
if (!victim)
{
if ((victim = whine_malloc(sizeof(struct ping_result))))
{
victim->next = daemon->ping_results;
daemon->ping_results = victim;
}
}
/* record that this address is OK for 30s
without more ping checks */
if (victim)
{
victim->addr = addr;
victim->time = now;
}
return 1;
}
}
addr.s_addr = htonl(ntohl(addr.s_addr) + 1);
if (addr.s_addr == htonl(ntohl(c->end.s_addr) + 1))
addr = c->start;
} while (addr.s_addr != start.s_addr);
}
return 0;
}
由于在3s内没有任何人响应ARP报文,ping操作也超时,因此dnsmasq认为该IP在局域网内无人使用,从而可以将其发送给客户端;