演示会发生异步死循环故障的代码实现:
- bool Socket::AcceptLoopbackAsync(
- const Hosting& hosting,
- const boost::asio::ip::tcp::acceptor& acceptor,
- const BOOST_ASIO_MOVE_ARG(AcceptLoopbackCallback) callback) noexcept {
- if (!acceptor.is_open()) {
- return false;
- }
-
- if (!hosting || !callback) {
- Closesocket(acceptor);
- return false;
- }
-
- const AsioContext context_ = hosting->GetContext();
- if (!context_) {
- Closesocket(acceptor);
- return false;
- }
-
- boost::asio::ip::tcp::acceptor* const acceptor_ = addressof(acceptor);
- const Hosting hosting_ = hosting;
- const AcceptLoopbackCallback accept_ = BOOST_ASIO_MOVE_CAST(AcceptLoopbackCallback)(constantof(callback));
- const AsioTcpSocket socket_ = make_shared_object
(*context_); -
- acceptor_->async_accept(*socket_,
- [hosting_, context_, acceptor_, accept_, socket_](const boost::system::error_code& ec) noexcept {
- if (ec == boost::system::errc::operation_canceled) {
- Closesocket(*acceptor_);
- return;
- }
-
- bool success = false;
- do { /* boost::system::errc::connection_aborted */
- if (ec) { /* ECONNABORTED */
- break;
- }
-
- int handle_ = socket_->native_handle();
- Socket::AdjustDefaultSocketOptional(handle_, false);
- Socket::SetTypeOfService(handle_);
- Socket::SetSignalPipeline(handle_, false);
- Socket::SetDontFragment(handle_, false);
- Socket::ReuseSocketAddress(handle_, true);
-
- /* Accept Socket?? */
- success = accept_(context_, socket_);
- } while (0);
- if (!success) {
- Closesocket(socket_);
- }
-
- success = AcceptLoopbackAsync(hosting_, *acceptor_, forward0f(accept_));
- if (!success) {
- Closesocket(*acceptor_);
- }
- });
- return true;
- }
该问题发生的原因:
进程的最大文件描述符太小,当进程打开的文件描述符句柄(fd)数量超过,当前进程的 “Max open files”,那么会导致 accept 无法获取 session fd 而失败,但它并不产生错误,而此时 TCP/IP 连接的三次握手(或 FAST OPEN,单次)已建立连接。
而 boost::asio 采用 epoll LT(水平出发模式),该模式会导致,epoll 不停的触发 accept 事件到达,所以就产生了无限制的 async_accept 回调调用,故而又重新 async_accept,造成的死循环,即使没有实际的网络IO产生。
查看特定进程的 “Max open files” 值大小,人们可以使用以下的命令
cat /proc/进程PID/limits ## 打印内容如下
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1178 1178 processes
Max open files 1000000 1000000 files
Max locked memory 67108864 67108864 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 1178 1178 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
通过命令设置当前终端会话,最大文件描述符数量:
ulimit -n 1000000 ## 默认数量:1024
确定某个进程打开的文件描述符数量
lsof -Pnl +M -p 进程ID | wc -l
确定某个进程打开的文件描述符详细信息
lsof -Pnl +M -p 进程ID