• 在 Java 中解析 A​​pache 访问日志


    维护页面请求历史的 Web 服务器日志,通常附加到文件的末尾。通常会添加有关请求的信息,包括客户端 IP 地址、请求日期/时间、请求的页面、HTTP 代码、提供的字节数、用户代理和引用者。

    给定一个 Web 服务器日志记录,找到成功响应的 IP 地址的成功 HTTL 响应总数(200 个代码)。

    例子:

    输入:示例访问日志 192.168.1.2 - - [17/Sep/2013:22:18:19 -0700]“GET /abc HTTP/1.1”404 201 192.168.1.2 - - [17/Sep/2013:22:18:19 -0700]“GET /favicon.ico HTTP/1.1”200 1406 192.168.1.2 - - [17/Sep/2013:22:18:27 -0700] “GET /wp/ HTTP/1.1”200 5325 192.168.1.2 - - [17/Sep/2013:22:18:27 -0700]“GET /wp/wp-content/themes/twentytwelve/style.css?ver=3.5.1 HTTP/1.1”200 35292 192.168.1.3 - - [17/Sep/2013:22:18:27 -0700]“GET /wp/wp-content/themes/twentytwelve/js/navigation.js?ver=1.0 HTTP/1.1”200 863 输出 : 192.168.1.3 1 192.168.1.2 3 先决条件:Java中的正则表达式
    1. // Java program to count the no. of IP address
    2. // count for successful http response 200 code.
    3. import java.io.*;
    4. import java.util.*;
    5. import java.util.regex.Matcher;
    6. import java.util.regex.Pattern;
    7. class FindSuccessIpCount {
    8. public static void findSuccessIpCount(String record)
    9. {
    10. // Creating a regular expression for the records
    11. final String regex = "^(\\S+) (\\S+) (\\S+) " +
    12. "\\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(\\S+)" +
    13. " (\\S+)\\s*(\\S+)?\\s*\" (\\d{3}) (\\S+)";
    14. final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
    15. final Matcher matcher = pattern.matcher(record);
    16. // Creating a Hashmap containing string as
    17. // the key and integer as the value.
    18. HashMap countIP = new HashMap();
    19. while (matcher.find()) {
    20. String IP = matcher.group(1);
    21. String Response = matcher.group(8);
    22. int response = Integer.parseInt(Response);
    23. // Inserting the IP addresses in the
    24. // HashMap and maintaining the frequency
    25. // for each HTTP 200 code.
    26. if (response == 200) {
    27. if (countIP.containsKey(IP)) {
    28. countIP.put(IP, countIP.get(IP) + 1);
    29. }
    30. else {
    31. countIP.put(IP, 1);
    32. }
    33. }
    34. }
    35. // Printing the hashmap
    36. for (Map.Entry entry : countIP.entrySet()) {
    37. System.out.println(entry.getKey() + " " + entry.getValue());
    38. }
    39. }
    40. public static void main(String[] args)
    41. {
    42. final String log = "123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] \"GET /pics/wpaper.gif HTTP/1.0\" 200 6248 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
    43. + "123.123.123.123 - - [26/Apr/2000:00:23:47 -0400] \"GET /asctortf/ HTTP/1.0\" 200 8130 \"http:// search.netscape.com/Computers/Data_Formats/Document/Text/RTF\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
    44. + "123.123.123.124 - - [26/Apr/2000:00:23:48 -0400] \"GET /pics/5star2000.gif HTTP/1.0\" 200 4005 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
    45. + "123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] \"GET /pics/5star.gif HTTP/1.0\" 404 1031 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
    46. + "123.123.123.126 - - [26/Apr/2000:00:23:51 -0400] \"GET /pics/a2hlogo.jpg HTTP/1.0\" 200 4282 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
    47. + "123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] \"GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0\" 200 36 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n";
    48. findSuccessIpCount(log);
    49. }
    50. }

    输出:

    123.123.123.126 1 123.123.123.124 1 123.123.123.123 3 
     
  • 相关阅读:
    看谷歌浏览器源码,为什么p标签和div标签为块元素
    ElementUI如何实现分页查询?html如何实现分页查询?vue如何实现分页查询?
    qt 滑动查看图片效果
    大型医院HIS系统源码 优质源码 医院管理系统源码
    数据结构与算法基础-学习-05-线性表之链式表-删除元素、头插法创建单链表、尾插法创建单链表等实现
    ASM字节码操作类库(打开java语言世界通往字节码世界的大门) | 京东云技术团队
    基于写时复制技术的并发集合———CopyOnWriteArrayList源码分析
    2023最新SSM计算机毕业设计选题大全(附源码+LW)之java高校教室管理系统9y8cv
    SPI 实验
    [环境配置][原创]matconv在windows上GPU编译成功的环境
  • 原文地址:https://blog.csdn.net/allway2/article/details/126067521