Parsing Apache access log in Java
Last Updated :
31 Aug, 2022
Web server log which maintains a history of page requests, typically appended to the end of the file. Information about the request, including client IP address, request date/time, page requested, HTTP code, bytes served, user agent, and referrer are typically added. Given a web server log records, find the total number of successful HTTL responses (200 code) for IP addresses with successful responses.
Examples:
Input : Sample Access Log
192.168.1.2 - - [17/Sep/2013:22:18:19 -0700] "GET /abc HTTP/1.1" 404 201
192.168.1.2 - - [17/Sep/2013:22:18:19 -0700] "GET /favicon.ico HTTP/1.1" 200 1406
192.168.1.2 - - [17/Sep/2013:22:18:27 -0700] "GET /wp/ HTTP/1.1" 200 5325
192.168.1.2 - - [17/Sep/2013:22:18:27 -0700] "GET /wp/wp-content/themes/twentytwelve/style.css?ver=3.5.1 HTTP/1.1" 200 35292
192.168.1.3 - - [17/Sep/2013:22:18:27 -0700] "GET /wp/wp-content/themes/twentytwelve/js/navigation.js?ver=1.0 HTTP/1.1" 200 863
Output :
192.168.1.3 1
192.168.1.2 3
Prerequisite : Regular Expression in Java
Implementation:
java
import java.io.*;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class FindSuccessIpCount {
public static void findSuccessIpCount(String record)
{
final String regex = "^(\\S+) (\\S+) (\\S+) " +
"\\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(\\S+)" +
" (\\S+)\\s*(\\S+)?\\s*\" (\\d{3}) (\\S+)" ;
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(record);
HashMap<String, Integer> countIP = new HashMap<String, Integer>();
while (matcher.find()) {
String IP = matcher.group( 1 );
String Response = matcher.group( 8 );
int response = Integer.parseInt(Response);
if (response == 200 ) {
if (countIP.containsKey(IP)) {
countIP.put(IP, countIP.get(IP) + 1 );
}
else {
countIP.put(IP, 1 );
}
}
}
for (Map.Entry entry : countIP.entrySet()) {
System.out.println(entry.getKey() + " " + entry.getValue());
}
}
public static void main(String[] args)
{
final String log = "123.123.123.123 - - [26/Apr/2000:00:23:48 -0400] \"GET /pics/wpaper.gif HTTP/1.0\" 200 6248 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
+ "123.123.123.123 - - [26/Apr/2000:00:23:47 -0400] \"GET /asctortf/ HTTP/1.0\" 200 8130 \"http:// search.netscape.com/Computers/Data_Formats/Document/Text/RTF\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
+ "123.123.123.124 - - [26/Apr/2000:00:23:48 -0400] \"GET /pics/5star2000.gif HTTP/1.0\" 200 4005 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
+ "123.123.123.123 - - [26/Apr/2000:00:23:50 -0400] \"GET /pics/5star.gif HTTP/1.0\" 404 1031 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
+ "123.123.123.126 - - [26/Apr/2000:00:23:51 -0400] \"GET /pics/a2hlogo.jpg HTTP/1.0\" 200 4282 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n"
+ "123.123.123.123 - - [26/Apr/2000:00:23:51 -0400] \"GET /cgi-bin/newcount?jafsof3&width=4&font=digital&noshow HTTP/1.0\" 200 36 \"http:// www.jafsoft.com/asctortf/\" \"Mozilla/4.05 (Macintosh; I; PPC)\"\n" ;
findSuccessIpCount(log);
}
}
|
Output
123.123.123.126 1
123.123.123.124 1
123.123.123.123 3
Share your thoughts in the comments
Please Login to comment...