So, I'm trying to make an awk script that determines the most hits in the top three order. I am doing this based on the Apache weblog, which looks like
192.168.198.92 - - [22/Dec/2002:23:08:37 -0400] "GET / HTTP/1.1" 200 6394 www.yahoo.com "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1...)" "-"
192.168.198.92 - - [22/Dec/2002:23:08:38 -0400] "GET /images/logo.gif HTTP/1.1" 200 807 www.yahoo.com "http://www.some.com/" "Mozilla/4.0 (compatible; MSIE 6...)" "-"
192.168.72.177 - - [22/Dec/2002:23:32:14 -0400] "GET /news/sports.html HTTP/1.1" 200 3500 www.yahoo.com "http://www.some.com/" "Mozilla/4.0 (compatible; MSIE ...)" "-"
192.168.72.177 - - [22/Dec/2002:23:32:14 -0400] "GET /favicon.ico HTTP/1.1" 404 1997 www.yahoo.com "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3)..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:15 -0400] "GET /style.css HTTP/1.1" 200 4138 www.yahoo.com "http://www.yahoo.com/index.html" "Mozilla/5.0 (Windows..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:16 -0400] "GET /js/ads.js HTTP/1.1" 200 10229 www.yahoo.com "http://www.search.com/index.html" "Mozilla/5.0 (Windows..." "-"
192.168.72.177 - - [22/Dec/2002:23:32:19 -0400] "GET /search.php HTTP/1.1" 400 1997 www.yahoo.com "-" "Mozilla/4.0 JJohnJoJJJJJoJJoJJJJJoJJohJJJJJJJJJJJJohnJohJoJoJJJoJJ
To do this, I:
$1 ~ /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/ {
hitCounter[$1]++
notIndexed=1
for(i in ips) {
if (i==$1) { notIndexed=0 }
}
if(notIndexed==1) {
ips[indexx]=$1
indexx++
}
}
This line identifies the IP address and then increases its number in the hitCounter array, which is indexed by IP addresses. After that, I check the ips, "ips" list to see if there is a hit IP address. If the IP address is not added to the ips array, and the number of indices increases by one. Theoretically, by doing this, each index in ips should correlate with the indexes in hitCounter. Finally, I ...
END {
indexxx=0
for (i in hitCounter) {
if (i>hitCounter[firstIP])
firstIP=ips[indexxx]
else if (i>hitCounter[secondIP])
secondIP=ips[indexxx]
else
thirdIP=ips[indexxx]
indexxx++
}
}
IP "hitCounter", , , IP- , , IP.
, , "192.168.72.177 192.168.198.92" , "192.168.198.92 192.168.198.92".
?
EDIT: , , "hitCounter" foreach...
print "The most hits were from "firstIP" "secondIP" "thirdIP