Shell Script Example Squid Logs

I was trying to explain to a colleague a few days ago how a few shell commands can be really useful, when today I came across an example to try to illustrate. My problem was that I had 245 log files each about 70-80MB in size – roughly 4 million lines in each log file. Each line in the log file uses the following (squid) format:

1378297522.050      4 111.222.111.222 TCP_MISS/200 2600 GET http://somewebste.com/favicon.ico 12586072 DIRECT/111.222.111.222 text/html

Now my problem was that I wanted to examine or graph the number of unique IP addresses seen in each log file per day to give me a rough idea of how many computers have been using the service each day. The reasoning is that I want to check the effect of new computer deployments.

So to get the number of distinct IP addresses per day – a simple shell script and I have csv values I can import into a spreadsheet to graph.

#!/bin/bash
DIR="/var/log-archive/squid/2013"
MONTHS=("01" "02" "03" "04" "05" "06" "07" "08")
for MONTH in ${MONTHS[*]}
do
for DAY in `seq -w 01 31`
do
MYDATE=$MONTH$DAY
if [ -f $DIR"/access_2013"$MYDATE"_combined.log.gz" ]
then
UNIQUE_IPS_SEEN=`zcat $DIR"/access_2013"$MYDATE"_combined.log.gz" | awk '{print $3}' | sort | uniq | wc -l`
echo "2013$MYDATE,$UNIQUE_IPS_SEEN"
fi
done
done

So I reckon it would be hard to find a quicker, friendlier way to solve that problem.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>