So recently I put some files up on a file server I rent with a couple of mates, mostly ones used in my CV so that prospective employees can check papers etc that I have written. In an attempt to try and figure out who has accessed these files I, with the help of my friend Paul wrote a smal script that uses the access log, grep and whois to figure out the domains that have accessed the file. To be honest its a small script, and if I had more experience with bash I probably could have written it myself. In fact, if Paul had wanted to, I know he could have written it no problems, but it was all experience. In case someone wants to use it, here it is.
#!/bin/bash
FILE=”/tmp/$(basename $0).$RANDOM.$$.txt”;
echo Searching for access to files with $1;
sudo cat /var/log/apache2/access.log | grep -i *PUT YOUR NAME HERE* | grep $1 > $FILE;
if [ -z "$2" ]
then
echo “no exclusions”;
else
echo “excluding files containing $2″;
cat $FILE | grep -v $2 > $FILE;
fi
cat $FILE | cut -d ‘ ‘ -f 1 | sort | uniq -c > $FILE;
for i in $(cat $FILE)
do
echo -ne “$i - “; whois $i -H | egrep ‘OrgName|descr’ | head -n 1 | cut -d ‘:’ -f 2;
done
To use it, just copy the script (replacing the part that says *PUT YOUR NAME HERE* with your username) into a file, set it as an executable using chmod +x, then run it. The first argument is the string you are looking for access to, e.g. pdf will show all pdf’s, and there is a second optional string for if you want to exclude files with names containg a certain string.


