grep – finding patterns in your log files

grep is a very useful utility when working with the Linux file system.

It is one of many common Linux tools that support regular expressions such as vim, sed and awk, allowing you to home in on results of a particular string pattern contained within a file.

However even searching for simple patterns can prove to be very useful too. So don’t let the term “regular expression” put you off making use of it.

The syntax can be as simple as: grep patterntomatch filename

Here are some simple examples that I find particularly useful/interesting.

Having a look at the web server access logs for file types I know I don’t use, but might indicate signs of possible attack.

/var/log/apache2# grep .php access.log
104.224.15.126 – – [17/Dec/2014:02:22:53 +0400] “GET /dddd/ddd/dd.php HTTP/1.1” 404 527 “-” “-”
104.224.15.126 – – [17/Dec/2014:02:22:54 +0400] “GET /phpMyAdmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
104.224.15.126 – – [17/Dec/2014:02:22:54 +0400] “GET /pma/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
104.224.15.126 – – [17/Dec/2014:02:22:55 +0400] “GET /myadmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
195.154.42.218 – – [17/Dec/2014:08:39:44 +0400] “GET /rgrg/rgr/rg.php HTTP/1.1” 404 527 “-” “-”
195.154.42.218 – – [17/Dec/2014:08:39:44 +0400] “GET /phpMyAdmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
195.154.42.218 – – [17/Dec/2014:08:39:44 +0400] “GET /pma/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
195.154.42.218 – – [17/Dec/2014:08:39:44 +0400] “GET /myadmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-“

Above we can see a few results indicating that likely automated attempts are being made for detecting mis-configured or vulnerable versions of phpMyAdmin.

Again taking a look at the server’s logs for instances of requests for content that doesn’t exist can provide an interesting picture. Normal user activity will rarely include 404 errors if your content is broken link free, even if not, as a site administrator you should have a decent idea of what constitutes an unusual pattern and could be used anyway to identify/fix broken links in any case 🙂

/var/log/apache2# grep -v robots.txt *| grep 404

21.41.58.199 – – [15/Dec/2014:02:57:32 +0400] “GET /bvbv/bvb/bv.php HTTP/1.1” 404 527 “-” “-”
121.41.58.199 – – [15/Dec/2014:02:57:33 +0400] “GET /phpMyAdmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
121.41.58.199 – – [15/Dec/2014:02:57:34 +0400] “GET /pma/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
121.41.58.199 – – [15/Dec/2014:02:57:34 +0400] “GET /myadmin/scripts/setup.php HTTP/1.1” 404 527 “-” “-”
66.249.75.24 – – [15/Dec/2014:03:17:04 +0400] “GET /rvmgthgqv.html HTTP/1.1” 404 546 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
61.160.247.7 – – [15/Dec/2014:09:42:46 +0400] “GET /manager/html HTTP/1.1″ 404 508 “-” “Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)”

The -v robots.txt has been used to remove all of the requests that include robots.txt as I know these will be in there and I don’t want to see them as I expect a 404 for any of these requests.

I have highlighted above an attempt to detect a Tomcat login page. Misconfigured Tomcat servers are often responsible for compromises in many environments due to the ability to deploy your own code if successfully compromised, extra win if configured to run as system or root as can sometimes be the case.

See the related metasploit modules for more info on these attacks.

Suppose your interested in determining where the IP that made the suspicious connection to your server is located:

whois 61.160.247.7 | grep country | sort -u
country: CN

Combining whois to lookup the IP record information, filtering that output with grep to return only lines with country in the response and performing a unique sort to get rid of the duplicates we can see its our friends from china who are apparently paying us a visit.

Particular flags I find most useful when combined with grep:

-v — Invert search, i.e. patterns that don’t match

-E — Regular expression

-i — Ignore case

-l — Print the name of the file that matches instead of the whole output

-n — include line number

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: