30th November 2021

AWStats and Hiawatha

Hiawatha is a secure and reliable web-server. It is used for this blog. AWStats is a collection of Perl-scripts to analyze log-files from web-servers. By default, AWStats can read Apache log-files. It cannot directly read log-files from Hiawatha.

The Hiawatha log-file format is:

  1. host, this is the IP address
  2. date + time
  3. HTTP status code, e.g., 200
  4. size in bytes
  5. URL including the method (GET/POST/etc.)
  6. referrer
  7. user agent, e.g., Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0
  8. and other fields

I already use GoAccess to analyze my log-files. See Using GoAccess with Hiawatha Web-Server. GoAccess is pretty fast and the output looks good, but it cannot really filter the data. I.e., it shows huge amounts of data generated by bots. So I hoped that AWStats could fill this gap.

1. Modifying Perl program. I first tried to configure AWStats in

/etc/awstats/awstats.eklausmeier.goip.de.conf

Unfortunately, even after preformatting the Hiawatha log-files this didn't work. So I had to change to source code in awstats.pl.

I created a new LogFormat=5:

elsif ( $LogFormat eq '5' ) {	# Hiawatha web-server log-format
    $PerlParsingFormat = "special Hiawatha web-server log-format";
    $pos_host    = 0;
    $pos_date    = 1;
    $pos_code    = 2;
    $pos_size    = 3;
    $pos_method  = 4;	# together with url
    $pos_url     = 5;	# together with method
    $pos_referer = 6;
    $pos_agent   = 7;
    @fieldlib    = (
        'host', 'date',  'code', 'size',
        'method', 'url', 'referer', 'ua'
    );
}

There are two places in awstats.pl, which actually read from the log-file. These two places I changed with a call to a small subroutine, which can then handle Hiawatha log-files natively and without hassle.

# split log line into fields
sub splitLog (@) {
    my ($PerlParsingFormat,$line) = (@_[0],@_[1]);
    if ($PerlParsingFormat eq '(?^:^special Hiawatha web-server log-format)') {
        my @F = split('\|',$line);
        my @R;
        ($R[0],$R[2],$R[3],$R[6],$R[7]) = ($F[0],$F[2],$F[3],$F[5],$F[6]);
        my ($day,$month,$year,$hms) = ($F[1] =~ /\w\w\w\s+(\d+)\s+(\w+)\s+(\d+)\s+(\d+:\d+:\d+)/);
        $R[1] = sprintf("%02d/%s/%04d:%s",$day,$month,$year,$hms);	# DD/Month/YYYY:HH:MM:SS (Apache)
        ($R[4],$R[5]) = ($F[4] =~ /^(\w+)\s+([^\s]+)\s+[^\s]+$/);	# GET /index.html HTTP/x.x
        return @R;
    }
    return map( /$PerlParsingFormat/, $line );
}

This subroutine is then called

@field = splitLog($PerlParsingFormat,$line);

instead of

@field = map( /$PerlParsingFormat/, $line );

and instead of

if ( !( @field = splitLog($PerlParsingFormat,$line) ) ) {	#map( /$PerlParsingFormat/, $line )

In total this occurs two times in awstats.pl.

2. Configuring AWStats. To actually run awstats.pl you have to symlink the lib-directory first:

ln -s /usr/share/webapps/awstats/cgi-bin/lib lib

assuming that the AWStat package was installed under /usr/share/webapps/awstats.

In /etc/awstats/awstats.eklausmeier.goip.de.conf I set:

LogFile="/tmp/access.log"
LogType=W
LogFormat=5
LogSeparator="\|"
SiteDomain="eklausmeier.goip.de"
DNSLookup=2
DirIcons="/awstatsicon"

You have to set a symbolic link in your web-root:

ln -s /usr/share/webapps/awstats/icon awstatsicon

3. Running AWStats. Starting AWStats is thus:

./awstats.pl -config=eklausmeier.goip.de -output -staticlinks > /srv/http/awstats.html

Generating all the reports:

/usr/share/awstats/tools/awstats_buildstaticpages.pl -config=eklausmeier.goip.de -dir=/srv/http

Hiawatha splits the log-file and gzip's them. To concat them all together, use something like:

L=/tmp/access.log; rm $L; for i in `seq 52 -1 2`; do zcat access.log.$i.gz >> $L; done; cat access.log.1 access.log >> $L

4. Example output. Output of AWStats for the overview for spiders and bots looks similar to this:

The detailled overview of the most requested URLs looks similar to this:

The list of used operating systems looks like this:




Categories: web
Tags: awstats
Author: Elmar Klausmeier