API

Download counter

Searches access.log files for successful downloads that match a specified search string. Matching downloads are tallied in an SQLite database, and results output to an html file.

Usage

The main functional parameters are set in dlcounter.cfg. Additional options may be set through command line arguments.

Command line switches

Usage: dlcounter.py [-d] [-h] [-i] [-v] [-D] [-V]

Arguments:

-d, --docs

Show this documentation and exit.

-h, --help

Show short help and exit.

-i, --init string

This option is required if you wish to count downloads in old archived ‘.gz’ files.

All access logs in path, (including .gz files), are read. Matching download files are counted regardless of when they were downloaded, so this option should only be used on first run, (before the database contains data). This option overrides ACCESSLOGS in dlcounter.cfg.

Example

The path string should be entered in the form:

$ python3 dlcounter.py -n '/var/log/nginx/access.log'

To read all logs:

* /var/log/nginx/access.log
* /var/log/nginx/access.log.1
* /var/log/nginx/access.log.2.gz
* /var/log/nginx/access.log.3.gz
* ...
-v, --verbose

Print commands and database contents to stdout.

-D, --debug

Print additional debug strings to stdout.

-V, --version

Show program version and exit.

Configuration file

The configuration file (‘dlcounter.cfg’) must be in the same directory as ‘download_counter.py’.

[ACCESSLOGS] One or more access logs.

Log files must be plain text (not .gz archives). When more than one access.log files specified, files must be in reverse chronological order (process oldest first).

Default:

log1 = /var/log/nginx/access.log

[FILEPATH] The first part of the download file’s string.

This refers to the string as it appears in the access log. If not supplied, all file names matching the [FILENAMES] option(s) will be counted.

Default:

path = /wp-content/uploads/

The default option will catch files in any of:

  • …/website/downloads/2001/

  • …/website/downloads/2002/

  • …/website/downloads/…/

[FILENAMES] Download files end of string.

The default options will catch .zip and .exe files that begin with ‘FILEPATH’.

Default:
  • file1 = .zip

  • file2 = .exe

[WEBPAGE] Fully qualified path for html output.

HTML output is disabled if this path is not specified.

Default:

path = /var/www/html/downloads.html

[DATETIME] Datetime formats for reading access logs and writing HTML.

  • datetime_read

    Format for reading access logs. The default matches: 01/Jan/2022:23:35:05 +0000

    Default:

    %d/%b/%Y:%H:%M:%S %z

  • datetime_write

    Format for writing html webpage. The default matches: Mon 01 Jan 18:35

    Default:

    %a %d %b %H:%M

Note:

[FILEPATH] and [FILENAMES] options are just strings to search for in the accesslog file(s). Regex is used to search the log file(s) for: “<path-string> any-characters <file-string>”

dlcounter.check_path(name, file)

Check if file exists.

Parameters
  • name (string) – File identifier / file name.

  • file (string) – File path.

Returns

True or exit.

Return type

bool

dlcounter.db_path()

Path to database file.

Parameters

None

Returns

Fully qualified path to SQLite database file.

Return type

string

dlcounter.first_item_in_section(cfg, section)

Return the first item from cfg section.

Parameters
  • cfg (configparser.ConfigParser) – The ConfigParser object.

  • section (string) – Section key.

Returns

First value from section, or empty string.

Return type

string

dlcounter.format_datetime_output(dt_string)

Format dt_string as required for html output.

Parameters

dt_string (datetime) – datetime object.

Returns

Reformatted datetime string.

Return type

string

dlcounter.get_config()

Return dict of arguments from dlcounter.cfg.

List values are retrieved by list_section(). Single values retrieved by first_item_in_section(). Also print parameters from command line and config file when –verbose command line argument is passed.

Parameters

None

Returns

Values from configuration file.

Return type

dict

dlcounter.get_db_time(con)

Return most recent timestamp from database.

Parameters

con (connection) – Connection to the database.

Returns

datetime.min if timestamp not found.

Return type

datetime

dlcounter.get_record(record, pattern)

Return (filename, time) from line when start and end of name are found, or None if not found.

Parameters
  • record (string) – One line from access log.

  • pattern (string) – Regex pattern: f’GET {re.escape(start)}.*{re.escape(end)}’

Returns

(Short filename string, datetime object) if successful,or None.

Return type

tuple

dlcounter.get_time(record)

Return timestamp from record.

Parameters

record (string) –

Returns

Exit if timestamp not found.

Return type

datetime

dlcounter.init_db(logpath, opt)

Initialise database.

Similar to main() but reads all logs that start with ‘logpath’ and does NOT check timestamp before counting. If the database already exists, the old table will be deleted and a new table created.

Parameters
  • logpath (string) – Path to access logs.

  • opt (dict) – Parameters from dlcounter.cfg.

Return type

None

dlcounter.list_section(cfg, section)

Return list of values from cfg section.

Parameters
  • cfg (configparser.ConfigParser) – The ConfigParser object.

  • section (string) – Section key.

Returns

List of zero or more values.

Return type

list

dlcounter.log_to_sql(con, file, searchstring, timecheck=None)

Copy download data from log file to database.

Read one log file and update database. Updating is handled by update_db().

Parameters
  • con (connection) – Connection to the database.

  • file (_io.TextIOWrapper) – Pointer to access log.

  • searchstring (string) – Regex pattern for filename in log.

  • timecheck (datetime) – Initialise if timecheck=None.

Returns

True when database has been modified.

Return type

bool

dlcounter.main(opt)

Count downloads from access logs.

Search for file names in the access logs that match the search criteria, and update the database as necessary. The database is created automatically if it does not exist. Downloads with timestamps older than the last update are ignored.

Parameters

opt (dict) – Contains string values: acclogs, searchstring, and html_out.

Return type

None

dlcounter.print_table()

Print contents of database to stdout.

This function is used only with –verbose option.

Parameters

None

Return type

None

dlcounter.sql_table(con)

Create ‘downloads’ table if it doesn’t exist.

Parameters

con (connection) – Connection to the database.

Return type

None

dlcounter.time_format(readf='', writef='')

Return the date-time format

Values from config for reading access logs and writing html. Call either time_format.read or time_format.write.

Parameters
  • readf (string, default '') – Time format for reading access logs.

  • writef (string, default '') – Time format for writing html.

dlcounter.read
Type

string

dlcounter.write
Type

string

Return type

None

dlcounter.update_db(con, fname, timestamp)

Update database.

If fname exists in database, update its download total and timestamp, else insert it into the database with a count of 1.

Parameters
  • con (connection) – Connection to the database.

  • fname (string) – Name of the downloaded file.

  • timestamp (datetime) – Timestamp of download.

Return type

None

dlcounter.write_html(con, htmlfile)

Write sql data to web page.

Parameters
  • con (connection) – Connection to the database.

  • htmlfile (string) – Path to html output file.

Return type

None