API
Download counter
Searches access.log files for successful downloads that match a specified search string. Matching downloads are tallied in an SQLite database, and results output to an html file.
Usage
The main functional parameters are set in dlcounter.cfg. Additional options may be set through command line arguments.
Command line switches
Usage: dlcounter.py [-d] [-h] [-i] [-v] [-D] [-V]
Arguments:
- -d, --docs
Show this documentation and exit.
- -h, --help
Show short help and exit.
- -i, --init string
This option is required if you wish to count downloads in old archived ‘.gz’ files.
All access logs in path, (including .gz files), are read. Matching download files are counted regardless of when they were downloaded, so this option should only be used on first run, (before the database contains data). This option overrides ACCESSLOGS in dlcounter.cfg.
Example
The path string should be entered in the form:
$ python3 dlcounter.py -n '/var/log/nginx/access.log'
To read all logs:
* /var/log/nginx/access.log
* /var/log/nginx/access.log.1
* /var/log/nginx/access.log.2.gz
* /var/log/nginx/access.log.3.gz
* ...
- -v, --verbose
Print commands and database contents to stdout.
- -D, --debug
Print additional debug strings to stdout.
- -V, --version
Show program version and exit.
Configuration file
The configuration file (‘dlcounter.cfg’) must be in the same directory as ‘download_counter.py’.
[ACCESSLOGS] One or more access logs.
Log files must be plain text (not .gz archives). When more than one access.log files specified, files must be in reverse chronological order (process oldest first).
- Default:
log1 = /var/log/nginx/access.log
[FILEPATH] The first part of the download file’s string.
This refers to the string as it appears in the access log. If not supplied, all file names matching the [FILENAMES] option(s) will be counted.
- Default:
path = /wp-content/uploads/
The default option will catch files in any of:
…/website/downloads/2001/
…/website/downloads/2002/
…/website/downloads/…/
[FILENAMES] Download files end of string.
The default options will catch .zip and .exe files that begin with ‘FILEPATH’.
- Default:
file1 = .zip
file2 = .exe
[WEBPAGE] Fully qualified path for html output.
HTML output is disabled if this path is not specified.
- Default:
path = /var/www/html/downloads.html
[DATETIME] Datetime formats for reading access logs and writing HTML.
datetime_read
Format for reading access logs. The default matches: 01/Jan/2022:23:35:05 +0000
- Default:
%d/%b/%Y:%H:%M:%S %z
datetime_write
Format for writing html webpage. The default matches: Mon 01 Jan 18:35
- Default:
%a %d %b %H:%M
Note:
[FILEPATH] and [FILENAMES] options are just strings to search for in the accesslog file(s). Regex is used to search the log file(s) for: “<path-string> any-characters <file-string>”
- dlcounter.check_path(name, file)
Check if file exists.
- Parameters
name (string) – File identifier / file name.
file (string) – File path.
- Returns
True or exit.
- Return type
bool
- dlcounter.db_path()
Path to database file.
- Parameters
None –
- Returns
Fully qualified path to SQLite database file.
- Return type
string
- dlcounter.first_item_in_section(cfg, section)
Return the first item from cfg section.
- Parameters
cfg (configparser.ConfigParser) – The ConfigParser object.
section (string) – Section key.
- Returns
First value from section, or empty string.
- Return type
string
- dlcounter.format_datetime_output(dt_string)
Format dt_string as required for html output.
- Parameters
dt_string (datetime) – datetime object.
- Returns
Reformatted datetime string.
- Return type
string
- dlcounter.get_config()
Return dict of arguments from dlcounter.cfg.
List values are retrieved by
list_section()
. Single values retrieved byfirst_item_in_section()
. Also print parameters from command line and config file when –verbose command line argument is passed.- Parameters
None –
- Returns
Values from configuration file.
- Return type
dict
- dlcounter.get_db_time(con)
Return most recent timestamp from database.
- Parameters
con (connection) – Connection to the database.
- Returns
datetime.min if timestamp not found.
- Return type
datetime
- dlcounter.get_record(record, pattern)
Return (filename, time) from line when start and end of name are found, or None if not found.
- Parameters
record (string) – One line from access log.
pattern (string) – Regex pattern: f’GET {re.escape(start)}.*{re.escape(end)}’
- Returns
(Short filename string, datetime object) if successful,or None.
- Return type
tuple
- dlcounter.get_time(record)
Return timestamp from record.
- Parameters
record (string) –
- Returns
Exit if timestamp not found.
- Return type
datetime
- dlcounter.init_db(logpath, opt)
Initialise database.
Similar to main() but reads all logs that start with ‘logpath’ and does NOT check timestamp before counting. If the database already exists, the old table will be deleted and a new table created.
- Parameters
logpath (string) – Path to access logs.
opt (dict) – Parameters from dlcounter.cfg.
- Return type
None
- dlcounter.list_section(cfg, section)
Return list of values from cfg section.
- Parameters
cfg (configparser.ConfigParser) – The ConfigParser object.
section (string) – Section key.
- Returns
List of zero or more values.
- Return type
list
- dlcounter.log_to_sql(con, file, searchstring, timecheck=None)
Copy download data from log file to database.
Read one log file and update database. Updating is handled by
update_db()
.- Parameters
con (connection) – Connection to the database.
file (_io.TextIOWrapper) – Pointer to access log.
searchstring (string) – Regex pattern for filename in log.
timecheck (datetime) – Initialise if timecheck=None.
- Returns
True when database has been modified.
- Return type
bool
- dlcounter.main(opt)
Count downloads from access logs.
Search for file names in the access logs that match the search criteria, and update the database as necessary. The database is created automatically if it does not exist. Downloads with timestamps older than the last update are ignored.
- Parameters
opt (dict) – Contains string values: acclogs, searchstring, and html_out.
- Return type
None
- dlcounter.print_table()
Print contents of database to stdout.
This function is used only with –verbose option.
- Parameters
None –
- Return type
None
- dlcounter.sql_table(con)
Create ‘downloads’ table if it doesn’t exist.
- Parameters
con (connection) – Connection to the database.
- Return type
None
- dlcounter.time_format(readf='', writef='')
Return the date-time format
Values from config for reading access logs and writing html. Call either time_format.read or time_format.write.
- Parameters
readf (string, default '') – Time format for reading access logs.
writef (string, default '') – Time format for writing html.
- dlcounter.read
- Type
string
- dlcounter.write
- Type
string
- Return type
None
- dlcounter.update_db(con, fname, timestamp)
Update database.
If fname exists in database, update its download total and timestamp, else insert it into the database with a count of 1.
- Parameters
con (connection) – Connection to the database.
fname (string) – Name of the downloaded file.
timestamp (datetime) – Timestamp of download.
- Return type
None
- dlcounter.write_html(con, htmlfile)
Write sql data to web page.
- Parameters
con (connection) – Connection to the database.
htmlfile (string) – Path to html output file.
- Return type
None