Previous topic

tse2sql.scrapper

This Page

tse2sql.utils

Utilities module.

Functions

  • is_url(): Deterime if given string is an url.

  • ensure_dir(): Ensure that a path exists.

  • download(): Download given file in system temporal files folder.

  • sha256(): Calculate SHA256 of given filename.

  • unzip(): Unzip given filename.

  • get_file(): Case-insensitive get file from a directory.

  • count_lines(): Count the number of lines in filename.

tse2sql.utils.is_url(url)

Deterime if given string is an url.

Parameters

url (str) – String to check if its a URL.

Returns

True if its an URL, False otherwise.

Return type

bool

tse2sql.utils.ensure_dir(path)

Ensure that a path exists.

Parameters

path (str) – Directory path to create.

tse2sql.utils.download(url, subdir=None)

Download given file in system temporal files folder.

Parameters
  • url (str) – URL of the file to download.

  • subdir (str) – Subfolder name to store the downloaded file in the system temporal files folder.

Returns

Local path where the file was stored.

Return type

str

tse2sql.utils.sha256(filename)

Calculate SHA256 of given filename.

Parameters

filename (str) – Filename to calculate SHA256 from.

Returns

SHA256 hexidecimal digest.

Return type

str

tse2sql.utils.unzip(filename)

Unzip given filename.

The extraction folder will be determined by the archive filename removing the extension, including it’s parent folder.

Parameters

filename (str) – Path to the zip to extract.

Returns

The path where the archive was extracted.

Return type

str

tse2sql.utils.get_file(search_dir, filename)

Case-insensitive get file from a directory.

Parameters
  • search_dir (str) – Directory to look for filename.

  • filename (str) – Case-insensitive filename to look for.

Returns

The absolute path to the file.

Return type

str

Raises

Exception if file not found.

tse2sql.utils.count_lines(filename)

Count the number of lines in filename.

Parameters

filename (str) – Path to the filename.

Returns

The number of lines in the file.

Return type

int