Helpers

This helper functions provide the functionality that FastDownload relies on. Most users should use FastDownload rather than calling these helpers.

dest = Path('tmp')
url = 'https://s3.amazonaws.com/fast-ai-sample/mnist_tiny.tgz'

dest.mkdir(exist_ok=True)
fpath = download_url(url, dest)
fpath

Path('tmp/mnist_tiny.tgz')

path_stats(fpath)

(342207, '56143e8f24db90d925d82a5a74141875')

The download_checks.py file containing sizes and hashes will be located next to module:

mod = checks_module(fastdownload)
mod

Path('git/fastdownload/fastdownload/download_checks.py')

assert read_checks({}) == {}

if mod.exists(): mod.unlink()
update_checks(fpath, url, mod)
read_checks(mod)

{'https://s3.amazonaws.com/fast-ai-sample/mnist_tiny.tgz': (342207,
  '56143e8f24db90d925d82a5a74141875')}

d = FastDownload(module=fastdownload)
d.module

Path('git/fastdownload/fastdownload/download_checks.py')

The config.ini file will be created (if it doesn't exist) in {base}/config.ini:

d.cfg.config_file

Path('.fastdownload/config.ini')

print(d.cfg.config_file.read_text())

[DEFAULT]
data = /home/jhoward/.fastdownload/data
archive = /home/jhoward/.fastdownload/archive

If there is no stored hash and size for url, or the size and hash matches the stored checks, then download will only download the URL if the destination file does not exist. The destination path will be retured.

if d.module.exists(): d.module.unlink()
arch = d.download(url)
arch

Path('.fastdownload/archive/mnist_tiny.tgz')

d.update(url)
eval(d.module.read_text())

{'https://s3.amazonaws.com/fast-ai-sample/mnist_tiny.tgz': (342207,
  '56143e8f24db90d925d82a5a74141875')}

Calling download will now just return the existing file, since the checks match:

d.download(url)

Path('.fastdownload/archive/mnist_tiny.tgz')

If the checks file doesn't match the size or hash of the archive, then a new copy of the file will be downloaded.

extr = d.extract(url, force=True)
extr

Path('.fastdownload/data/mnist_tiny')

extr.ls()

(#5) [Path('.fastdownload/data/mnist_tiny/models'),Path('.fastdownload/data/mnist_tiny/train'),Path('.fastdownload/data/mnist_tiny/labels.csv'),Path('.fastdownload/data/mnist_tiny/valid'),Path('.fastdownload/data/mnist_tiny/test')]

Pass extract_key to use a key other than data from your config file when selecting an archive extraction location:

d.cfg['model_path'] = 'models'
d.extract(url, extract_key='model_path')

Path('.fastdownload/models/mnist_tiny')

d.rm(url)
extr.exists(),arch.exists()

(False, False)

res = d.get(url)
res,extr.exists()

(Path('.fastdownload/data/mnist_tiny'), True)

If the archive doesn't exist, but the extracted data does, then the archive is not downloaded again.

d.rm(url, rm_data=False)
res = d.get(url)
res,extr.exists()

(Path('.fastdownload/data/mnist_tiny'), True)

extract_key works the same way as in FastDownload.extract:

res = d.get(url, extract_key='model_path')
res,res.exists()

(Path('.fastdownload/models/mnist_tiny'), True)

Core API

Helpers

`download_url`[source]

`path_stats`[source]

`checks_module`[source]

`read_checks`[source]

`check`[source]

`update_checks`[source]

`download_and_check`[source]

`class` `FastDownload`[source]

`FastDownload.download`[source]

`FastDownload.update`[source]

`FastDownload.extract`[source]

`FastDownload.rm`[source]

`FastDownload.get`[source]

Core API

Helpers

download_url[source]

path_stats[source]

checks_module[source]

read_checks[source]

check[source]

update_checks[source]

download_and_check[source]

class FastDownload[source]

FastDownload.download[source]

FastDownload.update[source]

FastDownload.extract[source]

FastDownload.rm[source]

FastDownload.get[source]

`download_url`[source]

`path_stats`[source]

`checks_module`[source]

`read_checks`[source]

`check`[source]

`update_checks`[source]

`download_and_check`[source]

`class` `FastDownload`[source]

`FastDownload.download`[source]

`FastDownload.update`[source]

`FastDownload.extract`[source]

`FastDownload.rm`[source]

`FastDownload.get`[source]