This helper functions provide the functionality that FastDownload
relies on. Most users should use FastDownload
rather than calling these helpers.
dest = Path('tmp')
url = 'https://s3.amazonaws.com/fast-ai-sample/mnist_tiny.tgz'
dest.mkdir(exist_ok=True)
fpath = download_url(url, dest)
fpath
path_stats(fpath)
The download_checks.py
file containing sizes and hashes will be located next to module
:
mod = checks_module(fastdownload)
mod
assert read_checks({}) == {}
if mod.exists(): mod.unlink()
update_checks(fpath, url, mod)
read_checks(mod)
d = FastDownload(module=fastdownload)
d.module
The config.ini
file will be created (if it doesn't exist) in {base}/config.ini
:
d.cfg.config_file
print(d.cfg.config_file.read_text())
If there is no stored hash and size for url
, or the size and hash matches the stored checks, then download
will only download the URL if the destination file does not exist. The destination path will be retured.
if d.module.exists(): d.module.unlink()
arch = d.download(url)
arch
d.update(url)
eval(d.module.read_text())
Calling download
will now just return the existing file, since the checks match:
d.download(url)
If the checks file doesn't match the size or hash of the archive, then a new copy of the file will be downloaded.
extr = d.extract(url, force=True)
extr
extr.ls()
Pass extract_key
to use a key other than data
from your config file when selecting an archive extraction location:
d.cfg['model_path'] = 'models'
d.extract(url, extract_key='model_path')
d.rm(url)
extr.exists(),arch.exists()
res = d.get(url)
res,extr.exists()
If the archive doesn't exist, but the extracted data does, then the archive is not downloaded again.
d.rm(url, rm_data=False)
res = d.get(url)
res,extr.exists()
extract_key
works the same way as in FastDownload.extract
:
res = d.get(url, extract_key='model_path')
res,res.exists()