I'm using aiohttp to download large files (~150MB-200MB each).
Currently I'm doing for each file:
async def download_file(session: aiohttp.ClientSession, url: str, dest: str):
chunk_size = 16384
async with session.get(url) as response:
async with aiofiles.open(dest, mode="wb") as f:
async for data in response.content.iter_chunked(chunk_size):
await f.write(data)
I create multiple tasks of this coroutine to achieve concurrency. I'm wondering:
- What is the best value for
chunk_size
? - Is calling
iter_chunked(chunk_size)
is better then just doingdata = await response.read()
and writing that to disk? In that case, how can I report the download progress? - How many tasks made of this coroutine should I create?
- Is there a way to download multiple parts of the same file in parallel, is it something that aiohttp already does?
from aiohttp: fast parallel downloading of large files
No comments:
Post a Comment