I've got a proxy written in Django which receives requests for certain files. After deciding whether the user is allowed to see the file the proxy gets the file from a remote service and serves it to the user. There's a bit more to it but this is the gist.
This setup works great for single files, but there is a new requirement that the users want to download multiple files together as a zip. The files are sometimes small, but can also become really large (100MB plus) and it can be anywhere from 2 up to 1000 files simultaneously. This can become really large, and a burden to first get all those files, zip them and then serve them in the same request.
I read about the possibility to create "streaming zips"; a way to open a zip and then start sending the files in that zip until you close it. I found a couple php examples and in Python the django-zip-stream extension. They all assume locally stored files and the django extension also assumes the usages of nginx.
There are a couple things I wonder about in my situation:
- I don't have the files locally stored. I can get them with an async/await structure and serve them simultaneously. That would mean I always have two files in memory (the one I'm currently serving, and the next one I'm getting from the source server).
- Unfortunately I don't have control over the web servers which will serve this. I can of course put an nginx container in front of it, but I don't think nginx could serve from files I store in Python vars because I get them from the source server.
- Whether I'm doing this in Python or let it be zipped in nginx, I presume the needed CPU cycles for this would be substantial.
Does anybody know whether streaming zips are a good idea with my setup of very large remote files? I'm a bit afraid that many requests will easily DOS our servers because of CPU or memory limits.
I can also build a queue which zips the files and sends an email to the user, but if possible I'd like to keep the application as stateless as possible.
All tips are welcome!
from Streaming zip in Django for large non-local files possible?
No comments:
Post a Comment