1. I think the tests should probably mock the AWS web API (basically boto itself) so we don't depend on hitting the API when running the tests (e.g. we don't need to test boto itself).
2. One thing I noticed about the multipart write streaming is that doing a write() that *itself* causes you to go over the write buffer doesn't actually write the data to S3. So I think that the buffer size check should be after the super.write() call in S3BotoStorageFile.
However, I think the changes are ok for now so I'll go ahead and accept this and clean it up a bit later.
Currently that is taken care of in close(). But that is a reasonable suggestion and probably a better idea. The next step with this would be to spread it over several threads probably using multiprocessing. Also, failed upload attempts should be retried. It is likely that some segments of large files will fail to upload occasionally.
I am interested in continuing to work on this. I would also like to implement reads using a buffer type approach as well so that files could be streamed easily... basically an improvement on my older suggestion that only relies on SpooledTemporaryFile to avoid memory errors. But that is for another discussion.
Since this issue is closed can we discuss further by email?
I know it's taken care of in close() and I know that's necessary but I think it would be *better* taken care of in write() if the buffer goes over in the current write rather than wait for the next write or close. I'll probably be fixing this later today anyway. It's a pretty simple change.
Currently the buffered write approach is a good one and I think that's ok as it is however reading is, like you said, not streamed so you're right that that could be improved. If you have any suggestions or want to create a pull request for that I would be very grateful.
On a related note, I also think I'll update the tests to mock out the boto library and use tox to run the tests against multiple Django versions. I'll be working on that today during the Django sprints at PyCon.
I'd rather not do things using private email. Maybe we should get around to making a mailing list for django-storages?
That all sounds good. The tests were just a starting point to get the ball rolling. I ran into some issues using storage to open a file for writing so I figured tests would prevent that in the future.
I have not used tox before but it seems rather interesting.
I am confused by your comment to remove boto from the tests. I do not understand how you can remove boto from the tests since they use boto indirectly via the storage backend.
Finally, a mailing list for django-storages would be great.
I meant that I would like to mock it out using the mock library (http://pypi.python.org/pypi/mock/) because I don't want to have to hit AWS every time we do a test. We aren't really testing boto itself anyway.
I'll talk to Jannis Leidel about a mailing list and maybe get it set up today.