Google app engine for reals

Issue #272 new
Matt Chaput
repo owner created an issue

With the new codec layer, it would be easier to implement Whoosh on top of Google App Engine's db.

  • Store posting blocks as blobs under {{{segment_# > term > block_#}}} keys.

  • Only allow bytes columns, store column values under {{{segment_# fieldname > doc_#}}} ? Or store columns as blob? (1 MB limit is a problem)

It probably wouldn't be fast. And I don't actually think using Google App Engine is a good idea. So there's that.

Comments (2)

  1. Steven Ourada

    For those of us who think that GAE is a good idea, about how much work do you think this would be? In this case, I want to use it for a fairly small collection (50k) of tiny documents (500 chars). It looks like gae.py is a bit out of sync with the rest of the system in 2.5.4, because it was choking when I tried to use it. I was thinking along the lines you outline above, since in GAE world, small entities indexed for direct retrieval by keys is the way to go... I haven't looked too deeply at the storage architecture to know exactly where to plug in, but your description above might give me the pointers I need.

    Thanks for putting Whoosh out there!

  2. Steven Ourada

    (I take back what I said about gae.py. I was able to get it working on a small sample, after I fixed my code and did a workaround for some other little issue. I'll keep working on that path. But I still like the idea of storing the postings in separate entities.)

  3. Log in to comment