1. Michael Bayer
  2. sqlalchemy
  3. Issues

Issues

Issue #2826 resolved

New recipe -- solution to select through large number of rows for SQLITE

Anonymous created an issue

Sqlite does not support the window/range function described in this recipe

http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery

We have developed a solution along the same lines using LIMIT/OFFSET, and wanted to share this back with the community.

def _yield_limit(qry, pk, maxrq=1000):
    r"""specialized windowed query generator (using LIMIT/OFFSET)

    This recipe is to select through a large number of rows thats too
    large to fetch at once. The technique depends on the primary key
    of the FROM clause being an integer value, and selects items
    surrounded with LIMIT/OFFSET"""

    key = pk.property.key
    nextid = qry.session.query(pk).order_by(key).\
                limit(1).scalar()

    if nextid is None:
        return

    while 1:
        count = 0
        for rec in qry.filter(pk >= nextid).limit(maxrq):
            # Retrieve the key value before yielding as the
            # caller could delete the object.
            nextid = rec.__getattribute__(key) + 1
            count += 1
            yield rec

        # If we received less than the number of items we requested
        # we have reached the end.

        if count != maxrq:
            return

Comments (10)

  1. Michael Bayer repo owner

    this is pretty trivial and here's how to make sure it's ordering correctly:

    def _yield_limit(qry, pk_attr, maxrq=1000):
        r"""specialized windowed query generator (using LIMIT/OFFSET)
    
        This recipe is to select through a large number of rows thats too
        large to fetch at once. The technique depends on the primary key
        of the FROM clause being an integer value, and selects items
        surrounded with LIMIT/OFFSET"""
    
    
        firstid = qry.session.query(func.min(pk_attr)).scalar()
    
        while firstid is not None:
            rec = None
            for rec in qry.filter(pk_attr >= firstid).order_by(pk_attr).limit(maxrq):
                yield rec
            firstid = pk_attr.__get__(rec) if rec else None
    

    maybe we could make some kind of category out of the existing limit recipe, e.g. https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/WindowedRangeQuery

  2. Dan Sully

    Hey Mike - your updated recipe doesn't quite work - it seems like get(rec) requires a second argument. Also, firstid is never set to None, thus the loop doesn't end. At least for my test case.

    I'd love to see this become a category, as I'm unfortunately not able to use a DB that can use the WindowedRangeQuery recipe.

    Thanks

  3. Michael Bayer repo owner
    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    
    Base = declarative_base()
    
    class A(Base):
        __tablename__ = 'a'
    
        id = Column(Integer, primary_key=True)
    
    e = create_engine("sqlite://", echo=True)
    Base.metadata.create_all(e)
    
    sess = Session(e)
    
    sess.add_all([A() for i in range(2000)])
    
    def _yield_limit(qry, pk_attr, maxrq=100):
        r"""specialized windowed query generator (using LIMIT/OFFSET)
    
        This recipe is to select through a large number of rows thats too
        large to fetch at once. The technique depends on the primary key
        of the FROM clause being an integer value, and selects items
        surrounded with LIMIT/OFFSET"""
    
        firstid = None
        while True:
            q = qry
            if firstid is not None:
                q = qry.filter(pk_attr > firstid)
            rec = None
            for rec in q.order_by(pk_attr).limit(maxrq):
                yield rec
            if rec is None:
                break
            firstid = pk_attr.__get__(rec, pk_attr) if rec else None
    
    for rec in _yield_limit(sess.query(A), A.id):
        print(rec.id, rec)
    
  4. Log in to comment