[wait-for-feedback] FlushError: Instance <UserLog at 0x7f2cecab6f10> has a NULL identity key.

Issue #268 closed
Stephan Riedel created an issue

Hi,

I'm currently trying to migrate the kallithea database from from sqlite to postgresql via pgloader.

Kallithea version: 0.3.2

# sqlite3 --version
3.8.7.1 2014-10-29 13:59:56 3b7b72c4685aa5cf5e675c2c47ebec10d9704221
# psql --version
psql (PostgreSQL) 9.4.9
# pgloader --version
pgloader version "3.2.2"
compiled with SBCL 1.3.1.debian

When trying to push or pull a repository, I get an HTTP error 500 and the following stack trace in the syslog:

Feb  7 20:19:39 bsserv24 paster[8288]: 2017-02-07 20:19:39.883 ERROR [kallithea.lib.middleware.simplegit] Traceback (most recent call last):
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/kallithea/lib/middleware/simplegit.py", line 203, in _handle_request
Feb  7 20:19:39 bsserv24 paster[8288]: self._handle_githooks(repo_name, action, baseui, environ)
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/kallithea/lib/middleware/simplegit.py", line 296, in _handle_githooks
Feb  7 20:19:39 bsserv24 paster[8288]: log_pull_action(ui=baseui, repo=_repo._repo)
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/kallithea/lib/hooks.py", line 135, in log_pull_action
Feb  7 20:19:39 bsserv24 paster[8288]: action_logger(user, action, ex.repository, ex.ip, commit=True)
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/kallithea/lib/utils.py", line 206, in action_logger
Feb  7 20:19:39 bsserv24 paster[8288]: sa.commit()
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 710, in commit
Feb  7 20:19:39 bsserv24 paster[8288]: self.transaction.commit()
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 368, in commit
Feb  7 20:19:39 bsserv24 paster[8288]: self._prepare_impl()
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 347, in _prepare_impl
Feb  7 20:19:39 bsserv24 paster[8288]: self.session.flush()
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1734, in flush
Feb  7 20:19:39 bsserv24 paster[8288]: self._flush(objects)
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1809, in _flush
Feb  7 20:19:39 bsserv24 paster[8288]: flush_context.finalize_flush_changes()
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 347, in finalize_flush_changes
Feb  7 20:19:39 bsserv24 paster[8288]: self.session._register_newly_persistent(state)
Feb  7 20:19:39 bsserv24 paster[8288]: File "/srv/kallithea/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1323, in _register_newly_persistent
Feb  7 20:19:39 bsserv24 paster[8288]: "a load() event." % mapperutil.state_str(state)
Feb  7 20:19:39 bsserv24 paster[8288]: FlushError: Instance <UserLog at 0x7f2cecab6f10> has a NULL identity key.  If this is an auto-generated value, check that the database table allows generation of new primary key values, and that the mapped Column object is configured to expect these generated values.  Ensure also that this flush() is not occurring at an inappropriate time, such as within a load() event.

Google tells me, that this might be connecte to the AUTOINCREMENT feature of sqlite but on first sight, all tables/columns using AUTOINCREMENT in sqlite were converted correctly to postgres (I think at least).

Can you give me a hint where Kallithea is crashing or which tables might get corrupted during conversion?

Also, is there any official guide on how to convert the Kallithea sqlite DB to postgres? The grate command (as proposed in the docs) requires some metadata for the migration. Where can I find it?

Comments (11)

  1. Mads Kiilerich

    I guess it is crashing because a problem at the database layer - not in Kallithea itself.

    Our documentation mentions SQLAlchemyGrate - but only because someone mentioned it worked.

    Are you using a database created by Kallithea and loaded with data from the migration, or has the PG database been created by your tool? Then I suggest creating a clean PG database and load your data into it. Perhaps by using pg_dump and schema only vs data only dumps.

  2. Stephan Riedel reporter

    Yes, I think it's a problem on database layer, but I think Kallithea is trying to do a query on key or column which does not exist (maybe was not imported correctly). Can you check what kind of query this is? The stack trace mentions "UserLog".

    I created a clean database by hand and then used pgloader to import the scheme and the tables. pgloader is one of the recommended tools on the postgres page to migrate data from sqlite to PG.

    Correct me if I'm wrong, but pg_dump is just to dump the contents of the PG database. How do I use it to migrate from sqlite to PG?

  3. Mads Kiilerich

    It looks like the problem not is a query but that sqlalchemy get userlog entries without primary key. No doubt sqlite allows that, but it is surprising that PG with a correct schema allows it. Try to check it in your database.

    IIRC, You can get queries listed by setting debug in .ini or by increasing the logging level.

    Yes, pg_dump is only for pg to pg but it can be used to load old data into a new database schema and thus guarantee correct structure. That might also be a part of your "created a clean database by hand".

  4. Stephan Riedel reporter

    Where can I get the Kallithea DB scheme for PG?

    I only did a

    postgres=# CREATE DATABASE kallithea;
    

    The tables and scheme were created by pgloader based on the sqlite DB.

    Can I just alter all the tables by adding a primary key? e.g.

    kallithea=# \d cache_invalidation;
     Table "public.cache_invalidation"
        Column    |  Type   | Modifiers
    --------------+---------+-----------
     cache_id     | bigint  |
     cache_key    | text    |
     cache_args   | text    |
     cache_active | boolean |
    Indexes:
        "key_idx" btree (cache_key)
    
    kallithea=# alter table cache_invalidation add primary key (cache_id);
    ALTER TABLE
    kallithea=# \d cache_invalidation;
     Table "public.cache_invalidation"
        Column    |  Type   | Modifiers
    --------------+---------+-----------
     cache_id     | bigint  | not null
     cache_key    | text    |
     cache_args   | text    |
     cache_active | boolean |
    Indexes:
        "cache_invalidation_pkey" PRIMARY KEY, btree (cache_id)
        "key_idx" btree (cache_key)
    

    If yes, do I have to do this for all columns or just for the one which had AUTOINCREMENT in sqlite?

  5. Mads Kiilerich

    You can get a the Kallithea DB scheme for PG by creating a new kallithea database with paster setup-db, then make a pg_dump with only the scema.

    I don't know pgloader (or other conversion tools) but generally you can only expect them to convert (most of) the data - they do inherently not have enough information to create the right schema.

  6. Stephan Riedel reporter

    Can I work with a copy of my ini file while Kallithea is still running on sqlite (for testing and to reduce downtime), e.g.

    paster setup-db <path to my ini copy>
    

    According to the docs, this will ask me for the root path to the repositories. Will paster create a repository structure under this path when calling the command? If yes, I'd rather use an intermediate path (e.g. /tmp/repos).

  7. Stephan Riedel reporter

    Any hints? Can I work with a copy of my kallithea ini file and an path to an intermediate repo (see my last comment)?

  8. Mads Kiilerich

    You can have multiple .ini files pointing at different databases.

    Separate repo path is not important. It can be shared between instances ... even though changes not necessarily will show up until rescanning.

    The simplest and safest way to play around with this will probably be to use a separate machine for experiments.

  9. Log in to comment