Application startup

Issue #211 new
John L. Jegutanis
created an issue

We got reports from users that nud takes a lot of time to start.

I have confirmed that the problem exists but there is not a reliable way to reproduce it.

The symptoms are intense disk IO usage while reading the block index and the impossibility to issue RPC commands.

Here is the iotop usage

 TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 5310 be/4 nubits    311.18 K/s 1036.75 K/s  0.00 % 98.97 % nud

Comments (7)

  1. Michael Witrant

    I think the problem is the client has to scan the whole block and tx database at load time. It only processes blockindex entries, but it scans the whole database.

    I think it's not reproducible because after the first load most of the db files are in OS cache, and the process is very fast the second time if you have enough RAM.

    It's not directly related, but the discussion on memory optimization led to the idea of using another database for the vote data.

    All this makes me think we should try merging bitcoin 0.8 or at least the switch to leveldb. It would probably solve this load time and make storing the vote data easier.

  2. John L. Jegutanis reporter

    Yes, the caching part makes sense, when I restarted nud after a long startup it started immediately. The same when I extracted a backup of the database it started fast.

    Why it is writing to disk though?

  3. Michael Witrant

    It should not be writing, unless the code that extracts and rewrites the vote is triggered. Did the database you loaded had votes with version 0 (you should have repair messages in the log)? If that's the case then it's normal and it would explain even better why it's fast the next time.

    But maybe this code is triggered when it should not and the client does unnecessary writes.

  4. John L. Jegutanis reporter

    It writes to the __db.xxx files used by BDB. The file size stays the same but the content changes (checked the hash). Also iotop reports the disk writing. Not sure why there is writing though, I am using v2.0.1 and the repairs should be available in v2.1.0 right?

    Loading block index...
    PPCoin Network: genesis=0x000003cc2da5a0a289ad nBitsLimit=0x1e0fffff nBitsInitial=0x1e00ffff nStakeMinAge=604800 nCoinbaseMaturity=100 nCoinstakeMaturity=5000 nModifierInter
    val=14400
    
    <-- it spends all the time here
    
    LoadBlockIndex(): hashBestChain=8c08cecdec6858b3644b  height=507230  trust=740874205016
    LoadBlockIndex(): synchronized checkpoint 11b331cb1e5ba525fdcd5915aa47ee9dc0ca79337e634beabc7be21b1e23b5b4
    Verifying last 2500 blocks at level 1
     Upgrade Info: v0.4+ no txdb upgrade detected.
     block index         1086005ms
    

    If we disregard the unexplained disk writes, the slow startup time can be indeed attributed to the big block index database.

  5. Log in to comment