Download_optimizations_r2
Bitbucket cannot automatically merge this request.
The commits that make up this pull request have been removed.
Bitbucket cannot automatically merge this request due to conflicts.
Review the conflicts on the Overview tab. You can then either decline the request or merge it manually on your local system using the following commands:
git checkout 2.1.1-RC
git merge --no-ff -m 'Merged in download_optimizations_r2 (pull request #266)' remotes/origin/download_optimizations_r2
Pull request updated. Reload
Also, a quick comparison of 2.0.3 (blue), stock 2.1.1 (red) and this PR (black) -- when downloading from a single local parent) -- X axis is block, Y axis is how many seconds to download 1000 blocks.
A note to self: the final commit that changes things in ProcessBlock may be extraneous, as duplicate orphan blocks are checked for earlier in the function.
This is really working well and provides a great deal of improvement to the client. My daemon has not crashed on my supernode since adding this branch, which it was doing multiple times per day before. Though there's still seems to be the lingering memory issue. While the client will drop the memory usage to around 450MB shortly after startup it will eventually increase back up close to 900MB-1GB. This occurs without using any wallet functionality. It just seems to start consuming more ram again over time.
Thanks for the testing and feedback. Next steps are exactly that -- to get a sense as to what's going on with memory leaks. I did add some memory leak testing logic but am tripping up over corrupt stack traces on my build environment. There is the other bug with transactions being reported wrong in the UI that's also pegged to this build.
But this is promising, and I'm sure delightful news to many.
Cool. If you need help with anything please let me know.
I haven't tested it, but the code looks good.
Indeed it's already checked earlier.
So I've been doing my best to verify this is actually a result of changes here and not some environmental or dependency issues. I'm not sure what changes are causing it but this PR introduces some stability issues. At least with the QT on Ubuntu. The client will often drop error messages into the console while running (if you run it from the command line). Eventually this leads to the client crashing. This is occurring with wallets that are syncing to the main production network which contain no local assets in the wallets. You can see an example of the console messages bellow:
https://gist.github.com/CoinGame/ef6b814fd0512bd069a5
I'm trying to look into it a bit more, but I wanted to bring it up and see if anyone else has experienced this instability.
I haven't seen that one as of yet.
The exception is being generated when the in-memory blockmap needs to load a block off-disk. The reasons could be many -- either the block never made it to disk, it is not indexed/stored correctly in the database, or a block (i.e.: an orphan) is being asked for that the client never knew about and is punching thru all the way down to the DB.
It would be easy enough to add in some more debugging code to show what block it is missing -- this would help determine if it is asking for something recent on the network or from a ways back. There are some other touch-ups I want to do and will update the PR in the next few days.
I think this should be readily reproducible in prod.
Out of curiosity, is the underlying database a fresh-download or a conversion?
@Ben Rossi is pointing out to me this might be related to locales, which I think we've had issues with in the past and causing weird errors. Don't sound the alarms yet!
Edit: Nope, something is getting funky funky here. I'll keep playing around.
COMMITS ON APRIL 10:
Revert mapOrphanBlocks check, since it is done earlier:
This brings the code back to how it was, since there was a previous check.
Note that the Orphan Blocks in effect is enabling the ability for Nu to download quicker since it will store blocks in queue, then resolve them recursively once it has the blocks that link. This looks messy in the logs, and I wonder if there's a potential DOS if a rogue peer floods orphan blocks. Also there is a memory leak by way of pblock2 if an orphan block has a "duplicate proof-of-stake"
Add additional debugging if Nu asserts:
If we are unable to load the block from disk, print out the hash in the assert. This may be fixed with the following commit, but if not this will help us narrow down on the state of the client at the crash
Persist a block's hash in the object:
The block's hash used to be a pointer into the BlockIndexMap, but since blocks could be unloaded, there is the case that an iterator may no longer be valid. This increases the memory footprint per block held in memory, but should be safer.
Shut down the data feed thread:
The data feed thread cycles 1x per minute. My memory leak tracker takes some time to tabulate leaks, and during the tabulation, the data feed thread was seen accessing objects that had already been deleted, causing a SEGV. This issue may have rarely or never been seen in production depending on how quickly the client shuts down. But since there is database activity during shutdown, it is good to be safe.
Add to makefiles:
I added a reference to the chain.cpp file in the src/ makefiles except for makefile.osx-mavericks as I didn't see blockmap.o in there
OUTSTANDING ISSUES
Memory leaks
They look to be at the network / connection level, particularly around inventory that it knows it told the peer. This is exacerbated by a blockchain download because the Nu client informs its peers about new blocks -- this could easily be 100,000's of (small) objects.
More diagnostics needed. I've integrated memory leak checking code into my dev environment, but hunting down what leaks is not easy since most of the dynamic memory allocation is done in the STL classes, not by "new" calls in
GUI updating issues
I've mapped out the code flow.
There is at least one assert when using the Qt Debug libraries, as the "NuBits / Receive" pane does not have a "Dividend Address" column, but AddressBookPage::SetModel references it to set the resize mode
I've pushed some additional commits: