G3gg0 and me took a closer look at why black level gets sometimes autodetected incorrectly, and what can we do about it. The most obvious side effect is that raw videos are rendered with a strong green cast (sometimes pink).
Sometimes the problem exists in photo mode too (raw overlays), if the raw buffer gets overwritten by something else. For example, in 500D, if you take a picture in LiveView, single shooting mode, and hold the shutter button a little more, the LiveView will appear before QuickReview, and the photo raw buffer will be overwritten by LiveView data. This results in corrupted raw overlays.
History and rationale:
Proposed approach: raw_update_params (which calls autodetect_black_level) can return 0 if black level doesn't look right, so the user code may decide to either retry (as in mlv_rec), fall back to YUV (as in zebras and histogram) or simply give up.
For deciding whether the black level looks alright, the first idea (e8876ef) was to check if the autodetected level is consistent with previous estimations (that is, converges to some value). This works well in movie mode, but it requires multiple frames to analyze, which is not suitable for photo mode.
The second idea (40d1737) was to check the consistency of the optical black area in a single frame (whether it look like an uniform noise or not). This means, a spatial check rather than a temporal one. For this I've tried a simple heuristic, which divides the OB in 5 areas and checks their local mean/stdev against the global mean/stdev values. If the small OB areas "agree" with the global estimation, the black level is declared good; otherwise, raw_update_params will return failure (0).
Black level checking was still missing some bad frames, especially in low light, and felt like an workaround rather than fixing the real problem. So I took a closer look at why these bad frames appeared, and noticed that, with CONFIG_EDMAC_RAW_SLURP, after a resolution change (zoom or fps), the resolution used for configuring the EDMAC transfer was wrong until the first call to raw_update_params (so, if that call was after 5 seconds, you had 5 seconds of bad raw data). And, of course, that first call operated on broken data, and reported a wrong black level. Subsequent calls were OK, but when recording raw with memory hack, only one call was made after restoring LiveView, and that call was always broken. The fix to this was 7996d60.
Memory hack was OK now, but black warnings were still present immediately after switching the resolution (zoom or Canon's video mode). Next fix, ccecba6, addresses this: after a resolution change, the first call to raw_update_params will wait for one LiveView frame to make sure it has valid data.
Black warnings were still happening if you press the zoom button like crazy. So, in the next fix (3cc213e), the raw data is marked as invalid for a small duration (~0.5 seconds) starting from a resolution change or zoom toggle. With this, raw_update_params no longer needs to wait for one LV frame (it relies on user code to retry). I can no longer trigger any black warnings :)
After the latest fixes, the sanity check code for the OB area remained more like a double-checking (it catches the 500D bug, for example, and also points out where the raw backend requires more attention). I would prefer the bad frames to be solved properly in the raw backend, and not caught by the black checking code.
So, @ early adopters, please test it out before including it in the nightly builds.
open the console; it prints the black level whenever it's refreshed
sane values for black level: around 1024, around 2048, around 1700-1800... others?
if there are black warnings, a frame named bad.dng will be saved on the card; upload it
record raw/mlv in crop mode with memory hack on (this was likely to give bad frames)
press record really quick after going to liveview or after toggling the zoom mode or Canon's video mode
playback test videos in mlv_play (they should not have green or pink cast)
check if dual iso images passes the black tests
implement the black checks for cameras without a left OB (550D/600D/60D/7D in zoom mode); important?
comment out the debug code before merging (the bad.dng part)?
I guess there is 2 ways of looking at (least part of) it. Frantic button pushers should slow down a bit, however, they are apparently really good bug finders.
Silly question # 1. The OB adjustment in iso_regs. If you increase the OB area, does it provide correct data, or is it affected by the same problems? I assume it is affected also.
Same problems (they are not related to OB area, but to gibberish in the raw buffer, usually when resolution changes or when LiveView is stopped and resumed).
The black autodetection will be essential once the ISO tweaks are merged (this is why I don't like the hardcoding of 2048 or 1024 or whatever other values).
Can you flush the data? This might slow down the ability to record within ms of some change, but should be robust?
I can try to mark the raw data as invalid for like 0.5 seconds since the last resolution change.
edit: yeaaa, this fixed it, I can no longer trigger the warnings :D
bug report: 600D, in non-video LV mode and mlv_rec enabled it locks up for 20 seconds when trying to paint the mlv_rec menu entry.
looks like raw_update_params() is failing. console shows no black level message at all in LV mode.
-> we should lower the loop count to ~50 loops in mlv_rec:462 but this is a symptom only, not the real bug.
Ah, too many retries in mlv_rec_update_raw?
yes, didnt dig deeper, but it looks like the raw code tells not to be ready because this is not video mode.
and mlv_rec retries "a few" times :)
600D seems to work now, just a bad frame every now and then when restarting LV by entering canon menu.
seems to be working pretty well on 60D, I didn't do anything too extensive, I hardly ever use raw video anyway, I did have a BAD.DNG: https://dl.dropboxusercontent.com/u/74060/BAD.DNG, but all my actual recordings turned out just fine, did two in normal mode and two in crop mode, making sure ti hit record quickly
This one was because of a undetected resolution change.
If the light is good, the consistency checks for the OB area will detect the bad frame, but in very dim lighting, the bad frame may be missed. Unlikely to happen though.
Yeah it was after sundown here when I did the tests, not much light to work
Looks like this BAD.DNG was actually a bug in my diagnostic code, fixed.
Did these conflicts arise because I used a branch name very similar to this branch?
They appeared because you changed the workaround code (which is going to disappear after merging this).