Random grey frames (corrupted) while encoding video sequence

Issue #403 new
Vasya Volkov created an issue

Hi,

I've encounter strange issue while working with libx265 2.7 and ffmpeg 4.0: while encoding, within ten minutes (maybe more maybe less), in my encoded stream grey frames appear (one or several times in a row) and then they disappear with large grey pixels and original stream images return back and everything goes fine. Then problem repeats.

Input stream decodes well, ffmpeg print no error, resulted HEVC bitstream successfully decoded by VLC or any other software (ffplay) (or hardware, I've tried Amlogic S905) but while playing video sequence contains grey frame and looks corrupted sometimes. Here is example of corrupted output sequence: https://yadi.sk/d/BNC7n7dJ3VVdxD

It has random nature, it does not depend on input stream, it doesn't depend on particular subsequence within input stream (I've tried to reencode same sequence and everything went fine without any image degradation).

The description is very close to this ffmpeg bug: https://trac.ffmpeg.org/ticket/6814 but in my case I definitely has no decoder corruption, and FFmpeg definitely (I've checked it with function and check raw frames) sends to libx265 encoder well decoded frames. So it looks like libx265 encoder bug.

A csv sheet with metrics from libx265 while output stream contains grey frame: https://yadi.sk/i/xVSfm6ph3VW47j

Also that grey frames actually YUV with every byte == 128

I've checked: master libx265, release 2.7, release 2.6 -- problem everywhere.

This is my encoding commands (maybe here is error in codec params?):

ffmpeg -threads auto -i pipe:0 -s 1920x1080 -codec:0 libx265 -preset:0 ultrafast -x265-params:0 hdr=1:colorprim=9:transfer=18:colormatrix=9:fps=50:keyint=50:min-keyint=50:bframes=3:hrd=1:ref=3:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000:rc-lookahead=4:no-scenecut=1:repeat-headers=1:no-open-gop=1:aud=1:no-info=1:level-idc=5.1:min-cu-size=16:merange=42:lookahead-threads=10:frame-threads=10:csv=ffmpeg-4.0-2.7.csv:csv-log-level=1 -streamid 0:101 -map '#0xc8' -codec:1 copy -threads:1 2 -streamid 1:110 -map '#0xca' -muxrate 26000000 -max_delay 2500000 -f:0 mpegts pipe:1 < input.ts > output.ts


ffmpeg version 4.0 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.5) 20160609
  configuration: --prefix=/usr/local --pkg-config-flags=--static --enable-static --enable-nonfree --disable-shared --enable-runtime-cpudetect --extra-cflags=-ftree-vectorize --extra-cflags='-march=native' --extra-cflags=-O3 --extra-cflags=-fuse-linker-plugin --enable-pic --enable-lto --ar=gcc-ar --ranlib=true --enable-gpl --enable-libx264 --enable-libx265 --enable-libmp3lame --extra-libs=-lpthread
libx265 version info:
x265 [info]: HEVC encoder version 2.7
x265 [info]: build info [Linux][GCC 5.4.0][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

My CPU setup:

~$ lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                88
On-line CPU(s) list:   0-87
Thread(s) per core:    2
Core(s) per socket:    22
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Stepping:              1
CPU MHz:               2195.105
BogoMIPS:              4391.78
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              56320K
NUMA node0 CPU(s):     0-21,44-65
NUMA node1 CPU(s):     22-43,66-87
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch arat epb pln pts dtherm intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc

Here is also x265 command specified same parameters (raw decoded stream needed):

x265 --input - --input-res 1920x1080 -p ultrafast --fps 50 --hrd --aud --repeat-headers --bitrate 18000 --vbv-maxrate 20000 --vbv-bufsize 40000 --bframes 3 --no-scenecut --rc-lookahead 4 --min-keyint 50 --keyint 50 --no-open-gop --ref 3 --merange 42 --min-cu-size 16 --frame-threads 10 --level-idc 5.1 --lookahead-slices 8 --lookahead-threads 10 --colorprim bt2020 --transfer arib-std-b67 --hdr --no-info --input-depth 10 --output-depth 10 -o out.h265

Mpegts input stream: https://yadi.sk/d/NtTPUVnq3VVuFG

Can be decoded with command:

ffmpeg3 -re -i uhd_input.ts -s 1920x1080 -f rawvideo pipe:1 > /tmp/pipe.yuv

Also I've found that bug can be reproduced even with simple settings:

ffmpeg -threads auto -i pipe:0 -s 1920x1080 -codec:0 libx265 -preset:0 ultrafast -x265-params:0 fps=50:keyint=50:hrd=1:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000 -map 0:v -codec:1 copy -map 0:a -muxrate 26000000 -max_delay 2500000 -f:0 mpegts < input.ts > out.ts

Also I've posted this bug into mailing list, but it seems it's better place to discuss.

Comments (20)

  1. Pradeep Ramachandran Account Deactivated

    Thanks. If I understand correctly, you are just encoding to a file and playing back right? Or are you trying to stream the ts actively?

  2. Vasya Volkov reporter

    It's definitely an libx265 bug and it's corrupted bitstream, so it makes no matter whether stream or just play it. I've run another bunch of tests and can now confirm, this bug is libx265 only.

    Version 2.6 2.7 and master occasionally generates corrupted frames, maybe with some types of inputs - I don't know, but I've checked three completely different sources and bug persists everytime.

    I upload 7200 seconds input mpegts stream for you here (it's large 21Gb): https://yadi.sk/d/M3s9hMb23VbjWM And also a smaller version (around 2400 seconds) of same input (it's 6.8G Gb): https://yadi.sk/d/xhUEW6Er3VbjYz

    If you will try to encode, you will definitely reproduce this bug.

    I transcoded it with this bunch of commands:

    ffmpeg -loglevel error -i input_7200.ts -s 1920x1080 -pix_fmt yuv420p10le -an -f rawvideo - | x265 --input - --input-res 1920x1080 -p ultrafast --fps 50 --hrd --aud --repeat-headers --bitrate 18000 --vbv-maxrate 20000 --vbv-bufsize 40000 --bframes 3 --no-scenecut --rc-lookahead 4 --min-keyint 50 --keyint 50 --no-open-gop --ref 3 --merange 42 --min-cu-size 16 --frame-threads 10 --level-idc 5.1 --lookahead-threads 10 --colorprim bt2020 --transfer arib-std-b67 --hdr --input-depth 10 --output-depth 10 -o out.h265
    

    And this time I've tried master version libx265:

    x265 [info]: HEVC encoder version 2.7+338-147b7dcee675
    x265 [info]: build info [Linux][GCC 5.4.0][64 bit] 10bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    

    Bug definitely reproduced (but more rarerly) with this x265 params (nothing special as you see):

    -x265-params:0 keyint=50:hrd=1:vbv-maxrate=20000:vbv-bufsize=19000:bitrate=18000
    

    I compile libx265 with this options on ubuntu 16.04.3:

    -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_SHARED:bool=off -DENABLE_ASSEMBLY:BOOL=on -DHIGH_BIT_DEPTH:bool=on -DNATIVE_BUILD:BOOL=on -DENABLE_HDR10_PLUS:BOOL=on
    

    Problem not only with 10bit, I've also check 8bit - problem persists. I've also checked libx264 - encoding fine without any errors.

  3. Vasya Volkov reporter

    One interesting thing. I can't reproduce (or it's definitely significant rarerly reproducible) this bug on this CPU:

    ~# lscpu 
    Architecture:          x86_64
    CPU op-mode(s):        32-bit, 64-bit
    Byte Order:            Little Endian
    CPU(s):                40
    On-line CPU(s) list:   0-39
    Thread(s) per core:    2
    Core(s) per socket:    10
    Socket(s):             2
    NUMA node(s):          2
    Vendor ID:             GenuineIntel
    CPU family:            6
    Model:                 62
    Model name:            Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
    Stepping:              4
    CPU MHz:               2799.907
    BogoMIPS:              5601.60
    Virtualization:        VT-x
    L1d cache:             32K
    L1i cache:             32K
    L2 cache:              256K
    L3 cache:              25600K
    NUMA node0 CPU(s):     0-9,20-29
    NUMA node1 CPU(s):     10-19,30-39
    Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
    

    There are less cores, maybe it's somehow connected to problem.

  4. Vasya Volkov reporter

    I've performed more tests and it seems that the problem occurs only while libx265 using this CPU capabilities set: LZCNT FMA3 BMI2 AVX2

  5. Vasya Volkov reporter

    I've tried to rebuild libx265 and ffmpeg with newer toolchain:

    ~# gcc --version
    gcc (Ubuntu 7.3.0-16ubuntu3~16.04.1) 7.3.0
    

    The problem still exists.

  6. Vasya Volkov reporter

    Have you got any suggestions about how can I debug asm primitives? Maybe you have test cases for it which I can run on this CPU to find out which of the primitives fail?

  7. Vasya Volkov reporter

    Hi, again

    I've wrote that bug can be reproduced only on some CPUs or with some CPU capabilities - it seems it's wrong. I've performed more tests on that two CPUs (that I've mentioned before) and checked multiple versions from release 2.0 till 2.7+338-147b7dcee675. That's what I've found. Also take into account that when I say "reproduced" - that's means it's definitely reproduced, but when I say "not reproduced" - then it's not reproduced while encoding 2h dump one or more times (so if it's reproduces then significantly rare).

    The problem reproduces with release 2.0 (I know it's old) with this params (on both CPUs):

    fps=50:keyint=150:min-keyint=150:bframes=3:hrd=1:ref=3:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000:no-scenecut=1:no-open-gop=1
    
    or 
    
    fps=50:keyint=50:min-keyint=50:hrd=1:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000:no-scenecut=1:no-open-gop=1
    

    The problem is not reproducing with 2.7+338-147b7dcee675 with params:

    fps=50:keyint=150:min-keyint=150:hrd=1:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000:no-scenecut=1:no-open-gop=1

    But it reproduces with 2.7+338-147b7dcee675 with this params:

    fps=50:keyint=50:min-keyint=50:hrd=1:vbv-maxrate=20000:vbv-bufsize=40000:bitrate=18000:no-scenecut=1:no-open-gop=1 The difference only in keyint=150:min-keyint=150 vs keyint=50:min-keyint=50.

    Interesting part: it seems that problem not reproducible with this params (I've checked multiple versions from 2.0 to last master, encoded 2h dump, everything is fine) where no no-open-gop=1:no-scenecut=1:

    bitrate=18000:vbv-bufsize=20000:vbv-maxrate=40000:hrd=1:fps=50:keyint=50:min-keyint=50

  8. Aruna Matheswaran

    Hi pryg_skok,

    We tried decoding the initial 1000 frames of the 7200 seconds clip you shared here. We see that the first 12 frames in the YUV decoded by ffmpeg version 4.0 are corrupted.

    The command we used for YUV generation is : ffmpeg -loglevel error -i input_7200.ts -s 1920x1080 -pix_fmt yuv420p10le -an -f rawvideo -vframes 1000 input_7200.yuv

    Error message from ffmpeg:

    [hevc @ 0x23ea200] PPS id out of range: 0

    Last message repeated 11 times
    

    [hevc @ 0x241e200] Could not find ref with POC -104

    [hevc @ 0x241e200] Could not find ref with POC -100

    [hevc @ 0x241e200] Could not find ref with POC -96

    Can you please verify that your YUV is not corrupted. If so, can you please share the corrupted bitstream and its corresponding input sequence also. We don't find this corrupted scene in any of the input sequences you shared. So, we can't actually say that the issue is in libx265 unless we confirm that the decoded raw video is proper. Thanks.

  9. Vasya Volkov reporter

    Hi, Aruna

    Basically, it's fine that first frames are corrupted, because it's just dump of UDP live stream. So you can just drop them until first keyframe arrived and not take into account.

    To do such thing you can just remux this stream with ffmpeg: ffmpeg -i input_7200.ts -c copy -map 0 -f mpegts input_7200_fixed_errors.ts

    And in input_7200_fixed_errors.ts will be no corrupted frames.

    Also about shared corrupted scene - yes, it's from another input sequence and I shared it just for example how it looks like. Ok, I will share a corrupted bitstream from this input sequence also - the reason I didn't do that already because I shared commands which reproduce the bug.

  10. Vasya Volkov reporter

    So, corrupted frames in hevc bitstream can be produced from input_7200.ts in a following manner:

    ffmpeg -loglevel error -i /data/input_7200.ts -s 1920x1080 -pix_fmt yuv420p10le -an -f rawvideo - | x265 --input - --input-res 1920x1080 -p ultrafast --fps 50 --hrd --aud --repeat-headers --bitrate 18000 --vbv-maxrate 20000 --vbv-bufsize 40000 --bframes 3 --no-scenecut --rc-lookahead 4 --min-keyint 50 --keyint 50 --no-open-gop --ref 3 --merange 42 --input-depth 10 --output-depth 10 -o out.h265
    
    # simple hack for muxing output hevc bitstream into mpegts
    ffmpeg -y -i out.h265 -c:v copy -f mp4 out.mp4
    ffmpeg -y -i out.mp4 -c copy -f mpegts corrupted_x265.ts
    

    Using this version of x265:

    x265 [info]: HEVC encoder version 2.8+2-cc2c5e46f3c8
    x265 [info]: build info [Linux][GCC 5.4.0][64 bit] 10bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    

    Corrupted frames can be found for example at positions (-5 sec from actual corrupted sequence positions):

    00:04:12.020000
    00:05:01.020000
    00:08:39.020000
    00:12:59.020000
    and so on....
    

    You can extract corresponding clips for bug observation like this: ffmpeg -y -ss 00:04:12.020000 -i corrupted_x265.ts -s 1280x720 -c:v libx264 -b:v 2000k -t 10 clip.ts

    Link: corrupted_x265.ts md5sum corrupted_x265.ts: 99e98259f44ac64932b0444c822f4939

  11. Aruna Matheswaran

    Hi pryg_skok, Thanks for the details. We are able to reproduce the issue. We'll post an appropriate solution to handle this asap.

  12. Aruna Matheswaran

    Hi pryg_skok,

    We have identified that the issue comes from noise reduction and noise reduction is enabled by default if VBV is enabled immaterial of --nr-intra and --nr-inter. We have a patch in x265 mailing list which disables noise reduction with VBV. Can you test by disabling noise reduction ( Apply the patch shared and set --nr-intra and --nr-inter to 0 in your command ) and share the update on this issue ? Thanks.

  13. Vasya Volkov reporter

    Hi, Aruna

    I've applied your patch and use latest libx265: 2.8+11-df5bd3be9b11

    My command now looks like:

    x265 --input - --input-res 1920x1080 -p ultrafast --fps 50 --hrd --aud --repeat-headers --bitrate 18000 --vbv-maxrate 20000 --vbv-bufsize 40000 --bframes 3 --no-scenecut --rc-lookahead 4 --nr-intra 0 --nr-inter 0 --min-keyint 50 --keyint 50 --no-open-gop --ref 3 --merange 42 --input-depth 10 --output-depth 10 -o /mnt/out.h265 
    

    It seems that corruptions still persist but look a bit different. I don't see grey frames now, but corruption definitely observable. Check this out: https://yadi.sk/d/AhWr28qu3YmyN7

  14. Vasya Volkov reporter

    Hi, Aruna,

    Any progress here? Anything I can test? Do you reproduce the bug with patch you've provided?

  15. Log in to comment