3.1RC library hangs during encoder_close()

Issue #498 new
Michal Szymanski created an issue

x265 libraries in our system have been updated, from rel 2.8 to 3.1rc (3.0+5-113518629fa5).

Now, once for a while, the library hangs during encoder_close() call.

It doesn’t happen very often (once for 50-100 runs), but on both Windows and Linux. Encoder configuration also doesn’t seem to be important here, however I’m running only 10-bit encodes.

encoder_close() call hangs after successful encode - this is not abnormal execution.

Is there anything in encoder_close() what could cause library to hang? Is there anything I could do to debug the issue?

Update:

library hangs at m_frameEncoder[i]->getEncodedPicture(m_nalList);

Comments (24)

  1. Aruna Matheswaran

    @Michal Szymanski - Could you please provide the encoder configuration with which you are facing the issue?

  2. Michal Szymanski reporter

    This is CLI equivalent of my configuration:

    --input-depth 10 --preset medium --aud --bframes 4 --bitrate 15000 --cbqpoffs 0 --chromaloc 0 --colormatrix 2 --colorprim 2 --crqpoffs 0 --ctu 64 --fps 30 --frame-threads 0 --hash 1 --hrd --no-info --input-csp i420 --input-res 1920x1080 --keyint 65535 --level-idc 0 --log-level 0 --min-cu-size 8 --min-keyint 0 --nr-inter 0 --nr-intra 0 --no-open-gop --psy-rd 2.0 --qg-size 64 --range full --no-rc-grain --rc-lookahead 20 --repeat-headers --sar 1 --scenecut 40 --scenecut-bias 5 --transfer 2 --vbv-bufsize 30000 --vbv-maxrate 15000

    I haven’t seen same issue in x265 reference application so far. However implementation code is basically the same and it’s not clear for me what could be the mistake when it comes to closing encoder. It’s also not clear for me why library is waiting for picture if it was already successfully flushed.

  3. Nik McNulty

    For what it’s worth, when the 3.1 RC library is compiled into Handbrake (MinGW cross-compile for Windows), the program suffers the exact same inexplicable, intermittent hangs during an encode as was reported by Michal Szymanski.

    This does NOT occur if 3.0 STABLE is deployed.

  4. Michal Szymanski reporter

    I have more findings.

    I think the issue occurs when encoding relatively short streams (e.g. 30-60 frames).

    Let’s assume I’m encoding 30 frames.

    While frame-feeding encoder:

    int numEncoded = api->encoder_encode( encoder, &p_nal, &nal, picInput, pic_recon );

    encoder_encode returns 0 for all 30 calls. Makes sense, apparently no encoded frame is yet ready.

    Then, I proceed with flushing:

    int numEncoded = api->encoder_encode(encoder, &p_nal, &nal, NULL, pic_recon);

    Usually, there are 30 flushing calls, each returning 1 encoded frame.

    But sometimes, very first flushing call returns 0. Perhaps, because some encoding threads still not finished.

    But when flushing, 0 means there is nothing more to do, so implementation proceeds to closing encoder. Apparently, closing encoder which still runs is not a good idea, so it hangs.

    Shouldn’t flushing call be blocking when encoder knows there is frame being processed?

    This explains why I couldn’t reproduce this using sample application. Apparently, sample implementation is “slow enough” to guarantee that when flush call occurs, some frame is available.

  5. Aruna Matheswaran

    @Michal Szymanski - As long as you are not forcing flush, api->encoder_encode(encoder, &p_nal, &nal, NULL, pic_recon); will wait until it gets an encoded frame if frames are being processed.

    Have you enabled encoder->m_param->forceFlush or encoder->m_externalFlush in your application?

  6. Michal Szymanski reporter

    I’m not setting forceFlush value at all. Just calling api->param_default(…). Any chance it doesn’t get initialized?

  7. Nik McNulty

    Mention of the forceFlush parameter made me try it out with the HandBrake application, which was freezing on pretty much every third preview generated when using 3.1+7.

    Setting param->forceFlush to 1 made no difference. Neither did 0.

    BTW, HandBrake uses the same api->encoder_encode(encoder, &p_nal, &nal, NULL, pic_recon); flush routine that you said would wait until it gets an encoded frame if frames are being processed. Except, it doesn’t.

    Now, of course, the real question is: why does 3.1 hang like this, whereas 3.0 didn’t?

  8. Michal Szymanski reporter

    Same here. Explicit setting of forceFlush to 0 doesn’t help.

    Flush routine just happens to not block, once for a while.

  9. Nik McNulty

    I became so pissed off with 3.1 randomly hanging HandBrake that I manually regressed the commits until I found the culprit. Yeah, fun…

    x265 version 3.0_RC+29-b36242b9f354 WORKS!

    https://bitbucket.org/multicoreware/x265/commits/b36242b9f354b8773e38674b876b0ca5dfc35ad2

    https://bitbucket.org/multicoreware/x265/get/b36242b9f354b8773e38674b876b0ca5dfc35ad2.tar.gz

    x265 version 3.0_RC+30-768ab38fd5fd HANGS RANDOMLY!

    https://bitbucket.org/multicoreware/x265/commits/768ab38fd5fd104a8d58f42b646d6117d63b2c0a

    https://bitbucket.org/multicoreware/x265/get/768ab38fd5fd104a8d58f42b646d6117d63b2c0a.tar.gz

    It all makes perfect sense when you see what the defective commit does.

    HandBrake uses an 8-bit pipeline, and my i7-9700K employs: x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

    I cross-compile HandBrake under Ubuntu 18.04 WSL using GCC 9.1.0 with NASM 2.14.02.

  10. Nik McNulty

    CORRECTION: x265 v3.0_RC+29-b36242b9f354 still hangs, just 5x less than x265 v3.0_RC+30-768ab38fd5fd

    Unfortunately, this means that continuing to test all the regressions to find the culprit will take even longer than before. Not looking forward to that.

  11. Nik McNulty

    Having been bitten by RC+29 above, I created a trustworthy stress environment. Using 10 x source videos of varying complexity, I employed HandBrake's preview function to generate 5 x 10-second x265 samples from each source, for a total test regime of 50 x 10-second x265 encodes.

    I methodically compiled each commit since 3.0 STABLE into HandBrake (x86_64 MinGW cross-compiled under Ubuntu 18.04 WSL, GCC 9.1.0), testing each build with the above regime. No setting was changed between each preview generated.

    x265 v3.0_RC+16-dcbec33bfb0f (or earlier) NEVER FAILED to complete the entire test
    https://bitbucket.org/multicoreware/x265/commits/dcbec33bfb0f1cabdb1ff9eaadba5305ba23e6fa
    https://bitbucket.org/multicoreware/x265/get/dcbec33bfb0f1cabdb1ff9eaadba5305ba23e6fa.tar.gz

    x265 v3.0_RC+17-878541319ea1 (or later) ALWAYS FAILED to complete the entire test
    https://bitbucket.org/multicoreware/x265/commits/878541319ea1375be0e981f6ea5fefdb4d509fbd
    https://bitbucket.org/multicoreware/x265/get/878541319ea1375be0e981f6ea5fefdb4d509fbd.tar.gz

    So, what's the story with commit 878541319ea1? Surprise, surprise... it's where you added SVT-HEVC as a parasite on top of x265.

    This is the working encoder:
    x265 [info]: HEVC encoder version 3.0_RC+16-dcbec33bfb0f
    x265 [info]: build info [Windows][GCC 9.1.0][64 bit] 10bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x265 [info]: Main 10 profile, Level-3.1 (Main tier)
    x265 [info]: Thread pool created using 8 threads
    x265 [info]: Slices : 1
    x265 [info]: frame threads / pool features : 3 / wpp(18 rows)
    x265 [warning]: Source height < 720p; disabling lookahead-slices
    x265 [info]: Coding QT: max CU size, min CU size : 32 / 8
    x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
    x265 [info]: ME / range / subpel / merge : star / 24 / 4 / 5
    x265 [info]: Keyframe min / max / scenecut / bias: 24 / 240 / 40 / 5.00
    x265 [info]: Cb/Cr QP Offset : -1 / -1
    x265 [info]: Lookahead / bframes / badapt : 15 / 4 / 0
    x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
    x265 [info]: References / ref-limit cu / depth : 3 / on / on
    x265 [info]: Rate Control / qCompress : CRF-21.0 / 0.60
    x265 [info]: tools: rd=2 psy-rd=4.25 rdoq=1 psy-rdoq=12.25 nr-intra=90
    x265 [info]: tools: nr-inter=190 signhide tmvp fast-intra deblock(tC=0:B=-2)

    This is the freezing encoder:
    x265 [info]: HEVC encoder version 3.0_RC+17-878541319ea1
    x265 [info]: build info [Windows][GCC 9.1.0][64 bit] 10bit
    x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
    x265 [info]: Main 10 profile, Level-3.1 (Main tier)
    x265 [info]: Thread pool created using 8 threads
    x265 [info]: Slices : 1
    x265 [info]: frame threads / pool features : 3 / wpp(18 rows)
    x265 [warning]: Source height < 720p; disabling lookahead-slices
    x265 [info]: Coding QT: max CU size, min CU size : 32 / 8
    x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
    x265 [info]: ME / range / subpel / merge : star / 24 / 4 / 5
    x265 [info]: Keyframe min / max / scenecut / bias: 24 / 240 / 40 / 5.00
    x265 [info]: Cb/Cr QP Offset : -1 / -1
    x265 [info]: Lookahead / bframes / badapt : 15 / 4 / 0
    x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
    x265 [info]: References / ref-limit cu / depth : 3 / on / on
    x265 [info]: Rate Control / qCompress : CRF-21.0 / 0.60
    x265 [info]: tools: rd=2 psy-rd=4.25 rdoq=1 psy-rdoq=12.25 nr-intra=90
    x265 [info]: tools: nr-inter=190 signhide tmvp fast-intra
    x265 [info]: tools: refine-analysis-type=avc deblock(tC=0:B=-2)

    Note the additional "tool" refine-analysis-type=avc deployed in the hanging encoder which is not present in the working encoder.

    Looking through the SVT source, I see that you're using #ifdef's again, like this one in source/encoder/encoder.cpp:

    #ifdef SVT_HEVC
    X265_FREE(m_svtAppData);
    #endif

    With this, it doesn't matter whether the value of SVT_HEVC is TRUE or FALSE. It will include X265_FREE(m_svtAppData) merely for SVT_HEVC existing, regardless of value (even NULL). You'll REALLY want to check that this is the behavior that you want.

  12. Aruna Matheswaran

    Okay, so the issue seems to be in x265_copy_params() which copies the param given by the application to the library.

    Could you verify this fix.

  13. Nik McNulty

    @Aruna Matheswaran

    I added dst->bAnalysisType = src->bAnalysisType; to param.cpp from 21db162c8622677c41a4fc77a14a59eb7326b46a (v3.1+8-21db162c8622), and ran an extended regime of 10-second previews as above.

    After 70+ encodes, it had not frozen once.

    So, if this also solves Michal Szymanski’s freezes, then we have a winner.

  14. Michal Szymanski reporter

    Over 1300 encodes and still no fail, so we can assume that’s the hit.

    Kudos to Nik for narrowing that down.

    i would like to know what’s next? Will there be patched 3.1 release available, or this fix goes only to master?

  15. Aruna Matheswaran

    Nik and Michal - Thanks for verifying the fix. The fix is pushed into Release_3.1 and the graft is available in default. You’ll get the fix in stable once we do the periodic merge of default into stable.

  16. Log in to comment