12-bit encoding broken in ver. 1.7+470
In ver. 1.7+470 8 and 10-bit encoding is working, 12-bit encoding is broken. Unwatchable output, big difference when --no-asm option is added (in both cases output unwatchable).
Comments (27)
-
-
reporter After close look it is quite strange: VS 2015 build with LTO (option -GL) is totally broken @ 12bit. GCC 5.2 build & VS 2015 build without LTO is simply bad @ 12bit (VS 2015 build without LTO has bit identical output to GCC 5.2 build). In this message http://forum.doom9.org/showthread.php?p=1738104#post1738104 there is example of 12bit encoding with preset medium with GCC 5.2 & VS 2015 LTO.
-
reporter After additional tests: this problem is not related to ver 1.7+470 (it was earlier); VS 2015 LTO multilib build is totally broken @ 12bit, VS 2015 LTO normal 12bit build is OK (like GCC build); 12bit --no-asm output differs from --asm SSE2 output; --asm SSE2, --asm SSSE3, --asm SSE4.2, --asm AVX outputs are identical.
-
reporter 12bit encoding of sample https://media.xiph.org/video/derf/y4m/720p50_parkrun_ter.y4m with option --preset slower shows bad quality especially at trees branches.
-
I update to ver. 1.7+478 and don't have any broken now.
-
reporter Don't you see mess at top 1/3 of screen after 12bit encoding https://media.xiph.org/video/derf/y4m/720p50_parkrun_ter.y4m with option --preset slower? Maybe I have something wrong with video player, I will double check.
-
Thanks your report! Some intermedia result overflow, I was sent the patches, please check it.
-
reporter I've checked the last 2 patches -- quality glitches are gone. Thanks! Still --no-asm output differs from normal (maybe I should apply another patch).
-
Yes, you need try three patches to fix these bugs.
-
reporter I've applied additional your third patch fix PSYVALUE shift overflow, Issue
#180[OUTPUT CHANGE on 12bpp] and ramya patch asm: fix sse_pp[32x64] sse2 asm for 12 bit and still differs:i:\speed\12b>x265 -D12 --no-asm 720p50_parkrun_ter.y4m w3-n.hevc y4m [info]: 1280x720 fps 50/1 i420p8 sar 1:1 frames 0 - 503 of 504 raw [info]: output file: w3-n.hevc x265 [info]: HEVC encoder version 1.7+478-365f7ed4d896 x265 [info]: build info [Windows][MSVC 1900][64 bit] 12bit x265 [info]: using cpu capabilities: none! x265 [info]: Main 12 profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: frame threads / pool features : 2 / wpp(12 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 2 x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / 0 / 0 x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=0.30 signhide tmvp strong-intra-smoothing x265 [info]: tools: deblock sao x265 [info]: frame I: 3, Avg QP:30.45 kb/s: 13883.07 x265 [info]: frame P: 123, Avg QP:35.22 kb/s: 11264.51 x265 [info]: frame B: 378, Avg QP:38.97 kb/s: 452.37 x265 [info]: Weighted P-Frames: Y:0.8% UV:0.8% x265 [info]: consecutive B-frames: 2.4% 2.4% 9.5% 64.3% 21.4% encoded 504 frames in 128.44s (3.92 fps), 3171.00 kb/s, Avg QP:38.00 i:\speed\12b>x265 -D12 720p50_parkrun_ter.y4m w3.hevc y4m [info]: 1280x720 fps 50/1 i420p8 sar 1:1 frames 0 - 503 of 504 raw [info]: output file: w3.hevc x265 [info]: HEVC encoder version 1.7+478-365f7ed4d896 x265 [info]: build info [Windows][MSVC 1900][64 bit] 12bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX x265 [info]: Main 12 profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: frame threads / pool features : 2 / wpp(12 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 2 x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40 x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0 x265 [info]: References / ref-limit cu / depth : 3 / 0 / 0 x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rd=3 psy-rd=0.30 signhide tmvp strong-intra-smoothing x265 [info]: tools: deblock sao x265 [info]: frame I: 3, Avg QP:30.45 kb/s: 13882.93 x265 [info]: frame P: 123, Avg QP:35.22 kb/s: 11265.20 x265 [info]: frame B: 378, Avg QP:38.97 kb/s: 449.16 x265 [info]: Weighted P-Frames: Y:0.8% UV:0.8% x265 [info]: consecutive B-frames: 2.4% 2.4% 9.5% 64.3% 21.4% encoded 504 frames in 50.34s (10.01 fps), 3168.75 kb/s, Avg QP:38.00
-
In late frame, more satd functions need fix Workaround is disable all of SATD asm code I am working on fix these functions one by one, need more time
-
I sent the new patch, you need three patches to fix this bug, please try again, thanks!
-
reporter Now is much better. Thanks! However at preset slower (I could apply wrong patches, I will confirm later):
i:\speed\12b>x265 -D12 --preset slower 720p50_parkrun_ter.y4m w.hevc y4m [info]: 1280x720 fps 50/1 i420p8 sar 1:1 frames 0 - 503 of 504 raw [info]: output file: w.hevc x265 [info]: HEVC encoder version 1.7+478-365f7ed4d896 x265 [info]: build info [Windows][GCC 5.2.0][64 bit] 12bit x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX x265 [info]: Main 12 profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: frame threads / pool features : 2 / wpp(12 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 2 inter / 2 intra x265 [info]: ME / range / subpel / merge : star / 57 / 3 / 3 x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40 x265 [info]: Lookahead / bframes / badapt : 30 / 8 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1 x265 [info]: References / ref-limit cu / depth : 3 / 0 / 0 x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rect amp rd=6 psy-rd=0.30 rdoq=2 psy-rdoq=1.00 signhide x265 [info]: tools: tmvp b-intra strong-intra-smoothing deblock sao x265 [info]: frame I: 3, Avg QP:31.06 kb/s: 12825.07 x265 [info]: frame P: 103, Avg QP:35.15 kb/s: 13088.21 x265 [info]: frame B: 398, Avg QP:38.28 kb/s: 519.31 x265 [info]: Weighted P-Frames: Y:1.0% UV:1.0% x265 [info]: Weighted B-Frames: Y:0.0% UV:0.0% x265 [info]: consecutive B-frames: 2.8% 0.0% 3.8% 50.0% 18.9% 12.3% 7.5% 1.9% 2.8% encoded 504 frames in 245.25s (2.06 fps), 3161.20 kb/s, Avg QP:37.60 i:\speed\12b>x265 -D12 --preset slower --no-asm 720p50_parkrun_ter.y4m wn.hevc y4m [info]: 1280x720 fps 50/1 i420p8 sar 1:1 frames 0 - 503 of 504 raw [info]: output file: wn.hevc x265 [info]: HEVC encoder version 1.7+478-365f7ed4d896 x265 [info]: build info [Windows][GCC 5.2.0][64 bit] 12bit x265 [info]: using cpu capabilities: none! x265 [info]: Main 12 profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: frame threads / pool features : 2 / wpp(12 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 2 inter / 2 intra x265 [info]: ME / range / subpel / merge : star / 57 / 3 / 3 x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40 x265 [info]: Lookahead / bframes / badapt : 30 / 8 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1 x265 [info]: References / ref-limit cu / depth : 3 / 0 / 0 x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rect amp rd=6 psy-rd=0.30 rdoq=2 psy-rdoq=1.00 signhide x265 [info]: tools: tmvp b-intra strong-intra-smoothing deblock sao x265 [info]: frame I: 3, Avg QP:31.06 kb/s: 12825.07 x265 [info]: frame P: 103, Avg QP:35.16 kb/s: 13068.81 x265 [info]: frame B: 398, Avg QP:38.24 kb/s: 516.09 x265 [info]: Weighted P-Frames: Y:1.0% UV:1.0% x265 [info]: Weighted B-Frames: Y:0.0% UV:0.0% x265 [info]: consecutive B-frames: 2.8% 0.0% 3.8% 50.0% 18.9% 12.3% 7.5% 1.9% 2.8% encoded 504 frames in 526.53s (0.96 fps), 3154.69 kb/s, Avg QP:37.57
I applied wrong patches, but patch [x265] [PATCH 1 of 2] fix SSE_PP intermedia result overflow in Main12, (fixes
#180) is the same as older [x265] [PATCH 2 of 2] fix SSE_PP intermedia result overflow in Main12, issue#180. So it should be problem @ preset slower. -
reporter I found quality glitches at preset veryslow:
i:\speed\12b>x265vs -D12 --preset veryslow --no-asm 720p50_parkrun_ter.y4m vsn.hevc y4m [info]: 1280x720 fps 50/1 i420p8 sar 1:1 frames 0 - 503 of 504 raw [info]: output file: vsn.hevc x265 [info]: HEVC encoder version 1.7+478-365f7ed4d896 x265 [info]: build info [Windows][MSVC 1900][64 bit] 12bit x265 [info]: using cpu capabilities: none! x265 [info]: Main 12 profile, Level-4 (Main tier) x265 [info]: Thread pool created using 4 threads x265 [info]: frame threads / pool features : 2 / wpp(12 rows) x265 [info]: Coding QT: max CU size, min CU size : 64 / 8 x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra x265 [info]: ME / range / subpel / merge : star / 57 / 4 / 4 x265 [info]: Keyframe min / max / scenecut : 25 / 250 / 40 x265 [info]: Lookahead / bframes / badapt : 40 / 8 / 2 x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1 x265 [info]: References / ref-limit cu / depth : 5 / 0 / 0 x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1 x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60 x265 [info]: tools: rect amp rd=6 psy-rd=0.30 rdoq=2 psy-rdoq=1.00 signhide x265 [info]: tools: tmvp b-intra strong-intra-smoothing deblock sao x265 [info]: frame I: 3, Avg QP:30.70 kb/s: 12827.73 x265 [info]: frame P: 104, Avg QP:35.15 kb/s: 12856.90 x265 [info]: frame B: 397, Avg QP:38.20 kb/s: 491.08 x265 [info]: Weighted P-Frames: Y:1.0% UV:1.0% x265 [info]: Weighted B-Frames: Y:0.0% UV:0.0% x265 [info]: consecutive B-frames: 2.8% 0.0% 3.7% 52.3% 17.8% 11.2% 8.4% 0.9% 2.8% encoded 504 frames in 1047.48s (0.48 fps), 3116.19 kb/s, Avg QP:37.52
-
Thanks your report. the asm output mistake because there have un-commit code in my local tree, I will merge it into patch the quality issue is same as before intermedia result overflow, I was fixed these bugs, I am doing verify and send patches after confirm.
-
I was sent the patches in yesterday night, please check again.
-
reporter Now quality is good, output the same with --no-asm option. Thanks! The oldest patch fix PSYVALUE shift overflow, Issue
#180[OUTPUT CHANGE on 12bpp] when I apply to ver. 1.7+478 display info:patching file source/common/quant.cpp Hunk #1 succeeded at 585 with fuzz 2 (offset -3 lines).
It works but I'm curious why there is 3 line difference.
-
it is based on my C++ template rdoQuant patches
-
reporter I think that it is a time to commit patches -- it will be easier for everyone.
-
version tree control by Deepthi, I guess she in the business travel.
If the bug was solved, we can close this issue.
-
reporter I confirm, the bug was solved. Thanks!
-
- changed status to resolved
-
fix SSE_PP intermediate result overflow in Main12, (fixes
#180)→ <<cset 5a0c60944a2b>>
-
fix Main12 satd overflow bug up to SSE4, (fixes
#180)→ <<cset aa8f74c8e37c>>
-
fix PSYVALUE shift overflow (fixes
#180) [OUTPUT CHANGE on 12bpp]→ <<cset 66bcd8d1da99>>
-
fix SSE_PP intermediate result overflow in Main12, (fixes
#180)→ <<cset 07b556605fb8>>
-
fix Main12 satd overflow bug up to SSE4, (fixes
#180)→ <<cset 83dd03e51d25>>
- Log in to comment
Thank your report, I can reproduce it in preset slow and above. I was fixed the bug, the root cause is PSYVALUE()