- edited description
x265 CPU utilization very low on a multi-numa sockets server
Hi, I am testing x265 with a two numa nodes server, each node has 36 cores. The x265 version is 1.7 release with command line
./x265 --input-res 1920x1080 --input input.yuv --bitrate 1200 --vbv-maxrate 1380 --fps 20 --early-skip --preset fast -o test1.hevc
but when ruuning on the server, CPU utilization ranges from 27% ~ 35% (< 40%) which means most of the CPU cores are not busy.
x265 [info]: HEVC encoder version 1.7
x265 [info]: build info [Linux][GCC 4.4.6][64 bit] 8bpp
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
x265 [warning]: --psnr used with AQ on: results will be invalid!
x265 [warning]: --tune psnr should be used if attempting to benchmark psnr!
x265 [info]: Main profile, Level-4 (Main tier)
x265 [info]: Thread pool 0 using 36 threads on NUMA node 0
x265 [info]: Thread pool 1 using 36 threads on NUMA node 1
x265 [info]: frame threads / pool features : 16 / wpp(34 rows)+pmode
x265 [warning]: VBV maxrate specified, but no bufsize, ignored
x265 [info]: Coding QT: max CU size, min CU size : 32 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 2 inter / 2 intra
x265 [info]: ME / range / subpel / merge : star / 57 / 1 / 2
x265 [info]: Keyframe min / max / scenecut : 20 / 250 / 40
x265 [info]: Lookahead / bframes / badapt : 60 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb / refs: 1 / 1 / 1 / 1
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.3 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1200 kbps / 0.60
x265 [info]: tools: rect amp rd=4 rdoq=2 early-skip signhide tmvp b-intra
Comments (5)
-
reporter -
reporter - changed title to x265 CPU utilization very low on a multi-numa sockets server
-
Some light reading: http://x265.readthedocs.org/en/latest/threading.html
Short summary: this is not a surprise. The large block sizes in HEVC make parallelism more difficult I don't expect a 1080p encode to fill that rather large machine. You might find that --lookahead-slices 6 helps a bit.
-
Also, at --rd 4 you might find --pmode to be beneficial.
-
- changed status to resolved
No quick fixes available.....
- Log in to comment