Adding A New Encoder Performance Primitive
The following steps are to be used to add a new optimized encoder primitive function to x265 (aka: encoder primitive, aka: primitive)
If you follow these steps you will avoid introducing any bugs. Each step should roughly correspond to a single local commit. At the appropriate stage, the group of commits can be submitted to the mailing list as a coherent improvement.
Add a funcdef to primitives.h (if necessary) and add function pointer(s) to EncoderPrimitives structure
Use an existing funcdef if one is available.
Implement the C/C++ reference version
pixel.cpp, dct.cpp, intrapred.cpp, etc already contain a number of C primitives. Create a new C++ file if your primitive does not fit an existing file.
Add p.YOURFUNC = YOUR_C_REFERENCE; to the Setup_C_FooPrimitives() function of this file.
If you are adding a new file, add the call to Setup_C_FooPrimitives() to Setup_C_Primitives() in primitives.cpp
Add a new unit test for this primitive to the TestBench framework
- Your C/C++ version will be the reference against which the other implementations are verified.
- You are responsible for writing the logic which generates good test data for your function.
- Use an existing harness if one already generates the data you need
- Your new unit test will not do anything interesting until you add vectorized or assembly versions of your new function(s)
- Consult x264 checkasm for inspiration
Use this new primitive function in the encoder
You should be replacing existing functionality with this C reference version with the expectation that optimized implementations will follow.
Compile the encoder with these changes and generate HEVC bitstreams and verify the outputs have not changed in any way (your C reference behavior matches the HM code it is replacing)
Congrats these four commits can be pushed together to the x265 repository, introducing a new performance primitive.
Add assembly versions for the architectures you care about
Add your new assembly routine to an appropriate existing file common/x86 or make a new file and add it to common/CMakeLists.txt (ASMS variable).
Add logic to asm-primitives.cpp to setup the function pointer to your assembly routine for the appropriate CPU architecture level and X265_DEPTH build option