fallback for atomic ops usage

Issue #85 closed
Brad Smith created an issue

Building x265 on a wide range of architectures we (OpenBSD) noticed that x265 uses compiler built in atomic ops. Any possibility of having a fallback code path that does not use the built-in atomic ops for archs where they are not available? We do not care about the performance so much as just being able to build and use x265 at all.

Comments (10)

  1. Steve Borho

    I wasn't aware that atomic operations were not supported on some PC platforms.

    Someone could do a quick-and-dirty hack to make the ATOMIC_* operators all use the same global mutex to protect those memory operations. That would make x265 functional, if a bit sub-optimal.

  2. Brad Smith reporter

    With the back port of commits..

    d3389bb9efd02246170504dca2662c54b4ae 814b687db30ec6a13c03375d7aafb822c5f2ff19aa45

    I can now build x265 1.4 on powerpc but it crashes in what looks like the motion estimation functions from the back trace. But it runs and encodes content just fine on amd64 / i386 and sparc64 (it built on alpha as is but I haven't done any run-time testing). Bulk package builds will kick off soon-ish to see if this helps building on hppa and mips64.

  3. djcj

    Not sure if this is helpful for builds on OpenBSD, but the Debian package uses this patch to add atomics for mips, mipsel and powerpc:

    Description: Use atomic on mips, mipsel and powerpc
    Author: Dejan Latinovi <Dejan.Latinovic@imgtec.com>,
     Sebastian Ramacher <sramacher@debian.org>
    Bug: https://bitbucket.org/multicoreware/x265/issue/93/__sync_val_compare_exchange_8-not
    Bug-Debian: https://bugs.debian.org/771556
    Last-Update: 2014-12-05
    
    --- a/source/CMakeLists.txt
    +++ b/source/CMakeLists.txt
    @@ -50,6 +50,8 @@
         message(STATUS "Detected ARM target processor")
         set(ARM 1)
         add_definitions(-DX265_ARCH_ARM=1 -DHAVE_ARMV6=1)
    +elseif("${SYSPROC}" STREQUAL "ppc" OR "${SYSPROC}" STREQUAL "mips")
    +    set(NEEDS_ATOMIC 1)
     else()
         message(STATUS "CMAKE_SYSTEM_PROCESSOR value `${CMAKE_SYSTEM_PROCESSOR}` is unknown")
         message(STATUS "Please add this value near ${CMAKE_CURRENT_LIST_FILE}:${CMAKE_CURRENT_LIST_LINE}")
    @@ -61,6 +63,9 @@
         if(LIBRT)
             set(PLATFORM_LIBS ${PLATFORM_LIBS} rt)
         endif()
    +    if(NEEDS_ATOMIC)
    +        set(PLATFORM_LIBS ${PLATFORM_LIBS} atomic)
    +    endif()
     endif(UNIX)
    
     # Compiler detection
    --- a/source/common/threading.h
    +++ b/source/common/threading.h
    @@ -51,8 +51,15 @@
    
     #define CLZ32(id, x)                        id = (unsigned long)__builtin_clz(x) ^ 31
     #define CTZ64(id, x)                        id = (unsigned long)__builtin_ctzll(x)
    +
    +#if (defined(__mips__) && !defined(__mips64)) || (defined(__powerpc__) && !defined(__powerpc64__))
    +#define ATOMIC_OR(ptr, mask)                __atomic_or_fetch(ptr, mask, __ATOMIC_SEQ_CST)
    +#define ATOMIC_CAS(ptr, oldval, newval)     __atomic_compare_exchange(ptr, &oldval, &newval, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)
    +#else
     #define ATOMIC_OR(ptr, mask)                __sync_or_and_fetch(ptr, mask)
     #define ATOMIC_CAS(ptr, oldval, newval)     __sync_val_compare_and_swap(ptr, oldval, newval)
    +#endif
    +
     #define ATOMIC_CAS32(ptr, oldval, newval)   __sync_val_compare_and_swap(ptr, oldval, newval)
     #define ATOMIC_INC(ptr)                     __sync_add_and_fetch((volatile int32_t*)ptr, 1)
     #define ATOMIC_DEC(ptr)                     __sync_add_and_fetch((volatile int32_t*)ptr, -1)
    
  4. Brad Smith reporter

    No, the issue has not been resolved.

    The switch to 32-bit atomic ops allowed x265 to build on PowerPC, but it was crashing and started working as of 1.6 I believe it was. But it still does not build on some other archs such as MIPS, HPPA and I'm sure this would affect us on ARM and probably SPARC (32-bit).

    I see x264 does use a mutex as you mention. There is no expectation that performance is great. Just that the code builds and works.

  5. Brad Smith reporter

    To make things more transparent for the user... have CMake check for the existence of __sync_fetch_and_or(), compile + link, if fail fall back to NO_ATOMICS.

  6. Log in to comment