Windows: 2785+: Crash running x64 build on processors that do not support vmovaps

Issue #1999 resolved
Dmitry Azaraev
created an issue

Environment to reproduce:

  • CEF 3.2785.1480.g162e9a9 x64 build. x86 build work.
  • Windows Server 2012 R2 (not sure that it is actually have difference)
  • CPU: Intel Xeon X5560 (not sure that it is actually have difference, but exception say to us about invalid instruction)

cefclient starts like normal, creates a renderer process and immediately after renderer process created it crashed. No any error/fatal entries in log appears. There is impossible to catch error in any way except crash dump.

I'm created a dump file via WER (whoa, it is work this time), and got next results:

(2344.24c4): Illegal instruction - code c000001d (first/second chance not available)
ntdll!NtWaitForMultipleObjects+0xa:
00007ffc`04a00c6a c3              ret

0:000> .ecxr
*** WARNING: Unable to verify checksum for libcef.dll
rax=0000000000000003 rbx=0000003a1d48503c rcx=0000003a16e3c010
rdx=0000003a1d48503c rsi=0000003a16e3c130 rdi=0000003a16e3c140
rip=00007ffbcc99045c rsp=0000003a16e3bfb8 rbp=0000003a16e3c029
r8=0000003a16e3c130  r9=0000000000000002 r10=0000003a1d48503c
r11=0000003a16e3c130 r12=0000000000000000 r13=0000003a1b4110d0
r14=0000003a1d48503c r15=0000003a16e3c500
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010246
libcef!SkNx<4,float>::SkNx<4,float>:
00007ffb`cc99045c c5f828c1        vmovaps xmm0,xmm1


Stack Trace:

.  0  Id: 2344.24c4 Suspend: 0 Teb: 00007ff6`eddee000 Unfrozen
 # Child-SP          RetAddr           Call Site
00 0000003a`16e3a8d8 00007ffc`01e513ed ntdll!NtWaitForMultipleObjects+0xa
01 0000003a`16e3a8e0 00007ffc`03d27d51 KERNELBASE!WaitForMultipleObjectsEx+0xe1
02 0000003a`16e3abc0 00007ffc`03d27773 kernel32!WerpReportFaultInternal+0x581
03 0000003a`16e3b130 00007ffc`01f31fdf kernel32!WerpReportFault+0x83
04 0000003a`16e3b160 00007ffc`04a0f133 KERNELBASE!UnhandledExceptionFilter+0x23f
05 0000003a`16e3b250 00007ffc`049f1d86 ntdll!RtlUserThreadStart$filt$0+0x3e
06 0000003a`16e3b290 00007ffc`04a033fd ntdll!_C_specific_handler+0x96
07 0000003a`16e3b300 00007ffc`049c4847 ntdll!RtlpExecuteHandlerForException+0xd
08 0000003a`16e3b330 00007ffc`04a0258a ntdll!RtlDispatchException+0x197
09 0000003a`16e3ba00 00007ffb`cc99045c ntdll!KiUserExceptionDispatch+0x3a
0a 0000003a`16e3bfb8 00007ffb`ceb6b495 libcef!SkNx<4,float>::SkNx<4,float>(float a = 3.621263742e-036, float b = 0, float c = 0, float d = -3.621951103) [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\opts\sknx_sse.h @ 73]
0b 0000003a`16e3bfc0 00007ffb`ceb6bb0f libcef!SkMatrix::Scale_pts(class SkMatrix * m = <Value unavailable error>, struct SkPoint * dst = 0x00000000`00000040, struct SkPoint * src = 0x00000000`00001fa0, int count = 0n384024536)+0x71 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skmatrix.cpp @ 960]
0c (Inline Function) --------`-------- libcef!SkMatrix::mapPoints+0x34 [h:\cef\build\chromium_git\chromium\src\third_party\skia\include\core\skmatrix.h @ 436]
0d 0000003a`16e3c090 00007ffb`ceb52d12 libcef!SkMatrix::mapRect(struct SkRect * dst = 0x0000003a`1d48503c, struct SkRect * src = 0x0000003a`16e3c130)+0x6f [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skmatrix.cpp @ 1105]
0e 0000003a`16e3c100 00007ffb`ceb52bab libcef!SkCanvas::getClipBounds(struct SkRect * bounds = 0x0000003a`1d48503c)+0xee [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1855]
0f (Inline Function) --------`-------- libcef!SkCanvas::getLocalClipBounds+0x1c [h:\cef\build\chromium_git\chromium\src\third_party\skia\include\core\skcanvas.h @ 1521]
10 0000003a`16e3c190 00007ffb`ceb53c42 libcef!SkCanvas::quickReject(struct SkRect * rect = 0x0000003a`16e3c2f0)+0x167 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1813]
11 0000003a`16e3c200 00007ffb`cead2980 libcef!SkCanvas::onDrawRect(struct SkRect * r = 0x0000003a`16e3c500, class SkPaint * paint = 0x0000003a`1b412940)+0x156 [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 2139]
12 (Inline Function) --------`-------- libcef!SkCanvas::drawRect+0x1f [h:\cef\build\chromium_git\chromium\src\third_party\skia\src\core\skcanvas.cpp @ 1919]
13 0000003a`16e3c4e0 00007ffb`cead1f4c libcef!cc::SoftwareRenderer::DrawSolidColorQuad(class cc::SolidColorDrawQuad * quad = 0x0000003a`1d483e38)+0x1f0 [h:\cef\build\chromium_git\chromium\src\cc\output\software_renderer.cc @ 405]
14 0000003a`16e3c5c0 00007ffb`ceb112a4 libcef!cc::SoftwareRenderer::DoDrawQuad(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::DrawQuad * quad = 0x0000003a`1d483e38, class gfx::QuadF * draw_region = 0x00000000`00000000)+0x5dc [h:\cef\build\chromium_git\chromium\src\cc\output\software_renderer.cc @ 311]
15 0000003a`16e3c870 00007ffb`ceb10bc9 libcef!cc::DirectRenderer::DrawRenderPass(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::RenderPass * render_pass = 0x0000003a`1d466540)+0x664 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 499]
16 0000003a`16e3ca10 00007ffb`ceb102a8 libcef!cc::DirectRenderer::DrawRenderPassAndExecuteCopyRequests(struct cc::DirectRenderer::DrawingFrame * frame = 0x0000003a`16e3cc00, class cc::RenderPass * render_pass = 0x0000003a`1d466540)+0xa9 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 430]
17 0000003a`16e3ca50 00007ffb`cf3038d2 libcef!cc::DirectRenderer::DrawFrame(class std::vector<std::unique_ptr<cc::RenderPass,std::default_delete<cc::RenderPass> >,std::allocator<std::unique_ptr<cc::RenderPass,std::default_delete<cc::RenderPass> > > > * render_passes_in_draw_order = 0x0000003a`1b591cd8 { size=1 }, float device_scale_factor = <Value unavailable error>, class gfx::ColorSpace * device_color_space = 0x0000003a`1b411118, class gfx::Rect * device_viewport_rect = 0x0000003a`16e3ce48, class gfx::Rect * device_clip_rect = 0x0000003a`16e3cee8, bool disable_picture_quad_image_filtering = false)+0x698 [h:\cef\build\chromium_git\chromium\src\cc\output\direct_renderer.cc @ 281]
18 0000003a`16e3cd90 00007ffb`cf305774 libcef!cc::Display::DrawAndSwap(void)+0x44a [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display.cc @ 301]
19 0000003a`16e3d120 00007ffb`cf305fe7 libcef!cc::DisplayScheduler::DrawAndSwap(void)+0x84 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 118]
1a 0000003a`16e3d1e0 00007ffb`cf3060c1 libcef!cc::DisplayScheduler::AttemptDrawAndSwap(void)+0x73 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 275]
1b 0000003a`16e3d210 00007ffb`cc9799f2 libcef!cc::DisplayScheduler::OnBeginFrameDeadline(void)+0x79 [h:\cef\build\chromium_git\chromium\src\cc\surfaces\display_scheduler.cc @ 294]
1c (Inline Function) --------`-------- libcef!base::internal::RunnableAdapter<void +0x11 [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 171]
1d (Inline Function) --------`-------- libcef!base::internal::InvokeHelper<1,void>::MakeItSo+0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 309]
1e (Inline Function) --------`-------- libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void +0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 363]
1f 0000003a`16e3d280 00007ffb`cc9799f2 libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void (class base::internal::BindStateBase * base = 0x00007ffc`0499da0b)+0x3a [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 346]
20 (Inline Function) --------`-------- libcef!base::internal::RunnableAdapter<void +0x11 [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 171]
21 (Inline Function) --------`-------- libcef!base::internal::InvokeHelper<1,void>::MakeItSo+0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 309]
22 (Inline Function) --------`-------- libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void +0x2d [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 363]
23 0000003a`16e3d2b0 00007ffb`ce9c5214 libcef!base::internal::Invoker<base::internal::BindState<base::internal::RunnableAdapter<void (class base::internal::BindStateBase * base = 0x00007ffc`0499da0b)+0x3a [h:\cef\build\chromium_git\chromium\src\base\bind_internal.h @ 346]
24 (Inline Function) --------`-------- libcef!base::Callback<void __cdecl+0x8 [h:\cef\build\chromium_git\chromium\src\base\callback.h @ 389]
25 0000003a`16e3d2e0 00007ffb`ce95e763 libcef!base::debug::TaskAnnotator::RunTask(char * queue_function = 0x00007ffb`cff71798 "MessageLoop::PostTask", struct base::PendingTask * pending_task = 0x0000003a`16e3e700)+0x184 [h:\cef\build\chromium_git\chromium\src\base\debug\task_annotator.cc @ 53]
26 0000003a`16e3d3d0 00007ffb`ce95f6e9 libcef!base::MessageLoop::RunTask(struct base::PendingTask * pending_task = 0x0000003a`16e3e700)+0x453 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 491]
27 (Inline Function) --------`-------- libcef!base::MessageLoop::DeferOrRunPendingTask+0x181 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 499]
28 0000003a`16e3e6e0 00007ffb`ce9b6a91 libcef!base::MessageLoop::DoWork(void)+0x4a9 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 622]
29 0000003a`16e3eab0 00007ffb`ce9b6764 libcef!base::MessagePumpForUI::DoRunLoop(void)+0x71 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_pump_win.cc @ 263]
2a 0000003a`16e3eb20 00007ffb`ce9a4f3d libcef!base::MessagePumpWin::Run(class base::MessagePump::Delegate * delegate = <Value unavailable error>)+0x54 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_pump_win.cc @ 142]
2b (Inline Function) --------`-------- libcef!base::MessageLoop::RunHandler+0x15 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 454]
2c 0000003a`16e3eb70 00007ffb`ce95dc31 libcef!base::RunLoop::Run(void)+0xed [h:\cef\build\chromium_git\chromium\src\base\run_loop.cc @ 36]
2d 0000003a`16e3ebc0 00007ffb`cc938a3d libcef!base::MessageLoop::Run(void)+0x41 [h:\cef\build\chromium_git\chromium\src\base\message_loop\message_loop.cc @ 290]
*** WARNING: Unable to verify checksum for cefclient.exe
2e (Inline Function) --------`-------- libcef!CefBrowserMessageLoop::RunMessageLoop+0x8 [h:\cef\build\chromium_git\chromium\src\cef\libcef\browser\browser_message_loop.cc @ 126]
2f (Inline Function) --------`-------- libcef!CefRunMessageLoop+0x3b [h:\cef\build\chromium_git\chromium\src\cef\libcef\browser\context.cc @ 206]
30 0000003a`16e3ec20 00007ff6`ee482546 libcef!cef_run_message_loop(void)+0x41 [h:\cef\build\chromium_git\chromium\src\cef\libcef_dll\libcef_dll.cc @ 351]
31 (Inline Function) --------`-------- cefclient!CefRunMessageLoop+0x6 [h:\cef\build\chromium_git\chromium\src\cef\libcef_dll\wrapper\libcef_dll_wrapper.cc @ 342]
32 0000003a`16e3ec50 00007ff6`ee4a965e cefclient!client::MainMessageLoopStd::Run(void)+0xa [h:\cef\build\chromium_git\chromium\src\cef\tests\cefclient\browser\main_message_loop_std.cc @ 16]
33 0000003a`16e3ec80 00007ff6`ee4fd5a3 cefclient!client::`anonymous namespace'::RunMain(struct HINSTANCE__ * hInstance = <Value unavailable error>)+0x6d2 [h:\cef\build\chromium_git\chromium\src\cef\tests\cefclient\cefclient_win.cc @ 106]
34 (Inline Function) --------`-------- cefclient!invoke_main+0x21 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 113]
35 0000003a`16e3f740 00007ffc`03c213d2 cefclient!__scrt_common_main_seh(void)+0x117 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 253]
36 0000003a`16e3f780 00007ffc`049854e4 kernel32!BaseThreadInitThunk+0x22
37 0000003a`16e3f7b0 00000000`00000000 ntdll!RtlUserThreadStart+0x34

Faulted code:

--- h:\cef\build\chromium_git\chromium\src\third_party\skia\src\opts\sknx_sse.h 
    71:     static SkNx Load(const void* ptr) { return _mm_loadu_ps((const float*)ptr); }
    72: 
    73:     SkNx(float a, float b, float c, float d) : fVec(_mm_setr_ps(a,b,c,d)) {}
00007FFBCC99045C C5 F8 28 C1          vmovaps     xmm0,xmm1  
00007FFBCC990460 C4 E3 79 21 C2 10    vinsertps   xmm0,xmm0,xmm2,10h  
00007FFBCC990466 C4 E3 79 21 C3 20    vinsertps   xmm0,xmm0,xmm3,20h  
00007FFBCC99046C C4 E3 79 21 44 24 28 30 vinsertps   xmm0,xmm0,dword ptr [d],30h  
00007FFBCC990474 C5 F8 11 01          vmovups     xmmword ptr [rcx],xmm0  
00007FFBCC990478 48 8B C1             mov         rax,rcx  
00007FFBCC99047B C3                   ret  

I'm not sure what happens: CPU is not support SSE2 command, or command really invalid? CPUID say that it is support even SSE4.1... Also on same host x86 build work, and on i7-4770 x64 build also work. So it is really possible something with CPU supported commands?

PS: What's default CEF requirements for target CPU?

Comments (15)

  1. Dmitry Azaraev reporter

    Looks like CEF build skia with AVX2 support. Intel Xeon X5560 doesn't support AVX or AVX2 (but support any other SSEs). So this is looks like a root of problem. Need find a way how to tweak build options.

  2. Dmitry Azaraev reporter

    Google Chrome Version 53.0.2785.116 m (64-bit) is work on target host, so it is looks like CEF-build specific. Official builds now built with 2015U3, and I'm tried to build with MSVS 2015 Update 3.1, and still got same result. May be building with 2015 Update 2 can help.

  3. Dmitry Azaraev reporter

    There is compiler issue (LTCG): 2015U3 produce result depending on object file ordering. If first object file contains AVX instruction set, then following objects also generate AVX instruction set, even if they had been compiled with lower instruction set. I.e. cl /ltcg sse2.obj avx.obj will produce correct result, but cl /ltcg avx.obj sse2.obj now produces incorrect result (images looks like works fine, but requires AVX).

  4. Dmitry Azaraev reporter

    Because i'm did not encounter in any problems with build 2785 branch with 2015 Update 3 except this, it is have sense to track down this problem deeper. As i'm say before, this is tied to possible bug? in LTCG.

    vs2015u3-ltcg-bug-1.zip includes build.cmd script which should build avx-sse.exe and sse-avx.exe. This executable build from same obj modules, difference only in order of object files which is passed to linker.

    Also *.disasm files generated to easy understand difference without touching debugger.

    Tested with Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x64.

    SSE-AVX: Correct case:

    ?use_sse@@YAXXZ:
      000000014000110C: 48 83 EC 38        sub         rsp,38h
      0000000140001110: 0F 28 2D 19 DC 04  movaps      xmm5,xmmword ptr [__xmm@4080000040400000400000003f800000]
                        00
      0000000140001117: 48 8D 0D DA DB 04  lea         rcx,[??_C@_0BC@POEDNAAP@SSE?3?5?$CFf?5?$CFf?5?$CFf?5?$CFf?6?$AA@]
                        00
      000000014000111E: 0F 28 C5           movaps      xmm0,xmm5
      0000000140001121: 0F 57 E4           xorps       xmm4,xmm4
      0000000140001124: 0F C6 C5 FF        shufps      xmm0,xmm5,0FFh
      0000000140001128: 0F 28 CD           movaps      xmm1,xmm5
      000000014000112B: F3 0F 5A E0        cvtss2sd    xmm4,xmm0
      000000014000112F: 0F C6 CD AA        shufps      xmm1,xmm5,0AAh
      0000000140001133: 0F 57 DB           xorps       xmm3,xmm3
      0000000140001136: F3 0F 5A D9        cvtss2sd    xmm3,xmm1
      000000014000113A: 0F 28 C5           movaps      xmm0,xmm5
      000000014000113D: F2 0F 11 64 24 20  movsd       mmword ptr [rsp+20h],xmm4
      0000000140001143: 0F C6 C5 55        shufps      xmm0,xmm5,55h
      0000000140001147: 0F 57 D2           xorps       xmm2,xmm2
      000000014000114A: 0F 57 C9           xorps       xmm1,xmm1
      000000014000114D: 66 49 0F 7E D9     movd        r9,xmm3
      0000000140001152: F3 0F 5A D0        cvtss2sd    xmm2,xmm0
      0000000140001156: F3 0F 5A CD        cvtss2sd    xmm1,xmm5
      000000014000115A: 66 49 0F 7E D0     movd        r8,xmm2
      000000014000115F: 66 48 0F 7E CA     movd        rdx,xmm1
      0000000140001164: E8 FB FE FF FF     call        printf
      0000000140001169: 48 83 C4 38        add         rsp,38h
      000000014000116D: C3                 ret
    

    AVX-SSE: incorrect case:

    ?use_sse@@YAXXZ:
      000000014000117C: 48 83 EC 48        sub         rsp,48h
      0000000140001180: F3 0F 10 05 A8 DB  movss       xmm0,dword ptr [__real@40800000]
                        04 00
      0000000140001188: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
      000000014000118D: F3 0F 10 1D 97 DB  movss       xmm3,dword ptr [__real@40400000]
                        04 00
      0000000140001195: F3 0F 10 15 8B DB  movss       xmm2,dword ptr [__real@40000000]
                        04 00
      000000014000119D: F3 0F 10 0D 7F DB  movss       xmm1,dword ptr [__real@3f800000]
                        04 00
      00000001400011A5: F3 0F 11 44 24 20  movss       dword ptr [rsp+20h],xmm0
      00000001400011AB: E8 08 FF FF FF     call        ??0Sk4f@@QEAA@MMMM@Z
      00000001400011B0: BA 03 00 00 00     mov         edx,3
      00000001400011B5: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
      00000001400011BA: E8 19 FF FF FF     call        ??ASk4f@@QEBAMH@Z
      00000001400011BF: 0F 57 E4           xorps       xmm4,xmm4
      00000001400011C2: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
      00000001400011C7: BA 02 00 00 00     mov         edx,2
      00000001400011CC: F3 0F 5A E0        cvtss2sd    xmm4,xmm0
      00000001400011D0: E8 03 FF FF FF     call        ??ASk4f@@QEBAMH@Z
      00000001400011D5: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
      00000001400011DA: BA 01 00 00 00     mov         edx,1
      00000001400011DF: 0F 57 DB           xorps       xmm3,xmm3
      00000001400011E2: F3 0F 5A D8        cvtss2sd    xmm3,xmm0
      00000001400011E6: E8 ED FE FF FF     call        ??ASk4f@@QEBAMH@Z
      00000001400011EB: 48 8D 4C 24 30     lea         rcx,[rsp+30h]
      00000001400011F0: 33 D2              xor         edx,edx
      00000001400011F2: 0F 57 D2           xorps       xmm2,xmm2
      00000001400011F5: F3 0F 5A D0        cvtss2sd    xmm2,xmm0
      00000001400011F9: E8 DA FE FF FF     call        ??ASk4f@@QEBAMH@Z
      00000001400011FE: 48 8D 0D 0B DB 04  lea         rcx,[??_C@_0BC@POEDNAAP@SSE?3?5?$CFf?5?$CFf?5?$CFf?5?$CFf?6?$AA@]
                        00
      0000000140001205: 0F 57 C9           xorps       xmm1,xmm1
      0000000140001208: F2 0F 11 64 24 20  movsd       mmword ptr [rsp+20h],xmm4
      000000014000120E: F3 0F 5A C8        cvtss2sd    xmm1,xmm0
      0000000140001212: 66 49 0F 7E D9     movd        r9,xmm3
      0000000140001217: 66 49 0F 7E D0     movd        r8,xmm2
      000000014000121C: 66 48 0F 7E CA     movd        rdx,xmm1
      0000000140001221: E8 3E FE FF FF     call        printf
      0000000140001226: 48 83 C4 48        add         rsp,48h
      000000014000122A: C3                 ret
    
    ??0Sk4f@@QEAA@MMMM@Z:
      00000001400010B8: C5 F8 28 C1        vmovaps     xmm0,xmm1
      00000001400010BC: C4 E3 79 21 C2 10  vinsertps   xmm0,xmm0,xmm2,10h
      00000001400010C2: C4 E3 79 21 C3 20  vinsertps   xmm0,xmm0,xmm3,20h
      00000001400010C8: C4 E3 79 21 44 24  vinsertps   xmm0,xmm0,dword ptr [rsp+28h],30h
                        28 30
      00000001400010D0: C5 F8 11 01        vmovups     xmmword ptr [rcx],xmm0
      00000001400010D4: 48 8B C1           mov         rax,rcx
      00000001400010D7: C3                 ret
    

    So, what's difference: in first case Sk4f constructor is completely inlined and it is holds only SSE instructions. In second case method body looks fine, but Sk4f constructor is not inlined. If we take a look on constructor code (listed above) - it is built with AVX instructions. So, now - our SSE-only code no more work on CPU's without AVX instruction set, and this completely depends on order of object files passed to linker.

    Update: In CEF build i'm got crash exactly on Sk4f constructor, which looks very similar.

  5. Dmitry Azaraev reporter

    This script resort object files in libcef.ninja file. I'm built libcef with 2015U3 using this order and this looks like work (cefclient) runs on non-AVX host.

    This script actually makes next files are last:

    obj/third_party/libjpeg_turbo/simd_asm/jfdctflt-sse-64.o
    obj/media/base/media_yasm/convert_yuv_to_rgb_sse.o
    obj/media/base/media_yasm/linear_scale_yuv_to_rgb_sse.o
    obj/media/base/media_yasm/scale_yuv_to_rgb_sse.o
    obj/skia/skia_opts/SkBitmapFilter_opts_SSE2.obj
    obj/skia/skia_opts/SkBitmapProcState_opts_SSE2.obj
    obj/skia/skia_opts/SkBlitRow_opts_SSE2.obj
    obj/third_party/libpng/libpng_sources/filter_sse2_intrinsics.obj
    obj/third_party/libjpeg_turbo/simd_asm/jccolor-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jcgray-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jchuff-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jcsample-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jdcolor-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jdmerge-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jdsample-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jfdctfst-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jfdctint-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jidctflt-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jidctfst-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jidctint-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jidctred-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jquantf-sse2-64.o
    obj/third_party/libjpeg_turbo/simd_asm/jquanti-sse2-64.o
    obj/media/base/base/convert_rgb_to_yuv_sse2.obj
    obj/media/base/base/filter_yuv_sse2.obj
    obj/media/base/media_yasm/scale_yuv_to_rgb_sse2_x64.o
    obj/third_party/libvpx/libvpx_yasm/copy_sse2.o
    obj/third_party/libvpx/libvpx_yasm/idctllm_sse2.o
    obj/third_party/libvpx/libvpx_yasm/iwalsh_sse2.o
    obj/third_party/libvpx/libvpx_yasm/loopfilter_block_sse2_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/loopfilter_sse2.o
    obj/third_party/libvpx/libvpx_yasm/mfqe_sse2.o
    obj/third_party/libvpx/libvpx_yasm/postproc_sse2.o
    obj/third_party/libvpx/libvpx_yasm/recon_sse2.o
    obj/third_party/libvpx/libvpx_yasm/subpixel_sse2.o
    obj/third_party/libvpx/libvpx_yasm/dct_sse2.o
    obj/third_party/libvpx/libvpx_yasm/fwalsh_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vp9_mfqe_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vp9_postproc_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vp9_dct_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vp9_error_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vp9_temporal_filter_apply_sse2.o
    obj/third_party/libvpx/libvpx_yasm/add_noise_sse2.o
    obj/third_party/libvpx/libvpx_yasm/halfpix_variance_impl_sse2.o
    obj/third_party/libvpx/libvpx_yasm/intrapred_sse2.o
    obj/third_party/libvpx/libvpx_yasm/inv_wht_sse2.o
    obj/third_party/libvpx/libvpx_yasm/sad4d_sse2.o
    obj/third_party/libvpx/libvpx_yasm/sad_sse2.o
    obj/third_party/libvpx/libvpx_yasm/subpel_variance_sse2.o
    obj/third_party/libvpx/libvpx_yasm/subtract_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vpx_convolve_copy_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_8t_sse2.o
    obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_bilinear_sse2.o
    obj/third_party/libwebp/libwebp_dsp_sse2/alpha_processing_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/argb_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/cost_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/dec_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/enc_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/filters_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/lossless_enc_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/lossless_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/rescaler_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/upsampling_sse2.obj
    obj/third_party/libwebp/libwebp_dsp_sse2/yuv_sse2.obj
    obj/third_party/qcms/qcms/transform-sse2.obj
    obj/third_party/libvpx/libvpx_intrinsics_sse2.lib
    
    obj/skia/skia_opts_sse3/SkBitmapProcState_opts_SSSE3.obj
    obj/skia/skia_opts_sse3/SkOpts_ssse3.obj
    obj/media/base/base/convert_rgb_to_yuv_ssse3.obj
    obj/media/base/media_yasm/convert_rgb_to_yuv_ssse3.o
    obj/third_party/libvpx/libvpx_yasm/copy_sse3.o
    obj/third_party/libvpx/libvpx_yasm/subpixel_ssse3.o
    obj/third_party/libvpx/libvpx_yasm/vp9_quantize_ssse3_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/avg_ssse3_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/fwd_txfm_ssse3_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/intrapred_ssse3.o
    obj/third_party/libvpx/libvpx_yasm/inv_txfm_ssse3_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/quantize_ssse3_x86_64.o
    obj/third_party/libvpx/libvpx_yasm/sad_sse3.o
    obj/third_party/libvpx/libvpx_yasm/sad_ssse3.o
    obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_8t_ssse3.o
    obj/third_party/libvpx/libvpx_yasm/vpx_subpixel_bilinear_ssse3.o
    obj/third_party/libvpx/libvpx_intrinsics_ssse3.lib
    obj/skia/skia_opts_sse41/SkOpts_sse41.obj
    obj/skia/skia_opts_sse42/SkForceCPlusPlusLinking.obj
    obj/third_party/libvpx/libvpx_yasm/sad_sse4.o
    obj/third_party/libwebp/libwebp_dsp_sse41/alpha_processing_sse41.obj
    obj/third_party/libwebp/libwebp_dsp_sse41/dec_sse41.obj
    obj/third_party/libwebp/libwebp_dsp_sse41/enc_sse41.obj
    obj/third_party/libwebp/libwebp_dsp_sse41/lossless_enc_sse41.obj
    obj/third_party/libvpx/libvpx_intrinsics_sse4_1.lib
    obj/skia/skia_opts_avx/SkOpts_avx.obj
    obj/third_party/libvpx/libvpx_yasm/quantize_avx_x86_64.o
    obj/third_party/libvpx/libvpx_intrinsics_avx.lib
    obj/skia/skia_opts_avx2/SkForceCPlusPlusLinking.obj
    obj/third_party/boringssl/boringssl_asm/rsaz-avx2.o
    obj/third_party/libwebp/libwebp_dsp/enc_avx2.obj
    obj/third_party/libvpx/libvpx_intrinsics_avx2.lib 
    
  6. Marshall Greenblatt

    To clarify, this bug is not triggered if the SSE object files are included before the AVX object files.

    As Dmitry describes above, he created a build after ordering the list of files in obj/cef/libcef.ninja. The bug was not triggered when the obj files were ordered as: generic, sse, sse2, sse3, sse4, avx, avx2.

    We think this bug is not triggered in Chrome either because chrome uses PGO, or because the chrome ninja files just happen to include sse first. Chrome versions that currently build with Update 3 are canary and master.

  7. amaitland

    Testing with 3.2840.1493 (before this change had been applied) and the computer I was previously having issues with was working perfectly. Looked like the issue was resolved in Chromium, just mentioning as an FYI.

  8. Dmitry Azaraev reporter

    In chromium they only workaround same by apply some forced inlines in skia, but this workaround produces stable and still efficient result on current compiler without changes sources for whole codebase without inspecting it. Once C++ compliant (program-wide ODR-violation-free) implementations will be provided by chromium (third party libs mainly) it is safe to disable it.

  9. Log in to comment