Sage x86 SSE code seems broken

Issue #12 open
Frank Sapone created an issue

Sometimes dynamic DXE builds with Sage fail to initialize and the program segfaults (but the screen is black). This seems to be like a 50% occurrence. Will try to get a stack dump. This main issue is preventing us from deploying Sage to replace Mesa.

Static builds are unaffected, and for some reason my SSE dynamic compile of q2.exe and ref_gl.dxe have yet to exhibit the problem.

Comments (16)

  1. sezero

    UPDATE: This seems to have turned out to be a problem with the SSE code in sage. It is currently worked around by disabling SSE feature through sage.ini entries. The actual bug(s) need fixing.

  2. Frank Sapone reporter

    If SSE is disabled in the sage.ini mode switches work OK (tested changing video modes multiple times in game, then connecting to a xatrix mod server and doing the same).

    But, with SSE enabled in the sage.ini and no -P gl.dxe anything that triggers a vid_restart will immediately bomb out..

  3. sezero

    ... no -P gl.dxe ...

    i.e.: with a ref_gl.dxe that doesn't depend on gl.dxe at build time and ref_gl.dxe explicitly dlopen()ing gl.dxe

  4. Frank Sapone reporter

    Since I know jack shit about ASM and nobody who knows it would waste their time helping us maybe it would be worth wrapping prgama hints to compiler optimzer for pentium3 in the functions it replaces, check the xxx.S output of those functions and make them separate xxx.asm files so it can still choose a codepath. This would at least allow some SSE specific code without the issue. With -march=pentium3 the performance is very close to the hand optimized version. It may even give us a clue to what the real issue is with daniel's code.

    The SSE code being able to work is a huge deal here. Consider Quake 1 -- 640x480x16 with sound on P3 900 with timedemo demo1, 214fps SSE code, 172 fps no SSE, 198 fps from Mesa with it's SSE code (never forced it off so not sure of the performance loss). With letting the compiler make the decision in mosts tests I noticed maybe like 3-5fps penatly, so 210fps is a realistic aim and the extra 10 fps might mean the difference in being able to play a larger map without dropping below 40fps.

  5. Frank Sapone reporter

    For reference, the actual offending code is in sage\drivers\glide\x86\sse_vbtmp.asm. if sse_emitvertices friends are bypassed in sage\drivers\glide\drv_vb.c then the problem doesnt occur but then performance is basically the same as if SSE was disabled.

    proc TAG(sse_emitvertices)

  6. sezero

    For it to be not lost, I think this should be reported at the issue tracker of dborca's original sage repository, with references to this page. Can you please do that?

  7. Frank Sapone reporter

    UPDATE: Possibly related to DXE3 not aligning some things properly. A workaround is to convert all movaps to movups in the SSE ASM code. Technically, it's wrong and needs further investigating but it does work and there is no apparent performance penalty for this.

    Static GL builds are not affected.

  8. sezero

    I have been using MesaLib-3.4.2 as a dxe for uhexen2: it has sse code which does movaps everywhere and I haven't had issues yet on my p3-650 box with it. So I am not sure the sage movaps issue is related to dxe.

  9. Frank Sapone reporter
    • changed status to open

    I thought this problem was gone but then I recompiled from latest build and the problems came back. Maybe my machine was aligned in such a state that particular test case it was working without bombing.

  10. Frank Sapone reporter

    Yes, I don't know what to tell you. It worked fine that other day and I was sure to disable the compile flag, etc. Then today I went to try it again and had problems.

  11. Log in to comment