The new OSR implementaion has an (artificial) limit to the frame rate

Issue #1368 resolved
Former user created an issue

Original issue 1368 created by fa3rhan on 2014-09-04T21:39:21.000Z:

What steps will reproduce the problem?
1. Use OSR with CefBrowserSettings.windowless_frame_rate = 60 or cefclient with '--off-screen-rendering-enabled --off-screen-frame-rate=60'
2. load a website with a perpetual css animation or webGL

What is the expected output? What do you see instead?
It should output 60fps, instead it does (at most) 30

What version of the product are you using? On what operating system?
Cef branch 2062 on linux

Please provide any additional information below.
Chromium does render at 60fps (as shown by running webGL benchmarks), but CropScaleReadbackAndCleanMailbox in render_widget_host_view_osr.cc isn't done in time, resulting in every other frame to get dropped, so OSR uses exacly half of the chromium frames.
When adding another browser that renders at the same time, the frame rate goes down to 15 fps each, with 3 browsers it's 10 each.
Note that this has nothing to do with the CefBrowserSettings.windowless_frame_rate limitation and is not a performance problem, it happens with tiny browser windows all the same.

here is what I think is the problem:
Chromium runs with vsync per default which means the GL thread gets blocked when a buffer swap is requested. When Cef queues the download of the frame buffer contents into system memory, the GL thread is already blocked, so the download will be done after the GL thread is finished waiting for vsync and cef recieves the copy finished callback too late for the next frame to be used.

Comments (13)

  1. Former user Account Deleted

    Comment 2. originally posted by fa3rhan on 2014-09-05T11:09:20.000Z:

    using a GL/D3D texture/surface provided by the client is a completely different approach and has nothing to do with the vsync problem described in this issue.

  2. Marshall Greenblatt

    Comment 3. originally posted by magreenblatt on 2014-09-05T16:53:37.000Z:

    It's possible to set the vsync interval on a per-compositor basis using |compositor_->vsync_manager()->SetAuthoritativeVSyncInterval(...))| in the CefRenderWidgetHostViewOSR constructor. OnSwapCompositorFrame should be called at or near the vsync interval. This currently results in an async call to DelegatedFrameHost::RequestCopyOfOutput which is rate limited based on off-screen-frame-rate.

    It would be nice if we could eliminate the async call to RequestCopyOfOutput.
    We could then just set the vsync interval based on off-screen-frame-rate.

  3. Marshall Greenblatt

    Comment 6. originally posted by magreenblatt on 2014-12-16T21:52:01.000Z:

    The attached patch against trunk revision 1959 adds support for the `--enable-begin-frame-scheduling` command-line flag which clamps the frame rate in all processes to the off-screen-frame-rate value. The below statistics were gathered on a 4 core Ubuntu 14 LTS 64-bit VM by running `ps -C cefclient -o %cpu,%mem,cmd` after about 2 minutes. %CPU is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage.

    Performance without enable-begin-frame-scheduling (using vsync=30fps in the browser process and vsync=60fps in all subprocesses):

    $ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/

    %CPU %MEM CMD
    14.9 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    24.3 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=520
    0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
    24.8 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-deferre

    Performance with enable-begin-frame-scheduling and various frame rate values:

    $ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --enable-begin-frame-scheduling --off-screen-frame-rate=X

    X=60:
    %CPU %MEM CMD
    20.5 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    29.1 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=510
    0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
    22.0 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

    X=30:
    %CPU %MEM CMD
    14.7 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    20.8 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=511
    0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
    15.6 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

    X=15:
    %CPU %MEM CMD
    9.7 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    13.6 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=512
    0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
    10.0 0.4 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

    Inspection of trace output shows that CefRenderWidgetHostViewOSR::SendBeginFrame, CefRenderWidgetHostViewOSR::OnFrameCaptureSuccess and CefRenderWidgetHostViewOSR::OnSwapCompositorFrame are called at the specified frame rate frequency while CopyOutputRequest and Compositor::Draw are called at 2x the frequency (expected, since the output is requested an additional time per frame via RequestCopyOfOutput).

    In this example (using requestAnimationFrame) CPU usage is highly correlated with the frame rate. Renderer process CPU usage with more complex content will show lower correlation with the frame rate.

  4. Former user Account Deleted

    Comment 7. originally posted by fa3rhan on 2014-12-18T20:07:51.000Z:

    i've tried the patch and can confirm the CPU/frame rate correlation.

    however, visually --off-screen-frame-rate=60 and --off-screen-frame-rate=30 still look exactly the same (as opposed to loading it in chrome where the fps difference is like night and day)

  5. Marshall Greenblatt

    Comment 8. originally posted by magreenblatt on 2014-12-19T12:33:37.000Z:

    @ comment 7.: In my testing with the current design (using CopyOutputRequest) the Compositor::Draw call cannot complete at much above 60fps, so OnPaint is only being called at about 30fps. Specifying a higher frame rate with this design results in higher CPU usage for no gain (because the additional frames are dropped).

  6. Marshall Greenblatt

    Comment 9. originally posted by magreenblatt on 2014-12-19T14:02:24.000Z:

    Looking at performance now with GPU disabled. This avoids expensive readback from the GPU in exchange for losing some features (3D CSS is supported but not WebGL).

    Windowed rendering performance with current trunk (no changes) with GPU disabled (all processes at 60fps):

    $ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --disable-gpu --disable-gpu-compositing
    %CPU %MEM CMD
    10.8 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    17.4 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    Performance with current trunk (no changes) with GPU disabled (browser process at 30fps, other processes at 60fps)

    $ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled --disable-gpu --disable-gpu-compositing
    %CPU %MEM CMD
    26.2 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    15.5 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    Performance with 1916 branch (old software-only OSR implementation, no 3D CSS or WebGL support):

    $ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled
    %CPU %MEM CMD
    10.1 0.8 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --url=http://mrdoob.co
    0.0 0.1 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --url=http://mrdoob.co
    0.0 0.3 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=zygote --lang=e
    0.0 0.0 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=zygote --lang=e
    6.6 0.4 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=renderer --disa

    Attached is a new patch (currently tested on Windows and Linux only) that uses a custom SoftwareOutputDevice implementation which supports direct rendering to bitmap from OnSwapCompositorFrame and consequently avoids the extra Compositor::Draw when GPU is disabled (it's no longer necessary to call RequestCopyOfOutput).

    Performance with new patch with GPU disabled (browser process at 30fps, other processes at 60fps):

    $ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled --disable-gpu --disable-gpu-compositing
    %CPU %MEM CMD
    13.0 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    17.0 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    Performance with new patch with GPU disabled and begin frame scheduling enabled with various frame rates (all processes at Xfps):

    $ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --disable-gpu --disable-gpu-compositing --enable-begin-frame-scheduling --off-screen-frame-rate=X

    X=60
    %CPU %MEM CMD
    15.1 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    17.4 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    X=30
    %CPU %MEM CMD
    9.4 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    11.2 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    X=15
    %CPU %MEM CMD
    5.5 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
    0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
    6.2 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

    Conclusions:
    - The current trunk implementation uses ~150% more CPU than the 1916 branch implementation at 30fps.
    - With this new patch CPU usage is reduced by ~51% compared to current trunk and goes from ~150% to ~23% CPU usage increase compared to the 1916 branch.
    - With this new patch lower frame rates use less CPU than 1916 branch.
    - It should be relatively easy to render directly to a client-provided surface instead of using an intermediary bitmap with the new SoftwareOutputDevice-based implementation.

  7. Marshall Greenblatt

    Comment 11. originally posted by magreenblatt on 2014-12-20T20:19:11.000Z:

    Some OSR unit tests are failing with the custom SoftwareOutputDevice implementation (run `cef_unittests --gtest_filter=OSRTest.* --disable-gpu --disable-gpu-compositing`). Need to fix those before merging this patch.

  8. Marshall Greenblatt

    Comment 12. originally posted by magreenblatt on 2014-12-29T17:24:36.000Z:

    @ comment 11.: Need to test all of the following combinations:

    $ cef_unittests --gtest_filter=OSRTest.*
    $ cef_unittests --gtest_filter=OSRTest.* --enable-begin-frame-scheduling
    $ cef_unittests --gtest_filter=OSRTest.* --disable-gpu --disable-gpu-compositing
    $ cef_unittests --gtest_filter=OSRTest.* --enable-begin-frame-scheduling --disable-gpu --disable-gpu-compositing

  9. Marshall Greenblatt

    Comment 13. originally posted by magreenblatt on 2015-01-01T16:53:37.000Z:

    Trunk revision 1960 adds support for begin frame scheduling and direct rendering when GPU compositing is disabled.
    - Always set the browser process VSync rate (frame rate) to CefSettings.windowless_frame_rate.
    - When the `enable-begin-frame-scheduling` command-line flag is specified the VSync rate for all processes will be synchronized to CefSettings.windowless_frame_rate. This flag cannot be used in combination with windowed rendering.
    - When the `disable-gpu` and `disable-gpu-compositing` command-line flags are specified the CefRenderHandler::OnPaint method will be called directly from the compositor instead of requiring an additional copy for each frame.
    - CefRenderHandler::OnPopupSize now passes view coordinates instead of (potentially scaled) pixel coordinates.
    - Add OSR unit tests for 2x (HiDPI) pixel scaling.
    - Improve CefRenderHandler documentation.

  10. Log in to comment