Commit 3ad74420 authored by Siarhei Siamashka's avatar Siarhei Siamashka
Browse files

CPU: use VFP overlapped blit on VFP-capable hardware by default



This should be useful for Raspberry Pi. When reading uncached source buffers,
the VFP optimized overlapped two-pass blit is roughly 2-3 times slower than
memcpy in cached memory. Which makes it reasonably competitive compared to
ShadowFB (considering that ShadowFB allocates an extra buffer, does extra
memory copies which take time and thrash L2 cache, etc.). It even provides
a slight performance advantage in a more or less realistic use case
(scrolling in xterm), which needs reads from the framebuffer:

==== Before (xf86-video-fbdev with ShadowFB) ====

$ time DISPLAY=:0 xterm +j -maximized -e cat longtext.txt

real    1m50.245s
user    0m1.750s
sys     0m0.800s

==== After (xf86-video-sunxifb without ShadowFB) ====

$ time DISPLAY=:0 xterm +j -maximized -e cat longtext.txt

real    1m27.709s
user    0m1.690s
sys     0m0.920s

We get decent results even when reading from the framebuffer. However
in many typical workloads (excluding scrolling and dragging windows)
the framebuffer is primarily used as write-only. In write-only use
cases ShadowFB is just pure overhead. So getting rid of it is a
very good idea as this improves overall graphics performance.
Signed-off-by: default avatarSiarhei Siamashka <siarhei.siamashka@gmail.com>
parent 3676a495
...@@ -516,8 +516,8 @@ FBDevPreInit(ScrnInfoPtr pScrn, int flags) ...@@ -516,8 +516,8 @@ FBDevPreInit(ScrnInfoPtr pScrn, int flags)
cpuinfo = cpuinfo_init(); cpuinfo = cpuinfo_init();
xf86DrvMsg(pScrn->scrnIndex, X_INFO, "processor: %s\n", xf86DrvMsg(pScrn->scrnIndex, X_INFO, "processor: %s\n",
cpuinfo->processor_name); cpuinfo->processor_name);
/* don't use shadow by default if we have NEON or HW acceleration */ /* don't use shadow by default if we have VFP/NEON or HW acceleration */
fPtr->shadowFB = !cpuinfo->has_arm_neon && fPtr->shadowFB = !cpuinfo->has_arm_vfp &&
!xf86GetOptValString(fPtr->Options, OPTION_ACCELMETHOD); !xf86GetOptValString(fPtr->Options, OPTION_ACCELMETHOD);
cpuinfo_close(cpuinfo); cpuinfo_close(cpuinfo);
...@@ -931,9 +931,9 @@ FBDevScreenInit(SCREEN_INIT_ARGS_DECL) ...@@ -931,9 +931,9 @@ FBDevScreenInit(SCREEN_INIT_ARGS_DECL)
"G2D acceleration is disabled via AccelMethod option\n"); "G2D acceleration is disabled via AccelMethod option\n");
} }
if (!fPtr->SunxiG2D_private && cpu_backend->cpuinfo->has_arm_neon) { if (!fPtr->SunxiG2D_private && cpu_backend->cpuinfo->has_arm_vfp) {
if ((fPtr->SunxiG2D_private = SunxiG2D_Init(pScreen, &cpu_backend->blt2d))) { if ((fPtr->SunxiG2D_private = SunxiG2D_Init(pScreen, &cpu_backend->blt2d))) {
xf86DrvMsg(pScrn->scrnIndex, X_INFO, "enabled NEON optimizations\n"); xf86DrvMsg(pScrn->scrnIndex, X_INFO, "enabled VFP/NEON optimizations\n");
} }
} }
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment