- 04 Apr, 2016 1 commit
-
-
Siarhei Siamashka authored
On a PINE64 board (ARM Cortex-A53), this provides ~180 MB/s speed for the framebuffer readback. For comparison, the normal memcpy operation in cached buffers runs at around ~1200 MB/s. Such read back speed is actually not very fast and is borderline usable. With a 1920x1080 32bpp screen resolution, this results in something like ~20 FPS scrolling. Benchmark vs. shadow framebuffer (1920x1080 32bpp): == Shadow framebuffer in xf86-video-fbdev == $ wget http://mirror.its.dal.ca/gutenberg/3/2/0/3/32032/32032.txt $ time DISPLAY=:0 xterm +j -maximized -e cat 32032.txt real 0m43.909s user 0m0.820s sys 0m0.300s $ DISPLAY=:0 x11perf -scroll500 -copywinwin500 -copypixwin500 -copywinpix500 15000 trep @ 1.8460 msec ( 542.0/sec): Scroll 500x500 pixels 12000 trep @ 2.2629 msec ( 442.0/sec): Copy 500x500 from window to window 12000 trep @ 2.2096 msec ( 453.0/sec): Copy 500x500 from pixmap to window 14000 trep @ 1.9740 msec ( 507.0/sec): Copy 500x500 from window to pixmap == Direct framebuffer readback in xf86-video-fbturbo == $ wget http://mirror.its.dal.ca/gutenberg/3/2/0/3/32032/32032.txt $ time DISPLAY=:0 xterm +j -maximized -e cat 32032.txt real 2m5.741s user 0m0.390s sys 0m0.190s $ DISPLAY=:0 x11perf -scroll500 -copywinwin500 -copypixwin500 -copywinpix500 4500 trep @ 5.9201 msec ( 169.0/sec): Scroll 500x500 pixels 6000 trep @ 5.9211 msec ( 169.0/sec): Copy 500x500 from window to window 18000 trep @ 1.5341 msec ( 652.0/sec): Copy 500x500 from pixmap to window 4000 trep @ 6.4657 msec ( 155.0/sec): Copy 500x500 from window to pixmap == The direct framebuffer access without the shadow framebuffer layer makes scrolling and moving windows slower. But copying from pixmaps to windows becomes faster. In the real world, copying from offscreen pixmaps to windows is much more important, because it is one of the performance bottlenecks for almost every X11 application. While reading back from the framebuffer is only used for a few very specialized tasks (scrolling/moving windows and making screenshots). On 32-bit ARM systems, the uncached framebuffer readback used to perform better. Even the Cortex-A53 running in 32-bit mode can do framebuffer readback at more than 300 MB/s: https://github.com/ssvb/tinymembench/wiki/PINE64-(Allwinner-A64) Scrolling/moving windows still can be accelerated by the kernel (via DMA, a dedicated 2D accelerator or some other method) and hooked into xf86-video-fbturbo. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 06 Oct, 2015 1 commit
-
-
Siarhei Siamashka authored
sunxi_x_g2d: drop unused dri2 include
-
- 03 Oct, 2015 1 commit
-
-
Peter Korsgaard authored
The driver doesn't use DRI for anything. Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
-
- 05 Mar, 2015 1 commit
-
-
Jerome Oufella authored
Some boards use an inverted screen layer configuration, making the original code unable to enable disp layers functionality properly. This commit adds a fallback mechanism to the actual disp probing sequence, allowing those cases to be properly handled. Signed-off-by: Jérôme Oufella <jerome.oufella@savoirfairelinux.com>
-
- 20 Sep, 2014 1 commit
-
-
Siarhei Siamashka authored
When probing for the copyarea ioctl, we want to be sure that the kernel just does not return 0 (success) for any unsupported ioctls. The rockchip vendor kernels have been reported to have this issue. In the case if the support for the Raspberry Pi specific copyarea ioctl was detected by mistake, moving windows or scrolling was broken. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 31 Mar, 2014 1 commit
-
-
Siarhei Siamashka authored
Try to load the 'sunxi_cedar_mod' kernel module. And if it loads successfully, then report the DRI2 VDPAU name as 'sunxi'. This allows to use libvdpau-sunxi without setting the VDPAU_DRIVER environment variable. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 29 Mar, 2014 1 commit
-
-
Siarhei Siamashka authored
Fixes https://github.com/ssvb/xf86-video-fbturbo/issues/30 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 12 Jan, 2014 3 commits
-
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
It makes sense to make a formal release. Providing the pre-generated 'configure' script should make it less likely for people to mess with autotools and encounter troubles: https://github.com/ssvb/xf86-video-fbturbo/issues/28 https://github.com/ssvb/xf86-video-fbturbo/issues/25 Also it's likely that this particular xf86-video-fbturbo git master snapshot was used in: http://www.raspberrypi.org/archives/5580 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 09 Dec, 2013 1 commit
-
-
Siarhei Siamashka authored
After window creation or resize, the mali blob on the client side requests two dri2 buffers (for back and front) from the ddx. The problem is that the 'swap' and 'get_buffer' operations are executed out of order relative to each other and we may have different possible patterns of dri2 communication: 1. swap swap swap swap get_buffer swap get_buffer swap swap ... 2. swap swap swap get_buffer swap swap get_buffer swap swap ... A major annoyance is that both mali blob on the client side and the ddx driver in xserver need have the same idea about which one of there two buffers goes to front and which goes to back. Older commit https://github.com/ssvb/xf86-video-fbturbo/commit/30b4ca27d1c4 tried to address this problem in a mostly empirical way and managed to solve it at least for the synthetic test gles-rgb-cycle-demo and for most of the real programs (such as Qt5 applications, etc.) However appears that this heuristics is not 100% reliable in all cases. The Extreme Tux Racer game run in glshim manages to trigger the back and front buffers mismatch. Which manifests itself as erratic penguin movement. This patch adds a special check, which now randomly samples certain bytes from the dri2 buffers to see which one of them has been modified by the client application between buffer swaps. If we see that the rendering actually happens to the front buffer instead of the back buffer, then we just change the roles of these buffers. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 15 Nov, 2013 1 commit
-
-
Daniel Drake authored
When using exynos_drm, /dev/dri/card0 is now the exynos-drm node, and /dev/dri/card1 is mali. Instead of hardcoding mali at card0, use libdrm to automatically provide the correct device node path. Signed-off-by: Daniel Drake <drake@endlessm.com>
-
- 26 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Fixes linking related fragility, which could result in crashes when doing Thumb2->ARM function calls. Reported-by: Luc Verhaegen <libv@skynet.be> Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 19 Oct, 2013 3 commits
-
-
Luc Verhaegen authored
This avoids a kernel oops due to the badly implemented and badly checked ump interface. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
Luc Verhaegen authored
And disable building ump when it is not there. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
Luc Verhaegen authored
The binary driver is unaffected by it, only when mesa-dri is fully installed does it do something. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
- 17 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 16 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Marvell PJ4 core used in CuBox very poorly handles VFP uncached reads from the framebuffer. Using WMMX or ARM LDM reads is much faster, with LDM instructions having a minor advantage. This improves framebuffer read performance from ~50MB/s to ~100MB/s. WMMX runtime detection and PJ4 core identification is also added as part of this fix. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 08 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Benchmarking with x11perf, modified to support wider range of sizes for the scroll operation. Tests have been run at the stock 700MHz CPU clock frequency and with 1280x720 32bpp desktop. $ DISPLAY=:0 ./x11perf -scroll5 -scroll10 -scroll15 -scroll20 \ -scroll30 -scroll50 -scroll100 == CPU == 1000000 trep @ 0.0289 msec ( 34600.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0387 msec ( 25800.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0459 msec ( 21800.0/sec): Scroll 15x15 pixels 450000 trep @ 0.0576 msec ( 17300.0/sec): Scroll 20x20 pixels 350000 trep @ 0.0817 msec ( 12200.0/sec): Scroll 30x30 pixels 200000 trep @ 0.1564 msec ( 6390.0/sec): Scroll 50x50 pixels 100000 trep @ 0.4446 msec ( 2250.0/sec): Scroll 100x100 pixels == fb_copyarea (DMA) acceleration == 1000000 trep @ 0.0307 msec ( 32500.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0353 msec ( 28300.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0397 msec ( 25200.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0464 msec ( 21600.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0645 msec ( 15500.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1177 msec ( 8500.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2783 msec ( 3590.0/sec): Scroll 100x100 pixels This shows that the ioctls overhead and the DMA setup cost is not so significant for the Raspberry Pi. DMA already becomes a bit faster than CPU at 10x10 size of the blit operation. Even though there is no significant difference between CPU and DMA for extremely small sizes of operations (the other overhead is clearly dominating), setting a threshold is not going to harm: == mixed CPU / fb_copyarea (DMA) with 90 pixels threshold == 1000000 trep @ 0.0291 msec ( 34300.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0345 msec ( 29000.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0395 msec ( 25300.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0466 msec ( 21400.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0650 msec ( 15400.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1181 msec ( 8470.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2784 msec ( 3590.0/sec): Scroll 100x100 pixels If some other ARM devices also implement Raspberry Pi compatible accelerated fb_copyarea ioctl, then the threshold selection may be reconsidered. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 07 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Now acceleration is only used in the case if the AccelMethod option is not set (so that it is assumed to be a default choice) or when it is explicitly set to "COPYAREA". Any other value (for example "CPU") disables acceleration. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 03 Oct, 2013 2 commits
-
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
This provides basic 2D acceleration support for Raspberry Pi to speed up moving windows and scrolling. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 22 Sep, 2013 1 commit
-
-
Siarhei Siamashka authored
Because a wide range of embedded ARM devices are actually supported (Allwinner A1X/A20, Raspberry Pi, ODROID-X, Rockchip, ...) and are getting some sort of performance improvement and/or hardware acceleration, the DDX driver needs a vendor neutral name. Resolves https://github.com/ssvb/xf86-video-fbturbo/issues/10 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 09 Sep, 2013 2 commits
-
-
Siarhei Siamashka authored
In the case if the framebuffer reservation size is too small for efficient use of the hardware overlays and zero-copy buffers flipping, log a hint about fixing this problem in /var/log/Xorg.0.log Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Even though we are primarily using the UMP buffer obtained by the GET_UMP_SECURE_ID_SUNXI_FB ioctl, another UMP buffer obtained by the GET_UMP_SECURE_ID_BUF1 ioctl should also span over the whole framebuffer. Otherwise we may have troubles with the window resize bug recovery and buffer flipping. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 07 Sep, 2013 3 commits
-
-
Siarhei Siamashka authored
The instructions, links, etc. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
The Allwinner A10/A13 display controller hardware is expected to support negative coordinates of the top left corners of the layers. But there is some bug either in the kernel driver or in the hardware, which messes up the picture on screen when the Y coordinate is negative for YUV layer. Negative X coordinates are not affected. RGB formats are not affected too (no matter whether the RGB layer is scaled or not). We fix this by just recalculating which part of the buffer in memory corresponds to Y=0 on screen and adjust the input buffer settings. Fixes https://github.com/ssvb/xf86-video-sunxifb/issues/16 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Now zero copy and tear free buffer swapping is also supported for 16bpp desktop. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 06 Sep, 2013 1 commit
-
-
Siarhei Siamashka authored
Now the scaler is enabled for the sunxi disp layer only when we want to use it for YUV format with XV. Whenever the layer is configured for RGB format or deactivated, the scaler gets disabled. This should make the driver more friendly to the other potential scaled layer users. The total number of available scalers is only 2 for Allwinner A10 and only 1 for Allwinner A13. The potential drawback is that now we may get an error when trying to enable the scaler (if somebody else has used up all the available scalers) instead of always having it reserved and ready for use. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 13 Aug, 2013 1 commit
-
-
Siarhei Siamashka authored
Recent changes broke the configuration when "DRI2HWOverlay" option is set to "false". This patch adds the missing UMP secure ids initialization and resolves the problem. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 04 Aug, 2013 3 commits
-
-
Siarhei Siamashka authored
Do this to keep the variables naming style consistent across the source file (earlier these variables had different names like 'self', 'drvpriv', 'private'). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
In double buffer mode, explicitly mark the buffers as designated for odd or even frame position when putting them into queue. And when swapping the buffers, use these flags to re-synchronize if it is necessary. This prevents problems after window resize (when gles-rgb-cycle-demo could expose a mismatch between the color name in the window title and the actual window color). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Whenever something goes wrong in high fps mode, it may be interesting to slow down the demo to check whether the actual background color matches the expected color (shown in the window title). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 03 Aug, 2013 1 commit
-
-
Siarhei Siamashka authored
If DEBUG_WITH_RGB_PATTERN is defined, then we check that the frames colors are changed as "R -> G -> B -> R -> G -> ..." pattern and print debugging messages when this is not the case. Such color change pattern can be generated by the "test/gles-rgb-cycle-demo.c" program. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 31 Jul, 2013 3 commits
-
-
Siarhei Siamashka authored
Do this mostly for security reasons. We don't want any application to see whatever was last rendered by the previous GLES application by just peeking into a freshly allocated DRI2 buffer. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
We manage only a single hardware overlay. That's a precious shared resource, which we want to use for zero-copy fullscreen compositing in gnome-shell. The strange 1x1 window does not really need it. Fixes https://github.com/ssvb/xf86-video-sunxifb/issues/2 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
When enabled, it tries to avoid tearing in OpenGL ES applications. Works on sunxi hardware in the case if the hardware overlay (sunxi disp layer) is used for a DRI2 window. The name of this option and the description in the man page has been borrowed from intel and radeon drivers. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 30 Jul, 2013 1 commit
-
-
Siarhei Siamashka authored
That's the right thing to do and fixes issues such as https://github.com/ssvb/xf86-video-sunxifb/issues/6 As a result, now the framebuffer size may need to be larger in order to accomodate two DRI2 buffers in the offscreen part of the framebuffer. The users of sunxi hardware are advised to increase the value of fb0_framebuffer_num variable in fex file to 3 for 32bpp mode and to 5 for 16bpp mode. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 29 Jul, 2013 2 commits
-
-
Siarhei Siamashka authored
Should fix https://github.com/ssvb/xf86-video-sunxifb/issues/14 and prevent FTBFS on some systems. Reported-by: Fred Chien <cfsghost@gmail.com> Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
When moving further to our own DRI2 buffers bookkeeping, we can't really trust the information from DRI2BufferRec anymore. So just add a copy of all the missing bits of information to UMPBufferInfoRec and use it instead. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-