- 20 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Now even without using the hardware overlay, ordinary ump buffers allocated in normal memory don't trigger excessive dri2 buffer requests from the mali blob. TODO: * check for any possible memory leaks, ensure correct refcounting * test windows resize behavior for both r3p0 and r3p2 blobs * check if the EVEN/ODD frame order enforcing is still necessary for r3p2 (it could be just one more r3p0 quirk that is already gone)
-
- 19 Oct, 2013 4 commits
-
-
Siarhei Siamashka authored
The mali r3p2 blob does not seem to be affected by the window resize bug anymore, so there is no need to apply the r3p0 workaround (and it does not even work correctly). When using the hardware overlay, now we need to allocate secure id 1 and 2 for back/front ump buffers. Otherwise the blob is trying to request a new dri2 buffer for each frame, and this kills performance (thanks to redundant ump buffers mapping/unmapping). This is now fixed. But when not using the hardware overlay, the r3p2 blob is still requesting dri2 buffer per each frame, and this needs an additional tweak.
-
Luc Verhaegen authored
This avoids a kernel oops due to the badly implemented and badly checked ump interface. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
Luc Verhaegen authored
And disable building ump when it is not there. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
Luc Verhaegen authored
The binary driver is unaffected by it, only when mesa-dri is fully installed does it do something. Signed-off-by: Luc Verhaegen <libv@skynet.be>
-
- 17 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 16 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Marvell PJ4 core used in CuBox very poorly handles VFP uncached reads from the framebuffer. Using WMMX or ARM LDM reads is much faster, with LDM instructions having a minor advantage. This improves framebuffer read performance from ~50MB/s to ~100MB/s. WMMX runtime detection and PJ4 core identification is also added as part of this fix. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 08 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Benchmarking with x11perf, modified to support wider range of sizes for the scroll operation. Tests have been run at the stock 700MHz CPU clock frequency and with 1280x720 32bpp desktop. $ DISPLAY=:0 ./x11perf -scroll5 -scroll10 -scroll15 -scroll20 \ -scroll30 -scroll50 -scroll100 == CPU == 1000000 trep @ 0.0289 msec ( 34600.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0387 msec ( 25800.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0459 msec ( 21800.0/sec): Scroll 15x15 pixels 450000 trep @ 0.0576 msec ( 17300.0/sec): Scroll 20x20 pixels 350000 trep @ 0.0817 msec ( 12200.0/sec): Scroll 30x30 pixels 200000 trep @ 0.1564 msec ( 6390.0/sec): Scroll 50x50 pixels 100000 trep @ 0.4446 msec ( 2250.0/sec): Scroll 100x100 pixels == fb_copyarea (DMA) acceleration == 1000000 trep @ 0.0307 msec ( 32500.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0353 msec ( 28300.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0397 msec ( 25200.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0464 msec ( 21600.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0645 msec ( 15500.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1177 msec ( 8500.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2783 msec ( 3590.0/sec): Scroll 100x100 pixels This shows that the ioctls overhead and the DMA setup cost is not so significant for the Raspberry Pi. DMA already becomes a bit faster than CPU at 10x10 size of the blit operation. Even though there is no significant difference between CPU and DMA for extremely small sizes of operations (the other overhead is clearly dominating), setting a threshold is not going to harm: == mixed CPU / fb_copyarea (DMA) with 90 pixels threshold == 1000000 trep @ 0.0291 msec ( 34300.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0345 msec ( 29000.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0395 msec ( 25300.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0466 msec ( 21400.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0650 msec ( 15400.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1181 msec ( 8470.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2784 msec ( 3590.0/sec): Scroll 100x100 pixels If some other ARM devices also implement Raspberry Pi compatible accelerated fb_copyarea ioctl, then the threshold selection may be reconsidered. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 07 Oct, 2013 1 commit
-
-
Siarhei Siamashka authored
Now acceleration is only used in the case if the AccelMethod option is not set (so that it is assumed to be a default choice) or when it is explicitly set to "COPYAREA". Any other value (for example "CPU") disables acceleration. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 03 Oct, 2013 2 commits
-
-
Siarhei Siamashka authored
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
This provides basic 2D acceleration support for Raspberry Pi to speed up moving windows and scrolling. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 22 Sep, 2013 1 commit
-
-
Siarhei Siamashka authored
Because a wide range of embedded ARM devices are actually supported (Allwinner A1X/A20, Raspberry Pi, ODROID-X, Rockchip, ...) and are getting some sort of performance improvement and/or hardware acceleration, the DDX driver needs a vendor neutral name. Resolves https://github.com/ssvb/xf86-video-fbturbo/issues/10 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 09 Sep, 2013 2 commits
-
-
Siarhei Siamashka authored
In the case if the framebuffer reservation size is too small for efficient use of the hardware overlays and zero-copy buffers flipping, log a hint about fixing this problem in /var/log/Xorg.0.log Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Even though we are primarily using the UMP buffer obtained by the GET_UMP_SECURE_ID_SUNXI_FB ioctl, another UMP buffer obtained by the GET_UMP_SECURE_ID_BUF1 ioctl should also span over the whole framebuffer. Otherwise we may have troubles with the window resize bug recovery and buffer flipping. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 07 Sep, 2013 3 commits
-
-
Siarhei Siamashka authored
The instructions, links, etc. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
The Allwinner A10/A13 display controller hardware is expected to support negative coordinates of the top left corners of the layers. But there is some bug either in the kernel driver or in the hardware, which messes up the picture on screen when the Y coordinate is negative for YUV layer. Negative X coordinates are not affected. RGB formats are not affected too (no matter whether the RGB layer is scaled or not). We fix this by just recalculating which part of the buffer in memory corresponds to Y=0 on screen and adjust the input buffer settings. Fixes https://github.com/ssvb/xf86-video-sunxifb/issues/16 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Now zero copy and tear free buffer swapping is also supported for 16bpp desktop. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 06 Sep, 2013 1 commit
-
-
Siarhei Siamashka authored
Now the scaler is enabled for the sunxi disp layer only when we want to use it for YUV format with XV. Whenever the layer is configured for RGB format or deactivated, the scaler gets disabled. This should make the driver more friendly to the other potential scaled layer users. The total number of available scalers is only 2 for Allwinner A10 and only 1 for Allwinner A13. The potential drawback is that now we may get an error when trying to enable the scaler (if somebody else has used up all the available scalers) instead of always having it reserved and ready for use. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 13 Aug, 2013 1 commit
-
-
Siarhei Siamashka authored
Recent changes broke the configuration when "DRI2HWOverlay" option is set to "false". This patch adds the missing UMP secure ids initialization and resolves the problem. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 04 Aug, 2013 3 commits
-
-
Siarhei Siamashka authored
Do this to keep the variables naming style consistent across the source file (earlier these variables had different names like 'self', 'drvpriv', 'private'). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
In double buffer mode, explicitly mark the buffers as designated for odd or even frame position when putting them into queue. And when swapping the buffers, use these flags to re-synchronize if it is necessary. This prevents problems after window resize (when gles-rgb-cycle-demo could expose a mismatch between the color name in the window title and the actual window color). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Whenever something goes wrong in high fps mode, it may be interesting to slow down the demo to check whether the actual background color matches the expected color (shown in the window title). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 03 Aug, 2013 1 commit
-
-
Siarhei Siamashka authored
If DEBUG_WITH_RGB_PATTERN is defined, then we check that the frames colors are changed as "R -> G -> B -> R -> G -> ..." pattern and print debugging messages when this is not the case. Such color change pattern can be generated by the "test/gles-rgb-cycle-demo.c" program. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 31 Jul, 2013 3 commits
-
-
Siarhei Siamashka authored
Do this mostly for security reasons. We don't want any application to see whatever was last rendered by the previous GLES application by just peeking into a freshly allocated DRI2 buffer. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
We manage only a single hardware overlay. That's a precious shared resource, which we want to use for zero-copy fullscreen compositing in gnome-shell. The strange 1x1 window does not really need it. Fixes https://github.com/ssvb/xf86-video-sunxifb/issues/2 Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
When enabled, it tries to avoid tearing in OpenGL ES applications. Works on sunxi hardware in the case if the hardware overlay (sunxi disp layer) is used for a DRI2 window. The name of this option and the description in the man page has been borrowed from intel and radeon drivers. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 30 Jul, 2013 1 commit
-
-
Siarhei Siamashka authored
That's the right thing to do and fixes issues such as https://github.com/ssvb/xf86-video-sunxifb/issues/6 As a result, now the framebuffer size may need to be larger in order to accomodate two DRI2 buffers in the offscreen part of the framebuffer. The users of sunxi hardware are advised to increase the value of fb0_framebuffer_num variable in fex file to 3 for 32bpp mode and to 5 for 16bpp mode. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 29 Jul, 2013 2 commits
-
-
Siarhei Siamashka authored
Should fix https://github.com/ssvb/xf86-video-sunxifb/issues/14 and prevent FTBFS on some systems. Reported-by: Fred Chien <cfsghost@gmail.com> Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
When moving further to our own DRI2 buffers bookkeeping, we can't really trust the information from DRI2BufferRec anymore. So just add a copy of all the missing bits of information to UMPBufferInfoRec and use it instead. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 28 Jul, 2013 1 commit
-
-
Siarhei Siamashka authored
By allowing to set the delay between frames with milliseconds precision in the command line, we can use it to test vsync. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 26 Jul, 2013 2 commits
-
-
Siarhei Siamashka authored
The recent commit 9e0a8731 (its part that suppressed buffers reuse in the Xorg DRI2 framework) introduced a regression. Half of the frames stoppped reaching the screen on the CPU copy fallback path because the Mali blob now ended up rendering them to the "wrong" buffer. It just confirms that we need to completely move from the standard DRI2 framework in the Xorg server to our own buffers bookkeeping logic. This patch fixes the regression by introducing a single UMP buffer per window, which is shared between back and front DRI2 buffers. We can do this because double buffering does not make much sense on the fallback path at the moment (we can't set scanout from this buffer and anyway have to copy this data elsewhere immediately after we get it from Mali). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Bail out earlier for the uninteresting types of DRI2 buffer requests (by just returning a dummy null UMP buffer). Makes the code a bit more simple on the common path. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 24 Jul, 2013 3 commits
-
-
Siarhei Siamashka authored
The test program cycles through 3 colors (red, green, blue), so it is easier to see if we get the color change pattern wrong. Also the X11 window title is updated to indicate the current color information. If we have any problems with window decorations handling, they are likely to be exposed. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
Using the secure id 1 (framebuffer) to trick the Mali blob into requesting DRI2 buffers again was not a very good idea. The problem is that the blob still writes something there and corrupts the framebuffer. So instead we try to assign secure id 2 to a dummy 4KiB UMP buffer allocated in memory and use it for the same purpose. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
Siarhei Siamashka authored
The Mali blob is doing something like this: 1. Request BackLeft DRI2 buffer (buffer A) and render to it 2. Swap buffers 3. Request BackLeft DRI2 buffer (buffer B) 4. Check window size, and if it has changed - go back to step 1. 5. Render to the current back buffer (either buffer A or B) 6. Swap buffers 7. Go back to step 4 A very serious show stopper problem is that the Mali blob ignores DRI2-InvalidateBuffers events and just uses GetGeometry polling to check whether the window size has changed. Unfortunately this is racy and we may end up with a size mismatch between buffer A and buffer B. This is particularly easy to trigger when the window size changes exactly between steps 1 and 3. See test/gles-yellow-blue-flip.c program which demonstrates this. Qt5 applications also trigger this bug. We workaround the issue by explicitly tracking the requests for BackLeft buffers and checking whether the sizes of these buffers match at step 1 and step 3. However the real challenge here is notifying the client application that these buffers are no good, so that it can request them again. As DRI2-InvalidateBuffers events are ignored, we are in a pretty difficult situation. Fortunately I remembered a weird behaviour observed earlier: https://groups.google.com/forum/#!msg/linux-sunxi/qnxpVaqp1Ys/aVTq09DVih0J Actually if we return UMP secure ID value 1 for the second DRI2 buffer request, the blob responds to this by spitting out the following error message: [EGL-X11] [2274] DETECTED ONLY ONE FRAMEBUFFER - FORCING A RESIZE [EGL-X11] [2274] DRI2 UMP ID 0x3 retrieved [EGL-X11] [2274] DRI2 WINDOW UMP SECURE ID CHANGED (0x3 -> 0x3) And then it proceeds by re-trying to request a pair of DRI2 buffers. But that's exactly the behaviour we want! As a down side, some ugly flashing can be seen on screen at the time when this workaround kicks in, but then everything normalizes. And unfortunately, the race condition is still not totally eliminated because the blob is apparently getting DRI2 buffer sizes from the separate GetGeometry requests instead of using the information provided by DRI2GetBuffers. But now the problem is at least very hard to trigger. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 20 Jul, 2013 1 commit
-
-
Harm Hanemaaijer authored
Since version 1.0, gstreamer (when using xvimagesink) often allocates a larger XV image for the video with padding on all four sides and then calls XvPutImage() to render a part of this image. With the current XV implementation this results in artifacts on the borders of the image, with a green bar at the bottom. I am observing this when playing a 1280x720 video on a 1920x1080 screen at 32bpp, the size of the video window doesn't matter. This problem seems to be an exaggeration of the one described in https://bugzilla.gnome.org/show_bug.cgi?id=685305 . The solution appears to be to use the source area dimensions as requested in the XvPutImage() call, as opposed to the dimensions of the originally allocated image, and to honour the offsets (src_x, src_y) when setting the source region on the display controller. With this relatively simple change, the problem seems to go away, and gstreamer 1.0 (which is faster than gstreamer 0.10 due to a zero-copy strategy) provides an acceptable solution for video playback. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
-
- 19 Jul, 2013 1 commit
-
-
Siarhei Siamashka authored
In the case if an attempt to reserve a scalable sunxi-disp layer failed, don't initialize XV at all. Otherwise any attempt to use XV overlay is not going to work correctly and just results in the following dmesg spam: [ 728.280000] [DISP] not supported yuv channel format:18 in img_sw_para_to_reg This may happen on Allwinner A13 if scaler mode is enabled in .fex file (A13 only has one DEFE scaler). Allwinner A10 also can have similar troubles in dual-head configuration if scaler mode is enabled for one or both screens (A10 has two DEFE scalers). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-
- 18 Jul, 2013 2 commits
-
-
Harm Hanemaaijer authored
Update the man page and bring it up-to-date, reflecting the fact that the driver also supports non-sunxi platforms. Add description of the "XVHWOverlay" option. Also a small update to the README for similar reasons. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
-
Harm Hanemaaijer authored
Add the "XVHWOverlay" boolean xorg.conf option to make it possible to disable the XV acceleration feature using display layers on sunxi hardware. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
-
- 17 Jul, 2013 1 commit
-
-
Siarhei Siamashka authored
In some systems libump library is built without an explicit pthreads dependency. As the issue has been already confirmed to affect both sunxi and odroid users (and maybe the users of the other mali400 based hardware), it is easier to just workaround the problem locally. Otherwise we would need to hunt down all the libump packagers and beg for the fix. More details are at https://github.com/ssvb/xf86-video-sunxifb/issues/11 Reported-by: Patrick Wood Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
-