-
Siarhei Siamashka authored
Benchmarking with x11perf, modified to support wider range of sizes for the scroll operation. Tests have been run at the stock 700MHz CPU clock frequency and with 1280x720 32bpp desktop. $ DISPLAY=:0 ./x11perf -scroll5 -scroll10 -scroll15 -scroll20 \ -scroll30 -scroll50 -scroll100 == CPU == 1000000 trep @ 0.0289 msec ( 34600.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0387 msec ( 25800.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0459 msec ( 21800.0/sec): Scroll 15x15 pixels 450000 trep @ 0.0576 msec ( 17300.0/sec): Scroll 20x20 pixels 350000 trep @ 0.0817 msec ( 12200.0/sec): Scroll 30x30 pixels 200000 trep @ 0.1564 msec ( 6390.0/sec): Scroll 50x50 pixels 100000 trep @ 0.4446 msec ( 2250.0/sec): Scroll 100x100 pixels == fb_copyarea (DMA) acceleration == 1000000 trep @ 0.0307 msec ( 32500.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0353 msec ( 28300.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0397 msec ( 25200.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0464 msec ( 21600.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0645 msec ( 15500.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1177 msec ( 8500.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2783 msec ( 3590.0/sec): Scroll 100x100 pixels This shows that the ioctls overhead and the DMA setup cost is not so significant for the Raspberry Pi. DMA already becomes a bit faster than CPU at 10x10 size of the blit operation. Even though there is no significant difference between CPU and DMA for extremely small sizes of operations (the other overhead is clearly dominating), setting a threshold is not going to harm: == mixed CPU / fb_copyarea (DMA) with 90 pixels threshold == 1000000 trep @ 0.0291 msec ( 34300.0/sec): Scroll 5x5 pixels 1000000 trep @ 0.0345 msec ( 29000.0/sec): Scroll 10x10 pixels 1000000 trep @ 0.0395 msec ( 25300.0/sec): Scroll 15x15 pixels 1000000 trep @ 0.0466 msec ( 21400.0/sec): Scroll 20x20 pixels 400000 trep @ 0.0650 msec ( 15400.0/sec): Scroll 30x30 pixels 250000 trep @ 0.1181 msec ( 8470.0/sec): Scroll 50x50 pixels 100000 trep @ 0.2784 msec ( 3590.0/sec): Scroll 100x100 pixels If some other ARM devices also implement Raspberry Pi compatible accelerated fb_copyarea ioctl, then the threshold selection may be reconsidered. Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
102957f9