- 02 Sep, 2020 1 commit
-
-
Alexei Fedorov authored
Trace analysis of FVP_Base_AEMv8A 0.0/6063 model running in Aarch32 mode with the build options listed below: TRUSTED_BOARD_BOOT=1 GENERATE_COT=1 ARM_ROTPK_LOCATION=devel_ecdsa KEY_ALG=ecdsa ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_ecdsa.pem shows that when auth_signature() gets called 71.99% of CPU execution time is spent in memset() function written in C using single byte write operations, see lib\libc\memset.c. This patch introduces new libc_asm.mk makefile which replaces C memset() implementation with assembler version giving the following results: - for Aarch32 in auth_signature() call memset() CPU time reduced to 20.56%. The number of CPU instructions (Inst) executed during TF-A boot stage before start of BL33 in RELEASE builds for different versions is presented in the tables below, where: - C TF-A: existing TF-A C code; - C musl: "lightweight code" C "implementation of the standard library for Linux-based systems" https://git.musl-libc.org/cgit/musl/tree/src/string/memset.c - Asm Opt: assemler version from "Arm Optimized Routines" project https://github.com/ARM-software/optimized-routines/blob/ master/string/arm/memset.S - Asm Linux: assembler version from Linux kernel https://github.com/torvalds/linux/blob/master/arch/arm/lib/memset.S - Asm TF-A: assembler version from this patch Aarch32: +-----------+------+------+--------------+----------+ | Variant | Set | Size | Inst | Ratio | +-----------+------+------+--------------+----------+ | C TF-A | T32 | 16 | 2122110003 | 1.000000 | | C musl | T32 | 156 | 1643917668 | 0.774662 | | Asm Opt | T32 | 84 | 1604810003 | 0.756233 | | Asm Linux | A32 | 168 | 1566255018 | 0.738065 | | Asm TF-A | A32 | 160 | 1525865101 | 0.719032 | +-----------+------+------+--------------+----------+ AArch64: +-----------+------+------------+----------+ | Variant | Size | Inst | Ratio | +-----------+------+------------+----------+ | C TF-A | 28 | 2732497518 | 1.000000 | | C musl | 212 | 1802999999 | 0.659836 | | Asm TF-A | 140 | 1680260003 | 0.614917 | +-----------+------+------------+----------+ This patch modifies 'plat\arm\common\arm_common.mk' by overriding libc.mk makefile with libc_asm.mk and does not effect other platforms. Change-Id: Ie89dd0b74ba1079420733a0d76b7366ad0157c2e Signed-off-by: Alexei Fedorov <Alexei.Fedorov@arm.com>
-
- 21 Aug, 2020 1 commit
-
-
Mark Dykes authored
This reverts commit e7d344de. This reverts the patch https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/5313 due to a timing issue with the merge. The merge occurred at the same time as the additional comments and thusly were were not seen until the merge was done. This reverts the change and additional patches from Alexei will follow to address the concerns expressed in the orignal patch. Change-Id: Iae5f6403c93ac13ceeda29463883fcd4c437f2b7
-
- 19 Aug, 2020 1 commit
-
-
Alexei Fedorov authored
Trace analysis of FVP_Base_AEMv8A model running in Aarch32 mode with the build options listed below: TRUSTED_BOARD_BOOT=1 GENERATE_COT=1 ARM_ROTPK_LOCATION=devel_ecdsa KEY_ALG=ecdsa ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_ecdsa.pem shows that when auth_signature() gets called 71.84% of CPU execution time is spent in memset() function written in C using single byte write operations, see lib\libc\memset.c. This patch replaces C memset() implementation with assembler version giving the following results: - for Aarch32 in auth_signature() call memset() CPU time reduced to 24.84%. - Number of CPU instructions executed during TF-A boot stage before start of BL33 in RELEASE builds: ---------------------------------------------- | Arch | C | assembler | % | ---------------------------------------------- | Aarch32 | 2073275460 | 1487400003 | -28.25 | | Aarch64 | 2056807158 | 1244898303 | -39.47 | ---------------------------------------------- The patch also replaces memset.c with aarch64/memset.S in plat\nvidia\tegra\platform.mk. Change-Id: Ifbf085a2f577a25491e2d28446ee95a4ac891597 Signed-off-by: Alexei Fedorov <Alexei.Fedorov@arm.com>
-