1. 02 Sep, 2020 1 commit
    • Alexei Fedorov's avatar
      plat/arm: Introduce and use libc_asm.mk makefile · e3f2b1a9
      Alexei Fedorov authored
      Trace analysis of FVP_Base_AEMv8A 0.0/6063 model
      running in Aarch32 mode with the build options
      listed below:
      TRUSTED_BOARD_BOOT=1 GENERATE_COT=1
      ARM_ROTPK_LOCATION=devel_ecdsa KEY_ALG=ecdsa
      ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_ecdsa.pem
      shows that when auth_signature() gets called
      71.99% of CPU execution time is spent in memset() function
      written in C using single byte write operations,
      see lib\libc\memset.c.
      This patch introduces new libc_asm.mk makefile which
      replaces C memset() implementation with assembler
      version giving the following results:
      - for Aarch32 in auth_signature() call memset() CPU time
      reduced to 20.56%.
      The number of CPU instructions (Inst) executed during
      TF-A boot stage before start of BL33 in RELEASE builds
      for different versions is presented in the tables below,
      where:
      - C TF-A: existing TF-A C code;
      - C musl: "lightweight code" C "implementation of the
        standard library for Linux-based systems"
      https://git.musl-libc.org/cgit/musl/tree/src/string/memset.c
      - Asm Opt: assemler version from "Arm Optimized Routines"
        project
      https://github.com/ARM-software/optimized-routines/blob/
      master/string/arm/memset.S
      - Asm Linux: assembler version from Linux kernel
      https://github.com/torvalds/linux/blob/master/arch/arm/lib/memset.S
      
      
      - Asm TF-A: assembler version from this patch
      
      Aarch32:
      +-----------+------+------+--------------+----------+
      | Variant   | Set  | Size |    Inst 	 |  Ratio   |
      +-----------+------+------+--------------+----------+
      | C TF-A    | T32  | 16   | 2122110003   | 1.000000 |
      | C musl    | T32  | 156  | 1643917668   | 0.774662 |
      | Asm Opt   | T32  | 84   | 1604810003   | 0.756233 |
      | Asm Linux | A32  | 168  | 1566255018   | 0.738065 |
      | Asm TF-A  | A32  | 160  | 1525865101   | 0.719032 |
      +-----------+------+------+--------------+----------+
      
      AArch64:
      +-----------+------+------------+----------+
      | Variant   | Size |    Inst    |  Ratio   |
      +-----------+------+------------+----------+
      | C TF-A    | 28   | 2732497518 | 1.000000 |
      | C musl    | 212  | 1802999999 | 0.659836 |
      | Asm TF-A  | 140  | 1680260003 | 0.614917 |
      +-----------+------+------------+----------+
      
      This patch modifies 'plat\arm\common\arm_common.mk'
      by overriding libc.mk makefile with libc_asm.mk and
      does not effect other platforms.
      
      Change-Id: Ie89dd0b74ba1079420733a0d76b7366ad0157c2e
      Signed-off-by: default avatarAlexei Fedorov <Alexei.Fedorov@arm.com>
      e3f2b1a9
  2. 21 Aug, 2020 1 commit
  3. 19 Aug, 2020 1 commit
    • Alexei Fedorov's avatar
      libc/memset: Implement function in assembler · e7d344de
      Alexei Fedorov authored
      
      
      Trace analysis of FVP_Base_AEMv8A model running in
      Aarch32 mode with the build options listed below:
      TRUSTED_BOARD_BOOT=1 GENERATE_COT=1
      ARM_ROTPK_LOCATION=devel_ecdsa KEY_ALG=ecdsa
      ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_ecdsa.pem
      shows that when auth_signature() gets called
      71.84% of CPU execution time is spent in memset() function
      written in C using single byte write operations,
      see lib\libc\memset.c.
      This patch replaces C memset() implementation with assembler
      version giving the following results:
      - for Aarch32 in auth_signature() call memset() CPU time
      reduced to 24.84%.
      - Number of CPU instructions executed during TF-A
      boot stage before start of BL33 in RELEASE builds:
      ----------------------------------------------
      |  Arch   |     C      |  assembler |    %   |
      ----------------------------------------------
      | Aarch32 | 2073275460 | 1487400003 | -28.25 |
      | Aarch64 | 2056807158 | 1244898303 | -39.47 |
      ----------------------------------------------
      The patch also replaces memset.c with aarch64/memset.S
      in plat\nvidia\tegra\platform.mk.
      
      Change-Id: Ifbf085a2f577a25491e2d28446ee95a4ac891597
      Signed-off-by: default avatarAlexei Fedorov <Alexei.Fedorov@arm.com>
      e7d344de