Commits · 5b33041c89c89c39e854a3c4dd4ee015992f8c26 · adam.huang / Arm Trusted Firmware

19 Nov, 2015 1 commit

Juno R2: Configure the correct L2 RAM latency values · 1dbe3159

Sandrine Bailleux authored Nov 18, 2015

The default reset values for the L2 Data & Tag RAM latencies on the
Cortex-A72 on Juno R2 are not suitable. This patch modifies
the Juno platform reset handler to configure the right settings
on Juno R2.

Change-Id: I20953de7ba0619324a389e0b7bbf951b64057db8

1dbe3159

13 Nov, 2015 1 commit

Add missing RES1 bit in SCTLR_EL1 · 6cd12daa

Vikram Kanigiri authored Jul 22, 2015

As per Section D7.2.81 in the ARMv8-A Reference Manual (DDI0487A Issue A.h),
bits[29:28], bits[23:22], bit[20] and bit[11] in the SCTLR_EL1 are RES1. This
patch adds the missing bit[20] to the SCTLR_EL1_RES1 macro.

Change-Id: I827982fa2856d04def6b22d8200a79fe6922a28e

6cd12daa

19 Oct, 2015 1 commit

Make CASSERT() macro callable from anywhere · c17a4dc3

Sandrine Bailleux authored Oct 14, 2015

The CASSERT() macro introduces a typedef for the sole purpose of
triggering a compilation error if the condition to check is false.
This typedef is not used afterwards. As a consequence, when the
CASSERT() macro is called from withing a function block, the compiler
complains and outputs the following error message:

error: typedef 'msg' locally defined but not used [-Werror=unused-local-typedefs]

This patch adds the "unused" attribute for the aforementioned
typedef. This silences the compiler warning and thus makes the
CASSERT() macro callable from within function blocks as well.

Change-Id: Ie36b58fcddae01a21584c48bb6ef43ec85590479

c17a4dc3

14 Sep, 2015 1 commit

Make generic code work in presence of system caches · 54dc71e7

Achin Gupta authored Sep 11, 2015

On the ARMv8 architecture, cache maintenance operations by set/way on the last
level of integrated cache do not affect the system cache. This means that such a
flush or clean operation could result in the data being pushed out to the system
cache rather than main memory. Another CPU could access this data before it
enables its data cache or MMU. Such accesses could be serviced from the main
memory instead of the system cache. If the data in the sysem cache has not yet
been flushed or evicted to main memory then there could be a loss of
coherency. The only mechanism to guarantee that the main memory will be updated
is to use cache maintenance operations to the PoC by MVA(See section D3.4.11
(System level caches) of ARMv8-A Reference Manual (Issue A.g/ARM DDI0487A.G).

This patch removes the reliance of Trusted Firmware on the flush by set/way
operation to ensure visibility of data in the main memory. Cache maintenance
operations by MVA are now used instead. The following are the broad category of
changes:

1. The RW areas of BL2/BL31/BL32 are invalidated by MVA before the C runtime is
initialised. This ensures that any stale cache lines at any level of cache
are removed.

2. Updates to global data in runtime firmware (BL31) by the primary CPU are made
visible to secondary CPUs using a cache clean operation by MVA.

3. Cache maintenance by set/way operations are only used prior to power down.

NOTE: NON-UPSTREAM TRUSTED FIRMWARE CODE SHOULD MAKE EQUIVALENT CHANGES IN
ORDER TO FUNCTION CORRECTLY ON PLATFORMS WITH SUPPORT FOR SYSTEM CACHES.

Fixes ARM-software/tf-issues#205

Change-Id: I64f1b398de0432813a0e0881d70f8337681f6e9a

54dc71e7

11 Sep, 2015 1 commit

Re-design bakery lock memory allocation and algorithm · ee7b35c4

Andrew Thoelke authored Sep 10, 2015

This patch unifies the bakery lock api's across coherent and normal
memory implementation of locks by using same data type `bakery_lock_t`
and similar arguments to functions.

A separate section `bakery_lock` has been created and used to allocate
memory for bakery locks using `DEFINE_BAKERY_LOCK`. When locks are
allocated in normal memory, each lock for a core has to spread
across multiple cache lines. By using the total size allocated in a
separate cache line for a single core at compile time, the memory for
other core locks is allocated at link time by multiplying the single
core locks size with (PLATFORM_CORE_COUNT - 1). The normal memory lock
algorithm now uses lock address instead of the `id` in the per_cpu_data.
For locks allocated in coherent memory, it moves locks from
tzfw_coherent_memory to bakery_lock section.

The bakery locks are allocated as part of bss or in coherent memory
depending on usage of coherent memory. Both these regions are
initialised to zero as part of run_time_init before locks are used.
Hence, bakery_lock_init() is made an empty function as the lock memory
is already initialised to zero.

The above design lead to the removal of psci bakery locks from
non_cpu_power_pd_node to psci_locks.

NOTE: THE BAKERY LOCK API WHEN USE_COHERENT_MEM IS NOT SET HAS CHANGED.
THIS IS A BREAKING CHANGE FOR ALL PLATFORM PORTS THAT ALLOCATE BAKERY
LOCKS IN NORMAL MEMORY.

Change-Id: Ic3751c0066b8032dcbf9d88f1d4dc73d15f61d8b

ee7b35c4

24 Aug, 2015 1 commit

Add macros for retention control in Cortex-A53/A57 · e0d913c7

Varun Wadekar authored Aug 21, 2015



This patch adds macros suitable for programming the Advanced
SIMD/Floating-point (only Cortex-A53), CPU and L2 dynamic
retention control policy in the CPUECTLR_EL1 and L2ECTLR
registers.
Signed-off-by: Varun Wadekar <vwadekar@nvidia.com>

e0d913c7

05 Aug, 2015 2 commits

cortex_a53: Add A53 errata #826319, #836870 · 6b0d97b2

Jimmy Huang authored Jul 29, 2015



- Apply a53 errata #826319 to revision <= r0p2
- Apply a53 errata #836870 to revision <= r0p3
- Update docs/cpu-specific-build-macros.md for newly added errata build flags

Change-Id: I44918e36b47dca1fa29695b68700ff9bf888865e
Signed-off-by: Jimmy Huang <jimmy.huang@mediatek.com>

6b0d97b2

Add mmio utility functions · fd904df1

Jimmy Huang authored Jul 31, 2015



- Add mmio 16 bits read/write functions.
- Add clear/set/clear-and-set utility functions.

Change-Id: I00fdbdf24af537424f8666b1cadaa5f77a2a46ed
Signed-off-by: Jimmy Huang <jimmy.huang@mediatek.com>

fd904df1

24 Jul, 2015 1 commit

Add "Project Denver" CPU support · 3a8c55f6

Varun Wadekar authored Jul 14, 2015

Denver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is
fully ARMv8 architecture compatible. Each of the two Denver cores
implements a 7-way superscalar microarchitecture (up to 7 concurrent
micro-ops can be executed per clock), and includes a 128KB 4-way L1
instruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2
cache, which services both cores.

Denver implements an innovative process called Dynamic Code Optimization,
which optimizes frequently used software routines at runtime into dense,
highly tuned microcode-equivalent routines. These are stored in a
dedicated, 128MB main-memory-based optimization cache. After being read
into the instruction cache, the optimized micro-ops are executed,
re-fetched and executed from the instruction cache as long as needed and
capacity allows.

Effectively, this reduces the need to re-optimize the software routines.
Instead of using hardware to extract the instruction-level parallelism
(ILP) inherent in the code, Denver extracts the ILP once via software
techniques, and then executes those routines repeatedly, thus amortizing
the cost of ILP extraction over the many execution instances.

Denver also features new low latency power-state transitions, in addition
to extensive power-gating and dynamic voltage and clock scaling based on
workloads.
Signed-off-by: Varun Wadekar <vwadekar@nvidia.com>

3a8c55f6

27 Apr, 2015 2 commits

Add header guards to asm macro files · e2bf57f8

Dan Handley authored Apr 01, 2015

Some assembly files containing macros are included like header files
into other assembly files. This will cause assembler errors if they
are included multiple times.

Add header guards to assembly macro files to avoid assembler errors.

Change-Id: Ia632e767ed7df7bf507b294982b8d730a6f8fe69

e2bf57f8

Remove use of PLATFORM_CACHE_LINE_SIZE · ce4c820d

Dan Handley authored Mar 30, 2015

The required platform constant PLATFORM_CACHE_LINE_SIZE is
unnecessary since CACHE_WRITEBACK_GRANULE effectively provides the
same information. CACHE_WRITEBACK_GRANULE is preferred since this
is an architecturally defined term and allows comparison with the
corresponding hardware register value.

Replace all usage of PLATFORM_CACHE_LINE_SIZE with
CACHE_WRITEBACK_GRANULE.

Also, add a runtime assert in BL1 to check that the provided
CACHE_WRITEBACK_GRANULE matches the value provided in CTR_EL0.

Change-Id: If87286be78068424217b9f3689be358356500dcd

ce4c820d

31 Mar, 2015 1 commit

Translate secure/non-secure virtual addresses · 6e159e7a

Varun Wadekar authored Mar 13, 2015



This patch adds functionality to translate virtual addresses from
secure or non-secure worlds. This functionality helps Trusted Apps
to share virtual addresses directly and allows the NS world to
pass virtual addresses to TLK directly.

Change-Id: I77b0892963e0e839c448b5d0532920fb7e54dc8e
Signed-off-by: Varun Wadekar <vwadekar@nvidia.com>

6e159e7a

27 Mar, 2015 2 commits

Remove the `owner` field in bakery_lock_t data structure · 548579f5

Soby Mathew authored Feb 20, 2015

This patch removes the `owner` field in bakery_lock_t structure which
is the data structure used in the bakery lock implementation that uses
coherent memory. The assertions to protect against recursive lock
acquisition were based on the 'owner' field. They are now done based
on the bakery lock ticket number. These assertions are also added
to the bakery lock implementation that uses normal memory as well.

Change-Id: If4850a00dffd3977e218c0f0a8d145808f36b470

548579f5

Optimize the bakery lock structure for coherent memory · 1c9573a1

Soby Mathew authored Feb 19, 2015

This patch optimizes the data structure used with the bakery lock
implementation for coherent memory to save memory and minimize memory
accesses. These optimizations were already part of the bakery lock
implementation for normal memory and this patch now implements
it for the coherent memory implementation as well. Also
included in the patch is a cleanup to use the do-while loop while
waiting for other contenders to finish choosing their tickets.

Change-Id: Iedb305473133dc8f12126726d8329b67888b70f1

1c9573a1

18 Mar, 2015 1 commit

Add support for ARM Cortex-A72 processor · 1ba93aeb

Vikram Kanigiri authored Feb 17, 2015

This patch adds support for ARM Cortex-A72 processor in the CPU
specific framework.

Change-Id: I5986855fc1b875aadf3eba8c36e989d8a05e5175

1ba93aeb

16 Mar, 2015 1 commit

Use ARM CCI driver on FVP and Juno platforms · 4991ecdc

Vikram Kanigiri authored Feb 26, 2015

This patch updates the FVP and Juno platform ports to use the common
driver for ARM Cache Coherent Interconnects.

Change-Id: Ib142f456b9b673600592616a2ec99e9b230d6542

4991ecdc

26 Jan, 2015 1 commit

Call reset handlers upon BL3-1 entry. · 79a97b2e

Yatharth Kochar authored Nov 20, 2014

This patch adds support to call the reset_handler() function in BL3-1 in the
cold and warm boot paths when another Boot ROM reset_handler() has already run.

This means the BL1 and BL3-1 versions of the CPU and platform specific reset
handlers may execute different code to each other. This enables a developer to
perform additional actions or undo actions already performed during the first
call of the reset handlers e.g. apply additional errata workarounds.

Typically, the reset handler will be first called from the BL1 Boot ROM. Any
additional functionality can be added to the reset handler when it is called
from BL3-1 resident in RW memory. The constant FIRST_RESET_HANDLER_CALL is used
to identify whether this is the first version of the reset handler code to be
executed or an overridden version of the code.

The Cortex-A57 errata workarounds are applied only if they have not already been
applied.

Fixes ARM-software/tf-issue#275

Change-Id: Id295f106e4fda23d6736debdade2ac7f2a9a9053

79a97b2e

23 Jan, 2015 1 commit

Return success if an interrupt is seen during PSCI CPU_SUSPEND · 22f08973

Soby Mathew authored Jan 06, 2015

This patch adds support to return SUCCESS if a pending interrupt is
detected during a CPU_SUSPEND call to a power down state. The check
is performed as late as possible without losing the ability to return
to the caller. This reduces the overhead incurred by a CPU in
undergoing a complete power cycle when a wakeup interrupt is already
pending.

Fixes ARM-Software/tf-issues#102

Change-Id: I1aff04a74b704a2f529734428030d1d10750fd4b

22f08973

22 Jan, 2015 2 commits

Move bakery algorithm implementation out of coherent memory · 8c5fe0b5

Soby Mathew authored Jan 08, 2015

This patch moves the bakery locks out of coherent memory to normal memory.
This implies that the lock information needs to be placed on a separate cache
line for each cpu. Hence the bakery_lock_info_t structure is allocated in the
per-cpu data so as to minimize memory wastage. A similar platform per-cpu
data is introduced for the platform locks.

As a result of the above changes, the bakery lock api is completely changed.
Earlier, a reference to the lock structure was passed to the lock implementation.
Now a unique-id (essentially an index into the per-cpu data array) and an offset
into the per-cpu data for bakery_info_t needs to be passed to the lock
implementation.

Change-Id: I1e76216277448713c6c98b4c2de4fb54198b39e0

8c5fe0b5

Add macros for domain specific barriers. · 5b1cd43b

Soby Mathew authored Dec 30, 2014

This patch adds helper macros for barrier operations that specify
the type of barrier (dmb, dsb) and the shareability domain (system,
inner-shareable) it affects.

Change-Id: I4bf95103e79da212c4fbdbc13d91ad8ac385d9f5

5b1cd43b

07 Jan, 2015 1 commit

Prevent optimisation of sysregs accessors calls · 36e2fd01

Sandrine Bailleux authored Jan 07, 2015

Calls to system register read accessors functions may be optimised
out by the compiler if called twice in a row for the same register.
This is because the compiler is not aware that the result from
the instruction may be modified by external agents. Therefore, if
nothing modifies the register between the 2 reads as far as the
compiler knows then it might consider that it is useless to read
it twice and emit only 1 call.

This behaviour is faulty for registers that may not have the same
value if read twice in succession. E.g.: counters, timer
control/countdown registers, GICv3 interrupt status registers and
so on.

The same problem happens for calls to system register write
accessors functions. The compiler might optimise out some calls
if it considers that it will produce the same result. Again, this
behaviour is faulty for cases where intermediate writes to these
registers make a difference in the system.

This patch fixes the problem by making these assembly register
accesses volatile.

Fixes ARM-software/tf-issues#273

Change-Id: I33903bc4cc4eea8a8d87bc2c757909fbb0138925

36e2fd01

04 Dec, 2014 1 commit

Fix the array size of mpidr_aff_map_nodes_t. · 235585b1

Soby Mathew authored Dec 04, 2014

This patch fixes the array size of mpidr_aff_map_nodes_t which
was less by one element.

Fixes ARM-software/tf-issues#264

Change-Id: I48264f6f9e7046a3d0f4cbcd63b9ba49657e8818

235585b1

29 Oct, 2014 1 commit

Apply errata workarounds only when major/minor revisions match. · 7395a725

Soby Mathew authored Sep 22, 2014

Prior to this patch, the errata workarounds were applied for any version
of the CPU in the release build and in the debug build an assert
failure resulted when the revision did not match. This patch applies
errata workarounds in the Cortex-A57 reset handler only if the 'variant'
and 'revision' fields read from the MIDR_EL1 match. In the debug build,
a warning message is printed for each errata workaround which is not
applied.

The patch modifies the register usage in 'reset_handler` so
as to adhere to ARM procedure calling standards.

Fixes ARM-software/tf-issues#242

Change-Id: I51b1f876474599db885afa03346e38a476f84c29

7395a725

25 Sep, 2014 1 commit

Create BL stage specific translation tables · d0ecd979

Soby Mathew authored Sep 03, 2014

This patch uses the IMAGE_BL<x> constants to create translation tables specific
to a boot loader stage. This allows each stage to create mappings only for areas
in the memory map that it needs.

Fixes ARM-software/tf-issues#209

Change-Id: Ie4861407ddf9317f0fb890fc7575eaa88d0de51c

d0ecd979

16 Sep, 2014 1 commit

Initialize SCTLR_EL1 based on MODE_RW bit · ae213cee

Jens Wiklander authored Sep 04, 2014

Initializes SCTLR_EL1 based on MODE_RW bit in SPSR for the entry
point. The RES1 bits for SCTLR_EL1 differs for Aarch64 and Aarch32
mode.

ae213cee

02 Sep, 2014 1 commit

Reset CNTVOFF_EL2 register before exit into EL1 on warm boot · 14c0526b

Soby Mathew authored Aug 29, 2014

This patch resets the value of CNTVOFF_EL2 before exit to EL1 on
warm boot. This needs to be done if only the Trusted Firmware exits
to EL1 instead of EL2, otherwise the hypervisor would be responsible
for this.

Fixes ARM-software/tf-issues#240

Change-Id: I79d54831356cf3215bcf1f251c373bd8f89db0e0

14c0526b

21 Aug, 2014 1 commit

Juno: Implement initial platform port · 01b916bf

Sandrine Bailleux authored Jul 17, 2014

This patch adds the initial port of the ARM Trusted Firmware on the Juno
development platform. This port does not support a BL3-2 image or any PSCI APIs
apart from PSCI_VERSION and PSCI_CPU_ON. It enables workarounds for selected
Cortex-A57 (#806969 & #813420) errata and implements the workaround for a Juno
platform errata (Defect id 831273).

Change-Id: Ib3d92df3af53820cfbb2977582ed0d7abf6ef893

01b916bf

20 Aug, 2014 4 commits

Add support for selected Cortex-A57 errata workarounds · d9bdaf2d

Soby Mathew authored Aug 14, 2014

This patch adds workarounds for selected errata which affect the Cortex-A57 r0p0
part. Each workaround has a build time flag which should be used by the platform
port to enable or disable the corresponding workaround. The workarounds are
disabled by default. An assertion is raised if the platform enables a workaround
which does not match the CPU revision at runtime.

Change-Id: I9ae96b01c6ff733d04dc733bd4e67dbf77b29fb0

d9bdaf2d

Add CPU specific crash reporting handlers · d3f70af6

Soby Mathew authored Aug 14, 2014

This patch adds handlers for dumping Cortex-A57 and Cortex-A53 specific register
state to the CPU specific operations framework. The contents of CPUECTLR_EL1 are
dumped currently.

Change-Id: I63d3dbfc4ac52fef5e25a8cf6b937c6f0975c8ab

d3f70af6

Add CPU specific power management operations · add40351

Soby Mathew authored Aug 14, 2014

This patch adds CPU core and cluster power down sequences to the CPU specific
operations framework introduced in a earlier patch. Cortex-A53, Cortex-A57 and
generic AEM sequences have been added. The latter is suitable for the
Foundation and Base AEM FVPs. A pointer to each CPU's operations structure is
saved in the per-cpu data so that it can be easily accessed during power down
seqeunces.

An optional platform API has been introduced to allow a platform to disable the
Accelerator Coherency Port (ACP) during a cluster power down sequence. The weak
definition of this function (plat_disable_acp()) does not take any action. It
should be overriden with a strong definition if the ACP is present on a
platform.

Change-Id: I8d09bd40d2f528a28d2d3f19b77101178778685d

add40351

Introduce framework for CPU specific operations · 9b476841

Soby Mathew authored Aug 14, 2014

This patch introduces a framework which will allow CPUs to perform
implementation defined actions after a CPU reset, during a CPU or cluster power
down, and when a crash occurs. CPU specific reset handlers have been implemented
in this patch. Other handlers will be implemented in subsequent patches.

Also moved cpu_helpers.S to the new directory lib/cpus/aarch64/.

Change-Id: I1ca1bade4d101d11a898fb30fea2669f9b37b956

9b476841

14 Aug, 2014 1 commit

Move IO storage source to drivers directory · 935db693

Dan Handley authored Aug 12, 2014

Move the remaining IO storage source file (io_storage.c) from the
lib to the drivers directory. This requires that platform ports
explicitly add this file to the list of source files.

Also move the IO header files to a new sub-directory, include/io.

Change-Id: I862b1252a796b3bcac0d93e50b11e7fb2ded93d6

935db693

28 Jul, 2014 1 commit

Simplify management of SCTLR_EL3 and SCTLR_EL1 · ec3c1003

Achin Gupta authored Jul 18, 2014

This patch reworks the manner in which the M,A, C, SA, I, WXN & EE bits of
SCTLR_EL3 & SCTLR_EL1 are managed. The EE bit is cleared immediately after reset
in EL3. The I, A and SA bits are set next in EL3 and immediately upon entry in
S-EL1. These bits are no longer managed in the blX_arch_setup() functions. They
do not have to be saved and restored either. The M, WXN and optionally the C
bit are set in the enable_mmu_elX() function. This is done during both the warm
and cold boot paths.

Fixes ARM-software/tf-issues#226

Change-Id: Ie894d1a07b8697c116960d858cd138c50bc7a069

ec3c1003

19 Jul, 2014 1 commit

Make enablement of the MMU more flexible · afff8cbd

Achin Gupta authored Jun 26, 2014

This patch adds a 'flags' parameter to each exception level specific function
responsible for enabling the MMU. At present only a single flag which indicates
whether the data cache should also be enabled is implemented. Subsequent patches
will use this flag when enabling the MMU in the warm boot paths.

Change-Id: I0eafae1e678c9ecc604e680851093f1680e9cefa

afff8cbd

09 Jul, 2014 1 commit

Calculate TCR bits based on VA and PA · 73ad2572

Lin Ma authored Jun 27, 2014

Currently the TCR bits are hardcoded in xlat_tables.c. In order to
map higher physical address into low virtual address, the TCR bits
need to be configured accordingly.

This patch is to save the max VA and PA and calculate the TCR.PS/IPS
and t0sz bits in init_xlat_tables function.

Change-Id: Ia7a58e5372b20200153057d457f4be5ddbb7dae4

73ad2572

24 Jun, 2014 2 commits

Inline the mmio accessor functions · 5e113753

Andrew Thoelke authored Jun 24, 2014

Making the simple mmio_read_*() and mmio_write_*() functions inline
saves 360 bytes of code in FVP release build.

Fixes ARM-software/tf-issues#210

Change-Id: I65134f9069f3b2d8821d882daaa5fdfe16355e2f

5e113753

Remove all checkpatch errors from codebase · 4f2104ff

Juan Castillo authored Jun 13, 2014

Exclude stdlib files because they do not follow kernel code style.

Fixes ARM-software/tf-issues#73

Change-Id: I4cfafa38ab436f5ab22c277cb38f884346a267ab

4f2104ff

23 Jun, 2014 2 commits

Remove calling CPU mpidr from bakery lock API · 634ec6c2

Andrew Thoelke authored Jun 09, 2014

The bakery lock code currently expects the calling code to pass
the MPIDR_EL1 of the current CPU.

This is not always done correctly. Also the change to provide
inline access to system registers makes it more efficient for the
bakery lock code to obtain the MPIDR_EL1 directly.

This change removes the mpidr parameter from the bakery lock
interface, and results in a code reduction of 160 bytes for the
ARM FVP port.

Fixes ARM-software/tf-issues#213

Change-Id: I7ec7bd117bcc9794a0d948990fcf3336a367d543

634ec6c2

Initialise CPU contexts from entry_point_info · 167a9357

Andrew Thoelke authored Jun 04, 2014

Consolidate all BL3-1 CPU context initialization for cold boot, PSCI
and SPDs into two functions:
*  The first uses entry_point_info to initialize the relevant
   cpu_context for first entry into a lower exception level on a CPU
*  The second populates the EL1 and EL2 system registers as needed
   from the cpu_context to ensure correct entry into the lower EL

This patch alters the way that BL3-1 determines which exception level
is used when first entering EL1 or EL2 during cold boot - this is now
fully determined by the SPSR value in the entry_point_info for BL3-3,
as set up by the platform code in BL2 (or otherwise provided to BL3-1).

In the situation that EL1 (or svc mode) is selected for a processor
that supports EL2, the context management code will now configure all
essential EL2 register state to ensure correct execution of EL1. This
allows the platform code to run non-secure EL1 payloads directly
without requiring a small EL2 stub or OS loader.

Change-Id: If9fbb2417e82d2226e47568203d5a369f39d3b0f

167a9357

10 Jun, 2014 1 commit

Make system register functions inline assembly · 5c3272a7

Andrew Thoelke authored Jun 02, 2014

Replace the current out-of-line assembler implementations of
the system register and system instruction operations with
inline assembler.

This enables better compiler optimisation and code generation
when accessing system registers.

Fixes ARM-software/tf-issues#91

Change-Id: I149af3a94e1e5e5140a3e44b9abfc37ba2324476

5c3272a7