diff --git a/docs/firmware-design.md b/docs/firmware-design.md
index cd8c7aaad8ef55d077dc1fca3ff28abfa7aca99a..54c5068073ce283dd311fcc2dbdfda40ca6c7f67 100644
--- a/docs/firmware-design.md
+++ b/docs/firmware-design.md
@@ -152,11 +152,15 @@ BL1 performs minimal architectural initialization as follows.
     CLCD window of the FVP.
 
     BL1 does not expect to receive any exceptions other than the SMC exception.
-    For the latter, BL1 installs a simple stub. The stub expects to receive
-    only a single type of SMC (determined by its function ID in the general
-    purpose register `X0`). This SMC is raised by BL2 to make BL1 pass control
-    to BL31 (loaded by BL2) at EL3. Any other SMC leads to an assertion
-    failure.
+    For the latter, BL1 installs a simple stub. The stub expects to receive a
+    limited set of SMC types (determined by their function IDs in the general
+    purpose register `X0`):
+    -   `BL1_SMC_RUN_IMAGE`: This SMC is raised by BL2 to make BL1 pass control
+        to BL31 (loaded by BL2) at EL3.
+    -   All SMCs listed in section "BL1 SMC Interface" in the [Firmware Update]
+        Design Guide.
+
+    Any other SMC leads to an assertion failure.
 
 *   CPU initialization
 
@@ -164,12 +168,6 @@ BL1 performs minimal architectural initialization as follows.
     specific reset handler function (see the section: "CPU specific operations
     framework").
 
-*   MMU setup
-
-    BL1 sets up EL3 memory translation by creating page tables to cover the
-    first 4GB of physical address space. This covers all the memories and
-    peripherals needed by BL1.
-
 *   Control register setup
     -   `SCTLR_EL3`. Instruction cache is enabled by setting the `SCTLR_EL3.I`
         bit. Alignment and stack alignment checking is enabled by setting the
@@ -187,12 +185,19 @@ BL1 performs minimal architectural initialization as follows.
         and Advanced SIMD execution are configured to not trap to EL3 by
         clearing the `CPTR_EL3.TFP` bit.
 
+    -   `DAIF`. The SError interrupt is enabled by clearing the SError interrupt
+        mask bit.
+
 #### Platform initialization
 
-BL1 enables issuing of snoop and DVM (Distributed Virtual Memory) requests to
-the CCI slave interface corresponding to the cluster that includes the
-primary CPU. BL1 also initializes a UART (PL011 console), which enables access
-to the `printf` family of functions in BL1.
+On ARM platforms, BL1 performs the following platform initializations:
+
+*   Enable the Trusted Watchdog.
+*   Initialize the console.
+*   Configure the Interconnect to enable hardware coherency.
+*   Enable the MMU and map the memory it needs to access.
+*   Configure any required platform storage to load the next bootloader image
+    (BL2).
 
 #### Firmware Update detection and execution
 
@@ -210,7 +215,12 @@ uses to initialize the execution state of the next image.
 
 In the normal boot flow, BL1 execution continues as follows:
 
-1.  BL1 determines the amount of free trusted SRAM memory available by
+1.  BL1 prints the following string from the primary CPU to indicate successful
+    execution of the BL1 stage:
+
+        "Booting Trusted Firmware"
+
+2.  BL1 determines the amount of free trusted SRAM memory available by
     calculating the extent of its own data section, which also resides in
     trusted SRAM. BL1 loads a BL2 raw binary image from platform storage, at a
     platform-specific base address. If the BL2 image file is not present or if
@@ -225,11 +235,6 @@ In the normal boot flow, BL1 execution continues as follows:
     provided as a base address in the platform header. Further description of
     the memory layout can be found later in this document.
 
-2.  BL1 prints the following string from the primary CPU to indicate successful
-    execution of the BL1 stage:
-
-        "Booting Trusted Firmware"
-
 3.  BL1 passes control to the BL2 image at Secure EL1, starting from its load
     address.
 
@@ -247,23 +252,23 @@ in this document). The functionality implemented by BL2 is as follows.
 #### Architectural initialization
 
 BL2 performs minimal architectural initialization required for subsequent
-stages of the ARM Trusted Firmware and normal world software. It sets up
-Secure EL1 memory translation by creating page tables to address the first 4GB
-of the physical address space in a similar way to BL1. EL1 and EL0 are given
-access to Floating Point & Advanced SIMD registers by clearing the `CPACR.FPEN`
-bits.
+stages of the ARM Trusted Firmware and normal world software. EL1 and EL0 are
+given access to Floating Point & Advanced SIMD registers by clearing the
+`CPACR.FPEN` bits.
 
 #### Platform initialization
 
-BL2 copies the information regarding the trusted SRAM populated by BL1 using a
-platform-specific mechanism. It calculates the limits of DRAM (main memory)
-to determine whether there is enough space to load the BL33 image. A platform
-defined base address is used to specify the load address for the BL31 image.
-It also defines the extents of memory available for use by the BL32 image.
-BL2 also initializes a UART (PL011 console), which enables  access to the
-`printf` family of functions in BL2. Platform security is initialized to allow
-access to controlled components. The storage abstraction layer is initialized
-which is used to load further bootloader images.
+On ARM platforms, BL2 performs the following platform initializations:
+
+*   Initialize the console.
+*   Configure any required platform storage to allow loading further bootloader
+    images.
+*   Enable the MMU and map the memory it needs to access.
+*   Perform platform security setup to allow access to controlled components.
+*   Reserve some memory for passing information to the next bootloader image
+    (BL31) and populate it.
+*   Define the extents of memory available for loading each subsequent
+    bootloader image.
 
 #### SCP_BL2 (System Control Processor Firmware) image load
 
@@ -334,89 +339,75 @@ in this document). The functionality implemented by BL31 is as follows.
 Currently, BL31 performs a similar architectural initialization to BL1 as
 far as system register settings are concerned. Since BL1 code resides in ROM,
 architectural initialization in BL31 allows override of any previous
-initialization done by BL1. BL31 creates page tables to address the first
-4GB of physical address space and initializes the MMU accordingly. It initializes
-a buffer of frequently used pointers, called per-CPU pointer cache, in memory for
-faster access. Currently the per-CPU pointer cache contains only the pointer
-to crash stack. It then replaces the exception vectors populated by BL1 with its
-own. BL31 exception vectors implement more elaborate support for
-handling SMCs since this is the only mechanism to access the runtime services
-implemented by BL31 (PSCI for example). BL31 checks each SMC for validity as
-specified by the [SMC calling convention PDD][SMCCC] before passing control to
-the required SMC handler routine. BL31 programs the `CNTFRQ_EL0` register with
-the clock frequency of the system counter, which is provided by the platform.
+initialization done by BL1.
+
+BL31 initializes the per-CPU data framework, which provides a cache of
+frequently accessed per-CPU data optimised for fast, concurrent manipulation
+on different CPUs. This buffer includes pointers to per-CPU contexts, crash
+buffer, CPU reset and power down operations, PSCI data, platform data and so on.
+
+It then replaces the exception vectors populated by BL1 with its own. BL31
+exception vectors implement more elaborate support for handling SMCs since this
+is the only mechanism to access the runtime services implemented by BL31 (PSCI
+for example). BL31 checks each SMC for validity as specified by the
+[SMC calling convention PDD][SMCCC] before passing control to the required SMC
+handler routine.
+
+BL31 programs the `CNTFRQ_EL0` register with the clock frequency of the system
+counter, which is provided by the platform.
 
 #### Platform initialization
 
 BL31 performs detailed platform initialization, which enables normal world
-software to function correctly. It also retrieves entrypoint information for
-the BL33 image loaded by BL2 from the platform defined memory address populated
-by BL2. It enables issuing of snoop and DVM (Distributed Virtual Memory)
-requests to the CCI slave interface corresponding to the cluster that includes
-the primary CPU. BL31 also initializes a UART (PL011 console), which enables
-access to the `printf` family of functions in BL31.  It enables the system
-level implementation of the generic timer through the memory mapped interface.
-
-* GICv2 initialization:
-
-    -   Enable group0 interrupts in the GIC CPU interface.
-    -   Configure group0 interrupts to be asserted as FIQs.
-    -   Disable the legacy interrupt bypass mechanism.
-    -   Configure the priority mask register to allow interrupts of all
-        priorities to be signaled to the CPU interface.
-    -   Mark SGIs 8-15 and the other secure interrupts on the platform
-        as group0 (secure).
-    -   Target all secure SPIs to CPU0.
-    -   Enable these group0 interrupts in the GIC distributor.
-    -   Configure all other interrupts as group1 (non-secure).
-    -   Enable signaling of group0 interrupts in the GIC distributor.
-
-*   GICv3 initialization:
-
-    If a GICv3 implementation is available in the platform, BL31 initializes
-    the GICv3 in GICv2 emulation mode with settings as described for GICv2
-    above.
-
-*   Power management initialization:
-
-    BL31 implements a state machine to track CPU and cluster state. The state
-    can be one of `OFF`, `ON_PENDING`, `SUSPEND` or `ON`. All secondary CPUs are
-    initially in the `OFF` state. The cluster that the primary CPU belongs to is
-    `ON`; any other cluster is `OFF`. BL31 initializes the data structures that
-    implement the state machine, including the locks that protect them. BL31
-    accesses the state of a CPU or cluster immediately after reset and before
-    the data cache is enabled in the warm boot path. It is not currently
-    possible to use 'exclusive' based spinlocks, therefore BL31 uses locks
-    based on Lamport's Bakery algorithm instead. BL31 allocates these locks in
-    device memory by default.
-
-*   Runtime services initialization:
-
-    The runtime service framework and its initialization is described in the
-    "EL3 runtime services framework" section below.
-
-    Details about the PSCI service are provided in the "Power State Coordination
-    Interface" section below.
-
-*   BL32 (Secure-EL1 Payload) image initialization
-
-    If a BL32 image is present then there must be a matching Secure-EL1 Payload
-    Dispatcher (SPD) service (see later for details). During initialization
-    that service  must register a function to carry out initialization of BL32
-    once the runtime services are fully initialized. BL31 invokes such a
-    registered function to initialize BL32 before running BL33.
-
-    Details on BL32 initialization and the SPD's role are described in the
-    "Secure-EL1 Payloads and Dispatchers" section below.
-
-*   BL33 (Non-trusted Firmware) execution
-
-    BL31 initializes the EL2 or EL1 processor context for normal-world cold
-    boot, ensuring that no secure state information finds its way into the
-    non-secure execution state. BL31 uses the entrypoint information provided
-    by BL2 to jump to the Non-trusted firmware image (BL33) at the highest
-    available Exception Level (EL2 if available, otherwise EL1).
+software to function correctly.
+
+On ARM platforms, this consists of the following:
+
+*   Initialize the console.
+*   Configure the Interconnect to enable hardware coherency.
+*   Enable the MMU and map the memory it needs to access.
+*   Initialize the generic interrupt controller.
+*   Initialize the power controller device.
+*   Detect the system topology.
+
+#### Runtime services initialization
+
+BL31 is responsible for initializing the runtime services. One of them is PSCI.
+
+As part of the PSCI initializations, BL31 detects the system topology. It also
+initializes the data structures that implement the state machine used to track
+the state of power domain nodes. The state can be one of `OFF`, `RUN` or
+`RETENTION`. All secondary CPUs are initially in the `OFF` state. The cluster
+that the primary CPU belongs to is `ON`; any other cluster is `OFF`. It also
+initializes the locks that protect them. BL31 accesses the state of a CPU or
+cluster immediately after reset and before the data cache is enabled in the
+warm boot path. It is not currently possible to use 'exclusive' based spinlocks,
+therefore BL31 uses locks based on Lamport's Bakery algorithm instead.
 
+The runtime service framework and its initialization is described in more
+detail in the "EL3 runtime services framework" section below.
+
+Details about the status of the PSCI implementation are provided in the
+"Power State Coordination Interface" section below.
+
+#### BL32 (Secure-EL1 Payload) image initialization
+
+If a BL32 image is present then there must be a matching Secure-EL1 Payload
+Dispatcher (SPD) service (see later for details). During initialization
+that service must register a function to carry out initialization of BL32
+once the runtime services are fully initialized. BL31 invokes such a
+registered function to initialize BL32 before running BL33.
+
+Details on BL32 initialization and the SPD's role are described in the
+"Secure-EL1 Payloads and Dispatchers" section below.
+
+#### BL33 (Non-trusted Firmware) execution
+
+BL31 initializes the EL2 or EL1 processor context for normal-world cold
+boot, ensuring that no secure state information finds its way into the
+non-secure execution state. BL31 uses the entrypoint information provided
+by BL2 to jump to the Non-trusted firmware image (BL33) at the highest
+available Exception Level (EL2 if available, otherwise EL1).
 
 ### Using alternative Trusted Boot Firmware in place of BL1 and BL2
 
@@ -558,9 +549,6 @@ not all been instantiated in the current implementation.
     Coordination Interface ([PSCI]) is the first set of standard service calls
     defined by ARM (see PSCI section later).
 
-    NOTE: Currently this service is called PSCI since there are no other
-    defined standard service calls.
-
 2.  Secure-EL1 Payload Dispatcher service
 
     If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then
@@ -1833,7 +1821,7 @@ kernel at boot time. These can be found in the `fdts` directory.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - -
 
-_Copyright (c) 2013-2015, ARM Limited and Contributors. All rights reserved._
+_Copyright (c) 2013-2016, ARM Limited and Contributors. All rights reserved._
 
 [ARM ARM]:          http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html "ARMv8-A Reference Manual (ARM DDI0487A.E)"
 [PSCI]:             http://infocenter.arm.com/help/topic/com.arm.doc.den0022c/DEN0022C_Power_State_Coordination_Interface.pdf "Power State Coordination Interface PDD (ARM DEN 0022C)"
diff --git a/docs/porting-guide.md b/docs/porting-guide.md
index bd1b4489cbda0fe99848627f382349cdce74e914..26c89bdaa3096f923791a8c65ff53135460f3d2a 100644
--- a/docs/porting-guide.md
+++ b/docs/porting-guide.md
@@ -854,8 +854,16 @@ BL1 to perform the above tasks.
 This function executes with the MMU and data caches disabled. It is only called
 by the primary CPU.
 
-In ARM standard platforms, this function initializes the console and enables
-snoop requests into the primary CPU's cluster.
+On ARM standard platforms, this function:
+
+*   Enables a secure instance of SP805 to act as the Trusted Watchdog.
+
+*   Initializes a UART (PL011 console), which enables access to the `printf`
+    family of functions in BL1.
+
+*   Enables issuing of snoop and DVM (Distributed Virtual Memory) requests to
+    the CCI slave interface corresponding to the cluster that includes the
+    primary CPU.
 
 ### Function : bl1_plat_arch_setup() [mandatory]
 
@@ -1061,15 +1069,19 @@ This function executes with the MMU and data caches disabled. It is only called
 by the primary CPU. The arguments to this function is the address of the
 `meminfo` structure populated by BL1.
 
-The platform must copy the contents of the `meminfo` structure into a private
+The platform may copy the contents of the `meminfo` structure into a private
 variable as the original memory may be subsequently overwritten by BL2. The
 copied structure is made available to all BL2 code through the
 `bl2_plat_sec_mem_layout()` function.
 
-In ARM standard platforms, this function also initializes the storage
-abstraction layer used to load further bootloader images. It is necessary to do
-this early on platforms with a SCP_BL2 image, since the later
-`bl2_platform_setup` must be done after SCP_BL2 is loaded.
+On ARM standard platforms, this function also:
+
+*   Initializes a UART (PL011 console), which enables access to the `printf`
+    family of functions in BL2.
+
+*   Initializes the storage abstraction layer used to load further bootloader
+    images. It is necessary to do this early on platforms with a SCP_BL2 image,
+    since the later `bl2_platform_setup` must be done after SCP_BL2 is loaded.
 
 
 ### Function : bl2_plat_arch_setup() [mandatory]
@@ -1081,9 +1093,9 @@ This function executes with the MMU and data caches disabled. It is only called
 by the primary CPU.
 
 The purpose of this function is to perform any architectural initialization
-that varies across platforms, for example enabling the MMU (since the memory
-map differs across platforms).
+that varies across platforms.
 
+On ARM standard platforms, this function enables the MMU.
 
 ### Function : bl2_platform_setup() [mandatory]
 
@@ -1284,7 +1296,7 @@ This function executes with the MMU and data caches disabled. It is only
 called by the primary CPU. The arguments to this function is the address
 of the `meminfo` structure and platform specific info provided by BL1.
 
-The platform must copy the contents of the `mem_info` and `plat_info` into
+The platform may copy the contents of the `mem_info` and `plat_info` into
 private storage as the original memory may be subsequently overwritten by BL2U.
 
 On ARM CSS platforms `plat_info` is interpreted as an `image_info_t` structure,
@@ -1386,7 +1398,14 @@ to the platform data also needs to be saved.
 
 In ARM standard platforms, BL2 passes a pointer to a `bl31_params` structure
 in BL2 memory. BL31 copies the information in this pointer to internal data
-structures.
+structures. It also performs the following:
+
+*   Initialize a UART (PL011 console), which enables access to the `printf`
+    family of functions in BL31.
+
+*   Enable issuing of snoop and DVM (Distributed Virtual Memory) requests to the
+    CCI slave interface corresponding to the cluster that includes the primary
+    CPU.
 
 
 ### Function : bl31_plat_arch_setup() [mandatory]
@@ -1398,8 +1417,9 @@ This function executes with the MMU and data caches disabled. It is only called
 by the primary CPU.
 
 The purpose of this function is to perform any architectural initialization
-that varies across platforms, for example enabling the MMU (since the memory
-map differs across platforms).
+that varies across platforms.
+
+On ARM standard platforms, this function enables the MMU.
 
 
 ### Function : bl31_platform_setup() [mandatory]
@@ -1414,12 +1434,32 @@ called by the primary CPU.
 The purpose of this function is to complete platform initialization so that both
 BL31 runtime services and normal world software can function correctly.
 
-In ARM standard platforms, this function does the following:
-*   Initializes the generic interrupt controller.
-*   Enables system-level implementation of the generic timer counter.
-*   Grants access to the system counter timer module
-*   Initializes the power controller device
-*   Detects the system topology.
+On ARM standard platforms, this function does the following:
+
+*   Initialize the generic interrupt controller.
+
+    Depending on the GIC driver selected by the platform, the appropriate GICv2
+    or GICv3 initialization will be done, which mainly consists of:
+
+    - Enable secure interrupts in the GIC CPU interface.
+    - Disable the legacy interrupt bypass mechanism.
+    - Configure the priority mask register to allow interrupts of all priorities
+      to be signaled to the CPU interface.
+    - Mark SGIs 8-15 and the other secure interrupts on the platform as secure.
+    - Target all secure SPIs to CPU0.
+    - Enable these secure interrupts in the GIC distributor.
+    - Configure all other interrupts as non-secure.
+    - Enable signaling of secure interrupts in the GIC distributor.
+
+*   Enable system-level implementation of the generic timer counter through the
+    memory mapped interface.
+
+*   Grant access to the system counter timer module
+
+*   Initialize the power controller device.
+
+    In particular, initialise the locks that prevent concurrent accesses to the
+    power controller device.
 
 
 ### Function : bl31_plat_runtime_setup() [optional]
@@ -2020,7 +2060,7 @@ amount of open resources per driver.
 
 - - - - - - - - - - - - - - - - - - - - - - - - - -
 
-_Copyright (c) 2013-2015, ARM Limited and Contributors. All rights reserved._
+_Copyright (c) 2013-2016, ARM Limited and Contributors. All rights reserved._
 
 
 [ARM GIC Architecture Specification 2.0]: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0048b/index.html