Commit b0408e87 authored by Jeenu Viswambharan's avatar Jeenu Viswambharan
Browse files

PSCI: Optimize call paths if all participants are cache-coherent



The current PSCI implementation can apply certain optimizations upon the
assumption that all PSCI participants are cache-coherent.

  - Skip performing cache maintenance during power-up.

  - Skip performing cache maintenance during power-down:

    At present, on the power-down path, CPU driver disables caches and
    MMU, and performs cache maintenance in preparation for powering down
    the CPU. This means that PSCI must perform additional cache
    maintenance on the extant stack for correct functioning.

    If all participating CPUs are cache-coherent, CPU driver would
    neither disable MMU nor perform cache maintenance. The CPU being
    powered down, therefore, remain cache-coherent throughout all PSCI
    call paths. This in turn means that PSCI cache maintenance
    operations are not required during power down.

  - Choose spin locks instead of bakery locks:

    The current PSCI implementation must synchronize both cache-coherent
    and non-cache-coherent participants. Mutual exclusion primitives are
    not guaranteed to function on non-coherent memory. For this reason,
    the current PSCI implementation had to resort to bakery locks.

    If all participants are cache-coherent, the implementation can
    enable MMU and data caches early, and substitute bakery locks for
    spin locks. Spin locks make use of architectural mutual exclusion
    primitives, and are lighter and faster.

The optimizations are applied when HW_ASSISTED_COHERENCY build option is
enabled, as it's expected that all PSCI participants are cache-coherent
in those systems.

Change-Id: Iac51c3ed318ea7e2120f6b6a46fd2db2eae46ede
Signed-off-by: default avatarJeenu Viswambharan <jeenu.viswambharan@arm.com>
parent a10d3632
...@@ -176,7 +176,9 @@ interfaces are: ...@@ -176,7 +176,9 @@ interfaces are:
* The page tables must be setup and the MMU enabled * The page tables must be setup and the MMU enabled
* The C runtime environment must be setup and stack initialized * The C runtime environment must be setup and stack initialized
* The Data cache must be enabled prior to invoking any of the PSCI library * The Data cache must be enabled prior to invoking any of the PSCI library
interfaces except for `psci_warmboot_entrypoint()`. interfaces except for `psci_warmboot_entrypoint()`. For
`psci_warmboot_entrypoint()`, if the build option `HW_ASSISTED_COHERENCY`
is enabled however, data caches are expected to be enabled.
Further requirements for each interface can be found in the interface Further requirements for each interface can be found in the interface
description. description.
...@@ -270,11 +272,11 @@ wakes up, it will start execution from the warm reset address. ...@@ -270,11 +272,11 @@ wakes up, it will start execution from the warm reset address.
Return : void Return : void
This function performs the warm boot initialization/restoration as mandated by This function performs the warm boot initialization/restoration as mandated by
[PSCI spec]. For AArch32, on wakeup from power down the CPU resets to secure [PSCI spec]. For AArch32, on wakeup from power down the CPU resets to secure SVC
SVC mode and the EL3 Runtime Software must perform the prerequisite mode and the EL3 Runtime Software must perform the prerequisite initializations
initializations mentioned at top of this section. This function must be called mentioned at top of this section. This function must be called with Data cache
with Data cache disabled but with MMU initialized and enabled. The major disabled (unless build option `HW_ASSISTED_COHERENCY` is enabled) but with MMU
actions performed by this function are: initialized and enabled. The major actions performed by this function are:
* Invalidates the stack and enables the data cache. * Invalidates the stack and enables the data cache.
* Initializes architecture and PSCI state coordination. * Initializes architecture and PSCI state coordination.
......
...@@ -79,7 +79,8 @@ __section("tzfw_coherent_mem") ...@@ -79,7 +79,8 @@ __section("tzfw_coherent_mem")
#endif #endif
; ;
DEFINE_BAKERY_LOCK(psci_locks[PSCI_NUM_NON_CPU_PWR_DOMAINS]); /* Lock for PSCI state coordination */
DEFINE_PSCI_LOCK(psci_locks[PSCI_NUM_NON_CPU_PWR_DOMAINS]);
cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT]; cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT];
...@@ -992,3 +993,33 @@ int psci_get_suspend_afflvl(void) ...@@ -992,3 +993,33 @@ int psci_get_suspend_afflvl(void)
} }
#endif #endif
/*******************************************************************************
* Initiate power down sequence, by calling power down operations registered for
* this CPU.
******************************************************************************/
void psci_do_pwrdown_sequence(unsigned int power_level)
{
#if HW_ASSISTED_COHERENCY
/*
* With hardware-assisted coherency, the CPU drivers only initiate the
* power down sequence, without performing cache-maintenance operations
* in software. Data caches and MMU remain enabled both before and after
* this call.
*/
prepare_cpu_pwr_dwn(power_level);
#else
/*
* Without hardware-assisted coherency, the CPU drivers disable data
* caches and MMU, then perform cache-maintenance operations in
* software.
*
* We ought to call prepare_cpu_pwr_dwn() to initiate power down
* sequence. We currently have data caches and MMU enabled, but the
* function will return with data caches and MMU disabled. We must
* ensure that the stack memory is flushed out to memory before we start
* popping from it again.
*/
psci_do_pwrdown_cache_maintenance(power_level);
#endif
}
...@@ -119,10 +119,9 @@ int psci_do_cpu_off(unsigned int end_pwrlvl) ...@@ -119,10 +119,9 @@ int psci_do_cpu_off(unsigned int end_pwrlvl)
#endif #endif
/* /*
* Arch. management. Perform the necessary steps to flush all * Arch. management. Initiate power down sequence.
* cpu caches.
*/ */
psci_do_pwrdown_cache_maintenance(psci_find_max_off_lvl(&state_info)); psci_do_pwrdown_sequence(psci_find_max_off_lvl(&state_info));
#if ENABLE_RUNTIME_INSTRUMENTATION #if ENABLE_RUNTIME_INSTRUMENTATION
PMF_CAPTURE_TIMESTAMP(rt_instr_svc, PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
......
/* /*
* Copyright (c) 2013-2016, ARM Limited and Contributors. All rights reserved. * Copyright (c) 2013-2017, ARM Limited and Contributors. All rights reserved.
* *
* Redistribution and use in source and binary forms, with or without * Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met: * modification, are permitted provided that the following conditions are met:
...@@ -165,10 +165,12 @@ void psci_cpu_on_finish(unsigned int cpu_idx, ...@@ -165,10 +165,12 @@ void psci_cpu_on_finish(unsigned int cpu_idx,
*/ */
psci_plat_pm_ops->pwr_domain_on_finish(state_info); psci_plat_pm_ops->pwr_domain_on_finish(state_info);
#if !HW_ASSISTED_COHERENCY
/* /*
* Arch. management: Enable data cache and manage stack memory * Arch. management: Enable data cache and manage stack memory
*/ */
psci_do_pwrup_cache_maintenance(); psci_do_pwrup_cache_maintenance();
#endif
/* /*
* All the platform specific actions for turning this cpu * All the platform specific actions for turning this cpu
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <spinlock.h> #include <spinlock.h>
#if HW_ASSISTED_COHERENCY #if HW_ASSISTED_COHERENCY
/* /*
* On systems with hardware-assisted coherency, make PSCI cache operations NOP, * On systems with hardware-assisted coherency, make PSCI cache operations NOP,
* as PSCI participants are cache-coherent, and there's no need for explicit * as PSCI participants are cache-coherent, and there's no need for explicit
...@@ -49,7 +50,21 @@ ...@@ -49,7 +50,21 @@
#define psci_inv_cpu_data(member) #define psci_inv_cpu_data(member)
#define psci_dsbish() #define psci_dsbish()
/*
* On systems where participant CPUs are cache-coherent, we can use spinlocks
* instead of bakery locks.
*/
#define DEFINE_PSCI_LOCK(_name) spinlock_t _name
#define DECLARE_PSCI_LOCK(_name) extern DEFINE_PSCI_LOCK(_name)
#define psci_lock_get(non_cpu_pd_node) \
spin_lock(&psci_locks[(non_cpu_pd_node)->lock_index])
#define psci_lock_release(non_cpu_pd_node) \
spin_unlock(&psci_locks[(non_cpu_pd_node)->lock_index])
#else #else
/* /*
* If not all PSCI participants are cache-coherent, perform cache maintenance * If not all PSCI participants are cache-coherent, perform cache maintenance
* and issue barriers wherever required to coordinate state. * and issue barriers wherever required to coordinate state.
...@@ -59,19 +74,24 @@ ...@@ -59,19 +74,24 @@
#define psci_inv_cpu_data(member) inv_cpu_data(member) #define psci_inv_cpu_data(member) inv_cpu_data(member)
#define psci_dsbish() dsbish() #define psci_dsbish() dsbish()
#endif
/* /*
* The following helper macros abstract the interface to the Bakery * Use bakery locks for state coordination as not all PSCI participants are
* Lock API. * cache coherent.
*/ */
#define psci_lock_init(non_cpu_pd_node, idx) \ #define DEFINE_PSCI_LOCK(_name) DEFINE_BAKERY_LOCK(_name)
((non_cpu_pd_node)[(idx)].lock_index = (idx)) #define DECLARE_PSCI_LOCK(_name) DECLARE_BAKERY_LOCK(_name)
#define psci_lock_get(non_cpu_pd_node) \ #define psci_lock_get(non_cpu_pd_node) \
bakery_lock_get(&psci_locks[(non_cpu_pd_node)->lock_index]) bakery_lock_get(&psci_locks[(non_cpu_pd_node)->lock_index])
#define psci_lock_release(non_cpu_pd_node) \ #define psci_lock_release(non_cpu_pd_node) \
bakery_lock_release(&psci_locks[(non_cpu_pd_node)->lock_index]) bakery_lock_release(&psci_locks[(non_cpu_pd_node)->lock_index])
#endif
#define psci_lock_init(non_cpu_pd_node, idx) \
((non_cpu_pd_node)[(idx)].lock_index = (idx))
/* /*
* The PSCI capability which are provided by the generic code but does not * The PSCI capability which are provided by the generic code but does not
* depend on the platform or spd capabilities. * depend on the platform or spd capabilities.
...@@ -189,8 +209,8 @@ extern non_cpu_pd_node_t psci_non_cpu_pd_nodes[PSCI_NUM_NON_CPU_PWR_DOMAINS]; ...@@ -189,8 +209,8 @@ extern non_cpu_pd_node_t psci_non_cpu_pd_nodes[PSCI_NUM_NON_CPU_PWR_DOMAINS];
extern cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT]; extern cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT];
extern unsigned int psci_caps; extern unsigned int psci_caps;
/* One bakery lock is required for each non-cpu power domain */ /* One lock is required per non-CPU power domain node */
DECLARE_BAKERY_LOCK(psci_locks[PSCI_NUM_NON_CPU_PWR_DOMAINS]); DECLARE_PSCI_LOCK(psci_locks[PSCI_NUM_NON_CPU_PWR_DOMAINS]);
/******************************************************************************* /*******************************************************************************
* SPD's power management hooks registered with PSCI * SPD's power management hooks registered with PSCI
...@@ -227,6 +247,14 @@ void psci_set_pwr_domains_to_run(unsigned int end_pwrlvl); ...@@ -227,6 +247,14 @@ void psci_set_pwr_domains_to_run(unsigned int end_pwrlvl);
void psci_print_power_domain_map(void); void psci_print_power_domain_map(void);
unsigned int psci_is_last_on_cpu(void); unsigned int psci_is_last_on_cpu(void);
int psci_spd_migrate_info(u_register_t *mpidr); int psci_spd_migrate_info(u_register_t *mpidr);
void psci_do_pwrdown_sequence(unsigned int power_level);
/*
* CPU power down is directly called only when HW_ASSISTED_COHERENCY is
* available. Otherwise, this needs post-call stack maintenance, which is
* handled in assembly.
*/
void prepare_cpu_pwr_dwn(unsigned int power_level);
/* Private exported functions from psci_on.c */ /* Private exported functions from psci_on.c */
int psci_cpu_on_start(u_register_t target_cpu, int psci_cpu_on_start(u_register_t target_cpu,
......
...@@ -121,13 +121,11 @@ static void psci_suspend_to_pwrdown_start(unsigned int end_pwrlvl, ...@@ -121,13 +121,11 @@ static void psci_suspend_to_pwrdown_start(unsigned int end_pwrlvl,
#endif #endif
/* /*
* Arch. management. Perform the necessary steps to flush all * Arch. management. Initiate power down sequence.
* cpu caches. Currently we assume that the power level correspond
* the cache level.
* TODO : Introduce a mechanism to query the cache level to flush * TODO : Introduce a mechanism to query the cache level to flush
* and the cpu-ops power down to perform from the platform. * and the cpu-ops power down to perform from the platform.
*/ */
psci_do_pwrdown_cache_maintenance(max_off_lvl); psci_do_pwrdown_sequence(max_off_lvl);
#if ENABLE_RUNTIME_INSTRUMENTATION #if ENABLE_RUNTIME_INSTRUMENTATION
PMF_CAPTURE_TIMESTAMP(rt_instr_svc, PMF_CAPTURE_TIMESTAMP(rt_instr_svc,
...@@ -304,12 +302,10 @@ void psci_cpu_suspend_finish(unsigned int cpu_idx, ...@@ -304,12 +302,10 @@ void psci_cpu_suspend_finish(unsigned int cpu_idx,
*/ */
psci_plat_pm_ops->pwr_domain_suspend_finish(state_info); psci_plat_pm_ops->pwr_domain_suspend_finish(state_info);
/* #if !HW_ASSISTED_COHERENCY
* Arch. management: Enable the data cache, manage stack memory and /* Arch. management: Enable the data cache, stack memory maintenance. */
* restore the stashed EL3 architectural context from the 'cpu_context'
* structure for this cpu.
*/
psci_do_pwrup_cache_maintenance(); psci_do_pwrup_cache_maintenance();
#endif
/* Re-init the cntfrq_el0 register */ /* Re-init the cntfrq_el0 register */
counter_freq = plat_get_syscnt_freq2(); counter_freq = plat_get_syscnt_freq2();
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment