在Linux中,休眠主要分三个主要的步骤:(1)冻结用户态进程和内核态任务;(2)调用注册的设备的suspend的回调函数;(3)按照注册顺序休眠核心设备和使CPU进入休眠态。
      冻结进程是内核把进程列表中所有的进程的状态都设置为停止,并且保存下所有进程的上下文。当这些进程被解冻的时候,他们是不知道自己被冻结过的,只是简单的继续执行。如何让Linux进入休眠呢?用户可以通过读写sys文件/sys /power/state 是实现控制系统进入休眠。比如:
# echo standby > /sys/power/state命令系统进入休眠。也可以使用
# cat /sys/power/state来得到内核支持哪几种休眠方式。

      Linux Suspend 的流程。相关的文件的路径:
linux_soruce/kernel/power/main.c 
linux_source/kernel/arch/xxx/mach-xxx/pm.c

linux_source/driver/base/power/main.c 
(1)接下来让我们详细的看一下Linux是怎么休眠/唤醒的。

      用户对于/sys/power/state 的读写会调用到 main.c中的state_store(),用户可以写入 const char * const pm_state[] 中定义的字符串,比如"mem"、 "standby"。然后state_store()会调用enter_state(),它首先会检查一些状态参数,然后同步文件系统。
(2)准备冻结进程。

      当进入到suspend_prepare()中以后,它会给suspend分配一个虚拟终端来输出信息,然后广播一个系统要进入suspend的Notify,关闭掉用户态的helper进程,然后一次调用suspend_freeze_processes()冻结所有的进程,这里会保存所有进程 当前的状态,也许有一些进程会拒绝进入冻结状态,当有这样的进程存在的时候,会导致冻结失败,此函数就会放弃冻结进程,并且解冻刚才冻结的所有进程。

(3)让外设进入休眠。

      现在,所有的进程(也包括workqueue/kthread) 都已经停止了,内核态任务有可能在停止的时候握有一些信号量,所以如果这时候在外设里面去解锁这个信号量有可能会发生死锁,所以在外设的suspend()函数里面作lock/unlock锁要非常小心,这里建议设计的时候就不要在suspend()里面等待锁。
      最后会调用suspend_devices_and_enter()来把所有的外设休眠,在这个函数中,如果平台注册了suspend_pos(通常是在板级定义中定义和注册),这里就会调用suspend_ops->begin(),然后driver/base/power/main.c 中的 device_suspend()->dpm_suspend() 会被调用,他们会依次调用驱动的suspend() 回调来休眠掉所有的设备。当所有的设备休眠以后,suspend_ops->prepare()会被调用,这个函数通常会作一些准备工作来让板机进入休眠。接下来Linux,在多核的CPU中的非启动CPU会被关掉,通过注释看到是避免这些其他的CPU造成race condion,接下来的以后只有一个CPU在运行了。
      suspend_ops 是板级的电源管理操作,通常注册在文件 arch/xxx/mach-xxx/pm.c 中。接下来,suspend_enter()会被调用,这个函数会关闭arch irq,调用 device_power_down(),它会调用suspend_late()函数,这个函数是系统真正进入休眠最后调用的函数,通常会在这个函数中作最后的检查。如果检查没问题,接下来休眠所有的系统设备和总线,并且调用 suspend_pos->enter() 来使CPU进入省电状态。这时候,就已经休眠了,代码的执行也就停在这里了。

(4)Resume。

      如果在休眠中系统被中断或者其他事件唤醒,接下来的代码就会开始执行,这个唤醒的顺序是和休眠的顺序相反的,所以系统设备和总线会首先唤醒,使能系统中断,使能休眠时候停止掉的非启动CPU,以及调用suspend_ops->finish(),而且在suspend_devices_and_enter()函数中也会继续唤醒每个设备,使能虚拟终端。最后调用 suspend_ops->end()。再返回到enter_state()函数中的,当suspend_devices_and_enter() 返回以后,外设已经唤醒了,但是进程和任务都还是冻结状态,这里会调用suspend_finish()来解冻这些进程和任务,而且发出Notify来表示系统已经从suspend状态退出,唤醒终端。到这里,所有的休眠和唤醒就已经完毕了,系统继续运行了。



当系统未处于 Suspend 状态下用户按下Power键时会在 /dev/input/event0 节点中产生一个信号, 上层的 WindowManager 会收到这个上节点的变化而得知当前应该进入休眠状态, 通知PowerManagerService, 它会做如下调用,

private int setScreenStateLocked(boolean on) {
        int err = Power.setScreenState(on);

setScreenState 最终会调用到.

int
set_screen_state(int on)
{
enum {
    ACQUIRE_PARTIAL_WAKE_LOCK = 0,
    RELEASE_WAKE_LOCK,
    REQUEST_STATE,
    OUR_FD_COUNT
};

const char * const OLD_PATHS[] = {
    "/sys/android_power/acquire_partial_wake_lock",
    "/sys/android_power/release_wake_lock",
    "/sys/android_power/request_state"
};

const char * const NEW_PATHS[] = {
    "/sys/power/wake_lock",
    "/sys/power/wake_unlock",
    "/sys/power/state"
};

    QEMU_FALLBACK(set_screen_state(on));

    LOGI("*** set_screen_state %d", on);

    initialize_fds();

    //LOGI("go_to_sleep eventTime=%lld now=%lld g_error=%s\n", eventTime,
      //      systemTime(), strerror(g_error));

    if (g_error) return g_error;

    char buf[32];
    int len;
    if(on)
        len = sprintf(buf, on_state);
    else
        len = sprintf(buf, off_state);
    len = write(g_fds[REQUEST_STATE], buf, len);
    if(len < 0) {
        LOGE("Failed setting last user activity: g_error=%d\n", g_error);
    }
    return 0;
}

这里向 /sys/power/state 结点写入了 on 或 mem, 或都有用户直接操作 # echo standby > /sys/power/state 时. 内核调用 state_store 函数在(具体过程未查) ./kernel/power/main.c 文件中. 在其中有如下代码

#ifdef CONFIG_EARLYSUSPEND
  if (state == PM_SUSPEND_ON || valid_state(state)) {
   error = 0;
   request_suspend_state(state);
  }
#else
  error = enter_state(state);
#endif

可以看出.如果定义了 CONFIG_EARLYSUSPEND 的话. 会调用 request_suspend_state 函数在 ./kernel/power/earlysuspend.c 文件中. 其中有关键代码如下

void request_suspend_state(suspend_state_t new_state)
{
 unsigned long irqflags;
 int old_sleep;

 spin_lock_irqsave(&state_lock, irqflags);
 old_sleep = state & SUSPEND_REQUESTED;
        ... ...
 if (!old_sleep && new_state != PM_SUSPEND_ON) {
  state |= SUSPEND_REQUESTED;
  queue_work(suspend_work_queue, &early_suspend_work);
 } else if (old_sleep && new_state == PM_SUSPEND_ON) {
  state &= ~SUSPEND_REQUESTED;
  wake_lock(&main_wake_lock);
  queue_work(suspend_work_queue, &late_resume_work);
 }
 requested_suspend_state = new_state;
 spin_unlock_irqrestore(&state_lock, irqflags);
}

这里根据不同的 new_state 的不同而将不同的任务添加到工作队列中. 现在分析 suspend 的情况 early_suspend 主要工作如下

suspend_state_t requested_suspend_state = PM_SUSPEND_MEM;

static void early_suspend(struct work_struct *work)
{
 struct early_suspend *pos;
 unsigned long irqflags;
 int abort = 0;

 mutex_lock(&early_suspend_lock);
 spin_lock_irqsave(&state_lock, irqflags);
 if (state == SUSPEND_REQUESTED)
  state |= SUSPENDED;
 else
  abort = 1;
 spin_unlock_irqrestore(&state_lock, irqflags);

 if (abort) {
  if (debug_mask & DEBUG_SUSPEND)
   pr_info("early_suspend: abort, state %d\n", state);
  mutex_unlock(&early_suspend_lock);
  goto abort;
 }

 if (debug_mask & DEBUG_SUSPEND)
  pr_info("early_suspend: call handlers\n");
 list_for_each_entry(pos, &early_suspend_handlers, link) {
  if (pos->suspend != NULL)
   pos->suspend(pos);
 }

 mutex_unlock(&early_suspend_lock);

 if (debug_mask & DEBUG_SUSPEND)
  pr_info("early_suspend: sync\n");

 sys_sync();

abort:
 spin_lock_irqsave(&state_lock, irqflags);
 if (state == SUSPEND_REQUESTED_AND_SUSPENDED)
  wake_unlock(&main_wake_lock);

 spin_unlock_irqrestore(&state_lock, irqflags);
}

在这里调用了事先注册的 early_syspend , 同步, 释放 main_wake_lock, 在释放 main_wake_lock 时.如下操作.

static void suspend(struct work_struct *work)
{
 int ret;
 int entry_event_num;

 if (has_wake_lock(WAKE_LOCK_SUSPEND)) {
  if (debug_mask & DEBUG_SUSPEND)
   pr_info("suspend: abort suspend\n");
  return;
 }

 entry_event_num = current_event_num;
 sys_sync();
 if (debug_mask & DEBUG_SUSPEND)
  pr_info("suspend: enter suspend\n");
 ret = pm_suspend(requested_suspend_state);
 if (debug_mask & DEBUG_EXIT_SUSPEND) {
  struct timespec ts;
  struct rtc_time tm;
  getnstimeofday(&ts);
  rtc_time_to_tm(ts.tv_sec, &tm);
  pr_info("suspend: exit suspend, ret = %d "
   "(%d-%02d-%02d %02d:%02d:%02d.%09lu UTC)\n", ret,
   tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
   tm.tm_hour, tm.tm_min, tm.tm_sec, ts.tv_nsec);
 }
 if (current_event_num == entry_event_num) {
  if (debug_mask & DEBUG_SUSPEND)
   pr_info("suspend: pm_suspend returned with no event\n");
  wake_lock_timeout(&unknown_wakeup, HZ / 2);
 }
}
static DECLARE_WORK(suspend_work, suspend);

void wake_unlock(struct wake_lock *lock)
{
 int type;
 unsigned long irqflags;
 spin_lock_irqsave(&list_lock, irqflags);
 type = lock->flags & WAKE_LOCK_TYPE_MASK;
#ifdef CONFIG_WAKELOCK_STAT
 wake_unlock_stat_locked(lock, 0);
#endif
 if (debug_mask & DEBUG_WAKE_LOCK)
  pr_info("wake_unlock: %s\n", lock->name);
 lock->flags &= ~(WAKE_LOCK_ACTIVE | WAKE_LOCK_AUTO_EXPIRE);
 list_del(&lock->link);
 list_add(&lock->link, &inactive_locks);
 if (type == WAKE_LOCK_SUSPEND) {
  long has_lock = has_wake_lock_locked(type);
  if (has_lock > 0) {
   if (debug_mask & DEBUG_EXPIRE)
    pr_info("wake_unlock: %s, start expire timer, "
     "%ld\n", lock->name, has_lock);
   mod_timer(&expire_timer, jiffies + has_lock);
  } else {
   if (del_timer(&expire_timer))
    if (debug_mask & DEBUG_EXPIRE)
     pr_info("wake_unlock: %s, stop expire "
      "timer\n", lock->name);
   if (has_lock == 0){
    queue_work(suspend_work_queue, &suspend_work);
   }
  }
  if (lock == &main_wake_lock) {
   if (debug_mask & DEBUG_SUSPEND)
    print_active_locks(WAKE_LOCK_SUSPEND);
#ifdef CONFIG_WAKELOCK_STAT
   update_sleep_wait_stats_locked(0);
#endif
  }
 }
 spin_unlock_irqrestore(&list_lock, irqflags);
}

在释放 main_wake_lock 时, 要判断 当没有 这个类型 锁的时候 要将 suspend_work 放到工作队列中. 在 supsend 函数中 调用 了正常 suspend 的入口函数.ret = pm_suspend(requested_suspend_state);
在 pm_suspend 函数中

/**
 * suspend_enter - enter the desired system sleep state.
 * @state:  state to enter
 *
 * This function should be called after devices have been suspended.
 */
static int suspend_enter(suspend_state_t state)
{
 int error;

 if (suspend_ops->prepare) {
  error = suspend_ops->prepare();
  if (error)
   return error;
 }

 error = dpm_suspend_noirq(PMSG_SUSPEND);
 if (error) {
  printk(KERN_ERR "PM: Some devices failed to power down\n");
  goto Platfrom_finish;
 }

 if (suspend_ops->prepare_late) {
  error = suspend_ops->prepare_late();
  if (error)
   goto Power_up_devices;
 }

 if (suspend_test(TEST_PLATFORM))
  goto Platform_wake;

 error = disable_nonboot_cpus();
 if (error || suspend_test(TEST_CPUS))
  goto Enable_cpus;

 arch_suspend_disable_irqs();
 BUG_ON(!irqs_disabled());

 error = sysdev_suspend(PMSG_SUSPEND);
 if (!error) {
  if (!suspend_test(TEST_CORE))
   error = suspend_ops->enter(state);
  sysdev_resume();
 }

 arch_suspend_enable_irqs();
 BUG_ON(irqs_disabled());

 Enable_cpus:
 enable_nonboot_cpus();

 Platform_wake:
 if (suspend_ops->wake)
  suspend_ops->wake();

 Power_up_devices:
 dpm_resume_noirq(PMSG_RESUME);

 Platfrom_finish:
 if (suspend_ops->finish)
  suspend_ops->finish();

 return error;
}
/**
 * suspend_prepare - Do prep work before entering low-power state.
 *
 * This is common code that is called for each state that we're entering.
 * Run suspend notifiers, allocate a console and stop all processes.
 */
static int suspend_prepare(void)
{
        ... ....
 if (!suspend_ops || !suspend_ops->enter)
  return -EPERM;
        ... ....
 error = pm_notifier_call_chain(PM_SUSPEND_PREPARE);
 if (error)
  goto Finish;

 error = usermodehelper_disable();
 if (error)
  goto Finish;

 error = suspend_freeze_processes();
 if (!error)
  return 0;

 suspend_thaw_processes();
 usermodehelper_enable();
 Finish:
 pm_notifier_call_chain(PM_POST_SUSPEND);
 pm_restore_console();
 return error;
}
/**
 * suspend_devices_and_enter - suspend devices and enter the desired system
 *        sleep state.
 * @state:    state to enter
 */
int suspend_devices_and_enter(suspend_state_t state)
{
 int error;
 if (!suspend_ops)
  return -ENOSYS;

 if (suspend_ops->begin) {
  error = suspend_ops->begin(state);
  if (error)
   goto Close;
 }
 //suspend_console();
 suspend_test_start();
 error = dpm_suspend_start(PMSG_SUSPEND);
 if (error) {
  printk(KERN_ERR "PM: Some devices failed to suspend\n");
  goto Recover_platform;
 }
 suspend_test_finish("suspend devices");
 if (suspend_test(TEST_DEVICES)){
  goto Recover_platform;
 }
 suspend_enter(state);

 Resume_devices:
 suspend_test_start();
 dpm_resume_end(PMSG_RESUME);
 suspend_test_finish("resume devices");
 resume_console();
 Close:
 if (suspend_ops->end){
  suspend_ops->end();
 }
 return error;

 Recover_platform:
 if (suspend_ops->recover){
  suspend_ops->recover();
 }
 goto Resume_devices;
}

/**
 * suspend_finish - Do final work before exiting suspend sequence.
 *
 * Call platform code to clean up, restart processes, and free the
 * console that we've allocated. This is not called for suspend-to-disk.
 */
static void suspend_finish(void)
{
 suspend_thaw_processes();
 usermodehelper_enable();
 pm_notifier_call_chain(PM_POST_SUSPEND);
 pm_restore_console();
}

/**
 * enter_state - Do common work of entering low-power state.
 * @state:  pm_state structure for state we're entering.
 *
 * Make sure we're the only ones trying to enter a sleep state. Fail
 * if someone has beat us to it, since we don't want anything weird to
 * happen when we wake up.
 * Then, do the setup for suspend, enter the state, and cleaup (after
 * we've woken up).
 */
int enter_state(suspend_state_t state)
{
 int error;
 if (!valid_state(state))
  return -ENODEV;

 if (!mutex_trylock(&pm_mutex))
  return -EBUSY;

 printk(KERN_INFO "PM: Syncing filesystems ... 1");
 sys_sync();
 printk("done.\n");

 pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 error = suspend_prepare();
 if (error)
  goto Unlock;

 if (suspend_test(TEST_FREEZER))
  goto Finish;

 pr_debug("PM: Entering %s sleep\n", pm_states[state]);
 error = suspend_devices_and_enter(state);

 Finish:
 pr_debug("PM: Finishing wakeup.\n");
 suspend_finish();
 Unlock:
 mutex_unlock(&pm_mutex);
 return error;
}

/**
 * pm_suspend - Externally visible function for suspending system.
 * @state:  Enumerated value of state to enter.
 *
 * Determine whether or not value is within range, get state
 * structure, and enter (above).
 */
int pm_suspend(suspend_state_t state)
{
 if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
  return enter_state(state);
 return -EINVAL;
}
pm_suspend -> enter_state(之后和标准 Linux 过程一致) -> suspend_prepare/suspend_devices_and_enter/suspend_finish

在 suspend_prepare 函数中 分别通知上层已经进入 "PM_SUSPEND_PREPARE" 过程让上层做一些处理. 之后冻结用户层所有应用程序及服务进程.

suspend_devices_and_enter 函数中 进程休眠, 设备(驱动)休眠. cpu进入休眠. 应该是停止在 arch_suspend_disable_irqs 里.
来电 cpu上电后, 使能中断.使能非活动的Cpu. 复位驱动, 激活进程. 
suspend_finish 函数中 同样通知上层 PM_POST_SUSPEND 这个消息. 上层收到这个消息后.会调用 NvddkAudioFxSuspend(NV_FALSE); 然后再干什么就没跟住了...
这里同进入suspend 一致, 会把一个按键事件写入到 /dev/input/event0 中, 上层应用程序被激活后会检测这个事件源, 发现有按下, 会使能屏幕, 这时同 suspend 过程一样, 在用户层的最下端写一个 "on" 到 /sys/power/state 中, 内核层中调用 state_store -> request_suspend_state 这里会把一个 resume 的工作加入到 工作队列中. 调用了 已经注册的 late_resume 函数. 这里 eraly_suspend 及 late_resume 都是 android 加的补丁!
10-29 11:52