• MTK system_server 卡死导致手机重启案例分析


    05d4a90957f99f8afe6328405867a439.gif

    和你一起终身学习,这里是程序员Android

    经典好文推荐,通过阅读本文,您将收获以下知识点:

    一、MTK AEE Log分析工具
    二、AEE Log分析流程
    三、system_server 卡死案例分析及解决

    本文主要针对 Exception Type: system_server_watchdog , system_server卡死找出的分析以及解决方案。

    一、MTK AEE Log分析工具

    MTK AEE Log 获取方式:

    程序员Android
    回复 aee 即可获取解析重启db log的工具。

    二、AEE Log分析流程

    1. 使用AEE 工具解析 dbg 文件。

    3c97631ef0c1e3120b01b15bb1d983d4.jpeg

    使用解析db.fatal.02.SWT.dbg

    059278b2f9d936e2bc90298da034bdc8.jpeg

    AEE Log 解析出来的文件

    2.分析解析出来的exp_main等文件

    exp_main 文件会记录发生重启时候的 log 打印信息。

    部分重启异常 Log信息如下:

    1. $** *** *** *** *** *** *** *** Fatal *** *** *** *** *** *** *** **$
    2. Build Info: 'alps-mp-o1.mp7:alps-mp-o1.mp7:mt6765:S01,ACE/AS0618/AS0618:8.1.0/O11019/1548123508:user/release-keys'
    3. Flavor Info: 'None'
    4. Exception Log Time:[Thu Mar 14 14:00:03 CST 2019] [38684.729626]
    5. Exception Class: SWT
    6. Exception Type: system_server_watchdog
    7. Current Executing Process:
    8. system_server
    9. Trigger time:[2019-03-14 14:00:03.711844] pid:1029
    10. Backtrace:
    11. Process: system_server
    12. Subject: Blocked in handler on ActivityManager (ActivityManager)
    13. Build: ACE/AS0618/AS0618:8.1.0/O11019/1548123508:user/release-keys

    3.exp_main 文件解析

    从开头的Log总体信息概览,我们可以看到 发生重启的时间类型触发重启的进程以及PID系统Blocked 的地方

    结合exp_main以及 trace分析重启 Log
    Log分析如下:

    1. // 1.重启触发时间,以及PID
    2. Trigger time:[2019-03-14 14:00:03.711844] pid:1029
    3. // 2.Blocked 的进程
    4. Backtrace:
    5. Process: system_server
    6. Subject: Blocked in handler on ActivityManager (ActivityManager)
    7. // 3.根据PID 查看Trace信息
    8. ----- pid 1029 at 2019-03-14 13:59:58 -----
    9. Cmd line: system_server
    10. ... ...
    11. // 4.根据Backtrace 查看Blocked的信息
    12. "ActivityManager" prio=5 tid=11 Blocked
    13. ... ...
    14. // 5.tid=11 等待 tid=106的线程释放资源锁
    15. - waiting to lock <0x090691f3> (a android.util.ArrayMap) held by thread 106
    16. ... ...
    17. // 6.查看tid = 106 持锁情况
    18. "backup" prio=5 tid=106 Waiting
    19. ... ...
    20. at java.lang.Object.wait(Native method)
    21. - waiting on <0x06a44c62> (a com.android.server.am.ContentProviderRecord)
    22. // 7.死锁卡住的地方
    23. at com.android.server.am.ActivityManagerService.getContentProviderImpl(ActivityManagerService.java:12127)
    24. - locked <0x06a44c62> (a com.android.server.am.ContentProviderRecord)
    25. ... ...
    26. "Binder:1029_8" prio=5 tid=107 Blocked
    27. // 8.log中 tid=107 被 106 Blocked 进一步问题的加重
    28. at com.android.server.notification.RankingHelper.getRecord(RankingHelper.java:258)
    29. - waiting to lock <0x090691f3> (a android.util.ArrayMap) held by thread 106
    30. $** *** *** *** *** *** *** *** Fatal *** *** *** *** *** *** *** **$

    6ff7d66b8c8ab5044555ace387ae42a7.jpeg

    Log 分析大致过程截图

    完整 log 请在公众号上获取

    三、system_server 卡死案例分析及解决

    通过 Log 找到卡死原因后,解决此问题。
    需要修改ActivityManagerService类。

    1.修改代码路径如下:alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

    2.解决思路

    避免 provider 长时间持锁触发MTK 60s 的SWT 重启机制,设置超时时间,超过时间就要释放资源锁,避免发生此问题。

    3.diff 修改方案如下:

    1. --- a/[alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c)
    2. +++ b/[alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444)
    3. @@ [-545,7](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c#l545) [+545,9](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444#l545) @@ public class ActivityManagerService extends IActivityManager.Stub
    4. // How long we wait for an attached process to publish its content providers
    5. // before we decide it must be hung.
    6. static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
    7. -
    8. + // How long we wait for provider to be notify before we decide it may be hung.
    9. + static final int CONTENT_PROVIDER_WAIT_TIMEOUT = 20*1000;
    10. +
    11. // How long we wait for a launched process to attach to the activity manager
    12. // before we decide it's never going to come up for real, when the process was
    13. // started with a wrapper for instrumentation (such as Valgrind) because it
    14. @@ [-1745,6](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c#l1745) [+1747,7](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444#l1747) @@ public class ActivityManagerService extends IActivityManager.Stub
    15. static final int PUSH_TEMP_WHITELIST_UI_MSG = 68;
    16. static final int SERVICE_FOREGROUND_CRASH_MSG = 69;
    17. static final int DISPATCH_OOM_ADJ_OBSERVER_MSG = 70;
    18. + static final int CONTENT_PROVIDER_WAIT_TIMEOUT_MSG = 71;
    19. static final int START_USER_SWITCH_FG_MSG = 712;
    20. static final int NOTIFY_VR_KEYGUARD_MSG = 74;
    21. @@ [-2108,6](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c#l2108) [+2111,12](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444#l2111) @@ public class ActivityManagerService extends IActivityManager.Stub
    22. synchronized (ActivityManagerService.this) {
    23. mActivityStarter.doPendingActivityLaunchesLocked(true);
    24. }
    25. + } break;
    26. + case CONTENT_PROVIDER_WAIT_TIMEOUT_MSG: {
    27. + ContentProviderRecord cpr = (ContentProviderRecord)msg.obj;
    28. + synchronized (ActivityManagerService.this) {
    29. + processContentProviderWaitTimedOutLocked(cpr);
    30. + }
    31. } break;
    32. case KILL_APPLICATION_MSG: {
    33. synchronized (ActivityManagerService.this) {
    34. @@ [-7029,7](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c#l7029) [+7038,31](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444#l7038) @@ public class ActivityManagerService extends IActivityManager.Stub
    35. cleanupAppInLaunchingProvidersLocked(app, true);
    36. removeProcessLocked(app, false, true, "timeout publishing content providers");
    37. }
    38. +
    39. + @GuardedBy("this")
    40. + private final void processContentProviderWaitTimedOutLocked(ContentProviderRecord cpr) {
    41. + try {
    42. + if (mLaunchingProviders.contains(cpr)) {
    43. + if (DEBUG_MU) Slog.v(TAG_MU,
    44. + "Remove from mLaunchingProviders, " + cpr
    45. + + " launchingApp=" + cpr.launchingApp);
    46. + mLaunchingProviders.remove(cpr);
    47. + }
    48. + if (DEBUG_MU) Slog.v(TAG_MU,
    49. + "RemoveMessages CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, " + cpr
    50. + + " launchingApp=" + cpr.launchingApp);
    51. + mHandler.removeMessages(CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, cpr);
    52. + synchronized (cpr) {
    53. + cpr.notifyAll();
    54. + cpr.launchingApp = null;
    55. + }
    56. + } catch (Exception e) {
    57. + if (DEBUG_MU) Slog.v(TAG_MU,
    58. + "processContentProviderWaitTimedOutLocked exception, " + e);
    59. + }
    60. + }
    61. +
    62. private final void processStartTimedOutLocked(ProcessRecord app) {
    63. final int pid = app.pid;
    64. boolean gone = false;
    65. @@ [-12124,11](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=d5e2e1411f3698a829e997d402c7482ec277fa8c;hb=d5e2e1411f3698a829e997d402c7482ec277fa8c#l12124) [+12157,33](http://192.168.11.104/gitweb/?p=alps-mp-o1.mp1-V1.git;a=blob;f=alps/frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java;h=03208f78a2bf3167b4f0790019326e3939cc5444;hb=03208f78a2bf3167b4f0790019326e3939cc5444#l12157) @@ public class ActivityManagerService extends IActivityManager.Stub
    66. if (conn != null) {
    67. conn.waiting = true;
    68. }
    69. + // add 20s wait timeout,avoid
    70. + if (!mHandler.hasMessages(CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, cpr)) {
    71. + if (DEBUG_MU) Slog.v(TAG_MU,
    72. + "SendMessageDelayed CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, " + cpr
    73. + + " launchingApp=" + cpr.launchingApp);
    74. + Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_WAIT_TIMEOUT_MSG);
    75. + msg.obj = cpr;
    76. + mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_WAIT_TIMEOUT);
    77. + } else {
    78. + if (DEBUG_MU) Slog.v(TAG_MU,
    79. + "There is another waiting to start provider " + cpr
    80. + + " launchingApp=" + cpr.launchingApp
    81. + + ", not send CONTENT_PROVIDER_WAIT_TIMEOUT_MSG again");
    82. + }
    83. +
    84. cpr.wait();
    85. } catch (InterruptedException ex) {
    86. } finally {
    87. if (conn != null) {
    88. conn.waiting = false;
    89. + }
    90. + // remove wait time out message
    91. + if (mHandler.hasMessages(CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, cpr)) {
    92. + if (DEBUG_MU) Slog.v(TAG_MU,
    93. + "After wait removeMessages CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, "
    94. + + cpr + " launchingApp=" + cpr.launchingApp);
    95. + mHandler.removeMessages(CONTENT_PROVIDER_WAIT_TIMEOUT_MSG, cpr);
    96. }
    97. }
    98. }

    参考文献:

    【腾讯文档】Android Framework 知识库
    https://docs.qq.com/doc/DSXBmSG9VbEROUXF5

    友情推荐:

    Android 开发干货集锦

    至此,本篇已结束。转载网络的文章,小编觉得很优秀,欢迎点击阅读原文,支持原创作者,如有侵权,恳请联系小编删除,欢迎您的建议与指正。同时期待您的关注,感谢您的阅读,谢谢!

    f80dbf8b93884caa64fac37931f4cf35.jpeg

    点击阅读原文,为大佬点赞!

  • 相关阅读:
    电商平台运用会员体系运营的好处以及注意事项
    DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models
    R Removing package报错(as ‘lib’ is unspecified)
    【Pytorch Lighting】第 2 章:第一个深度学习模型起步
    AD9371 官方例程HDL详解之JESD204B RX侧时钟生成
    分页查询慢的优化方式
    js中的Formdata数据结构
    ScalableViT网络模型
    Mybatis 返回值配置理解 - 返回值是浮点数 BigDecimal 或整数 Integer的配置 - 返回指定实体类格式的 List 数组
    【Docker】Docker Swarm介绍与环境搭建
  • 原文地址:https://blog.csdn.net/wjky2014/article/details/132014005