当前位置: 首页 > news >正文

如何分析各种ANR第二篇?Google官方文档详细教你

文章目录

    • 背景:
    • Execute service timeout(Service超时ANR)
      • Common causes(常见案例)
      • How to debug (如何调试ANR)
    • Content provider not responding(ContentProvider无响应ANR)
      • Common causes
      • How to debug
    • Slow job response(JobService相关anr)
    • Mystery ANRs(神秘ANR)
      • Message queue idle or nativePollOnce
      • No stack frames
    • Known issues

背景:

在实际开发工作各种类型的ANR层出不穷,之前一直也想google官方开发人员是否有解决各种ANR问题的一些文档等,主要是想看看自己平时自己分析套路是否也和google的人一样,还有就是想看看goole是否有啥分析ANR的新套路方法等,今天刚好找到了google关于各种ANR问题的详细指导文档,本来想用翻译成中文的版本,但是感觉自动翻译的其实并不是太好,所以直接上英文,马哥搬运google的干货。
本文主要摘抄以下几个类型的ANR详细分析指导和案例:
1、Execute service timeout
2、Content provider not responding
3、Slow job response
4、Mystery ANRs
下面是google官方文档,底部也有原文链接。

Execute service timeout(Service超时ANR)

An execute service ANR happens when the app’s main thread doesn’t start a service in time. Specifically, a service doesn’t finish executing onCreate() and onStartCommand() or onBind() within the timeout period.

Default timeout period:
20 seconds for foreground service; 200 seconds for background service. The ANR timeout period includes the app cold start, if necessary, and calls to onCreate(), onBind(), or onStartCommand().

To avoid execute service ANRs, follow these general best practices:

Make sure that app startup is fast, since it’s counted in the ANR timeout if the app is started to run the service component.

Make sure that the service’s onCreate(), onStartCommand(), and onBind() methods are fast.

Avoid running any slow or blocking operations on the main thread from other components; these operations can prevent a service from starting quickly.

Common causes(常见案例)

The following table lists common causes of execute service ANRs and suggested fixes.

How to debug (如何调试ANR)

From the cluster signature and ANR report in Google Play Console or Firebase Crashlytics, you can often determine the cause of the ANR based on what the main thread is doing.

Note: Ignore execute service ANR clusters that say “nativePollOnce” or “main thread idle.” These usually correspond to ANRs where the stack dump is taken too late, and are generally not actionable. The actual ANR issues are usually present in other clusters, so real issues aren’t being hidden. See nativePollOnce for more details.
The following flow chart describes how to debug an execute service ANR.

Figure 6. How to debug an execute service ANR.

If you’ve determined that the execute service ANR is actionable, follow these steps to help resolve the issue:

Find the service component class in the ANR signature. In Google Play Console, the service component class is shown in the ANR signature. In the following example ANR details, it’s com.example.app/MyService.

com.google.common.util.concurrent.Uninterruptibles.awaitUninterruptibly Executingservicecom.example.app/com.example.app.MyService

Determine whether the slow or block operation is part of app startup, the service component, or elsewhere by checking for the following important function call(s) in the main threads.

For example, if the onStartCommand() method in the MyService class is slow, the main threads will look like this:

at com.example.app.MyService.onStartCommand(FooService.java:25)at android.app.ActivityThread.handleServiceArgs(ActivityThread.java:4820)at android.app.ActivityThread.-$$Nest$mhandleServiceArgs(unavailable:0)at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2289)at android.os.Handler.dispatchMessage(Handler.java:106)at android.os.Looper.loopOnce(Looper.java:205)at android.os.Looper.loop(Looper.java:294)at android.app.ActivityThread.main(ActivityThread.java:8176)at java.lang.reflect.Method.invoke(Native method:0)

If you can’t see any of the important function calls, there are a couple other possibilities:

The service is running or shutting down, which means that the stacks are taken too late. In this case, you can ignore the ANR as a false positive.
A different app component is running, such as a broadcast receiver. In this case the main thread is likely blocked in this component, preventing the service from starting.

If you do see a key function call and can determine where the ANR is happening generally, check the rest of the main thread stacks to find the slow operation and optimize it or move it off the critical path.

For more information about services, see the following pages:

Services overview
Foreground services
Service

Content provider not responding(ContentProvider无响应ANR)

A content provider ANR happens when a remote content provider takes longer than the timeout period to respond to a query, and is killed.

Default timeout period: specified by content provider using ContentProviderClient.setDetectNotResponding. The ANR timeout period includes the total time for a remote content provider query to run, which includes cold-starting the remote app if it wasn’t already running.

To avoid content provider ANRs, follow these best practices:

Make sure that app startup is fast, since it’s counted in the ANR timeout if the app is started to run the content provider.
Make sure that the content provider queries are fast.
Don’t perform lots of concurrent blocking binder calls that can block all the app’s binder threads.

Common causes

The following table lists common causes of content provider ANRs and suggested fixes.

How to debug

To debug a content provider ANR using the cluster signature and ANR report in Google Play Console or Firebase Crashlytics, look at what the main thread and binder thread(s) are doing.

The following flow chart describes how to debug a content provider ANR:

Figure 7. How to debug a content provider ANR.
The following code snippet shows what the binder thread looks like when it’s blocked due to a slow content provider query. In this case, the content provider query is waiting for lock when opening a database.

binder:11300_2(tid=13)Blocked Waitingforosm(0x01ab5df9)held by at com.google.common.base.Suppliers$NonSerializableMemoizingSupplier.get(Suppliers:182)at com.example.app.MyClass.blockingGetOpenDatabase(FooClass:171)[...]at com.example.app.MyContentProvider.query(MyContentProvider.java:915)at android.content.ContentProvider$Transport.query(ContentProvider.java:292)at android.content.ContentProviderNative.onTransact(ContentProviderNative.java:107)at android.os.Binder.execTransactInternal(Binder.java:1339)at android.os.Binder.execTransact(Binder.java:1275)

The following code snippet shows what the main thread looks like when it’s blocked due to slow app startup. In this case, the app startup is slow due to lock contention during dagger initialization.

main(tid=1)Blocked[...]at dagger.internal.DoubleCheck.get(DoubleCheck:51)- locked 0x0e33cd2c(a qsn)at dagger.internal.SetFactory.get(SetFactory:126)at com.myapp.Bar_Factory.get(Bar_Factory:38)[...]at com.example.app.MyApplication.onCreate(DocsApplication:203)at android.app.Instrumentation.callApplicationOnCreate(Instrumentation.java:1316)at android.app.ActivityThread.handleBindApplication(ActivityThread.java:6991)at android.app.ActivityThread.-$$Nest$mhandleBindApplication(unavailable:0)at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2235)at android.os.Handler.dispatchMessage(Handler.java:106)at android.os.Looper.loopOnce(Looper.java:205)at android.os.Looper.loop(Looper.java:294)at android.app.ActivityThread.main(ActivityThread.java:8170)at java.lang.reflect.Method.invoke(Native method:0)at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:552)at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:971)

Slow job response(JobService相关anr)

A slow job response ANR happens when the app takes too long to respond to JobService.onStartJob() or JobService.onStopJob(), or takes too long to provide a notification using JobService.setNotification(). This suggests that the app’s main thread is blocked doing something else.

If it’s an issue with JobService.onStartJob() or JobService.onStopJob(), check what’s happening on the main thread. If it’s an issue with JobService.setNotification(), make sure to call it as quickly as possible. Don’t do a lot of work before providing the notification.

Mystery ANRs(神秘ANR)

Sometimes it’s unclear why an ANR is occurring, or there is insufficient information to debug it in the cluster signature and ANR report. In these cases, there are still some steps you can take to determine whether the ANR is actionable.

Message queue idle or nativePollOnce

If you see the frame android.os.MessageQueue.nativePollOnce in the stacks, it often indicates that the suspected unresponsive thread was actually idle and waiting for looper messages. In Google Play Console, the ANR details look like this:

Native method - android.os.MessageQueue.nativePollOnce

Executingservicecom.example.app/com.example.app.MyService For example,ifthe main thread is idle the stackslooklike this:
"main"tid=1NativeMain threadIdle#00 pc 0x00000000000d8b38 /apex/com.android.runtime/lib64/bionic/libc.so (__epoll_pwait+8)#01 pc 0x0000000000019d88 /system/lib64/libutils.so (android::Looper::pollInner(int)+184)#02 pc 0x0000000000019c68 /system/lib64/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+112)#03 pc 0x000000000011409c /system/lib64/libandroid_runtime.so (android::android_os_MessageQueue_nativePollOnce(_JNIEnv*, _jobject*, long, int)+44)at android.os.MessageQueue.nativePollOnce(Native method)at android.os.MessageQueue.next(MessageQueue.java:339)at android.os.Looper.loop(Looper.java:208)at android.app.ActivityThread.main(ActivityThread.java:8192)at java.lang.reflect.Method.invoke(Native method)at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:626)at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1015)

There are several reasons why the suspected unresponsive thread can be idle:

Late stack dump.
The thread recovered during the short period between the ANR triggering and the stacks being dumped. The latency in Pixels on Android 13 is around 100ms, but can exceed 1s. The latency in Pixels on Android 14 is usually under 10ms.

Thread misattribution.
The thread used to build the ANR signature was not the actual unresponsive thread that caused the ANR. In this case, try to determine if the ANR is one of the following types:
Broadcast receiver timeout
Content provider not responding
No focused window

System-wide issue.

The process wasn’t scheduled due to heavy system load or an issue in the system server.

No stack frames

Some ANR reports don’t include the stacks with the ANR, which means that the stack dumping failed when generating the ANR report. There are a couple of possible reasons for missing stack frames:

Taking the stack takes too long and times out.
The process died or was killed before the stacks were taken.

[...]---CriticalEventLog---capacity:20timestamp_ms:1666030897753window_ms:300000libdebuggerd_client:failed to read status response from tombstoned:timeout reached?-----Waiting Channels:pid7068at2022-10-1802:21:37.<US_SOCIAL_SECURITY_NUMBER>+0800-----[...]

ANRs without stack frames aren’t actionable from the cluster signature or ANR report. To debug, look at other clusters for the app, since if an issue is large enough it’ll usually have its own cluster where stack frames are present. Another option is to look at Perfetto traces.

Known issues

Keeping a timer in your app’s process for the purposes of finishing broadcast handling before an ANR triggers might not work correctly because of the asynchronous way the system monitors ANRs.

原文地址:
https://mp.weixin.qq.com/s/Bl2gV2ERghm2JZFF5LRfIA
如何分析各种ANR?Google官方文档详细教你

http://www.jsqmd.com/news/641525/

相关文章:

  • 从子密钥逆推到完整密钥:DES算法在CTF中的实战密钥恢复指南
  • 东莞装修设计避坑分析:五类旧房精改方案与报价模式实测 - 速递信息
  • Pixel Couplet Gen部署教程:阿里云ACR镜像仓库+ACK集群灰度发布
  • 2026瓶装水贴牌加工厂家推荐:综合实力测评发布,口碑靠谱厂家盘点 - 博客湾
  • ejabberd用户管理终极指南:如何高效管理大规模用户群体
  • 2026年高权重新闻媒体发稿平台推荐,高效推广必备! - 博客湾
  • ANR高级经验2:No Focused Window类型ANR的各种案例汇总
  • Windows11如何开启ssh服务以及自动启动
  • 2026 年国内软文营销平台 TOP5 榜单:软文发稿天花板实测 - 博客湾
  • 【原创】IgH EtherCAT主站详解(十)--CoE、EoE、FoE和SII执行状态机
  • BOXMOT工具箱深度评测:YOLOv8/YOLO-NAS/YOLOX三大检测器在MOT17数据集的表现对比
  • 2026数字中国创新大赛个人赛-Web
  • 预算少就不做推广?五大 “性价比之王” 软文发布平台综合评测与选择指南 - 博客湾
  • 从L0原始日志到L4业务意图追踪:AIAgent全栈Trace建模方法论(基于37个客户POC验证的7阶抽象模型)
  • 多模态大模型驱动自动驾驶的临界突破(2024实测数据首次公开):时延<83ms、跨模态误检率下降67.4%、通过ISO 21448 SOTIF认证的关键路径
  • 如何5分钟搞定抖音批量下载:douyin-downloader开源工具终极指南
  • 2026媒体发稿平台实测榜:6大主流平台10大核心维度硬核全拆解 - 博客湾
  • 2026 年整合软文发稿平台 TOP5 榜单:从软文发稿到自媒体全网分发 - 博客湾
  • Jitsi Meet移动端热更新:无需应用商店的功能升级方案
  • 终极指南:如何用罗技鼠标宏在绝地求生中实现完美压枪
  • TOP5 媒体发稿平台推荐:高效传播助力品牌推广 - 博客湾
  • PyTorch中通过训练图像去雾数据集 建立基于SFNet图像去雾算法的完整系统
  • 告别数据孤岛:Mantle与Flutter混编实现跨平台数据无缝流动
  • Quill 编辑器光标跳转到顶部的解决方案
  • 探秘LibSass:从源码到CSS的完整编译之旅
  • 简易DDS发生器制作
  • Qwen3-32B大模型并发性能优化实战:从理论估算到压力测试
  • 托福备考双指南:家长选型攻略+零基础痛点破解 2026权威版 - 速递信息
  • 不只是ChatGPT:手把手教你配置Agent,让它学会从‘学习强国’找会议素材
  • Media Player Classic Home Cinema:Windows媒体播放器的终极免费解决方案