大家知道binder线程池最大线程数为16个。
下面来证实下。
证实最大线程数16
编写AIDL:
package com.ericcode.bindertest;
interface ICalculator {
int add(int a,int b);
int sub(int a,int b);
}
编写服务:
class RemoteService : Service() {
companion object {
const val TAG = "RemoteService"
}
override fun onBind(intent: Intent): IBinder {
return object: ICalculator.Stub() {
override fun add(a: Int, b: Int): Int {
Logger.i(TAG, "add:$a,$b")
Thread.sleep(1000) // 阻塞binder线程
return a + b
}
override fun sub(a: Int, b: Int): Int {
return a - b
}
}
}
}
编写客户端:
private fun testBinderThreadPool() {
val intent = Intent(baseContext, RemoteService::class.java)
bindService(intent, object : ServiceConnection {
override fun onServiceConnected(name: ComponentName?, service: IBinder?) {
Logger.i(TAG, "onServiceConnected")
ICalculator.Stub.asInterface(service).apply {
for (i in 0..30) { // 执行30次跨进程调用
thread {
val startTime = System.currentTimeMillis()
val result = add(i, i)
Logger.i(
TAG,
"result:$result index:$i use:${System.currentTimeMillis() - startTime}ms"
)
}
}
}
}
override fun onServiceDisconnected(name: ComponentName?) {
TODO("Not yet implemented")
}
}, BIND_AUTO_CREATE)
}
以上实现了,客户端会调用30次服务端的add方法,每次add执行时都会耗时1秒,这样就会出现服务端处理不过来的场景,binder线程也会被占满。
以下是log:
2023-10-26 18:07:10.094 22945-22945 zsm:MainActivity com...ode.bindertest I main:onServiceConnected
2023-10-26 18:07:10.097 22978-22994 zsm:RemoteService com...ode.bindertest I binder:22978_4:add:0,0
2023-10-26 18:07:10.098 22978-22992 zsm:RemoteService com...ode.bindertest I binder:22978_2:add:3,3
2023-10-26 18:07:10.099 22978-22993 zsm:RemoteService com...ode.bindertest I binder:22978_3:add:1,1
2023-10-26 18:07:10.100 22978-22991 zsm:RemoteService com...ode.bindertest I binder:22978_1:add:11,11
2023-10-26 18:07:10.101 22978-23019 zsm:RemoteService com...ode.bindertest I binder:22978_5:add:7,7
2023-10-26 18:07:10.105 22978-23024 zsm:RemoteService com...ode.bindertest I binder:22978_6:add:9,9
2023-10-26 18:07:10.105 22978-23037 zsm:RemoteService com...ode.bindertest I binder:22978_7:add:10,10
2023-10-26 18:07:10.106 22978-23038 zsm:RemoteService com...ode.bindertest I binder:22978_8:add:2,2
2023-10-26 18:07:10.108 22978-23040 zsm:RemoteService com...ode.bindertest I binder:22978_9:add:6,6
2023-10-26 18:07:10.110 22978-23041 zsm:RemoteService com...ode.bindertest I binder:22978_A:add:4,4
2023-10-26 18:07:10.111 22978-23042 zsm:RemoteService com...ode.bindertest I binder:22978_B:add:8,8
2023-10-26 18:07:10.112 22978-23043 zsm:RemoteService com...ode.bindertest I binder:22978_C:add:22,22
2023-10-26 18:07:10.114 22978-23044 zsm:RemoteService com...ode.bindertest I binder:22978_D:add:23,23
2023-10-26 18:07:10.115 22978-23045 zsm:RemoteService com...ode.bindertest I binder:22978_E:add:16,16
2023-10-26 18:07:10.116 22978-23046 zsm:RemoteService com...ode.bindertest I binder:22978_F:add:14,14
2023-10-26 18:07:10.117 22978-23048 zsm:RemoteService com...ode.bindertest I binder:22978_10:add:25,25 // 服务端收到了很多次调用,但是卡在了第16次,因为线程池大小为16
2023-10-26 18:07:11.099 22978-22994 zsm:RemoteService com...ode.bindertest I binder:22978_4:add:26,26 // 新的调用进来,线程号还是原来的,证明此为线程池,线程会复用。
2023-10-26 18:07:11.100 22945-23005 zsm:MainActivity com...ode.bindertest I Thread-6:result:0 index:0 use:1003ms // 客户端收到了结果
2023-10-26 18:07:11.101 22978-22992 zsm:RemoteService com...ode.bindertest I binder:22978_2:add:5,5
2023-10-26 18:07:11.102 22945-23008 zsm:MainActivity com...ode.bindertest I Thread-9:result:6 index:3 use:1005ms
2023-10-26 18:07:11.102 22945-23006 zsm:MainActivity com...ode.bindertest I Thread-7:result:2 index:1 use:1003ms
2023-10-26 18:07:11.103 22945-23016 zsm:MainActivity com...ode.bindertest I Thread-17:result:22 index:11 use:1004ms
2023-10-26 18:07:11.103 22978-22993 zsm:RemoteService com...ode.bindertest I binder:22978_3:add:28,28
2023-10-26 18:07:11.104 22978-23019 zsm:RemoteService com...ode.bindertest I binder:22978_5:add:13,13
2023-10-26 18:07:11.104 22978-22991 zsm:RemoteService com...ode.bindertest I binder:22978_1:add:29,29
2023-10-26 18:07:11.104 22945-23012 zsm:MainActivity com...ode.bindertest I Thread-13:result:14 index:7 use:1004ms
2023-10-26 18:07:11.107 22978-23024 zsm:RemoteService com...ode.bindertest I binder:22978_6:add:20,20
2023-10-26 18:07:11.108 22978-23037 zsm:RemoteService com...ode.bindertest I binder:22978_7:add:17,17
2023-10-26 18:07:11.108 22978-23038 zsm:RemoteService com...ode.bindertest I binder:22978_8:add:21,21
.........
binder占满导致的ANR
各个线程及状态:
客户端:
main线程 talkWithDriver
存在大量线程都在与服务端交互(trace文件调用栈得出),且卡在talkWithDriver
服务端:
存在大量binder:xxx-x线程,个数为16,说明binder线程池占满,无法处理后续新的ipc调用,会将无法处理的ipc放入队列
oneway带来的不稳定调用问题
oneway修饰的方法,在ipc时,客户端可以不等待服务端的结果,提高客户端的效率。
所以oneway只能修饰返回值为void,且参数不可以被inout、out修饰。
但是在测试上面的最大线程数时,发现oneway方法无法成功执行。
AIDL:
interface ICalculator {
int add(int a,int b);
oneway void set(int i);
}
客户端代码:
private fun testBinderThreadPoolOneway() {
val intent = Intent(baseContext, RemoteService::class.java)
bindService(intent, object : ServiceConnection {
override fun onServiceConnected(name: ComponentName?, service: IBinder?) {
Logger.i(TAG, "onServiceConnected")
ICalculator.Stub.asInterface(service).apply {
for (i in 0..5000) {
thread(name = "ipc_client_$i") {
val startTime = System.currentTimeMillis()
set(i) // 跨进程调用
Logger.i(
TAG,
"index:$i use:${System.currentTimeMillis() - startTime}ms"
)
}
}
}
}
override fun onServiceDisconnected(name: ComponentName?) {
TODO("Not yet implemented")
}
}, BIND_AUTO_CREATE)
}
服务端实现:
class RemoteService : Service() {
companion object {
const val TAG = "RemoteService"
}
override fun onBind(intent: Intent): IBinder {
return object: ICalculator.Stub() {
var sum = 0
override fun add(a: Int, b: Int): Int {
Logger.i(TAG, "add:$a,$b")
Thread.sleep(1000)
return a + b
}
@Synchronized
override fun set(i: Int) {
Logger.i(TAG, "set:$i")
sum += i
Thread.sleep(10)
Logger.i(TAG, "set end, sum:$sum")
}
}
}
}
出现了客户端崩溃:
FATAL EXCEPTION: ipc_client_2780
Process: com.ericcode.bindertest, PID: 8638
android.os.DeadObjectException: Transaction failed on small parcel; remote process probably died, but this could also be caused by running out of binder buffer space
at android.os.BinderProxy.transactNative(Native Method)
at android.os.BinderProxy.transact(BinderProxy.java:639)
at com.ericcode.bindertest.ICalculator$Stub$Proxy.set(ICalculator.java:130)
at com.ericcode.bindertest.MainActivity$testBinderThreadPoolOneway$1$onServiceConnected$1$1.invoke(MainActivity.kt:42)
at com.ericcode.bindertest.MainActivity$testBinderThreadPoolOneway$1$onServiceConnected$1$1.invoke(MainActivity.kt:40)
at kotlin.concurrent.ThreadsKt$thread$thread$1.run(Thread.kt:30)
远端进程肯定没有挂掉,那就是缓存的问题了
加入try catch
try {
set(i)
} catch (e: Exception) {
Log.e(TAG, "failed in $i", e)
}
虽然异常还会存在,服务端无法收到异常的调用,但是异常之后的其他调用,会成功。
比如:3000这个调用失败了,但是3001这个调用可能会成功。
所以说在迅速批量调用oneway方法时,oneway是不可靠的。我们需要尽量避免这种情况下使用oneway关键字。
stackoverflow上的案例
可以通过以下步骤触发该错误:
- Process1向Process2发送大数据(例如980kB),Process2需要睡眠30秒,并且大binder缓冲区不会被释放。
- Process1 向 Process2 发送广播,其中包含例如 50kB 数据。这将超出 1016kB 的缓冲区容量,因为 980kB + 50kB 大于缓冲区容量。
- BroadcastQueue会抛出DeadObjectException,然后将scheduleCrash传递给应用程序端的ActivityThread。
即发生此错误时,可能不是本地调用的问题,而是存在一个调用没有返回,仍然占用了大量的buff。
| 参考:https://www.jianshu.com/p/4c8d346185cb
| https://stackoverflow.com/questions/45432647/android-throw-deadobjectexception-with-log-transaction-failed-on-small-parcel
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!