Android 重学系列 Binder 服务的初始化以及交互原理(下)

如果遇到问题请到:https://www.jianshu.com/p/84b18387992f

背景

为了避免逻辑断链,这里稍微提及一下,之前所阅读到的位置

  IBinder* b = e->binder;
        if (b == NULL || !e->refs->attemptIncWeak(this)) {
            if (handle == 0) {
                Parcel data;
                status_t status = IPCThreadState::self()->transact(
                        0, IBinder::PING_TRANSACTION, data, NULL, 0);
                if (status == DEAD_OBJECT)
                   return NULL;
            }

            b = BpBinder::create(handle);
            e->binder = b;
            if (b) e->refs = b->getWeakRefs();
            result = b;

注意,这些源码都是来自Android 9.0

BpBinder

文件:/frameworks/native/libs/binder/BpBinder.cpp

让我们看看这个BpBinder究竟是怎么回事。

BpBinder::BpBinder(int32_t handle, int32_t trackedUid)
    : mHandle(handle)
    , mAlive(1)
    , mObitsSent(0)
    , mObituaries(NULL)
    , mTrackedUid(trackedUid)
{
    ALOGV("Creating BpBinder %p handle %d\n", this, mHandle);

    extendObjectLifetime(OBJECT_LIFETIME_WEAK);
    IPCThreadState::self()->incWeakHandle(handle, this);
}

BpBinder 此时会把当前的handle设置进去,此时的handle是0.但是BpBinder并不是只有表面上这么简单,我们看看BpBinder是继承了谁。

class BpBinder : public IBinder

可以知道对应这个native也有一个IBinder的类,和Java层对应的IBinder类相似。这个IBinder是继承于RefBase 一种智能指针。

智能指针在这里稍微提一句,其核心十分简单,就是通过构造函数和析构函数来控制对象的引用计数,其还有弱引用和强引用指针,目的是为了处理循环引用问题。而Binder中的对象是否应该析构,是由智能指针来控制,所以阅读过binder源码就会发现,到处都是引用的增加和减少。这并不是重点,之后有机会再来讨论。

回到正题,IPCThreadState会调用incWeakHandle,为这个引用对应在驱动中的引用添加一个弱引用计数。

void IPCThreadState::incWeakHandle(int32_t handle, BpBinder *proxy)
{
    LOG_REMOTEREFS("IPCThreadState::incWeakHandle(%d)\n", handle);
    mOut.writeInt32(BC_INCREFS);
    mOut.writeInt32(handle);
    // Create a temp reference until the driver has handled this command.
    proxy->getWeakRefs()->incWeak(mProcess.get());
    mPostWriteWeakDerefs.push(proxy->getWeakRefs());
}

这个mOut就是一个Parcel,用来传输到binder驱动的对象。可以看到的是,Binder命令是BC_INCREFS,数据是handle。这样就会在binder驱动找到对应的引用,增加其在驱动中的引用弱计数。同时增加BpBinder的弱引用。由于我并没有介绍智能指针,所以我就不去底层介绍其原理。

native 层生成Proxy对象

当增加完成引用之后,此时我们回到native调用的顶层:
文件:/frameworks/base/core/jni/android_util_Binder.cpp

static jobject android_os_BinderInternal_getContextObject(JNIEnv* env, jobject clazz)
{
    sp<IBinder> b = ProcessState::self()->getContextObject(NULL);
    return javaObjectForIBinder(env, b);
}

当我们拿到IBinder的强引用指针,实际上当前是BpBinder。因为此时的思想是,用handle去查找远程端的binder对象。因为进程间有地址保护,因此此时我们使用了一个handle来代表远程的binder,在这里我们使用了BpBinder来代表这个远程端的对象。

接下来我们看看javaObjectForIBinder的方法。

jobject javaObjectForIBinder(JNIEnv* env, const sp<IBinder>& val)
{
    if (val == NULL) return NULL;

    if (val->checkSubclass(&gBinderOffsets)) {
        // It's a JavaBBinder created by ibinderForJavaObject. Already has Java object.
        jobject object = static_cast<JavaBBinder*>(val.get())->object();
        LOGDEATH("objectForBinder %p: it's our own %p!\n", val.get(), object);
        return object;
    }

    // For the rest of the function we will hold this lock, to serialize
    // looking/creation/destruction of Java proxies for native Binder proxies.
    AutoMutex _l(gProxyLock);

    BinderProxyNativeData* nativeData = gNativeDataCache;
    if (nativeData == nullptr) {
        nativeData = new BinderProxyNativeData();
    }
    // gNativeDataCache is now logically empty.
    jobject object = env->CallStaticObjectMethod(gBinderProxyOffsets.mClass,
            gBinderProxyOffsets.mGetInstance, (jlong) nativeData, (jlong) val.get());
    if (env->ExceptionCheck()) {
        // In the exception case, getInstance still took ownership of nativeData.
        gNativeDataCache = nullptr;
        return NULL;
    }
    BinderProxyNativeData* actualNativeData = getBPNativeData(env, object);
    if (actualNativeData == nativeData) {
        // New BinderProxy; we still have exclusive access.
        nativeData->mOrgue = new DeathRecipientList;
        nativeData->mObject = val;
        gNativeDataCache = nullptr;
        ++gNumProxies;
        if (gNumProxies >= gProxiesWarned + PROXY_WARN_INTERVAL) {
            ALOGW("Unexpectedly many live BinderProxies: %d\n", gNumProxies);
            gProxiesWarned = gNumProxies;
        }
    } else {
        // nativeData wasn't used. Reuse it the next time.
        gNativeDataCache = nativeData;
    }

    return object;
}

这里面出现了一个令人费解gBinderOffsets字段。实际上这个字段是的初始化出自的地方我在第一篇已经提到过。让我们看看系统启动的其中一段:

static const RegJNIRec gRegJNI[] = {
...
    REG_JNI(register_android_os_Binder),
...
}

在这里面实际上已经初始化了这个方法指针中内容。

int register_android_os_Binder(JNIEnv* env)
{
    if (int_register_android_os_Binder(env) < 0)
        return -1;
    if (int_register_android_os_BinderInternal(env) < 0)
        return -1;
    if (int_register_android_os_BinderProxy(env) < 0)
        return -1;

...
    return 0;
}

我们关注这三个方法。从这里面我们看到实际上会发现反射java层中的Binder的类,BinderProxy的类以及BinderInternal的类,用来和Java层进行沟通.
此时,native层准备BinderProxy的单例方法等。

我们就顺着上面的逻辑看看这段代码做了什么?

  • 1.首先判断是不是JavaBBinder,一个本地Binder 对象。首先此时是一个远程代理对象,所以不会直接返回
  • 2.接着实例化一个构造体BinderProxyNativeData,里面包含这个一个IBinder的智能指针,以及DeathRecipientList 一个binder死亡的接受者。接着把iBinder以及BinderProxyNativeData作为参数穿进去BinderProxy的单例方法中。
  • 3.当BinderProxy中的BinderProxyNativeData和新生成的BinderProxyNativeData是同一个对象的时候,说明是新生成的BinderProxy。此时缓存为空,把IBinder和DeathRecipientList设置进去。否则则把这一次生成的对象放到缓存,让下一个实例化BinderProxy使用。

BinderProxy getInstance

此我们看看BinderProxy单例方法

    private static BinderProxy getInstance(long nativeData, long iBinder) {
        BinderProxy result;
        try {
            result = sProxyMap.get(iBinder);
            if (result != null) {
                return result;
            }
            result = new BinderProxy(nativeData);
        } catch (Throwable e) {
            // We're throwing an exception (probably OOME); don't drop nativeData.
            NativeAllocationRegistry.applyFreeFunction(NoImagePreloadHolder.sNativeFinalizer,
                    nativeData);
            throw e;
        }
        NoImagePreloadHolder.sRegistry.registerNativeAllocation(result, nativeData);
        // The registry now owns nativeData, even if registration threw an exception.
        sProxyMap.set(iBinder, result);
        return result;
    }

    private BinderProxy(long nativeData) {
        mNativeData = nativeData;
    }

这里会发现,此时通过sProxyMap查找此时有没有iBinder对应的BinderProxy对象,没有则新建一个加入到map中。

细节小补充NoImagePreloadHolder

这里有个类稍微注意一下NoImagePreloadHolder这个静态内部类。这种方式在Android8.0之后经常做辅助native方法回收native数据。

    private static class NoImagePreloadHolder {
        public static final long sNativeFinalizer = getNativeFinalizer();
        public static final NativeAllocationRegistry sRegistry = new NativeAllocationRegistry(
                BinderProxy.class.getClassLoader(), sNativeFinalizer, NATIVE_ALLOCATION_SIZE);
    }

这个类实际上也是一种单例设计模式,灵活的应用了类的加载机制。为这整个进程生成一个单例。可以看到NativeAllocationRegistry这个类。这个类有什么玄妙呢?把类的加载器设置进去,同时设置sNativeFinalizer进去。

而这个sNativeFinalizer实际上是从native层获取的销毁binder对象的方法指针地址。

JNIEXPORT jlong JNICALL android_os_Binder_getNativeFinalizer(JNIEnv*, jclass) {
    return (jlong) Binder_destroy;
}

我们看看NativeAllocationRegistry构造函数做了什么事情

    public Runnable registerNativeAllocation(Object referent, long nativePtr) {
        if (referent == null) {
            throw new IllegalArgumentException("referent is null");
        }
        if (nativePtr == 0) {
            throw new IllegalArgumentException("nativePtr is null");
        }

        CleanerThunk thunk;
        CleanerRunner result;
        try {
            thunk = new CleanerThunk();
            Cleaner cleaner = Cleaner.create(referent, thunk);
            result = new CleanerRunner(cleaner);
            registerNativeAllocation(this.size);
        } catch (VirtualMachineError vme /* probably OutOfMemoryError */) {
            applyFreeFunction(freeFunction, nativePtr);
            throw vme;
        } // Other exceptions are impossible.
        // Enable the cleaner only after we can no longer throw anything, including OOME.
        thunk.setNativePtr(nativePtr);
        return result;
    }

原来如此,此时使用的是jdk 9.0的cleaner机制。提一下,cleaner的机制和finalize机制最大的不同就是要使用一个线程去调用clean方法,让gc回收。虽然都不能及时回收,但是却能够跳过调用object.finalize的方法。提高了效率。

因此当BinderProxy出现OOM的时候,会调用下面这段话回收native数据。

 NativeAllocationRegistry.applyFreeFunction(NoImagePreloadHolder.sNativeFinalizer,
                    nativeData);

而这个回收也是对应到native方法中。
文件:/libcore/luni/src/main/native/libcore_util_NativeAllocationRegistry.cpp

static void NativeAllocationRegistry_applyFreeFunction(JNIEnv*,
                                                       jclass,
                                                       jlong freeFunction,
                                                       jlong ptr) {
    void* nativePtr = reinterpret_cast<void*>(static_cast<uintptr_t>(ptr));
    FreeFunction nativeFreeFunction
        = reinterpret_cast<FreeFunction>(static_cast<uintptr_t>(freeFunction));
    nativeFreeFunction(nativePtr);
}

可以看到实际上就是通过强转型之后,使用native方法释放native数据。

这种思路也在Bitmap中能看到,相信不久的将来,我在解析bitmap的文章会继续聊聊。

ServiceManagerNative

    static public IServiceManager asInterface(IBinder obj)
    {
        if (obj == null) {
            return null;
        }
        IServiceManager in =
            (IServiceManager)obj.queryLocalInterface(descriptor);
        if (in != null) {
            return in;
        }

        return new ServiceManagerProxy(obj);
    }

让我们回到最上层的asInterface方法。此时我们传递下来的IBinder的对象是一个保存着BpBinder的BinderProxy对象。此时BinderProxy的queryLocalInterface必定是返回null。所以这个Binder将会被ServiceManagerProxy这个代理类包裹一层。之后我们需要调用BinderProxy的对象必须通过这个代理类进行跨进程通信。

可以说,到这里就是为什么Binder能够这么透明。但是事情还没有这么简单结束。让我们继续看看addService究竟做了什么事情。

ServiceManagerProxy

    public void addService(String name, IBinder service, boolean allowIsolated, int dumpPriority)
            throws RemoteException {
        Parcel data = Parcel.obtain();
        Parcel reply = Parcel.obtain();
        data.writeInterfaceToken(IServiceManager.descriptor);
        data.writeString(name);
        data.writeStrongBinder(service);
        data.writeInt(allowIsolated ? 1 : 0);
        data.writeInt(dumpPriority);
        mRemote.transact(ADD_SERVICE_TRANSACTION, data, reply, 0);
        reply.recycle();
        data.recycle();
    }

在这里面依次写入如下核心数据:

  • IServiceManager.descriptor 当前代理端描述符
  • name 要加入service_manager的binder对象名字
  • service 要加入service_manager的binder对象
    接着调用BinderProxy的transact方法。

为了探究这个addService进来的是一个什么Binder,我们来看看Binder的客户端对方法。

ServiceManager 添加WindowManagerService

    wm = WindowManagerService.main(context, inputManager,
                    mFactoryTestMode != FactoryTest.FACTORY_TEST_LOW_LEVEL,
                    !mFirstBoot, mOnlyCore, new PhoneWindowManager());
            ServiceManager.addService(Context.WINDOW_SERVICE, wm, /* allowIsolated= */ false,
                    DUMP_FLAG_PRIORITY_CRITICAL | DUMP_FLAG_PROTO);

在这里面,我们能看到实际上是WindowManagerService的静态方法main来生成的。

    public static WindowManagerService main(final Context context, final InputManagerService im,
            final boolean haveInputMethods, final boolean showBootMsgs, final boolean onlyCore,
            WindowManagerPolicy policy) {
        DisplayThread.getHandler().runWithScissors(() ->
                sInstance = new WindowManagerService(context, im, haveInputMethods, showBootMsgs,
                        onlyCore, policy), 0);
        return sInstance;
    }

这个DisplayThread实际上是一个HandlerThread。HandlerThread不多介绍,后面的runWithScissors方法是指当执行完里面的runnable则把里面线程给wait住,这种做法实际上是利于Display线程在初始化和界面相关的事情时候,等待其他事件完成后唤醒继续处理。

我们能够注意到我们实例化了WindowManagerService这个对象。

public class WindowManagerService extends IWindowManager.Stub
        implements Watchdog.Monitor, WindowManagerPolicy.WindowManagerFuncs

此时继承于IWindowManager.Stub。实际上我们并不能在源码中直接找到这段,但是我们可以找到IWindowManager.aidl文件。这种文件的出现是为了便捷的生成进程间通信的java文件。在这里面为了简单的描述这个情况,我建立了一个十分简单的aidl文件.

interface IServiceInterface {
      void send(int data);
}

此时生成的java文件如下:

public interface IServiceInterface extends android.os.IInterface
{
/** Local-side IPC implementation stub class. */
public static abstract class Stub extends android.os.Binder implements com.yjy.bindertest.aidl.IServiceInterface
{
private static final java.lang.String DESCRIPTOR = "com.yjy.bindertest.aidl.IServiceInterface";
/** Construct the stub at attach it to the interface. */
public Stub()
{
this.attachInterface(this, DESCRIPTOR);
}
/**
 * Cast an IBinder object into an com.yjy.bindertest.aidl.IServiceInterface interface,
 * generating a proxy if needed.
 */
public static com.yjy.bindertest.aidl.IServiceInterface asInterface(android.os.IBinder obj)
{
if ((obj==null)) {
return null;
}
android.os.IInterface iin = obj.queryLocalInterface(DESCRIPTOR);
if (((iin!=null)&&(iin instanceof com.yjy.bindertest.aidl.IServiceInterface))) {
return ((com.yjy.bindertest.aidl.IServiceInterface)iin);
}
return new com.yjy.bindertest.aidl.IServiceInterface.Stub.Proxy(obj);
}
@Override public android.os.IBinder asBinder()
{
return this;
}
@Override public boolean onTransact(int code, android.os.Parcel data, android.os.Parcel reply, int flags) throws android.os.RemoteException
{
java.lang.String descriptor = DESCRIPTOR;
switch (code)
{
case INTERFACE_TRANSACTION:
{
reply.writeString(descriptor);
return true;
}
case TRANSACTION_send:
{
data.enforceInterface(descriptor);
int _arg0;
_arg0 = data.readInt();
this.send(_arg0);
reply.writeNoException();
return true;
}
default:
{
return super.onTransact(code, data, reply, flags);
}
}
}
private static class Proxy implements com.yjy.bindertest.aidl.IServiceInterface
{
private android.os.IBinder mRemote;
Proxy(android.os.IBinder remote)
{
mRemote = remote;
}
@Override public android.os.IBinder asBinder()
{
return mRemote;
}
public java.lang.String getInterfaceDescriptor()
{
return DESCRIPTOR;
}

@Override public void send(int data) throws android.os.RemoteException
{
android.os.Parcel _data = android.os.Parcel.obtain();
android.os.Parcel _reply = android.os.Parcel.obtain();
try {
_data.writeInterfaceToken(DESCRIPTOR);
_data.writeInt(data);
mRemote.transact(Stub.TRANSACTION_send, _data, _reply, 0);
_reply.readException();
}
finally {
_reply.recycle();
_data.recycle();
}
}
}
static final int TRANSACTION_send = (android.os.IBinder.FIRST_CALL_TRANSACTION + 0);
}

public void send(int data) throws android.os.RemoteException;
}

仔细看了下,其实不就是ServiceManagerNative中的格式吗?因此此时我们在实例化WindowManagerService的时候,也是在实例化Binder对象。

实例化Binder对象

文件:/frameworks/base/core/java/android/os/Binder.java

    public Binder() {
        mObject = getNativeBBinderHolder();
        NoImagePreloadHolder.sRegistry.registerNativeAllocation(this, mObject);
...
        }
    }

我们能看到一个native方法,在native实例化一个BBinder对象。
文件:/frameworks/base/core/jni/android_util_Binder.cpp

static jlong android_os_Binder_getNativeBBinderHolder(JNIEnv* env, jobject clazz)
{
    JavaBBinderHolder* jbh = new JavaBBinderHolder();
    return (jlong) jbh;
}

我们能看到的是

class JavaBBinderHolder
{
public:
    sp<JavaBBinder> get(JNIEnv* env, jobject obj)
    {
        AutoMutex _l(mLock);
        sp<JavaBBinder> b = mBinder.promote();
        if (b == NULL) {
            b = new JavaBBinder(env, obj);
            mBinder = b;
            ALOGV("Creating JavaBinder %p (refs %p) for Object %p, weakCount=%" PRId32 "\n",
                 b.get(), b->getWeakRefs(), obj, b->getWeakRefs()->getWeakCount());
        }

        return b;
    }

    sp<JavaBBinder> getExisting()
    {
        AutoMutex _l(mLock);
        return mBinder.promote();
    }

private:
    Mutex           mLock;
    wp<JavaBBinder> mBinder;
};

此时我们能获得的一个JavaBBinder的Holder一个占位,等到真的需要我们去实例化一个真正的JavaBBinder对象。

当我们把Binder对象通过Parcel调用writeStrongBinder的时候,会把占位转化为JavaBBinder。
文件:/frameworks/base/core/jni/android_os_Parcel.cpp

static void android_os_Parcel_writeStrongBinder(JNIEnv* env, jclass clazz, jlong nativePtr, jobject object)
{
    Parcel* parcel = reinterpret_cast<Parcel*>(nativePtr);
    if (parcel != NULL) {
        const status_t err = parcel->writeStrongBinder(ibinderForJavaObject(env, object));
        if (err != NO_ERROR) {
            signalExceptionForError(env, clazz, err);
        }
    }
}

那么,我们看看ibinderForJavaObject,这个方法实际上和JavaObjectForIBinder是相反的作用,前者是把JavaObject转化为iBinder后者是把IBinder转化为JavaObject。

sp<IBinder> ibinderForJavaObject(JNIEnv* env, jobject obj)
{
    if (obj == NULL) return NULL;

    // Instance of Binder?
    if (env->IsInstanceOf(obj, gBinderOffsets.mClass)) {
        JavaBBinderHolder* jbh = (JavaBBinderHolder*)
            env->GetLongField(obj, gBinderOffsets.mObject);
        return jbh->get(env, obj);
    }

    // Instance of BinderProxy?
    if (env->IsInstanceOf(obj, gBinderProxyOffsets.mClass)) {
        return getBPNativeData(env, obj)->mObject;
    }

    return NULL;
}

实际上,这里我们就能看到实际上此时确定当前是Java层Binder对象则转型为JavaBBinderHolder,获取其中的JavaBBinder对象,否则则直接从BinderProxy中获取存在本地的BpBinder对象。

我们能看到JavaBBinder 实际上是继承的是BBinder,而这个BBinder也是继承IBinder

class JavaBBinder : public BBinder

了解这些基础之后,我们看看BinderProxy的transact

mRemote.transact(ADD_SERVICE_TRANSACTION, data, reply, 0);

记住此时data承载着发送数据,reply承载着回复数据。

BinderProxy 添加服务

文件:/frameworks/base/core/java/android/os/Binder.java

    public boolean transact(int code, Parcel data, Parcel reply, int flags) throws RemoteException {
        Binder.checkParcel(this, code, data, "Unreasonably large binder buffer");
...
        try {
            return transactNative(code, data, reply, flags);
        } finally {
            if (tracingEnabled) {
                Trace.traceEnd(Trace.TRACE_TAG_ALWAYS);
            }

    }

让我们看看transactNative

static jboolean android_os_BinderProxy_transact(JNIEnv* env, jobject obj,
        jint code, jobject dataObj, jobject replyObj, jint flags) // throws RemoteException
{
    if (dataObj == NULL) {
        jniThrowNullPointerException(env, NULL);
        return JNI_FALSE;
    }

    Parcel* data = parcelForJavaObject(env, dataObj);
...
    Parcel* reply = parcelForJavaObject(env, replyObj);
...
    IBinder* target = getBPNativeData(env, obj)->mObject.get();
...

    status_t err = target->transact(code, *data, reply, flags);
..
    if (err == NO_ERROR) {
        return JNI_TRUE;
    } else if (err == UNKNOWN_TRANSACTION) {
        return JNI_FALSE;
    }

...
    return JNI_FALSE;
}

取出其核心逻辑一看就知道,把顶层的Parcel java对象转化为native对应的对象。接着取出BpBinder调用其中的transact方法。

status_t BpBinder::transact(
    uint32_t code, const Parcel& data, Parcel* reply, uint32_t flags)
{
    // Once a binder has died, it will never come back to life.
    if (mAlive) {
        status_t status = IPCThreadState::self()->transact(
            mHandle, code, data, reply, flags);
        if (status == DEAD_OBJECT) mAlive = 0;
        return status;
    }

    return DEAD_OBJECT;
}

此刻的逻辑就很简单了,取出当前代表远程端对应的handle(句柄),以此为目标把数据在传送到目标进程。因为我上一篇已经详细的剖析了binder驱动,我们直接去service_manager去看看对应的code代码是什么。

service_manager 处理 ADD_SERVICE_TRANSACTION 命令

int svcmgr_handler(struct binder_state *bs,
                   struct binder_transaction_data *txn,
                   struct binder_io *msg,
                   struct binder_io *reply)
{
    struct svcinfo *si;
    uint16_t *s;
    size_t len;
    uint32_t handle;
    uint32_t strict_policy;
    int allow_isolated;
    uint32_t dumpsys_priority;
....
    s = bio_get_string16(msg, &len);
    if (s == NULL) {
        return -1;
    }

 ....
    switch(txn->code) {
...

    case SVC_MGR_ADD_SERVICE:
        s = bio_get_string16(msg, &len);
        if (s == NULL) {
            return -1;
        }
        handle = bio_get_ref(msg);
        allow_isolated = bio_get_uint32(msg) ? 1 : 0;
        dumpsys_priority = bio_get_uint32(msg);
        if (do_add_service(bs, s, len, handle, txn->sender_euid, allow_isolated, dumpsys_priority,
                           txn->sender_pid))
            return -1;
        break;

...
    default:
        ALOGE("unknown code %d\n", txn->code);
        return -1;
    }

    bio_put_uint32(reply, 0);
    return 0;
}

实际上binder_io 的结构体指代的是binder_transaction_data的这一部分数据
binder_io.png

因此我们可以看到在ServiceManager的调用端写入了什么数据。

  • IServiceManager.descriptor
  • name
  • service

因此我们能够看到在svcmgr_handler中,先解析了当前的Binder的描述符,没有则返回-1,接着看是解析name,以及binder对象。

int do_add_service(struct binder_state *bs, const uint16_t *s, size_t len, uint32_t handle,
                   uid_t uid, int allow_isolated, uint32_t dumpsys_priority, pid_t spid) {
    struct svcinfo *si;

...
    si = find_svc(s, len);
    if (si) {
        if (si->handle) {
            ALOGE("add_service('%s',%x) uid=%d - ALREADY REGISTERED, OVERRIDE\n",
                 str8(s, len), handle, uid);
            svcinfo_death(bs, si);
        }
        si->handle = handle;
    } else {
        si = malloc(sizeof(*si) + (len + 1) * sizeof(uint16_t));
        if (!si) {
            ALOGE("add_service('%s',%x) uid=%d - OUT OF MEMORY\n",
                 str8(s, len), handle, uid);
            return -1;
        }
        si->handle = handle;
        si->len = len;
        memcpy(si->name, s, (len + 1) * sizeof(uint16_t));
        si->name[len] = '\0';
        si->death.func = (void*) svcinfo_death;
        si->death.ptr = si;
        si->allow_isolated = allow_isolated;
        si->dumpsys_priority = dumpsys_priority;
        si->next = svclist;
        svclist = si;
    }

    binder_acquire(bs, handle);
    binder_link_to_death(bs, handle, &si->death);
    return 0;
}

最后从svclist查找是否存在对应名字,存在则直接替换svcinfo 的binder对象引用,否则则添加一个新的svcinfo 对象添加到列表末端。这样就完成了addService的流程。那么getService的流程仅仅只是code的变化,在reply返回binder对象而已。

中间小总结

我们能够发现此时在framework层一切都是由引用代替Binder对象,有三点,一点是Binder的大小未知在Java栈中使用,可能会导致OOM。其次在接受的时候,我们能够减少数据的传输量,在addService的时候,我们没必要去较量Binder对象申请一个结构体去接受,最后是我们使用引用实际上就能突破进程的地址保护。这种设计实际上和Parcel保存native地址有着异曲同工之妙。

但是到这里就结束了吗?实际上在aidl生成对象中,我们实际上还有一样东西没有注意到,那就是继承于Binder类中的onTransaction方法。

实际上,这个知识就要联动到我这个系列第一篇文章的知识。其奥妙就是在RuntimeInit的nativeZygoteInit.如果不记得,可以去看看我的从系统启动到Activity启动。

文件:/frameworks/base/core/jni/AndroidRuntime.cpp

static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
{
    gCurRuntime->onZygoteInit();
}

而gCurRuntime是指AndroidRuntime这个类。
文件:/frameworks/base/cmds/app_process/app_main.cpp

    virtual void onZygoteInit()
    {
        sp<ProcessState> proc = ProcessState::self();
        ALOGV("App process: starting thread pool.\n");
        proc->startThreadPool();
    }

我们再次看到了ProcessState这个关键类。我们看看startThreadPool做了什么事情。

void ProcessState::startThreadPool()
{
    AutoMutex _l(mLock);
    if (!mThreadPoolStarted) {
        mThreadPoolStarted = true;
        spawnPooledThread(true);
    }
}

void ProcessState::spawnPooledThread(bool isMain)
{
    if (mThreadPoolStarted) {
        String8 name = makeBinderThreadName();
        ALOGV("Spawning new pooled thread, name=%s\n", name.string());
        sp<Thread> t = new PoolThread(isMain);
        t->run(name.string());
    }
}

做的事情只有一件启动PoolThread这个线程,为线程的名字设置为:”Binder:(pid)_(2)”当第一次初始化的时候是会原子啊哦做增加一。每一次调用spawnPooledThread都会为这个名字最后的数字加一。

class PoolThread : public Thread
{
public:
    explicit PoolThread(bool isMain)
        : mIsMain(isMain)
    {
    }

protected:
    virtual bool threadLoop()
    {
        IPCThreadState::self()->joinThreadPool(mIsMain);
        return false;
    }

    const bool mIsMain;
};

Binder 中第二种Looper

看到这里我们能够敏锐的发现这是继承于线程。这个线程不是我们传统的pthread而是经过Android自己加工的thread。我们直接看看它的run方法。查看了
文件:

status_t Thread::run(const char* name, int32_t priority, size_t stack)
{
    LOG_ALWAYS_FATAL_IF(name == nullptr, "thread name not provided to Thread::run");
...
    if (mCanCallJava) {
        res = createThreadEtc(_threadLoop,
                this, name, priority, stack, &mThread);
    } else {
        res = androidCreateRawThreadEtc(_threadLoop,
                this, name, priority, stack, &mThread);
    }

...
    return NO_ERROR;

    // Exiting scope of mLock is a memory barrier and allows new thread to run
}

实际上此时会调用androidCreateRawThreadEtc这个方法,这个方法最终会调用_threadLoop,这个方法会调用threadLoop。因此我们只需要看看

IPCThreadState::self()->joinThreadPool(mIsMain);

文件:/frameworks/native/libs/binder/IPCThreadState.cpp

void IPCThreadState::joinThreadPool(bool isMain)
{
    LOG_THREADPOOL("**** THREAD %p (PID %d) IS JOINING THE THREAD POOL\n", (void*)pthread_self(), getpid());

    mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER);

    status_t result;
    do {
        processPendingDerefs();
        // now get the next command to be processed, waiting if necessary
        result = getAndExecuteCommand();

        if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) {
            ALOGE("getAndExecuteCommand(fd=%d) returned unexpected error %d, aborting",
                  mProcess->mDriverFD, result);
            abort();
        }

        // Let this thread exit the thread pool if it is no longer
        // needed and it is not the main process thread.
        if(result == TIMED_OUT && !isMain) {
            break;
        }
    } while (result != -ECONNREFUSED && result != -EBADF);

    LOG_THREADPOOL("**** THREAD %p (PID %d) IS LEAVING THE THREAD POOL err=%d\n",
        (void*)pthread_self(), getpid(), result);

    mOut.writeInt32(BC_EXIT_LOOPER);
    talkWithDriver(false);
}

此时我们能看到我之前所说的Binder中第二种looper,就是指的是在IPCThread中初始化的一个本地读取循环,对应着本地进程的Binder因此处理命令时候都会自己转型为BBinder。在这个循环中,将会专门接受从远程端发送过来的消息。如service_manager的消息,如其他进程app发送来的消息。

这循环当返回码是超时,或者初始化的isMain标志位不为true时候跳出,或者result返回了结果既不是 -ECONNREFUSED也不是-EBADF也会结束循环。最后会通过talkWithDriver往驱动中写入BC_EXIT_LOOPER命令,告诉驱动结束了进程的loop。当talkWithDriver传的是false的时候,不需要响应驱动的应答。

getAndExecuteCommand
status_t IPCThreadState::getAndExecuteCommand()
{
    status_t result;
    int32_t cmd;

    result = talkWithDriver();
    if (result >= NO_ERROR) {
        size_t IN = mIn.dataAvail();
        if (IN < sizeof(int32_t)) return result;
        cmd = mIn.readInt32();
        IF_LOG_COMMANDS() {
            alog << "Processing top-level Command: "
                 << getReturnString(cmd) << endl;
        }

        pthread_mutex_lock(&mProcess->mThreadCountLock);
        mProcess->mExecutingThreadsCount++;
        if (mProcess->mExecutingThreadsCount >= mProcess->mMaxThreads &&
                mProcess->mStarvationStartTimeMs == 0) {
            mProcess->mStarvationStartTimeMs = uptimeMillis();
        }
        pthread_mutex_unlock(&mProcess->mThreadCountLock);

        result = executeCommand(cmd);

        pthread_mutex_lock(&mProcess->mThreadCountLock);
        mProcess->mExecutingThreadsCount--;
        if (mProcess->mExecutingThreadsCount < mProcess->mMaxThreads &&
                mProcess->mStarvationStartTimeMs != 0) {
            int64_t starvationTimeMs = uptimeMillis() - mProcess->mStarvationStartTimeMs;
            if (starvationTimeMs > 100) {
                ALOGE("binder thread pool (%zu threads) starved for %" PRId64 " ms",
                      mProcess->mMaxThreads, starvationTimeMs);
            }
            mProcess->mStarvationStartTimeMs = 0;
        }
        pthread_cond_broadcast(&mProcess->mThreadCountDecrement);
        pthread_mutex_unlock(&mProcess->mThreadCountLock);
    }

    return result;
}

这里的操作主要是对线程进行读取mIn中的数据,通过executeCommand处理命令。这里就用我们最常见的命令BC_TRANSACTION,来说明。

status_t IPCThreadState::executeCommand(int32_t cmd)
{
    BBinder* obj;
    RefBase::weakref_type* refs;
    status_t result = NO_ERROR;

    switch ((uint32_t)cmd) {
    case BR_ERROR:
        result = mIn.readInt32();
        break;

    case BR_OK:
        break;

    case BR_ACQUIRE:
       ....
        break;

    case BR_RELEASE:
        ....
        break;

    case BR_INCREFS:
       ....
        break;

    case BR_DECREFS:
       ...
        break;

    case BR_ATTEMPT_ACQUIRE:
        ...
        break;

    case BR_TRANSACTION:
        {
            binder_transaction_data tr;
            result = mIn.read(&tr, sizeof(tr));
            ALOG_ASSERT(result == NO_ERROR,
                "Not enough command data for brTRANSACTION");
            if (result != NO_ERROR) break;

            Parcel buffer;
            buffer.ipcSetDataReference(
                reinterpret_cast<const uint8_t*>(tr.data.ptr.buffer),
                tr.data_size,
                reinterpret_cast<const binder_size_t*>(tr.data.ptr.offsets),
                tr.offsets_size/sizeof(binder_size_t), freeBuffer, this);

            const pid_t origPid = mCallingPid;
            const uid_t origUid = mCallingUid;
            const int32_t origStrictModePolicy = mStrictModePolicy;
            const int32_t origTransactionBinderFlags = mLastTransactionBinderFlags;

            mCallingPid = tr.sender_pid;
            mCallingUid = tr.sender_euid;
            mLastTransactionBinderFlags = tr.flags;

            Parcel reply;
            status_t error;
...
            if (tr.target.ptr) {

                if (reinterpret_cast<RefBase::weakref_type*>(
                        tr.target.ptr)->attemptIncStrong(this)) {
                    error = reinterpret_cast<BBinder*>(tr.cookie)->transact(tr.code, buffer,
                            &reply, tr.flags);
                    reinterpret_cast<BBinder*>(tr.cookie)->decStrong(this);
                } else {
                    error = UNKNOWN_TRANSACTION;
                }

            } else {
                error = the_context_object->transact(tr.code, buffer, &reply, tr.flags);
            }
...
            if ((tr.flags & TF_ONE_WAY) == 0) {
...
                if (error < NO_ERROR) reply.setError(error);
                sendReply(reply, 0);
            } else {
...
            }

            mCallingPid = origPid;
            mCallingUid = origUid;
            mStrictModePolicy = origStrictModePolicy;
            mLastTransactionBinderFlags = origTransactionBinderFlags;
...
        }
        break;

    case BR_DEAD_BINDER:
       ....
break;

    case BR_CLEAR_DEATH_NOTIFICATION_DONE:
...
 break;

    case BR_FINISHED:
        result = TIMED_OUT;
        break;

    case BR_NOOP:
        break;

    case BR_SPAWN_LOOPER:
...
        break;

    default:
        result = UNKNOWN_ERROR;
        break;
    }

    if (result != NO_ERROR) {
        mLastError = result;
    }

    return result;
}

我们又一次看到了极其熟悉的binder_transaction_data结构体。一般的这边的target的ptr不为空,会直接走BBinder的transact。即时为空也会走设置的默认the_context_object这个BBinder。这个BBinder还记得我之前的数据封包图吗?此时传的Binder如果是BBinder,则会在cookie带上当前的引用。因此就会调用到我们当前进程的对应Binder中的transact方法。等等?不是每个进程都有自己地址保护吗?实际上,不用担心,在binder驱动的binder_thread_read的步骤已经把里面的数据拷贝到本端进程,相当于本端进程持有远端进程的BBinder的一个副本。

实际上BBinder我们不会直接使用,我们都会使用JavaBBinder。
文件:/frameworks/base/core/jni/android_util_Binder.cpp

virtual status_t onTransact(
        uint32_t code, const Parcel& data, Parcel* reply, uint32_t flags = 0)
    {
        JNIEnv* env = javavm_to_jnienv(mVM);

      ...
//这里反射了Binder的execTransaction方法
        jboolean res = env->CallBooleanMethod(mObject, gBinderOffsets.mExecTransact,
            code, reinterpret_cast<jlong>(&data), reinterpret_cast<jlong>(reply), flags);

        if (env->ExceptionCheck()) {
        ...
        return res != JNI_FALSE ? NO_ERROR : UNKNOWN_TRANSACTION;
    }

看到了吧,此时我们将会反射回java层Binder的execTransact方法。


    // Entry point from android_util_Binder.cpp's onTransact
    private boolean execTransact(int code, long dataObj, long replyObj,
            int flags) {
        BinderCallsStats binderCallsStats = BinderCallsStats.getInstance();
        BinderCallsStats.CallSession callSession = binderCallsStats.callStarted(this, code);
        Parcel data = Parcel.obtain(dataObj);
        Parcel reply = Parcel.obtain(replyObj);
        // theoretically, we should call transact, which will call onTransact,
        // but all that does is rewind it, and we just got these from an IPC,
        // so we'll just call it directly.
        boolean res;
        // Log any exceptions as warnings, don't silently suppress them.
        // If the call was FLAG_ONEWAY then these exceptions disappear into the ether.
        final boolean tracingEnabled = Binder.isTracingEnabled();
        try {
            if (tracingEnabled) {
                Trace.traceBegin(Trace.TRACE_TAG_ALWAYS, getClass().getName() + ":" + code);
            }
//回调到aidl中onTransact
            res = onTransact(code, data, reply, flags);
        } catch (RemoteException|RuntimeException e) {
            if (LOG_RUNTIME_EXCEPTION) {
                Log.w(TAG, "Caught a RuntimeException from the binder stub implementation.", e);
            }
            if ((flags & FLAG_ONEWAY) != 0) {
                if (e instanceof RemoteException) {
                    Log.w(TAG, "Binder call failed.", e);
                } else {
                    Log.w(TAG, "Caught a RuntimeException from the binder stub implementation.", e);
                }
            } else {
                reply.setDataPosition(0);
                reply.writeException(e);
            }
            res = true;
        } finally {
            if (tracingEnabled) {
                Trace.traceEnd(Trace.TRACE_TAG_ALWAYS);
            }
        }
        checkParcel(this, code, reply, "Unreasonably large binder reply buffer");
        reply.recycle();
        data.recycle();

        // Just in case -- we are done with the IPC, so there should be no more strict
        // mode violations that have gathered for this thread.  Either they have been
        // parceled and are now in transport off to the caller, or we are returning back
        // to the main transaction loop to wait for another incoming transaction.  Either
        // way, strict mode begone!
        StrictMode.clearGatheredViolations();
        binderCallsStats.callEnded(callSession);

        return res;
    }

此时刚好回掉到aidl中的方法。此方法仅仅只是为了让我们更加规矩的使用Parcel而直接做了一层封装。但是还有Binder的死亡处理我还没有解析。

总结

实际上到了这里也只是整合上一篇代码和java层的反射原理,没有必要继续再写个时序图。大体上Binder涉及到的类,这里都说到了。我们可以绘制一个类的UML图。
binder类依赖图.png

通过这两章的学习,我实际上发现对于Binder来说,整个引用包裹是aidl的IInterface指向BpBinder,BpBinder指向binder_ref,binder_ref指向binder_node指向binder实际的类(BBinder),

到这里我们也开始能够回答之前的问题了。在服务端和远程端之间整个过程究竟是什么了。这里就模拟大学学习的tcp三次握手的写法。

binder握手通信.png

对于整个通信过程,不像tcp一样进行三次握手确定其消息的接受确定性,因为这一切都在binder驱动做了对所有的binder对象进行管理。


 上一篇
Android 重学系列 Binder 死亡代理 Android 重学系列 Binder 死亡代理
背景这是Binder系列的最后一篇了。让我们来聊聊Binder的死亡代理是怎么处理。我们之前只是聊了Binder的启动和传输数据,还差最后一个模块就补上整个缺口了。如果遇到问题:https://www.jianshu.com/p/e2200
2019-05-11
下一篇 
Android 重学系列 Binder 服务的初始化以及交互原理(上) Android 重学系列 Binder 服务的初始化以及交互原理(上)
前言经过前面三篇binder驱动的初始化阐述,我大致上稍微复习一边linux内核的基础知识,也对binder的理解更加深刻。接下来我们来看看binder 的服务是怎么注册到service_manager。如果遇到问题请到:https://w
2019-05-11
  目录