Tuesday 29 November 2022

Thousands of OkHttp related issues reported daily (connection, unknown host, dns, etc)

Setup

I have an app in production used by thousands of users daily.

I'm using retrofit2 version 2.9.0 (latest)

My build.gradle below.

def retrofitVersion = '2.9.0'
api "com.squareup.retrofit2:converter-gson:${retrofitVersion}"
api "com.squareup.retrofit2:converter-scalars:${retrofitVersion}"
api "com.squareup.retrofit2:adapter-rxjava2:${retrofitVersion}"
api "com.squareup.retrofit2:retrofit:${retrofitVersion}"

I integrated Firebase Crashlytics and made it so that app would report any API related exceptions in try-catch blocks.

e.g.

viewModelScope.launch {
    try {
        val response = myRepository.getProfile()
        if (response.isSuccessful) {
            // continue with some business logic
        } else {
            Log.e(tag, "error", RunTimeException("some error")
        }
    } catch (throwable: Throwable){
        Log.e(tag, "error thrown", throwable)
        crashlytics.recordException(throwable)
    }
}

Knowns

Now in Crashlytics, I get THOUSANDS of reports daily saying there were some errors. Before I get to those errors, I want to assure you that users ARE connected to internet with proper network permissions. I see logs that users are opening other contents at the time. So these errors seem to be really random.

Erros

  1. UnknownHostException
Non-fatal Exception: java.net.UnknownHostException: Unable to resolve host "my-host-address.com": No address associated with hostname
       at java.net.Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:156)
       at java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:103)
       at java.net.InetAddress.getAllByName(InetAddress.java:1152)
       at okhttp3.Dns$Companion$DnsSystem.lookup(Dns.java:5)
...
Caused by android.system.GaiException: android_getaddrinfo failed: EAI_NODATA (No address associated with hostname)
       at libcore.io.Linux.android_getaddrinfo(Linux.java)
       at libcore.io.ForwardingOs.android_getaddrinfo(ForwardingOs.java:74)
       at libcore.io.BlockGuardOs.android_getaddrinfo(BlockGuardOs.java:200)
       at libcore.io.ForwardingOs.android_getaddrinfo(ForwardingOs.java:74)
       at java.net.Inet6AddressImpl.lookupHostByName(Inet6AddressImpl.java:135)
       at java.net.Inet6AddressImpl.lookupAllHostAddr(Inet6AddressImpl.java:103)
...
  1. ConnectionException
Non-fatal Exception: java.net.ConnectException: Failed to connect to my-host-address.com/123.123.123.123:443
       at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:146)
       at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:191)
       at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:257)
       at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java)
       at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:47)
...
Caused by java.net.ConnectException: failed to connect to my-host-address.com/123.123.123.123 (port 443) from /:: (port 0) after 10000ms: connect failed: ENETUNREACH (Network is unreachable)
       at libcore.io.IoBridge.connect(IoBridge.java:142)
       at java.net.PlainSocketImpl.socketConnect(PlainSocketImpl.java:142)
       at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:390)
       at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:230)
       at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:212)
       at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:436)
       at java.net.Socket.connect(Socket.java:621)
...
Caused by android.system.ErrnoException: connect failed: ENETUNREACH (Network is unreachable)
       at libcore.io.Linux.connect(Linux.java)
       at libcore.io.ForwardingOs.connect(ForwardingOs.java:94)
       at libcore.io.BlockGuardOs.connect(BlockGuardOs.java:138)
       at libcore.io.ForwardingOs.connect(ForwardingOs.java:94)
       at libcore.io.IoBridge.connectErrno(IoBridge.java:173)
       at libcore.io.IoBridge.connect(IoBridge.java:134)
...
  1. SocketTimeoutException
Non-fatal Exception: java.net.SocketTimeoutException: timeout
       at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:4)
       at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:8)
       at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.java:24)
       at okhttp3.internal.http2.Http2ExchangeCodec.readResponseHeaders(Http2ExchangeCodec.java:5)
       at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.java:2)
       at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:145)
...
  1. Another SocketTimeoutException
Non-fatal Exception: java.net.SocketTimeoutException: SSL handshake timed out
       at com.android.org.conscrypt.NativeCrypto.SSL_do_handshake(NativeCrypto.java)
       at com.android.org.conscrypt.NativeSsl.doHandshake(NativeSsl.java:387)
       at com.android.org.conscrypt.ConscryptFileDescriptorSocket.startHandshake(ConscryptFileDescriptorSocket.java:234)
       at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:72)
       at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:52)
       at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:196)
       at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:257)
       at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java)
       at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:47)

And lastly, what makes me think it's not my server issue is that I get this kind of error when I request banner ads to Google server as well. I get thousands of reports of the following

{   "Message": "Error while connecting to ad server: Failed to connect to pubads.g.doubleclick.net/216.58.195.130:443",   "Cause": "null",   "Response Info": {     "Adapter Responses": [],     "Response ID": "null",     "Response Extras": {},     "Mediation Adapter Class Name": ""   },   "Domain": "com.google.android.gms.ads",   "Code": 0 }

from Google ads SDK's onAdFailedToLoad listener.

Attempt

I tried to find some solutions in Retrofit2/OkHttp3 github issues, SO community, and everyone says there may be some network permission issues or network connection problem itself. But I know users are connected to internet and not using some sort of proxy. I worked with customer service team and they walked through with users, and they did not find any network issues.

Any insight would be helpful. Thank you in advance!



from Thousands of OkHttp related issues reported daily (connection, unknown host, dns, etc)

No comments:

Post a Comment