Skip to content

Conversation

samedson
Copy link
Contributor

@samedson samedson commented Jun 23, 2020

  • In cases where there is bad connection, the current behavior deletes the cached reports because they are caught as an IOException. They appear to be encoding errors, when in reality it's an error opening the OutputStream that is the connection.

  • This changes it so that when the IOException is due to connection problems, we retry the upload instead of deleting it.

This is where we transform status numbers into success, fatal, and transient errors: CctTransportBackend.java

  • 400-level errors (except 404) are treated as fatal errors, which are deleted
  • 500-level errors are treated as transient errors, which are retried

This is where we decide to delete fatal errors, and retry transient errors: Uploader.java

@googlebot googlebot added the cla: yes Override cla label Jun 23, 2020
@samedson samedson requested a review from rlazo June 23, 2020 17:57
@google-oss-bot
Copy link
Contributor

Binary Size Report

Affected SDKs

  • firebase-encoders-json

    Type Base (037282f) Head (94d853c7) Diff
    apk (aggressive) 10.9 kB 11.0 kB +15 B (+0.1%)
    apk (debug) 28.6 kB 28.6 kB +3 B (+0.0%)
  • transport-api

    Type Base (037282f) Head (94d853c7) Diff
    apk (aggressive) 10.9 kB 11.0 kB +15 B (+0.1%)
    apk (debug) 23.0 kB 23.0 kB -13 B (-0.1%)
  • transport-backend-cct

    Type Base (037282f) Head (94d853c7) Diff
    aar 39.0 kB 39.0 kB +67 B (+0.2%)
    apk (aggressive) 47.9 kB 48.0 kB +57 B (+0.1%)
    apk (debug) 102 kB 102 kB +51 B (+0.1%)
    apk (release) 82.4 kB 82.4 kB +41 B (+0.0%)
  • transport-runtime

    Type Base (037282f) Head (94d853c7) Diff
    apk (debug) 80.6 kB 80.6 kB +2 B (+0.0%)

Test Logs

Notes

Head commit (94d853c7) is created by Prow via merging commits: 037282f e2af542.

@google-oss-bot
Copy link
Contributor

Coverage Report

Affected SDKs

  • transport-backend-cct

    SDK overall coverage changed from 78.45% (037282f) to 78.14% (94d853c7) by -0.31%.

    Filename Base (037282f) Head (94d853c7) Diff
    CctTransportBackend.java 96.24% 94.71% -1.53%

Test Logs

Notes

HTML coverage reports can be produced locally with ./gradlew <product>:checkCoverage.
Report files are located at <product-build-dir>/reports/jacoco/.

Head commit (94d853c7) is created by Prow via merging commits: 037282f e2af542.

@samedson
Copy link
Contributor Author

/test smoke-tests

@ashwinraghav
Copy link
Contributor

Links were super helpful.
It seems like we model IOExceptions as transient errors already

This would result in retrying ? What am i missing

@samedson
Copy link
Contributor Author

@ashwinraghav The issue is IOException from this error don't bubble up to that try catch because they're caught right here, and returned as a HttpResponse(400, null, 0);.

So when this error happens that we are testing, we get a valid response here. And then it becomes a fatalError here before, because it was a 400. Not since we return a HttpResponse(500, null, 0);, it becomes a transientError

@ashwinraghav
Copy link
Contributor

@ashwinraghav The issue is IOException from this error don't bubble up to that try catch because they're caught right here, and returned as a HttpResponse(400, null, 0);.

So when this error happens that we are testing, we get a valid response here. And then it becomes a fatalError here before, because it was a 400. Not since we return a HttpResponse(500, null, 0);, it becomes a transientError

Makes sense.

@ashwinraghav
Copy link
Contributor

Can you add a test similar to

@@ -268,6 +270,9 @@ private HttpResponse doSend(HttpRequest request) throws IOException {
// JsonWriter often writes one character at a time.
dataEncoder.encode(
request.requestBody, new BufferedWriter(new OutputStreamWriter(outputStream)));
} catch (ConnectException | UnknownHostException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud: This seems like a pretty conservative approach, as opposed to treating all IO Exceptions as retry-able.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly - the reason we did this is there are legit IO Exceptions that would not be retryable, eg. encoding errors of some sort