Exceptions.throwIfFatal / throwIfJvmFatal should log fatal exceptions
See original GitHub issueRelated issues: #3036, #2521, #2917, #1118
The combination of unrecoverable Error subclasses and reactive/asynchronous applications can easily lead to applications being left in semi-working state with no visible errors being logged.
Motivation
In an imperative, blocking application, an unrecoverable Error subclass would lead to the main application thread terminating with a stack trace. However, in reactive/asynchronous applications, the Error is often propagated up to a thread in some thread pool, which upon termination is peacefully returned to the pool. Unless some layer of the application catches Throwables and logs the error, it will be invisible to the end users and application maintainers. On the other hand, Reactor sinks/subscriptions will be left in a “hanging” state – neither working, nor terminated – leading to hard to catch and hard to debug situations.
The specific scenario that caused me to look into this was a minor library version incompatibility that resulted in java.lang.NoSuchMethodError being thrown in some stream mapping logic. While unrecoverable from both Java perspective (because it’s an Error) and from Reactor throwIfJvmFatal perspective (because it’s a LinkageErrror), this type of issue tends to be isolated to a particular code path and perhaps does not need to lead to application shutdown. At minimum, it would be good to have consistent logging of unrecoverable errors.
Desired solution
In order of the amount of work involved.
- Logging fatal errors as part of
Exceptions.throwIfJvmFatal. - While Reactor is doing the right thing by not turning thrown
Errorinto stream errors, it would be good if the downstream code could recover if desired. This would require thatFluxSinkis left in a state in which it’s capable of terminating – in thereactorFluxSinkUnrecoverableError()test below, it would meansink.error(e);results in the proper stream termination (although the exception passed in to the sink probably needs to be wrapped in something that won’t re-triggerthrowIfJvmFatalbehavior).
Considered alternatives
Leave everything as is; not handling Error subclasses works as intended. While true, it does lead to developers spending a lot of time troubleshooting – see related issues. And that’s only the people who did figure out what was going wrong! It’s a valid option though.
Additional context
The following two tests demonstrate the difference in behavior between normal exceptions being handled and converted to stream errors, and unrecoverable errors leading to a hanging stream.
// this test passes
@Test
void reactorFluxSinkRecoverableException() {
Flux<String> testFlux = Flux.create(sink -> {
sink.onCancel(() -> System.out.println("*** canceled"));
sink.onDispose(() -> System.out.println("*** disposed"));
try {
sink.next("some value");
sink.complete();
} catch(Exception e) {
// This point is never reached because Reactor correctly converts exception into stream error.
System.out.println("Exception in `next` processing caught, handled: " + e.getMessage());
sink.error(e);
}
}).map(value -> {
System.out.println("received value: " + value);
throw new RuntimeException("normal exception");
});
StepVerifier.create(testFlux)
.verifyErrorMessage("normal exception");
}
// this test hangs
@Test
void reactorFluxSinkUnrecoverableError() {
Flux<String> testFlux = Flux.create(sink -> {
sink.onCancel(() -> System.out.println("*** canceled"));
sink.onDispose(() -> System.out.println("*** disposed"));
try {
sink.next("some value");
sink.complete();
} catch(Throwable e) {
System.out.println("Throwable in `next` processing caught, handled: " + e.getMessage());
// this will have no effect; a terminal state is no longer reachable for the stream
sink.error(e);
}
}).map(value -> {
System.out.println("received value: " + value);
throw new LinkageError("unrecoverable error");
});
// This stream will hang.
StepVerifier.create(testFlux)
.verifyErrorMessage("unrecoverable error");
}
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:7 (5 by maintainers)
Top Related StackOverflow Question
@schananas if you have pieces of code that are susceptible to OOM but you’re confident you can recover from these, then you are the best suited to implement that. I’d argue only you can take the responsibility to wrap such code in a
try-catchblock, because Reactor certainly can’t identify these specific occurrences. I doubt even you can, but that’s another story (see this informative stackoverflow answer for some context).with the introduction of
isFatalis the associated PR, it could open up a path of hooking aPredicatebut as explained above thatPredicatewould apply too broadly (filter by exception Class, message…). Nothing better than user knowledge to pinpoint “I’m about to decode a potentially huge object, let’s try/catch”. We won’t be pursuing that path for now.I’ll take that one and add logging to
Exceptions.throwIfFatal