MediaType.APPLICATION_JSON_UTF8_VALUE is deprecated, but encoding is incorrect without it
See original GitHub issueAffects: 5.2.7.RELEASE
We just updated from Spring Boot 2.1.3.RELEASE (which used Spring Framework 5.1.7.RELEASE) to 2.3.1.RELEASE (which uses Spring Framework 5.2.7.RELEASE).
After finishing up and running system tests, we noticed that there are some encoding issues with non-ASCII characters. For example, with the following endpoint (details omitted):
@RequestMapping("/v1")
public interface Api {
@GetMapping("/inquiry")
@ApiOperation(value = "...", httpMethod = "GET",
notes = ".", response = InquiryData.class, tags = {"Inquiry"})
@ApiResponses(value = {@ApiResponse(code = 200, message = "Data", response = InquiryData.class),
@ApiResponse(code = 401, message = "Unauthorized", response = Errors.class),
@ApiResponse(code = 500, message = "Internal Server Error", response = Errors.class)})
InquiryData getInquiryData();
}
Which ordinarily would return something like the following:
{"name":"...","street":"...","zipCode":"...","city":"Münster","country":"DE"}
^ non-ASCII
After updating to 5.2.7.RELEASE, the following response is attained:
{"name":"...","street":"...","zipCode":"...","city":"Münster","country":"DE"}
We do not have any other specific spring-web settings for encoding and/or decoding. We do have a custom ObjectMapper PostProcessor which configures some de-/serialization properties, but nothing that should impact encoding.
The issue only seems to occur when encoding a response. Decoding a request body seems to work fine even without consumes = MediaType.APPLICATION_JSON_UTF8_VALUE.
Putting @RequestMapping(value = "/v1", produces = MediaType.APPLICATION_JSON_UTF8_VALUE) also does seem to fix the issue, but MediaType.APPLICATION_JSON_UTF8_VALUE has been deprecated since 5.2. IMO, this means that the default should support UTF-8 out-of-the-box in that case.
I haven’t been able to debug the issue yet, but I’ll try to do so and add some more technical details to try and narrow the issue down.
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (4 by maintainers)
Top Related StackOverflow Question
Thanks for the investigation. I thought that it might not be quite as simple - Cunningham’s Law at work. 😃
I guess my expectation was that UTF-8 should be the default in tests as well, but I was mistaken that this should always be the case. It seems that it is the default only when
application/jsonis implied to be the accepted media type.Good to know, thanks for the background information.
Unfortunately, it is not as simple as changing that default value. For one, there is a lot of code out there that would break if we change the default from ISO-8859-1 to UTF-8. Moreover, MockHttpServletResponse implements a type from the servlet spec, and the latest version 4 of that spec still lists ISO-8859-1 as the default; not UTF-8.
The underlying problem here is that you are testing against JSON contents using a string verifier, which is fragile at best. Instead, use JSONassert by calling the
jsonmethod instead ofstring. So in the utf8_demo code you submitted:The above runs fine for me.
As an extra bonus, changing to
jsonmade me find a typo in the original expectation string, which missed the closing curly bracket and therefore would have failed even if the character encoding was correct.