jsoup Java HTML Parser release 1.22.2
jsoup 1.22.2 is out now, with fixes and refinements across the library. It makes editing the DOM during traversal more predictable, refreshes the default HTML tag definitions with newer elements and better text boundaries, and improves reliability in parsing and HTTP transport. The release also fixes a number of edge cases in cleaning, stream parsing, XML doctype handling, and Android packaging.
jsoup is a Java library for working with real-world HTML and XML. It provides a very convenient API for extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors.
Download jsoup now.
- Expanded and clarified
NodeTraversorsupport for in-place DOM rewrites duringNodeVisitor.head(). Current-node edits such asremove,replace, andunwrapnow recover more predictably, while traversal stays within the original root subtree. This makes single-pass tree cleanup and normalization visitors easier to write, for example when unwrapping presentational elements or replacing text nodes as you walk the DOM. #2472 - Documentation: clarified that a configured
Cleanermay be reused across concurrent threads, and that sharedSafelistinstances should not be mutated while in use. #2473 - Updated the default HTML
TagSetfor current HTML elements: addeddialog,search,picture, andslot; madeins,del,button,audio,video, andcanvasinline by default (Tag#isInline(), aligned to phrasing content in the spec); and added readableElement.text()boundaries for controls and embedded objects via the newTag.TextBoundaryoption. This improves pretty-printing and keeps normalized text from running adjacent words together. #2493
- Android (R8/ProGuard): added a rule to ignore the optional
re2jdependency when not present. #2459 - Fixed a
NodeTraversorregression in 1.21.2 where removing or replacing the current node duringhead()could revisit the replacement node and loop indefinitely. The traversal docs now also clarify which inserted nodes are visited in the current pass. #2472 - Parsing during charset sniffing no longer fails if an advisory
available()call throwsIOException, as seen on JDK 8HttpURLConnection. #2474 Cleanerno longer makes relative URL attributes in the input document absolute when cleaning or validating aDocument. URL normalization now applies only to the cleaned output, andSafelist.isSafeAttribute()is side effect free. #2475Cleanerno longer duplicates enforced attributes when the inputDocumentpreserves attribute case. A case-variant source attribute is now replaced by the enforced attribute in the cleaned output. #2476- If a per-request SOCKS proxy is configured, jsoup now avoids using the JDK
HttpClient, because the JDK would silently ignore that proxy and attempt to connect directly. Those requests now fall back to the legacyHttpURLConnectiontransport instead, which does support SOCKS. #2468 Connection.Response.streamParser()andDataUtil.streamParser(Path, ...)could fail on small inputs without a declared charset, if the initial 5 KB charset sniff fully consumed the input and closed it before the stream parse began. #2483- In XML mode, doctypes with an internal subset, such as
<!DOCTYPE root [<!ENTITY name "value">]>, now round-trip correctly. The subset is preserved as raw text only; entities are not expanded and external DTDs are not loaded. #2486
- Migrated the integration test server from Jetty to Netty, which actively maintains support for our minimum JDK target (8). #2491
My sincere thanks to everyone who contributed to this release! If you have any suggestions for the next release, I would love to hear them; please get in touch via jsoup discussions, or with me directly.
You can also follow me (@jhy@tilde.zone) on Mastodon / Fediverse to receive occasional notes about jsoup releases.
v7.0.7
- Improve
SpringValidatorAdapterandMethodValidationAdapterperformance #36621 - Support JSON array decoding to
FluxinKotlinSerializationJsonDecoder#36597 - Deprecate
methodIdentification()inCacheAspectSupportfor removal #36575 - Add MockRestServiceServer#createServer variant for RestClient #36572
- Create RestClientXhrTransport variant replacing RestTemplateXhrTransport #36566
- Improve error handling in multipart codecs #36563
- Make
ApplicationListenerMethodAdapter#getTargetMethod()public #36558 - ApiVersionConfigurer.setSupportedVersionPredicate() returns void instead of ApiVersionConfigurer #36551
- LazyConnectionDataSourceProxy does not work well with Hibernate's multi-tenancy by schema strategy #36527
- Add registerManagedResource variant with bean key argument to MBeanExporter #36520
- Handle blank Accept-Language header in AcceptHeaderLocaleResolver #36513
- Make AbstractStreamingClientHttpRequest and AbstractBufferingClientHttpRequest public #36501
- MySQL Error 149 (Galera/WSREP conflict) not translated to ConcurrencyFailureException in Spring JDBC/ORM #36499
- Add PreFlightRequestFilter #36482
- Support configuration of extension context scope for
SpringExtensionvia Spring or JUnit properties #36460 - Lower log level of "Cache miss for REQUEST dispatch" in HandlerMappingIntrospector #36309
- WebDataBinder unnecessarily instantiates collections when using the "!" and "_" prefixes #36625
- Cache pollution from high-cardinality FieldError default messages in MessageSourceSupport #36609
MergedAnnotationdoes not useClassLoaderfor method or field #36606@Sqlfails ifDataSourceis wrapped in aTransactionAwareDataSourceProxy#36611AnnotatedTypeMetadatano longer retains source declaration order on Java 24+ #36598MergedAnnotation.asMap()fails when an attribute references a non-existent class #36586FileSystemResourcedoes not strictly follow theResource#isReadable()contract #36584- Converter overrides in HttpMessageConverters only apply when defaults are registered #36579
- Invalid method return type metadata for ClassFile variant on JDK 24+ #36577
- Fix Writer lifecycle for
AbstractJsonHttpMessageConverter.writeInternal(Object, Type, Writer)#36565 - Flushing-related regression in
SseServerResponse#36537 - LazyConnectionDataSourceProxy does not pass on holdability to target Connection #36528
AnnotationBeanNameGeneratorfails when an annotation references a non-existent class #36524- Perserve default API version in RestClientAdapter #36514
- Inconsistent codings resolution in resource resolvers #36507
DefaultJmsListenerContainermay hang in an endless loop indoShutdown#36506- Query not hidden in DefaultClientResponse checkpoint #36502
- RestClient closes stream for ResponseEntity responses #36492
- IllegalStateException when using websocket handshake headers with Tomcat #36486
- Invalid nullness information for ParameterizedTypeReference #36477
- WebTestClient cannot assert null list elements #36476
- Handle Kotlin nullable value class param correctly in
CoroutineUtils#36449 - Remove RFC 2047 encoding from Content-Disposition filename #36328
- Parent traceId is not reused when calling WebClient.awaitExchange function #36182
- Clarify semantics of HttpMethod.valueOf() #36652
- Document whitespace semantics in SpEL expressions #36628
- Document that
spring.profiles.activeis ignored by@ActiveProfiles#36600 MergedAnnotation.asAnnotationAttributes()Javadoc incorrectly states that it creates an immutable map #36567- Fix incorrect Javadoc in HandlerMethodReturnValueHandlerComposite regarding caching #36555
- Fix incorrect method name in
TypeDescriptor.array()Javadoc #36549 - Introduce Kotlin examples for Bean Overrides (
@MockitoBean, etc.) #36541 - Fix incorrect cross-reference links in AbstractEnvironment Javadoc #36516
- Document RetryTemplate#invoke variants in reference manual #36452
- Link observability section to Micrometer Observation Handler docs #34994
Thank you to all the contributors who worked on this release:
@Mohak-Nagaraju, @Sineaggi, @T45K, @angry-2k, @bebeis, @cookie-meringue, @dmitrysulman, @elgunshukurov, @itsmevichu, @junhyung8795, @msridhar, @nameearly, @tobifasc, and @xxxxxxjun
v6.2.18
- Improve
SpringValidatorAdapterandMethodValidationAdapterperformance #36624 - Add missing
@Deprecated(forRemoval = true) for deleted in 7.0 #36591 - Deprecate
methodIdentification()inCacheAspectSupportfor removal #36576 - Improve error handling in multipart codecs #36564
- LazyConnectionDataSourceProxy does not work well with Hibernate's multi-tenancy by schema strategy #36529
- MySQL Error 149 (Galera/WSREP conflict) not translated to ConcurrencyFailureException in Spring JDBC/ORM #36510
- Handle Kotlin nullable value class param correctly in
CoroutineUtils#36643 - NullPointerException in ServerSentEvent when trying to set id or event properties #36634
@Sqlfails ifDataSourceis wrapped in aTransactionAwareDataSourceProxy#36630- WebDataBinder unnecessarily instantiates collections when using the "!" and "_" prefixes #36627
- Cache pollution from high-cardinality FieldError default messages in MessageSourceSupport #36623
- ContentCachingRequestWrapper does not allow unlimited content caching #36620
MergedAnnotationdoes not useClassLoaderfor method or field #36614AnnotationBeanNameGeneratorfails when an annotation references a non-existent class #36588FileSystemResourcedoes not strictly follow theResource#isReadable()contract #36585- Query not hidden in DefaultClientResponse checkpoint #36571
- LazyConnectionDataSourceProxy does not pass on holdability to target Connection #36530
DefaultJmsListenerContainermay hang in an endless loop indoShutdown#36511- Inconsistent codings resolution in resource resolvers #36508
- Clarify semantics of HttpMethod.valueOf() #36653
- Document that
spring.profiles.activeis ignored by@ActiveProfiles#36636 - Document whitespace semantics in SpEL expressions #36629
MergedAnnotation.asAnnotationAttributes()Javadoc incorrectly states that it creates an immutable map #36568- Introduce Kotlin examples for Bean Overrides (
@MockitoBean, etc.) #36542 - Fix incorrect cross-reference links in AbstractEnvironment Javadoc #36517