JEP 353: Reimplement the Legacy Socket API
Summary
Replace the underlying implementation used by the java.net.Socket
and java.net.ServerSocket
APIs with a simpler and more modern implementation that is easy to maintain and debug. The new implementation will be easy to adapt to work with user-mode threads, a.k.a. fibers, currently being explored in Project Loom.
Motivation
The java.net.Socket
and java.net.ServerSocket
APIs, and their underlying implementations, date back to JDK 1.0. The implementation is a mix of legacy Java and C code that is painful to maintain and debug. The implementation uses the thread stack as the I/O buffer, an approach that has required increasing the default thread stack size on several occasions. The implementation uses a native data structure to support asynchronous close, a source of subtle reliability and porting issues over the years. The implementation also has several concurrency issues that require an overhaul to address properly. In the context of a future world of fibers that park instead of blocking threads in native methods, the current implementation is not fit for purpose.
Description
The java.net.Socket
and java.net.ServerSocket
APIs delegate all socket operations to a java.net.SocketImpl
, a Service Provider Interface (SPI) mechanism that has existed since JDK 1.0. The built-in implementation is termed the “plain” implementation, implemented by the non-public PlainSocketImpl
with supporting classes SocketInputStream
and SocketOutputStream
. PlainSocketImpl
is extended by two other JDK-internal implementations that support connections through SOCKS and HTTP proxy servers. By default, a Socket
and ServerSocket
is created (sometimes lazily) with a SOCKS based SocketImpl
. In the case of ServerSocket
, the use of the SOCKS implementation is an oddity that dates back to experimental (and since removed) support for proxying server connections in JDK 1.4.
The new implementation, NioSocketImpl
, is a drop-in replacement for PlainSocketImpl
. It is developed to be easy to maintain and debug. It shares the same JDK-internal infrastructure as the New I/O (NIO) implementation so it doesn't need its own native code. It integrates with the existing buffer cache mechanism so that it doesn’t need to use the thread stack for I/O. It uses java.util.concurrent
locks rather than synchronized
methods so that it can play well with fibers in the future. In JDK 11, the NIO SocketChannel
and the other SelectableChannel
implementations were mostly re-implemented with the same goal in mind.
The following are a few points about the new implementation:
-
SocketImpl
is a legacy SPI mechanism and is very under-specified. The new implementation attempts to be compatible with the old implementation by emulating unspecified behavior and exceptions where applicable. The Risks and Assumptions section below details the behavior differences between the old and new implementations. -
Socket operations using timeouts (
connect
,accept
,read
) are implemented by changing the socket to non-blocking mode and polling the socket. -
The
java.lang.ref.Cleaner
mechanism is used to close sockets when theSocketImpl
is garbage collected and the socket has not been explicitly closed. -
Connection reset handling is implemented in the same way as the old implementation so that attempts to read after a connection reset will fail consistently.
ServerSocket
is modified to use NioSocketImpl
(or PlainSocketImpl
) by default. It no longer uses the SOCKS implementation.
The SocketImpl
implementations to support SOCKS and HTTP proxy servers are modified to delegate so they can work with the old and new implementations.
The instrumentation support for socket I/O in Java Flight Recorder is modified to be independent of the SocketImpl
so that socket I/O events can be recorded when running with either the new, old, or custom implementations.
To reduce the risk of switching the implementation after more than twenty years, the old implementation will not be removed. The old implementation will remain in the JDK and a system property will be introduced to configure the JDK to use the old implementation. The JDK-specific system property to switch to the old implementation is jdk.net.usePlainSocketImpl
. If set, or set to the value true
, at startup, then the old implementation will be used. Some future release will remove PlainSocketImpl
and the system property.
This JEP does not propose to provide an alternative implementation of DatagramSocketImpl
at this time (DatagramSocketImpl
is the underlying implementation that instances of java.net.DatagramSocket
delegate to). The built-in default implementation (PlainDatagramSocketImpl
) is a maintenance (and porting) burden and may be the subject of another JEP.
Testing
The existing tests in the jdk/jdk
repository will be used to test the new implementation. The jdk_net
test group has accumulated many tests for networking corner case scenarios over the years. Some of the tests in this test group will be modified to run twice, the second time with -Djdk.net.usePlainSocketImpl
to ensure that the old implementation does not bit-rot during the time that the JDK includes both implementations.
A lot of code today makes direct or indirect use of libraries that use APIs defined in java.nio.channels
rather than the java.net.Socket
and java.net.ServerSocket
APIs. Every effort will be made to create awareness of the proposal and encourage developers that have code using Socket
and ServerSocket
to test their code with the early-access builds that are published on jdk.java.net or elsewhere.
The microbenchmarks in the jdk/jdk
repository include benchmarks for socket read/write and streaming. These benchmarks have been improved to make it easy to compare the old and new implementations. As things stand, the new implementation is about the same or 1-3% better than the old implementation on the socket read/write tests.
Risks and Assumptions
The primary risk of this proposal is that there is existing code that depends on unspecified behavior in corner cases where the old and new implementations behave differently. The differences that have been identified so far are listed here; all but the first two can be mitigated by running with -Djdk.net.usePlainSocketImpl
.
-
The
InputStream
andOutputStream
returned byPlainSocketImpl
'sgetInputStream()
andgetOutputStream()
methods extendjava.io.FileInputStream
andjava.io.FileOutputStream
respectively. It is possible, but unlikely, that there is existing code that depends on this. -
A
ServerSocket
using a customSocketImpl
cannot accept connections that return aSocket
with a platformSocketImpl
. Similarly, aServerSocket
using the platformSocketImpl
cannot accept connections that return aSocket
with a customSocketImpl
. -
The
InputStream
andOutputStream
returned by the old implementation tests the stream for EOF and returns -1 before other checks. The new implementation doesnull
and bounds checks before checking if the stream is at EOF. It is possible, but unlikely, that there is fragile code that will trip up due to the order of the checks. -
Closing a
Socket
with unread bytes in the receive queue will close the underlying socket gracefully. On platforms other than Microsoft Windows, the same scenario with the old implementation will lead to abortive/hard close. -
Oracle Solaris specific: Oracle Solaris differs to other platforms in the way that it reports “connection reset” to applications. For example, calls to
setsockopt
orioctl
can fail when there is a network error. Thexnet_skip_checks
setting can be configured in/etc/system
to disable this behavior (echo "xnet_skip_checks/W 1" | mdb -kw
on a live system). The old implementation handles the case whereioctl(FIOREAD)
fails so that attempts to read afteravailable
fails with a “connection reset” will fail consistently. This is fragile and unmaintainable, the new implementation does not attempt to emulate this behavior. -
Oracle Solaris specific: Oracle Solaris does not allow the
IPV6_TLCASS
socket option to be changed on a TCP socket after it is connected. The old implementation masks this by caching the value specified to thesetTrafficClass
method. -
The
java.net
package defines many sub-classes ofSocketException
. The new implementation will attempt to throw the same specificSocketException
as the old implementation but there may be cases where they are not the same. Furthermore, there may be cases where the exception messages differ. On Microsoft Windows for example, the old implementation maps Windows Socket error codes to English-only messages, while the new implementation uses the system messages.
Aside from behavioral differences, the performance of the new implementation may differ to the old when running certain workloads. In the old implementation several threads calling the accept
method on a ServerSocket
will queue in the kernel. In the new implementation, one thread will block in the accept
system call, the others will queue waiting to acquire a java.util.concurrent
lock. Performance characteristics may differ in other scenarios too.
Finally, there may be instrumentation agents or tools that instrument the non-public java.net.SocketInputStream
and java.net.SocketOutputStream
classes to get I/O events. These classes are not used by the new implementation.