JEP 430: String Templates (Preview)
Summary
Enhance the Java programming language with string templates. String templates complement Java's existing string literals and text blocks by coupling literal text with embedded expressions and template processors to produce specialized results. This is a preview language feature and API.
Goals
-
Simplify the writing of Java programs by making it easy to express strings that include values computed at run time.
-
Enhance the readability of expressions that mix text and expressions, whether the text fits on a single source line (as with string literals) or spans several source lines (as with text blocks).
-
Improve the security of Java programs that compose strings from user-provided values and pass them to other systems (e.g., building queries for databases) by supporting validation and transformation of both the template and the values of its embedded expressions.
-
Retain flexibility by allowing Java libraries to define the formatting syntax used in string templates.
-
Simplify the use of APIs that accept strings written in non-Java languages (e.g., SQL, XML, and JSON).
-
Enable the creation of non-string values computed from literal text and embedded expressions without having to transit through an intermediate string representation.
Non-Goals
-
It is not a goal to introduce syntactic sugar for Java's string concatenation operator (
+
), since that would circumvent the goal of validation. -
It is not a goal to deprecate or remove the
StringBuilder
andStringBuffer
classes, which have traditionally been used for complex or programmatic string composition.
Motivation
Developers routinely compose strings from a combination of literal text and expressions. Java provides several mechanisms for string composition, though unfortunately all have drawbacks.
-
String concatenation with the
+
operator produces hard-to-read code:String s = x + " plus " + y + " equals " + (x + y);
-
StringBuilder
is verbose:String s = new StringBuilder()
.append(x)
.append(" plus ")
.append(y)
.append(" equals ")
.append(x + y)
.toString(); -
String::format
andString::formatted
separate the format string from the parameters, inviting arity and type mismatches:String s = String.format("%2$d plus %1$d equals %3$d", x, y, x + y);
String t = "%2$d plus %1$d equals %3$d".formatted(x, y, x + y); -
java.text.MessageFormat
requires too much ceremony and uses an unfamiliar syntax in the format string:MessageFormat mf = new MessageFormat("{0} plus {1} equals {2}");
String s = mf.format(x, y, x + y);
String interpolation
Many programming languages offer string interpolation as an alternative to string concatenation. Typically, this takes the form of a string literal that contains embedded expressions as well as literal text. Embedding expressions in situ means that readers can easily discern the intended result. At run time, the embedded expressions are replaced with their (stringified) values — the values are said to be interpolated into the string. Here are some examples of interpolation in other languages:
C# $"{x} plus {y} equals {x + y}"
Visual Basic $"{x} plus {y} equals {x + y}"
Python f"{x} plus {y} equals {x + y}"
Scala s"$x plus $y equals ${x + y}"
Groovy "$x plus $y equals ${x + y}"
Kotlin "$x plus $y equals ${x + y}"
JavaScript `${x} plus ${y} equals ${x + y}`
Ruby "#{x} plus #{y} equals #{x + y}"
Swift "\(x) plus \(y) equals \(x + y)"
Some of these languages enable interpolation for all string literals while others require interpolation to be enabled when desired, for example by prefixing the literal's opening delimiter with $
or f
. The syntax of embedded expressions also varies but often involves characters such as $
or { }
, which means that those characters cannot appear literally unless they are escaped.
Not only is interpolation more convenient than concatenation when writing code, it also offers greater clarity when reading code. The clarity is especially striking with larger strings. For example, in JavaScript:
const title = "My Web Page";
const text = "Hello, world";
var html = `<html>
<head>
<title>${title}</title>
</head>
<body>
<p>${text}</p>
</body>
</html>`;
String interpolation is dangerous
Unfortunately, the convenience of interpolation has a downside: It is easy to construct strings that will be interpreted by other systems but which are dangerously incorrect in those systems.
Strings that hold SQL statements, HTML/XML documents, JSON snippets, shell scripts, and natural-language text all need to be validated and sanitized according to domain-specific rules. Since the Java programming language cannot possibly enforce all such rules, it is up to developers using interpolation to validate and sanitize. Typically, this means remembering to wrap embedded expressions in calls to escape
or validate
methods, and relying on IDEs or static analysis tools to help to validate the literal text.
Interpolation is especially dangerous for SQL statements because it can lead to injection attacks. For example, consider this hypothetical Java code with the embedded expression ${name}
:
String query = "SELECT * FROM Person p WHERE p.last_name = '${name}'";
ResultSet rs = connection.createStatement().executeQuery(query);
If name
had the troublesome value
Smith' OR p.last_name <> 'Smith
then the query string would be
SELECT * FROM Person p WHERE p.last_name = 'Smith' OR p.last_name <> 'Smith'
and the code would select all rows, potentially exposing confidential information. Composing a query string with simple-minded interpolation is just as unsafe as composing it with traditional concatenation:
String query = "SELECT * FROM Person p WHERE p.last_name = '" + name + "'";
Can we do better?
For Java, we would like to have a string composition feature that achieves the clarity of interpolation but achieves a safer result out-of-the-box, perhaps trading off a small amount of convenience to gain a large amount of safety.
For example, when composing SQL statements any quotes in the values of embedded expressions must be escaped, and the string overall must have balanced quotes. Given the troublesome value of name
shown above, the query that should be composed is a safe one:
SELECT * FROM Person p WHERE p.last_name = '\'Smith\' OR p.last_name <> \'Smith\''
Almost every use of string interpolation involves structuring the string to fit some kind of template: A SQL statement usually follows the template SELECT ... FROM ... WHERE ...
, an HTML document follows <html> ... </html>
, and even a message in a natural language follows a template that intersperses dynamic values (e.g., a username) amongst literal text. Each kind of template has rules for validation and transformation, such as "escape all quotes" for SQL statements, "allow only legal character entities" for HTML documents, and "localize to the language configured in the OS" for natural-language messages.
Ideally a string's template could be expressed directly in the code, as if annotating the string, and the Java runtime would apply template-specific rules to the string automatically. The result would be SQL statements with escaped quotes, HTML documents with no illegal entities, and boilerplate-free message localization. Composing a string from a template would relieve developers of having to laboriously escape each embedded expression, call validate()
on the whole string, or use java.util.ResourceBundle
to look up a localized string.
For another example, we might construct a string denoting a JSON document and then feed it to a JSON parser in order to obtain a strongly-typed JSONObject
:
String name = "Joan Smith";
String phone = "555-123-4567";
String address = "1 Maple Drive, Anytown";
String json = """
{
"name": "%s",
"phone": "%s",
"address": "%s"
}
""".formatted(name, phone, address);
JSONObject doc = JSON.parse(json);
... doc.entrySet().stream().map(...) ...
Ideally the JSON structure of the string could be expressed directly in the code, and the Java runtime would transform the string into a JSONObject
automatically. The manual detour through the parser would not be necessary.
In summary, we could improve the readability and reliability of almost every Java program if we had a first-class, template-based mechanism for composing strings. Such a feature would offer the benefits of interpolation, as seen in other programming languages, but would be less prone to introducing security vulnerabilities. It would also reduce the ceremony of working with libraries that take complex input as strings.
Description
Template expressions are a new kind of expression in the Java programming language. Template expressions can perform string interpolation but are also programmable in a way that helps developers compose strings safely and efficiently. In addition, template expressions are not limited to composing strings — they can turn structured text into any kind of object, according to domain-specific rules.
Syntactically, a template expression resembles a string literal with a prefix. There is a template expression on the second line of this code:
String name = "Joan";
String info = STR."My name is \{name}";
assert info.equals("My name is Joan"); // true
The template expression STR."My name is \{name}"
consists of:
- A template processor (
STR
); - A dot character (U+002E), as seen in other kinds of expressions; and
- A template (
"My name is \{name}"
) which contains an embedded expression (\{name}
).
When a template expression is evaluated at run time, its template processor combines the literal text in the template with the values of the embedded expressions in order to produce a result. The result of the template processor, and thus the result of evaluating the template expression, is often a String
— though not always.
The STR
template processor
STR
is a template processor defined in the Java Platform. It performs string interpolation by replacing each embedded expression in the template with the (stringified) value of that expression. The result of evaluating a template expression which uses STR
is a String
; e.g., "My name is Joan"
.
In everyday conversation developers are likely to use the term "template" when referring either to the whole of a template expression, which includes the template processor, or to just the template part of a template expression, which is the argument to the template processor. This informal usage is reasonable as long as care is taken not to conflate these concepts.
STR
is a public
static
final
field that is automatically imported into every Java source file.
Here are more examples of template expressions that use the STR
template processor. The symbol |
in the left margin means that the line shows the value of the previous statement, similar to jshell.
// Embedded expressions can be strings
String firstName = "Bill";
String lastName = "Duck";
String fullName = STR."\{firstName} \{lastName}";
| "Bill Duck"
String sortName = STR."\{lastName}, \{firstName}";
| "Duck, Bill"
// Embedded expressions can perform arithmetic
int x = 10, y = 20;
String s = STR."\{x} + \{y} = \{x + y}";
| "10 + 20 = 30"
// Embedded expressions can invoke methods and access fields
String s = STR."You have a \{getOfferType()} waiting for you!";
| "You have a gift waiting for you!"
String t = STR."Access at \{req.date} \{req.time} from \{req.ipAddress}";
| "Access at 2022-03-25 15:34 from 8.8.8.8"
To aid refactoring, double-quote characters can be used inside embedded expressions without escaping them as \"
. This means that an embedded expression can appear in a template expression exactly as it would appear outside the template expression, easing the switch from concatenation (+
) to template expressions. For example:
String filePath = "tmp.dat";
File file = new File(filePath);
String old = "The file " + filePath + " " + (file.exists() ? "does" : "does not") + " exist";
String msg = STR."The file \{filePath} \{file.exists() ? "does" : "does not"} exist";
| "The file tmp.dat does exist" or "The file tmp.dat does not exist"
To aid readability, an embedded expression can be spread over multiple lines in the source file without introducing newlines into the result. The value of the embedded expression is interpolated into the result at the position of the \
of the embedded expression; the template is then considered to continue on the same line as the \
. For example:
String time = STR."The time is \{
// The java.time.format package is very useful
DateTimeFormatter
.ofPattern("HH:mm:ss")
.format(LocalTime.now())
} right now";
| "The time is 12:34:56 right now"
There is no limit to the number of embedded expressions in a string template expression. The embedded expressions are evaluated from left to right, just like the arguments in a method invocation expression. For example:
// Embedded expressions can be postfix increment expressions
int index = 0;
String data = STR."\{index++}, \{index++}, \{index++}, \{index++}";
| "0, 1, 2, 3"
Any Java expression can be used as an embedded expression — even a template expression. For example:
// Embedded expression is a (nested) template expression
String[] fruit = { "apples", "oranges", "peaches" };
String s = STR."\{fruit[0]}, \{STR."\{fruit[1]}, \{fruit[2]}"}";
| "apples, oranges, peaches"
Here the template expression STR."\{fruit[1]}, \{fruit[2]}"
is embedded in the template of another template expression. This code is difficult to read due to the abundance of "
, \
, and { }
characters, so it is better to format it as:
String s = STR."\{fruit[0]}, \{
STR."\{fruit[1]}, \{fruit[2]}"
}";
Alternatively, since the embedded expression has no side effects, it could be refactored into a separate template expression:
String tmp = STR."\{fruit[1]}, \{fruit[2]}";
String s = STR."\{fruit[0]}, \{tmp}";
Multi-line template expressions
The template of a template expression can span multiple lines of source code, using a syntax similar to that of text blocks. (We saw an embedded expression spanning multiple lines above, but the template which contained the embedded expression was logically one line.)
Here are examples of template expressions denoting HTML text, JSON text, and a zone table, all spread over multiple lines:
String title = "My Web Page";
String text = "Hello, world";
String html = STR."""
<html>
<head>
<title>\{title}</title>
</head>
<body>
<p>\{text}</p>
</body>
</html>
""";
| """
| <html>
| <head>
| <title>My Web Page</title>
| </head>
| <body>
| <p>Hello, world</p>
| </body>
| </html>
| """
String name = "Joan Smith";
String phone = "555-123-4567";
String address = "1 Maple Drive, Anytown";
String json = STR."""
{
"name": "\{name}",
"phone": "\{phone}",
"address": "\{address}"
}
""";
| """
| {
| "name": "Joan Smith",
| "phone": "555-123-4567",
| "address": "1 Maple Drive, Anytown"
| }
| """
record Rectangle(String name, double width, double height) {
double area() {
return width * height;
}
}
Rectangle[] zone = new Rectangle[] {
new Rectangle("Alfa", 17.8, 31.4),
new Rectangle("Bravo", 9.6, 12.4),
new Rectangle("Charlie", 7.1, 11.23),
};
String table = STR."""
Description Width Height Area
\{zone[0].name} \{zone[0].width} \{zone[0].height} \{zone[0].area()}
\{zone[1].name} \{zone[1].width} \{zone[1].height} \{zone[1].area()}
\{zone[2].name} \{zone[2].width} \{zone[2].height} \{zone[2].area()}
Total \{zone[0].area() + zone[1].area() + zone[2].area()}
""";
| """
| Description Width Height Area
| Alfa 17.8 31.4 558.92
| Bravo 9.6 12.4 119.03999999999999
| Charlie 7.1 11.23 79.733
| Total 757.693
| """