JEP 138: Autoconf-Based Build System
Summary
Introduce autoconf (./configure
-style) build setup, refactor the Makefiles to remove recursion, and leverage JEP 139: Enhance javac to Improve Build Speed.
Goals
The top level goals that we are trying to achieve are:
- Increase build speed radically
- Simplify build-system source code (Makefiles, etc.)
- Simplify work for developers
- Get exact and reproducible build output
- Simplify build-machine configurations (JPRT, etc.)
We will address these goals by four sub-projects, which are more or less tightly intertwined.
- Update the Makefile structure
- Use autoconf (configure script)
- Add parallel Java compilation support
- Make Java builds incremental
We need to understand properly existing developer workflows so that we can minimize the impact of this change for everyone.
This project is part of a larger effort to improve the build infrastructure of the JDK. We expect this project to be closely followed by the future steps. The distinction between these steps is somewhat arbitrary, and is made only to quickly benefit from the first work done on improving the JDK build infrastructure.
Non-Goals
Since we will update the Makefiles with a new structure, several issues that we want to address in the future might turn out to work just by themselves as an effect of the update. However, we are not specifically addressing these issues during this project, and we will not test them nor make any guarantee that they will work properly. (We will, however, try to make sure that we don't break anything that works.) These issues include:
- Make it easy to port to new platforms
- Make it possible to do JDK development without a network connection
- Provide proper support for cross-compilation, including compilation of 32-bit binaries on 64-bit hosts
- Improve the handling of warnings
We will also not address issues that are scheduled for future steps. (However, some of this work will lay the ground for these future improvements.) These issues include:
- Speed up Hotspot compilation
- Upgrade compilers
- Support IDE projects
- Reconsider the source-drop mechanism
Success Metrics
Build simplicity
Given that all prerequisites are available, building should be accomplished by:
- Getting the source code from the Mercurial repositories
./configure
make
Build speed
Build speed depends on hardware factors, and improvements will vary. Our target is compiling on Linux on an 8-way machine. In this case, the time spent building the JDK after our improvements should be at most 33% of the current time. (Typically this means going from ~15 minutes to ~5 minutes, or less). A stretch goal is that build time should be at most 20% (~3 minutes) for the JDK.
Note that this is just for the JDK. It does not include building Hotspot, nor creating Javadoc.
Makefile cleanup
All small (<3 kB) recursive Makefiles in the JDK (not including Hotspot) should be removed, and the functionality collected into central Makefiles. (A small number of Makefiles is not in itself a goal, however, having the code in one (or a few) places helps with overview and understanding.)
Motivation
Building the complete JDK is unnecessarily slow. This puts an extra burden on developers and build systems. As a result, developers check out and build just a part of the source code, since the product as a whole takes too long to build.
The current implementation of the build system, with more than 350 minimal, recursive Makefiles scattered all around the product, makes it hard to make changes to the build system. The current solution also sometimes requires updating Makefiles just to add new source files or directories; this should not be needed.
Today the build system is configured by using several environment variables. This is in contrast to the popular method of using ./configure
to set up the build system. Apart from familiarity, this has several benefits over environment variables. Arguments to configure are checked -- a misspelled argument results in an error, whereas a misspelled environment variable is just ignored. ./configure --help
shows a list of available arguments, whereas it is almost impossible to get a comprehensive list of all environment variables that affect the current build system.
Description
These changes will not result in any changes in the built product; they only affect the internal development process.
Update the Makefile structure
Background
Updating the old Makefiles to a new, simplified architecture will be fundamental for all other work described here.
Implementation
The current style of recursive makefiles with one file per directory will be removed. Instead, the makefiles will discover files to be compiled by looking recursively into source-code directories. Files that should not be compiled will instead be listed as explicit exclusions. This will be needed to be able to use the new parallel javac compiler.
Code common to several subsystems will be stored in a new, top-level directory "common/make". The design idea is that these common files will provide a library with helper functions, so that the per-subsystem Makefiles can be written as simply and cleanly as possible. We will accept a greater code complexity in these libraries if it allows for increased simplicity of the per-subsystem Makefiles.
Since good coding practice is not automatically enforced by the Makefile syntax, we will take extra care to make sure we write proper and readable code.
As part of the update, we will produce a document describing the coding guidelines we have found useful and have followed during the rewrite process, so as to guide future changes in the Makefiles. We will also produce a document describing the overall architecture of the Makefiles.
The Makefiles do other things apart from building the resulting binary, or build unusual variants of the binaries. Some of these targets appear arcane and not used anymore. If all stakeholders agree then we will not port such targets to the new system. This is a list of features we're so far considering to remove:
- (currently empty)
Mixing old and new
It is probably possible to keep the old Makefile system around, in parallel with the new rewritten Makefile system, so we have two ways of building the product (new and old) for some time. This is not really desirable, since it risks leading to code duplication and general confusion, and will make us miss out the benefits of removing the old stuff. However, keeping the old system, or having an easy way of restoring the old system, would help us manage the risk involved.
Transition
Most developers will not have much interaction with the actual makefiles, so there will not be any large changes in workflow.
Previously, sometimes the Makefiles needed to be updated whenever source files or directories were added or deleted. This will not be needed anymore, and this needs to be communicated to all developers.
Developers who want to change the actual Makefiles need to understand the overall design and coding principles used. This will be documented, but the existence of these documents needs to be communicated.
Use autoconf (configure script)
Background
The basic idea behind autoconf is that a single, simple interface will handle the "glue" issues between a user's system's configuration and the requirements of the Makefiles. This interface is the ./configure
shell script.
Using autoconf has thus two facets -- creating and using the ./configure
script. The configure script is generated by the autoconf tools, from the source code in configure.ac
(and accompanying helper files), which is written using M4 macros. From this source code, a configure
shell script is generated. This script (even though it is generated) is checked in into the repository. Whenever the configure.ac
source code is changed, the configure
script needs to be regenerated and updated in the repository. To regenerate configure
, the autoconf tools need to be installed on the system.
The typical user, however, will not need to do this. Since configure
is checked in, he/she only needs to run ./configure
. To do this, the autoconf tools are not needed. This results in a config.spec
file in Makefile syntax, which determines the build details, and which is included by the Makefile.
Autoconf implementation
The configure script has three major tasks:
- Determine that all build dependencies are present.
- Analyze known differences between platforms and determine which applies in the current situation.
- Apply the arguments given by the user to specialize the build.
Even though the autoconf framework helps with all of these tasks, they must all be explicitly coded with knowledge about the specifics of OpenJDK. This means that we need to be clear about what build dependencies we actually have, what differences needs to be determined, and in what ways the user can influence the build result.
The build dependencies have previously been described in the README file.
The known differences has previously been encoded in the Makefiles, or been in the "common knowledge".
The user influences have historically been by using environment variables, and the check for these have been in the Makefiles.
The configure script can work like a "wrapper" for the old Makefiles, and set up the same variables in config.spec as the Makefile have been using. In this case, it will be almost transparent for the Makefile that the variables came from the configure script instead of the user. However, in many cases a better solution is probably to output a more "clean" variable, and rewrite the corresponding parts of the Makefiles.
Legal status
As part of using autoconf, we need to include three files from autoconf in the JDK 8 source repository. The three files are pkg.m4
, config.guess
and config.sub
. Legal clearance for inclusion of these files in the OpenJDK has been requested. We believe this should not be a problem, since the autoconf license is explicitly written to support this use case (basically allowing us to distribute them any way we like, as long as they are used as part of a configure script).
Transition
The current workflow when building OpenJDK is basically:
- Retrieve source code from repository
- Setup a slew of environment variables
- Run make
- Repeat 2 and 3 each time a rebuild is needed
Many team members have created personal shell scripts and similar solutions to help with this.
The new workflow using configure scripts will instead lead to:
- Retrieve source code from repository
- Run
./configure
, possibly with specializing arguments - Run make
- Repeat 3 each time a rebuild is needed
Since step 3 is so easy, no shell scripts will be needed to rebuild. However, if the user had heavily specialized their setup, they might want to create scripts to help them run configure with the correct arguments.
We should provide a translation table from old environment variables to new configure
arguments.
Discussion: Maybe we should check for some commonly used old-style environment variables when running configure/make, and alert the user?
Speed up javac using server mode supporting parallel compilation
For JEP 139: Enhance javac to Improve Build Speed we will write an extension to javac which will support parallel compilation. To use this, we must add support for it in the makefiles.
Transition
Switching the Java compilation to using the javac server will result in no noticeable impact for the developer (apart from the major speedup, of course). No transition plan is needed.
Make Java builds incremental by enhancing javac with dependency output
Make has the ability to make incremental builds, that is, just recompile a subset of all files when a change have been made. Ideally, this subset should be the minimal subset needed. For this to work make needs to have dependency information available, in a format that it can use.
For JEP 139: Enhance javac to Improve Build Speed we will write an extension to javac which will allow for incremental builds of Java code. To use this, we must add support for it in the makefiles.
Transition
The incremental build will be available for developers without any specific action. In theory, the only noticeable difference for the developer should be the increase in speed when doing recompilations. However, if the dependency generation fails or gets confused, the build might be incorrect and a full rebuild will be needed. This is very unlikely to happen, however it will be useful to inform developers of this potential problem and inform them how to do a full rebuild.
Also, compilation speed will now be correlated with the complexity of the source code dependencies. Informing the developers about this might add an incentive to write good code with less far-fetched dependencies.
Alternatives
Instead of making javac properly parallel, we could start several single-threaded compilations of different and independent java packages in parallel. This would not require any changes to javac, but it would be much harder to get the Makefiles correct, and it would not give as much speed improvement.
We could have skipped rewriting the Makefiles, but to introduce these kinds of changes without properly cleaning up the Makefiles first would have been a daunting and time-consuming task.
Testing
Since we will not change the resulting binary, we don't need to add or change any tests of the product itself.
However, we should make sure we deliver on the promise of not changing the resulting binary. As part of this project, we should create a build comparison tool, which can compare the build result from the old system with the build result from the new system, on all relevant aspects. This is a harder problem than it sounds, since two subsequent builds, even with the same build system, will not be bitwise identical, due to transient and irrelevant factors. To be useful, such a tool needs to ignore such irrelevant aspects, and focus on what should not change.
This tool should be run for a variety of platforms and build types, comparing the old and the new system.
This tool can also be used to test that incremental builds are identical to full rebuilds.
Discussion: Ideally, the build system should be tested, just as properly as the resulting product. Unfortunately, no such framework for testing the build system exists, and creating a proper testing framework is most likely outside the scope of this effort.
Discussion: We should examine the possibilities of adding at least some kind of basic testing of the Makefiles. Testing incremental builds by specially crafted and "evil" dependencies could be one kind of tests to add. Is there an existing javac test suite to add such tests to?
Risks and Assumptions
Removing non-build items
- Risk: By mistake remove support for workflow or process needed by some groups
- Mitigation plan: Communicate with all groups, gather requirements
- Contingency plan: Immediately re-implement support for workflow
Problems on rare platforms
- Risk: In some rare circumstances, the new build system will not work
- Mitigation plan: Test many scenarios (different hardware and software, for different groups) before deploying; make sure we can use both new and old system in parallel if needed
- Contingency plan: Keep old system so both systems can be used in parallel
Resulting product is incorrect
- Risk: Build changes causes incorrect bits to be build
- Mitigation plan: Test resulting build properly
- Contingency plan: Keep old system and use it instead until problem is solved
Dependences
As noted, this JEP depends upon JEP 139: Enhance javac to Improve Build Speed.
This JEP will make heavy changes to code which is also modified by the BSD/MacOS X port. The build changes are likely arrive in JDK 8 before that project, so we will have to take care of the changes they introduced. However, most changes will be related to Hotspot, which we are not considering in this project.
Future JEPs will build upon this JEP to improve the HotSpot and Javadoc build processes.
Impact
The impact of this change on the actual resulting product is minimal.
-
Compatibility: The way the product is built will be different. Existing personal or group build scripts will not work without modification.
-
Portability: We must make sure that the new build system works properly on all supported platforms. If possible, it should be written so as to minimize porting efforts when porting to new systems.
-
Documentation: Existing documentation (like the build README) needs to be updated.