8.5. Controlling the Contents of an Assembly
In theory, id and format are the only absolute requirements for a
valid assembly descriptor; however, many assembly archivers will fail
if they do not have at least one file to include in the output
archive. The task of defining the files to be included in the assembly
is handled by the five main sections of the assembly descriptor:
files , fileSets , dependencySets , repositories , and
moduleSets . To explore these sections most effectively, we’ll start
by discussing the most elemental section: files . Then, we’ll move on
to the two most commonly used sections, fileSets and
dependencySets . Once you understand the workings of fileSets and
dependencySets , it’s easier to understand repositories and
moduleSets .
The files section is the simplest part of the assembly descriptor,
it is designed for files that have a definite location relative to
your project’s directory. Using this section, you have absolute
control over the exact set of files that are included in your
assembly, exactly what they are named, and where they will reside in
the archive.
Including a JAR file in an Assembly using files .
<assembly>
...
<files>
<file>
<source>target/my-app-1.0.jar</source>
<outputDirectory>lib</outputDirectory>
<destName>my-app.jar</destName>
<fileMode>0644</fileMode>
</file>
</files>
...
</assembly>
Assuming you were building a project called my-app with a version of
1.0 , Including a JAR file in an Assembly using files would include your project’s JAR in the
assembly’s lib/ directory, trimming the version from the file name
in the process so the final file name is simply my-app.jar. It would
then make the JAR readable by everyone and writable by the user that
owns it (this is what the mode 0644 means for files, using Unix
four-digit Octal permission notation). For more information about the
format of the value in fileMode , please see the Wikipedia’s
explanation of
four-digit
Octal notation.
You could build a very complex assembly using file entries, if you
knew the full list of files to be included. Even if you didn’t know
the full list before the build started, you could probably use a
custom Maven plugin to discover that list and generate the assembly
descriptor using references like the one above. While the files
section gives you fine-grained control over the permission, location,
and name of each file in the assembly archive, listing a file
element for every file in a large archive would be a tedious
exercise. For the most part, you will be operating on groups of files
and dependencies using fileSets . The remaining four file-inclusion
sections are designed to help you include entire sets of files that
match a particular criteria.
Similar to the files section, fileSets are intended for files that
have a definite location relative to your project’s directory
structure. However, unlike the files section, fileSets describe
sets of files, defined by file and path patterns they match (or don’t
match), and the general directory structure in which they are
located. The simplest fileSet just specifies the directory where the
files are located:
<assembly>
...
<fileSets>
<fileSet>
<directory>src/main/java</directory>
</fileSet>
</fileSets>
...
</assembly>
This file set simply includes the contents of the src/main/java
directory from our project. It takes advantage of many default
settings in the section, so let’s discuss those briefly.
First, you’ll notice that we haven’t told the file set where within
the assembly matching files should be located. By default, the
destination directory (specified with outputDirectory ) is the same
as the source directory (in our case, src/main/java). Additionally,
we haven’t specified any inclusion or exclusion file patterns. When
these are empty, the file set assumes that all files within the source
directory are included, with some important exceptions. The exceptions
to this rule pertain mainly to source-control metadata files and
directories, and are controlled by the useDefaultExcludes flag,
which is defaulted to true . When active, useDefaultExcludes will
keep directories like .svn/ and CVS/ from being added to the
assembly archive. Section 8.5.3, “Default Exclusion Patterns for” provides a
detailed list of the default exclusion patterns.
If we want more control over this file set, we can specify it more
explicitly. Including Files with fileSet shows a fileSet element with all
of the default elements specified.
Including Files with fileSet .
<assembly>
...
<fileSets>
<fileSet>
<directory>src/main/java</directory>
<outputDirectory>src/main/java</outputDirectory>
<includes>
<include>**</include>
</includes>
<useDefaultExcludes>true</useDefaultExcludes>
<fileMode>0644</fileMode>
<directoryMode>0755</directoryMode>
</fileSet>
</fileSets>
...
</assembly>
The includes section uses a list of include elements, which
contain path patterns. These patterns may contain wildcards such as
‘**’ which matches one or more directories or ‘*’ which matches part
of a file name, and ‘?’ which matches a single character in a file
name. Including Files with fileSet uses a fileMode entry to specify that
files in this set should be readable by all, but only writable by the
owner. Since the fileSet includes directories, we also have the
option of specifying a directoryMode that works in much the same way
as the fileMode . Since a directories’ execute permission is what
allows users to list their contents, we want to make sure directories
are executable in addition to being readable. Like files, only the
owner can write to directories in this set.
The fileSet entry offers some other options as well. First, it
allows for an excludes section with a form identical to the
includes section. These exclusion patterns allow you to exclude
specific file patterns from a fileSet . Exclude patterns take
precedence over include patterns. Additionally, you can set the
filtering flag to true if you want to substitute property values for
expressions within the included files. Expressions can be delimited
either by ${ and } (standard Maven expressions like
${project.groupId} ) or by @ and @ (standard Ant
expressions like @project.groupId@ ). You can adjust the line ending
of your files using the lineEnding element; valid values for
lineEnding are:
-
keep
-
Preserve line endings from original files. (This is the default
value.)
-
unix
-
Unix-style line endings
-
lf
-
Only a Line Feed Character
-
dos
-
MS-DOS-style line endings
-
crlf
-
Carriage-return followed by a Line Feed
Finally, if you want to ensure that all file-matching patterns are
used, you can use the useStrictFiltering element with a value of
true (the default is false ). This can be especially useful if
unused patterns may signal missing files in an intermediary output
directory. When useStrictFiltering is set to true , the Assembly
plugin will fail if an include pattern is not satisfied. In other
words, if you have an include pattern which includes a file from a
build, and that file is not present, setting useStrictFiltering to
true will cause a failure if Maven cannot find the file to be
included.
8.5.3. Default Exclusion Patterns for
When you use the default exclusion patterns, the Maven Assembly plugin
is going to be ignoring more than just SVN and CVS information. By
default the exclusion patterns are defined by the
DirectoryScanner
class in the plexus-utils
project hosted at Codehaus. The array of exclude patterns is defined
as a static, final String array named DEFAULTEXCLUDES in
DirectoryScanner . The contents of this variable are shown in
Definition of Default Exclusion Patterns from Plexus Utils.
Definition of Default Exclusion Patterns from Plexus Utils.
public static final String[] DEFAULTEXCLUDES = {
// Miscellaneous typical temporary files
"**/*~",
"**/#*#",
"**/.#*",
"**/%*%",
"**/._*",
// CVS
"**/CVS",
"**/CVS/**",
"**/.cvsignore",
// SCCS
"**/SCCS",
"**/SCCS/**",
// Visual SourceSafe
"**/vssver.scc",
// Subversion
"**/.svn",
"**/.svn/**",
// Arch
"**/.arch-ids",
"**/.arch-ids/**",
//Bazaar
"**/.bzr",
"**/.bzr/**",
//SurroundSCM
"**/.MySCMServerInfo",
// Mac
"**/.DS_Store"
};
This default array of patterns excludes temporary files from editors
like GNU Emacs, and other common
temporary files from Macs and a few common source control systems
(although Visual SourceSafe is more of a curse than a source control
system). If you need to override these default exclusion patterns you
set useDefaultExcludes to false and then define a set of exclusion
patterns in your own assembly descriptor.
8.5.4. dependencySets Section
One of the most common requirements for assemblies is the inclusion of
a project’s dependencies in an assembly archive. Where files and
fileSets deal with files in your project, dependency files don’t
have a location in your project. The artifacts your project depends on
have to be resolved by Maven during the build. Dependency artifacts
are abstract, they lack a definite location, and are resolved using a
symbolic set of Maven coordinates. Since file and fileSet
specifications require a concrete source path, dependencies are
included or excluded from an assembly using a combination of Maven
coordinates and dependency scopes.
The simplest dependencySet is an empty element:
<assembly>
...
<dependencySets>
<dependencySet/>
</dependencySets>
...
</assembly>
The dependencySet above will match all runtime dependencies of your
project (runtime scope includes the compile scope implicitly), and it
will add these dependencies to the root directory of your assembly
archive. It will also copy the current project’s main artifact into
the root of the assembly archive, if it exists.
Note
Wait? I thought dependencySet was about including my project’s
dependencies, not my project’s main archive? This counterintuitive
side-effect was a widely-used bug in the 2.1 version of the Assembly
plugin, and, because Maven puts an emphasis on backward compatibility,
this counterintuitive and incorrect behavior needed to be preserved
between a 2.1 and 2.2 release. You can control this behavior by
changing the useProjectArtifact flag to false .
While the default dependency set can be quite useful with no
configuration whatsoever, this section of the assembly descriptor also
supports a wide array of configuration options, allowing your to
tailor its behavior to your specific requirements. For example, the
first thing you might do to the dependency set above is exclude the
current project artifact, by setting the useProjectArtifact flag to
false (again, its default value is true for legacy reasons). This
will allow you to manage the current project’s build output separately
from its dependency files. Alternatively, you might choose to unpack
the dependency artifacts using by setting the unpack flag to true
(this is false by default). When unpack is set to true, the Assembly
plugin will combine the unpacked contents of all matching dependencies
inside the archive’s root directory.
From this point, there are several things you might choose to do with
this dependency set. The next sections discuss how to define the
output location for dependency sets and how include and exclude
dependencies by scope. Finally, we’ll expand on the unpacking
functionality of the dependency set by exploring some advanced options
for unpacking dependencies.
Customizing Dependency Output Location
There are two configuration options that are used in concert to define
the location for a dependency file within the assembly archive:
outputDirectory and outputFileNameMapping . You may want to
customize the location of dependencies in your assembly using
properties of the dependency artifacts themselves. Let’s say you want
to put all the dependencies in directories that match the dependency
artifact’s groupId . In this case, you would use the
outputDirectory element of the dependencySet , and you would supply
something like:
<assembly>
...
<dependencySets>
<dependencySet>
<outputDirectory>${artifact.groupId}</outputDirectory>
</dependencySet>
</dependencySets>
...
</assembly>
This would have the effect of placing every single dependency in a
subdirectory that matched the name of each dependency artifact’s
groupId .
If you wanted to perform a further customization and remove the
version numbers from all dependencies. You could customize the output
file name for each dependency using the outputFileNameMapping
element as follows:
<assembly>
...
<dependencySets>
<dependencySet>
<outputDirectory>${artifact.groupId}</outputDirectory>
<outputFileNameMapping>
${artifact.artifactId}.${artifact.extension}
</outputFileNameMapping>
</dependencySet>
</dependencySets>
...
</assembly>
In the previous example, a dependency on commons:commons-codec
version 1.3, would end up in the file commons/commons-codec.jar.
Interpolation of Properties in Dependency Output
As mentioned in the Assembly Interpolation section above, neither of
these elements are interpolated with the rest of the assembly
descriptor, because their raw values have to be interpreted using
additional, artifact-specific expression resolvers.
The artifact expressions available for these two elements vary only
slightly. In both cases, all of the ${project.*} , ${pom.**} ,
and ${*} expressions that are available in the POM and the rest of
the assembly descriptor are also available here. For the
outputFileNameMapping element, the following process is applied to
resolve expressions:
-
If the expression matches the pattern ${artifact.\*} :
-
Match against the dependency’s
Artifact instance (resolves:
groupId , artifactId , version , baseVersion , scope ,
classifier , and file.* )
-
Match against the dependency’s
ArtifactHandler instance
(resolves: expression )
-
Match against the project instance associated with the dependency’s
Artifact (resolves: mainly POM properties)
-
If the expression matches the patterns ${pom.*} or
${project.\*} :
-
Match against the project instance (
MavenProject ) of the current build.
-
If the expression matches the pattern $ and
the Artifact instance contains a non-null classifier, resolve to the
classifier preceded by a dash (-classifier). Otherwise, resolve to
an empty string.
-
Attempt to resolve the expression against the project instance of
the current build.
-
Attempt to resolve the expression against the POM properties of the
current build.
-
Attempt to resolve the expression against the available system
properties.
-
Attempt to resolve the expression against the available
operating-system environment variables.
The outputDirectory value is interpolated in much the same way, with
the difference being that there is no available ${artifact.*}
information, only the ${project.\*} instance for the particular
artifact. Therefore, the expressions listed above associated with
those classes (1a, 1b, and 3 in the process listing above) are
unavailable.
How do you know when to use outputDirectory and
outputFileNameMapping ? When dependencies are unpacked only the
outputDirectory is used to calculate the output location. When
dependencies are managed as whole files (not unpacked), both
outputDirectory and outputFileNameMapping can be used
together. When used together, the result is the equivalent of:
<archive-root-dir>/<outputDirectory>/<outputFileNameMapping>
When outputDirectory is missing, it is not used. When
outputFileNameMapping is missing, its default value is:
${artifact.artifactId}-${artifact.version}-$.${artifact.extension}
Including and Excluding Dependencies by Scope
In Section 3.4, “Project Dependencies”, it was noted that
all project dependencies have one scope or another. Scope determines
when in the build process that dependency normally would be used. For
instance, test-scoped dependencies are not included in the classpath
during compilation of the main project sources; but they are included
in the classpath when compiling unit test sources. This is because
your project’s main source code should not contain any code specific
to testing, since testing is not a function of the project (it’s a
function of the project’s build process). Similarly, provided-scoped
dependencies are assumed to be present in the environment of any
eventual deployment. However, if a project depends on a particular
provided dependency, it is likely to require that dependency in order
to compile. Therefore, provided-scoped dependencies are present in the
compilation classpath, but not in the dependency set that should be
bundled with the project’s artifact or assembly.
Also from Section 3.4, “Project Dependencies”, recall that
some dependency scopes imply others. For instance, the runtime
dependency scope implies the compile scope, since all compile-time
dependencies (except for those in the provided scope) will be
required for the code to execute. There are a number of complex
relationships between the various dependency scopes which control how
the scope of a direct dependency affects the scope of a transitive
dependency. In a Maven Assembly descriptor, we can use scopes to apply
different settings to different sets of dependencies accordingly.
For instance, if we plan to bundle a web application with
Jetty to create a completely
self-contained application, we’ll need to include all provided-scope
dependencies somewhere in the jetty directory structure we’re
including. This ensures those provided dependencies actually are
present in the runtime environment. Non-provided, runtime dependencies
will still land in the WEB-INF/lib directory, so these two dependency
sets must be processed separately. These dependency sets might look
similar to the following XML.
Defining Dependency Sets Using Scope.
<assembly>
...
<dependencySets>
<dependencySet>
<scope>provided</scope>
<outputDirectory>lib/${project.artifactId}</outputDirectory>
</dependencySet>
<dependencySet>
<scope>runtime</scope>
<outputDirectory>
webapps/${webContextName}/WEB-INF/lib
</outputDirectory>
</dependencySet>
</dependencySets>
...
</assembly>
Provided-scoped dependencies are added to the lib/ directory in the
assembly root, which is assumed to be a libraries directory that will
be included in the Jetty global runtime classpath. We’re using a
subdirectory named for the project’s artifactId in order to make it
easier to track the origin of a particular library. Runtime
dependencies are included in the WEB-INF/lib path of the web
application, which is located within a subdirectory of the standard
Jetty webapps/ directory that is named using a custom POM property
called webContextName . What we’ve done in the previous example is
separate application-specific dependencies from dependencies which
will be present in a Servlet contains global classpath.
However, simply separating according to scope may not be enough,
particularly in the case of a web application. It’s conceivable that
one or more runtime dependencies will actually be bundles of
standardized, non-compiled resources for use in the web
application. For example, consider a set of web application which
reuse a common set of Javascript, CSS, SWF, and image resources. To
make these resources easy to standardize, it’s a common practice to
bundle them up in an archive and deploy them to the Maven
repository. At that point, they can be referenced as standard Maven
dependencies - possibly with a dependency type of zip - that are
normally specified with a runtime scope. Remember, these are
resources, not binary dependencies of the application code itself;
therefore, it’s not appropriate to blindly include them in the
WEB-INF/lib directory. Instead, these resource archives should be
separated from binary runtime dependencies, and unpacked into the web
application document root somewhere. In order to achieve this kind of
separation, we’ll need to use inclusion and exclusion patterns that
apply to the coordinates of a specific dependency.
In other words, say you have three or four web application which reuse
the same resources and you want to create an assembly that puts
provided dependencies into lib/, runtime dependencies into
webapps/<contextName>/WEB-INF/lib, and then unpacks a specific
runtime dependency into your web application’s document root. You can
do this because the Assembly allows you to define multiple include and
exclude patterns for a given dependencySet element. Read the next
section for more development of this idea.
Fine Tuning: Dependency Includes and Excludes
A resource dependency might be as simple as a set of resources (CSS,
Javascript, and Images) in a project that has an assembly which
creates a ZIP archive. Depending on the particulars of our web
application, we might be able to distinguish resource dependencies
from binary dependencies solely according to type. Most web
applications are going to depend on other dependencies of type jar ,
and it is possible that we can state with certainty that all
dependencies of type zip are resource dependencies. Or, we might
have a situation where resources are stored in jar format, but have
a classifier of something like resources . In either case, we can
specify an inclusion pattern to target these resource dependencies and
apply different logic than that used for binary dependencies. We’ll
specify these tuning patterns using the includes and excludes
sections of the dependencySet .
Both includes and excludes are list sections, meaning they accept the
sub-elements include and exclude respectively. Each include or
exclude element contains a string value, which can contain
wildcards. Each string value can match dependencies in a few different
ways. Generally speaking, three identity pattern formats are
supported:
-
groupId:artifactId - version-less key
-
You would use this pattern to match a dependency by only the
groupId and the artifactId.
-
groupId:artifactId:type[:classifier] - conflict id
-
The pattern allows you to specify a wider set of coordinates to
create a more specific include/exclude pattern.
-
groupId:artifactId:type[:classifier]:version - full artifact identity
-
If you need to get really specific, you can specify all the
coordinates.
All of these pattern formats support the wildcard character ‘*’, which
can match any subsection of the identity and is not limited to
matching single identity parts (sections between ‘:’
characters). Also, note that the classifier section above is optional,
in that patterns matching dependencies that don’t have classifiers do
not need to account for the classifier section in the pattern.
In the example given above, where the key distinction is the artifact
type zip, and none of the dependencies have classifiers, the following
pattern would match resource dependencies assuming that they were of
type zip :
*:zip
The pattern above makes use of the second dependency identity: the
dependency’s conflict id. Now that we have a pattern that
distinguishes resource dependencies from binary dependencies, we can
modify our dependency sets to handle resource archives differently:
Using Dependency Excludes and Includes in dependencySets .
<assembly>
...
<dependencySets>
<dependencySet>
<scope>provided</scope>
<outputDirectory>lib/${project.artifactId}</outputDirectory>
</dependencySet>
<dependencySet>
<scope>runtime</scope>
<outputDirectory>
webapps/${webContextName}/WEB-INF/lib
</outputDirectory>
<excludes>
<exclude>*:zip</exclude>
</excludes>
</dependencySet>
<dependencySet>
<scope>runtime</scope>
<outputDirectory>
webapps/${webContextName}/resources
</outputDirectory>
<includes>
<include>*:zip</include>
</includes>
<unpack>true</unpack>
</dependencySet>
</dependencySets>
...
</assembly>
In Using Dependency Excludes and Includes in dependencySets , the runtime-scoped dependency set
from our last example has been updated to exclude resource
dependencies. Only binary dependencies (non-zip dependencies) should
be added to the WEB-INF/lib directory of the web
application. Resource dependencies now have their own dependency set,
which is configured to include these dependencies in the resources
directory of the web application. The includes section in the last
dependencySet reverses the exclusion from the previous
dependencySet , so that resource dependencies are included using the
same identity pattern (i.e. *:zip ). The last dependencySet refers
to the shared resource dependency and it is configured to unpack the
shared resource dependency in the document root of the web
application.
Using Dependency Excludes and Includes in dependencySets was based upon the assumption that our
shared resources project dependency had a type which differed from all
of the other dependencies. What if the share resource dependency had
the same type as all of the other dependencies? How could you
differentiate the dependency? In this case if the shared resource
dependency had been bundled as a JAR with the classifier resources ,
you would match that dependency with the following identity pattern:
*:jar:resources
Instead of matching on artifacts with a type of zip and no
classifier, we’re matching on artifacts with a classifier of resources
and a type of jar .
Just like the fileSets section, dependencySets support the
useStrictFiltering flag. When enabled, any specified patterns that
don’t match one or more dependencies will cause the assembly - and
consequently, the build - to fail. This can be particularly useful as
a safety valve, to make sure your project dependencies and assembly
descriptors are synchronized and interacting as you expect them to. By
default, this flag is set to false for the purposes of backward
compatibility.
Transitive Dependencies, Project Attachments, and Project
The dependencySet section supports two more general mechanisms for
tuning the subset of matching artifacts: transitive selection options,
and options for working with project artifacts. Both of these features
are a product of the need to support legacy configurations that
applied a somewhat more liberal definition of the word
“dependency”. As a prime example, consider the project’s own main
artifact. Typically, this would not be considered a dependency; yet
older versions of the Assembly plugin included the project artifact in
calculations of dependency sets. To provide backward compatibility
with this “feature”, the 2.2 releases (currently at 2.2-beta-2) of the
Assembly plugin support a flag in the dependencySet called
useProjectArtifact , whose default value is true . By default,
dependency sets will attempt to include the project artifact itself in
calculations about which dependency artifacts match and which
don’t. If you’d rather deal with the project artifact separately, set
this flag to false .
Tip
The authors of this book recommend that you always set
useProjectArtifact to false .
As a natural extension to the inclusion of the project artifact, the
project’s attached artifacts can also be managed within a
dependencySet using the useProjectAttachments flag (whose default
value is false ). Enabling this flag allows patterns that specify
classifiers and types to match on artifacts that are “attached” to the
main project artifact; that is, they share the same basic
groupId /artifactId /version identity, but differ in type and
classifier from the main artifact. This could be useful for
including JavaDoc or source jars in an assembly.
Aside from dealing with the project’s own artifacts, it’s also
possible to fine-tune the dependency set using two
transitive-resolution flags. The first, called
useTransitiveDependencies (and set to true by default) simply
specifies whether the dependency set should consider transitive
dependencies at all when determining the matching artifact set to be
included. As an example of how this could be used, consider what
happens when your POM has a dependency on another assembly. That
assembly (most likely) will have a classifier that separates it from
the main project artifact, making it an attachment. However, one quirk
of the Maven dependency-resolution process is that the
transitive-dependency information for the main artifact is still used
when resolving the assembly artifact. If the assembly bundles its
project dependencies inside itself, using transitive dependency
resolution here would effectively duplicate those dependencies. To
avoid this, we simply set useTransitiveDependencies to false for
the dependency set that handles that assembly dependency.
The other transitive-resolution flag is far more subtle. It’s called
useTransitiveFiltering , and has a default value of false . To
understand what this flag does, we first need to understand what
information is available for any given artifact during the resolution
process. When an artifact is a dependency of a dependency (that is,
removed at least one level from your own POM), it has what Maven calls
a "dependency trail", which is maintained as a list of strings that
correspond to the full artifact identities
(groupId:artifactId:type:[classifier:]version ) of all dependencies
between your POM and the artifact that owns that dependency trail. If
you remember the three types of artifact identities available for
pattern matching in a dependency set, you’ll notice that the entries
in the dependency trail - the full artifact identity - correspond to
the third type. When useTransitiveFiltering is set to true , the
entries in an artifact’s dependency trail can cause the artifact to be
included or excluded in the same way its own identity can.
If you’re considering using transitive filtering, be careful! A given
artifact can be included from multiple places in the
transitive-dependency graph, but as of Maven 2.0.9, only the first
inclusion’s trail will be tracked for this type of matching. This can
lead to subtle problems when collecting the dependencies for your
project.
Warning
Most assemblies don’t really need this level of control over
dependency sets; consider carefully whether yours truly does. Hint: It
probably doesn’t.
Advanced Unpacking Options
As we discussed previously, some project dependencies may need to be
unpacked in order to create a working assembly archive. In the
examples above, the decision to unpack or not was simple. It didn’t
take into account what needed to be unpacked, or more importantly,
what should not be unpacked. To gain more control over the dependency
unpacking process, we can configure the unpackOptions element of the
dependencySet . Using this section, we have the ability to choose
which file patterns to include or exclude from the assembly, and
whether included files should be filtered to resolve expressions using
current POM information. In fact, the options available for unpacking
dependency sets are fairly similar to those available for including
files from the project directory structure, using the file sets
descriptor section.
To continue our web-application example, suppose some of the resource
dependencies have been bundled with a file that details their
distribution license. In the case of our web application, we’ll handle
third-party license notices by way of a NOTICES file included in our
own bundle, so we don’t want to include the license file from the
resource dependency. To exclude this file, we simply add it to the
unpack options inside the dependency set that handles resource
artifacts:
Excluding Files from a Dependency Unpack.
<asembly>
...
<dependencySets>
<dependencySet>
<scope>runtime</scope>
<outputDirectory>
webapps/${webContextName}/resources
</outputDirectory>
<includes>
<include>*:zip</include>
</includes>
<unpack>true</unpack>
<unpackOptions>
<excludes>
<exclude>**/LICENSE*</exclude>
</excludes>
</unpackOptions>
</dependencySet>
</dependencySets>
...
</assembly>
Notice that the exclude we’re using looks very similar to those used
in fileSet declarations. Here, we’re blocking any file starting with
the word LICENSE in any directory within our resource artifacts. You
can think of the unpack options section as a lightweight fileSet
applied to each dependency matched within that dependency set. In
other words, it is a fileSet by way of an unpacked dependency. Just
as we specified an exclusion pattern for files within resource
dependencies in order to block certain files, you can also choose
which restricted set of files to include using the includes
section. The same code that processes inclusions and exclusions on
fileSets has been reused for processing unpackOptions .
In addition to file inclusion and exclusion, the unpack options on a
dependency set also provides a filtering flag, whose default value
is false . Again, this should be familiar from our discussion of file
sets above. In both cases, expressions using either the Maven syntax
supported. Filtering is a particularly nice feature to have for
dependency sets, though, since it effectively allows you to create
standardized, versioned resource templates that are then customized to
each assembly as they are included. Once you start mastering the use
of filtered, unpacked dependencies which store shared resources, you
will be able to start abstracting repeated resources into common
resource projects.
Summarizing Dependency Sets
Finally, it’s worth mentioning that dependency sets support the same
fileMode and directoryMode configuration options that file sets
do, though you should remember that the directoryMode setting will
only be used when dependencies are unpacked.
8.5.5. moduleSets Sections
Multi-module builds are generally stitched together using the parent
and modules sections of interrelated POMs. Typically, parent POMs
specify their children in a modules section, which under normal
circumstances causes the child POMs to be included in the build
process of the parent. Exactly how this relationship is constructed
can have important implications for the ways in which the Assembly
plugin can participate in this process, but we’ll discuss that more
later. For now, it’s enough to keep in mind this parent-module
relationship as we discuss the moduleSets section.
Projects are stitched together into multi-module builds because they
are part of a larger system. These projects are designed to be used
together, and single module in a larger build has little practical
value on its own. In this way, the structure of the project’s build is
related to the way we expect the project (and its modules) to be
used. If consider the project from the user’s perspective, it makes
sense that the ideal end goal of that build would be a single,
distributable file that the user can consume directly with minimum
installation hassle. Since Maven multi-module builds typically follow
a top-down structure, where dependency information, plugin
configurations, and other information trickles down from parent to
child, it seems natural that the task of rolling all of these modules
into a single distribution file should fall to the topmost
project. This is where the moduleSet comes into the picture.
Module sets allow the inclusion of resources that belong to each
module in the project structure into the final assembly archive. Just
like you can select a group of files to include in an assembly using a
fileSet and a dependencySet , you can include a set of files and
resources using a moduleSet to refer to modules in a multi-module
build. They achieve this by enabling two basic types of
module-specific inclusion: file-based, and artifact-based. Before we
get into the specifics and differences between file-based and
artifact-based inclusion of module resources into an assembly, let’s
talk a little about selecting which modules to process.
By now, you should be familiar with includes /excludes patterns as
they are used throughout the assembly descriptor to filter files and
dependencies. When you are referring to modules in an assembly
descriptor, you will also use the includes /excludes patterns to
define rules which apply to different sets of modules. The difference
in moduleSet includes and excludes is that these rules do not
allow for wildcard patterns. (As of the 2.2-beta-2 release, this
feature has not really seen much demand, so it hasn’t been
implemented.) Instead, each include or exclude value is simply the
groupId and artifactId for the module, separated by a colon, like
this:
groupId:artifactId
In addition to includes and excludes , the moduleSet also
supports an additional selection tool: the includeSubModules flag
(whose default value is true ). The parent-child relationship in any
multi-module build structure is not strictly limited to two tiers of
projects. In fact, you can include any number of tiers, or layers, in
your build. Any project that is a module of a module of the current
project is considered a sub-module. In some cases, you may want to
deal with each individual module in the build separately (including
sub-modules). For example, this is often simplest when dealing with
artifact-based contributions from these modules. To do this, you would
simply leave the useSubModules flag set to the default of true .
When you’re trying to include files from each module’s directory
structure, you may wish to process that module’s directory structure
only once. If your project directory structure mirrors that of the
parent-module relationships that are included in the POMs, this
approach would allow file patterns like **/src/main/java to apply not
only to that direct module’s project directory, but also to the
directories of its own modules as well. In this case you don’t want to
process sub-modules directly (they will be processed as subdirectories
within your own project’s modules instead), you should set the
useSubModules flag to false .
Once we’ve determined how module selection should proceed for the
module set in question, we’re ready to choose what to include from
each module. As mentioned above, this can include files or artifacts
from the module project.
Suppose you want to include the source of all modules in your
project’s assembly, but you would like to exclude a particular
module. Maybe you have a project named secret-sauce which contains
secret and sensitive code that you don’t want to distribute with your
project. The simplest way to accomplish this is to use a moduleSet
which includes each project’s directory in ${module.basedir.name}
and which excludes the secret-sauce module from the assembly.
Includes and Excluding Modules with a moduleSet .
<assembly>
...
<moduleSets>
<moduleSet>
<includeSubModules>false</includeSubModules>
<excludes>
<exclude>
com.mycompany.application:secret-sauce
</exclude>
</excludes>
<sources>
<outputDirectoryMapping>
${module.basedir.name}
</outputDirectoryMapping>
<excludeSubModuleDirectories>
false
</excludeSubModuleDirectories>
<fileSets>
<fileSet>
<directory>/</directory>
<excludes>
<exclude>**/target</exclude>
</excludes>
</fileSet>
</fileSets>
</sources>
</moduleSet>
</moduleSets>
...
</assembly>
In Includes and Excluding Modules with a moduleSet , since we’re dealing with each
module’s sources it’s simpler to deal only with direct modules of the
current project, handling sub-modules using file-path wildcard
patterns in the file set. We set the includeSubModules element to
false so we don’t have to worry about submodules showing up in the
root directory of the assembly archive. The exclude element will
take care of excluding the secret-sauce module. We’re not going to
include the project sources for the secret-sauce module; they’re,
well, secret.
Normally, module sources are included in the assembly under a
subdirectory named after the module’s artifactId . However, since
Maven allows modules that are not in directories named after the
module project’s artifactId , it’s often better to use the expression
${module.basedir.name} to preserve the module directory’s
actual name (${module.basedir.name} is the same as calling
MavenProject.getBasedir().getName() ). It is critical to remember
that modules are not required to be subdirectories of the project that
declares them. If your project has a particularly strange directory
structure, you may need to resort to special moduleSet declarations
that include specific project and account for your own project’s
idiosyncrasies.
Warning
Try to minimize your own project’s idiosyncrasies, while
Maven is flexible, if you find yourself doing too much configuration
there is likely an easier way.
Continuing through Includes and Excluding Modules with a moduleSet , since we’re not
processing sub-modules explicitly in this module set, we need to make
sure sub-module directories are not excluded from the source
directories we consider for each direct module. By setting the
excludeSubModuleDirectories flag to false , this allows us to apply
the same file pattern to directory structures within a sub-module of
the one we’re processing. Finally in Includes and Excluding Modules with a moduleSet ,
we’re not interested in any output of the build process for this
module set. We exclude the target/ directory from all modules.
It’s also worth mentioning that the sources section supports
fileSet -like elements directly within itself, in addition to
supporting nested fileSets . These configuration elements are used to
provide backward compatibility to previous versions of the Assembly
plugin (versions 2.1 and under) that didn’t support multiple distinct
file sets for the same module without creating a separate module set
declaration. They are deprecated, and should not be used.
Interpolation of outputDirectoryMapping in
In the section called “Customizing Dependency Output Location”, we used the element
outputDirectoryMapping to change the name of the directory under
which each module’s sources would be included. The expressions
contained in this element are resolved in exactly the same way as the
outputFileNameMapping , used in dependency sets (see the explanation
of this algorithm in Section 8.5.4, “dependencySets Section”).
In Includes and Excluding Modules with a moduleSet , we used the expression
${module.basedir.name} . You might notice that the root of that
expression, module , is not listed in the mapping-resolution
algorithm from the dependency sets section; this object root is
specific to configurations within moduleSets . It works in exactly
the same way as the ${artifact.*} references available in the
outputFileNameMapping element, except it is applied to the module’s
MavenProject , Artifact , and ArtifactHandler instances instead of
those from a dependency artifact.
Just as the sources section is primarily concerned with including a
module in its source form, the binaries section is primarily
concerned with including the module’s build output, or its
artifacts. Though this section functions primarily as a way of
specifying dependencySets that apply to each module in the set,
there are a few additional features unique to module artifacts that
are worth exploring: attachmentClassifier and
includeDependencies . In addition, the binaries section contains
options similar to the dependencySet section, that relate to the
handling of the module artifact itself. These are: unpack ,
outputFileNameMapping , outputDirectory , directoryMode , and
fileMode . Finally, module binaries can contain a dependencySets
section, to specify how each module’s dependencies should be included
in the assembly archive. First, let’s take a look at how the options
mentioned here can be used to manage the module’s own artifacts.
Suppose we want to include the javadoc jars for each of our modules
inside our assembly. In this case, we don’t care about including the
module dependencies; we just want the javadoc jar. However, since this
particular jar is always going to be present as an attachment to the
main project artifact, we need to specify which classifier to use to
retrieve it. For simplicity, we won’t cover unpacking the module
javadoc jars, since this configuration is exactly the same as what we
used for dependency sets earlier in this chapter. The resulting module
set might look similar to Including JavaDoc from Modules in an Assembly.
Including JavaDoc from Modules in an Assembly.
<assembly>
...
<moduleSets>
<moduleSet>
<binaries>
<attachmentClassifier>javadoc</attachmentClassifier>
<includeDependencies>false</includeDependencies>
<outputDirectory>apidoc-jars</outputDirectory>
</binaries>
</moduleSet>
</moduleSets>
...
</assembly>
In Including JavaDoc from Modules in an Assembly, we don’t explicitly set the
includeSubModules flag, since it’s true by default. However, we
definitely want to process all modules - even sub-modules - using this
module set, since we’re not using any sort of file pattern that could
match on sub-module directory structures within. The
attachmentClassifier grabs the attached artifact with the javadoc
classifier for each module processed. The includeDependencies
element tells the Assembly plugin that we’re not interested in any of
the module’s dependencies, just the javadoc attachment. Finally, the
outputDirectory element tells the Assembly plugin to put all of the
javadoc jars into a directory named apidoc-jars/ off of the assembly
root directory.
Although we’re not doing anything too complicated in this example,
it’s important to understand that the same changes to the
expression-resolution algorithm discussed for the
outputDirectoryMapping element of the sources section also applies
here. That is, whatever was available as ${artifact.*} inside
a dependencySet+’s +outputFileNameMapping configuration is also
available here as ${module.*} . The same applies for
outputFileNameMapping when used directly within a binaries
section.
Finally, let’s examine an example where we simply want to process the
module’s artifact and its runtime dependencies. In this case, we want
to separate the artifact set for each module into separate directory
structures, according to the module’s artifactId and version . The
resulting module set is surprisingly simply, and it looks like the
listing in Including Module Artifacts and Dependencies in an Assembly:
Including Module Artifacts and Dependencies in an Assembly.
<assembly>
...
<moduleSets>
<moduleSet>
<binaries>
<outputDirectory>
${module.artifactId}-${module.version}
</outputDirectory>
<dependencySets>
<dependencySet/>
</dependencySets>
</binaries>
</moduleSet>
</moduleSets>
...
</assembly>
In Including Module Artifacts and Dependencies in an Assembly, we’re using the empty dependencySet
element here, since that should include all runtime dependencies by
default, with no configuration. With the outputDirectory specified
at the binaries level, all dependencies should be included alongside
the module’s own artifact in the same directory, so we don’t even need
to specify that in our dependency set.
For the most part, module binaries are fairly straightforward. In both
parts - the main part, concerned with handling the module artifact
itself, and the dependency sets, concerned with the module’s
dependencies - the configuration options are very similar to those in
a dependency set. Of course, the binaries section also provides
options for controlling whether dependencies are included, and which
main-project artifact you want to use.
Like the sources section, the binaries section contains a couple of
configuration options that are provided solely for backward
compatibility, and should be considered deprecated. These include the
includes and excludes sub-sections.
Finally, we close the discussion about module handling with a strong
warning. There are subtle interactions between Maven’s internal design
as it relates to parent-module relationships and the execution of a
module-set’s binaries section. When a POM declares a parent, that
parent must be resolved in some way or other before the POM in
question can be built. If the parent is in the Maven repository, there
is no problem. However, as of Maven 2.0.9 this can cause big problems
if that parent is a higher-level POM in the same build, particularly
if that parent POM expects to build an assembly using its modules’
binaries.
Maven 2.0.9 sorts projects in a multi-module build according to their
dependencies, with a given project’s dependencies being built ahead of
itself. The problem is the parent element is considered a dependency,
which means the parent project’s build must complete before the child
project is built. If part of that parent’s build process includes the
creation of an assembly that uses module binaries, those binaries will
not exist yet, and therefore cannot be included, causing the assembly
to fail. This is a complex and subtle issue, which severely limits the
usefulness of the module binaries section of the assembly
descriptor. In fact, it has been filed in the bug tracker for the
Assembly plugin at:
http://jira.codehaus.org/browse/MASSEMBLY-97.
Hopefully, future versions of Maven will find a way to restore this
functionality, since the parent-first requirement may not be
completely necessary.
8.5.6. Repositories Section
The repositories section represents a slightly more exotic feature in
the assembly descriptor, since few applications other than Maven can
take full advantage of a Maven-repository directory structure. For
this reason, and because many of its features closely resemble those
in the dependencySets section, we won’t spend too much time on the
repositories section of the assembly descriptor. In most cases, users
who understand dependency sets should have no trouble constructing
repositories via the Assembly plugin. We’re not going to motivate the
repositories section; we’re not going to go through a the business
of setting up a use case and walking you through the process. We’re
just going to bring up a few caveats for those of you who find the
need to use the repositories section.
Having said that, there are a two features particular to the
repositories section that deserve some mention. The first is the
includeMetadata flag. When set to true it includes metadata such
as the list of real versions that correspond to -SNAPSHOT virtual
versions, and by default it’s set to false . At present, the only
metadata included when this flag is true is the information
downloaded from Maven’s central repository.
The second feature is called groupVersionAlignments . Again, this
section is a list of individual groupVersionAlignment
configurations, whose purpose is to normalize all included artifacts
for a particular groupId to use a single version . Each alignment
entry consists of two mandatory elements - id and version - along
with an optional section called excludes that supplies a list of
artifactId string values which are to be excluded from this
realignment. Unfortunately, this realignment doesn’t seem to modify
the POMs involved in the repository, neither those related to
realigned artifacts nor those that depend on realigned artifacts, so
it’s difficult to imagine what the practical application for this sort
of realignment would be.
In general, it’s simplest to apply the same principles you would use
in dependency sets to repositories when adding them to your assembly
descriptor. While the repositories section does support the above
extra options, they are mainly provided for backward compatibility,
and will probably be deprecated in future releases.
8.5.7. Managing the Assembly’s Root Directory
Now that we’ve made it through the main body of the assembly
descriptor, we can close the discussion of content-related descriptor
sections with something lighter: root-directory naming and
site-directory handling.
Some may consider it a stylistic concern, but it’s often important to
have control over the name of the root directory for your assembly, or
whether the root directory is there at all. Fortunately, two
configuration options in the root of the assembly descriptor make
managing the archive root directory simple: includeBaseDirectory and
baseDirectory . In cases like executable jar files, you probably
don’t want a root directory at all. To skip it, simply set the
includeBaseDirectory flag to false (it’s true by default). This
will result in an archive that, when unpacked, may create more than
one directory in the unpack target directory. While this is considered
bad form for archives that are meant to be unpacked before use, it’s
not so bad for archives that are consumable as-is.
In other cases, you may want to guarantee the name of the archive root
directory regardless of the POM’s version or other information. By
default, the baseDirectory element has a value equal to
${project.artifactId}-${project.version} . However, we can
easily set this element to any value that consists of literal strings
and expressions which can be interpolated from the current POM, such
as ${project.groupId}-${project.artifactId} . This could be
very good news for your documentation team! (We all have those,
right?)
Another configuration available is the includeSiteDirectory flag,
whose default value is false . If your project build has also
constructed a website document root using the site lifecycle or the
Site plugin goals, that output can be included by setting this flag to
true . However, this feature is a bit limited, since it only includes
the outputDirectory from the reporting section of the current POM
(by default, target/site) and doesn’t take into consideration any
site directories that may be available in module projects. Use it if
you want, but a good fileSet specification or moduleSet
specification with sources configured could serve equally well, if not
better. This is yet another example of legacy configuration currently
supported by the Assembly plugin for the purpose of backward
compatibility. Your mileage may vary. If you really want to include a
site that is aggregated from many modules, you’ll want to consider
using a fileSet or moduleSet instead of setting
includeSiteDirectory to true .
8.5.8. componentDescriptors and
To round out our exploration of the assembly descriptor, we should
touch briefly on two other sections: containerDescriptorHandlers and
componentDescriptors . The containerDescriptorHandlers section
refers to custom components that you use to extend the capabilities of
the Assembly plugin. Specifically, these custom components allow you
to define and handle special files which may need to be merged from
the multiple constituents used to create your assembly. A good example
of this might be a custom container-descriptor handler that merged
web.xml files from constituent war or war-fragment files included in
your assembly, in order to create the single web-application
descriptor required for you to use the resulting assembly archive as a
war file.
The componentDescriptors section allows you to reference external
assembly-descriptor fragments and include them in the current
descriptor. Component references can be any of the following:
-
Relative file paths: src/main/assembly/component.xml
-
Artifact references:
groupId:artifactId:version[:type[:classifier]]
-
Classpath resources: /assemblies/component.xml
-
URLs: http://www.sonatype.com/component.xml
Incidentally, when resolving a component descriptor, the Assembly
plugin tries those different strategies in that exact order. The first
one to succeed is used.
Component descriptors can contain many of the same content-oriented
sections available in the assembly descriptor itself, with the
exception of moduleSets , which is considered so specific to each
project that it’s not a good candidate for reuse. Also included in a
component descriptor is the containerDescriptorHandlers section,
which we briefly discussed above. Component descriptors cannot contain
formats, assembly id’s, or any configuration related to the base
directory of the assembly archive, all of which are also considered
unique to a particular assembly descriptor. While it may make sense to
allow sharing of the formats section, this has not been implemented as
of the 2.2-beta-2 Assembly-plugin release.
|