When to use a custom classpath?
ReadTools relies on several libraries using Service Provider Interfaces (SPI) for extensible applications. Some common use cases to add an extension in ReadTools are:
java.nio.file.spi.FileSystemProviderfor IO operations in different file systems.
org.apache.hadoop.io.compress.CompressionCodecfor custom compression for IO in Hadoop.
- Other Hadoop services.
How to run ReadTools with a custom classpath
A list of jar files separated by
: should be provided to the
option of java in addition to the
ReadTools.jar. For example, to include
one or two services (packaged in service1.jar and service2.jar):
# only service 1 java -cp ReadTools.jar:service1.jar org.magicdgs.readtools.Main # service 1 and 2 java -cp ReadTools.jar:service1.jar:service2.jar org.magicdgs.readtools.Main
ReadTools jar file already packages several SPI extensions in its main jar, providing out-of-the-box support for:
Example usage: 4mc compression for distmap
One common usage of the custom classpath is to support in your Hadoop cluster non-default compression format, which integrates with the Distmap pipeline.
For example, 4mc compression would
make upload/download faster. You can download the packaged jar (e.g.,
File names ending in
.4mc would be output as compressed files with this
compressor if run as following:
java -cp ReadTools.jar:hadoop-4mc-2.0.0.jar org.magicdgs.readtools.Main \ ReadsToDistmap -I input.bam -O hdfs://output.4mc