Input and output¶
Input files¶
FASTQ files¶
AdapterRemoval v3 expects standard FASTQ files as input specified using the --in-file1 and --in-file2 options. DNA sequences and quality scores are expected to be written on a single line, and DNA sequences are expected to contain only standard bases A, C, G, T, and N (ambiguous bases). AdapterRemoval v3 supports reading of FASTQ files with Phred+33, Phred+64, and Solexa encoded quality scores. Input may optionally be gzip compressed.
When specifying paired reads, the FASTQ record names are expected to be identical excepting for the last two characters. The last two characters typically indicate the mate number, and consist of a matching separator character followed by either 1 or 2. Examples include /1 and /2, and .1 and .2.
Table of adapters¶
The table used with --adapter-table is expected to be a plain-text table containing one or two whitespace-separated columns, with each line containing the same number of columns:
AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA AAGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG
Column one corresponds to --adapter1 and column two corresponds to --adapter2. Only one column is required in single-end mode, while both columns are required in paired-end mode.
Table of barcodes¶
The table used with --barcode-table is expected to be a plain-text table containing two or three whitespace-separated columns, with each line containing the same number of columns:
sample_1 ATGCGGA TGAATCT
sample_2 ATGGATT ATAGTGA
sample_7 CAAAACT TCGCTGC
Sample names in the first column may only contain letters, numbers and underscores _. The barcodes in the second column are expected to be found at the 5' of mate 1 reads and the barcode in the third column is expected to be found at the 5' of mate 2 reads.
If --barcode-orientation explicit is used, the table is expected to contain a fourth column specifying the orientation of each barcode pair:
sample_1 ATGCGGA TGAATCT forward
sample_1 TGAATCT ATGCGGA reverse
sample_2 ATGGATT ATAGTGA reverse
sample_7 CAAAACT TCGCTGC forward
The following names for orientations are supported: forward, reverse, fwd, rev, +, and -. This table is used as is, and the "missing" orientations are not automatically generated.
Output files¶
During a normal run, AdapterRemoval will generate a combination of trimmed reads and report files. For a description of how this is controlled, see Detailed overview. In the following examples, the value ${prefix} is used to represent the path set using --out-prefix.
FASTQ output¶
When not performing demultiplexing, AdapterRemoval will generate the following files:
${prefix}.r1.fastq.gz${prefix}.r2.fastq.gz(paired-end mode)${prefix}.singleton.fastq.gz(paired-end mode)${prefix}.merged.fastq.gz(paired-end mode, if read merging is enabled)
If demultiplexing is enabled, the following files will be generated with --out-prefix, with the key ${sample} additionally replaced with each sample name:
${prefix}.unidentified.r1.fastq.gz${prefix}.unidentified.r2.fastq.gz(paired-end mode)${prefix}.${sample}.r1.fastq.gz${prefix}.${sample}.r2.fastq.gz(paired-end mode)${prefix}.${sample}.singleton.fastq.gz(paired-end mode)${prefix}.${sample}.merged.fastq.gz(paired-end mode, if read merging is enabled)
The unidentified files contain any reads for which no sample could be identified using the supplied barcode sequences. Note that no trimming, filtering, or merging is performed if --demultiplex-only is used.
SAM/BAM files¶
When using --out-format with .sam, .sam.gz, or .bam, AdapterRemoval defaults to writing a single file containing all output, including discarded reads. For example, for BAM output:
${prefix}.bam
Or when demultiplexing is enabled, replacing the key ${sample} with each sample name:
${prefix}.json${prefix}.html${prefix}.unidentified.bam${prefix}.${sample}.bam
This can be overridden as described above, using the individual --out options. To exclude discarded reads from the output, use --out-discarded /dev/null.
HTML report¶
When using the --out-html option or when using --out-prefix, AdapterRemoval writes a single-file HTML report using the default filename ${prefix}.html.
An example of the HTML report generated by AdapterRemoval can be found here: Example HTML report
Warning
The HTML report currently requires an active internet connection, due to depending on external JavaScript files for drawing figures.
JSON report¶
When using the --out-json option or when using --out-prefix, AdapterRemoval writes a JSON report using the default filename ${prefix}.json.
An example of the JSON report generated by AdapterRemoval can be found here: Example JSON report. A full JSON schema defining the content of the report can be found here: Report JSON schema