Keep only reads that are well-formed
Description
Tests whether a read is "well-formed" – that is, is free of major internal inconsistencies and issues that could lead to errors downstream. If a read passes this filter, the rest of the engine should be able to process it without blowing up.
Well-formed reads definition
- Alignment coordinates: start larger than 0 and end after the start position.
- Alignment agrees with header: contig exists and start is within its range.
- Read Group and Sequence are present
- Consistent read length: bases match in length with the qualities and the CIGAR string.</b>
- Do not contain skipped regions: represented by the 'N' operator in the CIGAR string.
See additional information in the following pages: