BAM¶
Functions¶
- Read bam file, stats the bamfile (reads, mapping ratio…);
- Get the depth for a genomic region (and visualization);
- Get the reads overlapped with a genomic region;
Design¶
Most of the function develop based on “samtools”. The version should be >=1.3.0
- samtools depth: to get the coverage depth;
- samtools view chrN:start-end : to get the overlapped reads;
Class¶
-
class
baseq.bam.
BAMTYPE
(path, bedfile='')[source]¶ BAM File Handler, Based on samtools. While initiate, it read the path using samtools and will parse the headers.
- Usage:
- Stats on enrichment quality.
-
get_columns
(rows=10000, colIdx=6)[source]¶ Read the bamfile using samtools, get the infors in the column<colIDx> and first <rows> of Rows. The columns of bam files are:
- header
- flags
- chromosome
- start
- mapping quality
- cigar
The colIdx start from 1.
BAMTYPE(path).get_columns(1000, 3) # ['chr1', 'chr1', ...]
-
get_reads
(chr, start, end)[source]¶ Return The Reads that overlaps with region chrN:start-end.
- Skip reads contains “N” cigar.
-
region_depth
(chr, start, end, all=False)[source]¶ Get the depth coverage of bases in the region. It will suitable for chromesome name like “chr1” and “1”.
Parameters: all – Shall the bases with zero coverge be returned. Usage:
BAMTYPE(path).region_depth("chr1", 1000, 2000, all=True) `return depth list [0,1,1,1,2,2,2,3,0]`
-
stats_bam
()[source]¶ Read the bampath.stat, if not exists, perform the samtools flagstat The results will be:
- self.reads_total
- self.reads_mapped
- self.mapping_ratio
-
stats_duplicates
()[source]¶ Stats Duplication Rates from the top 1M reads; The duplication should be reflected in the flag
-
stats_region_coverage
(numbers=1000)[source]¶ Check the enrichment quality.
- Require a bedfile while initiating the class
- Select <numbers> regions randomly
- Use multithread pool to get the coverage depth of the regions
- Stats on the ratio of 10X, 30X, 50X and 100X bases
Usage:
BAMTYPE("sample.bam", "panel.bed").stats_region_coverage(1000) The results will be save in object properies: self.mean_depth/self.pct_10X/..