SNV Analysis

SNV analysis functions.

Quality Control

Enrichment

baseq.snv.qc.enrich.enrich_qc(samplename, bampath, intervals)[source]

Check the coverage depth and enrichment quality.

Usage:

enrich_qc("sample01", "xx.bam", "panel.bed")
Return:
Sample/Total/Mapped/Map_Ratio/Dup_ratio/PCT_10X/PCT_30X/…

VCF

VCF stats and Filter

baseq.snv.vcf.GATK.vcf_stats(sample, vcfpath, min_depth=50)[source]

Stats on the VCF from GATK

vcf_stats("sample1", "path/to/vcf", min_depth=30)
Return:
A dict/json containing: Samplename/counts/mean_depth/GT_01/GT_11/MAF MAF is minor allel frequency.

GATK

run GATK

baseq.snv.gatk.alignment(fq1, fq2, sample, genome, outfile, thread=8)[source]

Map fastq1/2 files into genome using BWA. Add tags to bamfile using the input sample name. The bamfile is named as outfile.

baseq.snv.gatk.bqsr(markedbam, bqsrbam, genome, disable_dup_filter=False)[source]

Run BQSR.

bqsr()
This will generate a XXXX...
baseq.snv.gatk.run_markdup(bamfile, markedbam)[source]

Run MarkDuplicate of Picard. Generate the bai for the marked bamfile.

run_markdup("in.bam", "in.marked.bam")
baseq.snv.gatk.selectvar(rawvcf, selectvcf, filtervcf, genome, run=True)[source]

Select Variants, it process the XX from XX, genrate XX for XX…