暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

变异识别-软件varscan2

罗大黑学生信 2021-07-09
3855

目录


简介

安装

call 变异

结果过滤


简介

        VarScan采用了一种稳健的启发式/统计方法来调用满足读取深度、基本质量、变异等位基因频率和统计显著性所需阈值的变量。

It can be used to detect different types of variation:

  • Germline variants (SNPs an dindels) in individual samples or pools of samples.

  • Multi-sample variants (shared or private) in multi-sample datasets (with mpileup).

  • Somatic mutations, LOH events, and germline variants in tumor-normal pairs.

  • Somatic copy number alterations (CNAs) in tumor-normal exome data.


官网:http://varscan.sourceforge.net/germline-calling.html

该篇幅阐述尝试用VarScan call  Somatic mutations(SNPs an dindels) in individual samples .可行待确定?


安装

直接下载下载jar文件即可使用, 地址:https://sourceforge.net/projects/varscan/files/


call 变异

input

        VarScan2 的输入文件是由samtools mpileup生成mpileup文件。samtools mpileup其他参数可以根据需求调整,一般默认就好。

step 1 :Generate a pileup file 
# samtools mpileup -B -f [reference sequence] [BAM file] >myData.pileup
samtools mpileup -B -f Homo_sapiens_assembly19.fasta S2000332-L1.NRAS.C-T.gencore.sort.bam > S2000332-L1.NRAS.C-T.gencore.sort.pileup


Methods

VarScan2 提供三个命令进行call 变异, 包括:mpileup2snp/pileup2snp(单独call snp),mpileup2indel/pileup2indel (单独call indel) 和mpileup2cns/pileup2cns(同时call snp indel)。执行的命令如下:

java -jar VarScan.v2.4.4.jar mpileup2cns -h
Warning: No p-value threshold provided, so p-values will not be calculated
Min coverage: 8
Min reads2: 2
Min var freq: 0.2
Min avg qual: 15
P-value thresh: 0.01
USAGE: java -jar VarScan.jar mpileup2cns [pileup file] OPTIONS
mpileup file - The SAMtools mpileup file


OPTIONS:
--min-coverage Minimum read depth at a position to make a call [8]
--min-reads2 Minimum supporting reads at a position to call variants [2]
--min-avg-qual Minimum base quality at a position to count a read [15]
--min-var-freq Minimum variant allele frequency threshold [0.01]
--min-freq-for-hom Minimum frequency to call homozygote [0.75]
--p-value Default p-value threshold for calling variants [99e-02]
--strand-filter Ignore variants with >90% support on one strand [1]
--output-vcf If set to 1, outputs in VCF format
--vcf-sample-list For VCF output, a list of sample names in order, one per line
--variants Report only variant (SNP/indel) positions [0]


step 2 : call CNS
java -jar VarScan.v2.4.4.jar mpileup2cns \
S2000332-L1.NRAS.C-T.gencore.sort.pileup \
--min-var-freq 0.01 \
--p-value 0.01 \
--output-vcf 1 \
--variants > S2000332-L1.NRAS.C-T.snv.vcf


结果过滤

从上所述及官网介绍,varscan2 call 变异的命令是比较简单的,只有包括2步骤:1,将bam 文件转换成pipeup 格式;2,将pipeup 格式作为varscan2 的输入进行变异检测,参数的设置的可以按需求修改,一般默认。

虽然call 变异过程比较简单,但call完VCF文件的大小的会比较大,突变内容很多。而过滤将会成为尤为重要的部分。如何在众多数据筛选出可靠的阳性突变?(在此之前了解vcf 每个字段的意思,vcf 可以在info format 列中获得基因型深度(genotype depth),位点质量得分,次等位基因频率和基因型总深度(genotype call depth))

1.  根据panel 确实筛选的变异范围

bedtools  intersect -a NRAS.C-T.bed -b S2000332-L1.NRAS.C-T.cns.vcf -wb

2. 通用过滤,即根据

a.  突变点最小平均覆盖深度

b.  支持alt 最小覆盖深度

c.  链平衡(single-strands or bi-strands)

d. 支持alt 碱基的平均质量

e. VAF值, 突变频率

Min coverage:  10
Min reads2: 2
Min strands2: 1
Min var freq: 0.2
Min avg qual: 15
P-value thresh: 0.1


USAGE: java -jar VarScan.jar filter [variant file] OPTIONS
variant file - A file of SNPs or indels


OPTIONS:
--min-coverage Minimum read depth at a position to make a call [10]
--min-reads2 Minimum supporting reads at a position to call variants [2]
--min-strands2 Minimum # of strands on which variant observed (1 or 2) [1]
--min-avg-qual Minimum average base quality for variant-supporting reads [20]
--min-var-freq Minimum variant allele frequency threshold [0.20]
--p-value Default p-value threshold for calling variants [1e-01]
--indel-file File of indels for filtering nearby SNPs
--output-file File to contain variants passing filters




参考:

SNP Filtering Tutorial http://www.ddocent.com/filtering/

https://blog.csdn.net/u012110870/article/details/102804559

https://blog.csdn.net/u012110870/article/details/102804561?utm_medium=distribute.pc_relevant.none-task-blog-baidujs_title-0&spm=1001.2101.3001.4242

https://blog.csdn.net/Gossie/article/details/109320960?utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromMachineLearnPai2%7Edefault-3.control&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromMachineLearnPai2%7Edefault-3.control


文章转载自罗大黑学生信,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论