暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

国际千人基因组计划-1000genomes

罗大黑学生信 2021-07-28
2981

简介

            国际千人基因组计划,由中英美德等国科学家共同承担研究任务,旨在绘制迄今为止最详尽的、最有医学应用价值的人类基因组遗传多态性图谱。2012年11月大型国际科研合作项目“千人基因组计划”的研究人员在新一期英国期刊《自然》上发布了1092人的基因数据,这一成果将有助于更广泛地分析与疾病有关的基因变异。官网:

https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes/

        1000genomes 数据库收录SNP 的相关信息, 其中包括:基因Symbol ID, 染色体区域,不同人群突变碱基的等位基因频率(alternative allele frequency)。

示例:rs=rs121434596 网址:

http://grch37.ensembl.org/Homo_sapiens/Gene/Sequence?db=core;g=ENSG00000213281;r=1:115258244-115259244;t=ENST00000369535;v=rs121434596;vdb=variation;vf=642150019



软件注释

注释软件:annovar

注释软件调用的数据库:1000g2015aug (ALL.sites.2015_08 )

注释获取信息:1000基因组项目供选择的等位基因频率信息

执行命令:

table_annovar.pl  esp.avinput humandb/ -buildver hg19 -out myanno -remove -protocol refGene,EAS.sites.2015_09 -operation g,f -nastring .


>> cat esp.avinput
1 11012 11012 C G




>> cat myanno.hg19_multianno.txt
Chr Start End Ref Alt Func.refGene Gene.refGene GeneDetail.refGene ExonicFunc.refGene AAChange.refGene EAS.sites.2015_09
1 11012 11012 C G upstream DDX11L1 . . . 0.0367


问题

执行过程遇到报错:

Argument "A" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 2.
Argument "T" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 3.
Argument "T" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 4.
Argument "T" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 5.
Argument "T" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 6.
Argument "A" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 7.
Argument "C" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 8.
Argument "G" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 9.
Argument "C" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 10.
Argument "C" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 11.
Argument "C" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 12.
Argument "CCGCCGTTGCAAAGGCGCGCCG" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 13.
Argument "CGCCGTTGCAAAGGCGCGCCG" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 14.
Argument "G" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 15.
Argument "C" isn't numeric in numeric eq (==) at /home/pipeline/helitec_pipeline/bin/annovar/annotate_variation.pl line 2461, <DB> line 16.
Done
-----------------------------------------------------------------
NOTICE: Multianno output file is written to myanno.hg19_multianno.txt
查阅代码后,发现应该数据库问题,少了一列 start (or end);修改如下:
>> head hg19_EAS.sites.2015_08.txt
1 10177 A 0C 0.3363 rs367896724
1 10177 A 1AC 0.3363 rs367896724
1 10352 T 0A 0.4306 rs555500075
1 10352 T 1TA 0.4306 rs555500075
1 10542 C T 0.001 rs572818783




>> awk '{print $1"\t"$2"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6}' hg19_EAS.sites.2015_08.txt > hg19_EAS.sites.2015_09.txt
>> head hg19_EAS.sites.2015_09.txt
1 10177 10177 A 0C 0.3363 rs367896724
1 10177 10177 A 1AC 0.3363 rs367896724
1 10352 10352 T 0A 0.4306 rs555500075
1 10352 10352 T 1TA 0.4306 rs555500075
1 10542 10542 C T 0.001 rs572818783


文章转载自罗大黑学生信,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论