※ VirusMap Introduction:

Influenza A virus, a highly virulent pathogen that has caused several pandemic events over the course of human history, still remains a major threat to human health at present. Lately in 2013, a novel avian-origin influenza A virus (H7N9) was reported in eastern china. During the outbreak, the virus infected over 650 people in China, causing 271 deaths as well as widespread social panic(NHFPC 2016). For the purpose of investigating the influenza A virus, several public databases have been made available, such as the National Institute of Allergy and Infectious Diseases (NIAID) (Fauci 2005), the Influenza Virus Resource (IVR) of NCBI (Bao, et al. 2008), the influenza sequence and epitope database (ISED) (Yang, et al. 2009), the OpenFluDB (Liechti, et al. 2010), and the Influenza Research Database (IRD) (Squires, et al. 2012). However, these databases have only sequence-related tools and are limited by the lack of a convenient user interface for displaying the geographical distribution of influenza viruses. Thus, Chang et al. provided a simple visualization tool for displaying the worldwide geographic distribution of influenza virus, and built the Influenza Virus Database (IVDB) (Chang, et al. 2007), which has not been updated since 2007. In this regard, there is still a great need for developing a user-friendly graphical interface to visualize and analyze the influenza A virus data in a convenient manner.

In this work, we report a visualization database called VirusMap for investigating the epidemiological and geographical distribution of influenza A viruses. We downloaded 615,866 protein and 482,663 nucleotide sequences of influenza A viruses in FASTA format from IVR (Bao, et al. 2008) and IRD(Squires et al. 2012). As the policy of those database for the data submission, the information of subtype, host, sampling location, sampling time and serotype should be included for each virus strain. Thus, the title line of each FASTA sequence contains all of the necessary information. We extracted these information through a semi-automated series of steps. To ensure the data quality, only entries with the full information of host, serotype and sampling information were preserved. In total, there were 583,052 protein and 448,495 nucleotide records retained in a MySQL database. As the data was obtained from the IVR, VirusMap contains a comprehensive and frequently updated dataset on the influenza A virus.Furthermore, VirusMap provides a workbench tool for analyzing the phylogenetic relationships of influenza A viruses. When entering a nucleic acid or protein of interest into the workbench tool, the homologous sequences will be retrieved by a BLAST search and the phylogenetic tree will be constructed through a standard pipeline.

The workflow of VirusMap


For publication of results please cite the following articles:

VirusMap: A visualization database for the influenza A virus.
Yubin Xie, Xiaotong Luo, Zhihao He, Yueyuan Zheng, Zhixiang Zuo, Qi Zhao, Yanyan Miao, and Jian Ren.
Journal of Genetics and Genomics. 20 May 2017;Volume 44, Issue 5 Pages 281–284.

[Abstract] [Full Text]