This user guide will assist you in using PolyDoms, an integrated database of human coding single nucleotide polymorphisms (SNPs) and their annotations. Unlike other databases of similar nature, apart from integrating several coding SNPs (cSNPs) and protein-related information resources, we predict the implications of the non-synonymous SNPs (nsSNPs) using two well known algorithms (SIFT and PolyPhen). The results are presented in an intuitive visualization that depicts the cSNPs mapped onto protein domains and highlights those nsSNPs that are potentially damaging/deleterious or have been reported as disease allelic variants (based on OMIM). The query interface also supports searching for a list of proteins associated with any gene ontology term, pathway, disease term or gene family. Results can also be downloaded as a spreadsheet. The visualization page also provides links to several other related sources and dynamic links to literature references.
The protein functional domains information is based on NCBI's Conserved Domain Database (CDD). PolyDoms uses a single main source of SNPs: dbSNP from NCBI. The implication of each of these SNPs as predicted by SIFT and PolyPhen is also displayed. PolyDoms uses the Protein Data Bank (PDB) for obtaining the 3-D structure of the protein and maps the SNPs onto it. This 3-D image is displayed using MDL's Chime. The SNPs are highlighted in pink/red color. Also presented are the frequency (rate of occurrence) and the submitter of each of these SNPs to further your understanding of the significance of these SNPs. PolyDoms integrates data from a large collection of databases for pathways, interactions and allelic variations. Please visit the links page for a complete list of databases used.
To assist you in using PolyDoms, this document has been divided into four sections that reflect the four main tasks you can perform with the system:
If you need additional support at any time, contact the PolyDoms administrator.
With PolyDoms, you can use a variety of query terms to search an extensive database for cSNPs and other information. There are two modes of searches: basic and advanced (which enables you to search by disease, gene ontology, pathway or gene family).
Basic search
If you wish to begin your search with specific information, follow these steps:
1. In the top half of the PolyDoms home page, select one of the following options, and enter related search terms in the text box:
- Accession Number(s) - Type approved Reference Sequence (RefSeq) IDs only such as NM (mRNA) and NP (Protein) numbers. Examples of acceptable formats are NM_002583 and NP_002320. NM or NP numbers can be obtained from NCBI's RefSeq database via Entrez Gene. For multiple searches, simply place a comma "," between entries (for example, NM_002583, NP_002320, NM_992883). Searches are not case-sensitive.
- Gene/protein symbol(s) - Type approved gene or protein symbols or their synonyms. For multiple searches, simply place a comma "," between the Gene/Protein Symbols (for example, ATM, Mpg, XrCC1, PcNa). Searches are not case-sensitive.
- Entrez Gene ID - You can enter a list of Entrez Gene IDs. Please note that all gene IDs are integers. Examples of gene ids are 492, 472, 796 and so on.
- rsSNP ID - You can enter a list of rsSNP IDs. Some examples of rsSNP IDs are rs6500342, rs2203504 and so on.
- Gene/Protein Description - Enter an approved protein descriptor (for example, cyclin-dependent kinase OR Methylpurine DNA glycosylase). Note that this field does not support multiple search entries.
- ProbeSet ID - Search the database on probe set Ids. Currently, the human probe-sets belonging to U133PLUS2, GNF1H, U95AV2 and ILLUMINA are queried.
Note:
Multiple query terms can be separated by a space, comma or line break.
2. To filter your query, select one or more items from the list box. These filters enable you to search the database based on the characteristics of polymorphisms or the annotations of genes/proteins (whether they have at least one nsSNP, at least one synonymous SNP, a 3-D structure available, an OMIM disease entry associated with them; whether they are associated with a pathway; whether they are interacting with any other proteins; or whether they have an allelic variant causing a disease).
Note:
To select multiple terms, hold the Control (Ctr) key as you click if you are a Windows user; hold the Apple key as you click if you are a Mac user.
3. Click Search.
The majority of the time, your search criteria will limit the results to less than ten. If too many results are returned, repeat your search, entering more criteria in step 2. The more criteria you enter, the fewer search results are returned.
4. Follow the instructions for viewing synonymous or non-synonymous outputs, or for downloading results.
Advanced search
For this mode of search, various selectors are used to base your search on current terms in the database. This gives more accurate results.
1. Go to the bottom half of the PolyDoms home page, and select the type of query you wish to perform.
2. Define your query in one of the following ways:
- For disease, pathway, gene ontology and mammalian phenotype:
- Click the selector link next to the text box -- for example, Disease selector:
Note:
For mammalian phenotype, mouse phenotype data from The Jackson lab are associated with the human orthologs. In addition, human phenotype data from Genetic Association Database at NIH are considered.
- For Search, type as many relevant terms as you wish. Search terms are not case-sensitive, and the more you enter, the more focused your search will be.
Example terms:
Pathway searches: cell cycle, dna repair, apoptosis
Ontology searches: ber, ner, heart development
Mammalian phenotype: craniofacial phenotype and cardiovascular system phenotype
- Click Search.

- Under Search Results, select one or more terms to include in your query.
Note:
To select multiple terms, hold the Control (Ctr) key as you click if you are a Windows user; hold the Apple key as you click if you are a Mac user.
- Click Use these (diseases, pathways, ontologies, phenotypes) for search.

- Repeat steps (d) and (e) to add other terms to the query.
Note:
To start over, click Clear all selections.
- When you are finished adding terms to your query, click Done.
- For gene family, select an item from the dropdown menu.

3. Verify that the correct query type is selected, and that the terms you added in step 2 are populated in the correct text box.
4. To filter your query, select one or more items from the list box. These filters enable you to search the database based on the characteristics of polymorphisms or the annotations of genes/proteins (whether they have at least one nsSNP, at least one synonymous SNP, a 3-D structure available, an OMIM disease entry associated with them; whether they are associated with a pathway; whether they are interacting with any other proteins; or whether they have an allelic variant causing a disease).
Note:
To select multiple terms, hold the Control (Ctr) key as you click if you are a Windows user; hold the Apple key as you click if you are a Mac user.
5. To process your query, click Search.
The majority of the time, your search criteria will limit the results to less than ten. If too many results are returned, repeat your search, entering more criteria in step 2. The more criteria you enter, the fewer search results are returned.
6. Follow the instructions for viewing synonymous or non-synonymous outputs, or for downloading results.
Return to Top
Viewing synonymous or non-synonymous output
Once the system has completed its search of the PolyDoms database, a screen like the following appears:
Locate your protein of interest, and select NonSynonymous or Synonymous.
- The synonymous output
The synonymous output displays the selected protein with all known synonymous SNPs associated with it. Selecting the 3-D view enables you to see the selected protein with the synonymous SNPs mapped directly onto it. The synonymous SNPs selection also displays a SNP table with residue and position data.
- The non-synonymous output
The non-synonymous output displays the selected protein with all known non-synonymous SNPs associated with it. Selecting the 3-D view enables you to see the selected protein with the non-synonymous SNPs mapped directly onto it. In addition to the standard SNP table the non-synonymous selection displays implications of the non-synonymous SNPs as predicted by Polyphen and SIFT.
Return to Top
Analyzing the PolyDoms outputs
PolyDoms displays all outputs on one common page so that all relevant information can be immediately found. For an explanation of each output, click the corresponding numbers in the image on the right.
1. The summary view
The summary view displays the gene symbol, mRNA accession number, protein accession number, name of the gene and Entrez Gene ID. Links to the GenBank files based on the accession numbers are provided as well. In addition, the diseases, gene ontology, mammalian phenotype and pathways associated with the given protein are furnished from resources listed in links page.
Note that clicking any of the Genbank and OMIM links will display the GenBank and OMIM data for the specific protein in a new window.
Return to Top
2. The Navigation Menu
At the top of the main PolyDoms image are multiple links, described below:
- Home
Clicking this option returns you to PolyDoms data entry screen.
- Help
Clicking this option opens the online help for PolyDoms.
- Jump to Synonymous/NonSynonymous view
Clicking this option switches PolyDoms' output between synonymous and non-synonymous views.
- 3-D structure
Clicking this option generates a new window in which a 3-D representation of the given protein and SNPs are displayed.
- Chime is required to view the graphic and can be downloaded free of charge from MDL's web site.
- The 3-D structure of the protein is mapped using the SNPs. This option is
available for both synonymous and non-synonymous SNPs, which are displayed in a pink color for easy demarcation between the normal and the mutated molecules.
- Each SNP is also written in the format "RES1 <-> POS RES2," where the bidirectional arrow head indicates variation in either direction at that given position.
- Please read the Chime manual for further information on the 3-D structure.
- Allelic Variant
This option displays the table containing allelic variants of the gene. If a SNP matches an allelic variant, the corresponding record is highlighted in yellow.
- Interactions
This option displays the table containing protein interactions. Various sources like HPRD, BIND and Reactome are used for obtaining the interaction information.
- PubMed reference
Clicking this option takes you to the PubMed reference output. Learn more about this feature.
- View alignments
Clicking this option displays a multiple alignment of the queried protein with all available homologs. Additionally, the SNP residue(s) in the query sequence are highlighted in red color while the corresponding residues in the homologs are highlighted in bold in the multiple alignment.
- UCSC Proteome Browser
The UCSC Proteome Browser from UCSC presents a rich set of useful protein annotations as well as links to several protein and genomic data sources on the web. The Proteome Browser also provides links to a variety of external sites containing supplementary information on the protein including SwissProt, InterPro and Pfam domains, 3-D structures at PDB, and pathway maps of KEGG, BioCarta (CGAP) and BioCyc.
- iHOP
iHOP stands for "information Hyperlinked Over Proteins." It is a web server that links genes and proteins through scientific literature touching on phenotypes, pathologies and gene function. iHOP provides this network as a natural way of accessing millions of PubMed abstracts.
- MutDB
The goal of MutDB is to annotate human variation data with protein structural information and other functionally relevant information, if available. The mutations are organized by gene.
Return to Top
3. The main PolyDoms image
- The blue line at the center of the image represents the polypeptide sequence. The resolution of the line depends on its length and the amount of information represented in the image.
- Domains are colored into bars and extend throughout the region that the server maps the given protein. Details about the domains are furnished below the image with the ID, description and position. All domains are linked directly to the Conserved Domain Database (CDD) at the NCBI. This allows you to click them to view the complete information of the conserved domain.
- Variation(s) are drawn with the position and residue change. Note that the SNP information is stacked depending on the proximity. This ensures that you will be able to easily understand the image. The intersection of the vertical bar from the SNP box and the domain marks the position in the conserved domain that is varied.
Return to Top
4. The SNPs table(s)
The non-synonymous SNPs table:
The top of this table has a link to NCBI database (dbSNPs). The SNPs table lists all related SNPs in rows. The potential implications for each of the non-synonymous SNPs scored based on prediction algorithms of the Polyphen and SIFT servers are furnished in the table. In addition, data from LS-SNP indicating if a SNP is potentially destabilizing the protein structure, or if a SNP is occurring near a ligand, or if a SNP is occurring near a domain-domain interface is furnished. Rows are highlighted in gray, yellow or orange as described earlier. The rsSNP ID and the 5', 3' 200 bp flanking sequences are added to the table in addition to the allele. UCSC links are provided to observe the SNP in the context of the genome. Through this neighboring non-coding SNPs present in the UCSC database may also be observed.
The synonymous SNPs table:
The top of this table has a link to either the Utah (GeneSNPs) or NCBI database (dbSNPs), depending on the download source. The SNPs table lists all related SNPs in rows. Residues and positions are listed in individual columns. Rows within this table will be highlighted if SNPs match the input filter parameter of polymorphisms.
Return to Top
5. The PubMed references
When you click Pubmed reference, PolyDoms automatically brings you to the PubMed References section of the output. The top few hits are furnished at the end of the web page for easy reference. This is a live retrieval, so all information is current as of the search date.
Note:
The system may find many PubMed sources. However, to make analysis easier, only the top few sources are listed on the main PolyDoms page. More results can be viewed by clicking More....
Return to Top
Downloading PolyDoms Results
In addition to viewing PolyDoms results, you can download them into an Excel file.
1. At the top of the search results table, click Download the results:
2. Select the check boxes of all fields that you want to include in the downloaded file:
3. Click Download Table, and save the Excel file in a desired location.
Return to Top