|Important||As of March 15th, 2013, this site is maintained at http://irefindex.org
This site is for archival purposes and may eventually be deleted.
From Donaldson Group
iRefIndex Related Publications
iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics. 2008 Available here.
iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence. Database. 2010 Available here
Literature curation of protein interactions: measuring agreement across major public databases. Database. 2010 Available here
Interaction databases on the same page. Nature Biotechnology. 2011. Available here
PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nature Methods. 2011. Available here
iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex. BMC Bioinformatics 2011. Available here
iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database. BMC Bioinformatics 2011. Available here
iRefIndex has been published and is available here. If you use iRefIndex, please cite:
Razick S, Magklaras G, Donaldson IM: iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinformatics. 2008. 9(1):405 PMID 18823568.
or one of the other publications listed above if it is more appropriate.
Please also cite the source databases described below. iRefIndex consolidates protein interaction data from...
- BIND , ,
- BioGRID ,
- CORUM ,
- DIP ,
- HPRD , ,
- IntAct , ,
- MINT ,
- MPact ,
- MPPI  and
- OPHID .
iRefIndex uses SEGUID based identifiers to group proteins into redundant groups. The SEGUID algorithm and database are described in .
References to source interaction databases
Publications citing iRefIndex
This is a partial list - use Google Scholar for a complete and more up to date list of citations.
- Orchard S, Kerrien S, Abbani S, Aranda B, Bhate J, Bidwell S, Bridge A, Briganti L, Brinkman F, Cesareni G, Chatr-Aryamontri A, Chautard E, Chen C, Dumousseau M, Goll J, Hancock R, Hannick LI, Jurisica I, Khadake J, Lynn DJ, Mahadevan U, Perfetto L, Raghunath A, Ricard-Blum S, Roechert B, Salwinski L, Stümpflen V, Tyers M, Uetz P, Xenarios I, Hermjakob H. Protein interaction data curation: the International Molecular Exchange (IMEx) consortium. Nat Methods. 2012 Mar 27;9(4):345-350. doi: 10.1038/nmeth.1931. PMID 22453911.
- Gillis J, Pavlidis P. "Guilt by association" is the exception rather than the rule in gene networks. PLoS Comput Biol. 2012 Mar;8(3):e1002444. Epub 2012 Mar 29.PMID 22479173; PubMed Central PMCID:PMC3315453.
- Reimanda, J.,Huia, S., Jaina, S., Law, B., Bader, GD.
Domain-mediated protein interaction prediction: From genome to network. FEBS Letters. 2012. http://dx.doi.org/10.1016/j.febslet.2012.04.027
- Kritikos GD, Moschopoulos C, Vazirgiannis M, Kossida S. Noise reduction in protein-protein interaction graphs by the implementation of a novel weighting scheme.PMID:21676899
- Saliha Ece Acuner Ozbabacan,Hatice Billur Engin, Attila Gursoy1 and Ozlem Keskin1. Transient protein–protein interactions PMID 21679454
- Choi, H. et al. SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nat Methods 8, 70-73 (2011). PMID:21131968. (Read more)
- Croft, D. et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39, D691-697 (2011).PMID:21067998.(Read more)
- Gillis, J. & Pavlidis, P. The role of indirect connections in gene networks in predicting function. Bioinformatics (2011). PMID:21551147. (Read more)
- Hao, Y. et al. OrthoNets: simultaneous visual analysis of orthologs and their interaction neighborhoods across different organisms. Bioinformatics 27, 883-884 (2011). PMID:21257609 (Read more)
- Nolin, M.-A., Dumontier, M., Belleau, F. & Corbeil, J. Building an HIV data mashup using Bio2RDF. Briefings in Bioinformatics (2011).iRefindex was mentioned as a protein-protein interaction data provider and the contribution to Bio2RDF project.
- Turinsky, A.L. et al. DAnCER: disease-annotated chromatin epigenetics resource. Nucleic Acids Res 39, D889-894 (2011). PMID:20876685. (Read more)
- Valsesia, A. et al. Network-guided analysis of genes with altered somatic copy number and gene expression reveals pathways commonly perturbed in metastatic melanoma. PLoS One 6, e18369 (2011). PMID 21494657. (Read more)
- Vidal, M., Cusick, M.E. & Barabasi, A.L. Interactome networks and human disease. Cell 144, 986-998 (2011). PMID 21414488. The literature curated data quality assessment project was mentioned.
- Zhang, K.X. & Ouellette, B.F. CAERUS: predicting CAncER oUtcomeS using relationship between protein structural information, protein networks, gene expression data, and mutation data. PLoS Comput Biol 7, e1001114 (2011). PMID 21483478.(Read more)
- Stojmirović A, Yu YK. ppiTrim: constructing non-redundant and up-to-date interactomes. Database (Oxford). 2011 Aug 27;2011:bar036. Print 2011. PubMed PMID 21873645. (Read more)
- Mistry M, Gillis J, Pavlidis P. Genome-wide expression profiling of schizophrenia using a large combined cohort. Mol Psychiatry. 2012 Jan 3. doi:10.1038/mp.2011.172. PMID 22212594.
- Ceol, A. et al. MINT, the molecular interaction database: 2009 update. Nucleic Acids Res 38, D532-539 (2010). PMID 19897547. iRefIndex was mentioned as a consolidated data provider.
- Garcia-Garcia, J., Guney, E., Aragues, R., Planas-Iglesias, J. & Oliva, B. Biana: a software framework for compiling biological interactions and analyzing networks. BMC Bioinformatics 11, 56 (2010). PMID 20105306. iRefindex was compared to the system discussed in this paper. iRefIndex procedure was suggested as a Unification protocols to be used with the BIANA system.
- Jain, S. & Bader, G.D. An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinformatics 11, 562 (2010). PMID 21078182.(see the section Independent research projects using iRefIndex)
- Lopes, C.T. et al. Cytoscape Web: an interactive web-based network browser. Bioinformatics 26, 2347-2348 (2010). PMID 20656902. iRefWeb was mentioned in this publication as a resource using Cytoscape web.
- Nitsch, D., Goncalves, J.P., Ojeda, F., de Moor, B. & Moreau, Y. Candidate gene prioritization by network analysis of differential expression using machine learning approaches. BMC Bioinformatics 11, 460 (2010). PMID 20840752. iRefIndex was mentioned as a consolidated data provider.
- Pierri, C.L., Parisi, G. & Porcelli, V. Computational approaches for protein function prediction: A combined strategy from multiple sequence alignment to molecular docking-based virtual screening. Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics 1804, 1695-1712 (2010). iRefIndex was mentioned as a protein-protein interaction data provider.
- Blankenburg, H., Ramirez, F., Buch, J. & Albrecht, M. DASMIweb: online integration, analysis and assessment of distributed protein interaction data. Nucleic Acids Res 37, W122-128 (2009). PMID 19502495. This project uses identifier cross referencing and grouping using Gene ID. These are practices iRefIndex trying to avoid.
- Orchard, S. et al. Annual spring meeting of the Proteomics Standards Initiative. Proteomics 9, 4429-4432 (2009). PMID 19670378. iRefIndex is mentioned in this publication because of the contribution to the PSICQUIC project. iRefIndex data is available via The psi common query interface because of this.
- Terada, A. & Sese, J. Discovering large network motifs from a complex biological network. Journal of Physics: Conference Series 197, 012011 (2009).(see the section Independent research projects using iRefIndex)
Independent research projects using iRefIndex
Choi, H. et al. SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nat Methods 8, 70-73 (2011). PMID 18823568.
- This publication discusses a computational tool (SAINT) to assign confidence scores to protein-protein interaction data generated using AP-MS. They have shown that SAINT is applicable to data of different scales and protein connectivity and allows transparent analysis of AP-MS data. This could also be used to filter AP-MS datasets containing non-specifically binding proteins. They have evaluated the performance of SAINT algorithm using iRefWeb and BioGRID. The iRefWeb search filters and parameters provide a nice way to construct custom data sets.
Croft, D. et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39, D691-697 (2011). PMID 21067998.
- Curated bioinformatics database of human pathways and reactions. Uses PSIQUIC web services to overlay curated pathways with molecular interaction data from the Reactome Functional Interaction Network and external interaction databases such as IntAct, BioGRID, ChEMBL, iRefIndex, MINT and STRING. Expression Analysis tools enable ID mapping, pathway assignment and overrepresentation analysis of user-supplied data sets.
Gillis, J. & Pavlidis, P. The role of indirect connections in gene networks in predicting function. Bioinformatics (2011). PMID 21551147.
- Gene interactions can be used to infer functional relationships using a principle known as “guilt by association” (GBA).This research focuses on a extension of these methods, which is to incorporate the broader network structure (indirect connections among genes) into predictions. The iRefIndex data was used as a source when constructing the human PPI Network.
Hao, Y. et al. OrthoNets: simultaneous visual analysis of orthologs and their interaction neighborhoods across different organisms. Bioinformatics 27, 883-884 (2011). PMID 21257609.
- Cytoscape plugin that displays protein-protein interaction (PPI) networks from two organisms simultaneously, highlighting orthology relationships and aggregating several types of biomedical annotations. The iRefIndex data was used as PPI source.
Jain, S. & Bader, G.D. An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinformatics 11, 562 (2010). PMID 21078182.
- More semantically similar the gene function annotations are among the interacting proteins, more likely the interaction is physiologically relevant. This method described in this paper uses this principle and will be useful as an evidence source in PPI prediction or in confidence assessment of PPI datasets. Compared to other such methods the algorithm described here considers unequal depth of biological knowledge representation in different branches of the GO graph. The iRefWeb was used to generate the data set.
Terada, A. & Sese, J. Discovering large network motifs from a complex biological network. Journal of Physics: Conference Series 197, 012011 (2009).
- Basic biological processes are highly related to each other. Network motif discovery detects frequently appearing network structures and also determines the role of vertices in a network. In this study, a novel algorithm called ARIANA was developed to find large network motifs even when the network has noise and uncertainty. By applying ARIANA to a real biological network, authors have found network motifs associated with regulation of cell. The iRefIndex was used construct a biological dataset to test this algorithm.
Turinsky, A.L. et al. DAnCER: disease-annotated chromatin epigenetics resource. Nucleic Acids Res 39, D889-894 (2011). PMID 20876685.
- Chromatin modification (CM) is a set of epigenetic processes that govern many aspects of DNA replication, transcription and repair. DAnCER resource integrates information on genes with CM function from five model organisms, including human. DAnCER integrates. disease information and functional annotations are mapped onto the protein interaction networks (constructed using iRefIndex), enabling the user to formulate new hypotheses on the function and disease associations of a given gene based on those of its interaction partners.
Valsesia, A. et al. Network-guided analysis of genes with altered somatic copy number and gene expression reveals pathways commonly perturbed in metastatic melanoma. PLoS One 6, e18369 (2011). PMID 21494657.
- Cancer genomes contain somatic copy number alterations (SCNA) that can significantly disturb the expression level of affected genes. This can disrupt pathways controlling normal growth. Using karyotyping, SNP and CGH arrays, and RNA-seq, they have identified SCNA affecting gene expression in human metastatic melanoma cell lines. They have showed that the combination of these techniques is useful to identify candidate genes potentially involved in tumorigenesis. A protein network-guided approach was used to determine whether any pathways were enriched in SCNA-genes in one or more samples.They have investigated whether the proteins encoded by the SCNA-genes were connected in known human protein interaction networks. In the protein network-guided analysis of SCNA, iRefIndex and Pathway Commons were used.
Zhang, K.X. & Ouellette, B.F. CAERUS: predicting CAncER oUtcomeS using relationship between protein structural information, protein networks, gene expression data, and mutation data. PLoS Comput Biol 7, e1001114 (2011). PMID 21483478.
- CAERUS: Predicting cancer outcomes Using Relationship between Protein Structural Information, Protein Networks, Gene Expression Data and Mutation Data. Carcinogenesis is a complex process with multiple genetic and environmental factors contributing to the development of one or more tumors. CAERUS can be used for identification of gene signatures to predict cancer outcomes based on the domain interaction network in human proteome. This work provides a prognostic tool to classify different cancer outcomes. When constructing the protein network iRefIndex was used.
Stojmirović A, Yu YK. ppiTrim: constructing non-redundant and up-to-date interactomes. Database (Oxford). 2011 Aug 27;2011:bar036. Print 2011. PubMed PMID 21873645.
- ppiTrim - this is a script written by researchers at the NCBI that modifies and supplements iRefIndex data for use in gene-centric applications.
A collection of resources related to finding and working with protein interaction data: Protein Interaction Resources.