
In this paper, recent bioinformatics methods using data mining techniques are presented to analyze protein-protein interaction data gathered from recent large-scale biological studies. Novel approaches are suggested to tackle some of the challenges in this area. Protein-protein interaction data can provide a wealth of information to better understand the biology of a cell. The analysis of these interactions is also important for the discovery of disease-associated proteins. The data can also be used for the identification of novel cellular sites that are crucial for the development of new and improved pharmaceutical drugs. Knowledge discovery and data mining (KDD) is the process of extracting implicit information from large amounts of data using mathematical and statistical methods. It grows in synergy with computer technology, creating new analytical tools and using them for knowledge discovery in large volume of data. A multidisciplinary science and technology with links in statistics, machine learning, database systems, and computer programming and visualization, KDD has proved to be a promising solution to various problems in molecular biology, and gene analysis. An overview of various data mining techniques is presented in this paper with specific examples of their applications in protein-protein interaction data analysis. While some of the most widely used data mining techniques for exploring protein interaction data sets are clustering (including supervised and unsupervised), classification and association rule discovery, others are based on methods for mining interaction information from scientific sources such as PubMed and MedLine. There are areas such as prediction and profiling that have not been explored much for mining information in protein-protein interactions. We propose methods to employ these novel techniques to analyze protein-protein interaction data
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
