Abstract
Data could be of any form, symbolic or non-symbolic, continuous or discrete, spatial or non-spatial, it should be understood that whenever the data store becomes voluminous, it requires efficient algorithms to mine out required data as well as provide methods to answer various queries. Though the data analysis techniques are useful in almost all disciplines of study, greater emphasis is given in the area of bioinformatics for mining microarray gene expression data as well as gene sequence data. Considerable work is being done in preparation of protein arrays and corresponding visualization techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Kanhal, M.I. and Al-Hendi, R.I. (1992). Arabic phoneme map based on vector quantization neural networks. Graduate Thesis, King Saud University, Saudi Arabia.
Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X., Powell, J.I., Yang, L., Marti, G.E., Moore, T., Hudson, J.Jr., Lu, L., Lewis, D.B., Tibshirani, R., Sherlock, G., Chan, W.C., Greiner, T.C., Weisenburger, D.D., Armitage, J.O., Warnke, R., Levy, R., Wilson, W., Grever, M.R., Byrd, J.C., Botstein, D., Brown, P.O. and Staudt, L.M. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature, 403(3): 503–511.
Alter, O., Brown, P.O. and Botstein, D. (2000). Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. of Sc. USA, 97(18): 10101–10106.
Anderson, J.A. (2001). An Introduction to Artificial Neural Networks. Prentice Hall of India, New Delhi.
Baldi, P. and Brunak, S. (2003). Bioinformatics: The Machine Learning Approach. Affiliated East-West Press Pvt. Ltd., New Delhi.
Baldi, P. and Hatfield, G.W. (2001). Microarrays and Gene Expression. Cambridge University Press, Cambridge.
Bassett, D. Jr, Eisen, M.B. and Boguski, M.S. (1999). Gene Expression Informatics — it’s all in your mind. Nature Genetics, Supplement 21.
Ben-Dor, A., Shamir, R. and Yakhini, Z. (1999). Clustering gene expression patterns. Journal of Computational Biology, 6(3/4): 281–297.
Bergeron, B. (2003). Bioinformatics Computing. Prentice Hall of India, New Delhi.
Bowtell, D. (1999). Options available — from start to finish — for obtaining expression data by microarray. Nature Genetics, Supplement 21.
Brazma, A., Hingamp, P., Quackenbush, J., Sherlock, G., Spellman, P.T., Stoeckert, C., Aach, J., Ansorge, W., Ball, C.A., Causton, H.C., Gaasterland, T., Glenisson, P., Holstege, F.C.P., Kim, I.F., Markowitz, V., Matese, J.C., Parkinson, H., Robinson, A., Sarkans, U., Schulze-Kremer, S., Stewart, J., Taylor, R., Vilo, J. and Vingron, M. (2001). Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genetics, 29: 365–371.
Caron, H., van Schaik, B., van der Mee, M., Baas, F., Riggins, G., van Sluis, P., Hermus, M.C., van Asperen, R., Boon, K., Voute, P.A., van Kampen, A. and Versteeg, R. (2001). The Human Transcriptome Map: Clustering of highly expressed genes in chromosomal domains. Science, 291: 1289–1292.
Carr, D.B., Somogyi, R. and Micheals, G. (1997). Templates for looking at gene expression clustering. Stat. Comput. & Stat. Graph. Newsletter, 20–29.
Chakraborty, C. (2004). Bioinformatics: Approaches and Applications. Biotech Books, Delhi.
Chee, M.C., Yang, R., Hubbell, E., Berno, A., Huang, X.C., Stern, D., Winkler, J., Lockhart, D.J., Morris, M.S. and Fodor, S.P.A. (1996). Accessing genetic information with high-density DNA arrays. Science, 274: 610–614.
Chen, D., Chang, R.F. and Huang, Y.L. (2000). Breast cancer diagnosis using self-organizing map for sonography. Ultrasound Medical Biology, 26(3): 405–411.
Chen, C.H. et al. (2004). Generalized Association Plots (GAP), Presentation on “Cluster Analysis and Visualization”. In: Workshop on Statistics and Machine Learning, Institute of Statistical Science.
Cho, S.B. and Won, H.H. (2003). Machine learning in DNA microarray analysis for cancer classification. Conferences in Research and Practice in Information Technology, 19 (Ed. Yi-Ping Phoebe Chen, Australian Computer Society).
Churchill, G.A. (2002). Fundamentals of experimental design for cDNA microarrays. Nature Genetics, 32 Suppl: 490–495.
D’haeseleer, P., Wen, X., Fuhrman, S. and Somogyi, R. (1997). Mining the gene expression matrix: Inferring gene relationships from large scale gene expression data. In: Information processing in cells and tissues (eds. Paton, R.C. and Holcombe, M.). Plenum Press, 203–212.
DeRisi, J., Penland, L., Brown, P.O., Bittner, M.L., Meltzer, P.S., Ray, M., Chen, Y., Su, Y.A. and Trent, J.M. (1996). Use of a cDNA microarray to analyze gene expression patterns in human cancer. Nature Genetics, 14(4): 457–460.
Dopazo, J. (2002). Microarray data processing and analysis. In: Microarray Data Analysis II. Kluwer Academic Publ., 43–63.
Dudoit, S. and Gentleman, R. (2002a). Cluster analysis in DNA microarray experiments. Bioconductor Short Course, Presentation slides.
Dudoit, S., Fridlyand, J. and Gentleman, R. (2002b). Classification analysis in DNA Microarray experimetns. Bioconductor Short Course, Presentation slides.
Durbin, B.P., Hardin, J.S., Hawkins, D.M. and Rocke, D.M. (2002). A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics, 18(90001): S105–S110.
Eijssen, L. (2000). Cluster analysis of microarray gene expression data. Master’s thesis, Faculty of General Sciences, Maastricht University, The Netherlands.
Eisen, M.B., Spellman, P.T., Brown, P.O. and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. of Sc. USA, 95: 14863–14868.
Ewing, R.M. and Cherry, J.M. (2001). Visualization of expression clusters using Sammon’s non-linear mapping. Bioinformatics, 17(7).
Freeman, J.A. and Skapura, D.M. (1991). Neural Networks. Addison Wesley, USA.
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M. and Haussler, D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16: 906–914.
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D. and Lander, E.S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286: 531–537.
Hacia, J.G., Brody, L.C., Chee, M.S., Fodor, S.P. and Collins, F.S. (1996). Detection of heterozygous mutations in BRCA1 using high density oligonucleotide arrays and two-colour fluorescence analysis. Nature Genetics, 14: 441–447.
Han, J. and Kamber, M. (2001). Data Mining: Concepts and Techniques. Elsevier, San Francisco, USA.
Haykin, Simon (1999). Artificial Neural Networks: A Comprehensive Foundation 2nd ed. Addison Wesley.
Hedenfalk, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon, R. et al. (2001). Gene-expression profiles in hereditary breast cancer. New England Journal of Medicine, 344: 539–548.
Herroro, J., Valencin, A. and Dopazo, J. (2001). A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics, 17: 126–136
Hwang, K.B., Cho, D.Y., Park, S.W., Kim, S.D. and Zhang, B.T. (2001). Applying machine learning techniques to analysis of gene expression data: Cancer Diagnosis. In: Methods of Microarray Data Analysis Kluwer Academic, 167–182.
Iyer, V.R, Eisen, M.B., Ross, D.T, Schuler, G., Moore, T., Lee, J.C.F., Trent, J.M., Staudt, L.M., Hudson Jr. J., Boguski, M.S., Lashkari, D., Shalon, D., Botstein, D. and Brown, P.O. (1999). The transcriptional program in response of human fibroblasts to serum. Science, 283: 83–87.
Jagota, Arun (2001). Microarray data analysis and visualization. Dept. of Computer Engineering, University of California, CA., USA.
Kaski, S. (1997). Data exploration using self-organizing maps. Doctor of Technology Thesis, Helsinki University of Technology, Espoo, Finland.
Kapushesky, M., Kemmeren, P., Culhane, A. C., Durinck, S., Ihmels, J., Körner, C., Kull, M., Torrente, A., Sarkans, U., Vilo, J. and Brazma, A. (2004). Expression Profiler: next generation-an online platform for analysis of microarray data. Nucleic Acids Research, 32 (Web Server issue): W465–W470.
Khan, J., Wei, J.S., Ringnér, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C. and Meltzer, P.S. (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine, 7(6): 673–679.
Klingbiel, D. (2003). Singular value decomposition for feature selection in cDNA arrays, Talk at Max Plank Institute for Molecular Genetics, Germany, available at http://compdiag.molegen.mpg.de/docs/talk_03_03_03_klingbiel.pdf.
Koren, Y. and Carmel, L. (2003). Visualization of labeled data using linear transformation, Proceedings of IEEE Information Visualization (InfoVis’ 03), IEEE, pp. 121–128, Presentation slides, available at http://www.cs.ubc.ca/~tmm/courses/cpsc533c-04-spr/slides/update.0317.mtan.ppt.
Kurimo, M. (1997). Using self-organizing maps and learning vector quantization for mixture density hidden Markov models. Doctor of Technology Thesis, Helsinki University of Technology, Espoo, Finland.
La Vigna, A. (1989). Non-parametric classification using learning vector quantization. Ph.D. thesis, University of Maryland, USA.
Li, L., Weinberg, C.R., Darden, T.A. and Pederson, L.G. (2001). Gene selection for sample classification based on gene expression data: Study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics, 17(12), 1131–1142.
Liao, L. (2002). Clustering and classification and their applications in bioinformatics. Lecture notes, Discovery Information and High Performance Computing, (ELEG 667).
Luo, F., Tang, K. and Khan, L. (2003). Hierarchical clustering of gene expression data. University of Dallas, TX, USA.
Mount, D.W. (2001). Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, NY, USA.
Narayanan, A., Keedwell, E.C. and Olsson, B. (2003). Artificial intelligence techniques for bioinformatics. Applied Bioinformatics, Open Mind Journals.
Nilsson, J. (2002). Methods for classification of gene expressions. Master’s thesis, Centre for Mathematics, Lund University, Lund, Sweden
Phanikumar, B. (2002). Clustering algorithms for microarray data mining. Masters’ Thesis, Institute of Systems Research, University of Maryland, USA.
Pocock, M.R. and Hubbard, T.J.P. (2000). A browser for expression data. Bioinformatics, 16(4).
Prasad, T.V. and Ahson, S.I. (2005a). Visualization of microarray gene expression data. Bioinformation, 2006.
Prasad, T.V. and Ahson, S.I. (2005b). Application of Learning Vector quantization on microarray gene expression data. Bioinformation, submitted.
Prasad, T.V., Ravindra Babu, P. and Ahson, S.I. (2005c). GEDAS — Gene Expression Data Analysis Suite Software. Bioinformation, 2006.
Quackenbush, J. (2002). Microarray data normalization and transformation. Nature Genetics, 32 Suppl: 496–501.
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., Poggio, T., Gerald, W., Loda, M., Lander, E.S. and Golub, T.R. (2001). Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. of Sc., USA, 98(26): 15149–15154.
Raychaudhuri, S., Stuart, J.M., and Altman, R.B. (2000). Principal components analysis to summarize microarray experiments: Application to sporulation time series. Pacific Symposium of Biology, 5: 452–463.
Sharan, R., Elkon, R. and Shamir, R. (2001). Cluster analysis and its applications to gene expression data. Ernst Schering Workshop on Bioinformatics and Genome Analysis. Springer Verlag.
Sing, J.K., Basu, D.K., Nasipuri, M. and Kundu, M. (2003). Improved k-means algorithm in the design of RBG neural networks. Proceedings of IEEE TENCON 2003, Bangalore, India, October 2003.
Slonim, D., Tamayo, P., Mesirov, J., Golub, T.R. and Lander, E. (2000). Class prediction and discovery using gene expression data. Proceedings of RECOMB 2000.
Spellman, P.T., Miller, M., Stewart, J., Troup, C., Sarkans, U., Chervitz, S., Bernhart, D., Sherlock, G., Ball, C.A., Lepage, M., Swiatek, M., Marks, W.L., Goncalves. J., Market, S., Iordan, D., Shojatalab, M., Pizarro, A., White, J., Hubley, R., Deutsch, E., Senger, M., Aronow, B.J., Robinson, A., Bassett, D., Stoeckert, J. Jr. and Brazma, A. (2002). Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biology, 3(9).
Stolovitzky, G., Lepre, J. and Tu, Y. (2004). Gene expression pattern discovery in gene expression microarrays. Presentation slides, available at http://www.ibm.com/solutions/lifesciences.
Szallasi, Z. (1998). Gene expression patterns and cancer. Nature Biotechnology, 16: 1292–1293.
Talavera, L. (2000). Dependency-Based Feature Selection for Clustering Symbolic Data. Intelligent Data Analysis, 4: 19–28.
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J. and Church, G.M. (1999). Systematic determination of genetic network architecture. Nature Genetics, 22: 218–285.
Tibshirani, R., Hastie, T., Eisen, M., Ross, D., Botstein, D. and Brown, P. (1999). Clustering methods for the analysis of DNA microarray data. Technical Report, Stanford University, USA.
Toronen, P., Kolehmainen, M., Wong, G. and Castren, E. (1999). Analysis of gene expression data using self-organizing maps. FEBS Letters, 451(2): 142–146.
Vijaya, P.A., Murty, M.N. and Subramaniam, D.K. (2003). An efficient increamental protein sequence clustering algorithm. Proceedings of IEEE TENCON 2003. Bangalore, India, October 2003.
Vipin Kumar (2002). Data Mining Algorithms. Tutorial at IPAM 2002, Presentation slides.
Wall, M.E., Rechtsteiner, A. and Rocha, L.M. (2003). Singular value decomposition and principal component analysis. In: A Practical Approach to Microarray Data Analysis (eds. Berrar, D.P., Dubitzky, W., Granzow, M.), 91–109. Kluwer, MA, USA.
Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B., Smith, S., Barker, J.L. and Somogyi, R. (1998). Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl. Acad. of Sc. USA, 95(1): 334–339.
Westhead, D.R., Parish, J.H. and Twyman, R.M. (eds) (2003). Instant Notes on Bioinformatics. BIOS Scientific Publishers Ltd., Oxford, UK.
White, K.P., Rifkin, S.A., Hurban, P. and Hogness, D.S. (1999). Microarray analysis of Drosophila development during metamorphosis. Science, 286(5447): 2179–2184.
Wong, W.H. and Li, C. (2001a). Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc. of Natl. Acad. of Sc. USA, 98(1): 31–36.
Wong, W.H. and Li, C. (2001b). Model-based analysis of oligonucleotide arrays: Model validation, design issues and standard error application. Genome Biology, 2(8): research 0032.1–0032.11.
Wooley, J.C. and Lin, H.S. (2001). Catalyzing inquiry at the interface of Computing and Biology. The National Academies Press, Washington D.C., available at http://genomics.energy.goy.
Yang, Y.H., Dudoit, S., Luu, P. and Speed, T.P. (2001). Normalization for cDNA microarray data. Microarray Data Technical Report 589, SPIE BiOS 2001, San Jose, California, USA.
Yeang, C.H., Ramaswamy, S., Tamayo, P., Mukherjee, S., Rifkin, R.M., Angelo, M., Reich, M., Lander, E., Mesirov, J. and Golub, T. (2001). Molecular classification of multiple tumor types. Bioinformatics, 17: 316S–322S.
Yeung, K.Y. and Ruzzo, W.L. (2001). Principal component analysis for clustering gene expression data. Bioinformatics, 17: 763–774.
Yeung, K.Y., Haynor, D.R. and Ruzzo, W.L. (2001b). Validating clustering for gene expression data. Bioinformatics, 17(4): 309–318.
Zhang, M.Q. (1999). Large-scale gene expression data analysis: A new challenge to computational biologists. Genome Research. 9: 681–688.
Websties
Chang, C.C. and Lin, C.J. (2004). LibSVM: A library for support vector machine. Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Colantouni, C., Henry, G. and Pevsner, J. (2000). Standardization and Normalization of Microarray Data (SNOMAD) software. Available at http://pevsnerlab.kennedykrieger.org/snomad.htm.
de Hoon, M., Imoto, S. and Miyano, S. (2004). The C Clustering Library (Cluster 3.0) software. University of Tokyo, Institute of Medical Science, Human Genome Center, Japan, available at http://bosai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster.
Dopazo, J. (1999). Self-organizing Tree Algorithm (SOTA), DNA-array data analysis with SOM, Bioinformatics Unit at CNIO. Available at http://bioinfo.cnio.es/docus/SOTA/#Software.
Eisen Lab (1998). Cluster and Tree View software (Hierarchical clustering, k-means and tree display). Available at http://rana.lbl.gov/EisenSoftware.htm.
Johnny, R. (2002). Analysis of microarray gene expression data. Presentation slides. Available at www.kuleuven.ac.be/bio/mcb/ internet/downloads/gene_expression.pdf.
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J. and Torkkola, K. (1996). LVQ_PAK: The learning vector quantization package. Technical Report A30, Helsinki University of Technology, Finland. Available at http://cs.hut.fi.
Leung, Y.F. (2002). My microarray journal watch. University of Hong Kong, Website available at http://ihome.cuhk.edu.hk/~b400559/1999j_mray.html.
Merelo, J.J. and Prieto, A. (1994). G-LVQ — a combination of genetic algorithms and LVQ. Available at http://geneura.ugr.es/g-lvq/g-lvq.html.
Shapiro, G. P. and Ramaswamy, S. (2002). SPSS Clementine microarray Clementine Application Template (CAT). Presentation slides, available at http://www.spssscience.com.
SilicoCyte (2004). SilicoCyte v 1.3 software. Available at http://www.silicocyte.com.
Stanford Biomedical Informatics (2004). Cleaver 1.0. Helix Bioinformatics Group, Stanford School of Medicine, Stanford University, USA. A vailable at http://classify.stanford.edu.
Thomas, C. (2001). CISC 873, Data Mining Notes: What is Clustering? Lecture Notes, Queen’s University. Available at www.cs.queensu.ca/home/thomas/notes/basic_association.html.
Tom Sawyer (2003). Tom Sawyer software Image Gallery. Website available at http://www.tomsawyer.com/gallery/gallery.php?printable=1.
More suggested literature and website resources
Altinok, A. (1998). Adaptive pattern classification: Kohonen SOM and LVQ1, Presentation slides.
Bentley, P.J. (2001). Digital Biology. Simon & Schuster, New York, USA.
Brazma, A. and Vilo J. (2000). Gene expression data analysis. Mini Review. FEBS 23893, FEBS Letters 480, Elsevier Science.
Butte, A. (2002). The use and analysis of microarray data. Nature Reviews, Drug Discovery, 1: 951–960.
Chiang, D.Y., Brown, P.O. and Eisen, M.B. (2001). Visualizing associations between genome sequences and gene expression data using genome-mean expression profiles. Bioinformatics, 17: 49S–55S.
Collobert, R. and Bengio, S. (2001). SVM Torch: Support vector machines for largescale regression problems. Journal of Machine Learning Research, 1: 143–160.
Cooper, M.C. and Milligan, G.W. (1988). The effect of error on determining the number of clusters. Proc. of the International Workshop on Data Analysis, Decision Support and Expert Knowledge Representation in Marketing and Related Areas of Research, 319–328.
Cortes, C. and Vapnik, V. (1995). Support Vector Networks. Machine Learning, 20: 1–25.
Davoli, R. (2001). Neural Networks. Dept. of Computer Science, University of Bologna, Italy, Presentation slides.
Dougherty, E.R., Barrera, J., Brun, M., Kim, S., Cesar, R.M., Chen, Y., Bittner, M. and Trent, J.M. (2002). Inference from clustering with application of gene expression microarrays. Journal of Computational Biology, 9(1): 105–126.
DeRisi Lab. Department of Biochemistry & Biophysics, University of California at San Francisco. Website available at http://www.microarrays.org.
DNA Microarrays (a). Web site available at http://www.biologie.eus.fr/en/genetique/puces/microarraysframe.html.
DNA Microarrays (b). Web site available at http://dnamicroarrays.info.
Duggan, D.J., Bittner, M., Chen, Y., Meltzer, P. and Trent, J.M. (1999). Expression profiling using cDNA microarrays. Nature Genetics, 21(1 Suppl.): 10–14.
European Bioinformatics Institute (EBI). EBI website available at http://www.ebi.ac.uk/microarray.
Ewing, R.M., Kahla, A.B., Poirot, O., Lopez, F., Audic, S. and Claverie, J.M. (1999). Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. Genome Research, 10: 950–959.
Fuente, A. de la and Mendes, P. (2003). Integrative modeling of gene expression and cell metabolism. Applied Bioinformatics, Open Mind Journals, 2(2): 79–90.
Gene Expression Data Analysis (GEDA) Tool. GEDA software. University of Pennsylvania MC Health Systems. Available at http://bioinformatics.upmc.edu/GE2/GEDA.html.
Gene Expression Pattern Analysis Suite (GEPAS) v 1.0. the SOM Server Available at http://gepas.bioinfo.cnio.es.
GenomeWeb LLC (2005). Microarray Innovators. I. Available at http://www.bioarraynews.com.
Gibas, C. and Jambeck, P. (2001). Developing Bioinformatics Computer Skills. O’Reilly & Associates, CA, USA.
Gutkhe, R., Schmidt-Heck, W., Hahn, D. and Pfaff, M. (2000). Gene expression data mining for functional genomics using fuzzy technology. In: Intelligent Applications in Biomedicine, Advances in Computational Intelligence and Learning.
Hollmen, J., Tresp, V. and Simula, O. (2000). A learning vector quantization algorithm for probabilistic models. Proc. of EUSIPCO 2000, Vol. II.
Joachims, T. (1999). Support Vector Machines (SVMlight) software. Available at http://svmlight.joachims.org.
Kim, H. (2002). Microarray analysis II: Whole-genome expression analysis. CISC889: Bioinformatics course. Presentation slides, available at www.innu.org/~super/dnac/microarray.ppt.
Lesk, A.M. (2002). Introduction to Bioinformatics. Oxford University Press, NY, USA.
Makino, S., Ito A., Endo, M. and Kido, K. (1991). A Japanese text diction recognition and a dependency grammar. IEICE Transaction, E 74(7): 1773–1782.
Milligan, G.W. and Cooper, M.C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50: 159–179.
Mitchell, T.M. (1997). Machine Learning. McGraw Hill International Edition, New Delhi, India.
Molmine. J-Express Pro 2.6 software (Hierarchical clustering, self-organizing maps, and principal components analysis). University of Bergen. Available at http://www.molmine.com.
National Cancer Institute (2002). Gene Expression Data Portal (GEDP). National Institutes of Health, USA. Available at http://gedp.nci.nih/gov/dc/servlet/manager.
National Centre for Biotechnology Information (2002). Gene Expression Omnibus (GEO). National Institutes of Health, USA Gene expression datasets available at ftp://ftp/ncbi.nih.gov/pub/ge/data/gds/soft.
Prasad, T.V. and Ahson, S.I. (2003). Labeling gene expression data using vector quantization. Proc. of 3rd RECOMB Satellite Conference. Stanford University, USA.
Prasad, T.V. and Ahson, S.I. (2004). Analysis of microarray gene expression data. Proc. of 2nd Intl. Conference on Artificial Intelligence Applications in Engineering and Information Technology (ICAIET), Universiti Malaysia Sabah, Malaysia.
Sasik, R., Hwa, T., Iranfar, N. and Loomis, W.F. (2001). Percolation clustering: A novel approach to the clustering of gene expression patterns in Dictyostelium development. Pacific Symposium of Biocomputing, 335–347.
Shamir, R. and Sharan, R. (2002). Algorithmic approaches to clustering gene expression data. In: Current Topics in Computational Molecular Biology (eds. Jiang et al.), 269–300, MIT Press.
Sturn, A., Quackenbush, J. and Trajanoski, Z. (2002). Genesis: cluster analysis of microarray data. Bioinformatics, 18(1): 207–208.
Pat Brown Lab. Stanford Uiiversity. Website available at http://cmgm.stanford.edu/pbrown/.
Reich, M., Ohm, K., Tamayo, P., Angelo, M. and Mesirov, J.P. (2004). GeneCluster 2.0: An advanced toolset for bioarray analysis. Bioinformatics. Earlier version available from Lander and Golub (1999). Whitehead Institute, MIT, available at http://www.broad.mit.edu/cancer/software/genecluster2/gc2.html.
Shi, L. (2002), Gene Chips Web site. Available at http://www.gene-chips.com.
von Heydebreck, A., Huber, W., Poustka, A. and Vingron, M. (2001), Identifying splits with clear separation: A new class discovery method for gene expression data. Bioinformatics, 17: 107S–114S.
Wang, H., Yan, X. and Zhang, X. (2002). Analysis of gene expression profiles of hereditary breast cancer using different feature selection and classification methods. Available at http://www.columbia.edu/~xy56/project.htm.
Yang, Y.H., Buckley, M., Dudoit, S. and Speed, T. (2000). Comparison of methods for image analysis on cDNA microarray data. Berkeley Statistics Department, University of Berkeley, USA, Technical Report 584.
Zhu, H. and Snyder, M. (2001). Protein arrays and microarrays. Current Opinion in Chemical Biology, 5: 40–45.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Capital Publishing Company
About this chapter
Cite this chapter
Prasad, T., Ahson, S. (2009). Data Mining for Bioinformatics — Microarray Data. In: Fulekar, M.H. (eds) Bioinformatics: Applications in Life and Environmental Sciences. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-8880-3_8
Download citation
DOI: https://doi.org/10.1007/978-1-4020-8880-3_8
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-8879-7
Online ISBN: 978-1-4020-8880-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)