< return home
Documentation

User Documentation

Most of the user documentation for SMLR is built in to the program through the use of a help menu, and help buttons that explain the various options available in the program. The SMLR README file explains how to get started: it describes briefly how to install and run the SMLR program, and provides a small example of how to format your input data.

Research Papers

The primary publication describing the SMLR algorithm, along with theoretical error bounds on its generalization performance, is:

  1. Krishnapuram, B., Figueiredo, M., Carin, L., & Hartemink, A. (2005) “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27, June 2005. pp. 957–968.

We have applied SMLR to problems in systems biology in a few subsequent publications:

  1. Narlikar, L., Gordân, R., Ohler, U., & Hartemink, A. (2006) “Informative Priors Based on Transcription Factor Structural Class Improve de novo Motif Discovery.” Intelligent Systems in Molecular Biology 2006 (ISMB06), Bioinformatics, 22, July 2006. pp. e384–e392. [Supp. Info.] [Code] [Input Data]
  2. Narlikar, L. & Hartemink, A. (2006) “Sequence Features of DNA Binding Sites Reveal Structural Class of Associated Transcription Factor.” Bioinformatics, 22, January 2006. pp. 157–163.
  3. Pratapa, P., Patz, E., & Hartemink, A. (2006) “Finding Diagnostic Biomarkers in Proteomic Spectra.” In Pacific Symposium on Biocomputing 2006 (PSB06), Altman, R., Dunker, A.K., Hunter, L., Murray, T., & Klein, T., eds. World Scientific: New Jersey. pp. 279–290.

Our group has also written a number of other papers on various problems in supervised, semi-supervised, and active learning of classifiers. Some of these methods, like SMLR, center around building classifiers that are sparse in the sense that they use only a small set of relevant features (and/or kernel basis functions, when used with a kernel). We have applied these methods to machine learning benchmarks for the purposes of comparison with other methods, to problems in disease diagnosis on the basis of gene expression or proteomic spectra, and to problems in remote sensing, of mines for instance. Relevant publications are:

  1. Krishnapuram, B., Williams, D., Xue, Y., Carin, L., Figueiredo, M., & Hartemink, A. (2005) “Active Learning of Features and Labels.” Learning with Multiple Views Workshop at ICML05, August 2005.
  2. Lüdi, P., Hartemink, A., & Jirtle, R. (2005) “Genome-wide Prediction of Imprinted Murine Genes.” Genome Research, 15, June 2005. pp. 875–884.
  3. Krishnapuram, B., Williams, D., Xue, Y., Hartemink, A., Carin, L., & Figueiredo, M. (2005) “On Semi-Supervised Classification.” In Advances in Neural Information Processing Systems 17 (NIPS04), Saul, L., Weiss, Y., & Bottou, L., eds. MIT Press: Cambridge, MA. pp. 721–728.
  4. Krishnapuram, B., Hartemink, A., Carin, L., & Figueiredo, M. (2004) “A Bayesian Approach to Joint Feature Selection and Classifier Design.” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26, September 2004. pp. 1105–1111.
  5. Krishnapuram, B., Carin, L., & Hartemink, A. (2004) “Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data.” Journal of Computational Biology, 11, March 2004. pp. 227–242.
  6. Krishnapuram, B., Carin, L., & Hartemink, A. (2004) “Gene Expression Analysis: Joint Feature Selection and Classifier Design.” In Kernel Methods in Computational Biology, Schölkopf, B., Tsuda, K., & Vert, J.-P., eds. MIT Press: Cambridge, MA. pp. 299–318.
  7. Liu, Q., Krishnapuram, B., Pratapa, P., Liao, X., Hartemink, A., & Carin, L. (2003) “Identification of Differentially Expressed Proteins Using MALDI-TOF Mass Spectra.” ASILOMAR Conference: Biological Aspects of Signal Processing, November 2003.
  8. Krishnapuram, B., Carin, L., & Hartemink, A. (2003) “Joint Classifier and Kernel Design.” Kernel Methods in Bioinformatics Workshop at RECOMB03, April 2003.
  9. Krishnapuram, B., Carin, L., & Hartemink, A. (2003) “Joint Classifier and Feature Optimization for Cancer Diagnosis Using Gene Expression Data.” Research in Computational Molecular Biology 2003 (RECOMB03), April 2003.
  10. Krishnapuram, B., Hartemink, A., & Carin, L. (2002) “Applying Logistic Regression and RVM to Achieve Accurate Probabilistic Cancer Diagnosis from Gene Expression Profiles.” GENSIPS: Workshop on Genomic Signal Processing and Statistics, October 2002.