Curriculum Vitae

Contact Information

Prof. Alexander J. Hartemink
Duke University
Department of Computer Science

mailing address: Computer Science, Box 90129, Durham NC 27708-0129
package address: 308 Research Drive, LSRC D239, Durham NC 27708
office location: LSRC Building, Room D239

tel: (919) 660-6514
fax: (919) 660-6519
email:

Education

  • 2001: Ph.D., Electrical Engineering and Computer Science, MIT
  • 1997: S.M., Electrical Engineering and Computer Science, MIT
  • 1996: M.Phil., Economics, Oxford University
  • 1994: B.S., Mathematics, Duke University
  • 1994: B.S., Physics, Duke University
  • 1994: A.B., Economics, Duke University

Scholarships, Fellowships, Awards, and Prizes

Post-Faculty

  • Alfred P. Sloan Fellowship (116 awarded, 2005)
  • NSF Faculty Early Career Development Award (CAREER, 2004)
  • David and Janet Vaughn Brooks Distinguished Teaching Award (4 named awards, 2007)
  • DARPA Computer Science Study Panel (11 panelists, 2008)
  • ORAU Ralph E. Powe Junior Faculty Enhancement Award (24 awarded, 2002)

Pre-Faculty

  • Rhodes Scholarship (32 awarded, 1994)
  • Presidential Scholarship, White House Commission (141 awarded, 1990)
  • National Kalin Award for Mathematics (1 awarded, 1990)
  • NSF Graduate Research Fellowship (~300 awarded, 1994)
  • Hertz Foundation Graduate Research Fellowship Grant (30 awarded, 1998)
  • Barry M. Goldwater Memorial Scholarship (~250 awarded, 1992)
  • Angier B. Duke Memorial Scholarship (~20 awarded, 1990)
  • Karl Menger Award for Mathematics (3 awarded, 1993)
  • Julia Dale Prize for Mathematics (4 awarded, 1994)
  • Merck/MIT Graduate Fellowship in Informatics (~10 awarded, 1999)
  • NIH Genomics Training Grant Fellowship (~20 awarded, 2000)
  • Rothermere Fellowship (~20 awarded, 1993)
  • Dannenberg Mentorship (3 awarded, 1991)
  • National Merit Scholarship (~2000 awarded, 1990)
  • National Honor Society Scholarship (100 awarded, 1990)
  • Tandy Technology Scholarship (100 awarded, 1990)
  • Century III Leaders Scholarship (200 awarded, 1990)

Grants

  • “Bioinformatics and Computational Biology Training Program,” NIH-NIGMS T32, PI: Alexander Hartemink, $982,399, 12/1/2011–11/30/2015, T32-GM071340-07.
  • “Hedgehog Signaling and Adult Liver Regeneration,” NIH-NIDDK R01, PI: Anna Mae Diehl, Role: Collaborator, $1,365,900, 7/1/2012–3/31/2016, R01-DK077794-05.
  • “High Performance Computing System for Bioinformatics,” NIH-SIG Shared Instrumentation Grant S10, PI: Hunt Willard, Role: Co-PI, $461,402, 6/1/2009–5/31/2010, S10-RR025590-01.
  • “CSSG Phase II: New Computational Methods for Elucidating Transcriptional Regulation During the Eukaryotic Cell Cycle,” DARPA BAA08-22, PI: Alexander Hartemink, $500,000, 5/7/2009–5/6/2012, HR0011-09-1-0040.
  • “High Performance Computing System for Bioinformatics,” North Carolina Biotechnology Center Institutional Development Grant, PI: Hunt Willard, Role: Co-PI, ~$117,000.
  • “2008 Computer Science Study Group (CSSG),” DARPA CSSG (RA 07-43), PI: Alexander Hartemink, $100,000, 1/1/2008–12/31/2008, HR0011-08-1-0023.
  • “Duke Center for Systems Biology,” NIH-NIGMS P50, PI: Philip Benfey, Role: Co-PI, $14,498,123, 7/10/2007–6/30/2012, P50-GM081883-01.
  • “Clinico-Molecular Predictors of Presymptomatic Infectious Disease,” DARPA-SPAWAR (BAA 06-19), PI: Geoff Ginsburg, Role: Co-PI, $6,018,678, 7/1/2007–2/28/2009, N66001-07-C-2024.
  • “Identification and Characterization of Epigenetically Labile Genes,” NIH-NIEHS R01, PI: Randy Jirtle, Role: Co-PI, $2,421,408, 9/25/2006–6/30/2010, R01-ES015165-01.
  • “Integration of IBM Management Software with Campus Blade Clusters in Support of Duke Academic Infrastructure,” IBM SUR: Shared University Research Program, PI: Richard Lucic, Role: Co-PI, ~$250,000.
  • “Integrated Systems: Integrative Sciences,” Howard Hughes Undergraduate Biological Sciences Education Program, PI: Dean Robert Thompson, Role: member of steering committee, VIP team leader, ~$1,900,000, 8/1/2006–7/31/2010.
  • “CRCNS: Neural Flow Networks in Songbirds,” NIH/NSF CRCNS: Collaborative Research in Computational Neuroscience (NSF 04-514), PI: Alexander Hartemink, $2,023,005, 8/1/2005–7/31/2012, R01-DC007996-01.
  • “Alfred P. Sloan Research Fellowship,” Alfred P. Sloan Research Fellowship Program, PI: Alexander Hartemink, $45,000, 9/16/2005–9/15/2007, BR-4493.
  • “Discovery of Biomarkers for Lung Cancer Metastasis,” NIH-NCI R01, PI: Ned Patz, Role: Co-Investigator, ~$1,378,000, 04/01/2005–03/31/2009, R01-CA19384-01A1.
  • “Cluster Computing Infrastructure for Life Sciences Computing,” IBM SUR: Shared University Research Program, PI: Richard Lucic, Role: Co-PI, $249,965.
  • “CAREER: Computational Methods for Learning Dynamic Networks of Biological Regulation and Control,” NSF CAREER: Faculty Early Career Development Award (NSF 02-111), PI: Alexander Hartemink, $487,344, 2/1/2004–1/31/2009, NSF-IIS 0347801.
  • “Computational Functional Genomics: Discovering Genetic Regulatory Networks,” ORAU Ralph E. Powe Junior Faculty Enhancement Award, PI: Alexander Hartemink, $10,000, 6/1/2002–5/31/2003.
  • “Making Meaning of Genomic Information,” Howard Hughes Undergraduate Biological Sciences Education Program, PI: Dean Robert Thompson, Role: member of steering committee, ~$1,800,000, 8/1/2002–7/31/2006.

Publications

  1. Mordelet, F., Horton, J., Hartemink, A., Engelhardt, B., & Gordân, R. (2013) “Stability selection for regression-based models of transcription factor-DNA binding specificity.” Intelligent Systems in Molecular Biology 2013 (ISMB13). Bioinformatics, (in press).
  2. Guo, X., Bernard, A., Orlando, D., Haase, S., & Hartemink, A. (2013) “Branching process deconvolution algorithm reveals a detailed cell-cycle transcription program.” PNAS, 110, 5 March 2013. pp. E968–E977. [Deconvolution Website] [Author Summary] [Supp. Info.]
  3. Perez-Pinera, P., Ousterout, D., Brunger, J., Farin, A., Glass, K., Guilak, F., Crawford, G., Hartemink, A., & Gersbach, C. (2013) “Synergistic and tunable gene activation in human cells by combinations of synthetic transcription factors.” Nature Methods, 10, 3 February 2013. pp. 239–242. [Supp. Info.]
  4. Luo, K. & Hartemink, A. (2013) “Using DNase digestion data to accurately identify transcription factor binding sites.” In Pacific Symposium on Biocomputing 2013 (PSB13), Altman, R., Dunker, A.K., Hunter, L., Murray, T., & Klein, T., eds. World Scientific: New Jersey, pp. 80–91. [Supp. Info.] [Code]
  5. Landt, S., Marinov, G., Kundaje, A., Kheradpour, P., Pauli, F., Batzoglou, S., Bernstein, B., Bickel, P., Brown, B., Cayting, P., Chen, Y., DeSalvo, G., Epstein, C., Euskirchen, G., Fisher-Aylor, K., Gerstein, M., Gertz, J., Hartemink, A., Hoffman, M., Iyer, V., Jung, Y., Karmakar, S., Kellis, M., Kharchenko, P., Li, Q., Liu, T., Liu, X., Ma, L., Milosavljevic, A., Myers, R., Park, P., Pazin, M., Perry, M., Raha, D., Reddy, T., Rozowsky, J., Shoresh, N., Sidow, A., Slattery, M., Stammatoyonnopoulous, J., Tolstorukov, M., White, K., Xi, S., Farnham, P., Lieb, J., Wold, B., & Snyder, M. (2012) “ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia.” Genome Research, 22, September 2012. pp. 1813–1831.
  6. Mayhew, M., Guo, X., Haase, S., & Hartemink, A. (2012) “Close encounters of the collaborative kind.” IEEE Computer, Special Issue on Computationally Driven Experimental Biology, 45, March 2012. pp. 24–30. [Cover Feature]
  7. Guo, X., Bulyk, M., & Hartemink, A. (2012) “Intrinsic disorder within and flanking the DNA-binding domains of human transcription factors.” In Pacific Symposium on Biocomputing 2012 (PSB12), Altman, R., Dunker, A.K., Hunter, L., Murray, T., & Klein, T., eds. World Scientific: New Jersey, pp. 104–115.
  8. Meyer, P., Alexopoulos, L., Bonk, T., Califano, A., Cho, C., de la Fuente, A., de Graaf, D., Hartemink, A., Hoeng, J., Ivanov, N., Koeppl, H., Linding, R., Marbach, D., Norel, R., Peitsch, M., Rice, J., Royyuru, A., Schacherer, F., Sprengel, J., Stolle, K., Vitkup, D., & Stolovitzky, G. (2011) “Verification of systems biology research in the age of collaborative competition.” Nature Biotechnology, 29, September 2011. pp. 811–815.
  9. Mayhew, M., Robinson, J., Jung, B., Haase, S., & Hartemink, A. (2011) “A generalized model for multi-marker analysis of cell cycle progression in synchrony experiments.” Intelligent Systems in Molecular Biology 2011 (ISMB11). Bioinformatics, 27, July 2011. pp. i295–i303.
  10. Miller, H., Robinson, T., Gordân, R., Hartemink, A., & Garcia-Blanco, M. (2011) “Identification of Tat-SF1 cellular targets by exon array analysis reveals dual roles in transcription and splicing.” RNA, 17, April 2011. pp. 665–674.
  11. Robinson, J. & Hartemink, A. (2010) “Learning non-stationary dynamic Bayesian networks.” Journal of Machine Learning Research, 11, December 2010. pp. 3647–3680.
  12. Gordân, R., Narlikar, L., & Hartemink, A. (2010) “Finding regulatory DNA motifs using alignment-free evolutionary conservation information.” Nucleic Acids Research, 38, April 2010. p. e90. [Supp. Info.]
  13. MacAlpine, H., Gordân, R., Powell, S., Hartemink, A., & MacAlpine, D. (2010) “Drosophila ORC localizes to open chromatin and marks sites of cohesin complex loading.” Genome Research, 20, February 2010. pp. 201–211.
  14. Orlando, D., Iversen, E., Hartemink, A., & Haase, S. (2009) “A branching process model for flow cytometry and budding index measurements in cell synchrony experiments.” Annals of Applied Statistics, 3, December 2009. pp. 1521–1541.
  15. Wasson, T. & Hartemink, A. (2009) “An ensemble model of competitive multi-factor binding of the genome.” Genome Research, 19, November 2009. pp. 2101–2112.
  16. Gordân, R., Hartemink, A., & Bulyk, M. (2009) “Distinguishing direct versus indirect transcription factor-DNA interactions.” Genome Research, 19, November 2009. pp. 2090–2100. [Supp. Info.]
  17. Guo, X. & Hartemink, A. (2009) “Domain-oriented edge-based alignment of protein interaction networks.” Intelligent Systems in Molecular Biology 2009 (ISMB09). Bioinformatics, 25, 15 June 2009. pp. i240–i246.
  18. Robinson, J. & Hartemink, A. (2009) “Non-stationary dynamic Bayesian networks.” In Advances in Neural Information Processing Systems 21 (NIPS08), Koller, D., Schuurmans, D., Bengio, Y., & Bottou, L., eds. MIT Press: Cambridge, MA. pp. 1369–1376. [Appendix]
  19. Orlando, D., Lin, C., Bernard, A., Wang, J., Socolar, J., Iversen, E., Hartemink, A., & Haase, S. (2008) “Global control of cell-cycle transcription by coupled CDK and network oscillators.” Nature, 453, 12 June 2008. pp. 944–947. [Supp. Info.]
  20. Gordân, R., Narlikar, L., & Hartemink, A. (2008) “A fast, alignment-free, conservation-based method for transcription factor binding site discovery.” Research in Computational Molecular Biology 2008 (RECOMB08). Lecture Notes in Bioinformatics, Vingron, M. & Wong, L., eds. 4955, April 2008. pp. 98–111. [Supp. Info.]
  21. Gordân, R. & Hartemink, A. (2008) “Using DNA duplex stability information for transcription factor binding site discovery.” In Pacific Symposium on Biocomputing 2008 (PSB08), Altman, R., Dunker, A.K., Hunter, L., Murray, T., & Klein, T., eds. World Scientific: New Jersey. pp. 453–464. [Supp. Info.]
  22. Lüdi, P., Dietrich, F., Weidman, J., Bosko, J., Jirtle, R., & Hartemink, A. (2007) “Computational and experimental identification of novel human imprinted genes.” Genome Research, 17, December 2007. pp. 1723–1730. [Supp. Info.] [Cover] [Nature Reviews Genetics] [Science] [AP] [Wired]
  23. Narlikar, L., Gordân, R., & Hartemink, A. (2007) “A nucleosome-guided map of transcription factor binding sites in yeast.” PLoS Computational Biology, 3, November 2007. pp. 2199–2208.
  24. Bernard, A., Vaughn, D., & Hartemink, A. (2007) “Reconstructing the topology of protein complexes.” Research in Computational Molecular Biology 2007 (RECOMB07). Lecture Notes in Bioinformatics, Speed, T. & Huang, H., eds. 4453, April 2007. pp. 32–46.
  25. Narlikar, L., Gordân, R., & Hartemink, A. (2007) “Nucleosome occupancy information improves de novo motif discovery.” Research in Computational Molecular Biology 2007 (RECOMB07). Lecture Notes in Bioinformatics, Speed, T. & Huang, H., eds. 4453, April 2007. pp. 107–121. [Supp. Info.]
  26. Orlando, D., Lin, C., Bernard, A., Iversen, E., Hartemink, A., & Haase, S. (2007) “A probabilistic model for cell cycle distributions in synchrony experiments.” RECOMB Satellite Conference on Systems Biology 2006, Cell Cycle, 6, February 2007. pp. 478–488.
  27. Smith, V., Yu, J., Smulders, T., Hartemink, A., & Jarvis, E. (2006) “Computational inference of neural information flow networks.” PLoS Computational Biology, 2, November 2006. pp. 1436–1449. [Supp. Info.] [Code] [Most Viewed Research Article at PLoS Computational Biology]
  28. Bernard, A. & Hartemink, A. (2006) “Evaluating algorithms for learning biological networks.” DREAM Workshop, September 2006.
  29. Narlikar, L., Gordân, R., Ohler, U., & Hartemink, A. (2006) “Informative priors based on transcription factor structural class improve de novo motif discovery.” Intelligent Systems in Molecular Biology 2006 (ISMB06), Bioinformatics, 22, July 2006. pp. e384–e392. [Supp. Info.] [Code] [Input Data]
  30. Hartemink, A. (2006) “Bayesian networks and informative priors: Transcriptional regulatory network models.” In Bayesian Inference for Gene Expression and Proteomics, Do, K.-A., Müller, P., & Vannucci, M., eds. Cambridge University Press: Cambridge, UK. pp. 401–424.
  31. Narlikar, L. & Hartemink, A. (2006) “Sequence features of DNA binding sites reveal structural class of associated transcription factor.” Bioinformatics, 22, January 2006. pp. 157–163.
  32. Pratapa, P., Patz, E., & Hartemink, A. (2006) “Finding diagnostic biomarkers in proteomic spectra.” In Pacific Symposium on Biocomputing 2006 (PSB06), Altman, R., Dunker, A.K., Hunter, L., Murray, T., & Klein, T., eds. World Scientific: New Jersey. pp. 279–290. [Larger Figs.]
  33. Krishnapuram, B., Williams, D., Xue, Y., Carin, L., Figueiredo, M., & Hartemink, A. (2005) “Active Learning of Features and Labels.” Learning with Multiple Views Workshop at ICML05, August 2005.
  34. Lüdi, P., Hartemink, A., & Jirtle, R. (2005) “Genome-wide Prediction of Imprinted Murine Genes.” Genome Research, 15, June 2005. pp. 875–884.
  35. Krishnapuram, B., Figueiredo, M., Carin, L., & Hartemink, A. (2005) “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27, June 2005. pp. 957–968. [Code]
  36. Hartemink, A. (2005) “Reverse Engineering Gene Regulatory Networks.” Nature Biotechnology, 23, May 2005. pp. 554–555.
  37. Yin, P. & Hartemink, A. (2005) “Theoretical and Practical Advances in Genome Halving.” Bioinformatics, 21, April 2005. pp. 869–879.
  38. Bernard, A. & Hartemink, A. (2005) “Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data.” In Pacific Symposium on Biocomputing 2005 (PSB05), Altman, R., Dunker, A.K., Hunter, L., Jung, T., & Klein, T., eds. World Scientific: New Jersey. pp. 459–470. [Supp. Info.]
  39. Krishnapuram, B., Williams, D., Xue, Y., Hartemink, A., Carin, L., & Figueiredo, M. (2005) “On Semi-Supervised Classification.” In Advances in Neural Information Processing Systems 17 (NIPS04), Saul, L., Weiss, Y., & Bottou, L., eds. MIT Press: Cambridge, MA. pp. 721–728.
  40. Yu, J., Smith, V., Wang, P., Hartemink, A., & Jarvis, E. (2004) “Advances to Bayesian Network Inference for Generating Causal Networks from Observational Biological Data.” Bioinformatics, 20, December 2004. pp. 3594–3603.
  41. Krishnapuram, B., Hartemink, A., Carin, L., & Figueiredo, M. (2004) “A Bayesian Approach to Joint Feature Selection and Classifier Design.” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26, September 2004. pp. 1105–1111.
  42. Krishnapuram, B., Carin, L., & Hartemink, A. (2004) “Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data.” Journal of Computational Biology, 11, March 2004. pp. 227–242.
  43. Krishnapuram, B., Carin, L., & Hartemink, A. (2004) “Gene Expression Analysis: Joint Feature Selection and Classifier Design.” In Kernel Methods in Computational Biology, Schölkopf, B., Tsuda, K., & Vert, J.-P., eds. MIT Press: Cambridge, MA. pp. 299–318.
  44. Liu, Q., Krishnapuram, B., Pratapa, P., Liao, X., Hartemink, A., & Carin, L. (2003) “Identification of Differentially Expressed Proteins Using MALDI-TOF Mass Spectra.” ASILOMAR Conference: Biological Aspects of Signal Processing, November 2003.
  45. Krishnapuram, B., Carin, L., & Hartemink, A. (2003) “Joint Classifier and Kernel Design.” Kernel Methods in Bioinformatics Workshop at RECOMB03, April 2003.
  46. Krishnapuram, B., Carin, L., & Hartemink, A. (2003) “Joint Classifier and Feature Optimization for Cancer Diagnosis Using Gene Expression Data.” Research in Computational Molecular Biology 2003 (RECOMB03), April 2003.
  47. Smith, V., Jarvis, E., & Hartemink, A. (2003) “Influence of Network Topology and Data Collection on Network Inference.” In Pacific Symposium on Biocomputing 2003 (PSB03), Altman, R., Dunker, A.K., Hunter, L., Jung, T., & Klein, T., eds. World Scientific: New Jersey. pp. 164–175.
  48. Yu, J., Smith, V., Wang, P., Hartemink, A., & Jarvis, E. (2002) “Using Bayesian Network Inference Algorithms to Recover Molecular Genetic Regulatory Networks.” International Conference on Systems Biology 2002 (ICSB02), December 2002.
  49. Jarvis, E., Smith, V., Wada, K., Rivas, M., McElroy, M., Smulders, T., Carninci, P., Hayashisaki, Y., Dietrich, F., Wu, X., McConnell, P., Yu, J., Wang, P., Hartemink, A., & Lin, S. (2002) “A Framework for Integrating the Songbird Brain.” Journal of Comparative Physiology A, 188, December 2002. pp. 961–980.
  50. Krishnapuram, B., Hartemink, A., & Carin, L. (2002) “Applying Logistic Regression and RVM to Achieve Accurate Probabilistic Cancer Diagnosis from Gene Expression Profiles.” GENSIPS: Workshop on Genomic Signal Processing and Statistics, October 2002.
  51. Smith, V., Jarvis, E., & Hartemink, A. (2002) “Evaluating Functional Network Inference Using Simulations of Complex Biological Systems.” Intelligent Systems in Molecular Biology 2002 (ISMB02), Bioinformatics, 18:S1. pp. S216–S224.
  52. Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2002) “Bayesian Methods for Elucidating Genetic Regulatory Networks.” IEEE Intelligent Systems, special issue on Intelligent Systems in Biology, 17, March/April 2002. pp. 37–43.
  53. Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2002) “Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Networks.” In Pacific Symposium on Biocomputing 2002 (PSB02), Altman, R., Dunker, A.K., Hunter, L., Lauderdale, K., & Klein, T., eds. World Scientific: New Jersey. pp. 437–449.
  54. Hartemink, A. (2001) “Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks.” Massachusetts Institute of Technology, Ph.D. dissertation.
  55. Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2001) “Maximum Likelihood Estimation of Optimal Scaling Factors for Expression Array Normalization.” SPIE International Symposium on Biomedical Optics 2001 (BiOS01). In Microarrays: Optical Technologies and Informatics, Bittner, M., Chen, Y., Dorsel, A., & Dougherty, E., eds. Proceedings of SPIE, 4266. pp. 132–140.
  56. Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2001) “Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks.” In Pacific Symposium on Biocomputing 2001 (PSB01), Altman, R., Dunker, A.K., Hunter, L., Lauderdale, K., & Klein, T., eds. World Scientific: New Jersey. pp. 422–433.
  57. Hartemink, A., Mikkelsen, T., & Gifford, D. (2000) “Simulating Biological Reactions: A Modular Approach.” DNA Based Computers V. Winfree, E. & Gifford, D., eds. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 54, American Mathematical Society. pp. 111–121.
  58. Schechter, S., Parnell, T., & Hartemink, A. (1999) “Anonymous Authentication of Membership in Dynamic Groups.” Financial Cryptography '99. Franklin, M., ed. Lecture Notes in Computer Science, 1648, Springer-Verlag. pp. 184–195.
  59. Hartemink, A., Gifford, D., & Khodor, J. (1999) “Automated Constraint-Based Nucleotide Sequence Selection for DNA Computation.” Biosystems, 52, October 1999, Elsevier Press. pp. 227–235.
  60. Hartemink, A. & Gifford, D. (1999) “Thermodynamic Simulation of Deoxyoligonucleotide Hybridization for DNA Computation.” DNA Based Computers III. Rubin, H. & Wood, D., eds. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 48, American Mathematical Society. pp. 25–38.

Software

Banjo

Banjo stands for Bayesian Network Inference with Java Objects. Banjo is a highly efficient, configurable, and extensible package for the inference of either static or dynamic Bayesian networks. Banjo is currently limited to discrete variables only; however, it can discretize continuous data for you if you wish, and is modular and extensible so that new components can be written to handle continuous variables. The modular design also allows the user to mix and match various inference algorithm components to implement different learning procedures, ranging from simulated annealing with random local moves to greedy hillclimbing with all local moves, as well as create new ones.

SMLR

SMLR stands for Sparse Multinomial Logistic Regression. SMLR is an efficient implementation of a true multiclass probabilistic classifier based on the well-studied multinomial logistic regression framework. However, we adopt a Bayesian perspective, enabling us to incorporate a Laplacian prior (related to LASSO) which promotes the learning of a sparse weight vector. The result is a classifier that can operate either directly on input features and perform automatic feature selection (embedded, not filter or wrapper), or with a kernel and perform automatic sample selection (much like the SVM). The objective function is convex so it has a unique global optimum. SMLR implements a suite of bound-optimization algorithms that we have developed to find this optimum efficiently, even when the number of samples or features is large (at least tens of thousands).

PRIORITY

PRIORITY is a tool for de novo motif discovery in the context of transcription factor (TF) binding sites. It implements a new approach to motif discovery in which informative priors over sequence positions are used to guide the search. Although this approach will work for any motif model and any search/optimization strategy, the initial version of PRIORITY adopts a PSSM model and collapsed Gibbs sampling. PRIORITY is packaged with priors designed to measure how likely each sequence position is to be bound by three specific structural classes of TFs: basic leucine zipper, forkhead, and basic helix loop helix. In addition to discovering TF binding sites and a motif model for those binding sites, PRIORITY also predicts the structural class of the TF recognizing the binding sites.

COMPETE

CLOCCS

MILLIPEDE

Teaching

  • Fall 2012: Introduction to Computational Genomics (COMPSCI 260)
  • Spring 2012: Teaching Leave
  • Fall 2011: Introduction to Computational Genomics (CPS 160)
  • Spring 2011: Computational Systems Biology (CPS 262/CBB 262)
  • Fall 2010: Introduction to Computational Genomics (CPS 160)
  • Spring 2010: Sabbatical Leave (teaching in Kenya)
  • Fall 2009: Sabbatical Leave (teaching in Kenya)
  • Spring 2009: Computational Systems Biology (CPS 262/CBB 262)
  • Fall 2008: Introduction to Computational Genomics (CPS 160)
  • Spring 2008: Introduction to Computational Genomics (CPS 160)
  • Fall 2007: Systems Biology and Machine Learning (CPS 296)
  • Spring 2007: Introduction to Computational Genomics (CPS 160)
  • Spring 2007: Algorithms in Computational Biology (CPS 260/CBB 230)
  • Fall 2006: Teaching Leave
  • Spring 2006: Introduction to Computational Genomics (CPS 160)
  • Spring 2006: House Course: Patterns (HOUSE 79)
  • Fall 2005: Junior Research Leave
  • Spring 2005: Introduction to Computational Genomics (CPS 160)
  • Fall 2004: Algorithms in Computational Biology (CPS 260/BGT 204)
  • Spring 2004: Computational Functional Genomics (CPS 262/BGT 211)
  • Fall 2003: Introduction to Computational Genomics (CPS 160)
  • Spring 2003: Computational Functional Genomics (CPS 296/BGT 208)
  • Fall 2002: Algorithms in Computational Biology (CPS 260/BGT 204)
  • Spring 2002: Computational Functional Genomics (CPS 296/BGT 208)
  • Fall 2001: Introduction to Research in Computer Science (CPS 300)

Students

Current Students

Doctoral

Supervising
  • Yezhou Huang (CS)
  • Kaixuan (Kevin) Luo (CBB)
  • Jianling Zhong (CBB)
Committees
  • Jason Belsky (CBB, Dave MacAlpine)
  • Peng Dong (CBB, Bernard Mathey-Prevot and Lingchong You)
  • Pablo Gainza-Cirauqui (CS, Bruce Donald)
  • Monica Gutierrez (Genetics, Dave MacAlpine)
  • Dina Hafez (CS, Uwe Ohler)
  • Nicholas Haynes (Physics, Dan Gauthier)
  • Jonathan (JJ) Jou (CS, Bruce Donald)
  • Peter Tonner (CBB, Amy Schmid)
  • Florian Wagner (CBB, Barbara Englehardt)
  • Jenny Zhang (Genetics, Sandeep Dave)
  • Shiwen Zhao (CBB, Barbara Englehardt)
Rotations
  • Yizhe Zhang (CBB)

Undergraduate

Supervising
  • Kyle Donovan (CS/Biology)
  • Mike Gloudemans (CS/Biology)
  • Abigail Lin (CS/Biology)

Former Students

Postdoctoral

Supervising
  • Dr. Fantine Mordelet (CS)
    [started a postdoc with Barbara Engelhardt at Duke]
  • Dr. Victoria Anne Smith (Neurobiology), co-supervised with Erich Jarvis
    [joined the faculty at University of St. Andrews, Scotland]

Doctoral

Supervising
  • Allister Bernard (CS)
    [joined AlphaSimplex Group, a hedge fund]
  • Raluca Gordân (CS)
    [started a postdoc with Martha Bulyk at Harvard Medical School/MIT;
    later joined the faculty at Duke]
  • Xin Guo (CS)
    [joined Gilead Sciences]
  • Balaji Krishnapuram (ECE), co-supervised with Larry Carin
    [joined Siemens Medical Solutions]
  • Philippe Lüdi (CBB), co-supervised with Fred Dietrich and Randy Jirtle
    [joined AlphaSimplex Group, a hedge fund]
  • Michael Mayhew (CBB)
    [joined Lawrence Livermore National Lab]
  • Leelavati Narlikar (CS)
    [started a postdoc with Ivan Ovcharenko at NCBI (National Center for Biotechnology Information);
    later named a Ramanujan Fellow at the National Chemical Laboratory in Pune, India]
  • David Orlando (CBB), co-supervised with Philip Benfey and Steve Haase
    [started a postdoc with Rick Young at MIT/Whitehead Institute;
    later joined Syros Pharmaceuticals]
  • Andreas Pfenning (CBB), co-supervised with Erich Jarvis
    [started a postdoc with Manolis Kellis at MIT/Broad Institute]
  • Josh Robinson (CS)
    [joined Signal Innovations Group, a machine learning startup]
  • Todd Wasson (CBB)
    [joined Lawrence Livermore National Lab]
  • Jing Yu (ECE), co-supervised with Erich Jarvis and Paul Wang
    [joined Novartis Institute for Biomedical Research]
Committees
  • Alan Boyle (CBB, Terry Furey and Greg Crawford)
    [started a postdoc with Mike Snyder at Stanford;
    later joined the faculty at University of Michigan]
  • Yanting Dong (ECE, Larry Carin)
    [joined Guidant (acquired by Boston Scientific)]
  • Matt Eaton (CBB, David MacAlpine)
    [started a postdoc with Manolis Kellis at MIT/Broad Institute]
  • Ashish Gehani (CS, Gershon Kedem)
    [started a postdoc with Surendar Chandra at Notre Dame]
  • Ivelin Georgiev (CS, Bruce Donald)
    [joined the NIH Vaccine Research Center as a research fellow;
    later joined the faculty at Vanderbilt University]
  • Stoyan Georgiev (CBB, Uwe Ohler and Sayan Mukherjee)
    [started a postdoc with Jonathan Pritchard at Chicago, and then Stanford]
  • Jonathan Jesneck (BME, Joseph Lo)
    [started a postdoc with Jill Mesirov and Todd Golub at Broad Institute;
    later joined the Field Intelligence Lab at MIT as a Research Scientist]
  • Shihao Ji (ECE, Larry Carin)
    [joined Yahoo!; later joined Microsoft]
  • Laura Kavanaugh (Genetics, Fred Dietrich)
    [joined Syngenta Biotechnology]
  • Mitch Levesque (Genetics, Philip Benfey)
    [started a postdoc at Max Planck Institute for Developmental Biology;
    later joined the faculty at University of Zurich]
  • Qiuhua Liu (ECE, Larry Carin)
    [started a postdoc at Schlumberger-Doll Research]
  • Jiuliu Lu (ECE, Larry Carin)
    [joined Beckman Coulter, maybe?]
  • Dan Mace (CBB, Uwe Ohler)
    [started a postdoc with Bob Waterston at University of Washington]
  • Jeff Martin (CS, Bruce Donald)
    [joined Scalgo, a computational geometry startup]
  • Nabil Mustafa (CS, Pankaj Agarwal)
    [started a postdoc at Max Planck Institute for Informatics]
  • Johannes Norrell (Physics, Josh Socolar)
    [joined a government agency]
  • Constantin Pistol (CS, Alvy Lebeck and Chris Dwyer)
    [joined Apple]
  • Sudheer Sahu (CS, John Reif)
    [joined Microsoft; later joined AT&T Interactive]
  • Nathan Sheffield (CBB, Greg Crawford and Terry Furey)
    [started a postdoc with Christoph Bock at Center for Molecular Medicine of the Austrian Academy of Sciences]
  • Jason Stajich (Genetics, Fred Dietrich)
    [Miller Fellowship; started a postdoc with John Taylor at UC Berkeley;
    later joined the faculty at UC Riverside]
  • Rui Wang (CBB, Erich Jarvis)
    [joined Beijing Prosperous Biopharm as CEO and President]
  • Jennifer Weidman (Genetics, Randy Jirtle)
    [started a postdoc with Randy Jirtle at Duke]
  • Ya Xue (ECE, Larry Carin)
    [joined Centice, a sensor technology startup; later joined GE Global Research]
  • Gürkan Yardımcı (CBB, Greg Crawford and Uwe Ohler)
    [started a postdoc with Raluca Gordân at Duke]
  • Peng Yin (CS, John Reif)
    [Outstanding Ph.D. Dissertation; started a postdoc with Niles Pierce and Erik Winfree at CalTech;
    later joined the faculty at Harvard]
  • Jianyang (Michael) Zeng (CS, Bruce Donald)
    [joined the faculty at Tsinghua University in China]
  • Zhihong (Joe) Zhang (Genetics, Fred Dietrich)
    [started a postdoc with Stan Fields at University of Washington;
    later joined Illumina]
Research Initiation Project Committees
  • Austin Alexander (CS, Barbara Engelhardt)
  • Alan Davidson (CS, Carlo Tomasi)
  • Mark Fashing (CS, Carlo Tomasi)
  • Abhijit Guria (CS, Herbert Edelsbrunner)
  • Charles (Chip) Killian (CS, Amin Vahdat)
  • Branka Lakic (CS, Carlo Tomasi)
  • Urmi Majumder (CS, John Reif)
  • Mac Mason (CS, Ron Parr)
  • Wenbin Pan (CS, Herbert Edelsbrunner)
  • Tianqi Song (CS, John Reif)
Rotations
  • Diana Fusco (CBB)
  • Karthik Jayasurya (CBB)
  • Samuel Ramirez (CBB)
  • George Tretyakov (CBB)

Masters

Supervising
  • Abrita Chakravarty (CS)
    [joined Wolfram Research]
  • Yasunori Hongo (CS)
    [joined Bank of Japan]
  • Pallavi Pratapa (CS)
    [Outstanding Master's Thesis; joined UBS Warburg; later joined Lenovo]
  • David Vaughn (CS)
    [joined Modality, a software startup; later joined Measurement Incorporated]
Committees
  • Kanishk Asthana (BME, Lingchong You)
  • Avik Bhattacharya (CS, Terry Furey)
  • Anaghe Gupta (CS, Carlo Tomasi)
  • Kuan-ming Lin (CS, Larry Carin)
  • Arvind Sastry (CS, Carlo Tomasi)
  • Paul Shealy (CS, Carlo Tomasi)
  • Rumen Stamatov (CBB, Raluca Gordân)
  • Jie Xu (CS, Uwe Ohler)

Undergraduate

Supervising
  • Alexandra (Tally) Balaban (Math)
    [joined NESCent (National Evolutionary Synthesis Center);
    later entered grad school in Biostatistics at UNC]
  • Kshipra Bhawalkar (Math/CS)
    [entered grad school in Computer Science at Stanford]
  • Jason Bosko (ECE/CS)
    [joined SAS]
  • Scott Brothers (CS/Math)
    [Alex Vasilos Award; joined Microsoft]
  • Brian Bullins (Math/CS)
    [Graduation with Distinction; entered grad school in Computer Science at Princeton]
  • Jer-Yee (John) Chuang (ECE/CS)
    [Graduation with High Distinction; entered grad school in Bioinformatics at UCSF]
  • Matt Edwards (CS/Math)
    [Graduation with High Distinction; Alex Vasilos Award; joined IonTorrent, a biotech startup;
    later entered grad school in Computational and Systems Biology at MIT]
  • Daphne Ezer (CS/Biology)
    [Marshall Scholar; Graduation with Highest Distinction; Duke Faculty Scholar; Alex Vasilos Award;
    entered grad school in Genetics at Cambridge]
  • Eric Fountain (Math)
    [entered grad school in Physics at Princeton]
  • Daniel Greenblatt (CS)
    [entered grad school in Human Computer Interaction at Georgia Tech]
  • Kelvin Gu (post-baccalaureate), co-supervised with David Dunson
    [entered grad school in Statistics at Stanford]
  • Paul Heymann (CS/Philosophy)
    [Graduation with High Distinction; entered grad school in Computer Science at Stanford]
  • Boyoun (Sarah) Jung (CS/Biology)
    [interned for the Office of the President of the Republic of Korea at Cheong Wa Dae ("Blue House");
    later entered medical school at Dartmouth]
  • Charles Lin (Biology)
    [entered grad school in Computational and Systems Biology at MIT]
  • Jonathan Mathew (CS)
    [entered medical school at UNC]
  • Jimmy Mu (CS)
    [Graduation with Distinction; joined Microsoft]
  • Nikhil Saxena (BME/ECE)
    [joined Yelp]
  • Michael Vogelsong (BME)
  • Austin Weiss (Computational Biology)
    [entered medical school at Yale]
  • Aaron Wise (ECE/CS)
    [joined Google;
    later entered grad school in Computational Biology at CMU]
  • Derek Zhou (CS/Biology)
    [Graduation with Distinction; joined Citigroup]
Committees
  • Andrew Declercq (CS, Ron Parr)
    [joined IBM]
  • Patrick Paczkowski (CS, Terry Furey)
    [Graduation with Distinction; entered grad school in Computer Science at Yale]
  • Katherine (Beth) Trushkowsky (CS, Jeff Forbes)
    [Graduation with High Distinction; entered grad school in Computer Science at UC Berkeley]