Jun Yang
D327 Levine Science Research Center
Box 90129
Duke University
Durham, North Carolina 27708-0129
Tel: 919-660-6587
Fax: 919-660-6519
Web: http://www.cs.duke.edu/~junyang/
Email: <cs.duke.edu, junyang>
Research Interests
- Database and data-intensive systems.
Education
Professional Experience
- Associate Professor, Computer Science
Department, Duke
University, July 2008 - present.
- Assistant Professor, Computer Science
Department, Duke
University, August 2001 - June 2008.
- Member of Technical Staff, Radik Software, August 2000 - August 2001.
- Software Engineer, ESS Technology,
Inc., August 1999 - August 2000.
- Research Assistant, Computer Science
Department, Stanford University, September 1995 - August 2000.
- Instructor, Computer Science
Department, Stanford University, Spring 1999.
- Teaching Assistant, Computer Science
Department, Stanford University, Spring 1998.
- Research Intern, IBM Almaden
Research Center, June 1996 - September 1996.
- Programmer, College of
Natural Resources, UC Berkeley, June 1994 - August 1995.
- Lab Assistant, UC Berkeley,
Computer Science Division, Spring 1994.
- Tutor, San Joaquin Delta
College, February 1992 - June 1993.
Publications
Published work:
- Risi Thonangi and Jun Yang. "Permuting data on random-access block storage." Proceedings of the VLDB Endowment, ??(??), 2013.
- Botong Huang, Shivnath Babu, and Jun Yang. "Cumulon: optimizing statistical data analysis in the cloud." In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York City, New York, USA, June 2013.
- Yi Zhang, Kristian Lum, and Jun Yang. "Failure-aware cascaded suppression in wireless sensor networks." IEEE Transactions on Knowledge and Data Engineering, 25(5):1042-1055, May 2013. [paper and supplemental]
- Pankaj K. Agarwal, Lars Arge, Sathish Govindarajan, Jun Yang, and Ke Yi. "Efficient external memory structures for range-aggregate queries." Computational Geometry: Theory and Applications, 46(3):358-370, April 2013. [paper]
- Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Subscriber assignment for wide-area content-based publish/subscribe." IEEE Transactions on Knowledge and Data Engineering, 24(10):1833-1847, 2012. Invited as a special selection from ICDE 2011. [paper and supplemental]
- S. N. Lahiri, XuanLong Nguyen, Jun Yang, Zhengyuan Zhu, and P. Banerjee. "Wireless sensor networks: statistical issues and challenges." Journal of the Indian Statistical Association, 50(1–2):151-191, 2012.
- Rada Chirkova and Jun Yang. "Materialized views." Foundations and Trends in Databases, 4(4):295-405, 2012. [paper]
- Risi Thonangi, Shivnath Babu, and Jun Yang. "A practical concurrent index for solid-state drives." In Proceedings of the 2012 International Conference on Information and Knowledge Management, pages 1332-1341, Maui, Hawaii, USA, October 2012. Databases track. [paper and report]
- You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. "On “one of the few” objects." In Proceedings of the 2012 ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pages 1487-1495, Beijing, China, August 2012. [paper and report]
- Yi Zhang and Jun Yang. "Optimizing I/O for big array analytics." Proceedings of the VLDB Endowment, 5(8):764-775, June 2012. [paper]
- Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Processing a large number of continuous preference top-k queries." In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 397-408, Scottsdale, Arizona, USA, May 2012. [paper]
- Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Processing and notifying range top-k subscriptions." In Proceedings of the 2012 International Conference on Data Engineering, pages 810-821, Washington DC, USA, April 2012. [paper and report]
- Yi Zhang, Kamesh Munagala, and Jun Yang. "Storing matrices on disk: theory and practice revisited." Proceedings of the VLDB Endowment, 4(11):1075-1086, August 2011. [paper and report]
- James S. Clark, Pankaj K. Agarwal, David M. Bell, Paul G. Flikkema, Alan Gelfand, Xuanlong Nguyen, Eric Ward, and Jun Yang. "Inferential ecosystem models, from network data to prediction." Ecological Applications, 21(5):1523-1536, July 2011.
- Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Subscriber assignment for wide-area content-based publish/subscribe." In Proceedings of the 2011 International Conference on Data Engineering, pages 267-278, Hannover, Germany, April 2011. Results in this paper are subsumed by those in the TKDE 2012 paper by the same authors. [paper and report]
- Sarah Cohen, Chengkai Li, Jun Yang, and Cong Yu. "Computational journalism: a call to arms to database researchers." In Proceedings of the 2011 Conference on Innovative Data Systems Research, Asilomar, California, USA, January 2011. Outrageous ideas and vision track. Third-place winner of the Best Outrageous Ideas
and Vision Track Paper Competition sponsored by the Computing Community Consortium. [paper and slides]
- Lei Chen, Changjie Tang, Jun Yang, and Yunjun Gao, ed. Proceedings of the 2010 International Conference on Web-Age Information Management, Jiuzhaigou, Sichuan, China, July 2010. Lecture Notes in Computer Science 6184. Springer.
- Yi Zhang, Weiping Zhang, and Jun Yang. "I/O-efficient statistical computing with RIOT." In Proceedings of the 2010 International Conference on Data Engineering, pages 1157-1160, Long Beach, California, USA, March 2010. Demonstration track. [paper and poster]
- Jun Yang, Kamesh Munagala, and Adam Silberstein. "Data aggregation in sensor networks." In Encyclopedia of Database Systems. Ling Liu and M. Tamer Özsu, ed. Springer. 2009. Invited contribution.
- Albert Yu, Pankaj K. Agarwal, and Jun Yang. "Generating wide-area content-based publish/subscribe workloads." In Proceedings of the 2009 Workshop on Networking Meets Databases, Big Sky, Montana, USA, October 2009. [paper]
- Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Input-sensitive scalable continuous join query processing." ACM Transactions on Database Systems, 34(3):1-41, August 2009. [paper]
- Fei Chen, Byron J. Gao, AnHai Doan, Jun Yang, and Raghu Ramakrishnan. "Optimizing complex extraction programs over evolving text data." In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pages 321-334, Providence, Rhode Island, USA, June 2009. [paper]
- Risi Thonangi, Hao He, AnHai Doan, Haixun Wang, and Jun Yang. "Weighted proximity best-joins for information retrieval." In Proceedings of the 2009 International Conference on Data Engineering, pages 234-245, Shanghai, China, March 2009. [paper]
- Yi Zhang, Herodotos Herodotou, and Jun Yang. "RIOT: I/O-efficient numerical computing without SQL." In Proceedings of the 2009 Conference on Innovative Data Systems Research, Asilomar, California, USA, January 2009. [paper and slides]
- Badrish Chandramouli and Jun Yang. "End-to-end support for joins in large-scale publish/subscribe systems." In Proceedings of the 2008 International Conference on Very Large Data Bases, pages 434-450, Auckland, New Zealand, August 2008. Infrastructure track. [paper]
- Badrish Chandramouli, Jun Yang, Pankaj K. Agarwal, Albert Yu, and Ying Zheng. "ProSem: scalable wide-area publish/subscribe." In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 1315-1318, Vancouver, Canada, June 2008. Demonstration track. Acceptance rate: 31.9 percent. [paper]
- Junyi Xie, Jun Yang, Yuguo Chen, Haixun Wang, and Philip S. Yu. "A sampling-based approach to information recovery." In Proceedings of the 2008 International Conference on Data Engineering, pages 476-485, Cancun, Mexico, April 2008. Short presentation track. Acceptance rate: 19.2 percent of 715. Full paper. [paper]
- Fei Chen, AnHai Doan, Jun Yang, and Raghu Ramakrishnan. "Efficient information extraction over evolving text data." In Proceedings of the 2008 International Conference on Data Engineering, pages 943-952, Cancun, Mexico, April 2008. Acceptance rate: 12.1 percent of 715. [paper]
- Magdalena Balazinska, Amol Deshpande, Alexandros Labrinidis, Qiong Luo, Samuel Madden, and Jun Yang. "Report on the fourth international workshop on data management for sensor networks
(DMSN 2007)." ACM SIGMOD Record, 36(4):53-55, 2007.
- Adam Silberstein, Alan E. Gelfand, Kamesh Munagala, Gavino Puggioni, and Jun Yang. "Making sense of suppressions and failures in sensor data: a Bayesian approach." In Proceedings of the 2007 International Conference on Very Large Data Bases, pages 842-853, Vienna, Austria, September 2007. Infrastructure track. Acceptance rate: 45 out of 275. [paper]
- Badrish Chandramouli, Jeff M. Phillips, and Jun Yang. "Value-based notification conditions in large-scale publish/subscribe systems." In Proceedings of the 2007 International Conference on Very Large Data Bases, pages 878-889, Vienna, Austria, September 2007. Infrastructure track. Acceptance rate: 45 out of 275. [paper]
- Magdalena Balazinska, Amol Deshpande, Qiong Luo, and Jun Yang, ed. Proceedings of the 2007 International Workshop on Data Management for Sensor Networks, Vienna, Austria, September 2007.
- Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. "BLINKS: ranked keyword searches on graphs." In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pages 305-316, Beijing, China, June 2007. Acceptance rate: 70 out of 480. [paper and report]
- Badrish Chandramouli, Christopher N. Bond, Shivnath Babu, and Jun Yang. "Query suspend and resume." In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, pages 557-568, Beijing, China, June 2007. Acceptance rate: 70 out of 480. [paper and report]
- Adam Silberstein and Jun Yang. "Many-to-many aggregation for sensor networks." In Proceedings of the 2007 International Conference on Data Engineering, pages 986-995, Istanbul, Turkey, April 2007. Acceptance rate: 122 out of 659. [paper and report]
- Badrish Chandramouli, Christopher Bond, Shivnath Babu, and Jun Yang. "On suspending and resuming dataflows." In Proceedings of the 2007 International Conference on Data Engineering, pages 1289-1291, Istanbul, Turkey, April 2007. Poster track. Acceptance rate: 60(+122) out of 659. Results in this paper are subsumed
by those in the SIGMOD 2007 paper by the same authors.
- Adam Silberstein, Gregory Filpus, Kamesh Munagala, and Jun Yang. "Data-driven processing in sensor networks." In Proceedings of the 2007 Conference on Innovative Data Systems Research, pages 10-21, Asilomar, California, USA, January 2007. Acceptance rate: 34 out of 98. [paper]
- Junyi Xie and Jun Yang. "A survey of join processing in data streams." In Data Streams: Models and Algorithms. Charu C. Aggarwal, ed. Springer. November 2006. Invited contribution. [paper]
- Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Scalable continuous query processing by tracking hotspots." In Proceedings of the 2006 International Conference on Very Large Data Bases, pages 31-42, Seoul, Korea, September 2006. Core database track. Acceptance rate: 46 out of 334. Results in this paper are subsumed
by those in the 2009 TODS paper by the same authors. [paper and report]
- Adam Silberstein, Kamesh Munagala, and Jun Yang. "Energy-efficient monitoring of extreme values in sensor networks." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 169-180, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper]
- Adam Silberstein, Rebecca Braynard, and Jun Yang. "Constraint chaining: on energy-efficient continuous monitoring in sensor networks." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 157-168, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper]
- Badrish Chandramouli, Junyi Xie, and Jun Yang. "On the database/network interface in large-scale publish/subscribe systems." In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pages 587-598, Chicago, Illinois, USA, June 2006. Acceptance rate: 58 out of 446. [paper and report]
- Paul G. Flikkema, Pankaj K. Agarwal, James S. Clark, Carla Schlatter Ellis, Alan Gelfand, Kamesh Munagala, and Jun Yang. "Model-driven dynamic control of embedded wireless sensor networks." In Proceedings of the 2006 International Conference on Computational Science, pages 409-416, Reading, United Kingdom, May 2006.
- Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. "Dual labeling: answering graph reachability queries in constant time." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Acceptance rate: 89 out of 456. [paper]
- Adam Silberstein, Rebecca Braynard, and Jun Yang. "Energy-efficient continuous isoline queries in sensor networks." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Poster track. Results in this paper are subsumed by those in the SIGMOD 2006 paper
by the same authors [paper]
- Adam Silberstein, Rebecca Braynard, Carla Ellis, Kamesh Munagala, and Jun Yang. "A sampling-based approach to optimizing top-k queries in sensor networks." In Proceedings of the 2006 International Conference on Data Engineering, Atlanta, Georgia, USA, April 2006. Acceptance rate: 89 out of 456. [paper]
- Badrish Chandramouli, Jun Yang, and Amin Vahdat. "Distributed network querying with bounded approximate caching." In Proceedings of the 2006 International Conference on Database Systems for Advanced
Applications, pages 374-388, Singapore, April 2006. Acceptance rate: 24.5 percent. [paper and report]
- Pankaj K. Agarwal, Junyi Xie, Jun Yang, and Hai Yu. "Monitoring continuous band-join queries over dynamic data." In Proceedings of the 2005 International Symposium on Algorithms and Computation, pages 349-359, Sanya, Hainan, China, December 2005. [paper]
- Hao He, Haixun Wang, Jun Yang, and Philip S. Yu. "Compact reachability labeling for graph-structured data." In Proceedings of the 2005 International Conference on Information and Knowledge Management, pages 594-601, Bremen, Germany, November 2005. Acceptance rate: 76 out of 425. [paper and report]
- Kamesh Munagala, Jun Yang, and Hai Yu. "Online view maintenance under a response-time constraint." In Proceedings of the 2005 European Symposium on Algorithms, pages 677-688, Palma de Mallorca, Spain, October 2005. [paper]
- Wenfei Fan, Zhaohui Wu, and Jun Yang, ed. Proceedings of the 2005 International Conference on Web-Age Information Management, Hangzhou, China, October 2005. Lecture Notes in Computer Science 3739. Springer.
- Junyi Xie, Jun Yang, and Yuguo Chen. "On joining and caching stochastic streams." In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 359-370, Baltimore, Maryland, USA, June 2005. Acceptance rate: 65 out of 431. [paper and report]
- Adam Silberstein, Hao He, Ke Yi, and Jun Yang. "BOXes: efficient maintenance of order-based labeling for dynamic XML data." In Proceedings of the 2005 International Conference on Data Engineering, pages 285-296, Tokyo, Japan, April 2005. Acceptance rate: 67 out of 521. [paper and report]
- Hao He, Junyi Xie, Jun Yang, and Hai Yu. "Asymmetric batch incremental view maintenance." In Proceedings of the 2005 International Conference on Data Engineering, pages 106-117, Tokyo, Japan, April 2005. Acceptance rate: 67 out of 521. [paper]
- Junfei Geng and Jun Yang. "AutoBib: automatic extraction of bibliographic information on the Web." In Proceedings of the 2004 International Database Engineering and Applications Symposium, pages 193-204, Coimbra, Portugal, July 2004. [paper]
- Ke Yi, Hao He, Ioana Stanoi, and Jun Yang. "Incremental maintenance of XML structural indexes." In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pages 491-502, Paris, France, June 2004. Acceptance rate: 69 out of 431. [paper]
- Adam Silberstein and Jun Yang. "NeXSort: sorting XML in external memory." In Proceedings of the 2004 International Conference on Data Engineering, pages 695-706, Boston, Massachusetts, USA, April 2004. Acceptance rate: 63 out of 441. [paper and report]
- Hao He and Jun Yang. "Multiresolution indexing of XML for frequent queries." In Proceedings of the 2004 International Conference on Data Engineering, pages 683-694, Boston, Massachusetts, USA, April 2004. Acceptance rate: 63 out of 441. [paper and report]
- Jun Yang and Jennifer Widom. "Incremental computation and maintenance of temporal aggregates." The VLDB Journal, 12(3):262-283, 2003. [paper]
- Zhiyuan Chen, Li Chen, Jian Pei, Yufei Tao, Haixun Wang, Wei Wang, Jiong Yang, Jun Yang, and Donghui Zhang. "Recent progress on selected topics in database research: a report by nine young chinese
researchers working in the united states." Journal of Computer Science and Technology, 18(5):538-552, September 2003.
- Pankaj K. Agarwal, Lars Arge, Jun Yang, and Ke Yi. "I/O-efficient structures for orthogonal range-max and stabbing-max queries." In Proceedings of the 2003 European Symposium on Algorithms, pages 7-18, Budapest, Hungary, September 2003.
- Xiao Huang, Qiang Xue, and Jun Yang. "TupleRank and implicit relationship discovery in relational databases." In Proceedings of the 2003 International Conference on Web-Age Information Management, pages 445-457, Chengdu, China, August 2003. Acceptance rate: 30 out of 258. [paper and report]
- Ke Yi, Hai Yu, Jun Yang, Gangqiang Xia, and Yuguo Chen. "Efficient maintenance of materialized top-k views." In Proceedings of the 2003 International Conference on Data Engineering, pages 189-200, Bangalore, India, March 2003. Acceptance rate: 51 out of 378. [paper and report]
- Jun Yang. "Temporal data warehousing." Ph.D. Dissertation, Stanford University, August 2001.
- Jun Yang and Jennifer Widom. "Incremental computation and maintenance of temporal aggregates." In Proceedings of the 2001 International Conference on Data Engineering, pages 51-60, Heidelberg, Germany, April 2001. Acceptance rate: 14 percent. Results in this paper are subsumed by those in the 2003
VLDB Journal paper by the same authors
- Wilburt Juan Labio, Jun Yang, Yingwei Cui, Hector Garcia-Molina, and Jennifer Widom. "Performance issues in incremental warehouse maintenance." In Proceedings of the 2000 International Conference on Very Large Data Bases, pages 461-472, Cairo, Egypt, September 2000. Acceptance rate: 53 out of 351.
- Jun Yang, Huacheng C. Ying, and Jennifer Widom. "TIP: a temporal extension to informix." In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, page 596, Dallas, Texas, USA, May 2000. Demonstration track.
- Jun Yang, Huacheng C. Ying, and Jennifer Widom. "TIP: a temporal extension to informix." In Proceedings of the 2000 International Conference on Extending Database Technology, Konstanz, Germany, March 2000. Demonstration track. An improved version was shown in SIGMOD 2000.
- Jun Yang and Jennifer Widom. "Temporal view self-maintenance." In Proceedings of the 2000 International Conference on Extending Database Technology, pages 395-412, Konstanz, Germany, March 2000. Acceptance rate: 16.7 percent.
- Hector Garcia-Molina, Wilburt Juan Labio, and Jun Yang. "Expiring data in a warehouse." In Proceedings of the 1998 International Conference on Very Large Data Bases, pages 500-511, New York City, New York, USA, August 1998. Acceptance rate: 16 percent.
- Jun Yang and Jennifer Widom. "Maintaining temporal views over non-temporal information sources for data warehousing." In Proceedings of the 1998 International Conference on Extending Database Technology, pages 389-403, Valencia, Spain, March 1998. Acceptance rate: 32 out of 191.
- Laura M. Haas, Donald Kossmann, Edward L. Wimmers, and Jun Yang. "Optimizing queries across diverse data sources." In Proceedings of the 1997 International Conference on Very Large Data Bases, pages 276-285, Athens, Greece, August 1997. Acceptance rate: 16 percent.
- Laura M. Haas, Donald Kossmann, Edward L. Wimmers, and Jun Yang. "An optimizer for heterogeneous systems with non-standard data and search capabilities." IEEE Data Engineering Bulletin, 19(4):37-44, December 1996.
- Steve G. Steinberg, Jun Yang, and Katherine A. Yelick. "Performance modeling and composition: a case study in cell simulation." In Proceedings of the 1996 International Parallel Processing Symposium, pages 68-74, Honolulu, Hawaii, USA, April 1996. Acceptance rate: 35 percent.
Technical reports:
- Kevin A. Walsh, Amin Vahdat, and Jun Yang. "Enabling wide-area replication of database services with continuous consistency." Technical Report, Duke University, February 2002. [report]
- Jun Yang, Jennifer Widom, and Paul Brown. "Implementing parameterized range types in an extensible DBMS." Technical Report, Stanford University, November 2000.
Funding
Current funding:
- Principal investigator. III-Small: RIOT: Statistical Computing with Efficient, Transparent I/O. NSF IIS Division. September 2009 - August 2013.
- Principal investigator. Supplemental Award for III-Small: RIOT: Statistical Computing with Efficient, Transparent
I/O. NSF REU Program. June 2010.
- Principal investigator. Provisioning and Optimization for Statistical Workloads in a Cloud. Amazon Web Services. July 2012 - July 2014. With Shivnath Babu.
- Principal investigator. Computational Journalism: Bringing Together Social and Computer Scientists to Help
Journalism. Trinity College Bi-Annual Interdepartmental Collaboration Mini-Grants Program, Duke
University. July 2013 - June 2014. With Pankaj K. Agarwal and James T. Hamilton.
Pending proposals:
- Principal investigator. III: Small: Cumulon: Easy and Efficient Statistical Computing in the Cloud. NSF IIS. September 2013 - August 2016. With Shivnath Babu, Sayan Mukherjee, and Michael Ward.
- Principal investigator. III: Medium: Collaborative Research: Database Research for Computational Journalism. NSF IIS. June 2013 - May 2016. With Pankaj K. Agarwal, James T. Hamilton, and Chengkai Li (University of Texas at Arlington).
Past funding:
- Principal investigator. RIOT: Transparent Scalability for Statistical Analysis of Massive Datasets. HP Labs Innovation Research Program. August 2010 - July 2011.
- Co-investigator. Modeling Immunity for Biodefense. BAA-NIAID-DAIT-NIHAI2009074. September 2010 - August 2015. With Thomas B. Kepler and others.
- Principal investigator. III-COR: Scalable Publish/Subscribe: Unifying Data Processing and Dissemination. NSF IIS Division. September 2007 - August 2011. With Pankaj K. Agarwal.
- Principal investigator. Supplemental Award for III-COR: Scalable Publish/Subscribe: Unifying Data Processing
and Dissemination. NSF REU Program. June 2009.
- Co-investigator. Doctoral Program in Management and Analysis of Large Data Acquired from Sensors. Department of Education GAANN Program. May 2007. With Pankaj K. Agarwal and others.
- Co-investigator. Integration of IBM Management Software with Campus Blade Clusters in Support of Duke
Academic Infrastructure. IBM Shared University Research (SUR) Program. June 2006. With Richard Lucic and others.
- Co-investigator. COLLABORATIVE RESEARCH: DDDAS-TMRP: Dynamic Sensor Networks---Enabling the Measurement,
Modeling, and Prediction of Biophysical Change in a Landscape. NSF CNS DDDAS Program. January 2006 - December 2011. With James S. Clark and others.
- Co-investigator. Supplemental Award for COLLABORATIVE RESEARCH: DDDAS-TMRP: Dynamic Sensor Networks---Enabling
the Measurement, Modeling, and Prediction of Biophysical Change in a Landscape. NSF REU Program. July 2006. With James S. Clark and others.
- Co-investigator. Multiscale Integrative Immunology for Adjuvant Development. NIH-NIAID-DAIT-BAA-05-10. September 2005 - August 2010. With Thomas B. Kepler and others.
- Principal investigator. CAREER: Techniques and Applications of Derived Data Maintenance. NSF CAREER Program. September 2003 - August 2008.
- Principal investigator. Supplemental Award for CAREER: Techniques and Applications of Derived Data Maintenance. NSF REU Program. June 2006.
Honors and Awards
- David and Janet Vaughan Brooks Teaching Award, Trinity
College of Arts and Sciences, Duke University, April 2013.
- Third-place winner of the Best Outrageous Ideas and
Vision Track Paper Competition at the 2011 Conference on Innovative Data Systems Research (CIDR 2011),
sponsored by the Computing Community Consortium, January 2011.
- IBM Faculty Award, January 2006.
- Recognized for excellence in teaching by Teaching
Excellence Committee, Department of Computer Science, Duke
University, January 2004.
- NSF CAREER Award, September 2003.
- Highest Achievement Award, Computer Science Division,
UC Berkeley, May 1995.
- UC Berkeley Chancellor's Scholarship, 1993 - 1995.
- Dean's Honor List For Top 4% Students, UC
Berkeley, February 1994, July 1994, and February 1995.
- Chinese-American Institute of Engineers And Scientists
Scholarship, June 1994.
- Chuck Miller Scholarship, February 1994.
- National Individual Champion of Mathematics Competition
of American Math Association of Two-Year Colleges, 1991 - 1992 and 1992 - 1993.
- Outstanding Student's Honor, Delta College Academic
Senate, April 1993.
- California Math Council of Community Colleges
Scholarship, 1993 and 1994.
- Delta College Foundation Scholarship, Memorial
Scholarship, Academic Excellence Scholarship, etc., 1993.
- First Prizes, National Math Competition of Chinese High
Schools, 1989 and 1990.
- First Prizes, Computer Programming Contest of Chengdu,
China, 1988, 1989, and 1990.
External Presentations and Demonstrations
- "Big Data: Not Just about the Size," presentation at the Forum of Future Data, Wuyishan, China, July 2012.
- "Problems in Computational Journalism," presentation at HP Labs, Beijing, China, June 2012.
- "Fun with Arrays and Matrices in RIOT," informal talk at Stanford InfoLab lunch, August 2011.
- "Computational Journalism: A Call to Arms to Database Researchers," presentation at the 2011 Conference on Innovative Data Systems Research (CIDR 2011), January 2011.
- "Scalable Continuous Query Processing and Result Dissemination," seminar at HP Labs, Beijing, China, August 2010.
- "Data-Driven Processing in Sensor Networks," seminar at Stanford University, January 2009.
- "A Sampling-Based Approach to Information Recovery," presentation at the 2008 Annual Meeting of the Institute for Operations Research and the Management
Sciences (INFORMS 2008), October 2008.
- "Thoughts on Data Sharing: A Database Researcher's Perspective," presentation at the Primate Life History Working Group Meeting, NESCent (National
Evolutionary Synthesis Center), August 2007.
- "Query Suspend and Resume," presentation at the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007), June 2007.
- "Data-Driven Processing in Sensor Networks," seminars at University of Pennsylvania, University of Waterloo, and New England Database
Society, April 2007 - October 2007.
- "Scalable Continuous Query Processing and Result Dissemination," seminars at IBM T. J. Watson Research Center, University of Maryland at College Park,
University of Pittsburgh/Carnegie Mellon University Joint Database Seminar, Brown
University, University of Illinois at Urbana-Champaign, and University of California
at Berkeley, February 2006 - December 2006.
- "Continuous Query Processing over Networked Data," presentation at IBM Research Triangle Park University Day, October 2006.
- Panel discussion at SIGMOD '06 Life after Graduation Symposium, June 2006.
- "Scalable Continuous Query Processing and Result Dissemination," talk at the 2006 Southeast Workshop on Data and Information Management (SEWDIM 2006), March 2006.
- "Querying Networked Data," presentation at IBM Research Triangle Park University Day, October 2005.
- "An Overview of Database Research at Duke," presentation at inDuke Meeting, Duke University, May 2005.
- "Caching for Network Querying," presentation at SIGMOD '05 Program Committee Workshop, Stanford, California, February 2005.
- "Layers and Boxes: Efficient and Maintainable Indexes for XML," seminar at IBM T. J. Watson Research Center, July 2004.
- "AutoBib: Automatic Extraction of Bibliographic Information on the Web," presentation at the 2004 International Database Engineering and Applications Symposium (IDEAS 2004).
- "Post-Web-Age Information Management," panel discussion at the 2003 International Conference on Web-Age Information Management (WAIM 2003).
- "TupleRank and Implicit Relationship Discovery in Databases," presentation at the 2003 International Conference on Web-Age Information Management (WAIM 2003).
- "Problems in Database View Maintenance and Web Data Extraction," seminar at University of North Carolina at Greensboro, April 2003.
- "Efficient Maintenance of Materialized Top-k Views," presentation at the 2003 International Conference on Data Engineering (ICDE 2003).
- "Incremental Computation and Maintenance of Temporal Aggregates," presentation at the 2001 International Conference on Data Engineering (ICDE 2001).
- "Query Processing in Kidar," guest lecture for a course on database system
implementation at Stanford University, Stanford,
California, November 2000.
- "Performance Issues in Incremental Warehouse Maintenance," presentation at the 2000 International Conference on Very Large Data Bases (VLDB 2000).
- "TIP: A Temporal Extension to Informix," system demonstration at the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD 2000).
- "Temporal Data Warehousing," colloquia at
Brown University, Cornell University, Duke University, Harvard
University, Santa Clara University, State University of New York at
Stony Brook, University of California at Santa Barbara, University of
California at Santa Cruz, University of Southern California, Yale
University, and IBM Almaden Research Center, February 2000 - May 2000.
- "TIP: A Temporal Extension to Informix," presentation and system demonstration at
Stanford Database Workshop, Stanford, California, March 2000.
- "TIP: A Temporal Extension to Informix," presentation and system demonstration at
Informix Corporation, Oakland, California, March 2000.
- "Temporal View Self-Maintenance," presentation at the 2000 International Conference on Extending Database Technology (EDBT 2000).
- "TIP: A Temporal Extension to Informix," system demonstration at the 2000 International Conference on Extending Database Technology (EDBT 2000).
- "Maintaining Temporal Views Over Non-Temporal Information Sources For Data Warehousing," presentation at the 1998 International Conference on Extending Database Technology (EDBT 1998).
- "Performance Modeling and Composition: A Case Study in Cell Simulation," presentation at the 1996 International Parallel Processing Symposium (IPPS 1996).
Teaching
- COMPSCI 316 (formerly CPS 116), Duke University: Introduction to Database Systems. Fall 2002, Fall 2003, Fall 2004, Fall 2005, Fall 2006, Fall 2007, Fall 2008, Fall 2009, Fall 2011, and Fall 2012.
- COMPSCI 516 (formerly CPS 216), Duke University: Advanced Database Systems. Fall 2001, Spring 2003, Spring 2004, and Spring 2005.
- CPS 296.1, Duke University: Project in Computational Journalism. Spring 2012.
- CPS 296.1, Duke University: Database and Programming Languages: Crossing the Chasm. Spring 2010.
- CPS 296.3, Duke University: Information Management and Mining. Spring 2009.
- CPS 399.28, Duke University: Research Seminar and Project in Databases. Spring 2008.
- CPS 296.4, Statistical and Applied Mathematical Sciences Institute, cross-listed at Duke, North
Carolina State, and UNC Chapel Hill: Sensor Networks for Environmental Monitoring. Fall 2007.
- CPS 296.1, Duke University: Sensor Data Processing. Spring 2007.
- CPS 296.1, Duke University: Topics in Database Systems. Spring 2002.
- CPS 300, Duke University: Introduction to Graduate Study. Fall 2008, Fall 2009, Fall 2010, and Fall 2011.
- CS 145, Stanford University: Introduction to Databases. Spring 1999.
Student Advising
Current Ph.D. students:
- Botong Huang.
- Ph.D. preliminary exam: Cumulon: Optimizing Statistical Analysis in the Cloud. Spring 2013.
- Ph.D. research initiation project: Data Parallel Statistical Computing in the Cloud. 2012.
- Risi Thonangi (Rishi).
- Ph.D. preliminary exam: Searching, Sorting, Permuting and Beyond on Flash. Spring 2011.
- Ph.D. research initiation project: Investigating Concurrency Control for Flash-Efficient Indexes. 2009.
- You Wu (Will).
- Ph.D. research initiation project: Extended Promotion Analysis and its Applications in Computational Journalism. 2012.
- Albert Yu.
- Ph.D. preliminary exam: Algorithmic Challenges in Content-based Publish-Subscribe Systems. Spring 2010.
- Ph.D. research initiation project: Network Design for Wide-Area Publish/Subscribe. 2008.
Graduated Ph.D. students:
- Yi Zhang. First employment: Google Inc..
- Ph.D. dissertation defense: Transparent and Efficient I/O for Statistical Computing. March 2012.
- Ph.D. preliminary exam: RIOT: A Framework for Efficient Statistical Computing. Fall 2009.
- Ph.D. research initiation project: Failure-Aware Spatial Suppression in Sensor Networks. 2007.
- Badrish Chandramouli. First employment: Microsoft Research.
- Ph.D. dissertation defense: Unifying Databases and Internet-Scale Publish/Subscribe. July 2008.
- Ph.D. preliminary exam: Supporting Better Scalability and Richer Subscription
Models in Wide-Area Publish/Subscribe. Summer 2006.
- Ph.D. research initiation project: Distributed Network Querying: Reducing Costs by Providing
Approximate Answers. 2004. Duke CS Outstanding PhD Research Initiation Project Award.
- Junyi Xie. First employment: Oracle Corp.
- Ph.D. dissertation defense: Handling Resource Constraints and Scalability in Continuous Query Processing. September 2007.
- Ph.D. preliminary exam: Optimizing Continuous Queries Over Data Streams. Fall 2004.
- Ph.D. research initiation project: Building DRAM-Based High Performance Intermediate Memory Systems. 2002. (Served as committee member, not as primary advisor.)
- Hao He. IBM Ph.D. Fellowship, 2006-2007; first employment: Google Inc.
- Ph.D. dissertation defense: Query Processing and Indexing Techniques on Semi-Structured Data. July 2007.
- Ph.D. preliminary exam: Query Processing and Indexing Techniques on Graph-Structured Data. Spring 2006.
- Ph.D. research initiation project: A Workload-Aware Update-Efficient Index for XML. 2003.
- Adam Silberstein. First employment: Yahoo! Research.
- Ph.D. dissertation defense: Query Processing Methods for Wireless Sensor Networks. February 2007.
- Ph.D. preliminary exam: Query Processing and Optimization in Sensor Networks. Spring 2005.
- Ph.D. research initiation project: Sorting XML in External Memory. 2004.
Graduated M.S. students:
- Rohit Paravastu. Detecting Natural-Language Claims Checkable on Relational Databases. Fall 2012.
- Rozemary Scarlat. FirstPass: Crowdsourced Initial Document Analysis. Fall 2012.
- Yunjia Zhou. Exploring One-of-the-Few Claims from Data. Spring 2012.
- Pradeep K. Gunda. Scalable Lineage Tracking in Workflows. Fall 2007.
- Wenbin Pan. On Author Name Disambiguation in Citation Databases. Fall 2004.
- Zhihui Wang. Multiple-View Maintenance with Semantic Caching. Summer 2003.
- Jing Zhang. Implementing a File System on Top of a DBMS. Summer 2003.
- Junfei Geng. Automatic Extraction and Integration of Bibliographic Information on the Web Using
Hidden Markov Models. Spring 2003.
- Xiao F. Huang (Andy). TupleRank and Implicit Relationship Discovery in Databases. Spring 2003.
- Parag G. Palekar. Analysis of an Incremental Algorithm for Mining Frequent Itemsets. Fall 2002.
Undergraduate theses supervised:
- Tyler Brock. Amboseli Baboon Research Ranker. Spring 2007. Graduated with Distinction.
- Christopher N. Bond. Query Suspend and Resume. Spring 2005. Graduated with High Distinction.
Undergraduate research internship:
- Jiaqi Yan. RIOT: Statistical Computing with Efficient, Transparent I/O. Summer 2010 - Spring 2012. Duke CSURF Fellow.
- Weiping Zhang. RIOT: Statistical Computing with Efficient, Transparent I/O. Summer 2009 - Spring 2011.
- Gregory Filpus. Suppression Schemes for Sensor Data Collection. Summer 2006.
- Congyi Wu. Tracking Lineage for Computational Workflows. Summer 2006.
Undergraduate independent studies:
- Andrew Shim. Computational Journalism. Spring 2013. Duke CSURF Fellow.
- Jiaqi Yan. Efficient Out-of-Core Data Analysis. Fall 2010 - Spring 2011.
- Kevin Jang. Efficient Out-of-Core Data Analysis. Fall 2010.
- Perry Zheng. Managing Structure-Rich Data. Fall 2009 - Spring 2010.
- Weiping Zhang. RIOT: Statistical Computing with Efficient, Transparent I/O. Spring 2010.
- Ashley DeMass. Database Support for Wireless Sensor Networks. Fall 2008.
- Congyi Wu. Object-Oriented Schema and Data Editing on a Relational Backend. Spring 2008.
Ph.D. defense committee (not as primary advisor):
- Nedyalko Borisov. Integrated Management of the Persistent-Storage and Data-Processing Layers in Data-intensive
Computing Systems. Summer 2012.
- Sharathkumar Raghvendra. Geometric Approximation Algorithms - A Summary Based Approach . Summer 2012.
- Herodotos Herodotou. Automatic Tuning of Data-Intensive Analytical Workloads. Spring 2012.
- Sam Slee. Developing Scalable Abilities for Self-Reconfigurable Robots. Fall 2010.
- Songyun Duan. Simplifying System Management through Automated Forecasting, Diagnosis, and Configuration
Tuning. Spring 2010.
- Fareed Zaffar. Foresight: Countering Malware Through Cooperative Forensics Sharing. Summer 2008.
- Joseph Volpe. Mechanistic and Genetic Biases in Human Immunoglobulin Heavy Chain Development. Spring 2008.
- Laura Grit. Extensible Resource Management for Networked Virtual Computing. Fall 2007.
- Dazhi Wang. Service Reliability: Models, Algorithms and Applications. Summer 2007.
- Angela Dalton. Data Fidelity Mechanisms for Enhancing Energy Management in Context-Aware Systems. Fall 2006.
- Ke Yi. I/O Efficient Algorithms for Processing Massive Spatial
Data. Summer 2006.
- Hai Yu. Geometric Algorithms for Time-Varying Data. Summer 2006.
- Rebecca Braynard. Wireless MAC Layer Flexibility for Extending Effective System Lifetime. Spring 2006.
- Justin Moore. Automated Cost-Aware Data Center Management. Spring 2006.
- Patrick Reynolds. Using Causal Paths to Improve Performance and Correctness in Distributed Systems. Spring 2006.
- Dejan Kostic. High Bandwidth Data Dissemination for Large-Scale Distributed Systems. Summer 2005.
- Yun Fu. Resource Allocation for Global-Scale Network Services. Fall 2004.
- Sathish Govindarajan. Spatial Data Structures and Algorithms for Large Scale Applications. Fall 2004.
- Lipyeow Lim. Online Methods for Database Optimization. Fall 2004.
- Rajiv Wickremesinghe. Methods and Models for Data-Intensive Computing. Fall 2004.
- Heng Zeng. Explicit Energy Resource Management as a First Class Operating System Resource. Spring 2004.
- Ronald P. Doyle. Model-Based Adaptive Resource Provisioning in a Web Service Utility. Fall 2003.
Ph.D. preliminary exam committee (not as primary advisor):
- Andrew Brown. Cloud Platform Trust Logic. Fall 2012.
- Vamsidhar Thummala. Balancing Energy, Performance, and Stability Tradeoffs Under Uncertainty. Spring 2011.
- Herodotos Herodotou. Optimizing Analytical Workloads in Data-Intensive Computing Systems. Fall 2010.
- Nedyalko Borisov. Integrated Management of the Persistent-Storage and Data-Processing Layers in Data-intensive
Computing Systems. Spring 2010.
- Sharathkumar Raghvendra. Geometric Summaries. Spring 2009.
- Sam Slee. Developing Scalable Abilities for Self-Reconfigurable Robots. Spring 2009.
- Songyun Duan. Automated Forecasting and Diagnosis of System Failures. Spring 2008.
- Anita Lungu. Verification-Aware Processor Design. Spring 2007.
- Aydan Jumerefendi. System Support for Strong Accountability. Fall 2006.
- Joseph Volpe. Investigation of the IgH Locus and Analysis of the Antigen Receptors That It Forms. Fall 2005. Bioinformatics and Genome Technology.
- Dazhi Wang. Service Availability Modeling. Spring 2005.
- Rebecca Braynard. Asynchronous and Asymmetric Communication for Balancing Energy Consumption in Sensor
Networks. Fall 2004.
- Justin Moore. Balancing Site Goals and Service Goals in Datacenter Management. Fall 2004.
- Ke Yi. Index Structures for Large Databases: Theory and Practice. Spring 2004.
- Dejan Kostic. High Bandwidth Data Dissemination for Large-Scale Distributed Systems. Fall 2003.
- Patrick Reynolds. Measurement and Causality in Black-Box Distributed Systems. Fall 2003.
- Lipyeow Lim. Online Methods for Database Optimization. Spring 2003.
- Yun Fu. Resource Allocation for Global-Scale Network Services. Fall 2002.
- Sathish Govindarajan. Handling Large Spatial Data: Approximation and Data Structures. Summer 2002.
- Rajiv Wickremesinghe. Data Intensive Computation in a Compute/Storage Hierarchy. Spring 2002.
- Ronald P. Doyle. Internet Service Delivery Architecture: Implications of the Resource Grid Model. Fall 2001.
Ph.D. research initiation project committee (not as primary advisor):
- Mayuresh Kunjir. Physical Design for Big Data Management Systems. 2013.
- Mahanth Gowda. Cooperative Packet Recovery in Enterprise WLANs. 2012.
- Jie Li. Evaluating Starfish in the Real World. 2012.
- Wuzhou Zhang. Nearest Neighbor Searching Under Uncertainty. 2012.
- Nedyalko Borisov. Diagnosing Query Slowdowns in Database and SAN Environments. 2008.
- Vamsidhar Thummala. iTuned: An Auto-Tuner for Database Configuration Parameters. 2008.
- Songyun Duan. Proactive Performance Problem Identification and Diagnosis. 2006.
- Kuan-Ming Liu. Predicting Protein Functions by Integrating Biological Database from Multiple Knowledge
Domains. 2006.
- Anita Lungu. Integrating Biological Information Across Domains. 2006.
- Sita Badrish. Energy-Efficient Handling of Disk Accesses. 2004.
- Aydan Jumerefendi. Trust But Verify: Accountability for Internet Services. 2004.
- Haoying Li. Global Maximum Stereo Matching. 2004.
- Piyush Shivam. Distributed Data Staging for Performability. 2004.
- Kashi Vishwanath. Scalability Issues in ModelNet. 2003.
- Danxia Xie. Distributed Synthetic Energy Management for Sensor Networks. 2003.
- Ke Yi. External Memory Orthogonal Range and Stabbing Aggregate Queries on Semigroups. 2003.
- Hai Yu. Kinetic Fair-Split Trees and Proximity Problems. 2003.
- Junyi Xie. Building DRAM-Based High Performance Intermediate Memory Systems. 2002.
M.S. committee (not as primary advisor):
- Mahanth Gowda. Cooperative Packet Recovery in Enterprise WLANs. Spring 2013.
- Jie Li. Evaluating Starfish in the Real World. Spring 2013.
- Fan Yang. Prediction-Based Mobility Monitor with Adaptive Sensing for Smartphone. Summer 2012.
- Xixi Wang. Declarative Data Stream Analysis on Storm. Spring 2012.
- Gang Luo. Processing SQL-Like Declarative Queries in a MapReduce Framework. Summer 2011.
- Liang Dong. Optimization Opportunities for MapReduce Workloads. Spring 2011.
- Xuting Zhao. Workload-Aware Data-Placement and Scheduling Policies to Improve MapReduce Performance
under Cluster Hot Spots. Spring 2011.
- Kuan-Ming Liu. Combining Feature Selection Strategies with Bayesian Learning Models to Categorize
Gene Expression Profiles. Summer 2008.
- Yuqing Pan (Gary). Wireless Pulse Oximeter Sensor Project. Spring 2008. Electrical and Computer Engineering.
- Dongdong Zhao. An Evaluation of Techniques for Self-Healing in Application and Database Servers. Spring 2008.
- Jennifer Burge. Trading Information for Energy in Sensor Networks. Fall 2007.
- Sita Badrish. Energy Efficient Handling of Disk Accesses Using Economic Models. Fall 2005.
- Haoying Li. Just-in-Time Constraints for Dynamic-Programming Stereo. Summer 2005.
Undergraduate thesis committee (not as primary advisor):
- Andrea Scripa. A Survey of Text Mining Techniques for Short Texts. Spring 2012.
- Katherine Trushkowsky. CoBib: An Architecture for a Collaborative Database. Spring 2007. Graduated with High Distinction.
- Sanjay Ginde, David Goldberg, and Chris Zeiders. OogP2P Framework. 2004.
Activities
Service to the professional community:
- Subject Area Editor (Database and Knowledge-Based Systems), Journal of Computer Science and Technology (JCST), December 2011 - present.
- Associate Editor, IEEE Transactions on Knowledge and Data Engineering (TKDE), March 2009 - present.
- Review Board, Proceedings of the VLDB Endowment, August 2008 - March 2012 and April 2013 - present.
- Senior Program Committee, the 2013 International Conference on Information and Knowledge Management (CIKM 2013).
- Demonstration Program Committee Co-Chair, the 2013 International Conference on Very Large Data Bases (VLDB 2013).
- Best Paper Selection Committee, the 2013 National Database Conference of China (NDBC 2013).
- Program Committee Area Chair (Streams, Sensor Networks, Complex Event Processing),
the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD 2013).
- Publicity Co-Chair, the 2013 International Conference on Database Systems for Advanced Applications (DASFAA 2013).
- Panel Co-Chair, the 2013 International Conference on Data Engineering (ICDE 2013).
- Senior Program Committee, the 2012 International Conference on Information and Knowledge Management (CIKM 2012).
- Best Paper Selection Committee, the 2012 National Database Conference of China (NDBC 2012).
- Program Committee, the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD 2012).
- Program Committee, the 2012 International Conference on Data Engineering (ICDE 2012).
- Program Committee, the 2011 International Conference on Data Engineering (ICDE 2011).
- Program Committee, the 2011 Conference on Innovative Data Systems Research (CIDR 2011).
- Program Committee, the 2010 International Conference on Very Large Data Bases (VLDB 2010).
- Program Committee, the 2010 International Workshop on Data Management for Sensor Networks (DMSN 2010).
- Program Committee Co-Chair, the 2010 International Conference on Web-Age Information Management (WAIM 2010).
- Program Committee, the 2010 International Conference on Data Engineering (ICDE 2010).
- Program Committee, the 2010 International Workshop on Ranking in Databases (DBRANK 2010).
- Program Committee, the 2009 International Workshop on Cloud Data Management (CLOUDDB 2009).
- Program Committee, the 2009 IFIP/ACM International Conference on Distributed Systems Platforms (MIDDLEWARE 2009).
- Program Committee, the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD 2009).
- Program Committee, the 2009 ACM Workshop on Data Engineering for Wireless and Mobile Access (MOBIDE 2009).
- Program Committee, the 2009 International Workshop on Scalable Stream Processing Systems (SSPS 2009).
- Regional Chair (America), the 2009 International Conference on Database Systems for Advanced Applications (DASFAA 2009).
- Program Committee, the 2009 International Conference on World Wide Web (WWW 2009).
- Program Committee, the 2009 International Workshop on Ranking in Databases (DBRANK 2009).
- Program Committee, the 2009 International Conference on Data Engineering (ICDE 2009).
- Program Committee, the 2009 Conference on Innovative Data Systems Research (CIDR 2009).
- Steering Committee Member, International Conference on Web-Age Information Management (WAIM), September 2008 - present.
- General Co-Chair and Program Committee Member, the 2008 International Workshop on Data Management for Sensor Networks (DMSN 2008).
- Program Committee, the 2008 International Conference on Information and Knowledge Management (CIKM 2008).
- Program Committee, the 2008 ACM Workshop on Data Engineering for Wireless and Mobile Access (MOBIDE 2008).
- Program Committee, the 2008 International Conference on Web-Age Information Management (WAIM 2008).
- Program Committee, the 2008 IEEE International Conference on Computational Science and Engineering (CSE 2008).
- Program Committee, the 2008 International Workshop on Scalable Stream Processing Systems (SSPS 2008).
- Program Committee, the 2008 International Conference on Very Large Data Bases (VLDB 2008).
- Program Committee, the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008).
- Program Committee, the 2008 International Conference on Data Engineering (ICDE 2008).
- Program Committee Co-Chair, the 2007 International Workshop on Data Management for Sensor Networks (DMSN 2007).
- Demonstration Program Committee, the 2007 International Conference on Very Large Data Bases (VLDB 2007).
- Program Committee, the 2007 International Conference on Scalable Information Systems (INFOSCALE 2007).
- Program Committee, the 2007 International Symposium on Large Spatio-Temporal Databases (SSTD 2007).
- Program Committee, the 2007 Joint Conference of the Asia-Pacific Web Conference and the International
Conference on Web-Age Information Management (APWEBWAIM 2007).
- Program Committee, the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007).
- Program Committee, the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007) Ph.D. Workshop on Innovative Database Research.
- Program Committee, the 2007 Workshop on Networking Meets Databases (NETDB 2007).
- Program Committee, the 2007 International Workshop on Scalable Stream Processing Systems (SSPS 2007).
- Program Committee, the 2007 International Conference on Data Engineering (ICDE 2007).
- Program Committee, the 2006 International Conference on Information and Knowledge Management (CIKM 2006).
- Program Committee, the 2006 International Conference on Geosensor Networks (GSN 2006).
- Program Committee, the 2006 International Workshop on Data Management for Sensor Networks (DMSN 2006).
- Program Committee, the 2006 International XML Database Symposium (XSYM 2006).
- Program Committee, the 2006 International Conference on Very Large Data Bases (VLDB 2006) Ph.D. Workshop.
- Program Committee Co-Chair, the 2006 Southeast Workshop on Data and Information Management (SEWDIM 2006).
- Program Committee, the 2006 International Conference on Web-Age Information Management (WAIM 2006).
- Program Committee, the 2005 International Conference on Data Mining (ICDM 2005).
- Program Committee, the 2005 ACM International Workshop on Web Information and Data Management (WIDM 2005).
- Program Committee, the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD 2005).
- Program Committee, the 2005 International XML Database Symposium (XSYM 2005).
- Program Committee, the 2005 International Conference on Very Large Data Bases (VLDB 2005) Ph.D. Workshop.
- Program Committee, the 2005 International Conference on Database Systems for Advanced Applications (DASFAA 2005).
- Publications Chair, the 2005 International Conference on Web-Age Information Management (WAIM 2005).
- Program Committee, the 2004 International Conference on Data Mining (ICDM 2004).
- Program Committee, the 2004 International Conference on Very Large Data Bases (VLDB 2004).
- Program Committee, the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2004).
- Demonstration Program Committee, the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004).
- Participant of the Summer Workshop on Developing the Field of Computational Journalism,
Center for Advanced Study in Behavioral Sciences, Stanford, California, July 2009.
- Panelist for NSF, IIS Division, 2003, 2004, 2005, 2008,
2009, 2010, 2011.
- Panelist for Department of Homeland Security, 2006.
- Expert Panelist on Cancer Reporting Information Technology, Office of the Assistant
Secretary for Planning and Evaluation, Department of Health and Human Services, 2008 - 2009.
- Reviewer for Research Grants Council of Hong Kong, 2010, 2012.
- Reviewer for Natural Sciences and Engineering Research Council of Canada, 2008.
- Reviewer for Netherlands Organisation for Scientific Research, 2006.
- Associate Information Director, ACM SIGMOD, 2003 - present.
- Started Carolina
Database Research Group (CDB) in 2003 with a group of database
researchers in North Carolina and continue to be one of the main
organizers.
- Publicity Chair, the 2004 International Conference on Mobile Data Management (MDM 2004).
-
Reviewers for journals:
ACM Transactions on Database Systems (TODS),
The VLDB Journal (VLDBJ),
IEEE Transactions on Knowledge and Data Engineering (TKDE),
ACM Transactions on Programming Languages and Systems (TOPLAS),
ACM SIGMOD Record (SIGMODREC),
The Computer Journal (CJ),
Information and Computation (IC),
Information Processing Letters (IPL),
IEEE Transactions on Mobile Computing (TMC),
Data and Knowledge Engineering (DKE),
IEEE Internet Computing (INTERNET),
Information and Software Technology (IST),
Journal of Systems and Software (JSS),
Knowledge and Information Systems (KAIS),
Ad Hoc and Sensor Wireless Networks (AHSWN),
Journal of Research and Practice in Information Technology (JRPIT),
Journal of Computer Science and Technology (JCST),
Distributed and Parallel Databases (DPDB),
International Journal of Computer Systems Science and Engineering (CSSE),
LNCS Journal on Data Semantics (JODS),
Electronics and Telecommunications Research Institute Journal (ETRI),
Proceedings of the IEEE (PIEEE).
-
Reviewers for conferences:
ACM SIGMOD International Conference on Management of Data (SIGMOD),
International Conference on Very Large Data Bases (VLDB),
International Conference on Data Engineering (ICDE),
ACM Symposium on Principles of Database Systems (PODS),
International Conference on World Wide Web (WWW),
International Conference on Information and Knowledge Management (CIKM),
International Workshop on the Web and Databases (WEBDB),
ACM Symposium on Cloud Computing (SOCC),
International Symposium on Theoretical Aspects of Computer Science (STACS),
European Symposium on Algorithms (ESA),
International Conference on Distributed Computing Systems (ICDCS),
International Conference on Mobile Systems, Applications, and Services (MOBISYS),
USENIX Annual Technical Conference (USENIX),
ACM Symposium on Parallel Algorithms and Architectures (SPAA).
- Designer of the ACM SIGMOD logo,
IEEE Data Engineering logo,
Stanford InfoLab's old logo,
VLDB 2011 logo, and a number
of others.
Service to Duke University and the Department of Computer Science:
- Director of Graduate Studies, Department of Computer Science, Duke University, July 2008 - June 2012.
- Chair of Graduate Recruiting/Admissions Committee,
Department of Computer Science, Duke University, 2007 - 2008 and 2008 - 2009.
- Member of Faculty Search Committee, Department of
Computer Science, Duke University, 2001 - 2005 and 2006 - 2007.
- Member of Graduate Program Committee, Department of
Computer Science, Duke University, 2008 - present.
- Member of the inDuke Steering Committee, Department of
Computer Science and School of Engineering, Duke University, 2005 - 2010.
- Member of Communications Committee, Department of
Computer Science, Duke University, 2005 - 2011.
- Member of Computing Infrastructure Committee, Department of Computer
Science, Duke University, 2006 - 2010 and 2012 - present.
- Triangle Computer Science Distinguished Lecture Series
Chair, Department of Computer Science, Duke University, 2002 - 2007.
- Colloquium Chair, Department of Computer Science, Duke University, 2002 - 2003 and 2004 - 2007.
- Member of Ph.D. Admissions Committee, Department of
Computer Science, Duke University, 2001 - 2002, 2004 - 2005, 2005 - 2006, 2009 - 2010, and 2011 - 2012.
Other activities:
- Member of UC Berkeley Putnam Math Competition
Team, 1993 - 1994.
- Member of UC Berkeley Regents' and Chancellor's
Scholars Association, 1993 - 1995.