DARPA YEARLY TECHNICAL REPORT
Technical topic area: Ultrascale Computing
Contractor's type of Business: Educational
Date of Report: July 15, 1999
Period Covered By This Report: Sept. 1, 1998- July 15, 1999
Project Title: Prototyping Biomolecular Computations
Organization: Duke University
AO Number: F359
Contract Number: NSF Grant CCR-9725021
Start Date: 01 SEPT 1997
End Date: 31 AUG 2000
Principal Investigator
Name: John Reif
Address: Department of Computer Science
City, State Zip:
Durham,, NC 27707
Phone: 919-660-6568
Fax: 919-660-6519
Email: reif@cs.duke.edu
Level Of Participation: 38%
Financial POC
Name: Patricia Kimbrough
Address: Office of Research Support, Duke University
City, State Zip: Durham, NC 27707
Phone: 919-684-3030
Fax: 919-684-2418
Email: patricia.kimbrough@duke.edu
Project URLs:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.DARPA.FY99.html
Also:
http://www.cs.duke.edu/~reif/BMCA.html and http://bmc.cs.duke.edu/
DARPA FY99 quadchart URL:
http://www.cs.duke.edu/~reif/BMC/quadcharts/quadchart.FY99.gif
http://www.cs.duke.edu/~reif/BMC/quadcharts/quadchart.FY99.ppt http://www.cs.duke.edu/~reif/BMC/quadcharts/quadchart.FY99.ppt4.0
Objective:
The overall goal is to develop and demonstrate Biomolecular Computation (BMC) employing bio-technology to do massive parallel computing at the molecular scale (also known as "DNA computing"). This project leverages well established bio-technology, in particular recombinant DNA techniques, appropriately modified to insure error resiliency.
Applications: The BMC applications include solution of computationally hard problems that are far beyond the most optimistic extrapolations of current (silicon-based) systems. BMC applications also include DNA cryptography systems that are potentially unbreakable. The applications to biology include fingerprinting and re-coding of natural DNA into "wet" DNA databases that can be further processed by BMC techniques.
Tasks: The key tasks are experimental demonstrations, construction of new 2D and 3D nano-structures, applications to biology, and software tools and algorithmic development (employing realistic models of recombinant DNA techniques to optimize the experimental protocols and minimize errors).
Experimental Demonstrations: The BMC experimental demonstrations include moderate scale solution of computationally hard problems. A number of our BMC experiments encode large amounts of binary data (e.g., possible solutions to search problems) within distinct DNA strands, and then process this data in massively parallel fashion to execute arithmetic and logical operations on the data. The nano-structures constructed in experimental demonstrations consist of DNA crossover molecules that self assemble into large lattices that can execute computations, as well as DNA molecules that reconfigure for possible use as motors.
Approach:
Our approach emphasizes development of the underlying technology for BMC, development of specific approaches to challenge problems and demonstrations to test the basic capabilities and limitations of the bio-technology. For overviews of our Project, see:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.DARPA.FY99.html
Also:
http://www.cs.duke.edu/~reif/BMCA.html and http://bmc.cs.duke.edu/
Consortium on Biomolecular Computing and Applications
Our approach is collaborative, with leveraging on the collaborators distinctive capabilities. The collaborations are also focusing on topics of key concern: word design, error reduction, and scalability. For information on our recent consortium meeting, see:
http://www.cs.duke.edu/~reif/BMC/BMC.consortium/consortium.html
The key tasks are as follows:
(Task 1) EXPERIMENTAL DEMONSTRATIONS:
The most important task of the project is small to moderate scale prototype demonstrations and implementations. This involves word design and register design for storage and retrieval of logical and numeric information in massively parallel memories, where the registers encode large amounts of binary data within distinct DNA strands. We are using word designs and other methods to improve error resiliency. The experimental demonstrations of BMC include: massively parallel execution of basic operations such as logical and arithmetic operations, and the sequential chaining of these operations. To decrease overall risk, we are pursuing two distinct recombinant DNA techniques for this key task: (a) solution-based methods using the ligase chain reaction and (b) surface-based methods.
The BMC experimental demonstrations also include solution of computationally hard NP search problems using distinct DNA strands to encode distinct possible solutions to the search problems and using the massively parallel processing available via BMC to verify which is a correct solution to the search problem.
Also, we are testing various DNA word designs to minimize mismatches and insure error resiliency, and doing experiments to access the accuracy of high fidelity DNA separation methods. Furthermore, we are investigating input of data and developing 2D readout of results and interfacing with silicon- based computers.
(Task 2) NANO_CONSTRUCTION:
We are developing new methods for nano-assembly of computational structures. The nano-structures constructed consist of DNA crossover molecules (tiles) that have sticky ends that match the sticky ends of other DNA tiles. The DNA tiles self assemble into large lattices that can execute computations. We are doing experimental tests of computation by self-assembly of DNA nanostructure tilings. The key advantage of this approach is that the self-assembly sidesteps time consuming laboratory steps required by other methods for BMC. These assembly methods can be executed in massively parallel fashion, with concurrent assemblies, each with a (possibly) distinct input strand. We will test our approach by doing massively parallel arithmetic operations by this self-assembly method. We are currently engaged in various steps required to achieve this goal, including (i) construction of trial inputs by self-assembly of dsDNA (double stranded DNA) segments from ssDNA (single stranded DNA), (ii) design and construction of DNA tiles, and (iii) design and construction of DNA tilings for massively parallel arithmetic computations. Massively parallel DNA assemblies for integer addition are now being tested. This will be followed by massively parallel DNA assemblies for integer multiplication with random inputs, which will give solution of integer factorization problems (used for decryption of the RSA crypto-system). We are also developing DNA molecules that reconfigure for possible use as motors.
(Task 3) APPLICATIONS:
In addition to the solution of computationally hard problems, other BMC applications include DNA cryptography systems that are potentially unbreakable. We are also investigating biological applications such as re-coding of natural DNA using nonstandard bases. The re-coded natural DNA includes digital data to form an encoded library. The re-coded natural DNA can be processed in subsequent BMC steps to do massively parallel data base queries. The principal advantage of this approach is that we avoid competing with electronic machines, because they can not solve this analog type of problem. In particular, we can avoid the very time consuming step of sequencing the natural DNA into conventional digital electronic media. We are also investigating other biological applications such as DNA fingerprinting which can benefit from BMC techniques.
(Task 4) SOFTWARE TOOLS AND MATHEMATICAL MODELS:
Software tools are being developed for simulation and design of BMC experiments. The software tools employ realistic models of recombinant DNA techniques to optimize the experimental protocols and minimize errors. The associated mathematical models are being developed for modeling recombinant DNA operations at multiple levels of detail from high level down to the kinetics of reactions. We are also developing efficient BMC algorithms in the context of these mathematical models, which can lead to more efficient BMC experimental protocols.
Recent Accomplishments:
SUMMARY OF RECENT PROGRESS AND ACCOMPLISHMENTS
Sept. 1, 1998-April 30, 1999
FY89-99 PUBLICATIONS: (64 Papers, 2 Books)
Sept. 1, 1998-July 15, 1999 Publications:
http://www.cs.duke.edu/~reif/BMC/BMCpub/BMCpub99.htmlprevious 1998 Publications:
http://www.cs.duke.edu/~reif/BMC/BMCpub/BMCpub98.html
TASK BREAKDOWN:
(Task 1) EXPERIMENTAL DEMONSTRATIONS:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task1.html
(Task 2) NANO-CONSTRUCTION:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task2.html
(Task 3) APPLICATIONS:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task3.html
(Task 4) SOFTWARE TOOLS AND MATHEMATICAL MODELS:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task4.html
Details Given for each Task:
(Task 1) EXPERIMENTAL DEMONSTRATIONS:
For details and URLs see:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task1.htmlSUMMARY OF RECENT PROGRESS:
We executed laboratory demonstrations of massively parallel memory operations, using two distinct recombinant DNA techniques:
(1.1A)Executed solution based DNA computation operations, completing a laboratory demonstration of fabrication of DNA logic registers, and executing a reversible (Toffoli) universal Boolean logic operation using the ligase chain reaction. This is the first demonstration of a reversible gate via DNA computation, which may be very significant for low energy applications.
We also significantly improved the performance of the `sticker' machine for molecular computation by building the basis for a `solid state' device in which DNA is confined to acrylamide gels throughout the computation.
(1.1B) Executed surface based DNA computation operations, encoding bits by use of DNA hybridization chemistry, up to 4 bits, and solved a small example of the Satisfiability (SAT) problem.
(1.2) Experimentally tested a RNA protocol for solving a SAT problem known as the "Knight Problem" on a 3 x 3 chess board using sequential RNase H digestions, which corresponds to a 9-variable SAT problem. We also carried out the design of a 25-bit RNA library with the specific aim of extending our experimental protocol to address larger, more challenging NP-complete problems.
(1.3)Successfully tested experimental designs to decrease errors, including DNA word designs.
(1.4) Made an DNA implementation of an evolutionary algorithm.
(Task 2) NANO-CONSTRUCTION:
For details and URLs see:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task2.html
SUMMARY OF RECENT PROGRESS:
(2.1) Designed and experimentally tested in the lab a new DNA tile (TAO35) which is a rectangular shaped triple crossover molecule with sticky ends on each side that can match with other such tiles and with a "reporter" ssDNA sequence that runs through the tile from lower left to upper right, facilitating output of the tiling computation.
(2.2) We made significant progress with experimental implementation of DNA-based computers which perform calculations during self-assembly of specific nanoscale tilings. We successfully prototyped a novel read-in method for tiling-based computers: we have demonstrated for the first time that long single-strand DNA molecules, which we refer to as scaffold strands, effectively serve as nucleation regions for the formation of specific, multi-component tile assemblies. These scaffold strands have been used to generate a variety of tile assemblies and are useful directly as a means of reading specific inputs into current DNA tiling assembly computers. In addition, we have produced detailed designs for an improved DNA-based ("string tile") computer which makes optimal use of local parallelism by producing linear assemblies, thereby avoiding several potential experimental limitations inherent in previous 2-dimensional tile-assembly schemes.
(2.3) Experimentally constructed for the first time large 2D arrays of DNA tiles via self-assembly, where the arrays ranged in size up to thousands of tiles. The tilings were verified by a variety of techniques, including spectacular images using an atomic force microscope(AFM) and also "reporter" ssDNA sequences.
(2.4) Developed DNA molecules that reconfigure for possible use as nano-scale motors; we designed and experimentally tested in the lab a 2-state DNA nano-device that changes shape in response to a chemical stimulus.
(Task 3) APPLICATIONS:
For details and URLs see:
http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task3.html
SUMMARY OF RECENT PROGRESS:
(3.1) Recent research has considered DNA as a medium for ultra-scale computation and for ultra-compact information storage. One potential key application is DNA-based, molecular cryptography systems. We developed procedures for DNA-based cryptography based on one-time-pads that are in principle unbreakable. Since this work constitutes a novel approach to the use of DNA in the area of cryptography, it is expected to be of significant interest to the military and thus to DARPA. We also developed methods for DNA Steganography, which hides DNA in large pools of similar DNA.
(3.2) We experimentally demonstrated key stages of the re-coding of natural DNA using non-standard bases to code digital data. This experimental test is significant if we are to avoid the usual very time consuming task of sequencing natural DNA into conventional binary electronic form.
(Task 4) SOFTWARE TOOLS AND MATHEMATICAL MODELS:
For details and URLs see: http://www.cs.duke.edu/~reif/BMC/reports/BMC.FY99.reports/BMC.tasks/Task4.html
SUMMARY OF RECENT PROGRESS:
(4.1)We investigated the use of micro-flow device technology for BMC, and showed it provides multiple benefits over the current techniques for BMC.
(4.2)Demonstrated key portions of a prototype software simulation system for DNA computation.
(4.3)Developed various improved mathematical models for BMC: based on plasmids, liposomes, and also evolutionary models based on the gene scrambling and RNA editing done by biological systems such as Ciliates.
(4.4) Developed improved methods for evaluating large Boolean circuits using DNA.
DETAILS OF RECENT PROGRESS IN TASKS:
(Task 1) EXPERIMENTAL DEMONSTRATIONS:
DETAILS OF RECENT PROGRESS:
(1.1) DEMONSTRATIONS OF MASSIVELY PARALLEL OPERATIONS
We executed laboratory demonstrations of massively parallel memory operations, using two distinct recombinant DNA techniques:
(1.1A) DEMONSTRATIONS OF SOLUTION BASED DNA COMPUTATIONS
DNA logic gates. Fredkin gates are of fundamental importance in the design of computers because (I) they can function with extremely small amounts of energy (in principle, arbitrarily low energy), and (ii) arbitrary Boolean circuits can be assembled using Fredkin gates. We have shown how to encode information in DNA and use DNA amplification to implement Fredkin gates. We have shown successful readout, and we have demonstrated in the laboratory that the output of our Fredkin gate can then be used as legal input to a second Fredkin gate computation. We completed a laboratory demonstration of fabrication of DNA logic registers. Executed a reversible (Toffoli) universal Boolean logic operation using the ligase chain reaction. This is the first demonstration of a reversible gate via DNA computation, which may be very significant for low energy applications. Tested the efficiency of the reaction by end-labeling the 5' end oligo with 32-P, followed by analysis of the LCR products and detection.
Joshua P. Klien, Thomas H. Leete, and Harvey Rubin. A biomolecular implementation of logically reversible computation with minimal energy dissipation. Also in Peter J. Angeline, Zbyszek Michalewicz, Marc Schoenauer, Xin Yao, and Ali Zalzala, editors, 1999 Congress on Evolutionary Computation. IEEE Computer Society Press, New York, to appear, 1999. Also to appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. . L. Kari, (1999). Extended version of DNA4 and CEC99 papers. BioSystems, to appear.
http://www.cis.udel.edu/~wood/BMC/papers/BioSystems.pdf
http://www.cis.udel.edu/~wood/BMC/papers/cec99.pdfRubin, H., J. Klein, T. Leete, A biomolecular implementation of logically reversible computation with minimal energy dissipation, To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. . L. Kari, (1999).
Note: We had previously used a horizontal chain reaction for DNA-Based addition that did not allow repetitions; see: Guarnieri, F., and C. Bancroft. Use of a Horizontal Chain Reaction for DNA-Based Addition. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. 4:105-111 (1999).
The Sticker Machine for DNA Computation. We also significantly improved the performance of the `sticker' machine for molecular computation by building the basis for a `solid state' device. Our previous research revealed the difficulties of having the information carrying DNA in aqueous solution -- in particular spillage and volume expansion. To overcome these problems, we have designed and implemented a `solid state' method in which DNA is confined to acrylamide gels throughout the computation. During the computation, DNA is passaged from location to location via electrophoresis.
Graphic URLs:
(http://www-scf.usc.edu/~pwkr/BMC/modules.gif, 206 KB) Stackable glass modules containing acrylamide gel (colored blue for visibility). The acrylamide has embedded DNA probes which are covalently bound to the acrylamide via acrydite chemistry. Each module is approximately 1cm long.
(http://www-scf.usc.edu/~pwkr/BMC/plungers.gif, 113 KB) Glass `plungers' containing buffer for electrophoresis. Two plungers shown `head to head'. Blue region shows acrylamide plugs. Vertical shafts are for attaching electrodes and releasing gas.
(http://www-scf.usc.edu/~pwkr/BMC/block.gif, 116 KB) `Solid state' apparatus consisting of two aluminum blocks with central bore into which glass modules and plungers are inserted. During electrophoresis, the left block is heated and the right block is cooled. DNA leaves modules in the left block to migrate toward and be captured in modules in the right block.
Sam Roweis, Erik Winfree, Richard Burgoyne, Nickolas V. Chelyapov, Myron F. Goodman, Paul W.K. Rothemund and Leonard M. Adleman, A Sticker-based Model for DNA Computation, Journal of Computational Biology, V. 5 N. 4, pp. 615-629, 1998.
(A new representation for encoding bits in DNA is presented and examined.)
Leonard M. Adleman, Paul W.K. Rothemund, Sam Roweis and Erik Winfree, On Applying Molecular Computation to the Data Encryption Standard, Journal of Computational Biology, V. 6. N. 1, pp. 53-63, 1999.
(An algorithm for breaking DES using the Stickers model is outlined. Size, space, and error rates of a practical machine designed to implement this algorithm are considered.)
Sam Roweis, Erik Winfree, Richard Burgoyne, Nickolas V. Chelyapov, Myron F. Goodman, Paul W.K. Rothemund and Leonard M. Adleman, A Sticker-based Model for DNA Computation, 26 pages, Journal of Computational Biology, V. 5 N. 4, pp. 615-629, 1998. (http://www-scf.usc.edu/~pwkr/stickers.ps.gz 284 KB ), (http://www-scf.usc.edu/~pwkr/stickers.ps, 1.2 MB), (http://www-scf.usc.edu/~pwkr/stickers.pdf, 694 KB) (A new representation for encoding bits in DNA is presented and examined.) The version available at the links above has a section concerning methods for decreasing separation error rates that is not included in the JCB article cited. Instead this material appears in:
Leonard M. Adleman, Paul W.K. Rothemund, Sam Roweis and Erik Winfree, On the Reduction of Errors in DNA Computation, Journal of Computational Biology V. 6 N. 1, pp. 65-75. On Applying Molecular Computation to the Data Encryption Standard, 21 pages. (http://www-scf.usc.edu/~pwkr/des.ps.gz, 90 KB ), (http://www-scf.usc.edu/~pwkr/des.ps, 211 KB), (http://www-scf.usc.edu/~pwkr/des.pdf, 206 KB) Journal of Computational Biology, V. 6. N. 1, pp. 53-63, 1999. (An algorithm for breaking DES using the Stickers model is outlined. Size, space, and error rates of a practical machine designed to implement this algorithm are considered.)
(1.1B) DEMONSTRATIONS OF SURFACE BASED DNA COMPUTATIONS
We executed surface based DNA computation operations, encoding bits by use of DNA hybridization chemistry. We have developed a prototype DNA computer (by solving a small example of the Satisfiability (SAT) problem) in which a set of DNA molecules encoding all candidate solutions to the computational problem of interest is manipulated while attached to a surface. Successive cycles of MARK (hybridization) and Destroy (exonuclease digestion) steps identify and eliminate members of the set which are not solutions to the problem, followed by the step of identification of the solution to the computational problem using PCR amplification of the remaining molecules and hybridization to an addressed array (READOUT). We demonstrated:
These tests verified the feasibility of this method for surface based DNA computation. We also developed methods for constructing sets of short DNA strands satisfying certain combinatorial or free energy constraints, for use in DNA computing or as molecular bar codes in chemical libraries. We used single-word strands and a single-base per bit encoding strategy with a 4-base mismatch design. We automated the step of making DNA arrays for READOUT and have begun investigations to create arrays with gold films patterned on a silicon surface and also to create a patterned surface on silicon itself for a DNA array fabrication, as well as novel silicon modification chemistry for DNA attachment.
A Slide Show presentation of DNA computing on surfaces:
http://corninfo.chem.wisc.edu/writings/dnatalk/dna01.htmlRecent Graphics:
http://corninfo.chem.wisc.edu/pictures http://corninfo.chem.wisc.edu/pictures/sn2.JPGURL for Surface Based DNA Computation papers:
http://corninfo.chem.wisc.edu/writings/DNAcomputing.html
Wang, L., Q. Liu, A. Frutos, S. Gillmor, A. Thiel, T. Strother, A. Condon, R. Corn, M. Lagally, L. Smith, Surface-based DNA computing operations: DESTROY and READOUT, To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. L. Kari, (1999).
Frutos, A.G; Smith, L.M. and Corn, R.M. "Enzymatic Ligation Reactions of DNA "Words" on Surfaces for DNA Computing" 1998, J. Am. Chem. Soc., 120, 10277-10282.
Q. Liu, A. G. Frutos, L. Wang, A. E. Condon, R. M. Corn, L. M. Smith, DNA Computing on Surfaces, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999), submitted to Science, 1999.
Gillmor, S.D.; Liu, Q.; Thiel, A.J.; Condon, A.E.; Corn, R.M.; Smith, L.M. and Lagally, M.G. "Hydrophobic/Hydrophilic Patterned Surfaces to Create DNA Arrays" to be submitted (to Langmuir), 1999.
Liu, Q.; Wang, L. and Smith, L.M. "Identification of DNA Molecules on Surfaces by Array Hybridization for DNA Computing" (in preparation), 1999.
Strother, T.C.; Votruba, P.G.; Corn, R.M.; Hamers, R.J. and Smith, L.M. "Synthesis and Characterization of DNA-Modified Silicon(111) Surfaces" (in preparation), 1999.
Zhao, X.; Votruba, P.G.; Strother, T.C.; Ellison, M.D.; Smith, L.M. and Hamers, R.J. "Formation of Organic Films on Si(111) Surfaces via Photochemical Reaction" (in preparation), 1999.
Brockman, J.M.; Frutos, A.G. and Corn, R.M. "A Multi-Step Chemical Modification Procedure to Create DNA Arrays on Gold Surfaces for the Study of Protein-DNA Interactions with Surface Plasmon Resonance Imaging", (in preparation), 1999.
Frutos, A.G.; Weibel, S.G. and Corn, R.M. "Near Infrared Surface Plasmon Resonance Measurements of Ultrathin Films: Fourier Transform SPR Spectroscopy" (in preparation), 1999.
(1.2) DNA SOLUTION OF INTRACTABLE PROBLEMS:
RNA solutions to SAT problems
Developed and experimentally tested a RNA protocol for solving a SAT problem known as the "Knight Problem" on a 3 x 3 chess board using sequential RNase H digestions. Each configuration of knights on the chessboard was encoded on an RNA sequence. On each round we perform: selection, single bit operations, and destroying (using thermostable RNase H) RNA molecules which did not satisfy a criterion for this problem. The DNA oligonucleotides were removed by column chromatography, the eluates mixed and subjected to further rounds of selection and destruction. The remaining RNA pool was purified by PAGE after completion of the computation process. Isolated RNA was reverse transcribed, PCR amplified and cloned. The key advantage of this method is that it is scalable to well over 30 variables.
Most significantly, we have successfully refined the RNA mark and destroy algorithm which we introduced last year, to solve a 9-bit instance of a Satisfiability (SAT) problem derived from Chess (a 3 x 3 version of the Knight problem). The method is general for any type of SAT problem. Using the enzyme Ribonuclease (RNase) H to specifically cleave targeted RNA strands, we demonstrated that this effectively allows creation of a universal RNA restriction endonuclease, which means that the method is inherently scalable. We have also carried out the design of a 25-bit RNA library with the specific aim of extending our experimental protocol to address larger, more challenging NP-complete problems.
Powerpoint Slideshow on RNA solutions to general SAT problems:
http://rnaworld.princeton.edu/~fadirk/slides_html http://rnaworld.princeton.edu/~fadirk/Knight.htmlURL for papers:
ftp://rnaworld.princeton.edu/pub/export/Cukras, A. R., Faulhammer, D., Lipton, R. J. and L. F. Landweber, Chess Games: A Model for RNA Based Computation. In DNA Based Computers IV, L. Kari, ed. , (1999).
ftp://rnaworld.princeton.edu/pub/export/Knights.ps( paper describes RNA solutions to the Knight problem (placement of knights on a 3 x 3 board)
Faulhammer, D. , A. R. Cukras, R. J. Lipton, L. F. Landweber, When the Knight Falls: On Constructing an RNA Computer, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999).To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
Landweber, L. F. , RNA Based Computing. In DNA Based Computers II, L. F. Landweber and E. B. Baum, eds. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, American Mathematical Society, 181-189, (1999).
Faulhammer, D., Lipton, R. J. and L. F. Landweber (in press) Counting DNA: Estimating the complexity of a test tube of DNA. In DNA Based Computers IV, L. Kari, ed. (199).
(1.3) DECREASING ERRORS IN BMC.
We developed experimental designs to decrease errors, including DNA word designs:
Designed and tested short DNA words that can be used to encode digital information and to enable DNA operations (for example, based on hybridization) to be executed robustly. We also tested a new RNA method for creating libraries of words and the construction of a 10-bit library by an innovative mix and split evolutionary method which optimized for word dissimilarity (e.g., Hamming distance). Our evolutionary method for creating libraries will enable us to design a 25-bit library, allowing scalability of the methods for RNA solutions to large SAT problems as reported above in 1.2B.
Streamlined Modular Construction of 10-, 25-, and N-Bit Combinatorial Libraries. We synthesized a DNA library encoding all strands of the ten-bit RNA library, as only two halves. The first half contained bits one through six, and the second half contained the complements of bits six through ten. To insert bits 11 through 25, we will synthesize three additional sub-libraries containing bits 10 - 15, the complements of bits 15 - 20, and bits 20 through 25, each in a single step. Note that this approach scales linearly to synthesize any n-bit library from n/5 or n/6 pieces by generating six or seven bits at a time (with one-position overlap). In order to construct each modular piece of the combinatorial library without synthesizing each strand individually, we devised a novel mix and split synthesis. Bit j set to 0 was synthesized on one column and bit j set to 1 was synthesized on another column. Both columns were then decrimped and their contents mixed. Then half of this pool was returned to each column and placed back on the DNA synthesizer. The partially synthesized molecules were further extended by synthesis of the spacer and the next bit set to 0 on one column, and the spacer and the next bit set to 1 on the other column, and the process repeated. In this way, linear mixing and splitting steps yield exponential increases in pool complexity. Primer overlap extension of the two halves created the full-length ten-bit library, since the pieces are complementary at bit position six. Direct DNA sequencing confirmed the expected degeneracy at each bit, and in vitro transcription of the DNA library yielded the RNA library. Similar annealing and extension of all five modular pieces will correspondingly generate the 25-bit library from these streamlined components.
Marathe, A., A. E. Condon, R. M. Corn , On Combinatorial DNA Word Design, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999).To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999). submitted (to J. Comput. Biol.), 1999.
http://corninfo.chem.wisc.edu/writings/codepaper.htmlFaulhammer, D., Lipton, R. J. and L. F. Landweber (in preparation for J. Comp. Bio.) Fidelity of enzymatic ligation in DNA computing. (1999).
(1.4) DNA Implementation of Evolutionary Algorithms
Analogs of natural evolution have been used in computation (Genetic Algorithms are examples) and in Molecular Biology (In Vitro Evolution is an example). There have been many calls in the literature to combine these two approaches to gain:
We have carried out preliminary laboratory experiments demonstrating "selection according by fitness". This is the only example we know of physically separating a combinatorially encoded library of DNA strands according to global criteria.
Junghuei Chen, Eugene Antipov, Bertrand Lemieux, Walter Cedeno, and David Harlan Wood. DNA computing implementing genetic algorithms. In Laura Landweber, Erik Winfree, Richard Lipton, and Stephan Freeland, editors, Preliminary Proceedings DIMACS Workshop on Evolution as Computation, pages 39-49, DIMACS, Piscataway NJ, January . (1999).
http://www.cis.udel.edu/~wood/BMC/papers/UDel_DIMACS_99.pdfEugene Antipov. A Max 1s problem in DNA computing via genetic algorithms. In Wolfgang Banzhaf, A. E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, Late-Breaking Papers of the Genetic and Evolutionary Computation Conference, July 13-17, 1999, Orlando, Florida USA., San Francisco, 1999. Morgan Kaufman.
http://www.cis.udel.edu/~wood/BMC/papers/REU.pdfChen, J., E. Antipov, B. Lemieux, W. Cedeno, D.H. Wood, In vitro Selection for a Max 1s DNA Genetic Algorithm, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999). In Wolfgang Banzhaf, A. E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, July 13-17, 1999, Orlando, Florida USA., San Francisco, 1999. Morgan Kaufman.
http://www.cis.udel.edu/~wood/BMC/papers/gecco-99.pdf http://www.cis.udel.edu/~wood/BMC/papers/dna5.pdf http://www.cis.udel.edu/~wood/BMC/papers/dna5.psgraphics:
http://www.cis.udel.edu/~wood/BMC/graphics/dna5readout.pdf http://www.cis.udel.edu/~wood/BMC/graphics/dna5readout.psL. Kari, L. F. Landweber , Computational Power of Gene Rearrangement, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999).To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
(TASK 2) NANO-CONSTRUCTION:
DETAILS OF RECENT PROGRESS:
(2.1): DNA NANOSTRUCTURES. Designed and experimentally tested in the lab a new DNA tile (TAO35) which is a rectangular shaped triple crossover molecule with sticky ends on each side that can match with other such tiles and with a "reporter" ssDNA sequence that runs through the tile from lower left to upper right, facilitating output of the tiling computation. This tile and its unique properties will be key to our subsequent experiments in massively parallel arithmetic using self-assembly of DNA tiles.
LaBean, T. H., H. Yan, J. H. Reif and N. Seeman, Construction and Analysis of a DNA Triple Crossover Molecule, submitted for publication, (1999).
(2.2) DNA COMPUTATIONS using Self Assembly of Tilings
We have made significant progress with experimental implementation of DNA-based computers which perform calculations during self-assembly of specific nanoscale tilings. We have successfully prototyped a novel read-in method for tiling-based computers: we have demonstrated for the first time that long single-strand DNA molecules, which we refer to as scaffold strands, effectively serve as nucleation regions for the formation of specific, multi-component tile assemblies. These scaffold strands have been used to generate a variety of tile assemblies and are useful directly as a means of reading specific inputs into current DNA tiling assembly computers. In addition, we have produced detailed designs for an improved DNA-based ("string tile") computer which makes optimal use of local parallelism by producing linear assemblies, thereby avoiding several potential experimental limitations inherent in previous 2-dimensional tile-assembly schemes.
LaBean, T. H., E. Winfree, J. H. Reif, Experimental Progress in Computation by Self-Assembly of DNA Tilings, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
http://www.cs.duke.edu/~thl/tilings/labean.psGraphics and Slides:
http://www.cs.duke.edu/~reif/paper/DNAtiling/tiling.slides/index.htm http://www.cs.duke.edu/~thl/tilings/TileTalkSlides.hqxLagoudakis, M. G., T. H. LaBean, 2D DNA Self-Assembly for Satisfiability, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
Reif, J.H., Local Parallel Biomolecular Computation, DIMACS Workshop on DNA Based Computers, Series in Discrete Mathematics and Theoretical Computer Science, vol. 44 , American Mathematical Society, ed. H. Rubin, published Sept. 1998. Postscript versions of this paper and its figures are at
http://www.cs.duke.edu/~reif/paper/Assembly.ps http://www.cs.duke.edu/~reif/paper/Assembly.fig.ps(2.3)
Self-assembly of 2D LATTICES of DNA TilesWe experimentally constructed for the first time large 2D arrays of DNA tiles via self-assembly. The arrays ranged in size up to thousands of tiles. These DNA tilings used identical or nearly identical DNA tiles. The tilings were verified by a variety of techniques, including spectacular images using an atomic force microscope(AFM) and also "reporter" ssDNA sequences. This provides the first experimental evidence of the feasibility of the nano-construction of large arrays via self-assembly of DNA tiles. In particular, a system of two double-crossover (DX) units was designed to self-assemble into a periodic 2D lattice. A hairpin sequence was inserted into one of the units to provide a topographic contrast agent, resulting in 25 nm stripes in the lattice. Winfree have confirmed both lattices by atomic force microscopy (AFM), with supporting evidence from ligation studies. In Seeman's lab, a similar system of two larger DX units has produced lattices with 32 nm stripes; the design was elaborated to a four-unit system producing 64 nm stripes, clearly demonstrating the programmable nature of the DX system. These results were reported in fall, 1998 in (Winfree, Liu, Wenzler, Seeman, 1998).
*Spectacular AFM Images of the first Large Scale DNA Tilings are at:
http://seemanlab4.chem.nyu.edu/two.d.html
Also see:
http://seemanlab4.chem.nyu.edu/abstar.gif http://seemanlab4.chem.nyu.edu/fl32nm.gif http://seemanlab4.chem.nyu.edu/abcdstar.gif http://seemanlab4.chem.nyu.edu/fl64nm.gif http://seemanlab4.chem.nyu.edu/12.gifFurther Recent 1999 Progress:
-We have built planar arrays from triple crossover molecules. This demonstrates that previous results can be extended to triple crossover molecules.
-We have built arrays from triple crossover molecules containing roughly orthogonal components as precursor to constructing 3D arrays. This demonstrates that the gaps in triple crossover arrays can be filled with other multi-crossover molecules, and the feasibility of using this approach to get to 3D.
- We have built arrays from parallelograms of single-crossover components. This demonstrates that a new, and possibly simpler, motif is available for use as DNA cellular automata.
- We have constructed single-crossover molecules from complexes containing 5',5' and 3',3' linkages. This demonstrates that head-to-head and tail-to-tail stands can be incorporated in arrays. These are likely to be components of molecules that exhibit sequence-dependent transitions.
Recent Paper abstracts by Seeman:
http://seemanlab4.chem.nyu.edu/darpa.0499.2.htmlE. Winfree, F. Liu, Lisa A. Wenzler, N. C. Seeman, Design and Self-Assembly of Two Dimensional DNA Crystals, Nature 394: 539--544, 1998. (1998).
N.C. Seeman, Nucleic Acid Nanostructures and Topology. Angewandte Chemie. 110, 3408-3428 (1998); Angewandte Chemie International Edition 37, 3220-3238, (1998).
F. Liu, R. Sha and N.C. Seeman, Modifying the Surface Features of Two-Dimensional DNA Crystals, Journal of the American Chemical Society 121, 917-922 (1999). (Demonstrates that patterns generated by self-assembly can be modified using standard DNA biotechnology. Will permit attachment of new chemistries at particular places to mark particular features in an assembly.)
R. Sha, F. Liu, M.F. Bruist and N.C. Seeman, Parallel Helical Domains in DNA Branched Junctions Containing 5', 5' and 3', 3' Linkages, Biochemistry 38, 2832-2841 (1999). (Demonstrates that it is possible to include 5', 5' and 3', 3' linkages in branched DNA motifs. This is important, because it appears that these elements are key to the successful construction of molecules that can undergo sequence dependent mechanical transitions.)
(2.2) DNA NANOMOTORS. We also developed DNA molecules that reconfigure for possible use as nano-scale motors. We designed and experimentally tested in the lab a 2-state DNA nano-device that changes shape in response to a chemical stimulus. In particular, we constructed a molecular device that consists of two rigid DNA double crossover (DX) molecules connected by 4.5 double helical turns; one domain of each DX molecule is attached to the connecting helix. We have used the B-Z transition as the basis for the structural alteration we wish to induce. It changes shape predicated on this B-Z transition. We detected relative strand position changes by fluorescence resonance energy transfer(FRET), because the separation of two dyes attached to the device changes as a consequence of the transition. We obtained FRET evidence that we can cycle this device. Control measurements were performed to confirm this finding. This provides the first experimental methodology for nano-construction of a DNA motor. We are beginning electrophoretic mobility characterization of sequence-independent identity testing (SIIT), complete SIIT characterization by electrophoretic mobility, and characterize SIIT by FRET.
See Figure of DNA Motor at:
http://seemanlab4.chem.nyu.edu/device.jpegFor more details and graphics see:
http://seemanlab4.chem.nyu.edu/device.htmlC. Mao, W. Sun, Z. Shen and N.C. Seeman, A DNA Nanomechanical Device Based on the B-Z Transition, Nature 397, 144-146 (1999).
(Demonstrates that we can control DNA molecular transitions, and that we can detect those same motions. We can now apply the same methodology to sequence-dependent transitions, once we get the chemistry to work. The importance of sequence-dependent transitions (the molecular equivalent of a Dirac delta function for a given sequence) for DNA computing and nano-manufacturing is evident.)
(2.3) Assembly Simulations Using Plastic Tiles. A new system for performing computation via self-assembly was explored. Plastic tiles 1 cm on a side self-assemble to form segments of the famous Penrose tiling. This tiling has been used to model the structure of quasicrystals. These tiles form aperiodic patterns that correspond to Fibonacci sequences--the self-assembly of the tiles 'computes' the sequences.
Graphics URL:
Self-assembling Penrose tiles (http://www-scf.usc.edu/~pwkr/BMC/penrose.gif, 75 KB)
Paul W.K. Rothemund and Leonard M. Adleman, Programmed self-assembly using lateral capillary forces, submitted to Nature, 1999.
(Task 3) APPLICATIONS:
DETAILS OF RECENT PROGRESS:
(3.1) DNA Cryptography and Steganography
Recent research has considered DNA as a medium for ultra-scale computation and for ultra-compact information storage. One potential key application is DNA-based, molecular cryptography systems. Since this work constitutes a novel approach to the use of DNA in the area of cryptography, it is expected to be of significant interest to the military and thus to DARPA.
We developed some procedures for DNA-based cryptography based on one-time-pads that are in principle unbreakable. Practical applications of cryptographic systems based on one-time-pads are limited in conventional electronic media, by the size of the one-time-pad; however DNA provides a much more compact storage media, and an extremely small amount of DNA suffices even for huge one-time-pads. We detailed procedures for two DNA one-time-pad encryption schemes: (i) a substitution method using huge libraries of distinct pads, each of which defines a specific, randomly generated, pair-wise mapping; and (ii) an XOR scheme utilizing molecular computation and indexed, random key strings. These methods can be applied either for the encryption of (appropriately recoded) natural DNA, and also for the encryption of DNA encoding binary data. In the latter case, we also present a novel use of chip-based DNA micro-array technology for 2D data input and output. (Note: The above LaBean, et al paper also critically examined a large class of DNA steganography systems, and showed that in certain cases such schemes may not cryptographically secure, and can be broken with certain assumptions on the entropy of the plaintext messages.)
Gehani, A., T. H. LaBean, J. H. Reif, DNA-based Cryptography, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
www.cs.duke.edu/~reif/paper/DNAcrypt/crypt.pswith Nice Graphic illustrations of DNA Cryptography:
http://www.cs.duke.edu/~reif/paper/DNAcrypt/crypt.talk/crypt.talk.web/crypt.talk.html
http://www.cs.duke.edu/~reif/paper/DNAcrypt/crypt.talk/crypt.talk.pdfDNA Steganography
The work is in a sub-category of cryptography, called steganography, and involves hiding a (possibly encrypted) message in such a way that an observer (the "enemy") cannot detect or read it. We investigated DNA steganography systems, which secretly tag the input DNA and then disguise it (without further modifications) within collections of other DNA. The below paper developed and executed experimentally a novel approach to using a DNA steganographic technique in which a DNA-encoded message is hidden within human genomic DNA.
Advantage is taken of the enormous complexity of the genomes of multi-cellular organisms (we have employed the human genome as an example) to hide a secret DNA-encoded message in the midst of this complexity. A text message is encoded into DNA according to some simple encryption scheme (in this simplest model case, it is not necessary that this encoding be difficult to crack; both because encryption is not the major basis for keeping the enemy from reading the secret text, and because the enemy is very unlikely to ever see the secret message and thus will not even have an opportunity to attempt to break the code employed for encryption). A "Secret Message" (SM) DNA molecule is then synthesized containing three parts: the secret message in the middle, flanked on each side by specific primer sequences known only to the sender ("Alice") and the recipient ("Bob"). A small amount of the SM DNA is then added to an appropriately treated human DNA sample, and sent from Alice to Bob. To read the message, Bob carries out PCR on the DNA sample he has received, employing a primer pair complementary to the two primer sequences described above. (Although this might appear similar to the classical "one-time pad" cryptographic technique, it is in fact reusable, and is thus a more generally useful procedure than the one-time pad technique, because the same primer sequences (and encryption key) could be employed on many separate occasions for communications between Alice and Bob. This is because the enemy does not know that the human DNA sample contains a secret message; more importantly, even if he did know this, he would have no way to identify the SM DNA and thus read the sequences of either the primers or the encoded message). Bob then recovers the PCR-amplified DNA product, reads the sequence of the SM DNA molecule, and then employs the simple encryption key (also agreed upon in advance by Alice and Bob) to decode and read the message that is encoded in the DNA sequence residing between the primers.
Bancroft,C. C.T. Clelland, V. Risca, Genomic Steganography: Amplifiable Microdots, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999). To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
(3.1) DNA2DNA Computations
We experimentally demonstrated key stages of the re-coding of natural DNA using non-standard bases to code digital data. This experimental test is significant if we are to avoid the usual very time consuming task of sequencing natural DNA into conventional binary electronic form. The encoded DNA sequences were formed by a hybridization reaction. We made use of a thermostable DNA that allow us to amplify and capture the ligated strands by the ligase chain reaction (LCR). We tested the following features: a) the minimal length of probe strands to be ligated; b) the optimal temperature of ligation in terms of (1) efficiency of ligation, and (2) fidelity of ligation. We also developed a protocol for counting the number of different kinds of strands of DNA in the test tube.
http://www.princeton.edu/~lfl/DNA2DNA/
(Task 4) SOFTWARE TOOLS AND MATHEMATICAL MODELS:
DETAILS OF RECENT PROGRESS
(4.1) MEMS for BMC. MEMS is the technology of miniature actuators, valves, pumps, sensors and other such mechanisms. In particular, when controlling fluids it is known as micro-flow device technology. We investigated the use of micro-flow device technology for BMC, as this provides multiple benefits over the current techniques for BMC. Some of the current limits of BMC stem from the labor intensive nature of the laboratory work. Automation using available micro-flow technology allows a drastic reduction in this lab work. Micro-flow technology also may allow a drastic reduction in the large volume needed for the desired bio-molecular reactions to occur.
John H. Reif and Ashish Gehani, Microflow Bio-Molecular Computation, 4th DIMACS Workshop on DNA Based Computers, University of Pennsylvania , June, 1998. Invited to special issue of BIOSYSTEMS, 1999.
(4.2) Software Simulations of DNA Operations:
Demonstrated key portions of a prototype software simulation system for DNA computation. These software tools are expected to be very useful to optimize the experimental BMC protocols and minimize errors. A simulator was written in Java, which implemented the operations of the recombinant DNA model. The software models recombinant DNA operations at multiple levels of detail from high level down to the kinetics of reactions. For example, molecular complexes are represented as graphs at the granularity of nucleotides and have a level of spatial description such that most infeasible nucleotide complexes are not allowed to contaminate the data set as the computation is simulated. We used thermodynamic and kinetic models to represent interactions between complexes.
(4.3) Mathematical Models for BMC:
Improved Splicing Models. Characterized the generative capacity for formal representations of enzymatic actions on DNA molecules. We developed a dynamical model of time progression of DNA concentrations, using systems of differential equations for dynamic development of the concentrations of each variety of DNA molecules in a test tube as the molecules are processed by specified sets of enzymes. Performed second wet lab test of splicing theory.
We have developed an approach to DNA computation in which we 'write' on circular double stranded DNA molecules (plasmids) using biochemical processes. We have confirmed in the laboratory that we can carry out three of our 'write' operations in sequence - and then 'read' the result. New theoretical results concerning the splicing operations used in our laboratory work have been obtained.
T.Head, Circular suggestions for DNA computing, in: Eds. M.Gromov & A.Carbone, Pattern Formation, World Scientific, invited paper, 1999.
http://www.math.binghamton.edu/dennis/DARPA/circular.htmlT.Head, M.Yamamura & S.Gal, Aqueous computing: writing on molecules, Proceedings of the Congress on Evolutionary Computing (CEC'99), invited paper, July 6-9, 1999.
http://www.math.binghamton.edu/dennis/DARPA/aqueous.htmlLiposome Models.
We also have developed a theoretical liposome-based approach to DNA-based computation. This technique is based upon the compartmentalization and/or combination of the DNA molecules to be employed in a computation, via encapsulation in liposomes formed from various types of biological lipids.Bloom, B., C. Bancroft, Liposomal-Mediated Biomolecular Computation, 5th International Meeting on DNA Based Computers(DNA5), MIT, Cambridge, MA, (June, 1999).To appear in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, ed. E. Winfree, (1999).
Evolutionary Models.
We developed models of BMC based on the gene scrambling and RNA editing done by biological systems such as Ciliates.
Landweber, L. F. and L. Kari (1999, to appear) Universal Molecular Computation in Ciliates. In Landweber, L. F. and Winfree, E., eds. Evolution as Computation. Springer-Verlag. http://www.princeton.edu/~lfl/CiliateComputer/
Landweber, L. F. (1998) The Evolution of DNA Computing: Nature's Solution to a Path Problem. IEEE Proceedings of Symposia on Intelligence and Systems '98, May 21-23, 1998. IEEE Computer Society Press, 133-139.
Landweber, L. F. and L. Kari. (1998) The Evolution of DNA Computing: Nature's Solution to a Computational Problem. 1998 Genetic Programming Conference Proceedings, John Koza et al., eds., Morgan Kaufmann Publishers, Inc., 700-708.
Kari, L. and L. F. Landweber. (in press) Computing with DNA. Methods in Molecular Biology.
Forbes, N. A. and L. F. Landweber. (1999, in press) Computer Science and Meta-Evolution. Trends in Genetics.
Landweber, L. F., Horton, T. L. and L. Kari. (in press) Evolution of Gene Scrambling and RNA Editing: Protists Perform Computations!
Landweber, L. F. (in press) The Evolution of Cellular Computing. The Biological Bulletin.
(4.4) BMC Algorithms for Parallel Circuit Evaluation:
We have developed methods for evaluating large Boolean circuits (logic gate circuits) using DNA. We showed the tight connection between the simplest computational model with DNA and traditional circuit model. In particular, we proved that bounded fan-in circuits can be evaluated by DNA-based computers that use a small number of biochemical operations within computation time O(depth). The running time under the model corresponds to that of circuits and the size to that of circuits. This on-going work is expected to lead to more efficient BMC experimental protocols.
M. Ogihara, Relating the Minimum Model for DNA Computation and Boolean Circuits, to appear in Genetic and Evolutionary Computing conference '99 (GECCO '99)
http://www.cs.rochester.edu/u/ogihara/research/DNA/gecco.ps.gzM. Ogihara and A. Ray, Circuit evaluation: thoughts on a killer application in DNA computing. In Computing with Bio-Molecules. Theory and Experiments (G. Paun, ed.), pages 111--126, Springer-Verlag, Singapore, 1998.
http://www.cs.rochester.edu/u/ogihara/research/DNA/paun.ps.gz
CURRENT PLAN:
SUMMARY OF PLANNED WORK for the rest of FY99 and FY2000
(Task 1) SUMMARY OF PLANNED EXPERIMENTAL DEMONSTRATIONS:
(1.1A)We plan to demonstrate that the Fredkin gates of our DNA implementation can be "wired together" to form elementary circuits such as one of AND, OR, NOT, NOR, or NAND. The design for such wiring was presented in references [3, 4] above. This will demonstrate a simple instance of massively parallel biomolecular computation providing fundamental building blocks of Boolean based computation.
(1.1B)We are significantly scaling up our solution and surface based experiments for DNA computer arithmetic and logic registers as well as for solution of NP search problems (in particular, the SAT problem). For DNA computer arithmetic and logic registers, we working to scale up to 8 to 16 bits. For the SAT problem, we intend to scale up to the range of 25 Boolean variables, demonstrating massively parallel DNA operations with 2^25 processing elements.
(1.2) We will experimentally synthesize the designed 25-bit library and begin experiments to test our approach and this library on larger versions of the current SAT problems (placement of knights or queens on a 4 x 4 or 5 x 5 board, for example) to test the scalability of the RNA approach.
(1.3)We will further test the scalability of our methods to decrease errors.
(1.4)We plan to implement "crossover", a customary operation in Genetic Algorithms.
(TASK 2) SUMMARY OF PLANNED NANO-CONSTRUCTIONS:
(2.1) We will experimentally construct some new DNA tiles that facilitate input and readout.
(2.2) We are now beginning an experimental test of massively parallel DNA assemblies which will perform parallel addition and XOR on pairs of n-bit with randomly constructed inputs. The size of the numbers to be added will be in the range of 12 to 16. The inputs will be constructed by random assembly, and there is expected to be in the range of 10^15 inputs (including repeated inputs). According to our current time-table, our experiments will produce the very first published demonstration of a functioning DNA-tiling computer in which the DNA itself performs the computation during a self-assembly step.
(2.3) We plan to extend the successful assemblies of two-dimensional arrays to three-dimensional arrays. In addition to more complex cellular automata, 3D arrays can lead to very smart materials.
(2.4) We plan to continue to attempt to demonstrate DNA motor, and to construct an array of sequence-dependent transition devices that can be programmed to adopt a large number of states; so each locus can be in one of two states, and the loci can be programmed independently.
(Task 3) SUMMARY OF PLANNED APPLICATIONS:
(3.1) We are now completing the experimental demonstration of all stages of the re-coding of natural DNA using non-standard bases to code digital data. This will be significant if we are to avoid the usual very time consuming task of sequencing natural DNA into conventional binary electronic form.
Fingerprinting DNA: We will use genetic recombination intermediates to test for the presence of a particular sequence in a double helical context. This test, which does not involve proteins, appears to be independent of sequence effects. We plan develop means to incorporate the test into a DNA computing context. This may lead to exquisitely sensitive methods for fingerprinting DNA.
(3.2) We plan to perform experimental prototyping of some methods useful for cryptographic systems involving DNA, especially a novel input/output scheme which uses DNA-chips containing sizable arrays of addressable DNA sequences. We are considering experimental tests of our method for DNA Steganography, using amplifiable microdots.
(Task 4) SUMMARY OF PLANNED SOFTWARE TOOLS AND MATHEMATICAL MODELS:
(4.1) We intend to collaborate with a MEMS group to experimentally demonstrate the utility of MEMS in BMC.
(4.2) We are now completing a demonstration of all features of prototype software simulation system for DNA computation. These software tools are expected to be very useful to optimize the experimental BMC protocols and minimize errors.
(4.3) We plan to solve several hard algorithmic problems in our new models.
(4.4) We are designing experimental tests of our new methods for efficient simulation of NC circuits.
DETAILS OF PLANNED WORK:
DETAILS OF PLANNED WORK in TASK 1 for the rest of FY99 and FY2000
Demonstrations of solution based methods for DNA logic registers:
We plan to demonstrate that the Fredkin gates of our DNA implementation can be "wired together" to form elementary circuits such as one of AND, OR, NOT, NOR, or NAND. The design for such wiring was presented in references [3, 4] above. This will demonstrate a simple instance of massively parallel biomolecular computation providing fundamental building blocks of Boolean based computation.
Surface Based DNA Computation Operations.
We are significantly scaling up our solution and surface based experiments for DNA computer arithmetic and logic registers as well as for solution of NP search problems (in particular, the SAT problem). For DNA computer arithmetic and logic registers, we working to scale up to 8 to 16 bits. For the SAT problem, we intend to scale up to the range of 25 Boolean variables, demonstrating massively parallel DNA operations with 2^25 processing elements.
Scaling to larger computational problems requires an increase in the complexity and information density of DNA strands attached to solid supports and increased durability of the underlying chemistries for attachment. We are currently studying new attachment chemistries for gold thin films and Si(100) surfaces for better stability. New enzyme operations for multi-word and circuit SAT calculations are under investigation in order to enable solution of more complex computational problems. We are also creating arrays on silicon surfaces with feature sizes in the 10-50 micrometer range; these arrays are needed for identification of the solutions to larger computational problems.
DNA Solution of Intractable Problems: RNA solutions to SAT problems
Now that we have successfully demonstrated that we can use RNA based computing to find solutions to a representative 9-bit problem, our immediate goals are 1) to enhance the technology to the point at which we can choose specific solutions. For example, can we modify the protocol to select the board with the most knights. 2) We will try our approach with a slightly more challenging 9-bit problem, the related Queens problem. 3) We will then synthesize the designed 25-bit library and begin experiments to test our approach and this library on larger versions of the current SAT problems (placement of knights or queens on a 4 x 4 or 5 x 5 board, for example) to test the scalability of the RNA approach.
Demonstrations of Genetic Algorithms: We plan to implement "crossover", a customary operation in Genetic Algorithms. Preliminary techniques were presented in references [2, 5] above. Having both our current selection capability and the planned crossover, the essential components of Genetic Algorithms will be in place. They will be demonstrated using the classical Genetic Algorithms test problem called "Max 1s."
DETAILS OF PLANNED WORK in TASK 2 for the rest of FY99 and FY2000
Integer Addition by DNA Self-Assembly. One of our goals is do massively parallel arithmetic by the method of self-assembly of DNA tiles. We developed in 1998 a binary addition assembly design using the TX tile described above. During the next fiscal year we will proceed with final implementation of component tiles and linear, tile assemblies according to the "string tile" design already completed. We are now beginning an experimental test of massively parallel DNA assemblies which will perform parallel addition and XOR on pairs of n-bit with randomly constructed inputs. The size of the numbers to be added will be in the range of 12 to 16. The inputs will be constructed by random assembly, and there is expected to be in the range of 10^15 inputs (including repeated inputs). According to our current time-table, our experiments will produce the very first published demonstration of a functioning DNA-tiling computer in which the DNA itself performs the computation during a self-assembly step. (Extension to parallel integer multiplication is also planned, to give solution of integer factorization problems and to be used for decryption of the RSA crypto-system.)
3D NanoArrays:
We plan to extend the successful assemblies of two-dimensional arrays to three-dimensional arrays. In addition to more complex cellular automata, 3D arrays can
lead to very smart materials.
DNA Robotics:
We plan to continue to attempt to demonstrate the sequence-dependent transition. This will impinge on DNA-based computing, and on nano-manufacturing. The
device already constructed (and published) can change from one state to another as a function of solution conditions. However, an array of sequence-dependent
transition devices can be programmed to adopt a large number of states; each locus can be in one of two states, and the loci can be programmed independently.
Assembly Simulations Using Plastic Tiles. Our plans are to continue with the development of our `solid state' device. In particular, it is our goal in 1999 to solve a small computational problem using this device. Assuming success, we intend to move on to a `moderate' sized computational problem (e.g. 15 to 20 variable SAT). Solving such a problem would be a milestone of considerable importance to the entire community of researchers in this area.)
DETAILS OF PLANNED WORK in TASK 3 for the rest of FY99 and FY2000
(3.1) DNA2DNA Computations
We are now completing the experimental demonstration of all stages of the re-coding of natural DNA using non-standard bases to code digital data. The completion of these experimental tests of re-coding will be significant if we are to avoid the usual very time consuming task of sequencing natural DNA into conventional binary electronic form.
Fingerprinting DNA:
In our recent work on genetic recombination intermediates, we have discovered that it is possible to test for the presence of a particular sequence in a double helical context. This test, which does not involve proteins, appears to be independent of sequence effects. We plan to pursue this type of molecular recognition, and to develop means to incorporate it into a DNA computing context. This may lead to exquisitely sensitive methods for fingerprinting DNA.
(3.2) DNA Cryptography and Steganography
DNA Cryptography. We plan to perform experimental prototyping of some methods useful for cryptographic systems involving DNA, especially a novel input/output scheme which uses DNA-chips containing sizable arrays of addressable DNA sequences.
DNA Steganography. We are considering experimental tests of our method for genomic steganography using amplifiable microdots. The achievement of optimal utility for this technique will require the use of solid substrates other than the filter paper we have employed for the above studies. We thus plan to attempt to adapt this approach for use with various common, apparently innocuous substrates, such as typing paper, paper currency, etc. We plan also to work on scaling down the size of the dots employed to carry the message hidden in DNA, thus decreasing the ability of an adversary to detect the presence of the DNA-encoded secret message.
DETAILS OF PLANNED WORK in TASK 4 for the rest of FY99 and FY2000
MEMS:
We intend to collaborate with a MEMS group to experimentally demonstrate the utility of MEMS in BMC.
Software Simulations of DNA Operations:
We are now completing a demonstration of all features of prototype software simulation system for DNA computation. These software tools are expected to be very useful to optimize the experimental BMC protocols and minimize errors.
Mathematical Models for BMC:
We are now completing a lab demonstration of an example illustrating the new limit language concept. We are now completing our new differential equations model for the dynamics of the DNA contents of a test tube in which the DNA molecules are being acted on by enzymes and to initiate wet lab confirmation of the model.
We plan to solve several hard algorithmic problems by using our tested method of 'writing' (cut & paste). Our next step will be to develop a method of 'writing' on our plasmids that will allow the solution of large scale instances of similar problems. We plan to 'write' by attaching short single stranded PNA molecules (P: protein backbone) to the plasmids and to 'read' by separation based on mass of the resulting complexes. New theoretical results on splicing and test tube computing will be written by two Ph.D. candidates working with us.
BMC Algorithms for Parallel Circuit Evaluation:
In the next year we will further explore the potential of DNA methods as those for evaluating Boolean circuits. In particular we will explore problems that are amenable to massive parallel circuit evaluation with DNA. We are now designing experimental tests of our new methods for efficient simulation of NC circuits.
TECHNOLOGY TRANSITION:
With respect to technology transfer paths we are in a particularly favorable position because many of our groups (e.g., Princeton, Wisconsin, Duke, NYU, USC, U Penn., Binghamton) have ongoing collaborative interactions with the biotechnology industry at their state-of-the-art microbiology facilities.
In this first year of DARPA funding, we initiated these technology transfer paths by contacting the relevant companies. On the final year of DARPA funding, we expect a technology transition phase where the prototype biotechnology is made more robust, to allow for transition to commercial market.
We are employing several paths for technology transfer.
Prototype Biotechnology for Computation:
(1.1)Technology Transfer of Surface Based Methods for BMC:
The results of our experimental tests of DNA surfaces computation improve surface design and surface attachment chemistry, providing new algorithmic techniques for performing robust and flexible computations on the envisioned DNA computer, and new, massively parallel search strategies for NP-hard problems. These surface based methods may be used to augment the power of commercial addressable DNA chip arrays such as those of Affymetrix, Inc,and Nanogen, Inc. For example, this may provide much improved readout or sequencing strategies for determining the actual composition of the adsorbed molecules, as well as allow for the computational processing of the data on the DNA chip.
(1.2) Technology Transfer of BMC Methods for Solving Hard Problems:
Our proposed BMC methods for quickly solving hard computational problems have numerous military and commercial applications. These methods will allow for improved techniques for encoding and breaking the code of encrypted messages.
For example, the USC work is specifically directed toward breaking the DES code. Lucent Technologies has recently formed a group in DNA computing, and is intending to solve image matching problems in large image data bases using BMC techniques we have developed.
Using the ability to solve hard computational problems, our proposed BMC methods may allow possible breakthroughs in the field of artificial intelligence, permitting the development of truly "smart weapons" and intelligent robots.
(2) Technology Transfer of Nanotechnology:
The key expected benefit to the military and to commerce will be the ability to pack information in a very small space. The assembly of 2D and 3D arrays with units of roughly 2 x 12 nm, annotated at will with small molecules of choice will allow for very dense data storage. The use of somewhat larger units may ultimately allow for circuitry on the nanoscale so that RAM with bit volumes of 10^4 nm^3 could be developed.
The Ingeny Corporation was sufficiently interested in DNA technology applications that they paid NYU $10,000 for an option on the patents of our subcontractor Seeman for a self assembling molecular-scale memory device. The other benefit of this technology will be derived from improved commercial addressable DNA chip arrays such as those of Affymetrix, Inc,and Nanogen, The DNA chip technology may be improved by the use of DNA nano-arrays rather than the usual surface chemistry to affix DNA strands in a very compact 2D array. Sequencing and fingerprinting by hybridization on the oligonucleotide scale is likely to be freed from major sources of errors by converting it from a methodology whereby single-strands hybridize to one in which double strands exchange strands; the former system has greater competition for incorrect pairings.
(3) Applications of BMC Methods to Cryptography
We have recently developed BMC methods for cryptography that are theoretically unbreakable. We are pursuing patent applications for these techniques, and are considering commercial development.
(4) Applications of BMC Methods to Biology and Medicine.
We have identified key sectors of the commercial biotechnology market that may benefit from the ability to compute in addition to more standard biotechnology applications.
(4.1) Technology Transfer of BMC Methods for Associative Memory: and the development of "wet" (in solution) biological data bases.
Our BMC methods for re-coding will allow vast increases in computer memory and may allow for fast associative searches in this memory. A small test tube of DNA can contain up to 10^28 bits, far more than conventional storage media.
Our techniques for re-coding natural DNA techniques may be used to label molecules of biological interest with binary data, just as packages are labeled in bar code to allow for easier subsequent identification and dating.
Such techniques may be used to assemble large "wet" databases of biological interest, without the problem inherent in I/O to an electronic medium. Also, our BMC biotechnology may allow for fast associative searches in these "wet" databases. Military applications include the conversion of a very large number of individuals identifying DNA (e.g. the entire US Army) into a "wet" date base of re-coded DNA that can be subsequently processed by BMC techniques. These large biological databases may be stored in very compact space for processing in the battlefield. Other military applications include the storage of very large "wet" databases of biological pathogens (e.g., bacterial DNA), and subsequent fast pattern matching within these databases in the battlefield.
(4.2) Applications of BMC to Sequence Comparison and Fingerprinting. We are developing parallel BMC methods that may be used to speed up DNA sequencing, mutation detection, fingerprinting and other applications. These will have a very large market in the biotechnology industry since comparing DNA molecules is a standard tool for forensics, for determination of paternity, and for tracing the trails of infectious diseases through populations.
(5) Simulation Software.
We expect that software technology for simulations of recombinant DNA operations that we are developing will have a natural commercial market in the biotechnology sector.
In FY99 we expect a technology transition phase where the software interface is made more graphic and flexible, and where we provide additional functionality in large scale commercial biotechnology applications where simulation may be of critical use.
Comments / Questions /
Anything else you need:
(1) Update on Continued Funding of the existing DARPA/NSF grant: Prototyping Biomolecular Computations. DARPA in early 199 sent NSF the FY99 increment of the existing grant. On June 3, 1999 the FY99 increment of the existing grant was approved within NSF, including the amounts NSF contributes. But as of July 15, 1999 the FY99 funding increment has not yet reached Duke University. Reif has arranged so that Duke contracts personnel will, on the day the money arrives at Duke from NSF, immediately notify all subcontractors and make available the FY99 increment.
(2) Request for Optional Support for Large Scale Implementations:
http://www.cs.duke.edu/~reif/BMC/BMC.option/BMC.option.abstract.html http://www.cs.duke.edu/~reif/BMC/BMC.option/BMC.option.htmlWe are applying for an option (written into our existing DARPA grant) providing for additional funds to increase the scale of our massively parallel BMC experiments, and in particular for an integrated large scale demonstration of a biomolecular machine solving a NP search problem (the SAT problem). This FY99 proposal for optional money for large scale experimental prototypes will have major impact on our project.
(3) Planning for a Washonton DC Meeting on Future Continued Funding. Our current grant runs though FY2000. Reif is planning a meeting at NSF, Washington DC, (tentatively fall, 1999) to be titled: Biomolecular Computation: Its Potential and Applications. The meeting will be attended by the consortium members, who may be expected to give brief presentations, as well as some additional invited speakers. The meeting will be attended also by program managers from NSF, DARPA, and other agencies. We need to articulate at the meeting the applications of BMC beyond solution of combinatorial search problems, and the potential for scaling up to large scale experimental prototypes.
(4) Plans for Edited Textbook: "Bio-Molecular Computation" to be edited by Reif, with contributed chapters. This edited text (in latex) will cover: the underlying biotechnology that BMC utilizes, a number of distinct paradigms for doing BMC, and experimental techniques the field of BMC, as well as theoretical results.
The textbook outline is available online at
www.cs.duke.edu/~reif/BMC/BMCbook/BMCoutline.ps www.cs.duke.edu/~reif/BMC/BMCbook/BMCoutline.pdfSchedule (Revised by suggestion of N. Seeman)
-One page Chapter Outlines and Abstracts of sections due July 1, 1999.
-Detailed outlines due August 1, 1999.
-Full length drafts of sections due Oct 15, 1999.
-Referee reports of full length drafts due Nov 15, 1999.
-Edited sections due Jan 15, 2000.
(5) Report on Third Meeting of the
Consortium on Biomolecular Computing and Applications
Meeting Time: 9-10pm Monday, June 14, 1999, at University Park Hotel, Cambridge, MA.
John Reif chaired the meeting.
David Gifford at MIT chaired the local arrangements.
Institutes of Attending Members:
Binghamton University, Duke University, Mt. Sinai, MIT, New York University, Princeton University, University of Delaware, University of Pennsylvania, University of Rochester, University of Southern California, University of Wisconsin.
Agenda Items of Meeting:
-Update on Reporting for the existing DARPA/NSF grant: Prototyping Biomolecular Computations.
Update on Continued Funding of the existing DARPA/NSF grant: Prototyping Biomolecular Computations.
-Planning for Future Continued Funding.
-Plans for Edited Textbook: "Bio-Molecular Computation" to be edited by Reif, with contributed chapters. This edited text (in latex) will cover: the underlying biotechnology that BMC utilizes, a number of distinct paradigms for doing BMC, and experimental techniques the field of BMC, as well as theoretical results.