Tuesday, 31 July 2018

Towards Quantitative DNA Reaction-Diffusion Chemical Networks


By Javier Cabello Garcia

Have you ever wondered how zebras got their stripes? Or why you (probably) were blessed with five toes instead of webbed feet? These and other pattern-formation phenomena in Nature are produced by reaction-diffusion systems. These systems are defined as those in which the concentration of one or more species of chemical compounds change in time and space. The change of concentration with time is not exotic, since every chemical reaction leads the transformation of one species into another during time. However, space dependency due to a diffusion factor is introduced when the system is not well stirred - i.e., concentrations are not the same in all regions of space.

Diffusion is the spontaneous spread of particles as a result of random motion in a solution. This motion produces the transition of particles from regions where they are present in high concentration to others where their concentration is lower, following concentration gradients. Concentration imbalances not only cause diffusion, but lead to different chemical reaction rates in different regions. This heterogeneity prompts the formation of patterns, which can show interesting behaviours like wave fronts or oscillations.
The pattern formation of reaction-diffusion systems is common in many biochemical processes; several lines of research have therefore addressed the design and quantitative study of these systems. DNA has been proposed several times as the ideal substrate for the production of quantitative chemical networks. The suitability of DNA comes from the predictability of the DNA interactions and the thorough characterisation of its reaction rates in well-stirred conditions. However, reaction-diffusion networks strongly depend on the diffusion speeds of each species as well. This means that for DNA networks the diffusion speed of each DNA species has to be known and tunable.

Alas, the diffusion speed of individual DNA strands is hard to control. It is largely determined by size and shape, and varies relatively little for the sort of short DNA systems typically used. Previous attempts to modify the diffusion of strands relied on making them transiently stick to a solid matrix, effectively immobilising the strands temporarily. In a recent paper, Rodjanapanyakui et al. present a matrix-free approach, were they modulate the diffusion speeds of particular DNA species in solution by specific non-covalent binding. The “modulated” strand is complementary to an “anchor” strand, which is attached to a large polymer that is free in the solution. Binding to the anchor reduces the diffusion of the modulated strand a great deal. A “competitor” strand, which can compete with the modulated strand for binding to the anchor (by a toehold exchange reaction), allows temporary release of the modulated strand from the anchor. By varying the concentration of competitors, different average diffusion rates can be obtained.


Fig. 1 Diffusion modulation system. a) Strands of the system. b) In the presence of a large excess of the competitor strand (C) the modulated strand (T) can diffuse freely. When no C is present, T binds to the anchor (A) complexes, which diffuse at a significantly slower speed. c) C release T by a toehold exchange reaction. The newly created toehold can be used by a free-in-solution T to bind again to A. Reproduced with permission from APS Physics, Diffusion modulation of DNA by toehold exchange, Rodjanapanyakul, T., Takabatake, F., Abe, K., Kawamata, I., Nomura, S. and Murata, S. Physical Review E, 97(5).

Fluorescence Recovery After Photobleaching (FRAP) was used to follow the diffusion speed variation of the modulated strand with varying concentrations of competitor and anchor. This technique involves labelling the molecule of interest (in this case the modulated strand) with fluorophores and exciting a region of the sample with a high laser intensity. This results in the bleaching of the fluorescence of the excited area. The diffusion of the labelled molecule of interest gradually restores the fluorescence in the area, and the recovery time is directly related to the diffusion speed.

Experimental data showed that by employing this diffusion modulation method, the effective average diffusion speed of a specific strand can be tuned along a sixfold variation in range. This method has even been proved to tune simultaneously the diffusion of two DNA strands without any crosstalk. The introduction of this diffusion tuning mechanism in solution opens the way for new dynamics and quantitative reaction-diffusion systems with complex behaviours and functions for DNA systems.

Read More:
-Rodjanapanyakul, T., Takabatake, F., Abe, K., Kawamata, I., Nomura, S. and Murata, S. (2018). Diffusion modulation of DNA by toehold exchange. Physical Review E, 97(5). Link:https://journals.aps.org/pre/abstract/10.1103/PhysRevE.97.052617
-Zadorin, A., Rondelez, Y., Galas, J. and Estevez-Torres, A. (2015). Synthesis of Programmable Reaction-Diffusion Fronts Using DNA Catalyzers. Physical Review Letters, 114(6). Link:https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.114.068301
-Kondo, S. and Miura, T. (2010). Reaction-Diffusion Model as a Framework for Understanding Biological Pattern Formation. Science, 329(5999), pp.1616-1620. Link:http://science.sciencemag.org/content/329/5999/1616.long
-Allen, P., Chen, X. and Ellington, A. (2012). Spatial Control of DNA Reaction Networks by DNA Sequence. Molecules, 17(11), pp.13390-13402. Link: http://www.mdpi.com/1420-3049/17/11/13390
-Padirac, A., Fujii, T., Estévez-Torres, A. and Rondelez, Y. (2013). Spatial Waves in Synthetic Biochemical Networks. Journal of the American Chemical Society, 135(39), pp.14586-14592. Link:https://pubs.acs.org/doi/abs/10.1021/ja403584p

Friday, 15 June 2018

Multidimensional biochemical information processing of dynamical patterns


Biological cells can send signals to each other by emitting signalling molecules. Cells can receive the signals by measuring the concentration of the signalling molecules using surface receptors. If the concentration does not change in time then this is just a single number, which does not transmit much information. Compare this with sending electrical signals down wires; if the voltage is maintained at a constant value, then very little information is sent. Instead, the voltage is changed very quickly and the information is encoded in the time varying pattern. i.e. you can transmit a series of bits by turning the voltage on and off. In a similar way, cells can transmit messages in the dynamical concentration of signalling molecules.

A recent paper ‘Multidimensional biochemical information processing of dynamical patterns’ by Yoshihiko Hasegawa from the University of Tokyo examines how a simple model of cell sensing can extract the most information from a time varying concentration. 

The model assumes that the cell sees is a combination of two separate signals encoded in the same molecule, one signal with a fast variation and one with a slow one (see figure). The challenge for the cell is to work out the strength of each individual component from the combined signal that it sees. The better it does, the more information it has. This challenge is complicated by the fact that the cell's readout is not 100% accurate. 

The signal is made by summing a fast signal and a slow signal

The author assumes that the cell has two decoding channels (cell-surface receptors with an associated downstream network) by which to process the signal. They find that when the cell's readout is accurate, the information is maximised when each decoder independently detects one component each. By contrast, when the cell's readout is inaccurate, it is better to have two identical decoders that are simply giving independent estimates of the overall signal strength, without trying to tease out the individual components.


The author then shows how to construct chemical reaction networks that implement the optimal decoders. They find that a possible network for the decoder is a large cascade network with additional feedback and feed-forward loops, which has been found in real biological signalling networks. They also show that the number of molecular species in these cascades can be reduced and the response function is still approximately correct.

 

Monday, 4 June 2018

Probabilistic switching circuits in DNA.

As seen in previous entries in this blog, DNA reaction networks are a substrate that enables the creation of computational systems. These reaction networks have been a research topic for quite a long time and they have their own set of advantages and operational challenges. One of the present challenges is that up to the present day, all biomolecular computing architectures have been conceived as digital deterministic ones, while actual molecular signals are analog in concentration and involve stochastic (or random) reaction events. These two signal characteristics, when controlled in space and time, allow for the emergency of complex phenomena (such as patterning or morphogenesis). Besides this, it is considered that the integration of the analog/stochastic architectures would expand the capabilities of DNA computing.

Qian’s group present the first step towards the integration of stochastic/analog systems in DNA circuits. The implementation is done through irreversible strand displacement reactions in which a given input can bind to 2 different gate molecules with a probability that depends on the gate concentration. The reaction with one gate or the other leads to two different outputs. This architecture, implemented in switches and signal splitters, works stochastically in the single molecule level. But, when observed in bulk, the stochastic elements in the circuit allow the transformation of a digital signal of n bits into an analog signal able to take 2n values.




Figure 1. Example of a circuit with 3 stochastic elements (2 splitters and 1 gate), and the different analog signal values that it can output.

This architecture has been proved to be able to allow the implementation of feedback loops that allow adjustment of the intensity of the output signal. Despite some limitations (in larger circuits, fine tuning of the probability becomes harder thus requiring correction of the initial input signal), the proposed probabilistic circuits are a bridge that allows the interconnection of digital and analog computational circuits.

References:
Wilhelm, D., Bruck, J., & Qian, L. (2018). Probabilistic switching circuits in DNA. Proceedings of the National Academy of Sciences, 201715926.

Thursday, 24 May 2018

Paper Summary: Thermodynamics with continuous information flow

Paper Summary:
Thermodynamics with continuous information flow (2014) J M Horowitz and M Esposito

This paper discusses a way of quantifying the flows of information in the same way as flows of energy. Consider two random processes which hop between discrete states. Two independent systems of this type can be coupled together in a way so that the state of the first system, x, affects the transition propensities of the second system, y, and vice-versa. At any given instant, either or y could change, but crucially not both. This property defines the system as bipartite. The diagram below (taken from the paper) illustrates the creation of a bipartite system from and y.


Examples of systems of this type could include a ligand bound to a receptor which catalyses the creation of an activated complex. It is not possible for a ligand to bind to a receptor, and an activated complex to be created in a single step: they have to happen sequentially. However it is true that whether or not a ligand is bound to the receptor affects the rate with which activated complexes are formed.


If a system is bipartite all properties of the system to be decomposed into an x-subsystem component and the y-subsystem component, separating out the contributions from transitions involving the two subsystems. There are many useful properties of a system such as this, and for a rigorous thermodynamic analysis, a key one is the entropy. The second law of thermodynamics states that the entropy of the universe must increase with time. However this is only for the universe as a whole. On a more microscopic level it is possible for some parts of the system to decrease entropy, as long as other parts make up the shortfall. Traditionally we break the change in entropy down into the change in entropy of the system itself and the change of entropy in the environment due to the system. If we further decompose into separate contributions for our two subsystems an extra term appears for each,  quantifying the information flow around the system.



To the x component a term is added which takes into account how much information x learns about y during a single step. Similarly, to the y component a term is added which takes into account how much information y learns about x during a step. Splitting the entropy in this way gives a sense of how the information moves around the system; it flows from x->y or from y->x. Incorporating the information term allows us to apply the second law individually to each subsystem, rather than only at the level of the whole system. It also allows us to understand how a decrease in entropy in one subsystem can be caused by information flowing from one subsystem to the other, in a manner that is consistent with the second law. 

Tuesday, 30 January 2018

Recent Developments in DNA Origami

Javier Cabello – PhD student

Using both technique and imagination, origami allows the production of many shapes with only a folded sheet of paper. After the birth of DNA nanotechnology, the potential of this folding art inspired Paul Rothermund to develop a technique that allows the creation of complex nanoscale objects out of DNA. A single DNA strand is a flexible material that can be folded by hundreds of smaller strands (“staples”). These staples bind to specific non-consecutive regions of the larger strand (“template”) by Watson-Crick base-pairing, folding the template into a DNA structure with a particular shape.

DNA origami structures have a wide variety of potential uses that greatly differ from the original biological role of the molecule. These range from drug delivery to the production of electric circuits, with the biocompatible nature of DNA suggesting application in living organisms. However, the structures created are still limited by the size of the template, which becomes harder to fold when longer. Another problem with scaling up the technology is the cost of synthesising large numbers of staple strands.

Last November, four publications in Nature focused on overcoming the limitations of DNA origami. Three of the publications proposed new approaches for the assembly of bigger DNA structures. All three involved combining blocks of different sizes and complexity to form bigger structures.

Gigadalton-scale shape-programmable DNA assemblies
The first approach, suggested by Wagenbauer, Sigl and Dietz, involves making origami structures that can then self-polymerize. The resultant structures depend on the geometry of the individual origami building blocks. This approach, heavily inspired by viral capsid assembly, was used to form DNA rings from V-shaped DNA components. Afterwards, these rings were assembled to form DNA tubes. The method was generalised to produce different figures, as dodecahedra, by varying V-block angles and adding blocks with more than two binding sites.

Fractal assembly of micrometre-scale DNA origami arrays with arbitrary patterns
Tikhomirov, Petersen and Qian’s work revolves around the design of a robust algorithm to produce ordered assembly of complex patterns in 2D. In this work, individual origami tiles were first folded in isolation. Tiles were then combined in groups of four to produce larger “super-tiles”. Successively, four of these assembled larger tiles were employed to produce even bigger tiles. To achieve the necessary control of this process, the authors used three basic design rules, which they incorporated into a computational tool that they named the “FracTile Compiler”. This compiler allows the systematic design of complex, puzzle-like patterns. The resultant array can show an arbitrary picture, displayed by unique modifications to the surface of each tile, as they do with “La Gioconda” (ed.: the Mona Lisa for English readers) among others (see fig. 1).

Fig. 1 Complex shapes created using the method of Tikhomorov et al., reproduced with permission from Springer Nature, Fractal assembly of micrometre-scale DNA origami arrays with arbitrary patterns, G Tikhomirov, P Petersen, L Qian, Nature 552 (7683), 67.


Programmable self-assembly of three-dimensional nanostructures from 10,000 unique components
In this case, Ong et al improved a previously published variant related to DNA origami: DNA bricks. Instead of using large DNA strands as in previous approaches, the authors design structures built from smaller building blocks, of only 56 nucleotides each. Each of these bricks can bind to three others, forming a dense “lego”-like structure. By using 10,000 different sequences for these DNA bricks, the authors achieve the production of cubes of more than 1 GDa (3000 tunes the mass of the largest proteins). These cubes were proposed to serve as a canvas to produce different structures by simply removing bricks from the design, like chiselling a rock. This approach even allows the production of negative structures by carving cavities inside the DNA cube. To help in the design task, the authors created the software “NanoBricks” and proved its utility by producing different patterns. such as a helix or a teddy bear inside the DNA cube.

Biotechnological mass production of DNA origami
Despite these significant improvements in the production of larger DNA origami structures, all their possible applications are being hindered by the pecuniary costs involved in synthesising the component strands. However, new methods inspired by the current biotechnological industry are being developed to mass produce DNA origami designs.

Praetorius et al. have demonstrated the possibility of biotechnological mass production of a DNA origami design. Their approach involves encoding all the components necessary for the folding of an origami (ie. template and staples) in a single genetic sequence, and incorporating it into bacteria. Of course, this isn’t all. Between the different components they added DNA sequences with enzymatic activity: DNAzymes. The whole DNA strand is then produced in a culture of bacteria, and the DNAzymes, in the presence of zinc, cut themselves out of the sequence, separating it into the desired components. The result is that the staples and template, now separate, can assemble themselves to form the intended DNA origami structure.

This new approach, if escalated, would drastically reduce the production costs of DNA origami designs. However, the production of the initial genetic material would still require an expensive synthesis of the desired sequences for each new design produced. In order to make this solution even more affordable the produced DNA origami should be a very general or standardized one which can be customized with minor changes. Who knows if the answer lies in any of the previous papers? My humble guess is that DNA bricks are good candidates.

Read more:
- Rothemund, P. (2006). Folding DNA to create nanoscale shapes and patterns. Nature, 440(7082), pp.297-302. Link: https://www.nature.com/articles/nature04586
- Wagenbauer, K., Sigl, C. and Dietz, H. (2017). Gigadalton-scale shape-programmable DNA assemblies. Nature, 552(7683), pp.78-83. Link: https://www.nature.com/articles/nature24651
- Tikhomirov, G., Petersen, P. and Qian, L. (2017). Fractal assembly of micrometre-scale DNA origami arrays with arbitrary patterns. Nature, 552(7683), pp.67-71. Link: https://www.nature.com/articles/nature24655
Ong, L., Hanikel, N., Yaghi, O., Grun, C., Strauss, M., Bron, P., Lai-Kee-Him, J., Schueder, F., Wang, B., Wang, P., Kishi, J., Myhrvold, C., Zhu, A., Jungmann, R., Bellot, G., Ke, Y. and Yin, P. (2017). Programmable self-assembly of three-dimensional nanostructures from 10,000 unique components. Nature, 552(7683), pp.72-77. Link: https://www.nature.com/articles/nature24648
- Praetorius, F., Kick, B., Behler, K., Honemann, M., Weuster-Botz, D. and Dietz, H. (2017). Biotechnological mass production of DNA origami. Nature, 552(7683). pp.84-87. Link: https://www.nature.com/articles/nature24650

Tuesday, 2 January 2018

Why biomolecular computing could pose a cybersecurity threat, why it does not, and why it is still an interesting question.

By Ismael Mullor Ruiz

There is not much needed to construct a computer. We are not referring to the hardware here, but the basic operations that the computer must be able to implement. All possible computer configurations available are to some extent a physical approximation to the configuration envisioned by Alan Turing in his seminal paper in 1936 [1]: the Turing Machine. Turing imagined an infinitely long paper tape containing a series of sites in which two different states could be encoded (the minimum information unit, what would later be called a bit). In addition, he considered a “head” that could go through the tape performing different operations such as reading the information encoded, copying it and writing over the tape (the basic algorithmic operations) following defined rules. Such a device could, in principle, perform all possible computational tasks (it is “Turing complete”).

As said before, this setup is only a theoretical model, but in order to get a physical approximation that allows us to have a computer, we only need a physical system that can emulate this behaviour. The requirements are actually quite loose and allow the implementation of Turing complete machines in a wide array of phenomena, including the well-known electronic devices being used to read this post, but also other non-conventional configurations like biomolecular computing, membrane computing or optical computing.

It must be noted that for any reader with deep knowledge of molecular biology, the resemblance between the theoretical Turing machine and life’s basic molecular machinery is striking. For example, consider the presence of an information coding substrate – nucleic acids in biology - on which different machines – such as enzymes, ribozymes or some oligonucleotides - can operate. These natural devices can read the stored information, copy it into different formats, and overwrite or remove information given some rules.


Similarities between theoretical and biomolecular Turing machines. Adapted from Saphiro and Benenson (2006) [2].

Despite these similarities, it took almost 30 years from the publication of the central dogma of molecular biology, which outlines how the information in DNA is used by the cell, and almost 20 years from the birth of genetic engineering, to actually exploit biomolecules for computation. The birth of biomolecular computing came in 1994 with the foundational paper by Leonard Adelman in which he used DNA to solve a classic computational problem characterized by its difficulty: the Hamiltonian path problem [3]. In this problem, a network of n nodes is defined with the particularity that only certain transitions between the nodes of the network are allowed. If there exists a path in the network that allows you to go through all the nodes of the network staying only once in each node, this path is said to be Hamiltonian. During the years since, the field has grown beyond the architecture designed and developed by Adelman giving birth to a range of underlying biomolecular methods for computation [4].



(Left):Directed graph of the Hamiltonian path problem from Adelman's seminal paper [4]. (Right): DNA species implementation of the nodes and transitions for the graph.  Adapted from Parker (2003) [5].

These methods, often based on DNA, have an extraordinarily high density of information, dwarfing all the current optical/magnetic devices, as well as a very low energy consumption when compared to conventional computers and a capacity to perform astronomical numbers of operations in parallel. This means that that while in conventional computing architectures, all operations are executed sequentially doing only once at a given instant, in a biomolecular computer every single molecule in dilution is executing operations and making attempts to find a solution to the given problem at the same time.

These architectural features make DNA computers potentially efficient for crunching the solution of a set of computational challenges known as “NP problems”. For NP problems, the time for finding a solution grows exponentially with regards to the size of the problem and therefore can become impractical to solve with a conventional computer. The difficulty of solving NP problems is at the core of the security encryption protocols that are used for example in digital transactions or cryptocurrency mining. It is not hard to think that the existence of a tool that might allow those problems to be solved easily would pose a serious security breach. But despite the fact DNA computers have been designed to break cryptography protocols [6,7], they are not currently a serious threat due to a range of practical physical issues:

-The first problem comes from the fact that the information density calculations frequently cited assume essentially an essentially solid density of DNA – but such conditions are impractical for computation with any biomolecular architecture.

-Another issue with the use of massively parallel molecular computation is that only a small number of “correct” solutions may be produced in a test tube with astronomical quantities of unwanted molecules. Recovering and reading this output is an enormous challenge as it requires specifically picking these needles from the haystack using molecular interactions only.

-Related to the previous issue is the question of how to make the output readable and useful. The molecular interactions used to separate the output and read it never are completely specific. This results first in an imperfect separation of pools of molecules as well as important chance of reading as correct an incorrect output. This misreading problem becomes more prevalent when differences between correct and incorrect outputs become smaller.

-Another substantial issue to face is that most DNA-computing architectures are not reprogrammable and once the program has been run it cannot be run again.  Thus one must design new DNA strands and order those DNA strands from a supplier for a new instance of a problem, making the process of iterating an algorithm for a new set of inputs a time- and resource-consuming one. Some recent work in out-of-equilibrium chemical reaction networks aims to introduce reprogammability [8], but still is very far away from the versatility found on any personal computer.

 -The last issue that makes impractical the application of biomolecular computers for breaking cryptography protocols lies in the fact that despite algorithmic operations are easy to implement in biomolecular systems, translating that algorithmic operations into arithmetic operations is hard by itself (as proved by Winfree and Qian with their paper encoding an algorithm for calculating the square root of an integer using a 130 strands reaction network [9]).


Despite all these issues, valuable lessons can be taken. The analogy with the theoretical Turing machine underlies every conceivable computer to some extent or another, but the physical characteristics of the substrate in which the Turing machine is implemented affects deeply the efficiency of any program the machine is given. As we have seen, analogies between life’s molecular machinery and a Turing machine can be drawn and are significant. But the similarities that can be mapped between life molecular machinery and network computing are even more relevant [10,11]. Life does indeed perform certain computational tasks and it is incredibly efficient at doing so when compared to man-made computers. These tasks include inference, learning and signal processing and these remarkable capability are observed at all evolutionary levels. It is not hard to conclude that a deeper understanding of the underlying biophysics of genuine biomolecular computations would yield relevant insights improving the design and capabilities of programmable biomolecular systems that for their very own nature can directly interact with living systems and have applications in fields like synthetic biology or bio-medicine.

References
1. Turing, A. M. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London mathematical society, 2(1), 230-265.

2. Shapiro, E., & Benenson, Y. (2006). Bringing DNA computers to life. Scientific American, 294(5), 44-51.

3. Adleman, L. M. (1994). Molecular computation of solutions to combinatorial problems. Science. 266 (5187), 1021–1024

4. Benenson, Y. (2012). Biomolecular computing systems: principles, progress and potential. Nature Reviews Genetics13(7), 455-468.

5. Parker, J. (2003). Computing with DNA. EMBO reports, 4(1), 7-10.

6. Boneh, D., Dunworth, C., & Lipton, R. J. (1995). Breaking DES using a molecular computer. DNA based computers27, 37-66.

7. Chang, W. L., Guo, M., & Ho, M. H. (2005). Fast parallel molecular algorithms for DNA-based computation: factoring integers. IEEE Transactions on Nanobioscience4(2), 149-163.

8. Del Rosso, N. V., Hews, S., Spector, L., & Derr, N. D. (2017). A Molecular Circuit Regenerator to Implement Iterative Strand Displacement Operations. Angewandte Chemie International edition129(16), 4514-4517.

9. Qian, L., & Winfree, E. (2011). Scaling up digital circuit computation with DNA strand displacement cascades. Science332(6034), 1196-1201.

10. Bray, D. (1995). Protein molecules as computational elements in living cells. Nature, 376(6538), 307-312.

11. Kim, J., Hopfield, J., & Winfree, E. (2005). Neural network computation by in vitro transcriptional circuits. In Advances in neural information processing systems (pp. 681-688).