Tuesday, 11 April 2017

Two papers on the fundamental principles of biomolecular copying

Single cells, which are essentially bags of chemicals, can achieve remarkable feats of information processing. Humans have designed computers to perform similar tasks in our everyday world. The question of whether it is possible to emulate cells and use molecular systems to perform complex computational tasks in parallel, at an extremely small scale and consuming a low amount of power, is one that has intrigued many scientists.

In collaboration with the ten Wolde group from AMOLF Amsterdam, we have just published two articles in Physical Review X and Physical Review Letters that get to the heart of this question. 

The readout molecules (orange) act as copies of the binding
state of the receptors (purple), through catalytic
phosphorylation/dephosphorylation reactions.

In the first, “The Thermodynamics of computational copying in biochemical systems”, we show that a simple molecular process occurring inside living cells - a phosphorylation/dephosphorylation cycle - is able to copy the state of one protein (for example, whether a food molecule is bound to it or not) into the chemical modification state of another protein (phosphorylated or not). This copy process can be rigorously related to those performed by conventional computers.
We thus demonstrated that living cells can perform the basic computational operation of copying a single bit of information. Moreover, our analysis revealed that these biochemical computations can occur rapidly and at a low power consumption. The article shows precisely how natural systems relate to and differ from traditional computing architectures, and provides a blueprint for building naturally-inspired synthetic copying  systems that approach the lower limits of power consumption.
The production of a persistent copy from a template.
The separation in the final state is essential.
A more complex natural copy operation is the production of polymer copies from polymer templates, as discussed in this previous post. Such processes are necessary for DNA replication, and also for the production of proteins from DNA templates via intermediate RNA molecules. For cells to function, the data in the original DNA sequence of bases must be faithfully reproduced - each copy therefore involves copying many bits of data. 

In the second article, "Fundamental costs in the production and destruction of persistent polymer copies", we consider such processes. We point out that these polymer copies must be persistent to be functional. In other words, the end result is two physically separate polymers: it would be useless to produce proteins that couldn't detach from their nucleic acid templates. As a result, the underlying principles are very different from the superficially similar process of self-assembly, in which molecules aggregate together according to specific interactions to form a well-defined structure. 

In particular, we show that the need to produce persistent copies implies that more accurate copies necessarily have a higher minimal production cost (in terms of resources consumed) than sloppier copies. This result, which is not true if the copies do not need to physically separate from their templates, sets a bound on the function of minimal self-replicating systems.

Additionally, the  results suggest that polymer copying processes that occur without external intervention (autonomously) must occur far from equilibrium. Being far from equilibrium means that processes are highly irreversible - taking a forwards step is much more likely than taking a backwards step. This finding draws a sharp distinction with self-assembling systems, that typically assemble most accurately when close to equilibrium. This difference may explain why recent years have shown an enormous growth in the successful design of self-assembling molecular systems, but autonomous synthetic systems that produce persistent copies through chemical means have yet to be constructed.
Taken together, these papers set a theoretical background on which to base the design of synthetic molecular systems that achieve computational processes such as copying and information transmission. The next challenge is now to develop experimental systems that exploit these ideas.

Monday, 3 April 2017

Working with the City of London School on an exciting iGEM project

Today I meet with a group of school students (aged 16-18) from the City of London School, who will be working on a project for iGEM this year. iGEM is an international competition for school, undergrad and postgrad teams to design, model and build complex systems by engineering cells. Last year, Imperial won the overall prize, as discussed in this post by Ismael. 

Without giving too much away, the students will be working on a system based on a newly-developed molecular device, the toehold switch. Toehold switches are RNA molecules that contain the information required to produce proteins. This information is hidden via interactions within the RNA, which cause it to fold up into a shape that prevents the sequence from being accessed. If, however, a second strand of RNA with the right sequence is present, the structure can be opened up and protein production is possible.

This idea has been around for a reasonable while, but toehold switches are particularly useful, because they provide a better decoupling of the input, output and internal operation of the switch than previous designs. This is the principal of modularity that underlies the work of many of my colleagues here at Imperial, and allows for systematic engineering of molecular systems. This modularity is key to the proposed project.

I've been giving the students advice on how to model the operation of a toehold switch, in order that they can explore the design space before getting into the lab.

Wednesday, 11 January 2017

A simple biomolecular machine for exploiting information

Biological systems at many scales exploit information to extract energy from their environment. In chemotaxis, single-celled organisms use the location of food molecules to navigate their way to more food; humans use the fact that food is typically found in the cafeteria. Although the general idea is clear, the fundamental physical connection between information and energy is not yet well-understood. In particular, whilst energy is inherently physical, information appears to be an abstract concept, and relating the two consistently is challenging. To overcome this problem, we have designed two microscopic machines that can be assembled out of naturally-occurring biological molecules and exploit information in the environment to charge a chemical battery. The work has just been published as an Editor's selection in Physical Review Letters: http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.118.028101

The basic idea behind the machines is simple, and makes use of pre-existing biology. We employ an enzyme that can take a small phosphate group (one phosphorus and several oxygen atoms bound together) from one molecule and attach it to another – a process known as phosphorylation. Phosphorylation is the principal signaling mechanism within a cell, as enzymes called kinases use phosphyrlation to activate other proteins. In addition to signalling, phosphates are one of the cell’s main stores of energy; chains of phosphate bonds in ATP (the cell’s fuel molecule) act as batteries. By ‘recharging’ ATP through phosphorylation, we store energy in a useful format; this is effectively what mitochondria do via a long series of biochemical reactions.

Fig 1.: The ATP molecule (top) and ADP molecule (bottom). Adenosine (the "A") is the group of atoms on the right of the pictures; the phosphates (the P) are the basic units that form the chains on the left. In ADP (Adenosinediphosphate) there are two phosphates in the chain; in ATP((Adenosinetriphosphate) there are three. 

The machines we consider have three main components: the enzyme, the ‘food’ molecule that acts as a source of phosphates to charge ATP, and an activator for the enzyme, all of which are sitting in a solution of ATP and its dephosphorylated form ADP. Food molecules can either be charged (i.e. have a phosphate attached) or uncharged (without phosphate). When the enzyme is bound to an activator, it allows transfer of a phosphate from a charged food molecule to an ADP, resulting in an uncharged food molecule and ATP. The reverse reaction is also possible.

In order to systematically store energy in ATP, we want to activate the enzyme when a charged food molecule is nearby. This is possible if we have an excess of charged food molecules, or if charged food molecules are usually located near activators. In the second case, we're making use of information: the presence of an activator is informative about the possible presence of a charged food molecule. This is a very simple analogue of the way that cells and humans use information as outlined above. Indeed, mathematically, the 'mutual information' between the food and activator molecules is simply how well the presence of an activator indicates the presence of a charged food molecule. This mutual information  acts as an additional power supply that we can use to charge our ATP-batteries. We analyse the behaviour of our machines in environments containing information, and find that they can indeed exploit this information, or expend chemical energy in order to generate more information. By using well-known and simple components in our device, we are able to demystify much of the confusion over the connection between abstract information and physical energy.

A nice feature of our designs is that they are completely free-running, or autonomous. Like living systems, they can operate without any external manipulation, happily converting between chemical energy and information on its own. There’s still a lot to do on this subject; we have only analysed the simplest kind of information structure possible and have yet to look at more complex spatial or temporal correlations. In addition, our system doesn’t learn, but relies on ‘hard-coded’ knowledge about the relation between food and activators. It would be very interesting to see how machines that can learn and harness more complex correlation structures would behave.

Authored by Tom McGrath

Wednesday, 16 November 2016

Congratulatory post: Hail to the Imperial 2016 iGEM team!
By Ismael Mullor-Ruiz

With a bit of delay, we as a team would like to join in the congratulations for our colleagues and collaborators from the Imperial 2016 iGEM team, who triumphed at the iGEM 2016 Giant Jamboree at MIT.

For those who aren’t familiar with it, iGEM (acronym for “International Genetic Engineered Machine”) is the world’s largest synthetic biology contest. It was started 12 years ago at MIT as a summer side-project in which undergrad teams designed synthetic gene circuits never seen before in nature, built them and tested each of the parts. Many of these parts have subsequently pushed forward the field of synthetic biology. Even though it began as an undergrad-level competition with only a handful of teams involved, the competition grew larger and larger to include not only undergrad teams, but also postgrad teams, high school teams and even enterprises.  More than 200 teams from all around the globe that took part on the last edition.

Traditionally, synthetic biology involves tinkering with a single cell type (eg. E. coli) so that it performs some useful function – perhaps outputting an industrially or medically useful molecule. This tinkering involves altering the molecular circuitry of the cell by adding new instructions (in the form of DNA) that result in the cell producing new proteins/RNA that perform the new functions. The focus of this year’s project from the Imperial team was on the engineering of synthetic microbial ecosystems of multiple cell types (known as “cocultures”) rather than a single organism, since more complex capabilities can be derived from multiple cell types working together.

So they began by characterizing the growing conditions of six different “chassis” organisms and creating a database called ALICE. The challenge here resides in the fact that the different organisms had different growing conditions and thus maintaining a steady proportion is really hard to achieve; typically one of the populations ends up taking over in any given set of conditions. Thus, in order to allow self-tuning of the growth of the cocultures, they designed a system consisting of three biochemical modules:

1) A module that allows communication between the populations through a “quorum sensing” mechanism. Population densities of each species are communicated via chemical messengers that are produced within the cells, released and diffuse through the coculture.  Each cell type produces a unique messenger, and the overall concentration of this messenger indicates the proportion of those cells in the coculture.

2) A comparison module that enables a cell to compare the concentration of each chemical messenger. The chemical messengers were designed to trigger the production of short RNA strands in each cell; RNA strands triggered by different messengers bind to and neutralize each other. If there is an excess of the cell’s own species in the coculture, some of the RNA triggered by its own chemical messenger will not be neutralized, and can go on to influence cell behaviour.

3) An effector module. The RNA triggered in response to an excess of the cell’s own species is called “STAR”. It can bind to something known as a riboswitch (see figure below); when it is present, the cell produces a protein that suppresses its own growth. Cells therefore respond to an excess of their own population by reducing their own growth rate, allowing others to catch up. The approach of using a riboswitch for cell division control presents several advantages as its ease to design and to port at any cell type, and involves a reduced burden on the cell compared to other mechanisms.

Figure 1: Action of STAR in opening the hairpin of a riboswitch. Without STAR, the riboswitch interferes production of certain genes; STAR stops this interference so that the genes are produced.

As a demonstration of the concept, the students implemented this control system in different coloured strains of bacteria in order to create different pigments (analogous to the Pantone colour standard) through the coculture and combination of the strains. The approach is very generic, however, and as the team mention on their wiki, the possibilities of cocultures go way beyond this!

If you want to know more about the project, you can check out the team’s wiki:

Thursday, 6 October 2016

Replication, Replication, Replication I

This post and the one below it are linked. Here, I discuss a topic that interests us as a group, and below I look at some recent related papers. This post should make reasonable sense in isolation, the second perhaps less so.

Replication is at the heart of biology; whole organisms, cells and molecules all produce copies of themselves. Understanding natural self-replicating systems, and designing our own artificial analogues, is an obvious goal for scientists - many of whom share dreams of explaining the origin of life, or creating new, synthetic living systems.

Molecular-level replication is a natural place to start, since it is (in principle) the simplest, and also a necessary component of larger-scale self-replicating systems. The most obvious example in nature is the copying of DNA; prior to cell division, a single copy of the entire sequence of base pairs in the genome must be produced. But the processes of transcription (in which the information in DNA sequence is copied into an RNA sequence) and translation (in which the information in RNA sequence is copied into protein sequence) are closely related to replication. The information initially present in the DNA sequence is simply written out in a new medium, like printing off a copy of an electronic document. This process is illustrated in the figure above (which I stole from here). This figure nicely emphasies the polymer sequences (shown as letters) that are being copied into a new medium (note: three RNA bases get copied into one amino acid in a protein: AUG into M, for example). An absolutely fundamental feature of both replication and copying processes is that the copy, once produced, is physically separated from the template from which it was produced. This is important, otherwise the copies couldn't fulfill their function, and more copies could not be made from the same template.

This single fact - that useful copies must separate from their template yet retain the copied information - makes the whole engineering challenge far harder. It's (reasonably) straight-forward to design a complex (bio)chemical system that assembles on top of a template, guided by that template. All you need are sufficiently selective attractive interactions between copy components and the template. But if you then want to separate your copy from the template, these very same attractive interactions work against you, holding the copy in place - and more accurate copies hold on to the template more tightly. My collaborators and I formalise this idea, and explore some of the other consequences of needing to separate copies from templates, in this recent paper.

Largely because of this problem, no-one has yet constructed a purely chemically driven, artificial system that produces copies of long polymers, as nature does. Instead, it has proved necessary to perform external operations such as successively heating and cooling the system. Copies can then grow on the template at low temperature, and then fall off at high temperature, allowing a new copy to be made when the system is cooled down. This is exactly what is done in the PCR, an incredibly important process for amplifying a small amount of DNA in areas ranging from forensics to medicine.

As a group, we're very interested in how copying/replication can be achieved without this external intervention. Two recent papers, discussed in the blog entry below, highlight the questions at hand.

Replication, Replication, Replication II

Here I discuss two recent experimental papers that are related to the challenge of replication or copying, following on from the discussion in "Replication, Replication, Replication I". My take on these papers is heavily couched in terms of that discussion.

Semenov et al.: Autocatalytic, bistable oscillatory networks of biologically relevant reactions
Nature 537, 656–660 (2016)
A catalyst accelerates chemical involving a substrate. For example, amylase accelerates the interconversion of starch and sugars, helping us to digest food. As we learnt at school, a key feature of catalysts is that they are not consumed by the reaction - a single amylase can digest many starch molecules. This fact should remind us of the replication/copy process discussed above, in which it is important that a new copy separates from its template so that the template is not be consumed by the copy process, and can go on to produce many more copies. Indeed, templates for copying/replication must be catalysts. In the specific case of replication, the process is autocatalytic, meaning that a molecule is a catalyst for the production of identical molecules. Simple autocatalytic systems are thus often seen as a bridging point to the full complexity of life.

Semenov et. al. show that a particularly simple set of molecules can exhibit autocatalytic behaviour. Although autocatalysis has been previously demonstrated, the novelty of their approach is the use of such simple organic molecules (which could plausibly have been present on Earth prior to living organisms). Additionally, they are able to show relatively sophisticated behaviour from their system - not just exponential growth of the output molecule (the natural behaviour of autocatalytic systems). When molecules that cause inhibition of autocatalysis and degradation of components are added, for example, the output concentration can be made to oscillate.

Although fascinating, the work of Semenov et al. does not solve the question raised in the blog post above. There are no long polymers in this system, and so the difficulty of separating strongly-interacting copies and templates does not arise. But as a consequence, this autocatalytic mechanism passes on very little information (arguably, none) to the new molecules produced. Autocatalysis alone is not enough  - we are still a long way from processes such as DNA replication, transcription and translation.

Meng et al.: An autonomous molecular assembler for programmable chemical synthesis 
Nature Chemistry 8, 542–548 (2016)
This paper, co-authored by my collaborators in the Turberfield group, takes a completely different approach. The idea is to specify the sequence of a molecular polymer using a DNA-based programme. As I have talked about before, the exquisite selectivity of base-pairing in DNA allows reactions to be programmed into carefully designed single strands, allowing them to self-assemble into a complex patterns when mixed. In this case, the authors mix sets of short DNA strands that are designed to assemble into a long double-stranded structure in a specific order. The selectivity of interactions allows the strands to be programmed to bind one-by one to the end of the structure in the desired sequence.

This process (the hybridisation chain reaction) is not new. The advance is using it to template the sequence of a second (chemically quite different) polymer that can't assemble with a specific sequence on its own - for simplicity, lets call this polymer X (its details aren't important). The authors ingenuously attach building blocks of X to the DNA strands - with each distinct DNA sequence paired with a distinct building block. When a new strand is incorporated via the hybridisation chain reaction, it brings with it the associated building block and adds it to X, which grows simultaneously with the double-stranded DNA construct. The details of this process are a bit fiddly, and due to a technicality a new building block is only added for every second strand incorporated, but the process as a whole allows them to assemble a specific polymer X using DNA-based instructions set by the sequences of the original strands. The authors call this programmed chemical synthesis.

The authors are inspired by the ribosome (see fig, stolen from here), the biological machine that translates an RNA sequence (the red polymer) into a polypeptide sequence (green), which eventually folds into a protein. The ribosome uses RNA base pairing to bring a set of peptide building blocks together in the right order, like the device of Meng et al. uses DNA base-pairing to form polymer X. However, there is a key difference. The information-carrying RNA strand in the figure is not consumed by the process; it acts as a catalyst, as discussed, and the ribosome walks along it until the end and then releases it. The information-carrying components of the system of Meng et al. (the strands that carry the molecular programme) are consumed, being incorporated into a long double-stranded DNA molecule that the authors actually use to analyse the success of the process. Thus although the system allows programmable self-assembly, it doesn't implement catalysis and hence can't perform copying/replication.

Both papers are great pieces of work, but one demonstrates autocatalysis without information transfer, and the other demonstrates the ability to programme polymer assembly without autocatalysis. The challenge to produce chemical systems that copy or replicate is still on.

Wednesday, 15 June 2016

Reading list

Here are some papers we have been reading recently:

Neural Sampling by Irregular Gating Inhibition of Spiking Neurons and Attractor Networks by Lorenz K. Müller and Giacomo Indiveri
This paper shows how a neural network model implements an MCMC sampler.

Trade-Offs in Delayed Information Transmission in Biochemical Networks by F. Mancini, M. Marsili and A. M.Walczak
Here the authors investigate the dissipation required for simple models of sensors to transmit information.

Discrete fluctuations in memory erasure without energy cost by Toshio Croucher, Salil Bedkihal, and Joan A. Vaccaro
This extends Landauer’s principle to an angular momentum cost instead of an energy cost.

Experimental rectification of entropy production by a Maxwell's Demon in a quantum system by P. A. Camati, J. P. S. Peterson, T. B. Batalhão, K. Micadei, A. M. Souza, R. S. Sarthour, I. S. Oliveira and R. M. Serra
This paper describes the theory of a quantum Maxwell’s demon and an experiment where both the demon and the system are spin-1/2 quantum systems.

Minimal positive design for self-assembly of the Archimedean tilings by Stephen Whitelam
This paper shows that a certain amount of specificity in interactions is required for particles to self-assemble into a certain pattern.

Energy-Effcient Algorithms by Erik D. Demaine, Jayson Lynch, Geronimo J. Mirano and Nirvan Tyagi
The authors consider a formalism for identifying the minimal energetic costs of efficient computational algorithms.

Information Flows? A Critique of Transfer Entropies by Ryan G. James, Nix Barnett and James P. Crutchfield
This paper highlights the subtleties in identifying "flows of information" from one system to another.