Analysis of primitive genetic interactions for the design of a genetic signal differentiator
Analysis of primitive genetic interactions for the design of a genetic signal differentiator
Halter,, Wolfgang;Murray, Richard, M;Allgöwer,, Frank
2019-01-01 00:00:00
We study the dynamic and static input output behavior of several primitive genetic interactions and their eect on the performance of a genetic signal dierentiator. In a simpli ed design, several requirements for the linearity and time-scales of processes like transcription, translation and competitive promoter binding were introduced. By ex- perimentally probing simple genetic constructs in a cell-free experimental environment and tting semi-mechanistic models to these data, we show that some of these require- ments can be veri ed, while others are only met with reservations in certain operational regimes. Analyzing the linearized model of the resulting genetic network we conclude that it approximates a dierentiator with relative degree one. Taking also the discovered non-linearities into account and using a describing function approach, we further deter- mine the particular frequency and amplitude ranges where the genetic dierentiator can be expected to behave as such. Key words: genetic circuit design; combinatorial promoters; signal dierentiator © The Author(s) 2019. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 1 Introduction The systematic design of functional genetic circuits is one of the key challenges in the eld of synthetic biology. Usually, the goal is to add a desired function to a cellular organism. As the complexity of these functions has been increasing steadily [1], it becomes increasingly dicult to design the topology of the genetic network and decide what kind of genetic interactions to use. One way to approach this synthesis problem is by adapting methods from the elds of systems and control theory [2], e.g. by starting with a description of the desired part as a linear transfer function, nding the necessary fundamental input/output functions which realize this transfer function and then realizing the evolving network topology with primitive genetic interactions. The key to this approach is to determine how fundamental linear I/O functions like gain, integrator, sum and dierence can be realized using only primitive genetic interactions such as transcription, translation, combinatorial promotors, post-transcriptional modi cation or pairwise interactions of DNA, mRNA or protein molecules. This design work
ow follows the ideas of [3], where the authors showed that any arbitrary linear input/output system can be realized exactly using only zeroth and rst order biochemical reactions. We adressed the question of replacing the zeroth and rst order biochemical reac- tions with general genetic interactions in [4]. Therein, several requirements were introduced to conclude that the processes of transcription and translation can be interpreted as gain and integration respectively and that combinatorial promoters may be used to realize the dierence of two concentrations. In [4], and also in this work, we use these results to design a genetic signal dierentiator, i.e. a genetic part whose output indicates the temporal derivative of its input. Such a module would be of particular interest in context of a genetic PID controller that could be used to regulate production processes within a cell. While for this purpose the genetic realization of the more important integral feedback has been studied extensively [5, 6, 7, 8, 9, 10], dierential operators in a biological context have been investigated rather sporadically [11, 12] and have only recently moved into the focus of synthetic biology [13]. In latter work, the authors introduce a dierentiator module based on mechanisms borrowed from the E. coli chemotaxis regulatory network. This mechanism is based on active enzyme-like degradation and the assumption that this degradation operates at saturation of the enzyme. In contrast to the results of [13], the topology presented in [4] is not based on a known biological exam- Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 1: Ideal approximation of a dierentiator, from [4]. ple but is derived from scratch, using an adjusted version of the general design framework of [3]. This leads to a dierentiator module of similar complexity but dierent assumptions and requirements which need to be guaranteed. In this work, we combine control theoretic concepts, mathematical models and observations from experiments to verify and adapt the requirements introduced in [4]. We nd that, in cell-free extract, transcription can be considered as a PT1 element, i.e. a delayed gain, while translation indeed can be seen as an integrator. Further, we show that combinatorial promoters are not very well suited to realize the dierence of two signals and that the dynamics are very much dependent on the operation conditions. Lastly, we study how not meeting the requirements aects the performance of the genetic signal dierentiator and reveal the operating conditions under which the dierentiator behaves as expected and where this is not the case. In the following, we rst introduce the desired signal dierentiator, one possible topology to realize this part and the necessary requirements for primitive genetic interactions by reca- pitulating the results established in [4]. After, we introduce mathematical models of protein synthesis as well as the cell-free experimental environment which is used to generate the ex- perimental data. Subsequently, the requirements on time-scales and linear operation regimes of the processes of transcription and translation are veri ed by tting the model to a series of experimental data and analyzing the resulting parameters, leading to transfer function rep- resentations of these two processes. Using another series of experiments, we determine the input-output steady-state map of a combinatorial promoter and discuss the limited capability of such promoters to realize the dierence of two signals. Finally, the impact of the discov- ered discrepancies on the performance of the genetic dierentiator is studied both in time and frequency domain, using a describing function approach for the latter. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 2 Background First, we brie
y recapitulate the results from [4] before we analyze, verify and adjust the requirements we introduced therein. In the eld of control theory, one can study linear systems in two dierent domains. First, in the time domain, by looking at the states of a system and the temporal derivatives thereof which de ne a system of ordinary dierential equations (ODEs). And second, in the frequency domain, by looking at transfer functions which are complex valued functions and describe how dierent frequency components of an input signal are modi ed by a system. These two domains are connected via the Laplace transformation and particularly the frequency domain is very useful for the design and analysis of linear systems. An ideal dierentiator would be given by the transfer function G(s) = s with Laplace variable s. However, as is well known in the control community, an exact realization of such an ideal dierentiator is not possible due to the lack of causality. For a system to be causal, its output must not depend on future values of the input signal. This is not the case for the dierentiator. In case the system is given in form of a N(s) rational transfer function, i.e. G(s) = , one can easily check for this property by examining D(s) the degrees of the polynomials N (s) and D(s): causality is given if the degree of N (s) is not bigger than the degree of D(s). The desired function thus can only be approximated, e.g. by adding an additional low-pass lter to the ideal dierentiator, leading to the desired transfer function Ks G(s) = (1) s + K where K is the bandwidth of the lter. One possibility to realize this transfer function is by the circuit depicted in Fig. 1, with a (preferably large) gain K in the forward path and a weighted integrator in the feedback path. Ideally, one chooses = 1 to recover (1). Thus, in order to approximate the dierentiator, three basic functions are needed: a gain, an integrator and the signal dierence between input and feedback. Finding genetic realizations of these basic functions is the main challenge in designing the dierentiator. In particular, it is expected that this cannot be achieved in an exact way, thus it is necessary to determine how inaccuracies in the basic parts in
uence the behavior of the assembled circuit. For an initial guess for nding Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 such functions, a semi-mechanistic model of transcription and translation [14] was used in [4] to conclude that the processes of transcription and translation can approximately be seen as a gain and integrator respectively and that a combinatorial promoter may be used to realize the dierence of two signals. In the remainder of this section, we brie
y recapitulate these deductions. In the process of protein synthesis, the genetic information is read from DNA (with con- centration D ) and transcribed into mRNA (M ), then, mRNA molecules are translated into i i proteins (P ). In the following, the subscript i stands for the i-th gene (G ) in a network with i i I distinct genes. With P = P : : : P 1 I representing all proteins present in the genetic network, the dynamics of mRNA and protein concentrations of gene i are described by M = f (P; ; ) p (M ; ; ) (2a) i i i i i i P = g (M ; ; ) q (P ; ; ) (2b) i i i i i i i where f (P; ; ) and g (M ; ; ) are the respective production and p (M ; ; ) and i i i i i i i i q (P ; ; ) the respective degradation rates. These rates are possibly dependent on protein i i i and mRNA concentrations, certain gene speci c parameters 2 R like DNA concentrations (D ) or initiation and degradation rates, as well as several environmental parameters 2 R which include, among others, the total amount of RNA polymerase (RNAP), ribosomes and endonucleases, the transcription and translation elongation rates, and other host dependent variables. For better readability the arguments and are omitted in the remainder. In [4], we introduced the topology depicted in Fig. 2 as one approach to realize the transfer function (1). Therein, the input is considered to be a transcription factor, i.e. u = P , which activates gene G and inhibits another gene G . Each of these genes produces a transcription 1 2 factor which suppresses its own production. While G has the purpose of capturing positive gradients of the input signal, G is designed to capture negative ones. The output of the part is then given as the dierence between the mRNA concentrations of the two genes, i.e. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 P y P − Figure 2: Genetic dierentiator: Genes G and G tracking positive and negative slopes of 1 2 u. Proteins produced by G and G neutralize each other. Dierence of associated mRNAs 1 2 indicate output y. y = M M . Further, for the purpose of a minimal signal representation, the transcription 1 2 factors P and P undergo an annihilation reaction. 1 2 Several simpli cations and requirements for the processes of transcription, translation and degradation were introduced to nally arrive at the desired model equations M = D (P P ) M (3a) 1 1 u 1 1 1 P = M P P P (3b) 1 1 1 1 12 1 2 M = D (k P P ) M (3c) 2 2 0 u 2 2 2 P = M P P P (3d) 2 2 2 2 12 1 2 with the function x x > 0 (x) = (4) 0 x 0 assuring strictly positive transcription rates. In the following, we focus on G , the gene for capturing positive gradients, and recapitulate the requirements for the biological processes necessary to arrive at (3). Subsequently, the connection between (3) and (1) will be discussed. We note that the focus on G is without any loss of generality as the following requirements can be adjusted with minimal eort to arrive at the equations for G . 2 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Requirement 1. M and P are subject to rst order degradation, i.e. 1 1 p (M ) = M (5a) 1 1 1 1 q (P ) = P : (5b) 1 1 1 1 with degradation rate constants ; 2 . 1 1 1 Although degradation rates p and q are usually dependent on protease and endonuclease i i levels we require rst order degradation dynamics to assure linearity with respect to mRNA and protein levels. Requirement 2. The operation regime is such that f and g are both approximately linear in 1 1 D and M , respectively. 1 1 This requirement is recti ed by results like the ones presented in [15], where particularly the linearity of g in M is shown. Alternatively, similar simpli cations have been applied by follow- i i ing a linearization approach as pursued in [16]. In general, however, although the transcription rate f increases monotonically with DNA concentration D , it cannot grow arbitrarily large i i but is subject to saturation eects for large enough DNA or transcription factor concentrations, see e.g. [14, 17]. Requirement 3. There exists a combinatorial promoter which is piecewise linear in two inputs, such that f [P ; P ] (P P ) 1 u 1 u 1 with () like in Eq. (4). With this requirement, we demand that the combined eect of the two transcription factors is proportional to the dierence of their concentrations, as long as P > P , and zero, otherwise. u 1 In other words, f as a function of [P ; P ] , needs to ful ll the fundamental additivity property 1 u 1 of linear functions in the regime P > P . This further means that, as we are considering a u 1 combinatorial promoter, P has to act as an activator for G while P acts as an inhibitor. u 1 1 Consequently, instead of forming the dierence between input P and integral feedback P u 1 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 by using direct interactions between the two species, we move the dierence operation to the promoter function. Now if Requirements 2 and 3 hold, we nd f ([P ; P ] ) D (P P ) (6a) 1 u 1 1 u 1 g (M ) M ; (6b) 1 1 1 where and stand for lumped production rate parameters. Thus, with Requirements 1 to 3, we arrive at the rst part of Eq. (3). Note that, when considering both genes G and G , this 1 2 means that the transcription and translation rate constants and are assumed to be equal for both genes. Also, it is required that P > P for the part to work properly. For this reason, u 1 the annihilation reaction between P and P was introduced, see [4] for more details. 1 2 Finally, concerning an appropriate choice of parameters, another requirement can be de- duced from typical degradation rates given e.g. in [18]. Requirement 4. The degradation of mRNA is much faster than the one of protein, i.e. With that in mind, one can apply a quasi steady state approximation of the mRNA dynamics and further assume that 0 to arrive at M D (P P ) 1 1 u 1 _ ~ P M 1 1 where M stands for the steady state mRNA concentration. Thus, we conclude that the pro- cess of transcription can be interpreted as a gain while translation approximately realizes an integrator. With the signal entering the transcription process chosen as the residual of input P and integral feedback P , the presented model thus realizes Eq. (1). u 1 In [4], we veri ed this structure by simulating the system based on the much more detailed model described in [14]. This detailed model mainly aims at taking the nite amounts of RNAP and ribosomes as well as the time delay of transcription and translation into account, however, chosen parameters only re
ected average parameters from literature. Further, saturation eects Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 and nonlinearities of the promoter dynamics were neglected. After recapitulating the results of [4] and realizing the limitations of the used models, we now adjust our modeling approach and focus on analyzing and verifying the requirements by conducting a series of experiments using a cell-free experimental system [19]. 3 Materials and methods In this section, a brief overview on the experimental technique as well as the subsequently used models is provided. 3.1 TX-TL experimental platform For the purpose of establishing a reliable, ecient and fast prototyping environment for genetic circuits, various cell-free TX-TL systems have been developed and optimized during the past decade [19, 20, 21, 22, 23, 24]. The main advantages of cell-free over classical cell based in-vitro systems are that cellular systems impose certain physical constraints on the gene circuits and the incorporation of the desired genes is comparably time consuming. Cell-free extracts on the other hand provide a well reproducible platform for rapid testing of arbitrary gene circuits. Such an extract for instance can be produced from Escherichia coli (E. coli ) bacteria by bead- beating cell resuspensions, see [23] for more details on the production of E. coli extract. As DNA formatting and transformation as well as cell growth are thus decoupled from the actual testing of the circuit, testing cycles can be speed up signi cantly from several days for testing in original cells to only a few hours for testing in cell-free extract. However, regeneration of resources required for mRNA and protein synthesis is an issue in cell-free environments, which is why the dynamics of mRNA and protein production are subject to some overlayed degradation dynamics of the extract. Therefore, the experiments are only meaningful for a limited experiment duration and we only consider observations within the rst 200 minutes after initiation of the experiment. However, even in this limited time frame, degradation of resources will be visible in the experimental data. Since this mechanism is not considered in the mathematical models, the identi ed parameters will be biased. Production parameters tend to be underestimated while degradation parameters tend to be overestimated. For every TX-TL experiment, the DNA subject to testing is suspended in water and mixed Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 with cell extract and an energy buer. This buer contains amino acids, NTPs, tRNAs and other small molecules necessary for mRNA and protein synthesis. The reaction volume was chosen to 5L. Usually, one or more genetic constructs encode a
uorescent reporter protein such as GFP. After initialization of the experiment, the mixture is incubated at 29 C inside a Biotek plate reader, which assesses the level of
uorescent protein every few minutes. While the concentration of a
uorescent protein like GFP can be assessed directly, measuring the amount of mRNA requires an additional mechanism. We therefore make use of the malachite green dye (20M) and a corresponding aptamer sequence (MGapt) which is added to the 3 untranslated region (UTR) of the gene. The dye binds to a binding pocket of this sequence and changes its emission properties upon binding, therefore again enabling us to monitor a
uorescence signal which is proportional to the mRNA concentration [25]. However, measurements of the mRNA signal due to binding of the malachite green dye revealed only a poor signal to noise ratio, therefore an additional data pre-processing step was introduced by tting a Gaussian process to the experimental data. Details on the pre-processing procedure can be found in Supplementary Data A. In this work, we distinguish between gene and extract speci c parameters. Gene speci c parameters include variables like the anity of the particular promoter sequence towards RNAP and other proteins and by de nition are considered to be independent of the environment the experiment is conducted in, i.e. hold in dierent batches of cell-extract as well as inside living cells. In contrast, remaining parameters like the concentration of RNAP or transcription and translation elongation rates are denoted as extract or environment dependent, thus may vary even between dierent batches of cell-free extract. The experiments presented in this work have all been conducted using the same batch of TX-TL extract. All genetic parts were originally given as plasmids. Using polymerase chain reactions and appropriate primer sequences, only the relevant linear double-stranded gene sequence was ex- tracted from these plasmids and used in the TX-TL experiments. By addition of protein gamS, the degradation of linear DNA is prevented [26]. Information about the used genetic constructs can be found in Supplementary Data B. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 3.2 Modeling protein synthesis Throughout this work, dierent promoters are discussed and analyzed for various purposes. Therefore, the dierent mechanisms and modeling framework used for simulating the temporal evolution of mRNA and proteins are introduced. We therein build uppon the dynamics given in Eq. (2), however avoid using as strict simpli cations as the ones outlined in Section 2. In the following, complexes of two chemical species A and B are denoted with A:B and conserved quantities are indicated by a bar, e.g. R, the total amount of RNAP. It is a well established result [18, 27], that the production rate of mRNA f is proportional to the concentration of promoter which is bound to a corresponding RNAP holoenzyme and not blocked by any inhibitors, e.g. f (P) = D :R: (P; ; ) (7) i i 70 i where the concentration of complex D :R: may be depending on other proteins P, gene i 70 speci c parameters and extract speci c parameters . In this example, sigma factor 70 ( ) rst has to bind to RNAP to form the holoenzyme before this complex then binds the promoter region. The sigma factor therein has a very high speci- city towards certain promoters, enabling the cell to switch between dierent transcriptional programs depending on which sigma factor is expressed. Note that compared to (6a), this is a more realistic model for mRNA production but prohibits making the same deductions for the genetic dierentiator. The basic mechanisms of interest for us are binding and unbinding reactions happening at the promoter sequence of DNA. Usually, as in [18, 28], the amount of D :R: is approximated i 70 by Michaelis-Menten like equations, assuming that either DNA or RNAP holoenzyme is in abundance. In contrast to that, we won't make this assumption but particularly take the binding and unbinding reactions into account in order to consider both competition for shared cellular resources and saturation eects at the promoter. For simple setups where only self- competition occurs, we derive a closed form expression for the steady state concentration of the respective biochemical complexes. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 3.2.1 Holoenzyme formation When RNAP R is bound to a sigma factor , this complex is referred to as the RNAP holoenzyme. As discussed brie
y in the previous section, such a holoenzyme binds to the promoter sequence of a gene and initiates the transcription process. Therefore, sigma factors are a crucial component for this process and without the right sigma factor, transcription cannot initiate. According to [29], RNAP alone is sucient for transcription elongation, however, initiation requires sigma factors. We therefore assume that the formation of holoenzyme is independent of the holoenzyme binding to the promoter sequence, meaning that sigma factor and RNAP can bind and unbind irrespective of the fact if RNAP is bound to DNA or not. We therefore have to consider the reactions R + R: (8a) x x D :R + D :R: (8b) i x i x for each sigma factor and DNA species present in the system in order to account for the competition for RNAP. To simplify (8), we introduce R: = R: + D :R: x x i x X :R = R + D :R the total amount of R bound to as well as the total amount of R which is not bound to its respective sigma factor. Then, (8) can be combined to X :R + R: : (9) x x In most cases, only dissociation constants K = are identi able and it is assumed that binding reactions are fast compared to the transcription elongation steps and thus in quasi steady state. Therefore, for notational simplicity, we will Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 reduce the notation to using dissociation constants instead of on and o rates in the remainder of this work. Note that in (9) X :R and denote both the unbound chemical species. If only one sigma factor is present in the system, the amount of R: can be calculated analytically as a function of the dissociation constant K and the total amounts of RNAP and sigma factor respectively, viz. by application of the following proposition. Proposition 1. Given the entities A, B and A:B and the reaction A + B A:B: If none of the entities participates in any other chemical reaction, the steady state of A:B can be expressed in terms of the total amounts of A and B as A:B = K + A + B (K + A + B) 4AB (10) with A = A + A:B and B = B + A:B. The proof can be found in Supplementary Data C. It is noted that usually, i.e. for the deduction of Michaelis-Menten kinetics, it is assumed that either A B or B A holds while Proposition 1 gives exact solutions for any values of A and B. In cases when only a single sigma factor is present and its total concentration is constant over the time course of the experiment, we will later on use the amount R: as a tting parameter and omit the binding reaction in order to reduce the complexity of the tting problem. However, in cases where the concentration of sigma factor varies over time, we either use the exact formula from Proposition 1, or if there is more than one sigma factor, we directly implement the binding reactions as fast reactions and accept the increased computational complexity. 3.2.2 Promoter binding After formation of R: , the RNAP holoenzyme binds to the promoter sequence and starts transcribing the information encoded as DNA. A promoter is called constitutive , if this binding Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 of RNAP happens spontaneously and is not in
uenced by any activators or inhibitors, i.e. iH D + R: D :R: : (11) i x i x In such cases, given that the promoter does not interact with other holoenzymes, Proposition 1 can be applied again to simplify the modeling formalism. In contrast to a constitutive promoter, binding of RNAP can also be inhibited by other proteins, leading to a combinatorial promoter with competitive binding mechanism, i.e. by the additional reaction ij D + P D :P (12) i j i j which now competes with (11). 3.2.3 Translation and degradation rates Similarly to the transcription rate (7), the rate of translation is given by g (M ) = M :Q(M ; ; ); (13) i i i i i where M :Q stands for the concentration of ribosomes (Q) bound to the ribosome binding site of mRNA M . We assume unregulated ribosomal binding and that the ribosome binding site sequences used for the constructs are of equal strength. Thus, the reactions for forming the complex M :Q are the same as for the formation of holoenzyme and consequently, in case of only one mRNA species present, Proposition 1 can be applied again. Whenever more than one mRNA species is considered, competition for ribosomes occurs and binding reactions are implemented. Degradation of mRNA and protein is mainly in
uenced by third party molecules such as endonucleases (E) and proteases. It is known [30] that latter species is quasi non-existent in TX-TL extract, thus we keep the rst order degradation for proteins as in (5b). Endonucleases, on the other hand, are present in limited quantities, thus loading eects need to be considered. We explicitly assume that the binding of ribosomes and endonucleases is independent of each other, i.e. can be seen as two distinct processes where ribosomes and endonucleases do not Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 t t t σ (TF) M (mRNA) P (protein) D (DNA) TX TL Figure 3: Scheme of probing protein synthesis with step in DNA and expected responses. compete for mRNA. Thus, once more we de ne p (M ) =
M :E(M ; ; ) (14) i i i i i and apply Proposition 1 whenever only self-competition occurs. 4 Results Given the foundational work summarized in Section 2, it is yet unclear to what extent Re- quirements 1 to 4 can be veri ed. In particular, we are interested in answering the question of whether the processes of transcription and translation indeed can be regarded as a gain and integrator respectively (Requirements 1, 2 and 4) and further, whether one can nd a suitable combinatorial promotor which satis es all linearity requirements in order to verify Requirement 3. 4.1 I/O behavior of transcription and translation First, we analyze the time-scales and linearity of transcription and translation. Therefore, the input-/output (I/O) behavior of these processes are characterized by experimentally probing a simple gene with dierent input steps as depicted schematically in Fig. 3. By observing the response to dierent step sizes in the input, the non-linearity of the promoter dynamics can be identi ed. The gene we study is equipped with a dependent constitutive promoter and expresses GFP. By tting a suitable model to the experimental data and analyzing the corresponding parameters, Requirements 1 and 4 will be veri ed. P Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 4.1.1 Experimental setup There are two possibilities to realize a step-like input of varying height at the transcriptional level using promoters like introduced in Section 3.2: either by varying the amount of sigma factor (i.e. the transcription factor) while keeping the DNA concentration constant, or alternatively, changing the DNA concentration itself. While varying DNA amounts is straightforward, the sigma factor input additionally required puri ed protein which may be biologically unstable and is more dicult to obtain than DNA. Figure 4: Mean and 95% con dence interval of experimental step-responses (blue, dotted mean, shaded con dence interval) and simulated step responses of the tted nonlinear model (red, solid). Depending on the choice of input, i.e. sigma factor or DNA, dierent dynamical eects can be expected when probing the system with steps of dierent height. As discussed before in Section 3.2, the mRNA production rate is proportional to the complex D :R: , wherein i 70 the concentration depends on the total amounts of DNA, RNAP and sigma factor. In case the concentration of sigma factor is considered as input, the corresponding model needs to incorporate both the formation of holoenzyme as well as the binding of holoenzyme to the DNA. Thus both binding rates would need to be considered. In contrast, when varying the DNA concentration, the binding reaction of holoenzyme can be neglected and the amount of total holoenzyme R: can be introduced instead. This approach reduces the complexity of the tting problem by focusing on the identi - Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 1: Values of the parameters obtained by tting the nonlinear model to step response data. parameter unit value description min 21:54 transcription rate const. min 2:35 translation rate const. min 0:18 mRNA deg. const. min 1:19e 8 protein deg. const. K nM 0:82 dissoc. const. for D and R: 1H 1 70 K nM 72:26 dissoc. const. for M and Q MQ 1 K nM 102:20 dissoc. const. for M and E ME 1 R: nM 4:26 total RNAP holoenzyme Q nM 165:94 total ribosomes E nM 650:30 total endonuclease cation of promoter binding kinetics only. Thus, for the identi cation of the I/O behavior of transcription and translation, we rst limit ourselves to step inputs in form of varying DNA concentrations and study the sigma factor dependent holoenzyme formation in a separate ex- periment, discussed in Section 4.2. We choose four dierent DNA concentrations for probing the system: 1nM, 3nM, 5nM and 10nM. Three technical replicates were conducted. The data obtained by this process is depicted in Fig. 4. Therein, blue dashed lines stand for the mean of mRNA (upper column) and protein (lower column) concentrations and the 95% con dence intervals are illustrated as shaded blue regions respectively. 4.1.2 Corresponding model We denote the index of the gene under study with i = 1 and accordingly the amount of GFP with P . According to Section 3.2 and particularly Eqs. (2), (5b), (7), (13) and (14), the corresponding model is determined by the complexes D :R: = D :R: D ; R: ; K 1 70 1 70 1 70 1H M :Q = M :Q M ; Q; K 1 1 1 MQ M :E = M :E M ; E; K 1 1 1 ME which are calculated using Proposition 1, depending on the total amounts of DNA, mRNA, RNAP holoenzyme, ribosomes and endonucleases as well as the respective dissociation con- Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 stants. We again note that the model can capture the dynamics only in a limited time frame as the degradation of extract is not taken into account. For tting the model to the given data, we introduce a maximum likelihood objective function, see e.g. [31], and apply several rounds of both patternsearch and fmincon optimization algorithms implemented in Matlab. The resulting parameters given in Table 1 give rise to the red trajectories depicted in Fig. 4. For the process of translation, we observe that the protein degradation rate is evaluated to be of magnitude 10 and therefore, compared to
, practically zero. Conclusion 1. As required in Requirement 4, the degradation of mRNA is much faster than the one of protein. In order to check the linearity Requirements 1 and 2, we study the entities D :R: , M :E 1 70 1 and M :Q as functions of the tted parameters over the relevant range of DNA and mRNA concentrations as depicted in Fig. 5. This way, one can visualize the non-linear nature of the production reactions of mRNA and protein as well as the degradation of mRNA. Although these results clearly indicate that the processes of transcription and translation do not behave linearly in their inputs in general, they allow us to de ne operation regimes as those required in Requirement 2, i.e. where the linearity requirement holds at least approximately. A B C 4 150 0 0 0 0 2 4 6 8 10 0 200 400 600 0 200 400 600 D M M 1 1 1 Figure 5: Amount of active complexes for transcription (A), mRNA degradation (B) and translation (C) over relevant range of DNA and mRNA respectively. In that sense, we now introduce a relative measure of nonlinearity and de ne the -linear- range of a function f : R ! R as the largest interval [0; ] for which this nonlinearity measure is just . For the nonlinearity measure we follow the methods introduced in [32]. Let kf (x)k L [0;] be the truncated L2 norm of f (x), de ned by kf (x)k = f (x) dx: L [0;] D :R:σ 1 70 M :E M :Q 1 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 To approximate f , we use the linear function mx. Note that we forced the intercept of the linear function to take the value 0 to assure strictly positive values of the linear function on the interval [0; ]. For a given f , the best linear approximation in the interval [0; ] is then found as the argument m = m which minimizes L(; m) = k(f (x) mx)k ; (15) L [0;] the absolute L2 norm of the residual between function f (x) and the linear function mx. The value of L(; m ) now can be seen as an absolute measure for the nonlinearity of f on the interval [0; ], however, this measure depends on the magnitude of the function f . Thus, in order to compare this measure across dierent functions, we normalize (15) by the L2 norm of f , i.e. k(f (x) mx)k L [0;] L (; m) = rel kf (x)k L [0;] to nd our relative measure of nonlinearity. Consequently, is found as the solution of max (16) s.t. min L (; m) : rel In the given case, when one allows for a 5% error, i.e. = 0:05, one obtains the linear ranges indicated as black points in Fig. 5. Conclusion 2. Linearity of production and degradation terms, as requested in Requirements 1 and 2, can be veri ed with 95% accuracy with D :R: A D for D 2 [0; 3:805] 1 70 tx 1 1 M :E A M for M 2 [0; 593:1] 1 deg 1 1 M :Q A M for M 2 [0; 141:7] 1 tl 1 1 and A = 0:726, A = 0:752 , A = 0:602. tx deg tl Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 4.1.3 Linearized model and transfer functions Given the linear operation regimes indicated in Conclusion 2, one can now derive linear models for transcription and translation which are then valid in the respective regimes. In the control community, the standard approach to approximate a nonlinear model with a linear one is to locally linearize the nonlinear function at on speci c value. In case of the nonlinear mRNA degradation rate p for example, a linearization around some xed value M would yield dp 0 0 p (M ) p (M ) + (M M ): 1 1 1 1 1 1 dM 0 This approach assures that the linear function evaluated at M has the same value as the original nonlinear one, and that the dierence between the two functions is small in a neighborhood around M . Thus, the quality of the linear model on a certain interval strongly depends on the chosen value M . In our case, particularly the values of M may vary across a wide range. Further, it should be made sure that in the case when neither DNA nor mRNA or protein is present, the temporal derivatives of these species also is equal to zero, i.e. that _ _ M (D = 0; M = 0) = P (M = 0; P = 0) = 0 1 1 1 1 1 1 holds. This will only be achieved if all linear functions go through the origin. To assure this, one would consequently have to perform the linearization at D = M = P = 0, leading to 1 1 1 potentially large deviations between the linear and nonlinear models at larger values of the independent variables. Therefore, instead of using this standard approach, we directly use the approximations of Conclusion 2 where we already made sure that the linear approximation is as good as possible over a given interval of the independent variable. We thus obtain the linear model M A D
A M 1 tx 1 deg 1 P A M P 1 tl 1 1 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 and when de ning D and M as input and output of the transcription module, M and P as 1 1 1 1 input and output of the translation module, the corresponding transfer functions tx G (s) = (17) tx s +
A deg tl G (s) = (18) tl s + are obtained. We conclude that due to the fact that is very small, translation can indeed be seen as integration as long as DNA and mRNA concentrations are in the appropriate operation regime. However, the initial assumption that transcription can be seen as a gain needs to be adjusted as mRNA degradation cannot be neglected, leading to a PT1 element instead of a gain. So far, we studied and characterized time-scales and linearity of the processes of transcrip- tion and translation in context of an E. coli cell-free extract and mainly focused on possible limitations caused by the promoter and mRNA binding kinetics. We therefore bypassed nonlin- ear eects of RNAP holoenzyme formation by changing DNA concentrations instead of using as input and found that at least during the rst 200 minutes of a TX-TL experiment, re- source limitations do have an eect on transcription, translation and mRNA degradation. By studying dierent step responses, the linear operation regimes were identi ed. We now turn towards inhibitor binding dynamics and in particular towards the problem of how to realize a signal dierence using combinatorial promoters. 4.2 Signal dierence and combinatorial promoters In order to approximate the derivative of a signal by implementing the scheme depicted in Fig. 1, we remember that the input into the gain (i.e. transcription) has to be the residual between the reference and feedback signal. There are various ways to realize a signal dierence in biology, a widely used one being sequestration-based mechanisms between the signaling molecules, e.g. binding and degradation of the complex like elaborated in [7, 8]. When dealing with RNA or DNA, such a mechanism can be realized in a straight-forward way by e.g. the use of antisense strands. When it comes to proteins or metabolites, engineering a sequestration mechanism for an arbitrary protein or Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 B1 B2 B3 B4 20 20 20 20 15 15 15 15 10 10 10 10 1 20 5 5 5 5 0.8 0 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0.6 D=1nM D=5nM D=20nM D=50nM 0.4 C1 C2 C3 C4 0.2 20 20 20 20 0 0 15 15 15 15 0 5 10 15 20 10 10 10 10 activator conc. 5 5 5 5 0 0 0 0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 Figure 6: Level sets of promoter activity D :R: over varying levels of sigma factor and i x inhibitor. A Desired behavior for non-negative signal dierence. B1-B4 Simulated values for varying DNA concentrations under a weak repressor. C1-C4 Simulated values for varying DNA concentrations under a strong repressor. metabolite may be possible but in general is more challenging. Thus, one is rather restricted to the use of existing pairs of proteins which undergo binding reactions, e.g. sigma factors and anti sigma factors. Combinatorial promoters as an alternative mechanism may oer a higher
exibility during the prototyping process as various inhibitor operator sequences are already known for transcriptional regulation. Therefore, it is in principle possible to compare the concentrations of any two transcription factors by combination of these operator sequences with dierent promoters. It is one of the goals of this work to investigate whether this approach can actually be used for the purpose of subtraction in a biological context. Following such an approach, the desired behavior of the steady state of promoter dynamics is depicted in Fig. 6A where the steady state of D :R: is color-coded over varying concen- i x trations of ( ) and inhibitor (P ). Due to non-negativity of concentrations, no activity is x j desired whenever the concentration of inhibitor exceeds the one of activator (upper left triangle resembling zero). Otherwise, it is aspired that D :R: is proportional to the dierence P , i x x j illustrated by the parallel and equidistant level sets in Fig. 6A. Applying Proposition 1 and assuming that the total amount of RNAP holoenzyme is xed, the amount of D :R: depends on the chosen DNA concentration as well as on dissociation i x constants K and K of the RNAP holoenzyme and inhibitor respectively. If for instance we iH ij assume that K = K = 1 and look at the relative amount of activated DNA D :R: =D , iH ij i x i varying holoenzyme and inhibitor in the same range results in qualitatively dierent steady- repressor conc. strong repressor weak repressor Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 2: Values of the parameters obtained by tting the nonlinear model to the time-series responses of the combinatorial promoter. parameter unit value description K nM 1:8e 6 dissoc. const. for R and 70 70 K nM 5:3e 3 dissoc. const. for R and 28 28 K nM 8:1e 3 dissoc. const. for D and tetR tetR 2 K nM 2:74 dissoc. const. for tetR and aTc aTc K nM 1:084e4 dissoc. const. for D and R: 2H 2 28 K nM 33:86 dissoc. const. for D and R: s28H s28 70 K nM 2:97e3 dissoc. const. for D and R: tetRH tetR 70 R nM 283:14 total RNAP nM 3:36 total sigma factor 70 state maps depending on how much D is chosen, as depicted in Fig. 6 B1-B4. For high DNA, sigma factor acts quasi linearly on the promoter while the inhibitor does not play a role at all. On the other hand, for small amounts of DNA, the inhibitor has a large eect and distorts the steady-state map such that the level sets converge to each other at the origin. Also, suppression due to the repressor does not seem strong enough as in all cases, D :R: 0 for < P . i x x j In contrast to that, Fig. 6 C1-C4 show the same conditions, except that now K = 10K , iH ij i.e. the inhibitor binds 10 times stronger to the promoter than RNAP holoenzyme does. In that case, only minimal transcriptional activity is expected when there is less sigma factor than repressor. Further, although level sets are curved, for medium amounts of DNA, e.g. 20nM, they are comparably equidistant and the steady-state map is almost symmetric. This means that, while we have to acknowledge that exact realization of the dierence of two signals is not possible with combinatorial promoters, some crucial properties can be approximated by choosing dissociation constants and DNA amounts carefully. For that purpose and also for detangling the RNAP holoenzyme binding reaction, we study a a gene with a pTar intitiation sequence combined with a tetO inhibitor operator which expresses GFP. The pTar promoter is sensitive towards an RNAP holoenzyme consisting of RNAP bound to , while the operator sequence tetO enables binding and inhibition through Tet repressor proteins (tetR). We denote the concentration of this gene as D and GFP concentration as P . 2 2 To avoid usage of puri ed protein, both and tetR are produced in the TX-TL system from respective constitutive (i.e. dependent) DNAs D and D . While the amount of D is 70 s28 tetR s28 varied to achieve dierent activation levels, inhibition is in
uenced by adding dierent amounts of anhydrotetracycline (aTc) which binds to tetR and thus alleviates its association with the Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 promoter. The concentration of D is kept at a constant level of 1nM. The combinatorial tetR promoter then produces GFP, dependent on the concentrations of and unblocked tetR. The time-series of this experiment can be found in Supplementary Data D. According to the experimental setup, several chemical species compete for the same re- sources, thus Proposition 1 cannot be applied anymore and the binding reactions themselves had to be implemented as fast reactions. For brevity reasons, the binding reactions are not listed here. We focus on mRNA and protein dynamics, i.e. the ODEs M = D :R:
M :E s28 s28 70 s28 = M :Q s28 s28 s28 M = D :R:
M :E tetR tetR 70 tetR tetR = M :Q tetR tetR M = D :R:
M :E 2 2 28 2 P = M :Q P : 2 2 2 Fitting these equations to the data, we obtain the parameters listed in Table 2 and the trajectories depicted in Supplementary Data D. In the tting process, the optimization is constrained such that the amount of complex R: is similar to the value tted in the rst experiment where binding of sigma factor has been neglected, see Table 1. Steady-state D :R:σ 2 28 −3 · 10 1.5 0.5 0 0 0 5 10 15 20 activator conc. Figure 7: Promoter activity of pTar-tetO, obtained from tted model. repressor conc. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 0.6 0.4 0.2 0 200 400 Figure 8: Amount of active transcription complex over relevant range of sigma factor concen- tration for pTar promoter. The values given in Table 2 indicate that the total amount of RNAP is much bigger than the one of and further, that binding between these two species is very strong. Although also binds strongly to RNAP, its anity is still smaller than the one of . The excessive 28 70 amount of RNAP and the much higher binding anity of thus leads to a decoupling of the two binding reactions. We also note that the binding of R: to the pTar promoter apparently has a very low anity which leads to low GFP levels compared to the input step experiments. Together with the fact that the repressor tetR binds the pTar promoter very strongly, this leads to the steady-state promoter map depicted in Fig. 7, where the amount of active promoter for 20nM of DNA and varying activator and inhibitor concentrations is determined based on the reactions from Section 3.2 and parameters from Table 2. Although there is leakage for medium amounts of inhibitor and activator and the level sets are not completely linear, the determined promoter dynamics are comparable to the desired behavior of Fig. 6 A. Conclusion 3. Using combinatorial promoters, the dierence between two signals can only be realized to a limited extent. Given these results the transcription dynamics of Section 4.1 can now be extended with the appropriate promoter dynamics and as input. As pointed out before, the strong binding anities of the sigma factors lead to a quite linear but bi-modal input-output behavior, as depicted in Fig. 8, compared to the one depicted in Fig. 5 A. Therein, the active D :R: 2 28 complex linearly follows the amount of until the concentration of RNAP is matched. Con- sequently, the transcriptional gain A changes due to the change of input and using the same tx D :R:σ 2 28 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 linear approximation as de ned in (16), one now nds D :R: A 2 28 tx 28 with A = 0:0018: (19) tx 4.3 Implications for the closed loop Initially, with Requirements 1 to 4, we expected the process of transcription to behave like a gain, translation to behave like an integrator and combinatorial promoters to provide the dierence of two signals. Now, several observations were made which dier from our initial view. First, although mRNA degradation is indeed much faster than protein degradation, the simpli cation to a simple gain is not justi ed and the temporal dynamics of mRNA production should be taken into account instead, leading to a PT1 behavior instead of a gain. Second, both production and degradation rates are subject to saturations due to nite amounts of resources of the transcriptional and translational machinery in the cell-free extract. For small inputs however, these rates can be seen as linear functions of their inputs and the linear operation regimes have been determined explicitly. Third, when realizing the dierence of two signals by using combinatorial promoters, one only obtains an approximation of the dierence and the quality of the estimate depends on the magnitudes of the inputs. Now that these deviations from our initial requirements have been identi ed and charac- terized, their eect on functionality and performance of the synthetic genetic dierentiator postulated in [4] can be studied. For that purpose, two dierent models are compared with the ideal realizable dierentiator from Eq. (1) in both time and frequency domain. The rst model is given by the closed loop of the models G and G given in (17) and tx tl (18) respectively and adapted with the new transcriptional gain (19). This results in a linear model like depicted in Fig. 9 where no saturation eects are taken into account and perfect signal dierence is assumed. However, the slow mRNA production and resulting PT1 behavior is taken into account and parameters of G and G resemble realistic values as they were tx tl obtained from experimental data. With the simpli cation = 0, the transfer function of the Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 G (s) tx G (s) tl cl Figure 9: Topology of the linearized model given as the closed loop of G and G . tx tl closed loop system thus is given by A s tx G (s) = : (20) cl s +
A s + A A deg tx tl The second model is considered as the detailed nonlinear model and is based on the reactions introduced in Section 3.2, thus taking all saturation eects, non-linearities and time-delays into account. It consists of a gene G with a combinatorial promoter like the one studied in Section 4.2, i.e. sensitive to a holoenzyme and tetR inhibitor, producing this very same inhibitor, therefore realizing the circuit from Fig. 1. The concentration of is considered as input signal. In order to capture both positive and negative gradients, the same approach as introduced in [4] is used, leading to a network topology like in Fig. 2 where G is of similar structure as G but with negative in
uence of the input on the transcription rate. The following additional mechanisms are necessary to realize this topology: a) Additionally to , a second sigma factor is introduced to be present at a constant 28 xx level. While R: activates transcription of G and R: activates the one of G , both 28 1 xx 2 holoenzymes bind to both genes, leading to a competition and negative in
uence of one to the other. b) Self inhibition of the two genes is achieved by two dierent inhibitors, e.g. tetR and tetR . c) The two inhibitors tetR and tetR undergo an annihilation reaction at rate = 0:1=(nM min) which was chosen arbitrarily. As these modi cations have been discussed in [4] already, we omit the details at this point. The mRNA and protein dynamics of the core species as well as the output of the system is given Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 3: Summary of the models and comparison of the core features. Desired circuit Model 1 Model 2 Topology Fig. 1 Fig. 9 Fig. 2 Dynamics Eq. (1) Eq. (20) Eq. (21) Features linear linear nonlinear perfect gain delayed gain delayed gain no saturation no saturation saturation perfect dierence perfect dierence approximated dierence by M = D :R:
M :E (21a) tetR tetR 28 tetR tetR = M :Q tetR tetR tetR (21b) tetR ? ? ? M = D :R:
M :E (21c) tetR tetR xx tetR ? ? tetR = M ? :Q tetR tetR tetR (21d) tetR y = M M ?: (21e) tetR tetR We summarized the core features of these two models and the desired circuit in Table 3. Note that Model 1 can be seen as the linearized version of Model 2. 4.3.1 Frequency domain analysis In a rst step, we compare the two models and desired behavior in the frequency domain, i.e. in terms of the Bode plot depicted in Fig. 10. This again is a classical tool from the control community and graphically shows how sinusoid input signals are modi ed by a certain transfer function. In the upper part, the magnitude ampli cation (!) indicates how the amplitude of the input signal is ampli ed for dierent input frequencies. In the lower part, the phase shift (!) for these frequencies is shown. Magnitude and phase of the desired behavior (solid black) and linearized model (solid blue) are obtained trivially using Matlab. For the nonlinear model however, we use a describing function approach as described in [33] to compare the input-output behavior of the nonlinear Model 2 with the linear ones. Therefore, the nonlinear Model 2 is excited with input u(t) = A + A sin(!t) (22) 0 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 −50 −100 desired Model 1 −100 0.01 0.1 1 10 100 1000 −200 −4 −3 −2 −1 0 1 10 10 10 10 10 10 Frequency (rad/min) Figure 10: Bode plot of desired model (solid black) and linear Model 1 (solid cyan). In dashed lines magnitude and phase of output of nonlinear Model 2 subject to u(t) = A + A sin(!t). Dierent colors indicate dierent values of A . Several values for A are plotted (lying on top of each other). and the corresponding output y(t) analyzed in terms of its Fourier coecients. Assume that ? ? 2 after time t = k , the output oscillates in a steady-state fashion, i.e. no transient dynamics occur anymore, and let t +T in!t c (!) := y(t)e dt with period T ? T := be the n-th Fourier coecient of signal y(t) which corresponds to frequency !. Then, the magnitude ampli cation will be given as the ratio of the magnitudes of the rst Fourier coecients of output and input signal. With the input de ned like in (22), the rst Fourier coecient of this signal is simply . Therefore we have 2i jc (!)j 2jc (!)j 1 1 (!) = = : j j 2i Phase (degree) Magnitude (dB) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Further, the phase shift for this frequency and particular input signal is given by Re c (!) (!) = atan : Im c (!) The constant part A of input signal (22) is necessary to produce non-negative sinusoid functions. Due to the non-linearity of Model 2, the output y(t) does not only depend on the frequency ! but the shape of the input function in general, i.e. also the variables A and A. We therefore probed the system for several frequencies and values for A and A. For linear systems, an input signals with a single frequency component, like the one of (22), leads to an output with also only one frequency component, namely the same as the one of the input. In other words, higher harmonics are not existent and jc (!)j = 0 for n > 1. This is not the case for general non-linear systems where higher harmonics can also appear and in principle more than just the rst Fourier coecient should be analyzed. Thus, the way we use the describing function approach in this work relies on the assumption that higher harmonics of the output signal can be neglected. We thus analyzed the power spectrum of the output signals for dierent values of A, A and ! and found that for most combinations, the higher harmonics contributed less than 5% to the overall power spectrum. However, in the case when A approaches A and ! is close to the pole of the transfer function, it seems that the assumption does not hold, see Supplementary Data E for further details. We will see in Section 4.3.2 and Fig. 11 what this means for the output signal. In Fig. 10, magnitudes and phases of the respective response signal are plotted as dashed lines where dierent colors indicate dierent values for A . The values for A are chosen as A = kA with k 2 [0:1; 0:5; ; 0:75] and respective responses plotted in the same color. As seen in Fig. 10, the output response does not change with varying A, however, the choice of the oset A signi cantly in
uences the I/O behavior of the nonlinear signal dierentiator. Very low values of A (dashed red, orange and purple) lead to a very sensitive response, i.e. too high gain of the resulting closed loop and a smaller range of frequencies for which the output approximates the derivative of the input. A value of A = 10 (dashed green) results in the best response of the nonlinear system, matching the gain of an ideal dierentiator quite well while providing almost the same frequency range as the one predicted by the linearized system (! 0:03 rad/min). For too large values max Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 A = 0.1 A = 0.01 0.05 0.09 −1 0 200 400 600 8001,0001,200 A = 10 A = −1 0 200 400 600 8001,0001,200 A = 100 A = 1 50 −1 0 200 400 600 8001,0001,200 Figure 11: Normalized output y ~(t) of the nonlinear closed model with input u(t) = A + A sin(0:01t) for varying values of A and A. of A (dashed cyan and dark red) Model 2 breaks down as expected due to the previously characterized saturation eects and the resulting loss of sensitivity towards the input signal. 4.3.2 Time domain analysis From the previous analysis, we summarize that for the detailed Model 2, the phase of the output signal is o for too small values of A , the gain is very small for values of A 10 and 0 0 only in case of A 10 both magnitude and phase are as desired. We now focus on the shape of the output signal of Model 2 and therefore stick to sinusoid input signals, xing ! = 0:01 but varying A and amplitude A of the input signal. The normalized output y ~(t) = y(t) (23) !A as response to the just described input is depicted in Fig. 11. y(t) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 For a small value of A , as expected, the phase is o, however, the output signal still has a sinusoid shape for all amplitudes A. In contrast, for higher values of A , the phase is correct but with amplitude A approaching the oset A , the output signal becomes more and more distorted. This eect is ampli ed for higher oset values and is caused by a dilution of the power spectrum as discussed in the previous section. 5 Discussion For the synthesis of genetic networks that realize arbitrary linear transfer functions, we follow a similar approach as in [3]. Therefore, it is crucial to nd suitable genetic counterparts to primitive I/O functions such as gain, integration and dierence. In a rst attempt discussed in [4] and recapitulated in Section 2, several requirements were introduced to associate the processes of transcription and translation and combinatorial promoters with these respective I/O primitives. Now, a series of experiments and analyses was presented to verify and adapt these requirements. By observing mRNA and protein levels as response to step inputs of varying height, it was veri ed in Conclusion 1 that protein degradation is almost non-existent while mRNA degradation is comparably fast. However, degradation dynamics are not as fast as desired and a quasi steady state assumption for the process of transcription would be oversimplifying. Thus, transcription should be considered as a PT1-element rather than a gain. By tting an ODE model to the experimental data and analyzing the corresponding param- eters, it was also shown that all processes are subject to saturation due to limited amounts of resources. Using the same model and the tted parameters, the linear operation regimes of the I/O primitives can be characterized as shown in Conclusion 2, leading to more insight into the capabilities and limitations of respective genetic circuits. In a second series of experiments, the dependence of the performance of a combinatorial promoter on the operation regime was emphasized, realizing in Conclusion 3 that the dierence of two signals can only be obtained approximately. Based on these insights, DNA concentrations for a simulation study were chosen such that the I/O behavior of the combinatorial promoter is as close as possible to the desired one. In conclusion, the use of combinatorial promoters for comparing the concentrations of two transcription factors is only possible within a limited Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 range of magnitudes and we suggest to use sequestration based mechanisms in future. For the realization of a genetic signal dierentiator using the studied parts, the initial goal was to realize a dierentiator with high-pass lter. The corresponding transfer function is given in Eq. (1). It has a zero at the origin and one pole determined by the lter to make it a causal system. However, slow mRNA degradation leads to a behavior which, when linearized, is of relative degree one, i.e. Eq. (20) which has one zero at the origin and two poles in the left half plane. This reveals an additional delay of the transient dynamics. If protein degradation were signi cantly larger than zero, this would lead to a transfer function of the form K s + K 1 1 2 G = ; (24) cl;protdeg s + (
+
)s +
+ K K 1 2 1 2 1 2 thus, shifting the zero from the origin to the right half plane and therefore leading to an addi- tional lower frequency bound and a sign change in the output. In comparison, the dierentiator introduced in [13] leads to a very similar transfer function as (24), given that all necessary as- sumptions introduced there hold. The main dierence is that in [13], the zero of the transfer function always lies in the left half plane. On one hand, this means that a sign change is avoided. On the other hand, there inherently exists a lower bound for admissible input frequencies while for the design presented in this work, this only is be the case if protein degradation is large. In order to conduct studies beyond the linearized model, a describing function approach is used to evaluate the response of the nonlinear model to sinusoid inputs like in Eq. (22). Therein, it can be seen that the performance of the dierentiator critically depends on the constant part of the input signal, revealing again the limitations due to resource competition but also unexpectedly towards some supersensitivity at low values of A . With an appropriate choice of A , the presented network approximates the temporal derivative of an input signal for frequencies up to ! 0:02 rad/min. Additionally to the dependence on the absolute value of A , simulations in the time domain revealed a dependence on the relative amplitude in sense of a distortion of the output signal. When this relative amplitude approaches the value 1, the output signal looses its similarity to the sinusoid input, although phase and gain may be correct. In other words, the nonlinearities of the model lead to a dilution of the power spectrum of the output and higher harmonics are ampli ed. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Supplementary Data Supplementary Data available at SYNBIO online. Acknowledgements The authors are grateful for Vipul Singhal, Andrey Shur, Anandh Swaminathan and William Poole from the Murray Lab at Caltech for the thought provoking discussions and support in the laboratory. References [1] Purnick, P. E. M. and Weiss, R. \The second wave of synthetic biology: from modules to systems". In: Nature Reviews Molecular Cell Biology 10.6 (2009), pp. 410{422. doi: 10.1038/nrm2698. [2] Del Vecchio, D., Dy, A. J., and Qian, Y. \Control theory meets synthetic biology". In: Journal of The Royal Society Interface 13.120 (2016), p. 20160380. doi: 10.1098/rsif. 2016.0380. [3] Oishi, K. and Klavins, E. \Biomolecular implementation of linear I/O systems". In: IET Systems Biology 5.4 (2011), pp. 252{260. doi: 10.1049/iet-syb.2010.0056. [4] Halter, W., Tuza, Z. A., and Allg ower, F. \Signal dierentiation with genetic networks". In: Proceedings of the 20th IFAC World Congress. 2017, pp. 10938{10943. doi: 10.1016/ j.ifacol.2017.08.2463. [5] Ang, J., Bagh, S., Ingalls, B. P., and McMillen, D. R. \Considerations for using integral feedback control to construct a perfectly adapting synthetic gene network". In: Journal of Theoretical Biology 266.4 (2010), pp. 723{738. doi: 10.1016/j.jtbi.2010.07.034. [6] Yordanov, B., Kim, J., Petersen, R. L., Shudy, A., Kulkarni, V. V., and Phillips, A. \Com- putational design of nucleic acid feedback control circuits". In: ACS Synthetic Biology 3.8 (2014), pp. 600{616. doi: 10.1021/sb400169s. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 [7] Briat, C., Gupta, A., and Khammash, M. \Antithetic Integral Feedback Ensures Robust Perfect Adaptation in Noisy Bimolecular Networks". In: Cell Systems 2.1 (2016), pp. 15{ 26. doi: 10.1016/j.cels.2016.01.004. [8] Briat, C., Gupta, A., and Khammash, M. \Antithetic proportional-integral feedback for reduced variance and improved control performance of stochastic reaction networks". In: Journal of the Royal Society Interface 15.143 (2018). doi: 10.1098/rsif.2018.0079. [9] Briat, C. and Khammash, M. \Perfect Adaptation and Optimal Equilibrium Productivity in a Simple Microbial Biofuel Metabolic Pathway Using Dynamic Integral Control". In: ACS Synthetic Biology 7.2 (2018), pp. 419{431. doi: 10.1021/acssynbio.7b00188. [10] Qian, Y. and Del Vecchio, D. \Realizing `integral control' in living cells: how to overcome leaky integration due to dilution?" In: Journal of The Royal Society Interface 15.139 (2018), p. 20170902. doi: 10.1098/rsif.2017.0902. [11] Harris, A. W. K., Dolan, J. A., Kelly, C. L., Anderson, J., and Papachristodoulou, A. \Designing Genetic Feedback Controllers". In: IEEE Transactions on Biomedical Circuits and Systems 9.4 (2015), pp. 475{484. doi: 10.1109/TBCAS.2015.2458435. [12] Lang, M. and Sontag, E. \Scale-invariant systems realize nonlinear dierential operators". In: Proceedings of the American Control Conference. 1. IEEE, 2016, pp. 6676{6682. doi: 10.1109/ACC.2016.7526722. [13] Chevalier, M., Gomez-Schiavon, M., Ng, A., and El-Samad, H. \Design and analysis of a Proportional-Integral-Derivative controller with biological molecules". In: bioRxiv April (2018), p. 303545. doi: 10.1101/303545. [14] Halter, W., Montenbruck, J. M., Tuza, Z. A., and Allg ower, F. \A resource dependent pro- tein synthesis model for evaluating synthetic circuits". In: Journal of Theoretical Biology 420 (2017), pp. 267{278. doi: 10.1016/j.jtbi.2017.03.004. [15] Siegal-Gaskins, D., Tuza, Z. A., Kim, J., Noireaux, V., and Murray, R. M. \Gene Circuit Performance Characterization and Resource Usage in a Cell-Free \Breadboard"". In: ACS Synthetic Biology 3.6 (2014), pp. 416{425. doi: 10.1021/sb400203p. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 [16] Dolan, J., Anderson, J., and Papachristodoulou, A. \A loop shaping approach for de- signing biological circuits". In: Proceedings of the 51st IEEE Conference on Decision and Control. IEEE, 2012, pp. 3614{3619. doi: 10.1109/CDC.2012.6426405. [17] Gyorgy, A. and Del Vecchio, D. \Limitations and trade-os in gene expression due to competition for shared cellular resources". In: Proceedings of the 53th IEEE Conference on Decision and Control. Vol. 2015-Febru. February. IEEE, 2014, pp. 5431{5436. doi: 10.1109/CDC.2014.7040238. [18] Del Vecchio, D. and Murray, R. M. Biomolecular Feedback Systems. Princeton, NJ: Prince- ton University Press, 2015. [19] Takahashi, M. K. et al. \Characterizing and prototyping genetic networks with cell-free transcription{translation reactions". In: Methods 86 (2015), pp. 60{72. doi: 10.1016/j. ymeth.2015.05.020. [20] Noireaux, V., Bar-Ziv, R., and Libchaber, A. \Principles of cell-free genetic circuit as- sembly". In: Proceedings of the National Academy of Sciences 100.22 (2003), pp. 12672{ 12677. doi: 10.1073/pnas.2135496100. [21] Shin, J. and Noireaux, V. \Ecient cell-free expression with the endogenous E. Coli RNA polymerase and sigma factor 70". In: Journal of Biological Engineering 4.1 (2010), p. 8. doi: 10.1186/1754-1611-4-8. [22] Karig, D. K., Iyer, S., Simpson, M. L., and Doktycz, M. J. \Expression optimization and synthetic gene networks in cell-free systems". In: Nucleic Acids Research 40.8 (2012), pp. 3763{3774. doi: 10.1093/nar/gkr1191. [23] Sun, Z. Z., Hayes, C. A., Shin, J., Caschera, F., Murray, R. M., and Noireaux, V. \Pro- tocols for Implementing an Escherichia coli Based TX-TL Cell-Free Expression System for Synthetic Biology". In: Journal of Visualized Experiments 79 (2013), pp. 1{10. doi: 10.3791/50762. [24] Chappell, J., Jensen, K., and Freemont, P. S. \Validation of an entirely in vitro approach for rapid prototyping of DNA regulatory elements for synthetic biology". In: Nucleic Acids Research 41.5 (2013), pp. 3471{3481. doi: 10.1093/nar/gkt052. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 [25] Grate, D. and Wilson, C. \Laser-Mediated , Site-Speci c Inactivation of RNA Tran- scripts." In: Proceedings of the National Academy of Sciences of the United States of America 96.11 (1999), pp. 6131{6136. [26] Sun, Z. Z., Yeung, E., Hayes, C. A., Noireaux, V., and Murray, R. M. \Linear DNA for Rapid Prototyping of Synthetic Biological Circuits in an Escherichia coli Based TX-TL Cell-Free System". In: ACS Synthetic Biology 3.6 (2014), pp. 387{397. doi: 10.1021/ sb400131a. [27] Gruber, T. M. and Gross, C. A. \Multiple Sigma Subunits and the Partitioning of Bac- terial Transcription Space". In: Annual Review of Microbiology 57.1 (2003), pp. 441{466. doi: 10.1146/annurev.micro.57.030502.090913. [28] Gyorgy, A. and Murray, R. M. \Quantifying resource competition and its eects in the TX-TL system". In: Proceedings of the 55th IEEE Conference on Decision and Control. Vol. 1. IEEE, 2016, pp. 3363{3368. doi: 10.1109/CDC.2016.7798775. [29] Courey, A. J. Mechanisms in Transcriptional Regulation. Malden, MA 02148-5020, USA: Blackwell Publishing, 2008. [30] Shin, J. and Noireaux, V. \Study of messenger RNA inactivation and protein degradation in an Escherichia coli cell-free expression system". In: Journal of Biological Engineering 4 (2010), pp. 1{9. doi: 10.1186/1754-1611-4-9. [31] Klipp, E., Liebermeister, W., Wierling, C., Kowald, A., Lehrach, H., and Herwig, R. Systems biology: A Textbook. Wiley-VCH Verlag, 2009. [32] Allg ower, F. \De nition and Computation of a Nonlinearity Measure". In: IFAC Nonlin- ear Control Systems Design 28.14 (1995), pp. 257{262. doi: 10.1016/S1474-6670(17) 46840-6. [33] Gelb, A. and Velde, W. E. V. Multiple-input describing functions and nonlinear system design. New York: McGraw-Hill, 1968. [34] Rasmussen, C. and Williams, C. Gaussian processes for machine learning. The MIT Press, 2006. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Analysis of primitive genetic interactions for the design of a genetic signal dierentiator. 1; 2 1 Wolfgang Halter * Richard M. Murray Frank Allg ower Institute for Systems Theory and Automatic Control, University of Stuttgart, 70569 Stuttgart, Germany and California Institute of Technology, Pasadena, CA 91125, USA *Corresponding author: E-mail: wolfgang.halter@ist.uni-stuttgart.de Supplementary Data A Data pre-processing In this section we denote the data obtained in the experiments discussed in Sections 4.1 and 4.2 with y 2 R. There are mainly three issues with these data, exemplarily depicted in Fig. 12 A and B as grey crosses. A B C 3 3 3 ·10 ·10 ·10 1 1 1 0.5 0.5 0.5 0 0 0 0 200 400 0 200 400 0 200 400 time (min) time (min) time (min) Figure 12: Data pre-processing of I/O experiments. Example: processing malachite green signal with 5nM of DNA. MG signal background MG signal 5nM Corrected signal Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 First, the measurements are corrupted with noise, i.e. y = f (t) + ; N (0; ) where f (t) is some deterministic process generating the noise-free data and the gaussian noise. This is particularly the case for the malachite green
uorescence measurements. Second, the time points at which the measurements are obtained are not uniformly spaced due to inconsistent preparation times of the experiments. This leads to a heterogeneous distribution of the measurements along the time axis. And last, for malachite green, a substantial part of the measured signal stems from some background signal caused by unbound malachite green, leading to the need of correcting the signals by subtracting the background part. However, due to the non-uniform temporal spacing of the measurements, a correction of the background requires some kind of model or interpolation scheme of the data. We therefore assume that the measurement noise is i.i.d. and model the timeseries for each experimental condition as a gaussian process, i.e. 0 2 y GP ; k(t; t ; ) + 0 tt where 2 R is a constant mean, k is chosen as a squared exponential kernel parametrized with and 0 being the Kronecker delta. tt (ctrl) (e) Now let y and y be the tted gaussian processes of a control experiment without any (ctrl) (e) DNA and some other experimental condition with predicted mean , and predicted ? ? (ctrl) (e) standard deviations , as derived in [34] and depicted in Fig. 12 as dashed blue lines ? ? (e) (mean) and light blue shaded area (standard deviation). The background corrected signal y ~ is then determined by (e) (e) (ctrl) ~ = ? ? ? 2 2 2 (e) (e) (ctrl) ~ = + ; ? ? ? like depicted in Fig. 12 C. Finally, the
uorescence signals are converted from the arbitrary intensity unit into a concen- Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 tration unit, using the previously obtained calibration relations 1723 a:u: = 1M GFP 775 a:u: = 1M mRNA: Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 B Genetic constructs Gene functional contents sequence information D pBest-deGFP-MGapt addgene.org/67734/ D pTar-tetO-deGFP see * for sequence D pBest- 8 addgene.org/45779/ s28 2 D pBest-tetR addgene.org/45778/ tetR *GGCATGCCAAGCTTCAATAAAGTTTCCCCCCTCCTTGCCGATAATCCCTATC AGTGATAGAGAGCTAGCAATAATTTTGTTTAACTTTAAGAAGGAGATATACCA TGGAGCTTTTCACTGGCGTTGTTCCCATCCTGGTCGAGCTGGACGGCGACGTA AACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGG CAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGC CCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCC GACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGT CCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCG AGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATC GACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAA CAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGA ACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCAC TACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCA CTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATC ACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCAGAAGGGAAGAAAGA GCAAAGAAGGTAGCATAA Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 C Models This section extends the results presented in Section 3.2. Proof of Proposition 1. By setting A = 0, one arrives at the quadratic equation dt (A:B) A:B(K + A + B) + AB = 0 which in general can have either none, exactly one or two real solutions, determined by the discriminant = (K + A + B) 4AB: Taking into account that only K > 0, A > 0 and B > 0 are biologically meaningful, one nds 2 2 = K + KA + KB + (A B) 0 thus at least one real solution exists. For the existence of exactly one solution, one would need K = 0 which was excluded previously. Otherwise, the quadratic formula yields A:B = 1;2 K + A + B (K + A + B) 4AB where we assign A:B to the solution with the negative and A:B to the one with the positive 1 2 sign. Due to mass conservation, we are interested in the solution for which 0 A:B min(fA; Bg) (25) holds. Now, 0 A:B for both i = [1; 2] follows directly from 2 2 K + A + B K + A + B 4AB: Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 and with 1 1 K + A + B K + 2 min(fA; Bg) 2 2 > min(fA; Bg) it can be seen that A:B violates (25). It remains to realize that K + A + B 2 min(fA; Bg) (K + A + B) 4AB to conclude that A:B is the only biologically meaningful solution. 1 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 D pTar promoter characterization The time series data of the pTar characterization experiment is depicted in Fig. 13. 0nM aTc, 0nM D 0nM aTc, 0.04nM D 0nM aTc, 0.4nM D s28 s28 s28 1,000 1,000 1,000 800 800 800 600 600 600 400 400 400 200 200 200 0 0 0 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 100nM aTc, 0nM D 100nM aTc, 0.04nM D 100nM aTc, 0.4nM D s28 s28 s28 1,000 1,000 1,000 800 800 800 600 600 600 400 400 400 200 200 200 0 0 0 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 1000nM aTc, 0nM D 1000nM aTc, 0.04nM D 1000nM aTc, 0.4nM D s28 s28 s28 1,000 1,000 1,000 800 800 800 600 600 600 400 400 400 200 200 200 0 0 0 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 time (min) Figure 13: Concentration of GFP over time. pTar concentration at 5 nM, varying sigma factor DNA D (increasing from left to right) and inhibitor concentrations (increasing from top to bottom). E Limitations of the Describing Function approach The way the Describing Function approach has been used in Section 4.3.1, we assume that higher harmonics can be neglected in the output signal. This, however, may not the be case for every combination of parameters A, A and ! of the input signal given in (22). Therefore, we analyzed the output signal of the nonlinear Model 2 in terms of its 10 rst Fourier coecients GFP in nM Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 0.8 −2 0.6 −4 0.4 10 0.5 −2 0 10 A Figure 14: Proportion of basis frequency in the power spectrum of the output signal generated by the nonlinear system subject to input (22) over dierent parameters of the input signal. and calculated the proportion of the basis frequency in the power spectrum, i.e. jc (!)j p = : (26) rel jc (!)j n=1 If p 1, this indicates that higher harmonics can be neglected. As shown in Fig. 14, this is not rel A 2 0 always the case. For large values of and input frequencies in the range ! 2 [10 ; 10 ], the value of p drops below 0:8, suggesting that the output signal will signi cantly be in
uenced rel by frequency components other than the basis frequency !. This means that the output signal will have a distorted shape. ω Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Analysis of primitive genetic interactions for the design of a genetic signal dierentiator. 1; 2 1 Wolfgang Halter * Richard M. Murray Frank Allg ower Institute for Systems Theory and Automatic Control, University of Stuttgart, 70569 Stuttgart, Germany and California Institute of Technology, Pasadena, CA 91125, USA *Corresponding author: E-mail: wolfgang.halter@ist.uni-stuttgart.de Abstract We study the dynamic and static input output behavior of several primitive genetic interactions and their eect on the performance of a genetic signal dierentiator. In a simpli ed design, several requirements for the linearity and time-scales of processes like transcription, translation and competitive promoter binding were introduced. By ex- perimentally probing simple genetic constructs in a cell-free experimental environment and tting semi-mechanistic models to these data, we show that some of these require- ments can be veri ed, while others are only met with reservations in certain operational regimes. Analyzing the linearized model of the resulting genetic network we conclude that it approximates a dierentiator with relative degree one. Taking also the discovered non-linearities into account and using a describing function approach, we further deter- mine the particular frequency and amplitude ranges where the genetic dierentiator can be expected to behave as such. Key words: genetic circuit design; combinatorial promoters; signal dierentiator Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 1 Introduction The systematic design of functional genetic circuits is one of the key challenges in the eld of synthetic biology. Usually, the goal is to add a desired function to a cellular organism. As the complexity of these functions has been increasing steadily [Purnick2009 ], it becomes increasingly dicult to design the topology of the genetic network and decide what kind of genetic interactions to use. One way to approach this synthesis problem is by adapting meth- ods from the elds of systems and control theory [DelVecchio2016 ], e.g. by starting with a description of the desired part as a linear transfer function, nding the necessary fundamen- tal input/output functions which realize this transfer function and then realizing the evolving network topology with primitive genetic interactions. The key to this approach is to determine how fundamental linear I/O functions like gain, integrator, sum and dierence can be realized using only primitive genetic interactions such as transcription, translation, combinatorial pro- motors, post-transcriptional modi cation or pairwise interactions of DNA, mRNA or protein molecules. This design work
ow follows the ideas of [Oishi2011 ], where the authors showed that any arbitrary linear input/output system can be realized exactly using only zeroth and rst order biochemical reactions. We adressed the question of replacing the zeroth and rst order biochemical reactions with general genetic interactions in [Halter2017b ]. Therein, several requirements were introduced to conclude that the processes of transcription and translation can be interpreted as gain and integration respectively and that combinatorial promoters may be used to realize the dierence of two concentrations. In [Halter2017b ], and also in this work, we use these results to design a genetic signal dierentiator, i.e. a genetic part whose output indicates the temporal derivative of its input. Such a module would be of particular interest in context of a genetic PID controller that could be used to regulate production processes within a cell. While for this purpose the genetic realization of the more important integral feedback has been studied extensively [Ang2010, Yordanov2014, Briat2016a, Briat2018, Briat2018b, Qian2018 ], dierential operators in a biological context have been investigated rather sporadically [Harris2015, Lang2016 ] and have only recently moved into the focus of synthetic biology [Chevalier2018 ]. In latter work, the authors introduce a dierentiator module based on mechanisms borrowed from the E. coli chemotaxis regulatory network. This Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 1: Ideal approximation of a dierentiator, from [Halter2017b ]. mechanism is based on active enzyme-like degradation and the assumption that this degradation operates at saturation of the enzyme. In contrast to the results of [Chevalier2018 ], the topology presented in [Halter2017b ] is not based on a known biological example but is derived from scratch, using an adjusted version of the general design framework of [Oishi2011 ]. This leads to a dierentiator module of similar complexity but dierent assumptions and requirements which need to be guaranteed. In this work, we combine control theoretic concepts, mathematical models and observations from experiments to verify and adapt the requirements introduced in [Halter2017b ]. We nd that, in cell-free extract, transcription can be considered as a PT1 element, i.e. a delayed gain, while translation indeed can be seen as an integrator. Further, we show that combinatorial promoters are not very well suited to realize the dierence of two signals and that the dynamics are very much dependent on the operation conditions. Lastly, we study how not meeting the requirements aects the performance of the genetic signal dierentiator and reveal the operating conditions under which the dierentiator behaves as expected and where this is not the case. In the following, we rst introduce the desired signal dierentiator, one possible topology to realize this part and the necessary requirements for primitive genetic interactions by reca- pitulating the results established in [Halter2017b ]. After, we introduce mathematical models of protein synthesis as well as the cell-free experimental environment which is used to gener- ate the experimental data. Subsequently, the requirements on time-scales and linear operation regimes of the processes of transcription and translation are veri ed by tting the model to a series of experimental data and analyzing the resulting parameters, leading to transfer function representations of these two processes. Using another series of experiments, we determine the input-output steady-state map of a combinatorial promoter and discuss the limited capability of such promoters to realize the dierence of two signals. Finally, the impact of the discov- ered discrepancies on the performance of the genetic dierentiator is studied both in time and frequency domain, using a describing function approach for the latter. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 2 Background First, we brie
y recapitulate the results from [Halter2017b ] before we analyze, verify and adjust the requirements we introduced therein. In the eld of control theory, one can study linear systems in two dierent domains. First, in the time domain, by looking at the states of a system and the temporal derivatives thereof which de ne a system of ordinary dierential equations (ODEs). And second, in the frequency domain, by looking at transfer functions which are complex valued functions and describe how dierent frequency components of an input signal are modi ed by a system. These two domains are connected via the Laplace transformation and particularly the frequency domain is very useful for the design and analysis of linear systems. An ideal dierentiator would be given by the transfer function G(s) = s with Laplace variable s. However, as is well known in the control community, an exact realization of such an ideal dierentiator is not possible due to the lack of causality. For a system to be causal, its output must not depend on future values of the input signal. This is not the case for the dierentiator. In case the system is given in form of a N(s) rational transfer function, i.e. G(s) = , one can easily check for this property by examining D(s) the degrees of the polynomials N (s) and D(s): causality is given if the degree of N (s) is not bigger than the degree of D(s). The desired function thus can only be approximated, e.g. by adding an additional low-pass lter to the ideal dierentiator, leading to the desired transfer function Ks G(s) = (1) s + K where K is the bandwidth of the lter. One possibility to realize this transfer function is by the circuit depicted in Fig. 1, with a (preferably large) gain K in the forward path and a weighted integrator in the feedback path. Ideally, one chooses = 1 to recover (1). Thus, in order to approximate the dierentiator, three basic functions are needed: a gain, an integrator and the signal dierence between input and feedback. Finding genetic realizations of these basic functions is the main challenge in designing the dierentiator. In particular, it is expected that this cannot be achieved in an exact way, thus it is necessary to determine how inaccuracies in the basic parts in
uence the behavior of the assembled circuit. For an initial guess for nding Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 2: Genetic dierentiator: Genes G and G tracking positive and negative slopes of 1 2 u. Proteins produced by G and G neutralize each other. Dierence of associated mRNAs 1 2 indicate output y. such functions, a semi-mechanistic model of transcription and translation [Halter2017a ] was used in [Halter2017b ] to conclude that the processes of transcription and translation can approximately be seen as a gain and integrator respectively and that a combinatorial promoter may be used to realize the dierence of two signals. In the remainder of this section, we brie
y recapitulate these deductions. In the process of protein synthesis, the genetic information is read from DNA (with con- centration D ) and transcribed into mRNA (M ), then, mRNA molecules are translated into i i proteins (P ). In the following, the subscript i stands for the i-th gene (G ) in a network with i i I distinct genes. With P = P : : : P 1 I representing all proteins present in the genetic network, the dynamics of mRNA and protein concentrations of gene i are described by M = f (P; ; ) p (M ; ; ) (2a) i i i i i i P = g (M ; ; ) q (P ; ; ) (2b) i i i i i i i where f (P; ; ) and g (M ; ; ) are the respective production and p (M ; ; ) and i i i i i i i i q (P ; ; ) the respective degradation rates. These rates are possibly dependent on protein i i i and mRNA concentrations, certain gene speci c parameters 2 R like DNA concentrations (D ) or initiation and degradation rates, as well as several environmental parameters 2 R which include, among others, the total amount of RNA polymerase (RNAP), ribosomes and endonucleases, the transcription and translation elongation rates, and other host dependent variables. For better readability the arguments and are omitted in the remainder. In [Halter2017b ], we introduced the topology depicted in Fig. 2 as one approach to realize the transfer function (1). Therein, the input is considered to be a transcription factor, i.e. u = P , which activates gene G and inhibits another gene G . Each of these genes produces u 1 2 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 a transcription factor which suppresses its own production. While G has the purpose of capturing positive gradients of the input signal, G is designed to capture negative ones. The output of the part is then given as the dierence between the mRNA concentrations of the two genes, i.e. y = M M . Further, for the purpose of a minimal signal representation, the 1 2 transcription factors P and P undergo an annihilation reaction. 1 2 Several simpli cations and requirements for the processes of transcription, translation and degradation were introduced to nally arrive at the desired model equations M = D (P P ) M (3a) 1 1 u 1 1 1 P = M P P P (3b) 1 1 1 1 12 1 2 M = D (k P P ) M (3c) 2 2 0 u 2 2 2 P = M P P P (3d) 2 2 2 2 12 1 2 with the function x x > 0 (x) = (4) 0 x 0 assuring strictly positive transcription rates. In the following, we focus on G , the gene for capturing positive gradients, and recapitulate the requirements for the biological processes necessary to arrive at (3). Subsequently, the connection between (3) and (1) will be discussed. We note that the focus on G is without any loss of generality as the following requirements can be adjusted with minimal eort to arrive at the equations for G . Requirement 1. M and P are subject to rst order degradation, i.e. 1 1 p (M ) = M (5a) 1 1 1 1 q (P ) = P : (5b) 1 1 1 1 with degradation rate constants ; 2 . 1 1 1 Although degradation rates p and q are usually dependent on protease and endonuclease i i levels we require rst order degradation dynamics to assure linearity with respect to mRNA Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 and protein levels. Requirement 2. The operation regime is such that f and g are both approximately linear in 1 1 D and M , respectively. 1 1 This requirement is recti ed by results like the ones presented in [Siegal-Gaskins2014 ], where particularly the linearity of g in M is shown. Alternatively, similar simpli cations have i i been applied by following a linearization approach as pursued in [Dolan2012 ]. In general, however, although the transcription rate f increases monotonically with DNA concentration D , it cannot grow arbitrarily large but is subject to saturation eects for large enough DNA or transcription factor concentrations, see e.g. [Gyorgy2014a, Halter2017a ]. Requirement 3. There exists a combinatorial promoter which is piecewise linear in two inputs, such that f [P ; P ] (P P ) 1 u 1 u 1 with () like in Eq. (4). With this requirement, we demand that the combined eect of the two transcription factors is proportional to the dierence of their concentrations, as long as P > P , and zero, otherwise. u 1 In other words, f as a function of [P ; P ] , needs to ful ll the fundamental additivity property 1 u 1 of linear functions in the regime P > P . This further means that, as we are considering a u 1 combinatorial promoter, P has to act as an activator for G while P acts as an inhibitor. u 1 1 Consequently, instead of forming the dierence between input P and integral feedback P u 1 by using direct interactions between the two species, we move the dierence operation to the promoter function. Now if Requirements 2 and 3 hold, we nd f ([P ; P ] ) D (P P ) (6a) 1 u 1 1 u 1 g (M ) M ; (6b) 1 1 1 where and stand for lumped production rate parameters. Thus, with Requirements 1 to 3, we arrive at the rst part of Eq. (3). Note that, when considering both genes G and G , 1 2 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 this means that the transcription and translation rate constants and are assumed to be equal for both genes. Also, it is required that P > P for the part to work properly. For this u 1 reason, the annihilation reaction between P and P was introduced, see [Halter2017b ] for 1 2 more details. Finally, concerning an appropriate choice of parameters, another requirement can be de- duced from typical degradation rates given e.g. in [DelVecchio2015 ]. Requirement 4. The degradation of mRNA is much faster than the one of protein, i.e. With that in mind, one can apply a quasi steady state approximation of the mRNA dynamics and further assume that 0 to arrive at M D (P P ) 1 1 u 1 _ ~ P M 1 1 where M stands for the steady state mRNA concentration. Thus, we conclude that the pro- cess of transcription can be interpreted as a gain while translation approximately realizes an integrator. With the signal entering the transcription process chosen as the residual of input P and integral feedback P , the presented model thus realizes Eq. (1). u 1 In [Halter2017b ], we veri ed this structure by simulating the system based on the much more detailed model described in [Halter2017a ]. This detailed model mainly aims at tak- ing the nite amounts of RNAP and ribosomes as well as the time delay of transcription and translation into account, however, chosen parameters only re
ected average parameters from literature. Further, saturation eects and nonlinearities of the promoter dynamics were ne- glected. After recapitulating the results of [Halter2017b ] and realizing the limitations of the used models, we now adjust our modeling approach and focus on analyzing and verifying the requirements by conducting a series of experiments using a cell-free experimental system [Takahashi2015 ]. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 3 Materials and methods In this section, a brief overview on the experimental technique as well as the subsequently used models is provided. 3.1 TX-TL experimental platform For the purpose of establishing a reliable, ecient and fast prototyping environment for genetic circuits, various cell-free TX-TL systems have been developed and optimized during the past decade [Noireaux2003, Shin2010a, Karig2012, Sun2013, Chappell2013, Takahashi2015 ]. The main advantages of cell-free over classical cell based in-vitro systems are that cellular systems impose certain physical constraints on the gene circuits and the incorporation of the desired genes is comparably time consuming. Cell-free extracts on the other hand provide a well reproducible platform for rapid testing of arbitrary gene circuits. Such an extract for instance can be produced from Escherichia coli (E. coli ) bacteria by bead-beating cell resuspensions, see [Sun2013 ] for more details on the production of E. coli extract. As DNA formatting and transformation as well as cell growth are thus decoupled from the actual testing of the circuit, testing cycles can be speed up signi cantly from several days for testing in original cells to only a few hours for testing in cell-free extract. However, regeneration of resources required for mRNA and protein synthesis is an issue in cell-free environments, which is why the dynamics of mRNA and protein production are subject to some overlayed degradation dynamics of the extract. Therefore, the experiments are only meaningful for a limited experiment duration and we only consider observations within the rst 200 minutes after initiation of the experiment. However, even in this limited time frame, degradation of resources will be visible in the experimental data. Since this mechanism is not considered in the mathematical models, the identi ed parameters will be biased. Production parameters tend to be underestimated while degradation parameters tend to be overestimated. For every TX-TL experiment, the DNA subject to testing is suspended in water and mixed with cell extract and an energy buer. This buer contains amino acids, NTPs, tRNAs and other small molecules necessary for mRNA and protein synthesis. The reaction volume was chosen to 5L. Usually, one or more genetic constructs encode a
uorescent reporter protein such as GFP. After initialization of the experiment, the mixture is incubated at 29 C inside a Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Biotek plate reader, which assesses the level of
uorescent protein every few minutes. While the concentration of a
uorescent protein like GFP can be assessed directly, measuring the amount of mRNA requires an additional mechanism. We therefore make use of the malachite green dye (20M) and a corresponding aptamer sequence (MGapt) which is added to the 3 untranslated region (UTR) of the gene. The dye binds to a binding pocket of this sequence and changes its emission properties upon binding, therefore again enabling us to monitor a
uorescence signal which is proportional to the mRNA concentration [Grate1999 ]. However, measurements of the mRNA signal due to binding of the malachite green dye revealed only a poor signal to noise ratio, therefore an additional data pre-processing step was introduced by tting a Gaussian process to the experimental data. Details on the pre-processing procedure can be found in ??. In this work, we distinguish between gene and extract speci c parameters. Gene speci c parameters include variables like the anity of the particular promoter sequence towards RNAP and other proteins and by de nition are considered to be independent of the environment the experiment is conducted in, i.e. hold in dierent batches of cell-extract as well as inside living cells. In contrast, remaining parameters like the concentration of RNAP or transcription and translation elongation rates are denoted as extract or environment dependent, thus may vary even between dierent batches of cell-free extract. The experiments presented in this work have all been conducted using the same batch of TX-TL extract. All genetic parts were originally given as plasmids. Using polymerase chain reactions and appropriate primer sequences, only the relevant linear double-stranded gene sequence was ex- tracted from these plasmids and used in the TX-TL experiments. By addition of protein gamS, the degradation of linear DNA is prevented [Sun2014 ]. Information about the used genetic constructs can be found in ??. 3.2 Modeling protein synthesis Throughout this work, dierent promoters are discussed and analyzed for various purposes. Therefore, the dierent mechanisms and modeling framework used for simulating the temporal evolution of mRNA and proteins are introduced. We therein build uppon the dynamics given in Eq. (2), however avoid using as strict simpli cations as the ones outlined in Section 2. In the following, complexes of two chemical species A and B are denoted with A:B and conserved Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 quantities are indicated by a bar, e.g. R, the total amount of RNAP. It is a well established result [Gruber2003, DelVecchio2015 ], that the production rate of mRNA f is proportional to the concentration of promoter which is bound to a corresponding RNAP holoenzyme and not blocked by any inhibitors, e.g. f (P) = D :R: (P; ; ) (7) i i 70 i where the concentration of complex D :R: may be depending on other proteins P, gene i 70 speci c parameters and extract speci c parameters . In this example, sigma factor 70 ( ) rst has to bind to RNAP to form the holoenzyme before this complex then binds the promoter region. The sigma factor therein has a very high speci- city towards certain promoters, enabling the cell to switch between dierent transcriptional programs depending on which sigma factor is expressed. Note that compared to (6a), this is a more realistic model for mRNA production but prohibits making the same deductions for the genetic dierentiator. The basic mechanisms of interest for us are binding and unbinding reactions happening at the promoter sequence of DNA. Usually, as in [DelVecchio2015, Gyorgy2016 ], the amount of D :R: is approximated by Michaelis-Menten like equations, assuming that either DNA or i 70 RNAP holoenzyme is in abundance. In contrast to that, we won't make this assumption but particularly take the binding and unbinding reactions into account in order to consider both competition for shared cellular resources and saturation eects at the promoter. For simple setups where only self-competition occurs, we derive a closed form expression for the steady state concentration of the respective biochemical complexes. 3.2.1 Holoenzyme formation When RNAP R is bound to a sigma factor , this complex is referred to as the RNAP holoen- zyme. As discussed brie
y in the previous section, such a holoenzyme binds to the promoter sequence of a gene and initiates the transcription process. Therefore, sigma factors are a crucial component for this process and without the right sigma factor, transcription cannot initiate. According to [Courey2008 ], RNAP alone is sucient for transcription elongation, however, initiation requires sigma factors. We therefore assume that the formation of holoenzyme is Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 independent of the holoenzyme binding to the promoter sequence, meaning that sigma factor and RNAP can bind and unbind irrespective of the fact if RNAP is bound to DNA or not. We therefore have to consider the reactions R + R: (8a) x x D :R + D :R: (8b) i x i x for each sigma factor and DNA species present in the system in order to account for the competition for RNAP. To simplify (8), we introduce R: = R: + D :R: x x i x X :R = R + D :R the total amount of R bound to as well as the total amount of R which is not bound to its respective sigma factor. Then, (8) can be combined to X :R + R: : (9) x x In most cases, only dissociation constants K = are identi able and it is assumed that binding reactions are fast compared to the transcription elongation steps and thus in quasi steady state. Therefore, for notational simplicity, we will reduce the notation to using dissociation constants instead of on and o rates in the remainder of this work. Note that in (9) X :R and denote both the unbound chemical species. If only one sigma factor is present in the system, the amount of R: can be calculated analytically as a function of the dissociation constant K and the total amounts of RNAP and sigma factor respectively, viz. by application of the following proposition. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Proposition 1. Given the entities A, B and A:B and the reaction A + B A:B: If none of the entities participates in any other chemical reaction, the steady state of A:B can be expressed in terms of the total amounts of A and B as A:B = K + A + B (K + A + B) 4AB (10) with A = A + A:B and B = B + A:B. The proof can be found in ??. It is noted that usually, i.e. for the deduction of Michaelis- Menten kinetics, it is assumed that either A B or B A holds while Proposition 1 gives exact solutions for any values of A and B. In cases when only a single sigma factor is present and its total concentration is constant over the time course of the experiment, we will later on use the amount R: as a tting parameter and omit the binding reaction in order to reduce the complexity of the tting problem. However, in cases where the concentration of sigma factor varies over time, we either use the exact formula from Proposition 1, or if there is more than one sigma factor, we directly implement the binding reactions as fast reactions and accept the increased computational complexity. 3.2.2 Promoter binding After formation of R: , the RNAP holoenzyme binds to the promoter sequence and starts transcribing the information encoded as DNA. A promoter is called constitutive , if this binding of RNAP happens spontaneously and is not in
uenced by any activators or inhibitors, i.e. iH D + R: D :R: : (11) i x i x In such cases, given that the promoter does not interact with other holoenzymes, Proposition 1 can be applied again to simplify the modeling formalism. In contrast to a constitutive promoter, binding of RNAP can also be inhibited by other proteins, leading to a combinatorial promoter with competitive binding mechanism, i.e. by the Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 additional reaction ij D + P D :P (12) i j i j which now competes with (11). 3.2.3 Translation and degradation rates Similarly to the transcription rate (7), the rate of translation is given by g (M ) = M :Q(M ; ; ); (13) i i i i i where M :Q stands for the concentration of ribosomes (Q) bound to the ribosome binding site of mRNA M . We assume unregulated ribosomal binding and that the ribosome binding site sequences used for the constructs are of equal strength. Thus, the reactions for forming the complex M :Q are the same as for the formation of holoenzyme and consequently, in case of only one mRNA species present, Proposition 1 can be applied again. Whenever more than one mRNA species is considered, competition for ribosomes occurs and binding reactions are implemented. Degradation of mRNA and protein is mainly in
uenced by third party molecules such as endonucleases (E) and proteases. It is known [Shin2010b ] that latter species is quasi non- existent in TX-TL extract, thus we keep the rst order degradation for proteins as in (5b). Endonucleases, on the other hand, are present in limited quantities, thus loading eects need to be considered. We explicitly assume that the binding of ribosomes and endonucleases is independent of each other, i.e. can be seen as two distinct processes where ribosomes and endonucleases do not compete for mRNA. Thus, once more we de ne p (M ) =
M :E(M ; ; ) (14) i i i i i and apply Proposition 1 whenever only self-competition occurs. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 3: Scheme of probing protein synthesis with step in DNA and expected responses. 4 Results Given the foundational work summarized in Section 2, it is yet unclear to what extent Re- quirements 1 to 4 can be veri ed. In particular, we are interested in answering the question of whether the processes of transcription and translation indeed can be regarded as a gain and integrator respectively (Requirements 1, 2 and 4) and further, whether one can nd a suitable combinatorial promotor which satis es all linearity requirements in order to verify Requirement 3. 4.1 I/O behavior of transcription and translation First, we analyze the time-scales and linearity of transcription and translation. Therefore, the input-/output (I/O) behavior of these processes are characterized by experimentally probing a simple gene with dierent input steps as depicted schematically in Fig. 3. By observing the response to dierent step sizes in the input, the non-linearity of the promoter dynamics can be identi ed. The gene we study is equipped with a dependent constitutive promoter and expresses GFP. By tting a suitable model to the experimental data and analyzing the corresponding parameters, Requirements 1 and 4 will be veri ed. 4.1.1 Experimental setup There are two possibilities to realize a step-like input of varying height at the transcriptional level using promoters like introduced in Section 3.2: either by varying the amount of sigma factor (i.e. the transcription factor) while keeping the DNA concentration constant, or alternatively, changing the DNA concentration itself. While varying DNA amounts is straightforward, the sigma factor input additionally required puri ed protein which may be biologically unstable and is more dicult to obtain than DNA. Figure 4: Mean and 95% con dence interval of experimental step-responses (blue, dotted mean, shaded con dence interval) and simulated step responses of the tted nonlinear model (red, solid). Depending on the choice of input, i.e. sigma factor or DNA, dierent dynamical eects Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 1: Values of the parameters obtained by tting the nonlinear model to step response data. can be expected when probing the system with steps of dierent height. As discussed before in Section 3.2, the mRNA production rate is proportional to the complex D :R: , wherein i 70 the concentration depends on the total amounts of DNA, RNAP and sigma factor. In case the concentration of sigma factor is considered as input, the corresponding model needs to incorporate both the formation of holoenzyme as well as the binding of holoenzyme to the DNA. Thus both binding rates would need to be considered. In contrast, when varying the DNA concentration, the binding reaction of holoenzyme can be neglected and the amount of total holoenzyme R: can be introduced instead. This approach reduces the complexity of the tting problem by focusing on the identi - cation of promoter binding kinetics only. Thus, for the identi cation of the I/O behavior of transcription and translation, we rst limit ourselves to step inputs in form of varying DNA concentrations and study the sigma factor dependent holoenzyme formation in a separate ex- periment, discussed in Section 4.2. We choose four dierent DNA concentrations for probing the system: 1nM, 3nM, 5nM and 10nM. Three technical replicates were conducted. The data obtained by this process is depicted in Fig. 4. Therein, blue dashed lines stand for the mean of mRNA (upper column) and protein (lower column) concentrations and the 95% con dence intervals are illustrated as shaded blue regions respectively. 4.1.2 Corresponding model We denote the index of the gene under study with i = 1 and accordingly the amount of GFP with P . According to Section 3.2 and particularly Eqs. (2), (5b), (7), (13) and (14), the corresponding model is determined by the complexes D :R: = D :R: D ; R: ; K 1 70 1 70 1 70 1H M :Q = M :Q M ; Q; K 1 1 1 MQ M :E = M :E M ; E; K 1 1 1 ME Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 which are calculated using Proposition 1, depending on the total amounts of DNA, mRNA, RNAP holoenzyme, ribosomes and endonucleases as well as the respective dissociation con- stants. We again note that the model can capture the dynamics only in a limited time frame as the degradation of extract is not taken into account. For tting the model to the given data, we introduce a maximum likelihood objective function, see e.g. [Klipp2009 ], and apply several rounds of both patternsearch and fmincon optimization algorithms implemented in Matlab. The resulting parameters given in Table 1 give rise to the red trajectories depicted in Fig. 4. For the process of translation, we observe that the protein degradation rate is evaluated to be of magnitude 10 and therefore, compared to
, practically zero. Conclusion 1. As required in Requirement 4, the degradation of mRNA is much faster than the one of protein. In order to check the linearity Requirements 1 and 2, we study the entities D :R: , M :E 1 70 1 and M :Q as functions of the tted parameters over the relevant range of DNA and mRNA concentrations as depicted in Fig. 5. This way, one can visualize the non-linear nature of the production reactions of mRNA and protein as well as the degradation of mRNA. Although these results clearly indicate that the processes of transcription and translation do not behave linearly in their inputs in general, they allow us to de ne operation regimes as those required in Requirement 2, i.e. where the linearity requirement holds at least approximately. Figure 5: Amount of active complexes for transcription (A), mRNA degradation (B) and translation (C) over relevant range of DNA and mRNA respectively. In that sense, we now introduce a relative measure of nonlinearity and de ne the -linear- range of a function f : R ! R as the largest interval [0; ] for which this nonlinearity measure is just . For the nonlinearity measure we follow the methods introduced in [Allgower1995 ]. Let kf (x)k be the truncated L2 norm of f (x), de ned by L [0;] kf (x)k = f (x) dx: L [0;] To approximate f , we use the linear function mx. Note that we forced the intercept of the linear function to take the value 0 to assure strictly positive values of the linear function on the Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 interval [0; ]. For a given f , the best linear approximation in the interval [0; ] is then found as the argument m = m which minimizes L(; m) = k(f (x) mx)k ; (15) L [0;] the absolute L2 norm of the residual between function f (x) and the linear function mx. The value of L(; m ) now can be seen as an absolute measure for the nonlinearity of f on the interval [0; ], however, this measure depends on the magnitude of the function f . Thus, in order to compare this measure across dierent functions, we normalize (15) by the L2 norm of f , i.e. k(f (x) mx)k L [0;] L (; m) = rel kf (x)k L [0;] to nd our relative measure of nonlinearity. Consequently, is found as the solution of max (16) s.t. min L (; m) : rel In the given case, when one allows for a 5% error, i.e. = 0:05, one obtains the linear ranges indicated as black points in Fig. 5. Conclusion 2. Linearity of production and degradation terms, as requested in Requirements 1 and 2, can be veri ed with 95% accuracy with D :R: A D for D 2 [0; 3:805] 1 70 tx 1 1 M :E A M for M 2 [0; 593:1] 1 deg 1 1 M :Q A M for M 2 [0; 141:7] 1 tl 1 1 and A = 0:726, A = 0:752 , A = 0:602. tx deg tl Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 4.1.3 Linearized model and transfer functions Given the linear operation regimes indicated in Conclusion 2, one can now derive linear models for transcription and translation which are then valid in the respective regimes. In the control community, the standard approach to approximate a nonlinear model with a linear one is to locally linearize the nonlinear function at on speci c value. In case of the nonlinear mRNA degradation rate p for example, a linearization around some xed value M would yield dp 0 0 p (M ) p (M ) + (M M ): 1 1 1 1 1 1 dM 0 This approach assures that the linear function evaluated at M has the same value as the original nonlinear one, and that the dierence between the two functions is small in a neighborhood around M . Thus, the quality of the linear model on a certain interval strongly depends on the chosen value M . In our case, particularly the values of M may vary across a wide range. Further, it should be made sure that in the case when neither DNA nor mRNA or protein is present, the temporal derivatives of these species also is equal to zero, i.e. that _ _ M (D = 0; M = 0) = P (M = 0; P = 0) = 0 1 1 1 1 1 1 holds. This will only be achieved if all linear functions go through the origin. To assure this, one would consequently have to perform the linearization at D = M = P = 0, leading to 1 1 1 potentially large deviations between the linear and nonlinear models at larger values of the independent variables. Therefore, instead of using this standard approach, we directly use the approximations of Conclusion 2 where we already made sure that the linear approximation is as good as possible over a given interval of the independent variable. We thus obtain the linear model M A D
A M 1 tx 1 deg 1 P A M P 1 tl 1 1 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 and when de ning D and M as input and output of the transcription module, M and P as 1 1 1 1 input and output of the translation module, the corresponding transfer functions tx G (s) = (17) tx s +
A deg tl G (s) = (18) tl s + are obtained. We conclude that due to the fact that is very small, translation can indeed be seen as integration as long as DNA and mRNA concentrations are in the appropriate operation regime. However, the initial assumption that transcription can be seen as a gain needs to be adjusted as mRNA degradation cannot be neglected, leading to a PT1 element instead of a gain. So far, we studied and characterized time-scales and linearity of the processes of transcrip- tion and translation in context of an E. coli cell-free extract and mainly focused on possible limitations caused by the promoter and mRNA binding kinetics. We therefore bypassed nonlin- ear eects of RNAP holoenzyme formation by changing DNA concentrations instead of using as input and found that at least during the rst 200 minutes of a TX-TL experiment, re- source limitations do have an eect on transcription, translation and mRNA degradation. By studying dierent step responses, the linear operation regimes were identi ed. We now turn towards inhibitor binding dynamics and in particular towards the problem of how to realize a signal dierence using combinatorial promoters. 4.2 Signal dierence and combinatorial promoters Figure 6: Level sets of promoter activity D :R: over varying levels of sigma factor and i x inhibitor. A Desired behavior for non-negative signal dierence. B1-B4 Simulated values for varying DNA concentrations under a weak repressor. C1-C4 Simulated values for varying DNA concentrations under a strong repressor. In order to approximate the derivative of a signal by implementing the scheme depicted in Fig. 1, we remember that the input into the gain (i.e. transcription) has to be the residual between the reference and feedback signal. There are various ways to realize a signal dierence in biology, a widely used one being sequestration-based mechanisms between the signaling molecules, e.g. binding and degradation Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 of the complex like elaborated in [Briat2016a, Briat2018 ]. When dealing with RNA or DNA, such a mechanism can be realized in a straight-forward way by e.g. the use of antisense strands. When it comes to proteins or metabolites, engineering a sequestration mechanism for an arbitrary protein or metabolite may be possible but in general is more challenging. Thus, one is rather restricted to the use of existing pairs of proteins which undergo binding reactions, e.g. sigma factors and anti sigma factors. Combinatorial promoters as an alternative mechanism may oer a higher
exibility during the prototyping process as various inhibitor operator sequences are already known for transcriptional regulation. Therefore, it is in principle possible to compare the concentrations of any two transcription factors by combination of these operator sequences with dierent promoters. It is one of the goals of this work to investigate whether this approach can actually be used for the purpose of subtraction in a biological context. Following such an approach, the desired behavior of the steady state of promoter dynamics is depicted in Fig. 6A where the steady state of D :R: is color-coded over varying concen- i x trations of ( ) and inhibitor (P ). Due to non-negativity of concentrations, no activity is x j desired whenever the concentration of inhibitor exceeds the one of activator (upper left triangle resembling zero). Otherwise, it is aspired that D :R: is proportional to the dierence P , i x x j illustrated by the parallel and equidistant level sets in Fig. 6A. Applying Proposition 1 and assuming that the total amount of RNAP holoenzyme is xed, the amount of D :R: depends on the chosen DNA concentration as well as on dissociation i x constants K and K of the RNAP holoenzyme and inhibitor respectively. If for instance we iH ij assume that K = K = 1 and look at the relative amount of activated DNA D :R: =D , iH ij i x i varying holoenzyme and inhibitor in the same range results in qualitatively dierent steady- state maps depending on how much D is chosen, as depicted in Fig. 6 B1-B4. For high DNA, sigma factor acts quasi linearly on the promoter while the inhibitor does not play a role at all. On the other hand, for small amounts of DNA, the inhibitor has a large eect and distorts the steady-state map such that the level sets converge to each other at the origin. Also, suppression due to the repressor does not seem strong enough as in all cases, D :R: 0 for < P . i x x j In contrast to that, Fig. 6 C1-C4 show the same conditions, except that now K = 10K , iH ij i.e. the inhibitor binds 10 times stronger to the promoter than RNAP holoenzyme does. In that case, only minimal transcriptional activity is expected when there is less sigma factor than Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 2: Values of the parameters obtained by tting the nonlinear model to the time-series responses of the combinatorial promoter. repressor. Further, although level sets are curved, for medium amounts of DNA, e.g. 20nM, they are comparably equidistant and the steady-state map is almost symmetric. This means that, while we have to acknowledge that exact realization of the dierence of two signals is not possible with combinatorial promoters, some crucial properties can be approximated by choosing dissociation constants and DNA amounts carefully. For that purpose and also for detangling the RNAP holoenzyme binding reaction, we study a a gene with a pTar intitiation sequence combined with a tetO inhibitor operator which expresses GFP. The pTar promoter is sensitive towards an RNAP holoenzyme consisting of RNAP bound to , while the operator sequence tetO enables binding and inhibition through Tet repressor proteins (tetR). We denote the concentration of this gene as D and GFP concentration as P . 2 2 To avoid usage of puri ed protein, both and tetR are produced in the TX-TL system from respective constitutive (i.e. dependent) DNAs D and D . While the amount of D is 70 s28 tetR s28 varied to achieve dierent activation levels, inhibition is in
uenced by adding dierent amounts of anhydrotetracycline (aTc) which binds to tetR and thus alleviates its association with the promoter. The concentration of D is kept at a constant level of 1nM. The combinatorial tetR promoter then produces GFP, dependent on the concentrations of and unblocked tetR. The time-series of this experiment can be found in ??. According to the experimental setup, several chemical species compete for the same re- sources, thus Proposition 1 cannot be applied anymore and the binding reactions themselves had to be implemented as fast reactions. For brevity reasons, the binding reactions are not Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 listed here. We focus on mRNA and protein dynamics, i.e. the ODEs M = D :R:
M :E s28 s28 70 s28 = M :Q s28 s28 s28 M = D :R:
M :E tetR tetR 70 tetR tetR = M :Q tetR tetR M = D :R:
M :E 2 2 28 2 P = M :Q P : 2 2 2 Fitting these equations to the data, we obtain the parameters listed in Table 2 and the trajectories depicted in ??. In the tting process, the optimization is constrained such that the amount of complex R: is similar to the value tted in the rst experiment where binding of sigma factor has been neglected, see Table 1. Figure 7: Promoter activity of pTar-tetO, obtained from tted model. The values given in Table 2 indicate that the total amount of RNAP is much bigger than the one of and further, that binding between these two species is very strong. Although also binds strongly to RNAP, its anity is still smaller than the one of . The excessive 28 70 amount of RNAP and the much higher binding anity of thus leads to a decoupling of the two binding reactions. We also note that the binding of R: to the pTar promoter apparently has a very low anity which leads to low GFP levels compared to the input step experiments. Together with the fact that the repressor tetR binds the pTar promoter very strongly, this leads to the steady-state promoter map depicted in Fig. 7, where the amount of active promoter for 20nM of DNA and varying activator and inhibitor concentrations is determined based on the reactions from Section 3.2 and parameters from Table 2. Although there is leakage for medium amounts of inhibitor and activator and the level sets are not completely linear, the determined promoter dynamics are comparable to the desired behavior of Fig. 6 A. Conclusion 3. Using combinatorial promoters, the dierence between two signals can only be Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 8: Amount of active transcription complex over relevant range of sigma factor concen- tration for pTar promoter. realized to a limited extent. Given these results the transcription dynamics of Section 4.1 can now be extended with the appropriate promoter dynamics and as input. As pointed out before, the strong binding anities of the sigma factors lead to a quite linear but bi-modal input-output behavior, as depicted in Fig. 8, compared to the one depicted in Fig. 5 A. Therein, the active D :R: 2 28 complex linearly follows the amount of until the concentration of RNAP is matched. Con- sequently, the transcriptional gain A changes due to the change of input and using the same tx linear approximation as de ned in (16), one now nds D :R: A 2 28 tx 28 with A = 0:0018: (19) tx 4.3 Implications for the closed loop Initially, with Requirements 1 to 4, we expected the process of transcription to behave like a gain, translation to behave like an integrator and combinatorial promoters to provide the dierence of two signals. Now, several observations were made which dier from our initial view. First, although mRNA degradation is indeed much faster than protein degradation, the simpli cation to a simple gain is not justi ed and the temporal dynamics of mRNA production should be taken into account instead, leading to a PT1 behavior instead of a gain. Second, both production and degradation rates are subject to saturations due to nite amounts of resources of the transcriptional and translational machinery in the cell-free extract. For small inputs however, these rates can be seen as linear functions of their inputs and the linear operation regimes have been determined explicitly. Third, when realizing the dierence of two signals by using combinatorial promoters, one only obtains an approximation of the dierence and the quality of the estimate depends on the magnitudes of the inputs. Now that these deviations from our initial requirements have been identi ed and charac- Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 9: Topology of the linearized model given as the closed loop of G and G . tx tl terized, their eect on functionality and performance of the synthetic genetic dierentiator postulated in [Halter2017b ] can be studied. For that purpose, two dierent models are com- pared with the ideal realizable dierentiator from Eq. (1) in both time and frequency domain. The rst model is given by the closed loop of the models G and G given in (17) and tx tl (18) respectively and adapted with the new transcriptional gain (19). This results in a linear model like depicted in Fig. 9 where no saturation eects are taken into account and perfect signal dierence is assumed. However, the slow mRNA production and resulting PT1 behavior is taken into account and parameters of G and G resemble realistic values as they were tx tl obtained from experimental data. With the simpli cation = 0, the transfer function of the closed loop system thus is given by A s tx G (s) = : (20) cl 2 ~ s +
A s + A A deg tx tl The second model is considered as the detailed nonlinear model and is based on the reactions introduced in Section 3.2, thus taking all saturation eects, non-linearities and time-delays into account. It consists of a gene G with a combinatorial promoter like the one studied in Section 4.2, i.e. sensitive to a holoenzyme and tetR inhibitor, producing this very same inhibitor, therefore realizing the circuit from Fig. 1. The concentration of is considered as input signal. In order to capture both positive and negative gradients, the same approach as introduced in [Halter2017b ] is used, leading to a network topology like in Fig. 2 where G is of similar structure as G but with negative in
uence of the input on the transcription rate. The following additional mechanisms are necessary to realize this topology: a) Additionally to , a second sigma factor is introduced to be present at a constant 28 xx level. While R: activates transcription of G and R: activates the one of G , both 28 1 xx 2 holoenzymes bind to both genes, leading to a competition and negative in
uence of one to the other. b) Self inhibition of the two genes is achieved by two dierent inhibitors, e.g. tetR and tetR . c) The two inhibitors tetR and tetR undergo an annihilation reaction at rate Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 3: Summary of the models and comparison of the core features. Figure 10: Bode plot of desired model (solid black) and linear Model 1 (solid cyan). In dashed lines magnitude and phase of output of nonlinear Model 2 subject to u(t) = A + A sin(!t). Dierent colors indicate dierent values of A . Several values for A are plotted (lying on top of each other). = 0:1 ( nM min) which was chosen arbitrarily. As these modi cations have been discussed in [Halter2017b ] already, we omit the details at this point. The mRNA and protein dynamics of the core species as well as the output of the system is given by M = D :R:
M :E (21a) tetR tetR 28 tetR tetR = M :Q tetR tetR tetR (21b) tetR M ? = D ? :R:
M ? :E (21c) tetR tetR xx tetR _ ? ? ? tetR = M ? :Q tetR tetR tetR (21d) tetR y = M M ?: (21e) tetR tetR We summarized the core features of these two models and the desired circuit in Table 3. Note that Model 1 can be seen as the linearized version of Model 2. 4.3.1 Frequency domain analysis In a rst step, we compare the two models and desired behavior in the frequency domain, i.e. in terms of the Bode plot depicted in Fig. 10. This again is a classical tool from the control community and graphically shows how sinusoid input signals are modi ed by a certain transfer function. In the upper part, the magnitude ampli cation (!) indicates how the amplitude of the input signal is ampli ed for dierent input frequencies. In the lower part, the phase shift (!) for these frequencies is shown. Magnitude and phase of the desired behavior (solid black) and linearized model (solid blue) are obtained trivially using Matlab. For the nonlinear model however, we use a describing function approach as described in [Gelb1968 ] to compare the input-output behavior of the nonlinear Model 2 with the linear Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 ones. Therefore, the nonlinear Model 2 is excited with input u(t) = A + A sin(!t) (22) and the corresponding output y(t) analyzed in terms of its Fourier coecients. Assume that ? ? 2 after time t = k , the output oscillates in a steady-state fashion, i.e. no transient dynamics occur anymore, and let Z ? t +T in!t c (!) := y(t)e dt with period T = be the n-th Fourier coecient of signal y(t) which corresponds to frequency !. Then, the magnitude ampli cation will be given as the ratio of the magnitudes of the rst Fourier coecients of output and input signal. With the input de ned like in (22), the rst Fourier coecient of this signal is simply . Therefore we have 2i jc (!)j 2jc (!)j 1 1 (!) = = : j j 2i Further, the phase shift for this frequency and particular input signal is given by Re c (!) (!) = atan : Im c (!) The constant part A of input signal (22) is necessary to produce non-negative sinusoid functions. Due to the non-linearity of Model 2, the output y(t) does not only depend on the frequency ! but the shape of the input function in general, i.e. also the variables A and A. We therefore probed the system for several frequencies and values for A and A. For linear systems, an input signals with a single frequency component, like the one of (22), leads to an output with also only one frequency component, namely the same as the one of the input. In other words, higher harmonics are not existent and jc (!)j = 0 for n > 1. This is not the case for general non-linear systems where higher harmonics can also appear and in principle more than just the rst Fourier coecient should be analyzed. Thus, the way we use the describing function approach in this work relies on the assumption that higher harmonics Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Figure 11: Normalized output y ~(t) of the nonlinear closed model with input u(t) = A + A sin(0:01t) for varying values of A and A. of the output signal can be neglected. We thus analyzed the power spectrum of the output signals for dierent values of A, A and ! and found that for most combinations, the higher harmonics contributed less than 5% to the overall power spectrum. However, in the case when A approaches A and ! is close to the pole of the transfer function, it seems that the assumption does not hold, see ?? for further details. We will see in Section 4.3.2 and Fig. 11 what this means for the output signal. In Fig. 10, magnitudes and phases of the respective response signal are plotted as dashed lines where dierent colors indicate dierent values for A . The values for A are chosen as A = kA with k 2 [0:1; 0:5; ; 0:75] and respective responses plotted in the same color. As seen in Fig. 10, the output response does not change with varying A, however, the choice of the oset A signi cantly in
uences the I/O behavior of the nonlinear signal dierentiator. Very low values of A (dashed red, orange and purple) lead to a very sensitive response, i.e. too high gain of the resulting closed loop and a smaller range of frequencies for which the output approximates the derivative of the input. A value of A = 10 (dashed green) results in the best response of the nonlinear system, matching the gain of an ideal dierentiator quite well while providing almost the same frequency range as the one predicted by the linearized system (! 0:03 rad/min). For too large values max of A (dashed cyan and dark red) Model 2 breaks down as expected due to the previously characterized saturation eects and the resulting loss of sensitivity towards the input signal. 4.3.2 Time domain analysis From the previous analysis, we summarize that for the detailed Model 2, the phase of the output signal is o for too small values of A , the gain is very small for values of A 10 and 0 0 only in case of A 10 both magnitude and phase are as desired. We now focus on the shape of the output signal of Model 2 and therefore stick to sinusoid input signals, xing ! = 0:01 but varying A and amplitude A of the input signal. The normalized output y ~(t) = y(t) (23) !A Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 as response to the just described input is depicted in Fig. 11. For a small value of A , as expected, the phase is o, however, the output signal still has a sinusoid shape for all amplitudes A. In contrast, for higher values of A , the phase is correct but with amplitude A approaching the oset A , the output signal becomes more and more distorted. This eect is ampli ed for higher oset values and is caused by a dilution of the power spectrum as discussed in the previous section. 5 Discussion For the synthesis of genetic networks that realize arbitrary linear transfer functions, we follow a similar approach as in [Oishi2011 ]. Therefore, it is crucial to nd suitable genetic counterparts to primitive I/O functions such as gain, integration and dierence. In a rst attempt discussed in [Halter2017b ] and recapitulated in Section 2, several requirements were introduced to associate the processes of transcription and translation and combinatorial promoters with these respective I/O primitives. Now, a series of experiments and analyses was presented to verify and adapt these requirements. By observing mRNA and protein levels as response to step inputs of varying height, it was veri ed in Conclusion 1 that protein degradation is almost non-existent while mRNA degradation is comparably fast. However, degradation dynamics are not as fast as desired and a quasi steady state assumption for the process of transcription would be oversimplifying. Thus, transcription should be considered as a PT1-element rather than a gain. By tting an ODE model to the experimental data and analyzing the corresponding param- eters, it was also shown that all processes are subject to saturation due to limited amounts of resources. Using the same model and the tted parameters, the linear operation regimes of the I/O primitives can be characterized as shown in Conclusion 2, leading to more insight into the capabilities and limitations of respective genetic circuits. In a second series of experiments, the dependence of the performance of a combinatorial promoter on the operation regime was emphasized, realizing in Conclusion 3 that the dierence of two signals can only be obtained approximately. Based on these insights, DNA concentrations for a simulation study were chosen such that the I/O behavior of the combinatorial promoter is as close as possible to the desired one. In conclusion, the use of combinatorial promoters Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 for comparing the concentrations of two transcription factors is only possible within a limited range of magnitudes and we suggest to use sequestration based mechanisms in future. For the realization of a genetic signal dierentiator using the studied parts, the initial goal was to realize a dierentiator with high-pass lter. The corresponding transfer function is given in Eq. (1). It has a zero at the origin and one pole determined by the lter to make it a causal system. However, slow mRNA degradation leads to a behavior which, when linearized, is of relative degree one, i.e. Eq. (20) which has one zero at the origin and two poles in the left half plane. This reveals an additional delay of the transient dynamics. If protein degradation were signi cantly larger than zero, this would lead to a transfer function of the form K s + K 1 1 2 G = ; (24) cl;protdeg s + (
+
)s +
+ K K 1 2 1 2 1 2 thus, shifting the zero from the origin to the right half plane and therefore leading to an addi- tional lower frequency bound and a sign change in the output. In comparison, the dierentiator introduced in [Chevalier2018 ] leads to a very similar transfer function as (24), given that all necessary assumptions introduced there hold. The main dierence is that in [Chevalier2018 ], the zero of the transfer function always lies in the left half plane. On one hand, this means that a sign change is avoided. On the other hand, there inherently exists a lower bound for admissible input frequencies while for the design presented in this work, this only is be the case if protein degradation is large. In order to conduct studies beyond the linearized model, a describing function approach is used to evaluate the response of the nonlinear model to sinusoid inputs like in Eq. (22). Therein, it can be seen that the performance of the dierentiator critically depends on the constant part of the input signal, revealing again the limitations due to resource competition but also unexpectedly towards some supersensitivity at low values of A . With an appropriate choice of A , the presented network approximates the temporal derivative of an input signal for frequencies up to ! 0:02 rad/min. Additionally to the dependence on the absolute value of A , simulations in the time domain revealed a dependence on the relative amplitude in sense of a distortion of the output signal. When this relative amplitude approaches the value 1, the output signal looses its similarity to the sinusoid input, although phase and gain may be Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 correct. In other words, the nonlinearities of the model lead to a dilution of the power spectrum of the output and higher harmonics are ampli ed. Supplementary Data Supplementary Data available at SYNBIO online. Acknowledgements The authors are grateful for Vipul Singhal, Andrey Shur, Anandh Swaminathan and William Poole from the Murray Lab at Caltech for the thought provoking discussions and support in the laboratory. Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 1: Values of the parameters obtained by fitting the nonlinear model to step response data. parameter unit value description -1 min 21.54 transcription rate const. -1 β min 2.35 translation rate const. -1 min 0.18 mRNA deg. const. -1 min 1.19e-8 protein deg. const. 1𝐻 nM 0.82 dissoc. const. for 𝐷 and 𝑅: σ 1 70 nM 72.26 dissoc. const. for 𝑀 and 𝑄 𝑀𝑄 nM 102.2 dissoc. const. for 𝑀 and 𝐸 𝑀𝐸 nM 4.26 total RNAP holoenzyme 𝑅: σ Q nM 165.94 total ribosomes nM 650.3 total endonuclease 𝐸 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 2: Values of the parameters obtained by fitting the nonlinear model to the time-series responses of the combinatorial promoter. parameter unit value description 𝐾 nM 1.8e-6 dissoc. const. for and R σ σ70 nM 5.3e-3 dissoc. const. for 𝑅 and 𝜎 𝜎 28 𝑅𝑡𝑒𝑡 nM 8.1e-3 dissoc. const. for D and 𝑅𝑡𝑒𝑡 nM 2.74 dissoc. const. for and 𝑎𝑇𝑐 𝑅𝑡𝑒𝑡 𝑎𝑇𝑐 nM 1.084e4 dissoc. const. for 𝐷 and 𝑅: σ 2𝐻 2 28 𝑠28𝐻 nM 33.86 dissoc. const. for 𝐷 and 𝑅: 𝜎 s28 70 nM 2.97e3 dissoc. const. for 𝐷 and 𝑅: 𝜎 𝑅𝐻𝑡𝑒𝑡 𝑅𝑡𝑒𝑡 70 𝑅 nM 283.14 total RNAP σ nM 3.36 total sigma factor 70 70 Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Table 3: Summary of the models and comparison of the core features. Desired circuit Model 1 Model 2 Topology Fig. 1 Fig. 9 Fig. 2 Dynamics Eq. (1) Eq. (20) Eq. (21) Features linear linear nonlinear perfect gain delayed gain delayed gain no saturation no saturation saturation perfect difference perfect difference approximated difference Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Ideal approximation of a differentiator, from [4]. 65x25mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Genetic differentiator: Genes G and G tracking positive and negative slopes of u. Proteins produced by G 1 2 1 and G neutralize each other. Difference of associated mRNAs indicate output y. 57x35mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Scheme of probing protein synthesis with step in DNA and expected responses. 113x41mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Mean and 95% confidence interval of experimental step-responses (blue, dotted mean, shaded confidence interval) and simulated step responses of the fitted nonlinear model (red, solid). 170x81mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Amount of active complexes for transcription (A), mRNA degradation (B) and translation (C) over relevant range of DNA and mRNA respectively. 147x40mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Level sets of promoter activity D :R:σ over varying levels of sigma factor and inhibitor. A Desired behavior i x for non-negative signal difference. B1-B4 Simulated values for varying DNA concentrations under a weak repressor. C1-C4 Simulated values for varying DNA concentrations under a strong repressor. 188x75mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Promoter activity of pTar-tetO, obtained from fitted model. 86x74mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Amount of active transcription complex over relevant range of sigma factor concentration for pTar promoter. 59x43mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Topology of the linearized model given as the closed loop of G and G . tx tl 70x41mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Bode plot of desired model (solid black) and linear Model 1 (solid cyan). In dashed lines magnitude and phase of output of nonlinear Model 2 subject to u(t) = A + A sin(ωt). Different colors indicate different values of A . Several values for A are plotted (lying on top of each other). 83x94mm (300 x 300 DPI) Downloaded from https://academic.oup.com/synbio/advance-article-abstract/doi/10.1093/synbio/ysz015/5524895 by Ed 'DeepDyve' Gillespie user on 03 July 2019 Normalized output ỹ(t) of the nonlinear closed model with input u(t) = A + A sin(0.01 t) for varying values of A and A. 68x130mm (300 x 300 DPI)
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png
Synthetic Biology
Oxford University Press
http://www.deepdyve.com/lp/oxford-university-press/analysis-of-primitive-genetic-interactions-for-the-design-of-a-genetic-uSfU0VBziR