Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Optimal Experiment Design in Nonlinear Parameter Estimation with Exact Confidence Regions

Optimal Experiment Design in Nonlinear Parameter Estimation with Exact Confidence Regions A model-based optimal experiment design (OED) of nonlinear systems is studied. OED represents a methodology for optimizing the geometry of the parametric joint-con dence regions (CRs), which are obtained in an a posteriori analysis of the least-squares parameter estimates. The optimal design is achieved by using the available (experimental) degrees of freedom such that more informative measurements are obtained. Unlike the commonly used approaches, which base the OED procedure upon the linearized CRs, we explore a path where we explicitly consider the exact CRs in the OED framework. We use a methodology for a nite parametrization of the exact CRs within the OED problem and we introduce a novel approximation technique of the exact CRs using inner- and outer-approximating ellipsoids as a computationally less demanding alternative. The employed techniques give the OED problem as a nite-dimensional mathematical program of bilevel nature. We use two small-scale illustrative case studies to study various OED criteria and compare the resulting optimal designs with the commonly used linearization-based approach. We also assess the performance of two simple heuristic numerical schemes for bilevel optimization within the studied problems. Keywords: Optimal experiment design, Parameter estimation, Least-squares estimation 1. Introduction parameters cannot be identi ed precisely. This a poste- riori uncertainty can be represented by a joint-con dence At present, advanced industrial engineering and man- region (CR) that encompasses all the likely parameter es- agement strive for resource- and energy-ecient design and timates, given the probability distribution of the measure- operation of systems, plants, and processes. Here a use of ment noise. the model-based techniques is a leading paradigm. The The parametric uncertainty can be reduced by per- employed models, whether mechanistic or data-based, in- forming an experiment that appropriately sets the plant clude a nite number of parameters, whose values are re- into an operating region, where more informative measure- lated to the particular natural and system-wide phenom- ments are obtained. A way how to identify such experi- ena and are thus commonly only known to belong to some ment consists in performing an optimal experiment de- interval or unknown completely. Therefore, as-precise-as- sign (OED) [1, 2]. OED identi es the best experiment in possible determination of the unknown (uncertain) model terms of initial conditions, control inputs, sampling times parameters is crucial for the deployment of the e ective and/or locations of measurement devices. The model- model-based solutions. based OED problem is usually formulated as a mathemat- In the world, where the sensing technology becomes ical program, where a certain measure of the CR, such as virtually present everywhere, the simplest way of obtain- volume, is minimized. ing the parameter values is to conduct a series of obser- Some well-established methods exist for the OED for vations, experiments, during which the measurements of linear systems [3], wherein CR is an ellipsoid [4], so the some quantities (output variables) are gathered. Subse- task of its optimal shaping is greatly simpli ed. For non- quently, an estimation procedure is employed to nd pa- linear systems, the most common, yet approximate, ap- rameter values such that the model outputs match the proach is to resort to a system linearization and a use of observed data. As the measurements from a real plant are the linear techniques. Beale [5] presented methodology for corrupted with some (random) noise, the uncertain model assessment of system nonlinearity in this respect. Other approaches, covered in [6] for convex problems, are of more or less approximate character and mostly rely on the con- Tel.: +421 (0)2 5932 5730, Fax: +421 (0)2 5932 5340 (R. Paulen) vexity properties of the OED problem. Bates and Watts [7] Email addresses: anweshreddy.gottumukkula@tu-dortmund.de presented a framework based on local nonlinear curvature (Anwesh Reddy Gottu Mukkula), radoslav.paulen@stuba.sk (Radoslav Paulen) properties, which is also an approximate technique. The Preprint submitted to Journal of Process Control February 5, 2019 arXiv:1902.00931v1 [math.OC] 3 Feb 2019 use of a nite parametrization of the exact CRs [8, 9, 10] where Q(u ) is a so-called regressors matrix with appro- is a relatively recent subject of study. priate dimensions. In this contribution, we study the framework for OED Under the assumption of uncorrelated and normally of nonlinear systems that is based on explicit consideration distributed measurement noise with known standard devi- of the exact CRs. We also present a computationally sim- ation vector , the maximum-likelihood estimate is found pler variant that is based on simultaneous inner- and outer- via the least-squares estimation as approximations of the CR by ellipsoids. The preliminary p ^ 2 argmin J (p); (3) ndings of this work were presented in the conference con- w tribution [11]. We study various common design criteria and compare the performance of the presented techniques with with the linearized OED. X X We organized the paper as follows. The concepts of 2 2 J (p) :=  (y ( ) y ^ (p;  )) : (4) w i;m i linear and nonlinear parameter estimation and construc- i=1  = tion of CRs are discussed rst. Next, the formulation is presented of the experiment design criteria that directly The CR of parameter estimates is then given by an ellip- use exact CRs (exact designs). We also present two sim- soid [4] T 2 ple heuristic numerical approaches taken from literature (p p ^) FIM(U )(p p ^)   ; (5) ;n to solve the arising bilevel optimization problems. Finally where FIM is the so-called Fisher information matrix, results of two illustrative case studies are presented and discussed. T 2 2 FIM(U ) = Q(u ) diag ( ; : : : ;  )Q(u ); (6) 1 n 2. Parameter Estimation Problem T T T T 2 U := (u ;u ; : : : ;u ) , and  represents the upper 2.1. Mathematical Model    ;n 1 2 N p quantile of the chi-squared statistical distribution with In this paper, a mathematical model of a system is n degrees of freedom. represented by If the variance of the measurement noise is unknown, 2 2 the covariance matrix diag( ; : : : ;  ) is normally appro- y ^(p;  ) = F (p;u ); (1) 1 n ximated by s I with with y ^ as n measured variables, u as n degrees of free- y  u J (p ^) dom and p ^ as n uncertain parameters. Here  represents p 2 s := ; (7) an ordinal number of the data point taken in one or more N n n n n p u y experiments. Function F : R R ! R is twice con- where the expected (most-likely) value of parameters p ^ is tinuously di erentiable mapping. Throughout the paper, identi ed by solving we resort to a representation (1), which considers the sys- tem model as static and explicit. However, the presented X X ndings can straightforwardly be extended to dynamic and 2 p ^ 2 argmin J (p) = (y ( ) y ^ (p;  )) : (8) i;m i implicit models. p i=1  = We will assume throughout the paper that the model is not over-parametrized and that the parameters are identi- The joint-con dence ellipsoid is then given by able. We consider that, upon the realization of an exper- iment or several experiments, N instances are gathered of 1 T T 2 (p p ^) Q(u ) s I Q(u ) (p p ^) n -dimensional vector of plant measurements y and are y m subsequently used for the estimation of unknown parame- n F ; (9) p n ;Nn ; p p ters. Throughout the paper, we will assume the Gaussian noise to be corrupting the measurements. In the follow- where FIM is replaced by its corresponding approximation ing subsections, existing frameworks are presented for the and F represents the upper quantile of the Fisher dis- identi cation of the unknown parameters and the corre- tribution with n and N n degrees of freedom in the p p sponding exact CRs for both the linear and the nonlinear numerator and in the denominator, respectively. parameter estimation problems. 2.3. Nonlinear Parameter Estimation 2.2. Linear Parameter Estimation Given a static nonlinear mathematical model Parameter estimation with a linear model involved is a well-studied topic in the literature [4]. Assume a mathe- y ^(p;  ) = F (p;u ); (10) nl matical model with a mapping function F of the form y ^(p;  ) = F (p;u ) = Q(u )p; (2) 2 one can identify the (exact) CR dependent upon the avail- E design ability of information about the variance of the measure- ment noise. If the variance is known, the exact CR is given by [4] J (p) J (p)   ; (11) w w ;n while, when the variance of the measurement noise is un- known, the exact CR is given by [4] J (p) J (p ^)  n s F : (12) p n ;Nn ; p p At this point we can de ne the sets P := fpj Eq. (11)g and P := fpj Eq. (12)g. Unlike in the linear parameter estimation, the CR does not generally have a shape of an A design ellipsoid due to nonlinearity. D design 3. Optimal Experiment Design Figure 1: Illustration of the design criteria in two-dimensional para- metric space. The plot shows a generic exact CR (shaded), the over- We present a methodology for OED for both linear and approximating orthotope (dashed) of the exact CR identi ed using nonlinear parameter estimation problems. We will assume the anchor points  ( ). mark the points that give the maximal that the CR is connected. For disjoint exact CRs, which Euclidean distance of two points in the CR. result from non-identi ability issues, it is normally sug- gested to perform a re-parameterization of the model [7]. mization problem We will also assume that an estimate p ^ is available. The nal assumptions which are inherent to the standard ex- periment design techniques is that there exists no struc- U L (U ) := max p p (14a) j j tural plant-model mismatch and that the expected realiza- j=1 tion of the measurement noise is 0. In turn this results in s.t. 8j 2 f1; : : : ; 2n g; 8 2 f ; : : : ;  g : p 1 N ^ ^ y ( ) = y(p;  ); 8 . (14b) Several design criteria are proposed in the literature [3] such as A, D, E, Modi ed E, V, Q, M and so on. Each of y ^( ;  ) = F ( ;u ); (14c) j nl j these designs aims to tune a particular property of the con- y ( ) = y ^(p ^;  ); (14d) dence region. We will focus our study on the most used J ( ) J (p ^)   : (14e) w j w ;n criteria, i.e., A, D, and E, but other design criteria might be considered as well using the ideas presented herein. Note that the problem of identifying the orthotopic en- closure of the CR is formulated for the case when the 3.1. A-optimal design measurement-noise variance is known using the expression The idea behind the A design criterion is to minimize for the CR (11). This would simply be exchanged with the the perimeter of the box that encloses the exact CR [3], expression (12) in case when the variance is unknown. i.e., to minimize the sum of projections of the CR on the Note that the problem (14) is separable and highly parameter axes. This idea is sketched in Fig. 1, where the structured. On the other hand it is non-convex in gen- shaded set represents a CR. The enclosing box is given by eral and the number of its optimization variables (2n ) the four anchor points, marked as squares ( ). grows quadratically with the number of parameters. This For a general CR, one can identify 2n anchor points means that identi cation of an orthotope might get chal- 8 0 19 lenging for the state-of-the-art solvers and for the high- 0 10 1 0 1 n ;U L U 2;L p > > p p p > 1 1 > dimensional problems. 1 1 > > > 1;L 1;U B C> L n ;U <B CB C B C = p p p B p C The A-optimal experiment design (A design) can be 2 2 2 B CB C B C 2 B C := ; ; ; ; ; (13) B . CB . C B . C B C identi ed by solving . . . > > @ A@ A @ A . > . . . > @ A > > > > : 1;L 1;U 2;L ; p p p n n n p p p n p min  (U ); (15) L U u 2[u ;u ];82f ;:::; g 1 N where each point gives a lower or an upper limit of the which is a special case of a bilevel program. The bounds value of a particular parameter in the exact CR. The an- L U u and u represent the lower and upper limits of the chor points can be identi ed by solving the following opti- experimental degrees of freedom. 3 In the linear case, the A design is identi ed by approach can be exploited by approximation of the vol- ume using the semi-algebraic sets [12]. Approaches to in- min  (FIM) := min trace(FIM ) ner approximation of the CR based on an orthotope and on L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g successive SDP approximations are presented by Streif et 1 N 1 N al. [13]. We propose a simpler approximation here, which (16a) uses an idea similar to the L owner-John ellipsoids [14]. We s.t. y(p;  ) = F (p;u ); 8 2 f ; : : : ;  g: (16b) l  1 N construct the inner- and outer-approximating ellipsoids, which are the scaled counterparts of the linearized CRs. In case of unknown variance of measurement noise, the The proposed approximate D design is found by approximate FIM from (9) will be used. We denote this 1 1 linearization-based approach as classical in this study. det(k FIM ) det(k FIM ) out in min + (19a) L U u 2[u ;u ] k k out in 3.2. D-optimal design 82f ;:::; g 1 N The D design aims to nd the experimental conditions s.t. max k k (19b) out in p ;p ;k ;k out in out in such that the exact CR would have minimum volume. The s.t. 8 2 f ; : : : ;  g : (19c) 1 N D-optimal design problem then reads as Z Z y ^(p ;  ) = F (p ;u ); (19d) out nl out min  (U ) := min  dp : : : dp : D 1 n y ^(p ;  ) = F (p ;u ); (19e) p in nl in L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g 1 N 1 N P (U ) w y ( ) = y ^(p ^;  ); (19f ) (17) J (p ) J (p ^)   ; (19g) w out w ;n As there is no nite-dimensional parameterization of the J (p ) J (p ^)   ; (19h) set P (U ) available in general, it is very hard in general w in w ;n w p to evaluate the volume integral. We propose to use a T (p p ^) FIM(p p ^)  k ; (19i) out out out gridding-based approach, where the grid is evaluated in- (p p ^) FIM(p p ^)  k ; (19j) in in in side the aforementioned orthotopic enclosure of the CR. The proposed optimization problem may be written as where p and p are intersection points between outer- out in /inner-approximating ellipsoids and the exact CR. The min  (U ) := min  (18a) D i scaling factors k and k express the magnitude of devi- out in L U L U u 2[u ;u ] u 2[u ;u ] 8i2I 82f ;:::; g 82f ;:::; g ation of the outer- and, respectively, inner-approximating 1 N 1 N ellipsoid from the linearized CR. The weighting in the cost 1; if  2 P i w s.t.  = (18b) function is then introduced to penalize the contribution 0; otherwise of the most deviating ellipsoid. This prevents the de- L L L U sign procedure from concentrating on shaping the ellip- = fp ; p + ; p + 2; : : : ; p g 1 1 1 1 soid that is potentially a very loose approximation of the L L L U fp ; p + ; p + 2; : : : ; p g : : : 2 2 2 2 CR and in practice avoids numerical and irregularity prob- L L L U fp ; p + ; p + 2; : : : ; p g; (18c) lems. Hence that the proposed problem also scales well, n n n n p p p p p i.e., linearly, w.r.t. the number of the parameters as the U L max p p ; (18d) lower-level problem optimizes 2n + 2 variables. The pro- j j j=1 posed approximate D-optimal design is therefore compu- tationally a less expensive problem when compared to the s.t. 8j 2 f1; : : : ; 2n g; 8 2 f ; : : : ;  g : (18e) p 1 N exact D-optimal design using the gridding-based approach. y ^( ;  ) = F ( ;u ); (18f ) j nl j We will denote the proposed approximate D-optimal de- y ( ) = y ^(p ^;  ); (18g) sign approach as the ellipsoidal D design. The idea behind this method, slightly modi ed, could also be used for an J ( ) J (p ^)   ; (18h) w j w ;n approximate A design but we do not explore this path in where  > 0 is the tuning parameter that determines the the present study explicitly. accuracy of the approximation and I is the index set of Note also that, if the CR can be expressed exactly . This approach for approximating the volume of the CR as (5), the proposed optimization problem boils down to is illustrated in Fig. 1 as a grid in the shaded set. the classical D design where one solves In principle, the identi cation of the orthotopic enclo- min  (FIM) := min det(FIM ) (20a) sure of the CR can be removed and the problem (18) L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g 1 N 1 N can be modi ed to a single-level mathematical program. Nonetheless the optimization problem (18) is non-smooth s.t. y ^(p;  ) = F (p;u ); 8 2 f ; : : : ;  g: (20b) l  1 N due to the presence of indicator function (18b) and thus 3.3. E-optimal design it can get very challenging and computationally highly ex- Objective of the E design is to minimize the Euclidean pensive, especially in higher dimensions. An alternative distance (k' ' k ) between the two points ( in Fig. 1) 1 2 2 4 that belong to the CR and their Euclidean distance is max- 4.1. Nested approach imal. This can be expressed as The following nested approach is inspired by the for- mulation proposed in [17, 18] for solving dynamic opti- min  (U ) (21a) mization problems and in [19] for solving a coordination L U u 2[u ;u ] 82f ;:::; g 1 N control algorithm using a price-driven coordination tech- nique. The nested approach splits the bilevel optimization s.t.  (U ) = max k' ' k (21b) E 1 2 ' ;' 1 2 problem into a lower-level optimization problem s.t. 8j 2 f1; 2g; 8 2 f ; : : : ;  g : (21c) 1 N x 2 argmax g(x ) (24a) y ^(' ;  ) = F (' ;u ); (21d) 2 j nl j y ( ) = y ^(p ^;  ); (21e) s.t. 0 = h (x ;x ); (24b) E 2 J (' ) J (p)   : (21f ) w j w ;n p 0  h (x ;x ): (24c) I 2 The E design is also known as a decorrelating design as it that is solved for a given x using a global solver (e.g. aims at nding the experimental conditions such that the BARON) and a upper-level optimization problem CR is as spherical as possible. This criterion is illustrated in Fig. 1. It is noteworthy that the lower-level problem x 2 argmin f (x ;x ) (25a) 1 2 of (21) scales linearly w.r.t. the number of parameters as it optimizes 2n variables. s.t. 0 = h (x ;x ); (25b) E 1 p 2 In the linear case the (classical) E design is identi ed 0  h (x ;x ): (25c) I 1 by that can then be solved for a given x using a local solver. min  (FIM) := min max  (FIM ) E i We use IPOPT [20] as a local solver in this work. L U L U i u 2[u ;u ] u 2[u ;u ] The individual optimization problems are interconnected 82f ;:::; g 82f ;:::; g 1 N 1 N by the copy variables x and x that are exchanged be- (22a) 1 2 tween the problems. The problems are solved repeatedly s.t. y ^(p;  ) = F (p;u ); 8 2 f ; : : : ;  g; (22b) l  1 N and the convergence of the nested approach is claimed once the consecutive values of the copy variables satisfy where  () is the i-th eigenvalue of a matrix. kx x k  0 and kx x k  0, where k is 1;k+1 1;k 2;k+1 2;k an iteration counter. 4. Numerical Implementation If a gradient-based solver is used to determine the local optimum of the problem (25), the objective and constraint In this section we discuss the possible ways to solve the gradients are to be supplied. An approach from [21] can proposed optimization problems. We will exploit BARON T T T be used in this respect with x := (x ;x ) 1;k 2;k [15] in this work in order to guarantee global optimality " # of the classical OED problems (16), (20), and (22). Spe- dx 2 T 2 r Lj r hj r Lj cial attention is devoted to the presented bilevel programs x x dx x x ;x k x k 1 x ;x k 2 2 2 2 1 = ; (26) r hj 0 r hj as the classical OED problems are single-level optimiza- x x x x 2 k 1 k dx tion problems and can straightforwardly be solved using a where L represents the Lagrangian of the lower-level prob- nonlinear program solver. T T T lem and  := ( ; ) ;8i 2 I is the vector of multipli- We present two simple heuristic approaches taken from E I;i ers corresponding to the equality and active inequality con- literature that can be used to solve the presented bilevel T  T  T straints h(x ;x ) := (h (x ;x );h (x ;x )) ;8i 2 I problems, which can be generalized in the form 2 2 2 A 1 E 1 I;i 1 of the lower-level problem and I is an index set of the min f (x ;x ) (23a) active inequality constraints. It is not guaranteed that the nested approach always converge to a local minimum [18]. s.t. x 2 argmax g(x ) (23b) Instead it may sometimes converge to a local maximum or a saddle point. The obtained solution can be veri ed s.t. 0 = h (x ;x ); (23c) E 1 2 by evaluating the necessary and sucient conditions for 0  h (x ;x ): (23d) I 1 2 optimality. A special care has to be taken w.r.t. the non-convexity of 4.2. KKT-based approach lower-level problem. Its global optimum has to be iden- Another heuristic approach for solving a bilevel opti- ti ed in order to guarantee feasibility of the upper-level mization problem is to reformulate the lower-level problem problem [16]. using the KKT conditions [22, 23, 24]. The reformulated 5 problem reads as 2. The designs based on exact CRs and ellipsoidal D design are solved using nested approach where the min f (x ;x ) (27a) 1 2 lower-level problem is solved globally and the upper- x ;x 1 2 ; 0 E I level problem is solved using a local solver. s.t. 0 = r L(x ;x ; ; ); (27b) 3. In order to study numerical ecacy of di erent algo- x 1 2 E I rithms, the A-optimal design is solved by KKT-based 0 = r L(x ;x ; ; ) = h (x ;x ); (27c) 1 2 E I E 1 2 approach globally. 0 =  h ; 8i 2 I ; (27d) I;i I;i I 5.1. Case Study 1 where I is the index set of the inequality constraints The mathematical model for biological oxygen demand of (23). As discussed above, the lower-level problem has (BOD) [7] is considered. The cumulative BOD of microor- to be solved to global optimality, which is not guaran- ganisms at incubation time u is given by teed by satisfaction of the KKT conditions. Reaching of global optimum of the lower-level problem has to be as- y ^(p;  ) = p (1 exp(p u )); u 2 [0; 20]; (28) 1 2 sured upon convergence in order to guarantee feasibility of the lower-level problem and thus a local optimum of which can also be interpreted as a step response of the the bilevel program. The feasibility test can be performed rst-order linear time-invariant system with static gain p by solving (24) globally or by gridding or by set inversion and time constant 1=p . At this point it can be observed [25] techniques with a subsequent comparison of obtained that p enters the output equation linearly while p enters 1 2 values for variables of lower-level problem. nonlinearly. We note that there are other approaches that can be The true values for the parameters p and p are, re- 1 2 employed to solve the problem (23). The solution methods spectively, 2.5 and 0.5. These are also considered as ex- proposed by Dutta et al. in [26] and by Dempe et al. in [27] pected least-squares estimates p ^ for all the studied OED assume a convex inner level optimization problem. Bard problems. The measurements y ( ) are assumed to be et al., Dempe et al. and Mitsos et al. in [28], [24], and [29] corrupted by a zero-mean Gaussian noise with the stan- proposed solution methods considering a nonconvex inner dard deviation 0.1. We will assume here a case where vari- level optimization problem. It is generally well known ance or the standard deviation of the measurement noise that there is a close connection between bilevel problem is unknown. The exact CRs are then de ned by (12). Ad- and semi-in nite programming (SIP) as discussed in [30]. ditionally, we consider J (p ^) = 0. The tolerance  for the Cutting-plane SIP algorithm is proposed by Blankenship exact D design is set to 5 10 . and Falk [31], branch and bound algorithm by Bard and The optimal sampling times U with N = f4; 5g for Moore [32] or double penalty function method by Ishizuka the classical and exact A-, D- and E-optimal designs and and Aiyoshi [33]. Recently, Walz et al. [34] presented a the ellipsoidal D design are reported in Tab. 1. The values global SIP algorithm proposed in [35] that could be used of the objective function for exact designs () are evalu- in the context of optimal experiment design. Reference ated at the identi ed optimal solution U for each design therein give a complete picture about the use of SIP algo- with N = f4; 5g. For all the designs, the exact OED has rithms for the solution of bilevel programs. a superior performance when compared to the linearized Also various stochastic methods, such as genetic algo- design. The U as identi ed by the classical and the exact rithms, simulated annealing, etc., could be used in prin- OED contain multiple common repetitive measurements ciple, where these might be especially interesting for the at u = 20. This agreement between the designs can be D design problem (18) because of its non-smoothness. We attributed to the linear entry of p into the model y ^( ). It only exploit the described nested and KKT-based approac- can also further be reasoned by an obvious fact; that one hes in this study. can infer the steady-state gain irrespective of the value of time constant closer to the steady state. The classical 5. Case Studies and the exact A OED had same number of repetitions of u = 20 using N = f4; 5g, however this is not the case We apply the presented methodologies for nding OED for D-optimal design where only using N = 4 the number for two small-scale illustrative case studies. The employed of repetitions match. In case of the E-optimal design, the models are in the form of explicit step responses of linear classical OED identi ed U in which u = 20 is repeated time-invariant dynamic systems and the optimal experi- once more when compared to the solution identi ed by the ment design should reveal the best sampling instants. We exact OED, which again points to the linear decorrelation denote the designs that are based on the exact CRs (prob- between p and p that classical E design tries to achieve. 1 2 lems (15), (18), and (21)) as exact designs. The OED The performance of the proposed D-optimal design based problems are solved for 2-con dence level ( = 0:9545) on the inner- and the outer-approximation ellipsoids is bet- using the following approaches: ter when compared to the classical D design for all values of N . It achieves a relatively small loss in performance 1. Classical OED problems are solved globally. when compared to the exact D design. This suggests that Table 1: Optimal designs (U ) as identi ed for classical, ellipsoidal Classical D 1.2 and the exact OED using N = f4; 5g and the values of objective Ellipsoidal D function of exact designs (()) evaluated at the identi ed optimal Exact D designs (U ). In the case of D design, (U ) :=  (U ). Design N Solution (U ) (U ) 0.8 A 4 f1:69; 1:69; 20; 20g 1.610 0.6 A 5 f1:77; 1:77; 20; 20; 20g 0.940 D 4 f2; 2; 20; 20g 0.425 0.4 D 5 f2; 2; 20; 20; 20g 0.155 0.2 E 4 f1:61; 20; 20; 20g 1.016 E 5 f1:75; 20; 20; 20; 20g 0.365 D 4 f1:42; 1:42; 20; 20g 0.414 -0.2 D 5 f1:69; 1:69; 19:99; 19:99; 20g 0.154 1.5 2 2.5 3 3.5 A 4 f1:37; 1:37; 20; 20g 1.585 A 5 f1:60; 1:60; 20; 20; 20g 0.938 Figure 3: The exact (solid) and linearized (dashed ellipsoid) CRs D 4 f1:62; 1:62; 20; 20g 0.409 using N = 4 as obtained by classical, ellipsoidal and exact D designs. D 5 f1:81; 1:82; 1:83; 19:99; 19:99g 0.154 The plot shows the outer-/inner-approximating ellipsoids of the exact E 4 f1:04; 1:04; 20; 20g 0.974 CRs (dotted and dash-dotted lines, respectively). and are the E 5 f1:22; 1:23; 20; 20; 20g 0.322 intersection points for the outer-/inner-approximating ellipsoids and the exact CRs. 1.2 Classical A Exact A at the linearized CRs for both designs (dashed ellipsoids). 1 The linearized ellipsoid clearly does not approximate the exact CRs well, where, as it can be expected, the approx- imation is looser w.r.t. p that enters nonlinearly in out- 0.8 put equation (28). It is an interesting observation that the presented linearized CRs are very similar to each other, 0.6 despite the fact that they give signi cantly di erent exact CRs. Figure 3 shows the exact CRs for the classical ( ), 0.4 ellipsoidal ( ) and the exact ( ) D designs using N = 4. The linearized CRs (dashed ellipsoids) are again very 0.2 similar to each other. Among them the ellipsoid from the classical design ( ) has the smallest volume, as might be expected. However, the exact CR for the classical design 1.8 2 2.2 2.4 2.6 2.8 3 3.2 is the largest one (see Tab. 1), which again comes from disregarding of nonlinearity of the output equation in p by the classical design. Figure 2: The exact (solid) and linearized (dashed ellipsoid) CRs We also present the inner- and the outer-approximation using N = 4 as obtained by classical and exact A designs. The plot shows the over-approximating orthotopes (dashed) of the exact CRs ellipsoids for all the three OED approaches by dash-dotted identi ed using the anchor points  represented by . and dotted ellipsoids, respectively. The corresponding in- tersection points are marked by and respectively in Fig. 3. Here we can observe the bene ts of weighting intro- the proposed ellipsoidal D design is an interesting frame- duced in the objective function of the problem (19). While work to perform approximate D design as compared to looking at the sizes of outer-approximating ellipsoids (es- linearization-based alternative. pecially the one constructed for exact D design), it might The exact CRs for the classical ( ) and the exact ( ) appear reasonable to minimize the volume of the over- A OED for four measurements are compared in Fig. 2. approximating ellipsoid as a good approximation of the ex- The orthotopes enclosing the exact CRs are plotted us- act D design. This would correspond to setting 1=k ! 0 in ing the anchor points  represented by . It is clear that while solving the problem (19). We have explored this the orthotope that encloses the exact CR identi ed by the path in our earlier study [11], but the obtained design re- classical A design ( ) has a larger perimeter when one sults were unsatisfactory, since the over-approximation by compares it with the orthotope identi ed by the exact A an ellipsoid might become very loose. The proposed ellip- design. The reason for this can be found when looking soidal D design therefore balances out the concentration Ellips. Exact OED Classical OED 2 1.2 A design D design E design Classical E 1.7 0.48 1.4 Exact E 0.46 1.3 1.65 0.44 0.8 1.2 0.42 1.6 1.1 0.6 0.4 0.4 1.55 0.38 0.9 0.36 0.2 1.5 0.8 0.34 1.8 2 2.2 2.4 2.6 2.8 3 3.2 1.45 0.32 0.7 Figure 5: Mean and variance of  ,  and  for 1; 000 random A D E Figure 4: The exact (solid) and linearized (dashed ellipsoid) CRs experiments with N = 4 noisy measurements at U of classical ( ), using N = 4 as obtained for classical and exact E OED. mark the ellipsoidal ( ) and the exact ( ) designs. Dashed line signi es the points used to calculate the Euclidean distance of the CRs. performance of the nominal design. on the size of the ellipsoid and the appropriateness of the ments are gathered) of the di erent designs using dashed over-approximation by an ellipsoid. It is clearly visible lines. With respect to the mean values of obtained  , A D that the exact CR for the U identi ed by the proposed and  , it is con rmed that exact OED is the best op- ellipsoidal OED has much smaller volume than the exact tion on average, despite the mean values do not match the CR identi ed by the classical approach and, at the same expected nominal values of the design criteria. Regarding time, it is very close to the optimal exact OED. the obtained variances, it is noteworthy that the classi- The exact CRs for the classical ( ) and the exact ( ) cal design exhibits the strongest sensitivity to noise as it E designs are compared in Fig. 4 using N = 4 measure- can be concluded from the magnitude of the variances and ments. in Fig. 4 mark ' and ' (see (21)) obtained 1 2 thus appears to be the worst option. This again under- for the classical and the exact E-optimal designs. In this pins the importance of consideration of the nonlinearity case, unlike for the previous designs, we observe a ma- in the OED and it can be documented by comparing the jor discrepancy in the orientation between the linearized worst-case value of the exact E design w.r.t. nominal mean ellipsoids obtained for the classical and exact designs, de- obtained for the linearized design (right plot). The last in- spite the largest semi-axes, which correspond to the largest teresting observation is that the variance of  obtained eigenvalues (see (22)) are very similar. Again this is at- with ellipsoidal design is slightly smaller than the variance tributed to the nonlinearity in p . The resulting di erence of the exact OED. This might point to the approximation of distances between the most distant points that belong error introduced (tolerance  in (18)) in calculating  , to P is signi cant, however, among the two designs. Sim- which is however not severe as the mean and worst-case ilar behavior can be seen for the case with N = 5 (see values follow the expected trend. Tab. 1). Next, we study the performance of the obtained designs 5.2. Case study 2 against a number of simulated experiments. The aim here is to evaluate robustness of the designs against random Here we consider a problem where the system output can be modeled as realization of noise that would be present in the real ex- periment. We exclude the dependence on the least-squares p p + p 1 1 2 estimate here, i.e., we will use the nominal values for p ^. y ^( ) = b p u + 1 exp(p u ) 1 ; 0 2  2 p p Such dependence is the subject of study for robust OED, (29) which is not in the scope here. We simulated 1,000 experi- which can also be interpreted as a step response of the ments with each studied design using the obtained optimal second-order linear time-invariant system with a double incubation (measurement) times U and we corrupted the pole at p and a zero at p . The corresponding transfer 2 1 measurements y ( ) := y ^(p ^; ) + e with a Gaussian noise function can be given by e of standard deviation 0.1. Figure 5 shows the mean and the variance of the objec- b (s p ) 0 1 G(s) = : (30) tive value of the exact A, D and E designs. In this plot, we (s + p ) also include the nominal values (when noise-free measure- Table 2: Optimal designs (U ) as identi ed for classical, ellipsoidal 1.8 Classical A and the exact OED using N = f2; 3; 4g and the values of objective Exact A function of exact designs (()) evaluated at the identi ed optimal designs (U ). In the case of D design, (U ) :=  (U ). D 1.6 Design N Solution (U ) (U ) 1.4 A 2 f1:91; 10g 1.666 A 3 f1:86; 1:86; 10g 1.151 1.2 A 4 f1:81; 1:81; 1:81; 10g 0.974 D 2 f2; 10g 0.386 D 3 f2; 2; 10g 0.231 D 4 f2; 2; 10; 10g 0.148 E 2 f1:90; 10g 1.225 0.8 E 3 f1:82; 1:82; 10g 0.520 E 4 f1:74; 1:74; 1:74; 10g 0.341 0.6 D 2 f1:70; 10g 0.363 0 0.2 0.4 0.6 0.8 1 1.2 D 3 f1:73; 1:73; 10g 0.219 1 D 4 f1:82; 1:82; 10; 10g 0.144 Figure 6: The exact (solid) and linearized (dashed ellipsoid) CRs using N = 2 as obtained for classical and exact A OED. The plot A 2 f1:63; 10g 1.584 shows the over-approximating orthotopes (dashed) of the exact CRs A 3 f1:67; 1:67; 10g 1.132 identi ed using the anchor points  represented by . A 4 f1:66; 1:66; 1:67; 10g 0.966 D 2 f1:61; 10g 0.344 1.8 D 3 f1:65; 1:66; 10g 0.218 1.6 D 4 f1:74; 1:77; 10; 10g 0.144 E 2 f1:62; 10g 1.094 1.4 E 3 f1:63; 1:63; 10g 0.497 E 4 f1:59; 1:59; 1:59; 10g 0.331 1.2 Clearly the constant b is a parameter that in uences the steady-state gain of the system and we will assume it, for 0.8 simplicity, to be known b = 4. Notice that both p and p enter the output equation 1 2 0.6 nonlinearly, so this problem can be considered as more Classical D challenging and even greater discrepancy might be ex- 0.4 Ellipsoidal D pected between linearized and exact designs. The true Exact D 0.2 values of the parameters p and p are 0:5 and 1:0, respec- 1 2 -0.5 -0.3 -0.1 0.1 0.3 tively, and are equal to p ^. The measurements y ( ) are assumed to be corrupted with a zero-mean Gaussian noise Figure 7: The exact (solid) and linearized (dashed ellipsoid) CRs with known standard deviation of 0:4. For a fair compar- using N = 2 as obtained for classical, ellipsoidal and exact D OED. ison of the proposed framework, we assume J (p) = 0. The plot shows the outer-/inner-approximating ellipsoids of the exact The tolerance () for exact D design is 7:5 10 . CRs (dotted and dash-dotted lines, respectively). and are the The classical and the proposed OED frameworks are intersection points for the outer-/inner-approximating ellipsoids and the exact CRs. applied to identify the optimal sampling times U , where u 2 [0; 10]. The previously discussed OED problems are solved for N = f2; 3; 4g and = 0:9545 (2-CR) with the well the orientation of the exact CR but due to nonlinear- same numerical techniques as in case study 1. The results ity the approximation is relatively poor. Similarly to the are presented in Tab. 2. The trends regarding the perfor- rst case study we can observe that, despite very similar mance of the di erent designs are the same as described orientation of the linearized con dence regions, signi cant in case study 1. We observe in this case a more signi - bene ts of the exact design over the linearized one are ob- cant bene t of employing an ellipsoidal D design, which, tained. Unlike in case study 1, we obtain reduction in the for N = f3; 4g almost reaches the performance of exact range on both parametric axes, which is caused by the non- OED and is superior to classical OED. linearity of the output equation w.r.t. both parameters. In Fig. 6, we compare the exact and linearized CRs Figure 7 shows the resulting CRs for D design criterion. obtained for the classical ( ) and the exact ( ) A design This shows the reason for the good performance of the using N = 2. We can see that the linearized CR captures Ellipsoid Exact OED Classical OED p p 2 2 A design D design E design 1.3 0.25 0.6 1.6 1.5 1.2 0.5 1.4 0.2 1.1 1.3 0.4 1.2 0.15 1.1 0.9 0.3 0.1 0.8 0.9 0.2 0.8 0.7 Classical E 0.7 0.05 0.1 Exact E 0.6 0.6 0 0.2 0.4 0.6 0.8 1 1.2 0.5 0 0 Figure 8: The exact (solid) and linearized (dashed ellipsoid) CRs Figure 9: Mean and variance of  ,  and  for 1; 000 random A D E using N = 2 as obtained for classical and exact E OED. mark the experiments using N = 4 noisy measurements at U of classical ( ), points used to calculate the Euclidean distance of the CRs. ellipsoidal ( ) and the exact ( ) OED. Dashed line signi es the performance of the nominal design. proposed ellipsoidal technique, which is able to tackle the nonlinearity of the CR far better than the linearization- classical OED prohibit BARON from closing the optimal- based design. ity gap and thus it returns locally optimal and sub-optimal The E-optimal designs for the classical and the exact solutions unless the optimization problem is properly ini- OED with N = 2 are compared in Fig. 8. Despite the fact tialized. However, the classical OED can be solved very that the linearized CRs show great similarity and they eciently, in few minutes on standard desktop workstation capture the orientation of the exact CR, the exact design if initialized properly. The solution times for exact designs tackles the nonlinearity far better and shows clear bene ts followed the expectations that result from the aforemen- w.r.t. the linearization-based counterpart. tioned complexity analysis (see Section 3). On average Robustness of the obtained designs was tested against the solution procedure for exact A, D, and E design us- the random realization of the measurement noise for N = 4 ing nested approach took less than 10 min, 6 h, and 15{ in the same way as in the previous case study. It is clear 30 min, respectively. This shows that the optimal exact A again that exact designs outperform the classical OED, and E designs can be obtained with practically the same ef- which reaches the largest variances and the inferior means. fort as in the case of the classical design for the small-scale The performance of the ellipsoidal D design is practically problems. The exact A design procedure scales quadrat- the same the performance of exact D design. In compari- ically in n so it can get much more time-consuming in son with with the case study 1, we observe larger variance higher dimensions. We note that the reduction of CPU of the exact D design, which we attribute to the higher time obtained using ellipsoidal D design w.r.t. to exact nonlinearity. D design was two-fold (3 h), which is, on one hand, a considerable time saving but, on the other hand, it puts 5.3. Discussion the potential user in question, whether the bene ts pre- vail over the costs. We note, for completeness, that the We studied cases where the CR is found for 2 con - KKT-based approach applied to problem of exact A de- dence. For a greater con dence, the CR increases in size sign required the solution time of approximately one hour, and nonlinearity a ects it stronger. That is why even big- which makes this approach clearly inferior. ger di erences can be expected between classical and exact In problems with small number of samples, it might be OED and even greater bene ts can be obtained by using problematic to identify approximate (experimental) vari- exact design (see [11]). ance or to satisfy the asymptotic properties under which Regarding the computational ecacy of the di erent the CRs are de ned. In this case, one might think of studied problems, it must be clearly pointed out that there di erent approaches to experiment design. One such ap- exists a high sensitivity of model outputs w.r.t. the model proach might be to relax the assumption of the presence parameters in both case studies due to the presence of of a white Gaussian noise in the measurements. This highly nonlinear exponential terms. The presence of highly might in turn lead to set-membership estimation approach, nonlinear terms and the need for inversion of the Fisher also commonly known as guaranteed parameter estima- information matrix to formulate the objective function in tion. A step in the direction of experimental design in E set-membership context was taken in [36] and in the re- [7] D. M. Bates, D. G. Watts, Nonlinear Regression Analysis and Its Applications, Wiley Online Library, 1988. cent studies [34, 37, 38]. [8] A. B. Kurzhanski, Ellipsoidal Calculus for Estimation and Feed- back Control, Birkhauser Boston, Boston, MA, 1997, pp. 229{ 6. Conclusions [9] W. C. Rooney, L. T. Biegler, Design for model parameter uncer- tainty using nonlinear con dence regions, AIChE Journal 47 (8) In this paper, exact and linearization-based methods (2001) 1794{1804. were presented for the optimal experiment design of a [10] S. Streif, F. Petzke, A. Mesbah, R. Findeisen, R. D. Braatz, nonlinear parameter estimation problem. The ellipsoidal Optimal experimental design for probabilistic model discrimina- tion using polynomial chaos, 19th IFAC World Congress 47 (3) method is proposed as a computationally less demanding (2014) 4103{4109. counterpart to the exact D design, which is a computation- [11] A. R. Gottu Mukkula, R. Paulen, Model-based optimal exper- ally demanding problem since it requires a good approxi- iment design for nonlinear parameter estimation using exact con dence regions, Vol. 50, 2017, pp. 13760{13765, 20th IFAC mation for the volume of a set. Two simple heuristic nu- World Congress. merical methods are used here to solve the corresponding [12] F. Dabbene, D. Henrion, C. M. Lagoa, Simple approximations of optimization problems, which are of bilevel nature. The semialgebraic sets and their applications to control, Automatica OED framework is tested upon two illustrative small-scale 78 (2017) 110{118. nonlinear case studies, where the bene ts of the exact de- [13] S. Streif, N. Strobel, R. Findeisen, Inner approximations of consistent parameter sets by constraint inversion and mixed- sign are showcased. The proposed ellipsoidal technique is integer programming, IFAC Proceedings Volumes 46 (31) (2013) shown to perform very well. Despite this study treated 321{326, 12th IFAC Symposium on Computer Applications in the case when the system model describes a static system Biotechnology. [14] K. Ball, Ellipsoids of maximal volume in convex bodies, Geome- in an explicit form, the methodology is straightforwardly triae Dedicata 41 (2) (1992) 241{250. applicable to dynamic systems and implicit model equa- [15] M. Tawarmalani, N. V. Sahinidis, A polyhedral branch-and-cut tions. An interesting direction for the future work lies, approach to global optimization, Mathematical Programming on one hand, in increasing the eciency of the solution of 103 (2005) 225{249. [16] A. Mitsos, B. Chachuat, P. I. Barton, Towards global bilevel the bilevel programs and, on the other hand, in the study dynamic optimization, Journal of Global Optimization 45 (1) of robust OED that relaxes the assumption of known (ex- (2009) 63{93. pected) least-squares estimates p ^, which might be relevant [17] P. Tanartkit, L. Biegler, A nested, simultaneous approach for in practical tasks. dynamic optimization problems{I, Computers & Chemical En- gineering 20 (6) (1996) 735 { 741, fth International Symposium on Process Systems Engineering. Acknowledgments [18] P. Tanartkit, L. Biegler, A nested, simultaneous approach for dynamic optimization problems{II: the outer problem, Comput- ers & Chemical Engineering 21 (12) (1997) 1365 { 1388. The research leading to these results has received fund- [19] M. Ruben, S. Daniel, N. Daniel, D. P. Cesar, Coordination of ing from the European Commission under grant agree- distributed model predictive controllers using price-driven co- ment number 291458 (ERC Advanced Investigator Grant ordination and sensitivity analysis, IFAC Proceedings Volumes MOBOCON). RP acknowledges the contribution of Slo- 46 (32) (2013) 215 { 220, 10th IFAC International Symposium on Dynamics and Control of Process Systems. vak Research and Development Agency under the project [20] A. Wachter, T. L. Biegler, On the implementation of an interior- APVV 15-0007, the contribution of the Scienti c Grant point lter line-search algorithm for large-scale nonlinear pro- Agency of the Slovak Republic under the grant 1/0004/17 gramming, Mathematical Programming 106 (1) (2006) 25{57. and the Research & Development Operational Programme [21] A. V. Fiacco, Y. Ishizuka, Sensitivity and stability analysis for nonlinear programming, Annals of Operations Research 27 (1) for the project University Scienti c Park STU in Bratislava, (1990) 215{235. ITMS 26240220084, supported by the Research 7 Devel- [22] S. Dempe, Bilevel Optimization: Reformulation and First Op- opment Operational Programme funded by the ERDF. timality Conditions, Springer Singapore, 2017, pp. 1{20. [23] S. Dempe, V. Kalashnikov, G. P erez-Vald es, N. Kalashnykova, Bilevel Programming Problems: Theory, Algorithms and Appli- References cations to Energy Networks, Energy Systems, Springer Berlin Heidelberg, 2015. [1] H. Hjalmarsson, From experiment design to closed-loop control, [24] S. Dempe, Foundations of Bilevel Programming, Nonconvex Automatica 41 (3) (2005) 393{438. Optimization and Its Applications, Springer, Boston, MA, 2002. [2] L. Pronzato, Survey paper: Optimal experimental design and [25] M. Kie er, E. Walter, Guaranteed estimation of the parameters some related control problems, Automatica 44 (2) (2008) 303{ of nonlinear continuous-time models: Contributions of interval analysis, International Journal of Adaptive Control and Signal [3] G. Franceschini, S. Macchietto, Model-based design of exper- Processing 25 (3) (2011) 191{207. iments for parameter precision: State of the art, Chem. Eng. [26] J. Dutta, S. Dempe, Bilevel programming with convex lower Sci. 63 (19) (2008) 4846{4872. level problems, Springer US, Boston, MA, 2006, pp. 51{71. [4] G. A. F. Seber, C. J. Wild, Nonlinear Regression, Wiley- [27] S. Dempe, S. Franke, On the solution of convex bilevel optimiza- Interscience, 2003. tion problems, Computational Optimization and Applications [5] E. Beale, Con dence regions in non-linear estimation, Journal of 63 (3) (2016) 685{703. the Royal Statistical Society. Series B (Methodological) (1960) [28] J. Bard, Practical Bilevel Optimization: Algorithms and Appli- 41{88. cations, Nonconvex Optimization and Its Applications, Springer [6] L. Pronzato, A. P azman, Design of Experiments in Nonlinear US, 1998. Models: Asymptotic Normality, Optimality Criteria and Small- [29] A. Mitsos, P. Lemonidis, P. I. Barton, Global solution of bilevel Sample Properties, Springer, 2013. 11 programs with a nonconvex inner program, Journal of Global Optimization 42 (4) (2008) 475{513. [30] O. Stein, G. Still, On generalized semi-in nite optimization and bilevel optimization, European Journal of Operational Research 142 (3) (2002) 444 { 462. [31] J. W. Blankenship, J. E. Falk, In nitely constrained optimiza- tion problems, Journal of Optimization Theory & Applications 19 (2) (1976) 261{281. [32] J. F. Bard, J. T. Moore, A branch and bound algorithm for the bilevel programming problem, SIAM Journal on Scienti c and Statistical Computing 11 (2) (1990) 281{292. [33] Y. Ishizuka, E. Aiyoshi, Double penalty method for bilevel opti- mization problems, Annals of Operations Research 34 (1) (1992) 73{88. [34] O. Walz, H. Djelassi, A. Caspari, A. Mitsos, Bounded-error opti- mal experimental design via global solution of constrained min- max program, Computers & Chemical Engineering 111 (2018) 92{101. [35] A. Mitsos, A. Tsoukalas, Global optimization of generalized semi-in nite programs via restriction of the right hand side, Journal of Global Optimization 61 (1) (2015) 1{17. [36] D. Telen, B. Houska, F. Logist, M. Diehl, J. Van Impe, Guar- anteed robust optimal experiment design for nonlinear dynamic systems, in: Control Conference (ECC), 2013 European, IEEE, 2013, pp. 2939{2944. [37] A. R. Gottu Mukkula, R. Paulen, Model-based design of optimal experiments for nonlinear systems in the context of guaranteed parameter estimation, Computers & Chemical Engineering 99 (2017) 198{213. [38] A. R. Gottu Mukkula, R. Paulen, Optimal design of dynamic ex- periments for guaranteed parameter estimation, in: 2016 Amer- ican Control Conference (ACC), 2016, pp. 1826{1831. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Statistics arXiv (Cornell University)

Optimal Experiment Design in Nonlinear Parameter Estimation with Exact Confidence Regions

Statistics , Volume 2019 (1902) – Feb 3, 2019

Loading next page...
 
/lp/arxiv-cornell-university/optimal-experiment-design-in-nonlinear-parameter-estimation-with-exact-zOeF8NAvun

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

ISSN
0959-1524
eISSN
ARCH-3347
DOI
10.1016/j.jprocont.2019.01.004
Publisher site
See Article on Publisher Site

Abstract

A model-based optimal experiment design (OED) of nonlinear systems is studied. OED represents a methodology for optimizing the geometry of the parametric joint-con dence regions (CRs), which are obtained in an a posteriori analysis of the least-squares parameter estimates. The optimal design is achieved by using the available (experimental) degrees of freedom such that more informative measurements are obtained. Unlike the commonly used approaches, which base the OED procedure upon the linearized CRs, we explore a path where we explicitly consider the exact CRs in the OED framework. We use a methodology for a nite parametrization of the exact CRs within the OED problem and we introduce a novel approximation technique of the exact CRs using inner- and outer-approximating ellipsoids as a computationally less demanding alternative. The employed techniques give the OED problem as a nite-dimensional mathematical program of bilevel nature. We use two small-scale illustrative case studies to study various OED criteria and compare the resulting optimal designs with the commonly used linearization-based approach. We also assess the performance of two simple heuristic numerical schemes for bilevel optimization within the studied problems. Keywords: Optimal experiment design, Parameter estimation, Least-squares estimation 1. Introduction parameters cannot be identi ed precisely. This a poste- riori uncertainty can be represented by a joint-con dence At present, advanced industrial engineering and man- region (CR) that encompasses all the likely parameter es- agement strive for resource- and energy-ecient design and timates, given the probability distribution of the measure- operation of systems, plants, and processes. Here a use of ment noise. the model-based techniques is a leading paradigm. The The parametric uncertainty can be reduced by per- employed models, whether mechanistic or data-based, in- forming an experiment that appropriately sets the plant clude a nite number of parameters, whose values are re- into an operating region, where more informative measure- lated to the particular natural and system-wide phenom- ments are obtained. A way how to identify such experi- ena and are thus commonly only known to belong to some ment consists in performing an optimal experiment de- interval or unknown completely. Therefore, as-precise-as- sign (OED) [1, 2]. OED identi es the best experiment in possible determination of the unknown (uncertain) model terms of initial conditions, control inputs, sampling times parameters is crucial for the deployment of the e ective and/or locations of measurement devices. The model- model-based solutions. based OED problem is usually formulated as a mathemat- In the world, where the sensing technology becomes ical program, where a certain measure of the CR, such as virtually present everywhere, the simplest way of obtain- volume, is minimized. ing the parameter values is to conduct a series of obser- Some well-established methods exist for the OED for vations, experiments, during which the measurements of linear systems [3], wherein CR is an ellipsoid [4], so the some quantities (output variables) are gathered. Subse- task of its optimal shaping is greatly simpli ed. For non- quently, an estimation procedure is employed to nd pa- linear systems, the most common, yet approximate, ap- rameter values such that the model outputs match the proach is to resort to a system linearization and a use of observed data. As the measurements from a real plant are the linear techniques. Beale [5] presented methodology for corrupted with some (random) noise, the uncertain model assessment of system nonlinearity in this respect. Other approaches, covered in [6] for convex problems, are of more or less approximate character and mostly rely on the con- Tel.: +421 (0)2 5932 5730, Fax: +421 (0)2 5932 5340 (R. Paulen) vexity properties of the OED problem. Bates and Watts [7] Email addresses: anweshreddy.gottumukkula@tu-dortmund.de presented a framework based on local nonlinear curvature (Anwesh Reddy Gottu Mukkula), radoslav.paulen@stuba.sk (Radoslav Paulen) properties, which is also an approximate technique. The Preprint submitted to Journal of Process Control February 5, 2019 arXiv:1902.00931v1 [math.OC] 3 Feb 2019 use of a nite parametrization of the exact CRs [8, 9, 10] where Q(u ) is a so-called regressors matrix with appro- is a relatively recent subject of study. priate dimensions. In this contribution, we study the framework for OED Under the assumption of uncorrelated and normally of nonlinear systems that is based on explicit consideration distributed measurement noise with known standard devi- of the exact CRs. We also present a computationally sim- ation vector , the maximum-likelihood estimate is found pler variant that is based on simultaneous inner- and outer- via the least-squares estimation as approximations of the CR by ellipsoids. The preliminary p ^ 2 argmin J (p); (3) ndings of this work were presented in the conference con- w tribution [11]. We study various common design criteria and compare the performance of the presented techniques with with the linearized OED. X X We organized the paper as follows. The concepts of 2 2 J (p) :=  (y ( ) y ^ (p;  )) : (4) w i;m i linear and nonlinear parameter estimation and construc- i=1  = tion of CRs are discussed rst. Next, the formulation is presented of the experiment design criteria that directly The CR of parameter estimates is then given by an ellip- use exact CRs (exact designs). We also present two sim- soid [4] T 2 ple heuristic numerical approaches taken from literature (p p ^) FIM(U )(p p ^)   ; (5) ;n to solve the arising bilevel optimization problems. Finally where FIM is the so-called Fisher information matrix, results of two illustrative case studies are presented and discussed. T 2 2 FIM(U ) = Q(u ) diag ( ; : : : ;  )Q(u ); (6) 1 n 2. Parameter Estimation Problem T T T T 2 U := (u ;u ; : : : ;u ) , and  represents the upper 2.1. Mathematical Model    ;n 1 2 N p quantile of the chi-squared statistical distribution with In this paper, a mathematical model of a system is n degrees of freedom. represented by If the variance of the measurement noise is unknown, 2 2 the covariance matrix diag( ; : : : ;  ) is normally appro- y ^(p;  ) = F (p;u ); (1) 1 n ximated by s I with with y ^ as n measured variables, u as n degrees of free- y  u J (p ^) dom and p ^ as n uncertain parameters. Here  represents p 2 s := ; (7) an ordinal number of the data point taken in one or more N n n n n p u y experiments. Function F : R R ! R is twice con- where the expected (most-likely) value of parameters p ^ is tinuously di erentiable mapping. Throughout the paper, identi ed by solving we resort to a representation (1), which considers the sys- tem model as static and explicit. However, the presented X X ndings can straightforwardly be extended to dynamic and 2 p ^ 2 argmin J (p) = (y ( ) y ^ (p;  )) : (8) i;m i implicit models. p i=1  = We will assume throughout the paper that the model is not over-parametrized and that the parameters are identi- The joint-con dence ellipsoid is then given by able. We consider that, upon the realization of an exper- iment or several experiments, N instances are gathered of 1 T T 2 (p p ^) Q(u ) s I Q(u ) (p p ^) n -dimensional vector of plant measurements y and are y m subsequently used for the estimation of unknown parame- n F ; (9) p n ;Nn ; p p ters. Throughout the paper, we will assume the Gaussian noise to be corrupting the measurements. In the follow- where FIM is replaced by its corresponding approximation ing subsections, existing frameworks are presented for the and F represents the upper quantile of the Fisher dis- identi cation of the unknown parameters and the corre- tribution with n and N n degrees of freedom in the p p sponding exact CRs for both the linear and the nonlinear numerator and in the denominator, respectively. parameter estimation problems. 2.3. Nonlinear Parameter Estimation 2.2. Linear Parameter Estimation Given a static nonlinear mathematical model Parameter estimation with a linear model involved is a well-studied topic in the literature [4]. Assume a mathe- y ^(p;  ) = F (p;u ); (10) nl matical model with a mapping function F of the form y ^(p;  ) = F (p;u ) = Q(u )p; (2) 2 one can identify the (exact) CR dependent upon the avail- E design ability of information about the variance of the measure- ment noise. If the variance is known, the exact CR is given by [4] J (p) J (p)   ; (11) w w ;n while, when the variance of the measurement noise is un- known, the exact CR is given by [4] J (p) J (p ^)  n s F : (12) p n ;Nn ; p p At this point we can de ne the sets P := fpj Eq. (11)g and P := fpj Eq. (12)g. Unlike in the linear parameter estimation, the CR does not generally have a shape of an A design ellipsoid due to nonlinearity. D design 3. Optimal Experiment Design Figure 1: Illustration of the design criteria in two-dimensional para- metric space. The plot shows a generic exact CR (shaded), the over- We present a methodology for OED for both linear and approximating orthotope (dashed) of the exact CR identi ed using nonlinear parameter estimation problems. We will assume the anchor points  ( ). mark the points that give the maximal that the CR is connected. For disjoint exact CRs, which Euclidean distance of two points in the CR. result from non-identi ability issues, it is normally sug- gested to perform a re-parameterization of the model [7]. mization problem We will also assume that an estimate p ^ is available. The nal assumptions which are inherent to the standard ex- periment design techniques is that there exists no struc- U L (U ) := max p p (14a) j j tural plant-model mismatch and that the expected realiza- j=1 tion of the measurement noise is 0. In turn this results in s.t. 8j 2 f1; : : : ; 2n g; 8 2 f ; : : : ;  g : p 1 N ^ ^ y ( ) = y(p;  ); 8 . (14b) Several design criteria are proposed in the literature [3] such as A, D, E, Modi ed E, V, Q, M and so on. Each of y ^( ;  ) = F ( ;u ); (14c) j nl j these designs aims to tune a particular property of the con- y ( ) = y ^(p ^;  ); (14d) dence region. We will focus our study on the most used J ( ) J (p ^)   : (14e) w j w ;n criteria, i.e., A, D, and E, but other design criteria might be considered as well using the ideas presented herein. Note that the problem of identifying the orthotopic en- closure of the CR is formulated for the case when the 3.1. A-optimal design measurement-noise variance is known using the expression The idea behind the A design criterion is to minimize for the CR (11). This would simply be exchanged with the the perimeter of the box that encloses the exact CR [3], expression (12) in case when the variance is unknown. i.e., to minimize the sum of projections of the CR on the Note that the problem (14) is separable and highly parameter axes. This idea is sketched in Fig. 1, where the structured. On the other hand it is non-convex in gen- shaded set represents a CR. The enclosing box is given by eral and the number of its optimization variables (2n ) the four anchor points, marked as squares ( ). grows quadratically with the number of parameters. This For a general CR, one can identify 2n anchor points means that identi cation of an orthotope might get chal- 8 0 19 lenging for the state-of-the-art solvers and for the high- 0 10 1 0 1 n ;U L U 2;L p > > p p p > 1 1 > dimensional problems. 1 1 > > > 1;L 1;U B C> L n ;U <B CB C B C = p p p B p C The A-optimal experiment design (A design) can be 2 2 2 B CB C B C 2 B C := ; ; ; ; ; (13) B . CB . C B . C B C identi ed by solving . . . > > @ A@ A @ A . > . . . > @ A > > > > : 1;L 1;U 2;L ; p p p n n n p p p n p min  (U ); (15) L U u 2[u ;u ];82f ;:::; g 1 N where each point gives a lower or an upper limit of the which is a special case of a bilevel program. The bounds value of a particular parameter in the exact CR. The an- L U u and u represent the lower and upper limits of the chor points can be identi ed by solving the following opti- experimental degrees of freedom. 3 In the linear case, the A design is identi ed by approach can be exploited by approximation of the vol- ume using the semi-algebraic sets [12]. Approaches to in- min  (FIM) := min trace(FIM ) ner approximation of the CR based on an orthotope and on L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g successive SDP approximations are presented by Streif et 1 N 1 N al. [13]. We propose a simpler approximation here, which (16a) uses an idea similar to the L owner-John ellipsoids [14]. We s.t. y(p;  ) = F (p;u ); 8 2 f ; : : : ;  g: (16b) l  1 N construct the inner- and outer-approximating ellipsoids, which are the scaled counterparts of the linearized CRs. In case of unknown variance of measurement noise, the The proposed approximate D design is found by approximate FIM from (9) will be used. We denote this 1 1 linearization-based approach as classical in this study. det(k FIM ) det(k FIM ) out in min + (19a) L U u 2[u ;u ] k k out in 3.2. D-optimal design 82f ;:::; g 1 N The D design aims to nd the experimental conditions s.t. max k k (19b) out in p ;p ;k ;k out in out in such that the exact CR would have minimum volume. The s.t. 8 2 f ; : : : ;  g : (19c) 1 N D-optimal design problem then reads as Z Z y ^(p ;  ) = F (p ;u ); (19d) out nl out min  (U ) := min  dp : : : dp : D 1 n y ^(p ;  ) = F (p ;u ); (19e) p in nl in L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g 1 N 1 N P (U ) w y ( ) = y ^(p ^;  ); (19f ) (17) J (p ) J (p ^)   ; (19g) w out w ;n As there is no nite-dimensional parameterization of the J (p ) J (p ^)   ; (19h) set P (U ) available in general, it is very hard in general w in w ;n w p to evaluate the volume integral. We propose to use a T (p p ^) FIM(p p ^)  k ; (19i) out out out gridding-based approach, where the grid is evaluated in- (p p ^) FIM(p p ^)  k ; (19j) in in in side the aforementioned orthotopic enclosure of the CR. The proposed optimization problem may be written as where p and p are intersection points between outer- out in /inner-approximating ellipsoids and the exact CR. The min  (U ) := min  (18a) D i scaling factors k and k express the magnitude of devi- out in L U L U u 2[u ;u ] u 2[u ;u ] 8i2I 82f ;:::; g 82f ;:::; g ation of the outer- and, respectively, inner-approximating 1 N 1 N ellipsoid from the linearized CR. The weighting in the cost 1; if  2 P i w s.t.  = (18b) function is then introduced to penalize the contribution 0; otherwise of the most deviating ellipsoid. This prevents the de- L L L U sign procedure from concentrating on shaping the ellip- = fp ; p + ; p + 2; : : : ; p g 1 1 1 1 soid that is potentially a very loose approximation of the L L L U fp ; p + ; p + 2; : : : ; p g : : : 2 2 2 2 CR and in practice avoids numerical and irregularity prob- L L L U fp ; p + ; p + 2; : : : ; p g; (18c) lems. Hence that the proposed problem also scales well, n n n n p p p p p i.e., linearly, w.r.t. the number of the parameters as the U L max p p ; (18d) lower-level problem optimizes 2n + 2 variables. The pro- j j j=1 posed approximate D-optimal design is therefore compu- tationally a less expensive problem when compared to the s.t. 8j 2 f1; : : : ; 2n g; 8 2 f ; : : : ;  g : (18e) p 1 N exact D-optimal design using the gridding-based approach. y ^( ;  ) = F ( ;u ); (18f ) j nl j We will denote the proposed approximate D-optimal de- y ( ) = y ^(p ^;  ); (18g) sign approach as the ellipsoidal D design. The idea behind this method, slightly modi ed, could also be used for an J ( ) J (p ^)   ; (18h) w j w ;n approximate A design but we do not explore this path in where  > 0 is the tuning parameter that determines the the present study explicitly. accuracy of the approximation and I is the index set of Note also that, if the CR can be expressed exactly . This approach for approximating the volume of the CR as (5), the proposed optimization problem boils down to is illustrated in Fig. 1 as a grid in the shaded set. the classical D design where one solves In principle, the identi cation of the orthotopic enclo- min  (FIM) := min det(FIM ) (20a) sure of the CR can be removed and the problem (18) L U L U u 2[u ;u ] u 2[u ;u ] 82f ;:::; g 82f ;:::; g 1 N 1 N can be modi ed to a single-level mathematical program. Nonetheless the optimization problem (18) is non-smooth s.t. y ^(p;  ) = F (p;u ); 8 2 f ; : : : ;  g: (20b) l  1 N due to the presence of indicator function (18b) and thus 3.3. E-optimal design it can get very challenging and computationally highly ex- Objective of the E design is to minimize the Euclidean pensive, especially in higher dimensions. An alternative distance (k' ' k ) between the two points ( in Fig. 1) 1 2 2 4 that belong to the CR and their Euclidean distance is max- 4.1. Nested approach imal. This can be expressed as The following nested approach is inspired by the for- mulation proposed in [17, 18] for solving dynamic opti- min  (U ) (21a) mization problems and in [19] for solving a coordination L U u 2[u ;u ] 82f ;:::; g 1 N control algorithm using a price-driven coordination tech- nique. The nested approach splits the bilevel optimization s.t.  (U ) = max k' ' k (21b) E 1 2 ' ;' 1 2 problem into a lower-level optimization problem s.t. 8j 2 f1; 2g; 8 2 f ; : : : ;  g : (21c) 1 N x 2 argmax g(x ) (24a) y ^(' ;  ) = F (' ;u ); (21d) 2 j nl j y ( ) = y ^(p ^;  ); (21e) s.t. 0 = h (x ;x ); (24b) E 2 J (' ) J (p)   : (21f ) w j w ;n p 0  h (x ;x ): (24c) I 2 The E design is also known as a decorrelating design as it that is solved for a given x using a global solver (e.g. aims at nding the experimental conditions such that the BARON) and a upper-level optimization problem CR is as spherical as possible. This criterion is illustrated in Fig. 1. It is noteworthy that the lower-level problem x 2 argmin f (x ;x ) (25a) 1 2 of (21) scales linearly w.r.t. the number of parameters as it optimizes 2n variables. s.t. 0 = h (x ;x ); (25b) E 1 p 2 In the linear case the (classical) E design is identi ed 0  h (x ;x ): (25c) I 1 by that can then be solved for a given x using a local solver. min  (FIM) := min max  (FIM ) E i We use IPOPT [20] as a local solver in this work. L U L U i u 2[u ;u ] u 2[u ;u ] The individual optimization problems are interconnected 82f ;:::; g 82f ;:::; g 1 N 1 N by the copy variables x and x that are exchanged be- (22a) 1 2 tween the problems. The problems are solved repeatedly s.t. y ^(p;  ) = F (p;u ); 8 2 f ; : : : ;  g; (22b) l  1 N and the convergence of the nested approach is claimed once the consecutive values of the copy variables satisfy where  () is the i-th eigenvalue of a matrix. kx x k  0 and kx x k  0, where k is 1;k+1 1;k 2;k+1 2;k an iteration counter. 4. Numerical Implementation If a gradient-based solver is used to determine the local optimum of the problem (25), the objective and constraint In this section we discuss the possible ways to solve the gradients are to be supplied. An approach from [21] can proposed optimization problems. We will exploit BARON T T T be used in this respect with x := (x ;x ) 1;k 2;k [15] in this work in order to guarantee global optimality " # of the classical OED problems (16), (20), and (22). Spe- dx 2 T 2 r Lj r hj r Lj cial attention is devoted to the presented bilevel programs x x dx x x ;x k x k 1 x ;x k 2 2 2 2 1 = ; (26) r hj 0 r hj as the classical OED problems are single-level optimiza- x x x x 2 k 1 k dx tion problems and can straightforwardly be solved using a where L represents the Lagrangian of the lower-level prob- nonlinear program solver. T T T lem and  := ( ; ) ;8i 2 I is the vector of multipli- We present two simple heuristic approaches taken from E I;i ers corresponding to the equality and active inequality con- literature that can be used to solve the presented bilevel T  T  T straints h(x ;x ) := (h (x ;x );h (x ;x )) ;8i 2 I problems, which can be generalized in the form 2 2 2 A 1 E 1 I;i 1 of the lower-level problem and I is an index set of the min f (x ;x ) (23a) active inequality constraints. It is not guaranteed that the nested approach always converge to a local minimum [18]. s.t. x 2 argmax g(x ) (23b) Instead it may sometimes converge to a local maximum or a saddle point. The obtained solution can be veri ed s.t. 0 = h (x ;x ); (23c) E 1 2 by evaluating the necessary and sucient conditions for 0  h (x ;x ): (23d) I 1 2 optimality. A special care has to be taken w.r.t. the non-convexity of 4.2. KKT-based approach lower-level problem. Its global optimum has to be iden- Another heuristic approach for solving a bilevel opti- ti ed in order to guarantee feasibility of the upper-level mization problem is to reformulate the lower-level problem problem [16]. using the KKT conditions [22, 23, 24]. The reformulated 5 problem reads as 2. The designs based on exact CRs and ellipsoidal D design are solved using nested approach where the min f (x ;x ) (27a) 1 2 lower-level problem is solved globally and the upper- x ;x 1 2 ; 0 E I level problem is solved using a local solver. s.t. 0 = r L(x ;x ; ; ); (27b) 3. In order to study numerical ecacy of di erent algo- x 1 2 E I rithms, the A-optimal design is solved by KKT-based 0 = r L(x ;x ; ; ) = h (x ;x ); (27c) 1 2 E I E 1 2 approach globally. 0 =  h ; 8i 2 I ; (27d) I;i I;i I 5.1. Case Study 1 where I is the index set of the inequality constraints The mathematical model for biological oxygen demand of (23). As discussed above, the lower-level problem has (BOD) [7] is considered. The cumulative BOD of microor- to be solved to global optimality, which is not guaran- ganisms at incubation time u is given by teed by satisfaction of the KKT conditions. Reaching of global optimum of the lower-level problem has to be as- y ^(p;  ) = p (1 exp(p u )); u 2 [0; 20]; (28) 1 2 sured upon convergence in order to guarantee feasibility of the lower-level problem and thus a local optimum of which can also be interpreted as a step response of the the bilevel program. The feasibility test can be performed rst-order linear time-invariant system with static gain p by solving (24) globally or by gridding or by set inversion and time constant 1=p . At this point it can be observed [25] techniques with a subsequent comparison of obtained that p enters the output equation linearly while p enters 1 2 values for variables of lower-level problem. nonlinearly. We note that there are other approaches that can be The true values for the parameters p and p are, re- 1 2 employed to solve the problem (23). The solution methods spectively, 2.5 and 0.5. These are also considered as ex- proposed by Dutta et al. in [26] and by Dempe et al. in [27] pected least-squares estimates p ^ for all the studied OED assume a convex inner level optimization problem. Bard problems. The measurements y ( ) are assumed to be et al., Dempe et al. and Mitsos et al. in [28], [24], and [29] corrupted by a zero-mean Gaussian noise with the stan- proposed solution methods considering a nonconvex inner dard deviation 0.1. We will assume here a case where vari- level optimization problem. It is generally well known ance or the standard deviation of the measurement noise that there is a close connection between bilevel problem is unknown. The exact CRs are then de ned by (12). Ad- and semi-in nite programming (SIP) as discussed in [30]. ditionally, we consider J (p ^) = 0. The tolerance  for the Cutting-plane SIP algorithm is proposed by Blankenship exact D design is set to 5 10 . and Falk [31], branch and bound algorithm by Bard and The optimal sampling times U with N = f4; 5g for Moore [32] or double penalty function method by Ishizuka the classical and exact A-, D- and E-optimal designs and and Aiyoshi [33]. Recently, Walz et al. [34] presented a the ellipsoidal D design are reported in Tab. 1. The values global SIP algorithm proposed in [35] that could be used of the objective function for exact designs () are evalu- in the context of optimal experiment design. Reference ated at the identi ed optimal solution U for each design therein give a complete picture about the use of SIP algo- with N = f4; 5g. For all the designs, the exact OED has rithms for the solution of bilevel programs. a superior performance when compared to the linearized Also various stochastic methods, such as genetic algo- design. The U as identi ed by the classical and the exact rithms, simulated annealing, etc., could be used in prin- OED contain multiple common repetitive measurements ciple, where these might be especially interesting for the at u = 20. This agreement between the designs can be D design problem (18) because of its non-smoothness. We attributed to the linear entry of p into the model y ^( ). It only exploit the described nested and KKT-based approac- can also further be reasoned by an obvious fact; that one hes in this study. can infer the steady-state gain irrespective of the value of time constant closer to the steady state. The classical 5. Case Studies and the exact A OED had same number of repetitions of u = 20 using N = f4; 5g, however this is not the case We apply the presented methodologies for nding OED for D-optimal design where only using N = 4 the number for two small-scale illustrative case studies. The employed of repetitions match. In case of the E-optimal design, the models are in the form of explicit step responses of linear classical OED identi ed U in which u = 20 is repeated time-invariant dynamic systems and the optimal experi- once more when compared to the solution identi ed by the ment design should reveal the best sampling instants. We exact OED, which again points to the linear decorrelation denote the designs that are based on the exact CRs (prob- between p and p that classical E design tries to achieve. 1 2 lems (15), (18), and (21)) as exact designs. The OED The performance of the proposed D-optimal design based problems are solved for 2-con dence level ( = 0:9545) on the inner- and the outer-approximation ellipsoids is bet- using the following approaches: ter when compared to the classical D design for all values of N . It achieves a relatively small loss in performance 1. Classical OED problems are solved globally. when compared to the exact D design. This suggests that Table 1: Optimal designs (U ) as identi ed for classical, ellipsoidal Classical D 1.2 and the exact OED using N = f4; 5g and the values of objective Ellipsoidal D function of exact designs (()) evaluated at the identi ed optimal Exact D designs (U ). In the case of D design, (U ) :=  (U ). Design N Solution (U ) (U ) 0.8 A 4 f1:69; 1:69; 20; 20g 1.610 0.6 A 5 f1:77; 1:77; 20; 20; 20g 0.940 D 4 f2; 2; 20; 20g 0.425 0.4 D 5 f2; 2; 20; 20; 20g 0.155 0.2 E 4 f1:61; 20; 20; 20g 1.016 E 5 f1:75; 20; 20; 20; 20g 0.365 D 4 f1:42; 1:42; 20; 20g 0.414 -0.2 D 5 f1:69; 1:69; 19:99; 19:99; 20g 0.154 1.5 2 2.5 3 3.5 A 4 f1:37; 1:37; 20; 20g 1.585 A 5 f1:60; 1:60; 20; 20; 20g 0.938 Figure 3: The exact (solid) and linearized (dashed ellipsoid) CRs D 4 f1:62; 1:62; 20; 20g 0.409 using N = 4 as obtained by classical, ellipsoidal and exact D designs. D 5 f1:81; 1:82; 1:83; 19:99; 19:99g 0.154 The plot shows the outer-/inner-approximating ellipsoids of the exact E 4 f1:04; 1:04; 20; 20g 0.974 CRs (dotted and dash-dotted lines, respectively). and are the E 5 f1:22; 1:23; 20; 20; 20g 0.322 intersection points for the outer-/inner-approximating ellipsoids and the exact CRs. 1.2 Classical A Exact A at the linearized CRs for both designs (dashed ellipsoids). 1 The linearized ellipsoid clearly does not approximate the exact CRs well, where, as it can be expected, the approx- imation is looser w.r.t. p that enters nonlinearly in out- 0.8 put equation (28). It is an interesting observation that the presented linearized CRs are very similar to each other, 0.6 despite the fact that they give signi cantly di erent exact CRs. Figure 3 shows the exact CRs for the classical ( ), 0.4 ellipsoidal ( ) and the exact ( ) D designs using N = 4. The linearized CRs (dashed ellipsoids) are again very 0.2 similar to each other. Among them the ellipsoid from the classical design ( ) has the smallest volume, as might be expected. However, the exact CR for the classical design 1.8 2 2.2 2.4 2.6 2.8 3 3.2 is the largest one (see Tab. 1), which again comes from disregarding of nonlinearity of the output equation in p by the classical design. Figure 2: The exact (solid) and linearized (dashed ellipsoid) CRs We also present the inner- and the outer-approximation using N = 4 as obtained by classical and exact A designs. The plot shows the over-approximating orthotopes (dashed) of the exact CRs ellipsoids for all the three OED approaches by dash-dotted identi ed using the anchor points  represented by . and dotted ellipsoids, respectively. The corresponding in- tersection points are marked by and respectively in Fig. 3. Here we can observe the bene ts of weighting intro- the proposed ellipsoidal D design is an interesting frame- duced in the objective function of the problem (19). While work to perform approximate D design as compared to looking at the sizes of outer-approximating ellipsoids (es- linearization-based alternative. pecially the one constructed for exact D design), it might The exact CRs for the classical ( ) and the exact ( ) appear reasonable to minimize the volume of the over- A OED for four measurements are compared in Fig. 2. approximating ellipsoid as a good approximation of the ex- The orthotopes enclosing the exact CRs are plotted us- act D design. This would correspond to setting 1=k ! 0 in ing the anchor points  represented by . It is clear that while solving the problem (19). We have explored this the orthotope that encloses the exact CR identi ed by the path in our earlier study [11], but the obtained design re- classical A design ( ) has a larger perimeter when one sults were unsatisfactory, since the over-approximation by compares it with the orthotope identi ed by the exact A an ellipsoid might become very loose. The proposed ellip- design. The reason for this can be found when looking soidal D design therefore balances out the concentration Ellips. Exact OED Classical OED 2 1.2 A design D design E design Classical E 1.7 0.48 1.4 Exact E 0.46 1.3 1.65 0.44 0.8 1.2 0.42 1.6 1.1 0.6 0.4 0.4 1.55 0.38 0.9 0.36 0.2 1.5 0.8 0.34 1.8 2 2.2 2.4 2.6 2.8 3 3.2 1.45 0.32 0.7 Figure 5: Mean and variance of  ,  and  for 1; 000 random A D E Figure 4: The exact (solid) and linearized (dashed ellipsoid) CRs experiments with N = 4 noisy measurements at U of classical ( ), using N = 4 as obtained for classical and exact E OED. mark the ellipsoidal ( ) and the exact ( ) designs. Dashed line signi es the points used to calculate the Euclidean distance of the CRs. performance of the nominal design. on the size of the ellipsoid and the appropriateness of the ments are gathered) of the di erent designs using dashed over-approximation by an ellipsoid. It is clearly visible lines. With respect to the mean values of obtained  , A D that the exact CR for the U identi ed by the proposed and  , it is con rmed that exact OED is the best op- ellipsoidal OED has much smaller volume than the exact tion on average, despite the mean values do not match the CR identi ed by the classical approach and, at the same expected nominal values of the design criteria. Regarding time, it is very close to the optimal exact OED. the obtained variances, it is noteworthy that the classi- The exact CRs for the classical ( ) and the exact ( ) cal design exhibits the strongest sensitivity to noise as it E designs are compared in Fig. 4 using N = 4 measure- can be concluded from the magnitude of the variances and ments. in Fig. 4 mark ' and ' (see (21)) obtained 1 2 thus appears to be the worst option. This again under- for the classical and the exact E-optimal designs. In this pins the importance of consideration of the nonlinearity case, unlike for the previous designs, we observe a ma- in the OED and it can be documented by comparing the jor discrepancy in the orientation between the linearized worst-case value of the exact E design w.r.t. nominal mean ellipsoids obtained for the classical and exact designs, de- obtained for the linearized design (right plot). The last in- spite the largest semi-axes, which correspond to the largest teresting observation is that the variance of  obtained eigenvalues (see (22)) are very similar. Again this is at- with ellipsoidal design is slightly smaller than the variance tributed to the nonlinearity in p . The resulting di erence of the exact OED. This might point to the approximation of distances between the most distant points that belong error introduced (tolerance  in (18)) in calculating  , to P is signi cant, however, among the two designs. Sim- which is however not severe as the mean and worst-case ilar behavior can be seen for the case with N = 5 (see values follow the expected trend. Tab. 1). Next, we study the performance of the obtained designs 5.2. Case study 2 against a number of simulated experiments. The aim here is to evaluate robustness of the designs against random Here we consider a problem where the system output can be modeled as realization of noise that would be present in the real ex- periment. We exclude the dependence on the least-squares p p + p 1 1 2 estimate here, i.e., we will use the nominal values for p ^. y ^( ) = b p u + 1 exp(p u ) 1 ; 0 2  2 p p Such dependence is the subject of study for robust OED, (29) which is not in the scope here. We simulated 1,000 experi- which can also be interpreted as a step response of the ments with each studied design using the obtained optimal second-order linear time-invariant system with a double incubation (measurement) times U and we corrupted the pole at p and a zero at p . The corresponding transfer 2 1 measurements y ( ) := y ^(p ^; ) + e with a Gaussian noise function can be given by e of standard deviation 0.1. Figure 5 shows the mean and the variance of the objec- b (s p ) 0 1 G(s) = : (30) tive value of the exact A, D and E designs. In this plot, we (s + p ) also include the nominal values (when noise-free measure- Table 2: Optimal designs (U ) as identi ed for classical, ellipsoidal 1.8 Classical A and the exact OED using N = f2; 3; 4g and the values of objective Exact A function of exact designs (()) evaluated at the identi ed optimal designs (U ). In the case of D design, (U ) :=  (U ). D 1.6 Design N Solution (U ) (U ) 1.4 A 2 f1:91; 10g 1.666 A 3 f1:86; 1:86; 10g 1.151 1.2 A 4 f1:81; 1:81; 1:81; 10g 0.974 D 2 f2; 10g 0.386 D 3 f2; 2; 10g 0.231 D 4 f2; 2; 10; 10g 0.148 E 2 f1:90; 10g 1.225 0.8 E 3 f1:82; 1:82; 10g 0.520 E 4 f1:74; 1:74; 1:74; 10g 0.341 0.6 D 2 f1:70; 10g 0.363 0 0.2 0.4 0.6 0.8 1 1.2 D 3 f1:73; 1:73; 10g 0.219 1 D 4 f1:82; 1:82; 10; 10g 0.144 Figure 6: The exact (solid) and linearized (dashed ellipsoid) CRs using N = 2 as obtained for classical and exact A OED. The plot A 2 f1:63; 10g 1.584 shows the over-approximating orthotopes (dashed) of the exact CRs A 3 f1:67; 1:67; 10g 1.132 identi ed using the anchor points  represented by . A 4 f1:66; 1:66; 1:67; 10g 0.966 D 2 f1:61; 10g 0.344 1.8 D 3 f1:65; 1:66; 10g 0.218 1.6 D 4 f1:74; 1:77; 10; 10g 0.144 E 2 f1:62; 10g 1.094 1.4 E 3 f1:63; 1:63; 10g 0.497 E 4 f1:59; 1:59; 1:59; 10g 0.331 1.2 Clearly the constant b is a parameter that in uences the steady-state gain of the system and we will assume it, for 0.8 simplicity, to be known b = 4. Notice that both p and p enter the output equation 1 2 0.6 nonlinearly, so this problem can be considered as more Classical D challenging and even greater discrepancy might be ex- 0.4 Ellipsoidal D pected between linearized and exact designs. The true Exact D 0.2 values of the parameters p and p are 0:5 and 1:0, respec- 1 2 -0.5 -0.3 -0.1 0.1 0.3 tively, and are equal to p ^. The measurements y ( ) are assumed to be corrupted with a zero-mean Gaussian noise Figure 7: The exact (solid) and linearized (dashed ellipsoid) CRs with known standard deviation of 0:4. For a fair compar- using N = 2 as obtained for classical, ellipsoidal and exact D OED. ison of the proposed framework, we assume J (p) = 0. The plot shows the outer-/inner-approximating ellipsoids of the exact The tolerance () for exact D design is 7:5 10 . CRs (dotted and dash-dotted lines, respectively). and are the The classical and the proposed OED frameworks are intersection points for the outer-/inner-approximating ellipsoids and the exact CRs. applied to identify the optimal sampling times U , where u 2 [0; 10]. The previously discussed OED problems are solved for N = f2; 3; 4g and = 0:9545 (2-CR) with the well the orientation of the exact CR but due to nonlinear- same numerical techniques as in case study 1. The results ity the approximation is relatively poor. Similarly to the are presented in Tab. 2. The trends regarding the perfor- rst case study we can observe that, despite very similar mance of the di erent designs are the same as described orientation of the linearized con dence regions, signi cant in case study 1. We observe in this case a more signi - bene ts of the exact design over the linearized one are ob- cant bene t of employing an ellipsoidal D design, which, tained. Unlike in case study 1, we obtain reduction in the for N = f3; 4g almost reaches the performance of exact range on both parametric axes, which is caused by the non- OED and is superior to classical OED. linearity of the output equation w.r.t. both parameters. In Fig. 6, we compare the exact and linearized CRs Figure 7 shows the resulting CRs for D design criterion. obtained for the classical ( ) and the exact ( ) A design This shows the reason for the good performance of the using N = 2. We can see that the linearized CR captures Ellipsoid Exact OED Classical OED p p 2 2 A design D design E design 1.3 0.25 0.6 1.6 1.5 1.2 0.5 1.4 0.2 1.1 1.3 0.4 1.2 0.15 1.1 0.9 0.3 0.1 0.8 0.9 0.2 0.8 0.7 Classical E 0.7 0.05 0.1 Exact E 0.6 0.6 0 0.2 0.4 0.6 0.8 1 1.2 0.5 0 0 Figure 8: The exact (solid) and linearized (dashed ellipsoid) CRs Figure 9: Mean and variance of  ,  and  for 1; 000 random A D E using N = 2 as obtained for classical and exact E OED. mark the experiments using N = 4 noisy measurements at U of classical ( ), points used to calculate the Euclidean distance of the CRs. ellipsoidal ( ) and the exact ( ) OED. Dashed line signi es the performance of the nominal design. proposed ellipsoidal technique, which is able to tackle the nonlinearity of the CR far better than the linearization- classical OED prohibit BARON from closing the optimal- based design. ity gap and thus it returns locally optimal and sub-optimal The E-optimal designs for the classical and the exact solutions unless the optimization problem is properly ini- OED with N = 2 are compared in Fig. 8. Despite the fact tialized. However, the classical OED can be solved very that the linearized CRs show great similarity and they eciently, in few minutes on standard desktop workstation capture the orientation of the exact CR, the exact design if initialized properly. The solution times for exact designs tackles the nonlinearity far better and shows clear bene ts followed the expectations that result from the aforemen- w.r.t. the linearization-based counterpart. tioned complexity analysis (see Section 3). On average Robustness of the obtained designs was tested against the solution procedure for exact A, D, and E design us- the random realization of the measurement noise for N = 4 ing nested approach took less than 10 min, 6 h, and 15{ in the same way as in the previous case study. It is clear 30 min, respectively. This shows that the optimal exact A again that exact designs outperform the classical OED, and E designs can be obtained with practically the same ef- which reaches the largest variances and the inferior means. fort as in the case of the classical design for the small-scale The performance of the ellipsoidal D design is practically problems. The exact A design procedure scales quadrat- the same the performance of exact D design. In compari- ically in n so it can get much more time-consuming in son with with the case study 1, we observe larger variance higher dimensions. We note that the reduction of CPU of the exact D design, which we attribute to the higher time obtained using ellipsoidal D design w.r.t. to exact nonlinearity. D design was two-fold (3 h), which is, on one hand, a considerable time saving but, on the other hand, it puts 5.3. Discussion the potential user in question, whether the bene ts pre- vail over the costs. We note, for completeness, that the We studied cases where the CR is found for 2 con - KKT-based approach applied to problem of exact A de- dence. For a greater con dence, the CR increases in size sign required the solution time of approximately one hour, and nonlinearity a ects it stronger. That is why even big- which makes this approach clearly inferior. ger di erences can be expected between classical and exact In problems with small number of samples, it might be OED and even greater bene ts can be obtained by using problematic to identify approximate (experimental) vari- exact design (see [11]). ance or to satisfy the asymptotic properties under which Regarding the computational ecacy of the di erent the CRs are de ned. In this case, one might think of studied problems, it must be clearly pointed out that there di erent approaches to experiment design. One such ap- exists a high sensitivity of model outputs w.r.t. the model proach might be to relax the assumption of the presence parameters in both case studies due to the presence of of a white Gaussian noise in the measurements. This highly nonlinear exponential terms. The presence of highly might in turn lead to set-membership estimation approach, nonlinear terms and the need for inversion of the Fisher also commonly known as guaranteed parameter estima- information matrix to formulate the objective function in tion. A step in the direction of experimental design in E set-membership context was taken in [36] and in the re- [7] D. M. Bates, D. G. Watts, Nonlinear Regression Analysis and Its Applications, Wiley Online Library, 1988. cent studies [34, 37, 38]. [8] A. B. Kurzhanski, Ellipsoidal Calculus for Estimation and Feed- back Control, Birkhauser Boston, Boston, MA, 1997, pp. 229{ 6. Conclusions [9] W. C. Rooney, L. T. Biegler, Design for model parameter uncer- tainty using nonlinear con dence regions, AIChE Journal 47 (8) In this paper, exact and linearization-based methods (2001) 1794{1804. were presented for the optimal experiment design of a [10] S. Streif, F. Petzke, A. Mesbah, R. Findeisen, R. D. Braatz, nonlinear parameter estimation problem. The ellipsoidal Optimal experimental design for probabilistic model discrimina- tion using polynomial chaos, 19th IFAC World Congress 47 (3) method is proposed as a computationally less demanding (2014) 4103{4109. counterpart to the exact D design, which is a computation- [11] A. R. Gottu Mukkula, R. Paulen, Model-based optimal exper- ally demanding problem since it requires a good approxi- iment design for nonlinear parameter estimation using exact con dence regions, Vol. 50, 2017, pp. 13760{13765, 20th IFAC mation for the volume of a set. Two simple heuristic nu- World Congress. merical methods are used here to solve the corresponding [12] F. Dabbene, D. Henrion, C. M. Lagoa, Simple approximations of optimization problems, which are of bilevel nature. The semialgebraic sets and their applications to control, Automatica OED framework is tested upon two illustrative small-scale 78 (2017) 110{118. nonlinear case studies, where the bene ts of the exact de- [13] S. Streif, N. Strobel, R. Findeisen, Inner approximations of consistent parameter sets by constraint inversion and mixed- sign are showcased. The proposed ellipsoidal technique is integer programming, IFAC Proceedings Volumes 46 (31) (2013) shown to perform very well. Despite this study treated 321{326, 12th IFAC Symposium on Computer Applications in the case when the system model describes a static system Biotechnology. [14] K. Ball, Ellipsoids of maximal volume in convex bodies, Geome- in an explicit form, the methodology is straightforwardly triae Dedicata 41 (2) (1992) 241{250. applicable to dynamic systems and implicit model equa- [15] M. Tawarmalani, N. V. Sahinidis, A polyhedral branch-and-cut tions. An interesting direction for the future work lies, approach to global optimization, Mathematical Programming on one hand, in increasing the eciency of the solution of 103 (2005) 225{249. [16] A. Mitsos, B. Chachuat, P. I. Barton, Towards global bilevel the bilevel programs and, on the other hand, in the study dynamic optimization, Journal of Global Optimization 45 (1) of robust OED that relaxes the assumption of known (ex- (2009) 63{93. pected) least-squares estimates p ^, which might be relevant [17] P. Tanartkit, L. Biegler, A nested, simultaneous approach for in practical tasks. dynamic optimization problems{I, Computers & Chemical En- gineering 20 (6) (1996) 735 { 741, fth International Symposium on Process Systems Engineering. Acknowledgments [18] P. Tanartkit, L. Biegler, A nested, simultaneous approach for dynamic optimization problems{II: the outer problem, Comput- ers & Chemical Engineering 21 (12) (1997) 1365 { 1388. The research leading to these results has received fund- [19] M. Ruben, S. Daniel, N. Daniel, D. P. Cesar, Coordination of ing from the European Commission under grant agree- distributed model predictive controllers using price-driven co- ment number 291458 (ERC Advanced Investigator Grant ordination and sensitivity analysis, IFAC Proceedings Volumes MOBOCON). RP acknowledges the contribution of Slo- 46 (32) (2013) 215 { 220, 10th IFAC International Symposium on Dynamics and Control of Process Systems. vak Research and Development Agency under the project [20] A. Wachter, T. L. Biegler, On the implementation of an interior- APVV 15-0007, the contribution of the Scienti c Grant point lter line-search algorithm for large-scale nonlinear pro- Agency of the Slovak Republic under the grant 1/0004/17 gramming, Mathematical Programming 106 (1) (2006) 25{57. and the Research & Development Operational Programme [21] A. V. Fiacco, Y. Ishizuka, Sensitivity and stability analysis for nonlinear programming, Annals of Operations Research 27 (1) for the project University Scienti c Park STU in Bratislava, (1990) 215{235. ITMS 26240220084, supported by the Research 7 Devel- [22] S. Dempe, Bilevel Optimization: Reformulation and First Op- opment Operational Programme funded by the ERDF. timality Conditions, Springer Singapore, 2017, pp. 1{20. [23] S. Dempe, V. Kalashnikov, G. P erez-Vald es, N. Kalashnykova, Bilevel Programming Problems: Theory, Algorithms and Appli- References cations to Energy Networks, Energy Systems, Springer Berlin Heidelberg, 2015. [1] H. Hjalmarsson, From experiment design to closed-loop control, [24] S. Dempe, Foundations of Bilevel Programming, Nonconvex Automatica 41 (3) (2005) 393{438. Optimization and Its Applications, Springer, Boston, MA, 2002. [2] L. Pronzato, Survey paper: Optimal experimental design and [25] M. Kie er, E. Walter, Guaranteed estimation of the parameters some related control problems, Automatica 44 (2) (2008) 303{ of nonlinear continuous-time models: Contributions of interval analysis, International Journal of Adaptive Control and Signal [3] G. Franceschini, S. Macchietto, Model-based design of exper- Processing 25 (3) (2011) 191{207. iments for parameter precision: State of the art, Chem. Eng. [26] J. Dutta, S. Dempe, Bilevel programming with convex lower Sci. 63 (19) (2008) 4846{4872. level problems, Springer US, Boston, MA, 2006, pp. 51{71. [4] G. A. F. Seber, C. J. Wild, Nonlinear Regression, Wiley- [27] S. Dempe, S. Franke, On the solution of convex bilevel optimiza- Interscience, 2003. tion problems, Computational Optimization and Applications [5] E. Beale, Con dence regions in non-linear estimation, Journal of 63 (3) (2016) 685{703. the Royal Statistical Society. Series B (Methodological) (1960) [28] J. Bard, Practical Bilevel Optimization: Algorithms and Appli- 41{88. cations, Nonconvex Optimization and Its Applications, Springer [6] L. Pronzato, A. P azman, Design of Experiments in Nonlinear US, 1998. Models: Asymptotic Normality, Optimality Criteria and Small- [29] A. Mitsos, P. Lemonidis, P. I. Barton, Global solution of bilevel Sample Properties, Springer, 2013. 11 programs with a nonconvex inner program, Journal of Global Optimization 42 (4) (2008) 475{513. [30] O. Stein, G. Still, On generalized semi-in nite optimization and bilevel optimization, European Journal of Operational Research 142 (3) (2002) 444 { 462. [31] J. W. Blankenship, J. E. Falk, In nitely constrained optimiza- tion problems, Journal of Optimization Theory & Applications 19 (2) (1976) 261{281. [32] J. F. Bard, J. T. Moore, A branch and bound algorithm for the bilevel programming problem, SIAM Journal on Scienti c and Statistical Computing 11 (2) (1990) 281{292. [33] Y. Ishizuka, E. Aiyoshi, Double penalty method for bilevel opti- mization problems, Annals of Operations Research 34 (1) (1992) 73{88. [34] O. Walz, H. Djelassi, A. Caspari, A. Mitsos, Bounded-error opti- mal experimental design via global solution of constrained min- max program, Computers & Chemical Engineering 111 (2018) 92{101. [35] A. Mitsos, A. Tsoukalas, Global optimization of generalized semi-in nite programs via restriction of the right hand side, Journal of Global Optimization 61 (1) (2015) 1{17. [36] D. Telen, B. Houska, F. Logist, M. Diehl, J. Van Impe, Guar- anteed robust optimal experiment design for nonlinear dynamic systems, in: Control Conference (ECC), 2013 European, IEEE, 2013, pp. 2939{2944. [37] A. R. Gottu Mukkula, R. Paulen, Model-based design of optimal experiments for nonlinear systems in the context of guaranteed parameter estimation, Computers & Chemical Engineering 99 (2017) 198{213. [38] A. R. Gottu Mukkula, R. Paulen, Optimal design of dynamic ex- periments for guaranteed parameter estimation, in: 2016 Amer- ican Control Conference (ACC), 2016, pp. 1826{1831.

Journal

StatisticsarXiv (Cornell University)

Published: Feb 3, 2019

References