Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Evaluation of ensemble methods for quantifying uncertainties in steady-state CFD applications with small ensemble sizes

Evaluation of ensemble methods for quantifying uncertainties in steady-state CFD applications... Bayesian uncertainty quanti cation (UQ) is of interest to industry and academia as it provides a framework for quantifying and reducing the uncertainty in computational models by incorporating available data. For systems with very high computational costs, for instance, the computational uid dynamics (CFD) problem, the conventional, exact Bayesian approach such as Markov chain Monte Carlo is intractable. To this end, the ensemble-based Bayesian methods have been used for CFD applications. However, their applicability for UQ has not been fully analyzed and understood thus far. Here, we evaluate the performance of three widely used iterative ensemble-based data assimilation methods, namely ensemble Kalman lter, ensemble randomized maximum likelihood method, and ensemble Kalman lter with multiple data assimilation for UQ problems. We present the derivations of the three ensemble methods from an optimization viewpoint. Further, a scalar case is used to demonstrate the performance of the three di erent approaches with emphasis on the e ects of small ensemble sizes. Finally, we assess the three ensemble methods for quantifying uncertainties in steady-state CFD problems involving turbulent mean ows. Speci cally, the Reynolds averaged Navier{ Stokes (RANS) equation is considered the forward model, and the uncertainties in the propagated velocity are quanti ed and reduced by incorporating observation data. The results show that the ensemble methods cannot accurately capture the true posterior distribution, but they can provide a good estimation of the uncertainties even when very limited ensemble sizes are used. Based on the overall performance and eciency from the comparison, the ensemble randomized maximum likelihood method is identi ed as the best choice of approximate Bayesian UQ approach among the three ensemble methods evaluated here. Keywords: Uncertainty quanti cation, Ensemble methods, Data assimilation, Computational uid dynamics, Small ensemble sizes 1. Introduction 1.1. Bayesian uncertainty quanti cation for CFD In computational uid dynamics (CFD) applications, Reynolds averaged Navier{Stokes (RANS) methods still are the workhorse tool to inform the important decision-making during engineering design processes. Corresponding author Email address: hengxiao@vt.edu (Heng Xiao) URL: https://www.aoe.vt.edu/people/faculty/xiaoheng.html (Heng Xiao) Preprint submitted to Computers & Fluids April 14, 2020 arXiv:2004.05541v1 [physics.comp-ph] 12 Apr 2020 However, RANS models cannot provide accurate results for many cases in the presence of complex turbulent ows. That necessitates quantifying uncertainties in the numerical simulations so that we could obtain additional con dence/statistics information on the simulated results [1]. The conventional approach to quantify uncertainties is to forwardly propagate the presumed uncertainty in system inputs to the quantity of interests (QoIs) through the forward model. The procedure of the uncertainty propagation is illustrated in Fig. 1(a). Numerous methods [2{4] and applications [5{7] have been developed and explored for uncertainty propagation in the literature. Another uncertainty quanti cation (UQ) method is Bayesian UQ approach. This approach can account for the available data from high delity simulations or experiments to backwardly quantify and reduce the uncertainty of QoIs as well as the system inputs (e.g., model parameters or underlying terms) [8]. The procedure of Bayesian UQ is illustrated in the schematic in Fig. 1(b). ˆ y Input<latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit>x Quantities of interests <latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit> Forward model (a) Uncertainty propagation Data <latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit> (b) Bayesian UQ approach Figure 1: Schematic of uncertainty quanti cation. (a) Uncertainty propagation forwardly propagates the presumed uncertainty (red dashed line) in the input to the quantity of interest through the forward model; (b) Bayesian UQ approach combines the prior information with high delity simulation or experimental data to backwardly quantify the posterior uncertainty (blue solid line) in the quantities of interest and in the input. Numerous works have been conducted to apply Bayesian UQ approach to diverse applications, including RANS simulations. Based on the pioneering work of Kennedy and O'Hagan [9], Cheung et al. [10] applied a Bayesian calibration framework for the Spalart{Allmaras turbulence model to calibrate the model param- eters by incorporating experimental measurements. They evaluated their approach on the boundary layer ows to reduce computational costs and pointed out the necessity to develop tractable UQ approaches for computationally expensive cases. Oliver and Moser [11] further extended the work of Cheung et al. [10] by introducing stochastic representations for the uncertainties in eddy viscosity turbulence models. The uncer- tainty representations based on the multiplicative error in mean velocity and the additive error in Reynolds shear stress are developed and used for plane channel ows. Edeling et al. [12] proposed a Bayesian model- scenario averaging (BMSA) method to estimate the k{ turbulence model error for a class of boundary layer ows with di erent pressure gradients. More recently, Edeling et al. [13] leveraged maximum a posteri- ori (MAP) estimate to reduce the computational cost and thus make their BMSA approach applicable for complex ows. The aforementioned works use the Markov chain Monte Carlo (MCMC) technique which typically requires 5 6 samples of at least O(10 ) to O(10 ). However, it would computationally intractable to deal with the 2 complex ow cases of engineering interests where uncertainty propagation through the forward model is computationally expensive. In order to reduce the computational cost, the conventional approach is to use surrogate models (e.g., the polynomial chaos methods [14{16]) to replace the CFD code. Nevertheless, such approaches are challenging for high dimension problems due to the curse of dimensionality. The ensemble technique has been proposed and discussed extensively for UQ problems in the data assimilation community. It can signi cantly reduce sample size to O(10 ) and provide reasonable estimates of posterior uncertainty with limited samples. Therefore, the ensemble methods can potentially play a role as an approximate Bayesian UQ approach for computationally expensive ow cases. The ensemble-based data assimilation methods will be further discussed below. 1.2. Ensemble-based data assimilation Ensemble-based data assimilation has recently increased in popularity and has been applied to diverse con- texts including uid mechanics [17], weather forecasting [18] and geoscience [19] due to its non-intrusiveness and robustness. Among ensemble-based data assimilation methods, the most widely used is the ensemble Kalman lter (EnKF) [20]. It has been extensively used for uncertainty quanti cation in various applica- tions, such as hydrology [21, 22], meteorology [23, 24], oceangraphy [25, 26]. In the past few years, EnKF has also been increasingly leveraged for CFD applications to estimate empirical parameters or functional errors in the RANS closure models. Kato and Obayashi [27] explored the applicability of the EnKF method to estimate the uncertainty in the empirical parameters of the Spalart{Allmaras RANS model. However, due to the strong nonlinearity of the RANS problem, it is necessary to iteratively assimilate data even for the stationary scenario, thus enhancing the performance of data tting. To this end, Iglesias et al. [28] proposed an iterative form of the standard EnKF as a derivative-free optimization method for inverse problems. In their framework, the analysis step of EnKF iterates with the arti cial time for stationary systems based on the state augmentation. They showed the accuracy of the iterative EnKF for inferring the sample mean with three di erent cases, but its accuracy in the context of uncertainty quanti cation has not been fully investigated. Xiao et al. [8] applied this iterative EnKF to quantify and reduce the RANS model-form un- certainty within the Reynolds stress. They demonstrated that the posterior mean with EnKF could have remarkably good agreement with benchmark data. The readers are referred to the recent review of Xiao and Cinnella [29] for recent progress in model-form uncertainty quanti cation in RANS simulations. For highly nonlinear systems, the ill-posedness of the problem is signi cantly increased. To search for the optimal point, EnKF takes a full gradient descent step where the forward model is linearized to simplify the problem [30]. That possibly changes the original nonlinear problem and leads to wrong solutions. Considering this issue, several iterative ensemble methods have been proposed and discussed for UQ of nonlinear systems in the data assimilation community. For instance, Gu and Oliver [31] proposed the ensemble randomized maximum likelihood (EnRML) method to iterate the analysis step with the Gauss{Newton algorithm. They demonstrated the superiority of the EnRML method to EnKF for both static and dynamic problems with strong nonlinearity. Chen and Oliver [32] used the EnRML method as an iterative ensemble smoother for the history match problem. Yang et al. [33] proposed an enhanced ensemble variational method and applied their method to unsteady ows. Their method is implemented similarly to EnRML with an iterative minimization procedure based on the Gauss{Newton algorithm, but the error covariance is updated sequentially based on ensemble analysis. On the other hand, Emerick and Reynolds [19] proposed an ensemble Kalman lter 3 with multiple data assimilation (EnKF-MDA) and demonstrated it could provide better data match than EnKF with a comparable computational cost. This method performs Bayesian analysis with recursion of the likelihood through in ating the observation error. It is worth noting that for unsteady cases, EnKF is usually used as a ltering technique to assimilate the data in time sequentially, while the EnRML method [32] and EnKF-MDA [34] can be used as the smoother technique to account for all the available data simultaneously. Moreover, for the Gaussian linear case, it has been proven that the EnRML method and EnKF-MDA are equivalent to the EnKF [19, 35]. But for the nonlinear case, the equivalence does not hold. EnKF can be regarded as a single Gauss{Newton update with a full step. In contrast, the EnRML method and EnKF- MDA perform multiple small corrections, which helps to alleviate the inaccuracy caused by the linearization and better preserve the nonlinearity of the original problem. The ensemble-based data assimilation methods mentioned above can be derived in a similar manner by solving the minimization problem under several mild assumptions (e.g., the Gaussian distribution, lineariza- tion, and ensemble gradient representation) [30]. However, these assumptions may result in a departure of the estimated posterior distribution from the truth. Recently, several authors investigated the cause of inaccurate uncertainty estimates given by the ensemble methods. For instance, Oliver and Chen [36] re- viewed the progress of MCMC, EnKF, and EnRML on the history matching problem. They concluded that the EnRML method could provide the probability distribution in better agreement with MCMC at a low computational cost, as compared to the EnKF method. Ernst et al. [37] examined the EnKF method for nonlinear stationary systems. They demonstrated that EnKF can provide the sample statistics as indication of uncertainties but is not suitable for rigorous Bayesian inference. Evensen [30] derived and analyzed di er- ent ensemble methods from the view of model gradient representations and compared the analytic gradient and the ensemble representative gradient. He concluded that none of these methods could provide the exact posterior probability density function (PDF) for highly nonlinear models, but they can serve as indication of the uncertainties at least for weakly nonlinear cases. However, a suciently large number of samples is used to obtain accurate statistical estimation in his work, and the performance of these methods with small en- semble sizes is not fully evaluated. These iterative ensemble methods are useful for estimating uncertainties in QoIs in industrial CFD applications, and they warrant further investigation. 1.3. Objective of present work In this work, we present the derivations of three di erent iterative ensemble methods, namely iterative EnKF [28](hereinafter referred to as EnKF for brevity), EnRML, and EnKF-MDA, from the optimization perspective, and compare their performances for quantifying uncertainties in steady-state CFD applications with small ensemble sizes. Moreover, the e ect of small ensemble sizes on the performance of each method is evaluated in a scalar case by comparison with Bayesian distribution from MCMC. The rest of the paper is structured as follows. In Section 2, we give the brief derivation of the three most commonly used ensemble-based data assimilation methods (EnKF, EnRML, and EnKF-MDA). A scalar case is presented in Section 3 to compare the performances of these methods with di erent ensemble sizes. In Section 4, a steady ow case is tested to identify the suitable approach to quantify the uncertainty in the RANS model. Section 5 concludes the paper. 4 2. Ensemble-based data assimilation methods Here, we summarize the brief derivation of the three di erent ensemble-based data assimilation methods (EnKF, EnRML, and EnKF-MDA) from the optimization perspective. For clarity and without loss of generality, we assume a multi-variate state-space model with multiple observations. This is in contrast to Evensen's work [30] where a single-variate state-space model with a single observation is assumed. 2.1. Minimization problem Consider that the observation model can be expressed as y = H[x] + ; (1) N D where x is the state vector or input parameter x 2 R , y is the observation y 2 R , H is model function N D mapping the state to observation space R ! R , and  is added observation noise, which is assumed to be an independent and identically distributed (i.i.d.) Gaussian random vector with zero mean and covariance R. We give an initial guess on the PDF of state p(x) as the prior knowledge based on the Gaussian assumption. Further, the Bayesian UQ approach can be used to nd the posterior distribution conditioned by the observation. The Bayes' theorem can be formulated as p(x j y) / p(x) p(y j H[x]), (2) which states that the posterior distribution p(x j y) is proportional to the multiplication of the prior distri- bution p(x) and likelihood function p(y j H[x]) of data y conditioned by the model H[x]. With the Gaussian assumption for prior p(x) and likelihood p(y j H[x]), we can rewrite the Bayes' formula in Eq. 2 as p(x j y) / p(x) p(y j H[x]) / e ; (3) where J is the cost function de ned as 1 > 1 a a f 1 a f a 1 a J [x ] = x x P x x + (H[x ] y) R (H[x ] y) : (4) 2 2 In the formula above, P is the model error covariance, R is the observation error covariance, and the super- scripts a and f represent the \analysis" and \forecast", respectively. It is challenging to obtain the true error covariance P in problems with high-dimensional state-spaces. The ensemble methods apply the Monte Carlo technique to draw a small number of samples. Such samples can then be used to estimate the ensemble representation of the model error covariance P and the observation error covariance R as P = ( X X) (X X) ; M 1 (5) R =  ; where X = fx ; : : : ; x g. Note that the estimated covariance matrix for the observation error and state 1 M are both symmetric. Further, the maximum a posteriori (MAP) analysis can be applied to estimate the posterior distribution. That is, maximizing the posterior is equivalent to minimizing the cost function J . Based on such an optimization perspective, we can derive the three di erent data assimilation methods, namely EnKF, EnRML, and EnKF-MDA, from the perspective of minimizing the cost function with di erent gradient descent techniques. 5 2.2. EnKF For steady cases, the traditional EnKF only performs the Kalman update once. It may be dicult to achieve a satisfactory data t in some scenarios, for instance, where the prior mean is far from observation data, and the system model is strongly nonlinear [28]. To this end, the iterative technique is usually leveraged to adequately assimilate the data and thus prompt the data match. We use an iterative form of EnKF proposed by Iglesias et al. [28] to enhance the optimization performance. This method considers the EnKF as a regularized least square technique and performs multiple standard Kalman updates sequentially, even for stationary cases. The cost function for each ensemble realization can be written as 1 > 1 > a a f 1 a f a 1 a J [x ] = x x P x x + H[x ] y R H[x ] y ; (6) n;j n;j n;j n n;j n;j n;j n;j 2 2 where n indicates the iteration number and j denotes the sample index. Based on the cost function (6), the gradient with respect to the state is @J 1 a f 0 a 1 a = P x x + H [x ] R H[x ] y ; (7) n n;j n;j n;j n;j @x n;j which should vanish to minimize the cost function J . Therefore, the formulation of EnKF can be derived by setting the gradient of cost function (7) to be zero, which amounts to: 1 a f 0 a 1 a P x x = H [x ] R H[x ] y ; (8) n n;j n;j n;j n;j 0 a a where only the terms H [x ]) and H[x ] are unknown. The assumption of linearization is introduced to have j j an estimation on the two unknown terms. The two terms are linearized as a f 0 f a f H[x ]  H[x ] +H [x ] x x ; (9a) j j j j j 0 a 0 f 00 f a f H [x ]  H [x ] +H [x ] x x ; (9b) j j j j j where the second derivative in Eq. (9b) is neglected for simplicity, assuming the model is moderately nonlinear. With ensemble techniques, the model in observation space is randomized around the mean f f value H[x ]. After expanding H[x] around the ensemble mean H[X], we can represent H[x ] with the model function gradient as [30] f f 0 f f f H[x ]  H[X ] +H [x ] x X . (10a) j j j 0 f We introduce the tangent linear model H[x] = Hx, and thus the gradient representation H [x ] can be expressed as the tangent linear operator H by assuming the linear relationship between the measurement and the state. Accordingly, the update step of EnKF can be derived and formulated as a f > > f x = x + P H R + HP H y Hx : (11) n n j n;j n;j n;j Due to practical consideration, one does not usually compute the model operator H explicitly. Rather, the > > two terms PH and HPH can be reformulated as PH = X X H[X]H[X] ; (12a) M 1 HPH = H[X]H[X] H[X]H[X] : (12b) M 1 6 Besides, the ensemble observation is adopted based on [38]. That is, we use randomly perturbed observation data for each realization. Further details of the derivation are presented in Appendix A. We emphasize that the iterative ensemble Kalman method is a speci c method for solving inverse problems that is distinct from the conventional EnKF. It regards the ensemble Kalman method as the regularized least square technique. For stationary problems, the update step is iterated with pseudo-time to reduce data mis t. The iterative ensemble Kalman method for uncertainty quanti cation will be further discussed in subsection 2.5. 2.3. EnRML The ensemble randomized maximum likelihood method [31] updates the initial guess of state vector iteratively with Gauss{Newton algorithm. The cost function can be written as 1 1 > > 1 1 J [x ] = (x x ) P (x x ) + (H[x ] y ) R (H[x ] y ) ; (13) l;j l;j 0;j l;j 0;j l;j j l;j j 2 2 where x is the initial guess, P is the initially estimated model error covariance before the data assimilation 0 0 process , and iteration index l indicates the sub-iteration of the EnRML method. The gradient and Hessian of the cost function (13) can be derived similarly as in EnKF @J 1 0 > 1 = P (x x ) +H [x ] R (H[x ] y ) ; (14a) l;j 0;j l;j l;j j @x l;j @ J 1 0 > 1 0 = P +H [x ] R H [x ]: (14b) l;j l;j @ x l;j Instead of reaching a zero-gradient minimum directly as in EnKF, the prior x is iteratively updated based on Gauss{Newton method as @ J @J a f x = x ; (15) l;j l;j @x @x l;j l;j where is the step length parameter. The Gauss{Newton approach can reduce the step length and ease the in uence of the linearization assumption during the analysis step. With the gradient (14a) and the Hessian (14b) of the cost function we can obtain the analysis scheme for the EnRML method as follows: a f f 0 f > 0 f > 0 f x = x + (1 ) x P H [x ] R +H [x ] P H [x ] 0 0 l;j 0;j l;j l;j l;j l;j (16) f 0 f f f H[x ] y H [x ] x x : l;j l;j l;j 0;j In the EnRML method, the model error covariance P remains as the initial one P and does not change with the iteration number. Moreover, the sensitivity matrix H [X] has to be evaluated at each iteration through H [X ]  H[X ]H[X ] X X . (17) l l l l l The singular value decomposition (SVD) is used to estimate the inverse of the non-full rank matrix. The details of the derivation can be found in Appendix B. 2.4. EnKF-MDA From the derivation above, each update of EnKF can be regarded as the Gauss{Newton update but uses a full step in the search direction. However, a single global update may not result in a satisfactory data t. Hence, assimilating the data multiple times is highly desired to improve the data t [35]. Moreover, 7 in some cases where the prior mean/ rst guess is far from the truth and the model is highly nonlinear, performing the full Gauss{Newton step may result in the overcorrection and lead to inaccurate solutions. This de ciency can be alleviated to damp the changes in the early iterations [39, 40]. To this end, Emerick and Reynolds [19] proposed EnKF-MDA to assimilate the same data multiple times with an in ated observation error covariance. They have proven that for linear Gaussian cases, the EnKF-MDA is equivalent to the EnKF. For nonlinear cases, the traditional EnKF uses a full Gauss{Newton step with an average sensitivity estimated from the prior ensemble and probably leads to a large Gauss{Newton correction [35]. EnKF-MDA can be regarded as performing multiple small corrections to damp the changes of the model and thus alleviate the e ects of nonlinearity [34]. From the Bayesian perspective, the likelihood function of EnKF-MDA is in a recursive form as mda p(x j y) / p(x) p(y j H[x ]) ; (18) l1 l=1 mda 1 where = 1, N is the total data assimilation iteration number, and can be chosen simply as mda l=1 N . The cost function J can be expressed as: mda 1 1 p p > > a a f 1 a f a a J [x ] = x x P x x + d +  H[x ] ( R) d +  H[x ] ; (19) l l;j l l l;j l;j l;j l;j l l;j l;j l;j l;j 2 2 where d is the measurement without perturbations. The gradient of the cost function is then @J [x ] l;j 1 1 a f 0 a > a = P x x +H [x ] ( R) d +  H[x ] : (20) l l l;j l l;j l;j l;j l;j @x l;j Similar to the derivation of EnKF method, we set the gradient of cost function to zero. Further, with the linearization assumption (9) and ensemble gradient representation (10), we have the update scheme as a f 0 f > f f > f x = x + P H [x ] H[x ]P H[x ] + R d +  H[x ] : (21) l l l l l;j l;j l;j l;j l;j l;j l;j By introducing the tangent linear operator H, we can obtain the analysis step of EnKF-MDA as a f > > f x = x + P H HP H + R d +  Hx . (22) l l l l l;j l;j l;j l;j Given the prior distribution of system state and ensemble observations with error covariance matrix R, the implementation for the three data assimilation methods are summarized in Table 1. 2.5. Remarks From the derivations above, we apply the iterative form, linearization assumption, and ensemble gradient representation to obtain the derivative-free analysis scheme. Here, we provide some discussion on the e ects of each issue. 1. Iterative form is necessary to obtain satisfactory inference results for the inverse problem of nonlinear systems. However, the iterative EnKF performs several Gauss{Newton iterations with the full step where data is equally used for stationary systems. While the other two methods conduct partial iterations, and the several sub-iterations are only equivalent to the rst iteration of the iterative EnKF illustrated in Section 2.2. This iterative EnKF may cause the samples to collapse in early iterations and leads to underestimation of uncertainty, since the data is repeatedly used. Moreover, as the model error 8 Table 1: Schematic comparison of EnKF, EnRML and EnKF-MDA EnKF EnKF-MDA EnRML a. sampling step: a. sampling step: generate initial ensemble state vectors fx g 1. generate initial ensemble state 0;j j=1 vectors fx g ; 0;j j=1 2. estimate the mean X and model error covariance P of the ensemble. b. prediction step: b. prediction step: i) Propagate from current state l 1 to next iter- i) Propagate from current state l 1 ation level l based on forward model (l > 0). to next iteration level l based on for- ward model (l > 0). f a f a x = F [x ] x = F [x ] l;j l1;j l;j l1;j ii) Estimate the ensemble mean X and model error ii) Estimate the ensemble model gra- covariance P of the current iteration. dient by (17). c. analysis step c. analysis step c. analysis step update the state vector by update the state vector by update the state vector by (16) and re- (11) and return to step b (22) and return to step b turn to step b until the convergence cri- until the convergence cri- until the convergence cri- teria are reached. teria are reached. teria are reached. covariance for the next iteration becomes very small, the rst term in the cost function (6) prescribing the prior distribution will be dominant. That means the data assimilation analysis does not take e ect, and the update only depends on the prior afterward. In contrast, the EnRML method and EnKF-MDA iterate the update step through the Gauss{Newton algorithm and likelihood recursion, respectively, which can avoid the data overuse and sample collapse. 2. The linearization assumption is introduced in our derivation for simpli cation. However, for strongly nonlinear systems, the linear assumption may signi cantly a ect the optimal solution and lead to inaccurate inference results. EnKF takes a full update step to the optimal point, while the EnRML method and EnKF-MDA split one EnKF step by several small steps through Gauss{Newton method and likelihood recursion, respectively. From this perspective, the EnRML method and EnKF-MDA can alleviate the in uence of linearization and partly preserve the nonlinearity. Therefore, the EnRML method and EnKF-MDA are more suitable for the uncertainty quanti cation of stationary nonlinear systems than the iterative EnKF. 3. Another assumption, ensemble gradient representation, is leveraged in the ensemble-based DA methods as presented in our derivations. That is, the model gradient is approximated by ensemble realizations and is not derived analytically. This may cause the propagated posterior distribution to depart from the exact Bayesian distribution [30]. While the impact of linearization can be alleviated through the Gauss- Newton algorithm or reduced likelihood recursion, the e ects of ensemble gradient representation are inevitable for ensemble methods unless the adjoint method is used to compute the analytic gradient. Moreover, the parameters and N which control the length of the update step are introduced in the mda 9 EnRML method and EnKF-MDA, respectively. They can be constant or adaptive based on the convergence judgment. Speci cally, if the discrepancy in observation space is larger than that in the last iteration, we can reduce the step length by decreasing the step length parameter or increasing the in ation parameter N . Conversely, if the discrepancy is reduced, we can increase the in EnRML or reduce the N in mda mda EnKF-MDA to speed up the convergence [34]. 3. Scalar case We rst test the three ensemble-based Bayesian UQ approaches derived in Section 2 on a simple case used by Evensen [30]. In his work, the e ects of the model gradient representation are investigated with a suciently large sample size. Here, we focus on the e ects of limited ensemble sizes and evaluate the performance of the iterative ensemble methods with small sample sizes. In this case, the computing time for the forward model is negligible. Hence, we can obtain Bayesian posterior from MCMC and ensemble methods with a large sample size for comparison. 3.1. Problem statement The forward model is de ned as: y = 1 + sin(x) + q; (23) where x is the state variable, ^y is the model output in observation space, and q is the added model error with q  N (0; 0:03 ). The goal is to quantify and reduce the uncertainty of x and ^y with Bayesian approaches. The Bayesian UQ approach need the statistical information on the prior state and the observation data. We assume that the state variable x and data y both obey to the Gaussian distribution as x  N (0; 0:1 ) and y  N (1; 0:1 ). We set the step length parameter in the EnRML method as 0:5 and the in ation parameter N in EnKF-MDA as 30 to obtain convergence results. The performance of the ensemble mda 6 2 methods is assessed with two di erent ensemble sizes of 10 and 10 , and the e ects of small ensemble sizes on the propagated uncertainties are investigated. We conduct the Markov chain Monte Carlo (MCMC) with 10 samples by using the DREAM algorithm [41] and consider the results as the gold standard. The probability density in this case is estimated from the samples through kernel density estimation (KDE) using the Gaussian kernel. From the derivation in Section 2, it has been noted that two assumptions (linearization and ensemble gradient representation) are introduced to obtain the derivative-free analysis step. The model gradient can be represented by the analytic gradient or estimated by the ensemble samples. Although the analytic model gradient can give more accurate results compared to the ensemble gradient representation [30], it is not practical for complex models and beyond the scope of this work. Here, we focus on the ensemble gradient and also investigate the e ects of ensemble sizes on the ensemble gradient. The Python code for this test case is provided in a publicly available GitHub repository [42]. 3.2. Results We rst evaluate the performance of each ensemble method with a large ensemble size M = 10 . The joint and marginal PDFs with comparison among di erent ensemble methods are shown in Fig. 2 and Fig. 3, respectively. From the results, it can be seen that all the three ensemble methods can capture the posterior 10 mean. However, it is apparent that the iterative EnKF method leads to overcon dence in the mean value and signi cantly underestimates the posterior variance compared to the exact Bayesian distribution from MCMC. On the contrary, both the EnRML method and EnKF-MDA can provide an estimation on the posterior distribution in good agreement with the benchmark data. This is not surprising since the iterative EnKF repeats using the same data, while the EnRML method and EnKF-MDA can avoid data overuse by introducing the Gauss{Newton method or the observation error in ation, as we remarked in Section 2. To summarize, with large ensemble size, the EnRML method and EnKF-MDA can perform comparably to the MCMC, while EnKF signi cantly underestimates the posterior uncertainty due to data repeatedly used. (a) MCMC (b) EnKF (c) EnRML (d) EnKF{MDA Figure 2: Joint PDFs with 10 samples with the comparison among Bayes, EnKF, EnRML, and EnKF-MDA for the scalar case. Further, we explore the e ects of small ensemble size on this case and evaluate which method can outperform others with limited samples. For many realistic cases, the propagation with large ensemble size is computationally prohibitive, and ensemble methods can typically use less than 10 samples to describe the statistical information. Therefore, we set the ensemble size to be 10 , and other set-ups are consistent with the previous case. The joint PDF results with di erent ensemble methods are shown in Fig. 4. It can be seen that with the limited ensemble size, the iterative EnKF method performs similarly as with the large 11 6 Figure 3: Marginal PDFs for x with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case. ensemble size. Speci cally, all samples converge to the observations and the posterior distribution has a low variance. By contrast, the EnRML method and EnKF-MDA not only can capture the posterior mean value but also provide the statistical information to indicate the uncertainty with ensemble realizations. For better visualization, the marginal PDFs in comparison of the three ensemble methods with 10 samples are shown in Fig. 5. We can see that the EnRML method and EnKF-MDA give satisfactory estimations on the uncertainty, while the mode value with EnKF is approximately three times higher than that with MCMC. Generally, with limited ensemble size, EnKF performs similarly as with large ensemble size, which underestimates the posterior variance. The performance of EnRML and EnKF-MDA is still satisfactory but inferior to those with larger ensemble sizes. Not surprisingly, the estimation of uncertainty with limited ensemble size slightly deviates from the dis- tribution obtained with MCMC. It is likely that the limited number of samples are insucient for describing the necessary statistics. This may also increase the error in estimating the model gradient, especially for nonlinear models. For illustration, we present the plots of prior joint PDF with the large and small ensemble size, as shown in Fig. 6. It is obvious that the small ensemble size is not sucient to describe the prior distribution. Additionally, we provide the model gradient estimated by ensemble samples in comparison with the analytic gradient. The analytic gradient of this model is  cos x, and the ensemble gradient can sin(x)sin( x) be represented by . The sine function can be approximated as a linear model in the range close x x to zero, and thus we assume that: sin(x) sin(x)  sin((x x)); x ! 0, (24) and further sin((x x)) lim = cos((x x)): (25) x x!0 (x x) Based on this formula, we can see that if the samples are close to x and the sample mean x is estimated as 12 (a) MCMC (b) EnKF (c) EnRML (d) EnKF-MDA Figure 4: Joint PDFs with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case Figure 5: Marginal PDFs for x with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case. 13 zero, the ensemble gradient can be approximated to the analytic one as (24) (25) sin(x) sin(x) sin((x x))  x0 cos((x x))   cos(x): x x x x Given that the model gradient is not subject to the Gaussian distribution, we use the cosine kernel to estimate the probability density, as shown in Fig. 7. It is noticeable that the di erence between the analytic gradient and ensemble gradient can be eased with the large ensemble size. The discontinuity in the case with 10 samples is mainly due to the limited ensemble realizations which are insucient to prescribe the in nite distribution. The small ensemble size can signi cantly reduce the computational cost but may lead to additional errors in the statistical description and the model gradient estimation. To ensure the error remains within an acceptable range, the choice of the ensemble size need numerical tests. However, for highly nonlinear systems the reduction of errors in model gradient estimation will not bene t from large ensemble size unless the analytic gradient is adopted. Also, localization techniques [43] can be introduced to reduce the sampling error and need future investigation. 6 2 (a) with 10 samples (b) with 10 samples 6 2 Figure 6: Results of prior joint PDF with large (10 ) and small (10 ) ensemble size for the scalar case 4. RANS equation CFD is of signi cant importance for many engineering applications to inform the process of design, analysis, and optimization. Considering the computational cost, the RANS model is still the primary tool to characterize turbulence behavior in CFD simulations. However, the unknown Reynolds stress term in RANS equations is commonly solved with di erent closure models under the Boussinesq assumption. This assumption introduces the model uncertainty and reduces the con dence on the predictive performance. In this section, we apply the three ensemble-based data assimilation methods (EnKF, EnRML, and EnKF- MDA) on the RANS closure problem and evaluate their performance to quantify and reduce the uncertainty of the predicted velocity by incorporating high delity data. 14 20.0 20.0 analytic gradient analytic gradient ensemble gradient ensemble gradient 17.5 17.5 15.0 15.0 12.5 12.5 10.0 10.0 7.5 7.5 5.0 5.0 2.5 2.5 0.0 0.0 0 1 2 3 4 0 1 2 3 4 model gradient model gradient 6 2 (a) with 10 samples (b) with 10 samples Figure 7: Comparison of analytic gradient and ensemble gradient. The light/pink shaded region represents analytic gradient 6 2 and the dark/blue shaded region represents ensemble gradient. (a): 10 samples; (b): 10 samples 4.1. Problem statement The RANS equations can be expressed as: @U = 0 (26a) @x 0 0 @u u @U @ (U U ) @P 1 @ U i i j i i j + = + ; (26b) @t @x @x Re @x @x @x j i j j j where U; P is the dimensionless velocity and pressure respectively, and Re is the Reynolds number. In the 0 0 momentum equation (26b),  = u u is the Reynolds stress which is the main source of uncertainty in i j RANS simulations. We regard the Reynolds stress from RANS simulation coupling with the linear eddy{ viscosity model as the baseline. Further, we introduce the discrepancy term  representing the uncertainty into the baseline as RANS + : (27) Thus, we can quantify the uncertainty in the predicted velocity with the three ensemble-based DA methods by incorporating available observation data. 4.2. Methodology The data assimilation framework to quantify and reduce the RANS model-form uncertainty associated with Reynolds stress was proposed by Xiao et.al [8]. Here, we give a brief introduction to this methodology, and the reader is referred to [8] for further details. To quantify the uncertainty within Reynolds stress, we rst transform the Reynolds stress tensor into several scalar elds. Speci cally, the Reynolds stress tensor can be expressed as 1 1 = 2k( I + a) = 2k( I + VV ), 3 3 probability density probability density 3(4) <latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit> 3C (3-component isotropic) 3 4 0 0 0 x (⇠ , ⌘ ) <latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit> x<latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit> <latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit> 2C (2-component x<latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit> axisymmetric) x(⇠ , ⌘ ) <latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit> 1C 2 1 (1-component) (a) Barycentric coordinate (b) Natural coordinate Figure 8: Mapping between the barycentric coordinate to the natural coordinate where k is the turbulent kinetic energy, indicating the magnitude of the Reynolds stress, I is the second order identity tensor, a is the anisotropy tensor; V = [v ; v ; v ], and  = diag[ ;  ;  ] with  + + = 0 are 1 2 3 1 2 3 1 2 3 the eigenvector and eigenvalue of a, respectively, which represents the shape and orientation of  . Afterwards, the eigenvalues  ;  ;  are projected to a barycentric coordinate as 1 2 3 C = 1 1 2 C = 2(  ) 2 2 3 C = 3 + 1; 3 3 with C +C +C = 1. [44] The barycentric coordinate is shown in Fig. 8a. To facilitate the parameterization, 1 2 3 the barycentric coordinate is transformed to the natural coordinate  = (; ) by placing the triangle in a Cartesian coordinate as shown in Fig. 8b. The location of any point in the triangle can be expressed as a combination of those of the three vertices. That is, =  C +  C +  C ; (28) 1c 1 2c 2 3c 3 where  ,  , and  are the coordinates of the three vertices of the triangle. 1c 2c 3c RANS RANS RANS In conclude, we represent the Reynolds stress baseline  with three discrepancy variables k ,  , RANS k and  through eigendecomposition and coordinate conversion. Further, the additive uncertainties  ,  , and  can be injected into these projected variables as RANS k log k(x) = log k (x) +  (x); (29a) RANS (x) =  (x) +  (x); (29b) RANS (x) =  (x) +  (x); (29c) where the logarithm on k is to ensure non-negativity. The dimension of the variables log k(x), (x), and (x) is consistent with the mesh grid. To infer the entire eld with very sparse observation signi cantly increases the ill-posedness of the problem. Hence, it is necessary to reduce the dimension of the state space. In this case, we leverage the Karhunen|Lo eve (KL) expansion with truncated orthogonal modes to represent the eld for each quantity to be inferred. Concretely, the discrepancy variables  ,  , and  are constructed Plain strain as the random elds subject to zero-mean Gaussian process GP (0;K). The kernel function K indicates the covariance at two locations x and x as 0 2 jx x j 0 0 K(x; x ) = (x)(x ) exp( ): (30) In the formula above, (x) is the variance eld to re ect the region where large discrepancy is expected. l ^ ^ ^ ^ is the characteristic length. The KL modes take the form as:  (x) =  (x), where  and  are the i i eigenvalues and eigenvectors of the kernel K, respectively, computed based on the Fredholm integral as 0 0 0 ^ ^ ^ K(x; x )(x )dx = (x). (31) This choice of KL modes for the discrepancy elds  ,  ,  leads to a KL expansion. That is, the discrepancy variables can be constructed from these deterministic KL modes (x) and zero-mean, uni-variance random variable ! as k k (x) = !  (x); i=1 (x) = !  (x); (32) i=1 (x) = !  (x): i=1 With ! and KL modes (x), we can reconstruct the eld of each discrepancy quantity and recover the random eld of Reynolds stress tensor. The Reynolds stress representation and dimension reduction presented above makes it practical to quan- tify and reduce the uncertainty in the RANS model by incorporating observation data, i.e., direct numerical simulation (DNS) results. From a Bayesian perspective, the random noise   (0;  ) is added in time- obs averaged DNS data y to allow overlap between the likelihood and the prior distribution. Herein the obs is the standard deviation of observation noise, indicating the noise level. We take the velocity as the state augmented with the KL coecients. As a result, we can adopt the iterative ensemble methods (EnKF, EnRML, and EnKF-MDA) to quantify and reduce the uncertainty in velocity with prior samples of the KL coecient and the observation. In summary, the procedure of the RANS model-form uncertainty quanti cation framework is presented below: 1. Preprocessing step: RANS (1) Perform RANS simulation to obtain  as the baseline. RANS RANS RANS RANS (2) Project  onto the eld of k ,  , and  . (3) Conduct KL expansion to generate the KL basis sets or modes f (x)g , where m is the number i=1 of truncated modes. (4) Generate the initial value of ! with a zero-mean uni-variance normal distribution. 2. Data assimilation step: (a) Recover the discrepancy elds of  ,  , and  with coecient ! and basis sets (x) based on Eq. (32). (b) Reconstruct the ensemble realizations on  through mapping (k; ; ) !  and solve the RANS 17 Figure 9: The structured mesh used for the simulation of ow over the periodic hills equation to obtain the velocity eld given each realization of  . (c) Perform the Bayesian analysis with data assimilation technique to reduce the uncertainty of velocity by incorporating time-averaged DNS data. (d) Return to step (a) till the convergence criteria or maximum iteration number is reached. 4.3. Case setup The test case is turbulent ow over periodic hills initially proposed by Fr ohlich et al. [45]. The Reynolds number based on the bulk velocity and height of crest is 2800. We use the DNS data from [46] as the benchmark. The Launder{Sharma RANS model [47] is one of classical low Reynolds k{ models and is extensively used in industrial applications. Hence, we use the RANS simulation with this model as the baseline. The periodic boundary condition is imposed on the inlet, and the non-slip boundary condition is applied on the wall. A structured mesh is constructed with 50 cells in the stream-wise direction and 30 cells in the normal to wall direction, as shown in Fig. 9. Despite the coarse mesh, the dimensionless distance y between the rst cell and the walls is around 1, which meets the requirement of the Launder{Sharma turbulence model. As for the data assimilation setup, the number of KL modes for k, , and  is set to 8. The ensemble size is 50. The length scale is set as constant 1 for simplicity. The standard deviation of observation noise  is set as 10% of the truth. We take 18 observations to quantify and reduce the uncertainty in obs velocity. The locations are marked in Fig. 10. The step parameter in the EnRML method is chosen as 0:5, and the in ation parameter N in EnKF-MDA is set as 50 to obtain the convergence results based mda on our calibration study. For this case, the MCMC sampling is impractical to verify the estimated posterior uncertainty, due to the high dimensionality of the state space and the high costs of numerical simulation. The built-in solver simpleFoam in OpenFoam is used to run the RANS simulation and obtain the base- line/prior mean. The forward solver tauFoam is developed based on simpleFoam to propagate the Reynolds stress to velocity. That is, the forward solver computes velocity with the given Reynolds stress eld rather than using turbulence models. 4.4. Results Through solving RANS equations given the randomized Reynolds stresses, we can obtain the prior un- certainty in the propagated velocity. The plots of the prior stream-wise velocity are shown in Fig. 10. It can be seen that the space spanned by the ensemble realizations can indicate the statistical information. 18 DNS baseline samples sample mean observations 3.0 3.0 2.5 2.5 2.0 2.0 1.5 1.0 1.5 0.5 1.0 0.0 0 2 4 6 8 10 0.5 x/H; 2U /U + x/H x b 0.0 0 2 4 6 8 10 x/H; 2U /U + x/H x b Figure 10: Prior ensemble realization of stream-wise velocity pro les at 18 locations, in comparison to DNS and baseline. The location of observation is indicated with crosses(). Also, the sample mean can have a good t with RANS results. That is reasonable since the random eld is constructed by perturbing the baseline from RANS simulation. Further, we perform data assimilation analysis with EnKF, the EnRML method, and EnKF-MDA to quantify uncertainties in the velocity eld by incorporating the observations at the speci c locations. The results with di erent data assimilation schemes are presented in Fig. 11. It is noticeable that with EnKF the posterior mean can t well with DNS results. However, all samples converge to the mean value, and the variance of the posterior becomes very low. By contrast, the EnRML method can give an estimation of the uncertainty, and the mean value also has a good t with DNS data. EnKF-MDA can also preserve the sample variance and improve the data t, but the sample mean is relatively inferior compared to the other two methods. Based on our derivation and evaluation in the former sections, that is likely due to EnKF repeatedly using the same DNS data with full Gauss{Newton steps, while the EnRML method and EnKF-MDA can be considered to perform one EnKF step via several small analysis steps. Here we present the comparison of 95% credible interval between the prior and posterior with the three data assimilation methods. The results are shown in Fig. 12. It is noticeable that the posterior uncertainty with EnKF is underestimated and too much con dence is placed in the mean value. With the EnRML method and EnKF-MDA, we can have an estimation of the uncertainty indicated by samples. Besides, the uncertainty in the upper channel estimated by the EnRML method and EnKF-MDA is similar to the prior. That is reasonable since the variance  in this region is low [8], and no observation is informed as well. Hence, the posterior should not change much from the prior distribution. Based on the overall performance, the iterative EnKF loss the statistical information due to data overuse, while the other two methods can provide reasonable uncertainty information. We also compare the three data assimilation methods in convergence speed. The convergence criteria for the three methods are di erent. Concretely, EnKF and the EnRML method are considered to be converged when the iteration residual in data mis t between the two adjacent iterations is less than 1  10 , while EnKF-MDA has to reach the prede ned maximum iteration number N . From our numerical tests, the mda EnKF does not converge and stops at the maximum iteration number 100. Fig. 13 presents the evolution of y/H y/H DNS baseline samples sample mean observations 3.0 2.5 2.0 1.5 1.0 0.5 0.10 0 2 4 6 8 10 x/H; 2U /U + x/H x b 0 2 4 6 8 10 x/H; 2U /U + x/H x b (a) EnKF 0 2 4 6 8 10 x/H; 2U /U + x/H x b (b) EnRML 0 2 4 6 8 10 x/H; 2U /U + x/H x b (c) EnKF-MDA Figure 11: Data assimilation results of stream-wise velocity with EnKF, EnRML, and EnKF-MDA in comparison to baseline and DNS for the turbulent ow in a periodic hill. y/H y/H y/H y/H Prior Posterior DNS 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 10 0 2 4 6 8 10 x/H; 2U /U + x/H x b x/H;2U /U + x/H x b (a) EnKF 0 2 4 6 8 10 x/H; 2U /U + x/H x b (b) EnRML 0 2 4 6 8 10 x/H; 2U /U + x/H x b (c) EnKF-MDA Figure 12: The 95% credible intervals of the prior (light/pink shaded region) and posterior (dark/blue shaded region) samples of stream-wisevelocity pro les for the turbulent ow in a periodic hill y/H y/H y/H y/H (a) EnRML (b) EnKF-MDA Figure 13: Convergence plot of EnRML and EnKF-MDA with 50 samples iteration residual for the EnRML method and the convergence plot of the maximum iteration number N mda for EnKF-MDA with 50 samples. It can be seen that the EnRML method converges in 8 iterations, while EnKF-MDA need at least 50 iterations to converge in the maximum iteration number N , which suggests mda that EnRML outperforms the EnKF-MDA in convergence speed. Further, we conduct the data assimilation with 200, 800, and 3200 samples to investigate the e ects of DNS sample size. We use relative data mis t between posterior mean HX and truth U normed by that of prior, to evaluate the posterior results. It can be formulated as DNS k H[X ] U k i L2 . (33) DNS k H[X ] U k 0 L2 Also, the relative standard deviation of the posterior samples is computed in a similar manner to evaluate the reduction of uncertainty after assimilating observation data. The results with di erent samples are summarized in Table 2. It can be seen that EnKF can achieve the best data t among the three methods but underestimates the variance of the posterior samples. EnRML and EnKF-MDA not only can improve the data mis t but also provide the statistical information. Comparing between the EnRML and EnKF-MDA, EnRML can provide better data match and preserve larger variance of the posterior. With large sample size, the posterior variance will be increased for all the three methods since more samples can cover more statistical information. However, the data mis t will be inferior to that with the small sample size. That is likely due to the capping error. When we perform the Bayesian update, some samples may lead to the updated (; ) out of the square [1; 1] [1; 1] shown in Fig. 8b. To ensure physical reliability, we bound any sample outside the square by xing them at the edges. With large samples, more samples may jump out of the physical range and need to be capped, which likely causes large errors between the posterior mean and data. Better methods to ensure the physical realizability need to be investigated but are outside the scope of this work. 5. Conclusion This paper evaluates the performance of three widely used iterative ensemble methods (EnKF, EnRML, and EnKF-MDA) for UQ problems in steady cases. We summarize the derivations of these ensemble methods from an optimization viewpoint. The iterative EnKF method performs several full Gauss{Newton steps 22 Table 2: Summary of the relative data mis t and the relative standard deviation of posterior samples with di erent ensemble sizes. sample size N = 50 EnKF EnRML EnKF-MDA relative data mis t 15:2% 23:2% 30:0% relative std of ensemble 16:1% 36:9% 30:5% sample size N = 200 relative data mis t 21:7% 29:2% 36:6% relative std of ensemble 17:5% 40:9% 31:8% sample size N = 800 relative data mis t 33:7% 35:9% 46:1% relative std of ensemble 22:5% 44:4% 36:2% sample size N = 3200 relative data mis t 37:9% 38:2% 49:3% relative std of ensemble 23:2% 49:2% 38:8% during which same data is repeatedly used for the stationary scenario. The EnRML method and EnKF- MDA can iteratively approach to the optimal point with Gauss{Newton method or likelihood recursion, avoiding the data overuse and alleviating the e ects of linearization approximation simultaneously. From the numerical investigation for a scalar case, we investigate the e ects of small ensemble sizes. The results show that the EnRML method and EnKF-MDA can provide a satisfactory estimation on the posterior uncertainty with small ensemble size but remain inferior to that with large ensemble size. This is because the small ensemble size is not sucient to describe the statistical information and increases the error in the estimation of the model gradient. This de ciency may be alleviated by using the localization technique, and will be further investigated in future work. The comparison results for both the scalar case and CFD case show that the posterior mean with all the three methods can have a good agreement with benchmark data. However, the iterative form of EnKF discussed here which uses the same data repeatedly for steady problems can prompt the data t but underestimate the posterior uncertainty. The other two methods, EnRML and EnKF-MDA, are capable of giving an estimation of posterior uncertainty. Based on our comparison study, the EnRML method is recommended since it can converge fast and provide the statistical information even in complicated CFD cases. The applicability of these ensemble methods for unsteady CFD applications will be investigated in future studies. Acknowledgements The authors would like to thank the reviewers for their constructive and valuable comments, which helped us to improve the quality and clarity of this manuscript. References [1] F. D. Witherden, A. Jameson, Future Directions in Computational Fluid Dynamics, in: 23rd AIAA Computational Fluid Dynamics Conference, 2017, pp. 1{16. 23 [2] L. Mathelin, M. Y. Hussaini, T. A. Zang, F. Bataille, Uncertainty propagation for a turbulent, com- pressible nozzle ow using stochastic methods, AIAA Journal 42 (8) (2004) 1669{1676. [3] O. Knio, O. Le Maitre, Uncertainty propagation in CFD using polynomial chaos decomposition, Fluid Dynamics Research 38 (9) (2006) 616{640. [4] S. Hosder, R. Walters, R. Perez, A non-intrusive polynomial chaos method for uncertainty propagation in CFD simulations, in: 44th AIAA aerospace sciences meeting and exhibit, 2006. [5] H. N. Najm, Uncertainty quanti cation and polynomial chaos techniques in computational uid dy- namics, Annual Review of Fluid Mechanics 41 (2009) 35{52. [6] E. Dow, Q. Wang, Quanti cation of structural uncertainties in the k{! turbulence model, in: 52nd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 19th AIAA/ASME/AHS Adaptive Structures Conference, 2011, pp. 1{12. [7] A. Gel, R. Garg, C. Tong, M. Shahnam, C. Guenther, Applying uncertainty quanti cation to multiphase ow computational uid dynamics, Powder Technology 242 (2013) 27{39. [8] H. Xiao, J.-L. Wu, J.-X. Wang, R. Sun, C. Roy, Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier{Stokes simulations: A data-driven, physics-informed Bayesian approach, Journal of Computational Physics 324 (2016) 115{136. [9] M. C. Kennedy, A. O'Hagan, Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (3) (2001) 425{464. [10] S. H. Cheung, T. A. Oliver, E. E. Prudencio, S. Prudhomme, R. D. Moser, Bayesian uncertainty analysis with applications to turbulence modeling, Reliability Engineering & System Safety 96 (9) (2011) 1137{ [11] T. A. Oliver, R. D. Moser, Bayesian uncertainty quanti cation applied to RANS turbulence models, in: Journal of Physics: Conference Series, Vol. 318, IOP Publishing, 2011, pp. 1{10. [12] W. Edeling, P. Cinnella, R. P. Dwight, H. Bijl, Bayesian estimates of parameter variability in the k{" turbulence model, Journal of Computational Physics 258 (2014) 73{94. [13] W. N. Edeling, M. Schmelzer, R. P. Dwight, P. Cinnella, Bayesian predictions of Reynolds-averaged Navier{Stokes uncertainties using maximum a posteriori estimates, AIAA Journal 56 (5) (2018) 2018{ [14] X. Ma, N. Zabaras, An adaptive hierarchical sparse grid collocation algorithm for the solution of stochas- tic di erential equations, Journal of Computational Physics 228 (8) (2009) 3084{3113. [15] W. Edeling, R. P. Dwight, P. Cinnella, Simplex-stochastic collocation method with improved scalability, Journal of Computational Physics 310 (2016) 301{328. [16] J. Zhang, S. Fu, An ecient Bayesian uncertainty quanti cation approach with application to k-!- transition modeling, Computers & Fluids 161 (2018) 211{224. 24 [17] X. Zhang, T. Gomez, O. Coutier-Delgosha, Bayesian optimisation of RANS simulation with ensemble- based variational method in convergent-divergent channel, Journal of Turbulence 20 (3) (2019) 1{26. [18] C. Liu, Q. Xiao, B. Wang, An ensemble-based four-dimensional variational data assimilation scheme. Part I: Technical formulation and preliminary test, Monthly Weather Review 136 (9) (2008) 3363{3373. [19] A. A. Emerick, A. C. Reynolds, History matching time-lapse seismic data using the ensemble Kalman lter with multiple data assimilations, Computational Geosciences 16 (3) (2012) 639{659. [20] G. Evensen, Data assimilation: the ensemble Kalman lter, Springer Science & Business Media, 2009. [21] G. Gao, M. Zafari, A. C. Reynolds, et al., Quantifying uncertainty for the PUNQ-S3 problem in a Bayesian setting with RML and EnKF, in: SPE reservoir simulation symposium, Society of Petroleum Engineers, 2005, pp. 506{515. [22] J. A. Vrugt, B. A. Robinson, Treatment of uncertainty using ensemble methods: Comparison of sequen- tial data assimilation and Bayesian model averaging, Water Resources Research 43 (1). [23] A. Lorenc, The potential of the ensemble Kalman lter for NWP - a comparison with 4D-Var, Quarterly Journal of The Royal Meteorological Society 129 (595) (2003) 3183{3203. doi:10.1256/qj.02.132. [24] P. L. Houtekamer, F. Zhang, Review of the ensemble Kalman lter for atmospheric data assimilation, Monthly Weather Review 144 (12) (2016) 4489{4532. doi:10.1175/MWR-D-15-0440.1. [25] L. Natvik, G. Evensen, Assimilation of ocean colour data into a biochemical model of the North Atlantic - Part 1. Data assimilation experiments, Journal of Marine Systems 40 (2003) 127{153. doi:10.1016/ S0924-7963(03)00016-2. [26] V. E. J. Haugen, G. Evensen, Assimilation of SLA and SST data into an OGCM for the Indian ocean, Ocean Dynamics 52 (3) (2002) 133{151. doi:10.1007/s10236-002-0014-7. [27] H. Kato, S. Obayashi, Approach for uncertainty of turbulence modeling based on data assimilation technique, Computers & Fluids 85 (2013) 2{7. [28] M. A. Iglesias, K. J. Law, A. M. Stuart, Ensemble Kalman methods for inverse problems, Inverse Problems 29 (4) (2013) 045001. [29] H. Xiao, P. Cinnella, Quanti cation of model uncertainty in RANS simulations: a review, Progress in Aerospace Sciences. [30] G. Evensen, Analysis of iterative ensemble smoothers for solving inverse problems, Computational Geosciences 22 (3) (2018) 885{908. [31] Y. Gu, D. S. Oliver, et al., An iterative ensemble Kalman lter for multiphase uid ow data assimilation, SPE Journal 12 (04) (2007) 438{446. [32] Y. Chen, D. S. Oliver, Ensemble randomized maximum likelihood method as an iterative ensemble smoother, Mathematical Geosciences 44 (1) (2012) 1{26. 25 [33] Y. Yang, C. Robinson, D. Heitz, E. M emin, Enhanced ensemble-based 4DVar scheme for data assimila- tion, Computers & Fluids 115 (2015) 201{210. [34] A. A. Emerick, A. C. Reynolds, Ensemble smoother with multiple data assimilation, Computers & Geosciences 55 (2013) 3{15. [35] A. C. Reynolds, M. Zafari, G. Li, Iterative forms of the ensemble Kalman lter, in: ECMOR X-10th European Conference on the Mathematics of Oil Recovery, 2006. [36] D. S. Oliver, Y. Chen, Recent progress on reservoir history matching: a review, Computational Geo- sciences 15 (1) (2011) 185{221. [37] O. G. Ernst, B. Sprungk, H.-J. Starklo , Analysis of the ensemble and polynomial chaos Kalman lters in Bayesian inverse problems, SIAM/ASA Journal on Uncertainty Quanti cation 3 (1) (2015) 823{851. [38] G. Burgers, P. Jan van Leeuwen, G. Evensen, Analysis scheme in the ensemble Kalman lter, Monthly Weather Review 126 (6) (1998) 1719{1724. [39] Z. Wu, A. C. Reynolds, D. S. Oliver, et al., Conditioning geostatistical models to two-phase production data, in: SPE Annual Technical Conference and Exhibition, Society of Petroleum Engineers, 1998, pp. 142{155. [40] G. Gao, A. C. Reynolds, et al., An improved implementation of the LBFGS algorithm for automatic history matching, in: SPE Annual Technical Conference and Exhibition, Society of Petroleum Engineers, 2004, pp. 5{17. [41] J. A. Vrugt, Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation, Environmental Modelling & Software 75 (2016) 273{316. [42] X. Zhang, H. Xiao, T. Gomez, O. Coutier-Delgosha, Code of scalar case for uncertainty quanti cation, howpublished = "https://github.com/XinleiZhang/scalar-case-for-UQ". [43] J. L. Anderson, Localization and sampling error correction in ensemble Kalman lter data assimilation, Monthly Weather Review 140 (7) (2012) 2359{2371. [44] S. Banerjee, R. Krahl, F. Durst, C. Zenger, Presentation of anisotropy properties of turbulence, invari- ants versus eigenvalue approaches, Journal of Turbulence 8 (32) (2007) 1{27. [45] J. Fr ohlich, C. P. Mellen, W. Rodi, L. Temmerman, M. A. LESchziner, Highly resolved large-eddy simu- lation of separated ow in a channel with streamwise periodic constrictions, Journal of Fluid Mechanics 526 (2005) 19{66. [46] M. Breuer, N. Peller, C. Rapp, M. Manhart, Flow over periodic hills{numerical and experimental study in a wide range of Reynolds numbers, Computers & Fluids 38 (2) (2009) 433{457. [47] B. E. Launder, B. Sharma, Application of the energy-dissipation model of turbulence to the calculation of ow near a spinning disc, Letters in Heat and Mass Transfer 1 (2) (1974) 131{137. 26 Appendix A. Derivation of EnKF The cost function and its gradient for the iterative EnKF are formulated as 1 > 1 > a f 1 a f a 1 a J = x x P x x + H[x ] y R H[x ] y ; (A.1a) j j n;j n;j n n;j n;j n;j n;j 2 2 @J 1 a f 0 a > 1 a = P x x +H [x ] R H[x ] y : (A.1b) n n;j n;j n;j n;j @x n;j a 0 a We approximate the unknown terms H[x ] and H [x ] in Eq. (A.1b) with the linear assumption as a f 0 a a f H[x ]  H[x ] +H [x ] x x ; (A.2a) j j j j j 0 a 0 f 00 f a f H [x ]  H [x ] +H [x ] x x ; (A.2b) j j j j j where the second derivation can be neglected. Further, we set the gradient of the cost function to be zero and substitute with Eq. (A.2) as 1 a f 0 f > 1 f 0 f a f P x x = H [x ] R H[x ] +H [x ] x x y : (A.3) n n;j n;j n;j n;j n;j n;j n;j We expand H[x] around the ensemble mean as f f 0 f f f H[x ]  H[X ] +H [x ] x X : (A.4a) j j j Afterwards, we assume that H[x] = Hx, where H is the tangent linear operator. The model function gradient 0 f H [x ] can be estimated directly with the linear operator H based on Eq. A.4. Hence, Eq. (A.3) can be formulated and rearranged as 1 a f > 1 f a f P x x = H R Hx + H(x x ) y ; (A.5a) n n;j n;j n;j n;j n;j a f > 1 > 1 f x = x + P I + H R HP H R y Hx : (A.5b) n n j n;j n;j n;j Set Q = R HP and we have: > > > > H I + QH = I + H Q H ; (A.6a) > > > > I + H Q H = H I + QH : (A.6b) > 1 1 > > 1 > 1 Now back to Eq. (A.5b), substituting (I +H R HP ) H with H (I +R HP H ) based on Eq. (A.6b), n n we can derive: a f > 1 > 1 f x = x + P H I + R HP H R y Hx ; (A.7a) n n j n;j n;j n;j a f > > f x = x + P H R + HP H y Hx : (A.7b) n n j n;j n;j n;j Eq. (A.7b) is the iterative formulation for the analysis step of the EnKF method. Appendix B. Derivation of EnRML To derive the analysis scheme of ensemble randomized maximal likelihood method, we start from the gradient and Hessian of the cost function as @J 1 0 > 1 = P (x x ) +H [x ] R (H[x ] y ) ; (B.1a) l;j 0;j l;j l;j j @x l;j @ J 1 0 > 1 0 = P +H [x ] R H [x ]: (B.1b) l;j l;j @ x l;j 27 In the EnRML method, the state vector x is updated with the Gauss{Newton method as @ J @J a f x = x : (B.2) l;j l;j 2 f f @ x @x l;j l;j Through directly introducing the gradient and Hessian formulation into Eq. (B.2), we can have a f 1 0 f > 1 0 f 1 f f 0 f > 1 f x = x P +H [x ] R H [x ] P (x x ) +H [x ] R (H[x ] y ) ; l;j l;j 0 l;j l;j 0 l;j 0;j l;j l;j (B.3) 0 f > 1 0 f f f 0 f > 1 f I + P H [x ] R H [x ] x x + P H [x ] R H[x ] y : 0 0 j l;j l;j l;j 0;j l;j l;j By expanding the last term, we obtain a f 0 f > 1 0 f f f x =x I + P H [x ] R H [x ] x x l;j l;j l;j l;j l;j 0;j (B.4) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y : 0 0 j l;j l;j l;j l;j We can further derive from (B.4) via Woodbury formula as follows: a f 0 f > 0 f 0 f > 0 f f f x =x I P H [x ] R +H [x ]P H [x ] H [x ] x x 0 0 l;j l;j l;j l;j l;j l;j l;j 0;j (B.5) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y : 0 0 j l;j l;j l;j l;j After expanding the second term at right hand and rearranging, we can have a f f 0 f > 0 f 0 f > 0 f f f x = x + (1 ) x + P H [x ] R +H [x ]P H [x ] H [x ] x x 0 0 l;j 0;j l;j l;j l;j l;j l;j l;j 0;j (B.6) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y 0 0 j l;j l;j l;j l;j 0 > Set Q = P H [x] , and we deduce 1 0 1 0 QR (R +H [x]Q) = I + QR H [x] Q; (B.7a) 1 0 1 0 I + QR H [x] QR = Q (R +H [x]Q) : (B.7b) Finally, by substituting Eq. (B.7b) into Eq. (B.6) we can obtain the analysis step for the EnRML method as a f f 0 f > 0 f > 0 f f 0 f f f x = x + (1 ) x P H [x ] R +H [x ] P H [x ] H[x ] y H [x ] x x : 0 0 j l;j 0;j l;j l;j l;j l;j l;j l;j l;j 0;j (B.8) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Physics arXiv (Cornell University)

Evaluation of ensemble methods for quantifying uncertainties in steady-state CFD applications with small ensemble sizes

Physics , Volume 2020 (2004) – Apr 12, 2020

Loading next page...
 
/lp/arxiv-cornell-university/evaluation-of-ensemble-methods-for-quantifying-uncertainties-in-steady-fzGbr9Gr8W
ISSN
0045-7930
eISSN
ARCH-3341
DOI
10.1016/j.compfluid.2020.104530
Publisher site
See Article on Publisher Site

Abstract

Bayesian uncertainty quanti cation (UQ) is of interest to industry and academia as it provides a framework for quantifying and reducing the uncertainty in computational models by incorporating available data. For systems with very high computational costs, for instance, the computational uid dynamics (CFD) problem, the conventional, exact Bayesian approach such as Markov chain Monte Carlo is intractable. To this end, the ensemble-based Bayesian methods have been used for CFD applications. However, their applicability for UQ has not been fully analyzed and understood thus far. Here, we evaluate the performance of three widely used iterative ensemble-based data assimilation methods, namely ensemble Kalman lter, ensemble randomized maximum likelihood method, and ensemble Kalman lter with multiple data assimilation for UQ problems. We present the derivations of the three ensemble methods from an optimization viewpoint. Further, a scalar case is used to demonstrate the performance of the three di erent approaches with emphasis on the e ects of small ensemble sizes. Finally, we assess the three ensemble methods for quantifying uncertainties in steady-state CFD problems involving turbulent mean ows. Speci cally, the Reynolds averaged Navier{ Stokes (RANS) equation is considered the forward model, and the uncertainties in the propagated velocity are quanti ed and reduced by incorporating observation data. The results show that the ensemble methods cannot accurately capture the true posterior distribution, but they can provide a good estimation of the uncertainties even when very limited ensemble sizes are used. Based on the overall performance and eciency from the comparison, the ensemble randomized maximum likelihood method is identi ed as the best choice of approximate Bayesian UQ approach among the three ensemble methods evaluated here. Keywords: Uncertainty quanti cation, Ensemble methods, Data assimilation, Computational uid dynamics, Small ensemble sizes 1. Introduction 1.1. Bayesian uncertainty quanti cation for CFD In computational uid dynamics (CFD) applications, Reynolds averaged Navier{Stokes (RANS) methods still are the workhorse tool to inform the important decision-making during engineering design processes. Corresponding author Email address: hengxiao@vt.edu (Heng Xiao) URL: https://www.aoe.vt.edu/people/faculty/xiaoheng.html (Heng Xiao) Preprint submitted to Computers & Fluids April 14, 2020 arXiv:2004.05541v1 [physics.comp-ph] 12 Apr 2020 However, RANS models cannot provide accurate results for many cases in the presence of complex turbulent ows. That necessitates quantifying uncertainties in the numerical simulations so that we could obtain additional con dence/statistics information on the simulated results [1]. The conventional approach to quantify uncertainties is to forwardly propagate the presumed uncertainty in system inputs to the quantity of interests (QoIs) through the forward model. The procedure of the uncertainty propagation is illustrated in Fig. 1(a). Numerous methods [2{4] and applications [5{7] have been developed and explored for uncertainty propagation in the literature. Another uncertainty quanti cation (UQ) method is Bayesian UQ approach. This approach can account for the available data from high delity simulations or experiments to backwardly quantify and reduce the uncertainty of QoIs as well as the system inputs (e.g., model parameters or underlying terms) [8]. The procedure of Bayesian UQ is illustrated in the schematic in Fig. 1(b). ˆ y Input<latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit><latexit sha1_base64="JcYH0lA0N94gUlHD5FCeoR99llU=">AAAB8XicbVBNS8NAFHypX7V+VT16WSyCp5KIoMeiF48VbCu2oWy2m3bpZhN2X8QS+i+8eFDEq//Gm//GbZqDtg4sDDPvsfMmSKQw6LrfTmlldW19o7xZ2dre2d2r7h+0TZxqxlsslrG+D6jhUijeQoGS3yea0yiQvBOMr2d+55FrI2J1h5OE+xEdKhEKRtFKD72I4siE2dO0X625dTcHWSZeQWpQoNmvfvUGMUsjrpBJakzXcxP0M6pRMMmnlV5qeELZmA5511JFI278LE88JSdWGZAw1vYpJLn6eyOjkTGTKLCTecJFbyb+53VTDC/9TKgkRa7Y/KMwlQRjMjufDITmDOXEEsq0sFkJG1FNGdqSKrYEb/HkZdI+q3tu3bs9rzWuijrKcATHcAoeXEADbqAJLWCg4Ble4c0xzovz7nzMR0tOsXMIf+B8/gAXUJEt</latexit>x Quantities of interests <latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit><latexit sha1_base64="f8PiFNl2GuzH1OIIIHre72H8pZ4=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRIRdFl047KCfUATymQ6aYdOJmHmphBC/sSNC0Xc+ifu/BunaRbaemDgcM693DMnSATX4DjfVm1jc2t7p77b2Ns/ODyyj096Ok4VZV0ai1gNAqKZ4JJ1gYNgg0QxEgWC9YPZ/cLvz5nSPJZPkCXMj8hE8pBTAkYa2bY3JZB7EYGpDvOsKEZ202k5JfA6cSvSRBU6I/vLG8c0jZgEKojWQ9dJwM+JAk4FKxpeqllC6IxM2NBQSSKm/bxMXuALo4xxGCvzJOBS/b2Rk0jrLArMZBlx1VuI/3nDFMJbP+cySYFJujwUpgJDjBc14DFXjILIDCFUcZMV0ylRhIIpq2FKcFe/vE56Vy3XabmP1832XVVHHZ2hc3SJXHSD2ugBdVAXUTRHz+gVvVm59WK9Wx/L0ZpV7ZyiP7A+fwBxpJQs</latexit> Forward model (a) Uncertainty propagation Data <latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit><latexit sha1_base64="fWgmi8MTEIqqdlHD9xCnBt83KxQ=">AAAB8XicbVDLSsNAFL2pr1pfVZduBovgqiQi6LLoxmUF+8A2lMl00g6dTMLMjVBC/8KNC0Xc+jfu/BunaRbaemDgcM69zLknSKQw6LrfTmltfWNzq7xd2dnd2z+oHh61TZxqxlsslrHuBtRwKRRvoUDJu4nmNAok7wST27nfeeLaiFg94DThfkRHSoSCUbTSYz+iODZhNp0NqjW37uYgq8QrSA0KNAfVr/4wZmnEFTJJjel5boJ+RjUKJvms0k8NTyib0BHvWapoxI2f5Yln5MwqQxLG2j6FJFd/b2Q0MmYaBXYyT7jszcX/vF6K4bWfCZWkyBVbfBSmkmBM5ueTodCcoZxaQpkWNithY6opQ1tSxZbgLZ+8StoXdc+te/eXtcZNUUcZTuAUzsGDK2jAHTShBQwUPMMrvDnGeXHenY/FaMkpdo7hD5zPHxjVkS4=</latexit> (b) Bayesian UQ approach Figure 1: Schematic of uncertainty quanti cation. (a) Uncertainty propagation forwardly propagates the presumed uncertainty (red dashed line) in the input to the quantity of interest through the forward model; (b) Bayesian UQ approach combines the prior information with high delity simulation or experimental data to backwardly quantify the posterior uncertainty (blue solid line) in the quantities of interest and in the input. Numerous works have been conducted to apply Bayesian UQ approach to diverse applications, including RANS simulations. Based on the pioneering work of Kennedy and O'Hagan [9], Cheung et al. [10] applied a Bayesian calibration framework for the Spalart{Allmaras turbulence model to calibrate the model param- eters by incorporating experimental measurements. They evaluated their approach on the boundary layer ows to reduce computational costs and pointed out the necessity to develop tractable UQ approaches for computationally expensive cases. Oliver and Moser [11] further extended the work of Cheung et al. [10] by introducing stochastic representations for the uncertainties in eddy viscosity turbulence models. The uncer- tainty representations based on the multiplicative error in mean velocity and the additive error in Reynolds shear stress are developed and used for plane channel ows. Edeling et al. [12] proposed a Bayesian model- scenario averaging (BMSA) method to estimate the k{ turbulence model error for a class of boundary layer ows with di erent pressure gradients. More recently, Edeling et al. [13] leveraged maximum a posteri- ori (MAP) estimate to reduce the computational cost and thus make their BMSA approach applicable for complex ows. The aforementioned works use the Markov chain Monte Carlo (MCMC) technique which typically requires 5 6 samples of at least O(10 ) to O(10 ). However, it would computationally intractable to deal with the 2 complex ow cases of engineering interests where uncertainty propagation through the forward model is computationally expensive. In order to reduce the computational cost, the conventional approach is to use surrogate models (e.g., the polynomial chaos methods [14{16]) to replace the CFD code. Nevertheless, such approaches are challenging for high dimension problems due to the curse of dimensionality. The ensemble technique has been proposed and discussed extensively for UQ problems in the data assimilation community. It can signi cantly reduce sample size to O(10 ) and provide reasonable estimates of posterior uncertainty with limited samples. Therefore, the ensemble methods can potentially play a role as an approximate Bayesian UQ approach for computationally expensive ow cases. The ensemble-based data assimilation methods will be further discussed below. 1.2. Ensemble-based data assimilation Ensemble-based data assimilation has recently increased in popularity and has been applied to diverse con- texts including uid mechanics [17], weather forecasting [18] and geoscience [19] due to its non-intrusiveness and robustness. Among ensemble-based data assimilation methods, the most widely used is the ensemble Kalman lter (EnKF) [20]. It has been extensively used for uncertainty quanti cation in various applica- tions, such as hydrology [21, 22], meteorology [23, 24], oceangraphy [25, 26]. In the past few years, EnKF has also been increasingly leveraged for CFD applications to estimate empirical parameters or functional errors in the RANS closure models. Kato and Obayashi [27] explored the applicability of the EnKF method to estimate the uncertainty in the empirical parameters of the Spalart{Allmaras RANS model. However, due to the strong nonlinearity of the RANS problem, it is necessary to iteratively assimilate data even for the stationary scenario, thus enhancing the performance of data tting. To this end, Iglesias et al. [28] proposed an iterative form of the standard EnKF as a derivative-free optimization method for inverse problems. In their framework, the analysis step of EnKF iterates with the arti cial time for stationary systems based on the state augmentation. They showed the accuracy of the iterative EnKF for inferring the sample mean with three di erent cases, but its accuracy in the context of uncertainty quanti cation has not been fully investigated. Xiao et al. [8] applied this iterative EnKF to quantify and reduce the RANS model-form un- certainty within the Reynolds stress. They demonstrated that the posterior mean with EnKF could have remarkably good agreement with benchmark data. The readers are referred to the recent review of Xiao and Cinnella [29] for recent progress in model-form uncertainty quanti cation in RANS simulations. For highly nonlinear systems, the ill-posedness of the problem is signi cantly increased. To search for the optimal point, EnKF takes a full gradient descent step where the forward model is linearized to simplify the problem [30]. That possibly changes the original nonlinear problem and leads to wrong solutions. Considering this issue, several iterative ensemble methods have been proposed and discussed for UQ of nonlinear systems in the data assimilation community. For instance, Gu and Oliver [31] proposed the ensemble randomized maximum likelihood (EnRML) method to iterate the analysis step with the Gauss{Newton algorithm. They demonstrated the superiority of the EnRML method to EnKF for both static and dynamic problems with strong nonlinearity. Chen and Oliver [32] used the EnRML method as an iterative ensemble smoother for the history match problem. Yang et al. [33] proposed an enhanced ensemble variational method and applied their method to unsteady ows. Their method is implemented similarly to EnRML with an iterative minimization procedure based on the Gauss{Newton algorithm, but the error covariance is updated sequentially based on ensemble analysis. On the other hand, Emerick and Reynolds [19] proposed an ensemble Kalman lter 3 with multiple data assimilation (EnKF-MDA) and demonstrated it could provide better data match than EnKF with a comparable computational cost. This method performs Bayesian analysis with recursion of the likelihood through in ating the observation error. It is worth noting that for unsteady cases, EnKF is usually used as a ltering technique to assimilate the data in time sequentially, while the EnRML method [32] and EnKF-MDA [34] can be used as the smoother technique to account for all the available data simultaneously. Moreover, for the Gaussian linear case, it has been proven that the EnRML method and EnKF-MDA are equivalent to the EnKF [19, 35]. But for the nonlinear case, the equivalence does not hold. EnKF can be regarded as a single Gauss{Newton update with a full step. In contrast, the EnRML method and EnKF- MDA perform multiple small corrections, which helps to alleviate the inaccuracy caused by the linearization and better preserve the nonlinearity of the original problem. The ensemble-based data assimilation methods mentioned above can be derived in a similar manner by solving the minimization problem under several mild assumptions (e.g., the Gaussian distribution, lineariza- tion, and ensemble gradient representation) [30]. However, these assumptions may result in a departure of the estimated posterior distribution from the truth. Recently, several authors investigated the cause of inaccurate uncertainty estimates given by the ensemble methods. For instance, Oliver and Chen [36] re- viewed the progress of MCMC, EnKF, and EnRML on the history matching problem. They concluded that the EnRML method could provide the probability distribution in better agreement with MCMC at a low computational cost, as compared to the EnKF method. Ernst et al. [37] examined the EnKF method for nonlinear stationary systems. They demonstrated that EnKF can provide the sample statistics as indication of uncertainties but is not suitable for rigorous Bayesian inference. Evensen [30] derived and analyzed di er- ent ensemble methods from the view of model gradient representations and compared the analytic gradient and the ensemble representative gradient. He concluded that none of these methods could provide the exact posterior probability density function (PDF) for highly nonlinear models, but they can serve as indication of the uncertainties at least for weakly nonlinear cases. However, a suciently large number of samples is used to obtain accurate statistical estimation in his work, and the performance of these methods with small en- semble sizes is not fully evaluated. These iterative ensemble methods are useful for estimating uncertainties in QoIs in industrial CFD applications, and they warrant further investigation. 1.3. Objective of present work In this work, we present the derivations of three di erent iterative ensemble methods, namely iterative EnKF [28](hereinafter referred to as EnKF for brevity), EnRML, and EnKF-MDA, from the optimization perspective, and compare their performances for quantifying uncertainties in steady-state CFD applications with small ensemble sizes. Moreover, the e ect of small ensemble sizes on the performance of each method is evaluated in a scalar case by comparison with Bayesian distribution from MCMC. The rest of the paper is structured as follows. In Section 2, we give the brief derivation of the three most commonly used ensemble-based data assimilation methods (EnKF, EnRML, and EnKF-MDA). A scalar case is presented in Section 3 to compare the performances of these methods with di erent ensemble sizes. In Section 4, a steady ow case is tested to identify the suitable approach to quantify the uncertainty in the RANS model. Section 5 concludes the paper. 4 2. Ensemble-based data assimilation methods Here, we summarize the brief derivation of the three di erent ensemble-based data assimilation methods (EnKF, EnRML, and EnKF-MDA) from the optimization perspective. For clarity and without loss of generality, we assume a multi-variate state-space model with multiple observations. This is in contrast to Evensen's work [30] where a single-variate state-space model with a single observation is assumed. 2.1. Minimization problem Consider that the observation model can be expressed as y = H[x] + ; (1) N D where x is the state vector or input parameter x 2 R , y is the observation y 2 R , H is model function N D mapping the state to observation space R ! R , and  is added observation noise, which is assumed to be an independent and identically distributed (i.i.d.) Gaussian random vector with zero mean and covariance R. We give an initial guess on the PDF of state p(x) as the prior knowledge based on the Gaussian assumption. Further, the Bayesian UQ approach can be used to nd the posterior distribution conditioned by the observation. The Bayes' theorem can be formulated as p(x j y) / p(x) p(y j H[x]), (2) which states that the posterior distribution p(x j y) is proportional to the multiplication of the prior distri- bution p(x) and likelihood function p(y j H[x]) of data y conditioned by the model H[x]. With the Gaussian assumption for prior p(x) and likelihood p(y j H[x]), we can rewrite the Bayes' formula in Eq. 2 as p(x j y) / p(x) p(y j H[x]) / e ; (3) where J is the cost function de ned as 1 > 1 a a f 1 a f a 1 a J [x ] = x x P x x + (H[x ] y) R (H[x ] y) : (4) 2 2 In the formula above, P is the model error covariance, R is the observation error covariance, and the super- scripts a and f represent the \analysis" and \forecast", respectively. It is challenging to obtain the true error covariance P in problems with high-dimensional state-spaces. The ensemble methods apply the Monte Carlo technique to draw a small number of samples. Such samples can then be used to estimate the ensemble representation of the model error covariance P and the observation error covariance R as P = ( X X) (X X) ; M 1 (5) R =  ; where X = fx ; : : : ; x g. Note that the estimated covariance matrix for the observation error and state 1 M are both symmetric. Further, the maximum a posteriori (MAP) analysis can be applied to estimate the posterior distribution. That is, maximizing the posterior is equivalent to minimizing the cost function J . Based on such an optimization perspective, we can derive the three di erent data assimilation methods, namely EnKF, EnRML, and EnKF-MDA, from the perspective of minimizing the cost function with di erent gradient descent techniques. 5 2.2. EnKF For steady cases, the traditional EnKF only performs the Kalman update once. It may be dicult to achieve a satisfactory data t in some scenarios, for instance, where the prior mean is far from observation data, and the system model is strongly nonlinear [28]. To this end, the iterative technique is usually leveraged to adequately assimilate the data and thus prompt the data match. We use an iterative form of EnKF proposed by Iglesias et al. [28] to enhance the optimization performance. This method considers the EnKF as a regularized least square technique and performs multiple standard Kalman updates sequentially, even for stationary cases. The cost function for each ensemble realization can be written as 1 > 1 > a a f 1 a f a 1 a J [x ] = x x P x x + H[x ] y R H[x ] y ; (6) n;j n;j n;j n n;j n;j n;j n;j 2 2 where n indicates the iteration number and j denotes the sample index. Based on the cost function (6), the gradient with respect to the state is @J 1 a f 0 a 1 a = P x x + H [x ] R H[x ] y ; (7) n n;j n;j n;j n;j @x n;j which should vanish to minimize the cost function J . Therefore, the formulation of EnKF can be derived by setting the gradient of cost function (7) to be zero, which amounts to: 1 a f 0 a 1 a P x x = H [x ] R H[x ] y ; (8) n n;j n;j n;j n;j 0 a a where only the terms H [x ]) and H[x ] are unknown. The assumption of linearization is introduced to have j j an estimation on the two unknown terms. The two terms are linearized as a f 0 f a f H[x ]  H[x ] +H [x ] x x ; (9a) j j j j j 0 a 0 f 00 f a f H [x ]  H [x ] +H [x ] x x ; (9b) j j j j j where the second derivative in Eq. (9b) is neglected for simplicity, assuming the model is moderately nonlinear. With ensemble techniques, the model in observation space is randomized around the mean f f value H[x ]. After expanding H[x] around the ensemble mean H[X], we can represent H[x ] with the model function gradient as [30] f f 0 f f f H[x ]  H[X ] +H [x ] x X . (10a) j j j 0 f We introduce the tangent linear model H[x] = Hx, and thus the gradient representation H [x ] can be expressed as the tangent linear operator H by assuming the linear relationship between the measurement and the state. Accordingly, the update step of EnKF can be derived and formulated as a f > > f x = x + P H R + HP H y Hx : (11) n n j n;j n;j n;j Due to practical consideration, one does not usually compute the model operator H explicitly. Rather, the > > two terms PH and HPH can be reformulated as PH = X X H[X]H[X] ; (12a) M 1 HPH = H[X]H[X] H[X]H[X] : (12b) M 1 6 Besides, the ensemble observation is adopted based on [38]. That is, we use randomly perturbed observation data for each realization. Further details of the derivation are presented in Appendix A. We emphasize that the iterative ensemble Kalman method is a speci c method for solving inverse problems that is distinct from the conventional EnKF. It regards the ensemble Kalman method as the regularized least square technique. For stationary problems, the update step is iterated with pseudo-time to reduce data mis t. The iterative ensemble Kalman method for uncertainty quanti cation will be further discussed in subsection 2.5. 2.3. EnRML The ensemble randomized maximum likelihood method [31] updates the initial guess of state vector iteratively with Gauss{Newton algorithm. The cost function can be written as 1 1 > > 1 1 J [x ] = (x x ) P (x x ) + (H[x ] y ) R (H[x ] y ) ; (13) l;j l;j 0;j l;j 0;j l;j j l;j j 2 2 where x is the initial guess, P is the initially estimated model error covariance before the data assimilation 0 0 process , and iteration index l indicates the sub-iteration of the EnRML method. The gradient and Hessian of the cost function (13) can be derived similarly as in EnKF @J 1 0 > 1 = P (x x ) +H [x ] R (H[x ] y ) ; (14a) l;j 0;j l;j l;j j @x l;j @ J 1 0 > 1 0 = P +H [x ] R H [x ]: (14b) l;j l;j @ x l;j Instead of reaching a zero-gradient minimum directly as in EnKF, the prior x is iteratively updated based on Gauss{Newton method as @ J @J a f x = x ; (15) l;j l;j @x @x l;j l;j where is the step length parameter. The Gauss{Newton approach can reduce the step length and ease the in uence of the linearization assumption during the analysis step. With the gradient (14a) and the Hessian (14b) of the cost function we can obtain the analysis scheme for the EnRML method as follows: a f f 0 f > 0 f > 0 f x = x + (1 ) x P H [x ] R +H [x ] P H [x ] 0 0 l;j 0;j l;j l;j l;j l;j (16) f 0 f f f H[x ] y H [x ] x x : l;j l;j l;j 0;j In the EnRML method, the model error covariance P remains as the initial one P and does not change with the iteration number. Moreover, the sensitivity matrix H [X] has to be evaluated at each iteration through H [X ]  H[X ]H[X ] X X . (17) l l l l l The singular value decomposition (SVD) is used to estimate the inverse of the non-full rank matrix. The details of the derivation can be found in Appendix B. 2.4. EnKF-MDA From the derivation above, each update of EnKF can be regarded as the Gauss{Newton update but uses a full step in the search direction. However, a single global update may not result in a satisfactory data t. Hence, assimilating the data multiple times is highly desired to improve the data t [35]. Moreover, 7 in some cases where the prior mean/ rst guess is far from the truth and the model is highly nonlinear, performing the full Gauss{Newton step may result in the overcorrection and lead to inaccurate solutions. This de ciency can be alleviated to damp the changes in the early iterations [39, 40]. To this end, Emerick and Reynolds [19] proposed EnKF-MDA to assimilate the same data multiple times with an in ated observation error covariance. They have proven that for linear Gaussian cases, the EnKF-MDA is equivalent to the EnKF. For nonlinear cases, the traditional EnKF uses a full Gauss{Newton step with an average sensitivity estimated from the prior ensemble and probably leads to a large Gauss{Newton correction [35]. EnKF-MDA can be regarded as performing multiple small corrections to damp the changes of the model and thus alleviate the e ects of nonlinearity [34]. From the Bayesian perspective, the likelihood function of EnKF-MDA is in a recursive form as mda p(x j y) / p(x) p(y j H[x ]) ; (18) l1 l=1 mda 1 where = 1, N is the total data assimilation iteration number, and can be chosen simply as mda l=1 N . The cost function J can be expressed as: mda 1 1 p p > > a a f 1 a f a a J [x ] = x x P x x + d +  H[x ] ( R) d +  H[x ] ; (19) l l;j l l l;j l;j l;j l;j l l;j l;j l;j l;j 2 2 where d is the measurement without perturbations. The gradient of the cost function is then @J [x ] l;j 1 1 a f 0 a > a = P x x +H [x ] ( R) d +  H[x ] : (20) l l l;j l l;j l;j l;j l;j @x l;j Similar to the derivation of EnKF method, we set the gradient of cost function to zero. Further, with the linearization assumption (9) and ensemble gradient representation (10), we have the update scheme as a f 0 f > f f > f x = x + P H [x ] H[x ]P H[x ] + R d +  H[x ] : (21) l l l l l;j l;j l;j l;j l;j l;j l;j By introducing the tangent linear operator H, we can obtain the analysis step of EnKF-MDA as a f > > f x = x + P H HP H + R d +  Hx . (22) l l l l l;j l;j l;j l;j Given the prior distribution of system state and ensemble observations with error covariance matrix R, the implementation for the three data assimilation methods are summarized in Table 1. 2.5. Remarks From the derivations above, we apply the iterative form, linearization assumption, and ensemble gradient representation to obtain the derivative-free analysis scheme. Here, we provide some discussion on the e ects of each issue. 1. Iterative form is necessary to obtain satisfactory inference results for the inverse problem of nonlinear systems. However, the iterative EnKF performs several Gauss{Newton iterations with the full step where data is equally used for stationary systems. While the other two methods conduct partial iterations, and the several sub-iterations are only equivalent to the rst iteration of the iterative EnKF illustrated in Section 2.2. This iterative EnKF may cause the samples to collapse in early iterations and leads to underestimation of uncertainty, since the data is repeatedly used. Moreover, as the model error 8 Table 1: Schematic comparison of EnKF, EnRML and EnKF-MDA EnKF EnKF-MDA EnRML a. sampling step: a. sampling step: generate initial ensemble state vectors fx g 1. generate initial ensemble state 0;j j=1 vectors fx g ; 0;j j=1 2. estimate the mean X and model error covariance P of the ensemble. b. prediction step: b. prediction step: i) Propagate from current state l 1 to next iter- i) Propagate from current state l 1 ation level l based on forward model (l > 0). to next iteration level l based on for- ward model (l > 0). f a f a x = F [x ] x = F [x ] l;j l1;j l;j l1;j ii) Estimate the ensemble mean X and model error ii) Estimate the ensemble model gra- covariance P of the current iteration. dient by (17). c. analysis step c. analysis step c. analysis step update the state vector by update the state vector by update the state vector by (16) and re- (11) and return to step b (22) and return to step b turn to step b until the convergence cri- until the convergence cri- until the convergence cri- teria are reached. teria are reached. teria are reached. covariance for the next iteration becomes very small, the rst term in the cost function (6) prescribing the prior distribution will be dominant. That means the data assimilation analysis does not take e ect, and the update only depends on the prior afterward. In contrast, the EnRML method and EnKF-MDA iterate the update step through the Gauss{Newton algorithm and likelihood recursion, respectively, which can avoid the data overuse and sample collapse. 2. The linearization assumption is introduced in our derivation for simpli cation. However, for strongly nonlinear systems, the linear assumption may signi cantly a ect the optimal solution and lead to inaccurate inference results. EnKF takes a full update step to the optimal point, while the EnRML method and EnKF-MDA split one EnKF step by several small steps through Gauss{Newton method and likelihood recursion, respectively. From this perspective, the EnRML method and EnKF-MDA can alleviate the in uence of linearization and partly preserve the nonlinearity. Therefore, the EnRML method and EnKF-MDA are more suitable for the uncertainty quanti cation of stationary nonlinear systems than the iterative EnKF. 3. Another assumption, ensemble gradient representation, is leveraged in the ensemble-based DA methods as presented in our derivations. That is, the model gradient is approximated by ensemble realizations and is not derived analytically. This may cause the propagated posterior distribution to depart from the exact Bayesian distribution [30]. While the impact of linearization can be alleviated through the Gauss- Newton algorithm or reduced likelihood recursion, the e ects of ensemble gradient representation are inevitable for ensemble methods unless the adjoint method is used to compute the analytic gradient. Moreover, the parameters and N which control the length of the update step are introduced in the mda 9 EnRML method and EnKF-MDA, respectively. They can be constant or adaptive based on the convergence judgment. Speci cally, if the discrepancy in observation space is larger than that in the last iteration, we can reduce the step length by decreasing the step length parameter or increasing the in ation parameter N . Conversely, if the discrepancy is reduced, we can increase the in EnRML or reduce the N in mda mda EnKF-MDA to speed up the convergence [34]. 3. Scalar case We rst test the three ensemble-based Bayesian UQ approaches derived in Section 2 on a simple case used by Evensen [30]. In his work, the e ects of the model gradient representation are investigated with a suciently large sample size. Here, we focus on the e ects of limited ensemble sizes and evaluate the performance of the iterative ensemble methods with small sample sizes. In this case, the computing time for the forward model is negligible. Hence, we can obtain Bayesian posterior from MCMC and ensemble methods with a large sample size for comparison. 3.1. Problem statement The forward model is de ned as: y = 1 + sin(x) + q; (23) where x is the state variable, ^y is the model output in observation space, and q is the added model error with q  N (0; 0:03 ). The goal is to quantify and reduce the uncertainty of x and ^y with Bayesian approaches. The Bayesian UQ approach need the statistical information on the prior state and the observation data. We assume that the state variable x and data y both obey to the Gaussian distribution as x  N (0; 0:1 ) and y  N (1; 0:1 ). We set the step length parameter in the EnRML method as 0:5 and the in ation parameter N in EnKF-MDA as 30 to obtain convergence results. The performance of the ensemble mda 6 2 methods is assessed with two di erent ensemble sizes of 10 and 10 , and the e ects of small ensemble sizes on the propagated uncertainties are investigated. We conduct the Markov chain Monte Carlo (MCMC) with 10 samples by using the DREAM algorithm [41] and consider the results as the gold standard. The probability density in this case is estimated from the samples through kernel density estimation (KDE) using the Gaussian kernel. From the derivation in Section 2, it has been noted that two assumptions (linearization and ensemble gradient representation) are introduced to obtain the derivative-free analysis step. The model gradient can be represented by the analytic gradient or estimated by the ensemble samples. Although the analytic model gradient can give more accurate results compared to the ensemble gradient representation [30], it is not practical for complex models and beyond the scope of this work. Here, we focus on the ensemble gradient and also investigate the e ects of ensemble sizes on the ensemble gradient. The Python code for this test case is provided in a publicly available GitHub repository [42]. 3.2. Results We rst evaluate the performance of each ensemble method with a large ensemble size M = 10 . The joint and marginal PDFs with comparison among di erent ensemble methods are shown in Fig. 2 and Fig. 3, respectively. From the results, it can be seen that all the three ensemble methods can capture the posterior 10 mean. However, it is apparent that the iterative EnKF method leads to overcon dence in the mean value and signi cantly underestimates the posterior variance compared to the exact Bayesian distribution from MCMC. On the contrary, both the EnRML method and EnKF-MDA can provide an estimation on the posterior distribution in good agreement with the benchmark data. This is not surprising since the iterative EnKF repeats using the same data, while the EnRML method and EnKF-MDA can avoid data overuse by introducing the Gauss{Newton method or the observation error in ation, as we remarked in Section 2. To summarize, with large ensemble size, the EnRML method and EnKF-MDA can perform comparably to the MCMC, while EnKF signi cantly underestimates the posterior uncertainty due to data repeatedly used. (a) MCMC (b) EnKF (c) EnRML (d) EnKF{MDA Figure 2: Joint PDFs with 10 samples with the comparison among Bayes, EnKF, EnRML, and EnKF-MDA for the scalar case. Further, we explore the e ects of small ensemble size on this case and evaluate which method can outperform others with limited samples. For many realistic cases, the propagation with large ensemble size is computationally prohibitive, and ensemble methods can typically use less than 10 samples to describe the statistical information. Therefore, we set the ensemble size to be 10 , and other set-ups are consistent with the previous case. The joint PDF results with di erent ensemble methods are shown in Fig. 4. It can be seen that with the limited ensemble size, the iterative EnKF method performs similarly as with the large 11 6 Figure 3: Marginal PDFs for x with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case. ensemble size. Speci cally, all samples converge to the observations and the posterior distribution has a low variance. By contrast, the EnRML method and EnKF-MDA not only can capture the posterior mean value but also provide the statistical information to indicate the uncertainty with ensemble realizations. For better visualization, the marginal PDFs in comparison of the three ensemble methods with 10 samples are shown in Fig. 5. We can see that the EnRML method and EnKF-MDA give satisfactory estimations on the uncertainty, while the mode value with EnKF is approximately three times higher than that with MCMC. Generally, with limited ensemble size, EnKF performs similarly as with large ensemble size, which underestimates the posterior variance. The performance of EnRML and EnKF-MDA is still satisfactory but inferior to those with larger ensemble sizes. Not surprisingly, the estimation of uncertainty with limited ensemble size slightly deviates from the dis- tribution obtained with MCMC. It is likely that the limited number of samples are insucient for describing the necessary statistics. This may also increase the error in estimating the model gradient, especially for nonlinear models. For illustration, we present the plots of prior joint PDF with the large and small ensemble size, as shown in Fig. 6. It is obvious that the small ensemble size is not sucient to describe the prior distribution. Additionally, we provide the model gradient estimated by ensemble samples in comparison with the analytic gradient. The analytic gradient of this model is  cos x, and the ensemble gradient can sin(x)sin( x) be represented by . The sine function can be approximated as a linear model in the range close x x to zero, and thus we assume that: sin(x) sin(x)  sin((x x)); x ! 0, (24) and further sin((x x)) lim = cos((x x)): (25) x x!0 (x x) Based on this formula, we can see that if the samples are close to x and the sample mean x is estimated as 12 (a) MCMC (b) EnKF (c) EnRML (d) EnKF-MDA Figure 4: Joint PDFs with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case Figure 5: Marginal PDFs for x with 10 samples with the comparison among MCMC, EnKF, EnRML, and EnKF-MDA for the scalar case. 13 zero, the ensemble gradient can be approximated to the analytic one as (24) (25) sin(x) sin(x) sin((x x))  x0 cos((x x))   cos(x): x x x x Given that the model gradient is not subject to the Gaussian distribution, we use the cosine kernel to estimate the probability density, as shown in Fig. 7. It is noticeable that the di erence between the analytic gradient and ensemble gradient can be eased with the large ensemble size. The discontinuity in the case with 10 samples is mainly due to the limited ensemble realizations which are insucient to prescribe the in nite distribution. The small ensemble size can signi cantly reduce the computational cost but may lead to additional errors in the statistical description and the model gradient estimation. To ensure the error remains within an acceptable range, the choice of the ensemble size need numerical tests. However, for highly nonlinear systems the reduction of errors in model gradient estimation will not bene t from large ensemble size unless the analytic gradient is adopted. Also, localization techniques [43] can be introduced to reduce the sampling error and need future investigation. 6 2 (a) with 10 samples (b) with 10 samples 6 2 Figure 6: Results of prior joint PDF with large (10 ) and small (10 ) ensemble size for the scalar case 4. RANS equation CFD is of signi cant importance for many engineering applications to inform the process of design, analysis, and optimization. Considering the computational cost, the RANS model is still the primary tool to characterize turbulence behavior in CFD simulations. However, the unknown Reynolds stress term in RANS equations is commonly solved with di erent closure models under the Boussinesq assumption. This assumption introduces the model uncertainty and reduces the con dence on the predictive performance. In this section, we apply the three ensemble-based data assimilation methods (EnKF, EnRML, and EnKF- MDA) on the RANS closure problem and evaluate their performance to quantify and reduce the uncertainty of the predicted velocity by incorporating high delity data. 14 20.0 20.0 analytic gradient analytic gradient ensemble gradient ensemble gradient 17.5 17.5 15.0 15.0 12.5 12.5 10.0 10.0 7.5 7.5 5.0 5.0 2.5 2.5 0.0 0.0 0 1 2 3 4 0 1 2 3 4 model gradient model gradient 6 2 (a) with 10 samples (b) with 10 samples Figure 7: Comparison of analytic gradient and ensemble gradient. The light/pink shaded region represents analytic gradient 6 2 and the dark/blue shaded region represents ensemble gradient. (a): 10 samples; (b): 10 samples 4.1. Problem statement The RANS equations can be expressed as: @U = 0 (26a) @x 0 0 @u u @U @ (U U ) @P 1 @ U i i j i i j + = + ; (26b) @t @x @x Re @x @x @x j i j j j where U; P is the dimensionless velocity and pressure respectively, and Re is the Reynolds number. In the 0 0 momentum equation (26b),  = u u is the Reynolds stress which is the main source of uncertainty in i j RANS simulations. We regard the Reynolds stress from RANS simulation coupling with the linear eddy{ viscosity model as the baseline. Further, we introduce the discrepancy term  representing the uncertainty into the baseline as RANS + : (27) Thus, we can quantify the uncertainty in the predicted velocity with the three ensemble-based DA methods by incorporating available observation data. 4.2. Methodology The data assimilation framework to quantify and reduce the RANS model-form uncertainty associated with Reynolds stress was proposed by Xiao et.al [8]. Here, we give a brief introduction to this methodology, and the reader is referred to [8] for further details. To quantify the uncertainty within Reynolds stress, we rst transform the Reynolds stress tensor into several scalar elds. Speci cally, the Reynolds stress tensor can be expressed as 1 1 = 2k( I + a) = 2k( I + VV ), 3 3 probability density probability density 3(4) <latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit><latexit sha1_base64="TZ6dp5tyCBo6NdZnLwRhzdKcg4s=">AAAB63icbVA9SwNBEN2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJFmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1HFo8lrHpRMyCFBpaKFBCJzHAVCThMZrc5v7jExgrYv2A0wRCxUZaDAVnmEs9QNav1vy6PwddJUFBaqRAs1/96g1inirQyCWzthv4CYYZMyi4hFmll1pIGJ+wEXQd1UyBDbP5rTN65pQBHcbGlUY6V39PZExZO1WR61QMx3bZy8X/vG6Kw+swEzpJETRfLBqmkmJM88fpQBjgKKeOMG6Eu5XyMTOMo4un4kIIll9eJe2LeuDXg/vLWuOmiKNMTsgpOScBuSINckeapEU4GZNn8krePOW9eO/ex6K15BUzx+QPvM8fCOKOOA==</latexit> 3C (3-component isotropic) 3 4 0 0 0 x (⇠ , ⌘ ) <latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit><latexit sha1_base64="KvUPKrf1GAf8NXy2JlXWkDq2fCs=">AAAB9XicbVBNS8NAEN3Ur1q/qh69LBZpBSmJCHosevFYwX5AE8tmO2mXbjZhd6Mtof/DiwdFvPpfvPlv3LY5aOuDgcd7M8zM82POlLbtbyu3srq2vpHfLGxt7+zuFfcPmipKJIUGjXgk2z5RwJmAhmaaQzuWQEKfQ8sf3kz91iNIxSJxr8cxeCHpCxYwSrSRHkblijti5TMXNCmfdoslu2rPgJeJk5ESylDvFr/cXkSTEISmnCjVcexYeymRmlEOk4KbKIgJHZI+dAwVJATlpbOrJ/jEKD0cRNKU0Him/p5ISajUOPRNZ0j0QC16U/E/r5Po4MpLmYgTDYLOFwUJxzrC0whwj0mgmo8NIVQycyumAyIJ1SaoggnBWXx5mTTPq45dde4uSrXrLI48OkLHqIIcdIlq6BbVUQNRJNEzekVv1pP1Yr1bH/PWnJXNHKI/sD5/AJnvkUM=</latexit> x<latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit><latexit sha1_base64="1wGnoUZRraZ2SlTvRPCWqUe0y5M=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEqMeiF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+96T6JcrbtWdg6wSLycVyNHol796g5ilEVfIJDWm67kJ+hnVKJjk01IvNTyhbEyHvGupohE3fjY/dUrOrDIgYaxtKSRz9fdERiNjJlFgOyOKI7PszcT/vG6K4ZWfCZWkyBVbLApTSTAms7/JQGjOUE4soUwLeythI6opQ5tOyYbgLb+8SloXVc+teneXlfp1HkcRTuAUzsGDGtThFhrQBAZDeIZXeHOk8+K8Ox+L1oKTzxzDHzifP1vljdU=</latexit> <latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit><latexit sha1_base64="/J7wdTLRRdTvqpShvWZ6V9mt6r4=">AAAB6XicbVBNS8NAEJ3Ur1q/qh69LBbRU0lE0GPRi8cq9gPaUDbbSbt0swm7G7GE/gMvHhTx6j/y5r9x2+agrQ8GHu/NMDMvSATXxnW/ncLK6tr6RnGztLW9s7tX3j9o6jhVDBssFrFqB1Sj4BIbhhuB7UQhjQKBrWB0M/Vbj6g0j+WDGSfoR3QgecgZNVa6fzrtlStu1Z2BLBMvJxXIUe+Vv7r9mKURSsME1brjuYnxM6oMZwInpW6qMaFsRAfYsVTSCLWfzS6dkBOr9EkYK1vSkJn6eyKjkdbjKLCdETVDvehNxf+8TmrCKz/jMkkNSjZfFKaCmJhM3yZ9rpAZMbaEMsXtrYQNqaLM2HBKNgRv8eVl0jyvem7Vu7uo1K7zOIpwBMdwBh5cQg1uoQ4NYBDCM7zCmzNyXpx352PeWnDymUP4A+fzB0bAjS0=</latexit> 2C (2-component x<latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit><latexit sha1_base64="f2yzimwbR/Dgjzp6tZ360fHRqNI=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi8cW7Ae0oWy2k3btZhN2N2IJ/QVePCji1Z/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3H1FpHst7M0nQj+hQ8pAzaqzUeOqXK27VnYOsEi8nFchR75e/eoOYpRFKwwTVuuu5ifEzqgxnAqelXqoxoWxMh9i1VNIItZ/ND52SM6sMSBgrW9KQufp7IqOR1pMosJ0RNSO97M3E/7xuasJrP+MySQ1KtlgUpoKYmMy+JgOukBkxsYQyxe2thI2ooszYbEo2BG/55VXSuqh6btVrXFZqN3kcRTiBUzgHD66gBndQhyYwQHiGV3hzHpwX5935WLQWnHzmGP7A+fwB5jmM/A==</latexit> axisymmetric) x(⇠ , ⌘ ) <latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit><latexit sha1_base64="rBtAwAhUVBk4xkhDOKA7JqV8k4U=">AAAB8nicbVBNSwMxEM3Wr1q/qh69BItQQcquCHosevFYwX5AdynZNNuGZpMlmZWWpT/DiwdFvPprvPlvTNs9aOuDgcd7M8zMCxPBDbjut1NYW9/Y3Cpul3Z29/YPyodHLaNSTVmTKqF0JySGCS5ZEzgI1kk0I3EoWDsc3c389hPThiv5CJOEBTEZSB5xSsBK3XHVH/MLnwE575Urbs2dA68SLycVlKPRK3/5fUXTmEmgghjT9dwEgoxo4FSwaclPDUsIHZEB61oqScxMkM1PnuIzq/RxpLQtCXiu/p7ISGzMJA5tZ0xgaJa9mfif100hugkyLpMUmKSLRVEqMCg8+x/3uWYUxMQSQjW3t2I6JJpQsCmVbAje8surpHVZ89ya93BVqd/mcRTRCTpFVeSha1RH96iBmogihZ7RK3pzwHlx3p2PRWvByWeO0R84nz9wnpCw</latexit> 1C 2 1 (1-component) (a) Barycentric coordinate (b) Natural coordinate Figure 8: Mapping between the barycentric coordinate to the natural coordinate where k is the turbulent kinetic energy, indicating the magnitude of the Reynolds stress, I is the second order identity tensor, a is the anisotropy tensor; V = [v ; v ; v ], and  = diag[ ;  ;  ] with  + + = 0 are 1 2 3 1 2 3 1 2 3 the eigenvector and eigenvalue of a, respectively, which represents the shape and orientation of  . Afterwards, the eigenvalues  ;  ;  are projected to a barycentric coordinate as 1 2 3 C = 1 1 2 C = 2(  ) 2 2 3 C = 3 + 1; 3 3 with C +C +C = 1. [44] The barycentric coordinate is shown in Fig. 8a. To facilitate the parameterization, 1 2 3 the barycentric coordinate is transformed to the natural coordinate  = (; ) by placing the triangle in a Cartesian coordinate as shown in Fig. 8b. The location of any point in the triangle can be expressed as a combination of those of the three vertices. That is, =  C +  C +  C ; (28) 1c 1 2c 2 3c 3 where  ,  , and  are the coordinates of the three vertices of the triangle. 1c 2c 3c RANS RANS RANS In conclude, we represent the Reynolds stress baseline  with three discrepancy variables k ,  , RANS k and  through eigendecomposition and coordinate conversion. Further, the additive uncertainties  ,  , and  can be injected into these projected variables as RANS k log k(x) = log k (x) +  (x); (29a) RANS (x) =  (x) +  (x); (29b) RANS (x) =  (x) +  (x); (29c) where the logarithm on k is to ensure non-negativity. The dimension of the variables log k(x), (x), and (x) is consistent with the mesh grid. To infer the entire eld with very sparse observation signi cantly increases the ill-posedness of the problem. Hence, it is necessary to reduce the dimension of the state space. In this case, we leverage the Karhunen|Lo eve (KL) expansion with truncated orthogonal modes to represent the eld for each quantity to be inferred. Concretely, the discrepancy variables  ,  , and  are constructed Plain strain as the random elds subject to zero-mean Gaussian process GP (0;K). The kernel function K indicates the covariance at two locations x and x as 0 2 jx x j 0 0 K(x; x ) = (x)(x ) exp( ): (30) In the formula above, (x) is the variance eld to re ect the region where large discrepancy is expected. l ^ ^ ^ ^ is the characteristic length. The KL modes take the form as:  (x) =  (x), where  and  are the i i eigenvalues and eigenvectors of the kernel K, respectively, computed based on the Fredholm integral as 0 0 0 ^ ^ ^ K(x; x )(x )dx = (x). (31) This choice of KL modes for the discrepancy elds  ,  ,  leads to a KL expansion. That is, the discrepancy variables can be constructed from these deterministic KL modes (x) and zero-mean, uni-variance random variable ! as k k (x) = !  (x); i=1 (x) = !  (x); (32) i=1 (x) = !  (x): i=1 With ! and KL modes (x), we can reconstruct the eld of each discrepancy quantity and recover the random eld of Reynolds stress tensor. The Reynolds stress representation and dimension reduction presented above makes it practical to quan- tify and reduce the uncertainty in the RANS model by incorporating observation data, i.e., direct numerical simulation (DNS) results. From a Bayesian perspective, the random noise   (0;  ) is added in time- obs averaged DNS data y to allow overlap between the likelihood and the prior distribution. Herein the obs is the standard deviation of observation noise, indicating the noise level. We take the velocity as the state augmented with the KL coecients. As a result, we can adopt the iterative ensemble methods (EnKF, EnRML, and EnKF-MDA) to quantify and reduce the uncertainty in velocity with prior samples of the KL coecient and the observation. In summary, the procedure of the RANS model-form uncertainty quanti cation framework is presented below: 1. Preprocessing step: RANS (1) Perform RANS simulation to obtain  as the baseline. RANS RANS RANS RANS (2) Project  onto the eld of k ,  , and  . (3) Conduct KL expansion to generate the KL basis sets or modes f (x)g , where m is the number i=1 of truncated modes. (4) Generate the initial value of ! with a zero-mean uni-variance normal distribution. 2. Data assimilation step: (a) Recover the discrepancy elds of  ,  , and  with coecient ! and basis sets (x) based on Eq. (32). (b) Reconstruct the ensemble realizations on  through mapping (k; ; ) !  and solve the RANS 17 Figure 9: The structured mesh used for the simulation of ow over the periodic hills equation to obtain the velocity eld given each realization of  . (c) Perform the Bayesian analysis with data assimilation technique to reduce the uncertainty of velocity by incorporating time-averaged DNS data. (d) Return to step (a) till the convergence criteria or maximum iteration number is reached. 4.3. Case setup The test case is turbulent ow over periodic hills initially proposed by Fr ohlich et al. [45]. The Reynolds number based on the bulk velocity and height of crest is 2800. We use the DNS data from [46] as the benchmark. The Launder{Sharma RANS model [47] is one of classical low Reynolds k{ models and is extensively used in industrial applications. Hence, we use the RANS simulation with this model as the baseline. The periodic boundary condition is imposed on the inlet, and the non-slip boundary condition is applied on the wall. A structured mesh is constructed with 50 cells in the stream-wise direction and 30 cells in the normal to wall direction, as shown in Fig. 9. Despite the coarse mesh, the dimensionless distance y between the rst cell and the walls is around 1, which meets the requirement of the Launder{Sharma turbulence model. As for the data assimilation setup, the number of KL modes for k, , and  is set to 8. The ensemble size is 50. The length scale is set as constant 1 for simplicity. The standard deviation of observation noise  is set as 10% of the truth. We take 18 observations to quantify and reduce the uncertainty in obs velocity. The locations are marked in Fig. 10. The step parameter in the EnRML method is chosen as 0:5, and the in ation parameter N in EnKF-MDA is set as 50 to obtain the convergence results based mda on our calibration study. For this case, the MCMC sampling is impractical to verify the estimated posterior uncertainty, due to the high dimensionality of the state space and the high costs of numerical simulation. The built-in solver simpleFoam in OpenFoam is used to run the RANS simulation and obtain the base- line/prior mean. The forward solver tauFoam is developed based on simpleFoam to propagate the Reynolds stress to velocity. That is, the forward solver computes velocity with the given Reynolds stress eld rather than using turbulence models. 4.4. Results Through solving RANS equations given the randomized Reynolds stresses, we can obtain the prior un- certainty in the propagated velocity. The plots of the prior stream-wise velocity are shown in Fig. 10. It can be seen that the space spanned by the ensemble realizations can indicate the statistical information. 18 DNS baseline samples sample mean observations 3.0 3.0 2.5 2.5 2.0 2.0 1.5 1.0 1.5 0.5 1.0 0.0 0 2 4 6 8 10 0.5 x/H; 2U /U + x/H x b 0.0 0 2 4 6 8 10 x/H; 2U /U + x/H x b Figure 10: Prior ensemble realization of stream-wise velocity pro les at 18 locations, in comparison to DNS and baseline. The location of observation is indicated with crosses(). Also, the sample mean can have a good t with RANS results. That is reasonable since the random eld is constructed by perturbing the baseline from RANS simulation. Further, we perform data assimilation analysis with EnKF, the EnRML method, and EnKF-MDA to quantify uncertainties in the velocity eld by incorporating the observations at the speci c locations. The results with di erent data assimilation schemes are presented in Fig. 11. It is noticeable that with EnKF the posterior mean can t well with DNS results. However, all samples converge to the mean value, and the variance of the posterior becomes very low. By contrast, the EnRML method can give an estimation of the uncertainty, and the mean value also has a good t with DNS data. EnKF-MDA can also preserve the sample variance and improve the data t, but the sample mean is relatively inferior compared to the other two methods. Based on our derivation and evaluation in the former sections, that is likely due to EnKF repeatedly using the same DNS data with full Gauss{Newton steps, while the EnRML method and EnKF-MDA can be considered to perform one EnKF step via several small analysis steps. Here we present the comparison of 95% credible interval between the prior and posterior with the three data assimilation methods. The results are shown in Fig. 12. It is noticeable that the posterior uncertainty with EnKF is underestimated and too much con dence is placed in the mean value. With the EnRML method and EnKF-MDA, we can have an estimation of the uncertainty indicated by samples. Besides, the uncertainty in the upper channel estimated by the EnRML method and EnKF-MDA is similar to the prior. That is reasonable since the variance  in this region is low [8], and no observation is informed as well. Hence, the posterior should not change much from the prior distribution. Based on the overall performance, the iterative EnKF loss the statistical information due to data overuse, while the other two methods can provide reasonable uncertainty information. We also compare the three data assimilation methods in convergence speed. The convergence criteria for the three methods are di erent. Concretely, EnKF and the EnRML method are considered to be converged when the iteration residual in data mis t between the two adjacent iterations is less than 1  10 , while EnKF-MDA has to reach the prede ned maximum iteration number N . From our numerical tests, the mda EnKF does not converge and stops at the maximum iteration number 100. Fig. 13 presents the evolution of y/H y/H DNS baseline samples sample mean observations 3.0 2.5 2.0 1.5 1.0 0.5 0.10 0 2 4 6 8 10 x/H; 2U /U + x/H x b 0 2 4 6 8 10 x/H; 2U /U + x/H x b (a) EnKF 0 2 4 6 8 10 x/H; 2U /U + x/H x b (b) EnRML 0 2 4 6 8 10 x/H; 2U /U + x/H x b (c) EnKF-MDA Figure 11: Data assimilation results of stream-wise velocity with EnKF, EnRML, and EnKF-MDA in comparison to baseline and DNS for the turbulent ow in a periodic hill. y/H y/H y/H y/H Prior Posterior DNS 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 10 0 2 4 6 8 10 x/H; 2U /U + x/H x b x/H;2U /U + x/H x b (a) EnKF 0 2 4 6 8 10 x/H; 2U /U + x/H x b (b) EnRML 0 2 4 6 8 10 x/H; 2U /U + x/H x b (c) EnKF-MDA Figure 12: The 95% credible intervals of the prior (light/pink shaded region) and posterior (dark/blue shaded region) samples of stream-wisevelocity pro les for the turbulent ow in a periodic hill y/H y/H y/H y/H (a) EnRML (b) EnKF-MDA Figure 13: Convergence plot of EnRML and EnKF-MDA with 50 samples iteration residual for the EnRML method and the convergence plot of the maximum iteration number N mda for EnKF-MDA with 50 samples. It can be seen that the EnRML method converges in 8 iterations, while EnKF-MDA need at least 50 iterations to converge in the maximum iteration number N , which suggests mda that EnRML outperforms the EnKF-MDA in convergence speed. Further, we conduct the data assimilation with 200, 800, and 3200 samples to investigate the e ects of DNS sample size. We use relative data mis t between posterior mean HX and truth U normed by that of prior, to evaluate the posterior results. It can be formulated as DNS k H[X ] U k i L2 . (33) DNS k H[X ] U k 0 L2 Also, the relative standard deviation of the posterior samples is computed in a similar manner to evaluate the reduction of uncertainty after assimilating observation data. The results with di erent samples are summarized in Table 2. It can be seen that EnKF can achieve the best data t among the three methods but underestimates the variance of the posterior samples. EnRML and EnKF-MDA not only can improve the data mis t but also provide the statistical information. Comparing between the EnRML and EnKF-MDA, EnRML can provide better data match and preserve larger variance of the posterior. With large sample size, the posterior variance will be increased for all the three methods since more samples can cover more statistical information. However, the data mis t will be inferior to that with the small sample size. That is likely due to the capping error. When we perform the Bayesian update, some samples may lead to the updated (; ) out of the square [1; 1] [1; 1] shown in Fig. 8b. To ensure physical reliability, we bound any sample outside the square by xing them at the edges. With large samples, more samples may jump out of the physical range and need to be capped, which likely causes large errors between the posterior mean and data. Better methods to ensure the physical realizability need to be investigated but are outside the scope of this work. 5. Conclusion This paper evaluates the performance of three widely used iterative ensemble methods (EnKF, EnRML, and EnKF-MDA) for UQ problems in steady cases. We summarize the derivations of these ensemble methods from an optimization viewpoint. The iterative EnKF method performs several full Gauss{Newton steps 22 Table 2: Summary of the relative data mis t and the relative standard deviation of posterior samples with di erent ensemble sizes. sample size N = 50 EnKF EnRML EnKF-MDA relative data mis t 15:2% 23:2% 30:0% relative std of ensemble 16:1% 36:9% 30:5% sample size N = 200 relative data mis t 21:7% 29:2% 36:6% relative std of ensemble 17:5% 40:9% 31:8% sample size N = 800 relative data mis t 33:7% 35:9% 46:1% relative std of ensemble 22:5% 44:4% 36:2% sample size N = 3200 relative data mis t 37:9% 38:2% 49:3% relative std of ensemble 23:2% 49:2% 38:8% during which same data is repeatedly used for the stationary scenario. The EnRML method and EnKF- MDA can iteratively approach to the optimal point with Gauss{Newton method or likelihood recursion, avoiding the data overuse and alleviating the e ects of linearization approximation simultaneously. From the numerical investigation for a scalar case, we investigate the e ects of small ensemble sizes. The results show that the EnRML method and EnKF-MDA can provide a satisfactory estimation on the posterior uncertainty with small ensemble size but remain inferior to that with large ensemble size. This is because the small ensemble size is not sucient to describe the statistical information and increases the error in the estimation of the model gradient. This de ciency may be alleviated by using the localization technique, and will be further investigated in future work. The comparison results for both the scalar case and CFD case show that the posterior mean with all the three methods can have a good agreement with benchmark data. However, the iterative form of EnKF discussed here which uses the same data repeatedly for steady problems can prompt the data t but underestimate the posterior uncertainty. The other two methods, EnRML and EnKF-MDA, are capable of giving an estimation of posterior uncertainty. Based on our comparison study, the EnRML method is recommended since it can converge fast and provide the statistical information even in complicated CFD cases. The applicability of these ensemble methods for unsteady CFD applications will be investigated in future studies. Acknowledgements The authors would like to thank the reviewers for their constructive and valuable comments, which helped us to improve the quality and clarity of this manuscript. References [1] F. D. Witherden, A. Jameson, Future Directions in Computational Fluid Dynamics, in: 23rd AIAA Computational Fluid Dynamics Conference, 2017, pp. 1{16. 23 [2] L. Mathelin, M. Y. Hussaini, T. A. Zang, F. Bataille, Uncertainty propagation for a turbulent, com- pressible nozzle ow using stochastic methods, AIAA Journal 42 (8) (2004) 1669{1676. [3] O. Knio, O. Le Maitre, Uncertainty propagation in CFD using polynomial chaos decomposition, Fluid Dynamics Research 38 (9) (2006) 616{640. [4] S. Hosder, R. Walters, R. Perez, A non-intrusive polynomial chaos method for uncertainty propagation in CFD simulations, in: 44th AIAA aerospace sciences meeting and exhibit, 2006. [5] H. N. Najm, Uncertainty quanti cation and polynomial chaos techniques in computational uid dy- namics, Annual Review of Fluid Mechanics 41 (2009) 35{52. [6] E. Dow, Q. Wang, Quanti cation of structural uncertainties in the k{! turbulence model, in: 52nd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 19th AIAA/ASME/AHS Adaptive Structures Conference, 2011, pp. 1{12. [7] A. Gel, R. Garg, C. Tong, M. Shahnam, C. Guenther, Applying uncertainty quanti cation to multiphase ow computational uid dynamics, Powder Technology 242 (2013) 27{39. [8] H. Xiao, J.-L. Wu, J.-X. Wang, R. Sun, C. Roy, Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier{Stokes simulations: A data-driven, physics-informed Bayesian approach, Journal of Computational Physics 324 (2016) 115{136. [9] M. C. Kennedy, A. O'Hagan, Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (3) (2001) 425{464. [10] S. H. Cheung, T. A. Oliver, E. E. Prudencio, S. Prudhomme, R. D. Moser, Bayesian uncertainty analysis with applications to turbulence modeling, Reliability Engineering & System Safety 96 (9) (2011) 1137{ [11] T. A. Oliver, R. D. Moser, Bayesian uncertainty quanti cation applied to RANS turbulence models, in: Journal of Physics: Conference Series, Vol. 318, IOP Publishing, 2011, pp. 1{10. [12] W. Edeling, P. Cinnella, R. P. Dwight, H. Bijl, Bayesian estimates of parameter variability in the k{" turbulence model, Journal of Computational Physics 258 (2014) 73{94. [13] W. N. Edeling, M. Schmelzer, R. P. Dwight, P. Cinnella, Bayesian predictions of Reynolds-averaged Navier{Stokes uncertainties using maximum a posteriori estimates, AIAA Journal 56 (5) (2018) 2018{ [14] X. Ma, N. Zabaras, An adaptive hierarchical sparse grid collocation algorithm for the solution of stochas- tic di erential equations, Journal of Computational Physics 228 (8) (2009) 3084{3113. [15] W. Edeling, R. P. Dwight, P. Cinnella, Simplex-stochastic collocation method with improved scalability, Journal of Computational Physics 310 (2016) 301{328. [16] J. Zhang, S. Fu, An ecient Bayesian uncertainty quanti cation approach with application to k-!- transition modeling, Computers & Fluids 161 (2018) 211{224. 24 [17] X. Zhang, T. Gomez, O. Coutier-Delgosha, Bayesian optimisation of RANS simulation with ensemble- based variational method in convergent-divergent channel, Journal of Turbulence 20 (3) (2019) 1{26. [18] C. Liu, Q. Xiao, B. Wang, An ensemble-based four-dimensional variational data assimilation scheme. Part I: Technical formulation and preliminary test, Monthly Weather Review 136 (9) (2008) 3363{3373. [19] A. A. Emerick, A. C. Reynolds, History matching time-lapse seismic data using the ensemble Kalman lter with multiple data assimilations, Computational Geosciences 16 (3) (2012) 639{659. [20] G. Evensen, Data assimilation: the ensemble Kalman lter, Springer Science & Business Media, 2009. [21] G. Gao, M. Zafari, A. C. Reynolds, et al., Quantifying uncertainty for the PUNQ-S3 problem in a Bayesian setting with RML and EnKF, in: SPE reservoir simulation symposium, Society of Petroleum Engineers, 2005, pp. 506{515. [22] J. A. Vrugt, B. A. Robinson, Treatment of uncertainty using ensemble methods: Comparison of sequen- tial data assimilation and Bayesian model averaging, Water Resources Research 43 (1). [23] A. Lorenc, The potential of the ensemble Kalman lter for NWP - a comparison with 4D-Var, Quarterly Journal of The Royal Meteorological Society 129 (595) (2003) 3183{3203. doi:10.1256/qj.02.132. [24] P. L. Houtekamer, F. Zhang, Review of the ensemble Kalman lter for atmospheric data assimilation, Monthly Weather Review 144 (12) (2016) 4489{4532. doi:10.1175/MWR-D-15-0440.1. [25] L. Natvik, G. Evensen, Assimilation of ocean colour data into a biochemical model of the North Atlantic - Part 1. Data assimilation experiments, Journal of Marine Systems 40 (2003) 127{153. doi:10.1016/ S0924-7963(03)00016-2. [26] V. E. J. Haugen, G. Evensen, Assimilation of SLA and SST data into an OGCM for the Indian ocean, Ocean Dynamics 52 (3) (2002) 133{151. doi:10.1007/s10236-002-0014-7. [27] H. Kato, S. Obayashi, Approach for uncertainty of turbulence modeling based on data assimilation technique, Computers & Fluids 85 (2013) 2{7. [28] M. A. Iglesias, K. J. Law, A. M. Stuart, Ensemble Kalman methods for inverse problems, Inverse Problems 29 (4) (2013) 045001. [29] H. Xiao, P. Cinnella, Quanti cation of model uncertainty in RANS simulations: a review, Progress in Aerospace Sciences. [30] G. Evensen, Analysis of iterative ensemble smoothers for solving inverse problems, Computational Geosciences 22 (3) (2018) 885{908. [31] Y. Gu, D. S. Oliver, et al., An iterative ensemble Kalman lter for multiphase uid ow data assimilation, SPE Journal 12 (04) (2007) 438{446. [32] Y. Chen, D. S. Oliver, Ensemble randomized maximum likelihood method as an iterative ensemble smoother, Mathematical Geosciences 44 (1) (2012) 1{26. 25 [33] Y. Yang, C. Robinson, D. Heitz, E. M emin, Enhanced ensemble-based 4DVar scheme for data assimila- tion, Computers & Fluids 115 (2015) 201{210. [34] A. A. Emerick, A. C. Reynolds, Ensemble smoother with multiple data assimilation, Computers & Geosciences 55 (2013) 3{15. [35] A. C. Reynolds, M. Zafari, G. Li, Iterative forms of the ensemble Kalman lter, in: ECMOR X-10th European Conference on the Mathematics of Oil Recovery, 2006. [36] D. S. Oliver, Y. Chen, Recent progress on reservoir history matching: a review, Computational Geo- sciences 15 (1) (2011) 185{221. [37] O. G. Ernst, B. Sprungk, H.-J. Starklo , Analysis of the ensemble and polynomial chaos Kalman lters in Bayesian inverse problems, SIAM/ASA Journal on Uncertainty Quanti cation 3 (1) (2015) 823{851. [38] G. Burgers, P. Jan van Leeuwen, G. Evensen, Analysis scheme in the ensemble Kalman lter, Monthly Weather Review 126 (6) (1998) 1719{1724. [39] Z. Wu, A. C. Reynolds, D. S. Oliver, et al., Conditioning geostatistical models to two-phase production data, in: SPE Annual Technical Conference and Exhibition, Society of Petroleum Engineers, 1998, pp. 142{155. [40] G. Gao, A. C. Reynolds, et al., An improved implementation of the LBFGS algorithm for automatic history matching, in: SPE Annual Technical Conference and Exhibition, Society of Petroleum Engineers, 2004, pp. 5{17. [41] J. A. Vrugt, Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation, Environmental Modelling & Software 75 (2016) 273{316. [42] X. Zhang, H. Xiao, T. Gomez, O. Coutier-Delgosha, Code of scalar case for uncertainty quanti cation, howpublished = "https://github.com/XinleiZhang/scalar-case-for-UQ". [43] J. L. Anderson, Localization and sampling error correction in ensemble Kalman lter data assimilation, Monthly Weather Review 140 (7) (2012) 2359{2371. [44] S. Banerjee, R. Krahl, F. Durst, C. Zenger, Presentation of anisotropy properties of turbulence, invari- ants versus eigenvalue approaches, Journal of Turbulence 8 (32) (2007) 1{27. [45] J. Fr ohlich, C. P. Mellen, W. Rodi, L. Temmerman, M. A. LESchziner, Highly resolved large-eddy simu- lation of separated ow in a channel with streamwise periodic constrictions, Journal of Fluid Mechanics 526 (2005) 19{66. [46] M. Breuer, N. Peller, C. Rapp, M. Manhart, Flow over periodic hills{numerical and experimental study in a wide range of Reynolds numbers, Computers & Fluids 38 (2) (2009) 433{457. [47] B. E. Launder, B. Sharma, Application of the energy-dissipation model of turbulence to the calculation of ow near a spinning disc, Letters in Heat and Mass Transfer 1 (2) (1974) 131{137. 26 Appendix A. Derivation of EnKF The cost function and its gradient for the iterative EnKF are formulated as 1 > 1 > a f 1 a f a 1 a J = x x P x x + H[x ] y R H[x ] y ; (A.1a) j j n;j n;j n n;j n;j n;j n;j 2 2 @J 1 a f 0 a > 1 a = P x x +H [x ] R H[x ] y : (A.1b) n n;j n;j n;j n;j @x n;j a 0 a We approximate the unknown terms H[x ] and H [x ] in Eq. (A.1b) with the linear assumption as a f 0 a a f H[x ]  H[x ] +H [x ] x x ; (A.2a) j j j j j 0 a 0 f 00 f a f H [x ]  H [x ] +H [x ] x x ; (A.2b) j j j j j where the second derivation can be neglected. Further, we set the gradient of the cost function to be zero and substitute with Eq. (A.2) as 1 a f 0 f > 1 f 0 f a f P x x = H [x ] R H[x ] +H [x ] x x y : (A.3) n n;j n;j n;j n;j n;j n;j n;j We expand H[x] around the ensemble mean as f f 0 f f f H[x ]  H[X ] +H [x ] x X : (A.4a) j j j Afterwards, we assume that H[x] = Hx, where H is the tangent linear operator. The model function gradient 0 f H [x ] can be estimated directly with the linear operator H based on Eq. A.4. Hence, Eq. (A.3) can be formulated and rearranged as 1 a f > 1 f a f P x x = H R Hx + H(x x ) y ; (A.5a) n n;j n;j n;j n;j n;j a f > 1 > 1 f x = x + P I + H R HP H R y Hx : (A.5b) n n j n;j n;j n;j Set Q = R HP and we have: > > > > H I + QH = I + H Q H ; (A.6a) > > > > I + H Q H = H I + QH : (A.6b) > 1 1 > > 1 > 1 Now back to Eq. (A.5b), substituting (I +H R HP ) H with H (I +R HP H ) based on Eq. (A.6b), n n we can derive: a f > 1 > 1 f x = x + P H I + R HP H R y Hx ; (A.7a) n n j n;j n;j n;j a f > > f x = x + P H R + HP H y Hx : (A.7b) n n j n;j n;j n;j Eq. (A.7b) is the iterative formulation for the analysis step of the EnKF method. Appendix B. Derivation of EnRML To derive the analysis scheme of ensemble randomized maximal likelihood method, we start from the gradient and Hessian of the cost function as @J 1 0 > 1 = P (x x ) +H [x ] R (H[x ] y ) ; (B.1a) l;j 0;j l;j l;j j @x l;j @ J 1 0 > 1 0 = P +H [x ] R H [x ]: (B.1b) l;j l;j @ x l;j 27 In the EnRML method, the state vector x is updated with the Gauss{Newton method as @ J @J a f x = x : (B.2) l;j l;j 2 f f @ x @x l;j l;j Through directly introducing the gradient and Hessian formulation into Eq. (B.2), we can have a f 1 0 f > 1 0 f 1 f f 0 f > 1 f x = x P +H [x ] R H [x ] P (x x ) +H [x ] R (H[x ] y ) ; l;j l;j 0 l;j l;j 0 l;j 0;j l;j l;j (B.3) 0 f > 1 0 f f f 0 f > 1 f I + P H [x ] R H [x ] x x + P H [x ] R H[x ] y : 0 0 j l;j l;j l;j 0;j l;j l;j By expanding the last term, we obtain a f 0 f > 1 0 f f f x =x I + P H [x ] R H [x ] x x l;j l;j l;j l;j l;j 0;j (B.4) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y : 0 0 j l;j l;j l;j l;j We can further derive from (B.4) via Woodbury formula as follows: a f 0 f > 0 f 0 f > 0 f f f x =x I P H [x ] R +H [x ]P H [x ] H [x ] x x 0 0 l;j l;j l;j l;j l;j l;j l;j 0;j (B.5) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y : 0 0 j l;j l;j l;j l;j After expanding the second term at right hand and rearranging, we can have a f f 0 f > 0 f 0 f > 0 f f f x = x + (1 ) x + P H [x ] R +H [x ]P H [x ] H [x ] x x 0 0 l;j 0;j l;j l;j l;j l;j l;j l;j 0;j (B.6) 0 f > 1 0 f 0 f > 1 f I + P H [x ] R H [x ] P H [x ] R H[x ] y 0 0 j l;j l;j l;j l;j 0 > Set Q = P H [x] , and we deduce 1 0 1 0 QR (R +H [x]Q) = I + QR H [x] Q; (B.7a) 1 0 1 0 I + QR H [x] QR = Q (R +H [x]Q) : (B.7b) Finally, by substituting Eq. (B.7b) into Eq. (B.6) we can obtain the analysis step for the EnRML method as a f f 0 f > 0 f > 0 f f 0 f f f x = x + (1 ) x P H [x ] R +H [x ] P H [x ] H[x ] y H [x ] x x : 0 0 j l;j 0;j l;j l;j l;j l;j l;j l;j l;j 0;j (B.8)

Journal

PhysicsarXiv (Cornell University)

Published: Apr 12, 2020

References