Access the full text.
Sign up today, get DeepDyve free for 14 days.
Ban Zheng, F. Roueff, F. Abergel (2013)
Ergodicity and Scaling Limit of a Constrained Multivariate Hawkes ProcessCapital Markets: Market Microstructure eJournal
E. Bacry, S. Delattre, M. Hoffmann, J. Muzy (2012)
Scaling limits for Hawkes processes and application to financial statisticsarXiv: Probability
A. Swishchuk, N. Vadori (2016)
A Semi-Markovian Modeling of Limit Order MarketsCapital Markets: Market Microstructure eJournal
P. Embrechts, T. Liniger, Luan Lin (2011)
Multivariate Hawkes processes: an application to financial dataJournal of Applied Probability, 48
B. Rao (2009)
Conditional independence, conditional mixing and conditional associationAnnals of the Institute of Statistical Mathematics, 61
M. Rambaldi, E. Bacry, F. Lillo (2016)
The role of volume in order book dynamics: a multivariate Hawkes process analysisQuantitative Finance, 17
E. Valkeila (2008)
An Introduction to the Theory of Point Processes, Volume II: General Theory and Structure, 2nd Edition by Daryl J. Daley, David Vere‐JonesInternational Statistical Review, 76
I. Sheffer (1948)
Some limit theoremsBulletin of the American Mathematical Society, 54
M. Eichler, R. Dahlhaus, J. Dueck (2016)
Graphical Modeling for Multivariate Hawkes Processes with Nonparametric Link FunctionsJournal of Time Series Analysis, 38
E. Bacry, S. Delattre, M. Hoffmann, J. Muzy (2013)
Some limit theorems for Hawkes processes and application to financial statisticsStochastic Processes and their Applications, 123
A. Skorokhod (1966)
Studies In The Theory Of Random Processes
Á. Cartea, S. Jaimungal, Jose Penalva (2015)
Algorithmic and High-Frequency Trading
P. Brémaud, L. Massoulié (1996)
STABILITY OF NONLINEAR HAWKES PROCESSESAnnals of Probability, 24
C. Bergmeir, Rob Hyndman, B. Koo (2015)
A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction
Rémi Lemonnier, Kevin Scaman, Argyris Kalogeratos (2016)
Multivariate Hawkes Processes for Large-scale Inference APPENDIX
Ban Zheng, F. Roueff, F. Abergel (2014)
Modelling Bid and Ask Prices Using Constrained Hawkes Processes: Ergodicity and Scaling LimitSIAM J. Financial Math., 5
N. Vadori, A. Swishchuk (2015)
Strong Law of Large Numbers and Central Limit Theorems for Functionals of Inhomogeneous Semi-Markov ProcessesStochastic Analysis and Applications, 33
R. Cont, A. Larrard (2011)
Price Dynamics in a Markovian Limit Order MarketEconometrics: Applied Econometrics & Modeling eJournal
Lingjiong Zhu (2012)
Central Limit Theorem for Nonlinear Hawkes ProcessesJournal of Applied Probability, 50
Yingxiang Yang, Jalal Etesami, Niao He, N. Kiyavash (2017)
Online Learning for Multivariate Hawkes Processes
C. Bergmeir, Rob Hyndman, B. Koo (2018)
A note on the validity of cross-validation for evaluating autoregressive time series predictionComput. Stat. Data Anal., 120
(2014)
A point process model for the dynamics of LOB
Jonathan Chávez-Casillas, R. Elliott, B. Rémillard, A. Swishchuk (2017)
A Level-1 Limit Order Book with Time Dependent Arrival RatesMethodology and Computing in Applied Probability, 21
A. Swishchuk (2018)
Risk Model Based on Compound Hawkes ProcessWilmott, 2018
A. Swishchuk, Aiden Huffman (2017)
General Compound Hawkes Processes in Limit Order BooksEconometrics: Econometric & Statistical Methods - Special Topics eJournal
A. Swishchuk, Katharina Cera, Julia Schmidt, Tyler Hofmeister (2016)
General Semi-Markov Model for Limit Order Books: Theory, Implementation and NumericsOPER: Discrete (Topic)
Qi Guo, A. Swishchuk (2020)
Multivariate General Compound Hawkes Processes and their Applications in Limit Order BooksWilmott, 2020
N. Hautsch, L. Bauwens (2006)
Modelling Financial High Frequency Data Using Point ProcessesCapital Markets: Market Microstructure
T. Björk (2011)
An Introduction to Point Processes from a Martingale Point of View
T. Liniger (2009)
Multivariate Hawkes processes
Clive Bowsher (2003)
Modelling Security Market Events in Continuous Time: Intensity Based, Multivariate Point Process ModelsCapital Markets: Market Microstructure
Shizhe Chen, A. Shojaie, E. Shea-Brown, D. Witten (2017)
The Multivariate Hawkes Process in High Dimensions: Beyond Mutual ExcitationarXiv: Methodology
Rémi Lemonnier, Kevin Scaman, Argyris Kalogeratos (2016)
Multivariate Hawkes Processes for Large-Scale Inference
(2017)
Semi-Markov processes. Stochastic Analysis and Applications 33: 213–43
D. Vere-Jones (1972)
Markov ChainsNature, 236
risks Article Multivariate General Compound Point Processes in Limit Order Books 1, 2 1 Qi Guo *, Bruno Remillard and Anatoliy Swishchuk Department of Mathematics and Statistics, University of Calgary, 2500 University Drive NW, Calgary, AB T2N 1N4, Canada; aswish@ucalgary.ca Department of Decision Sciences, HEC Montréal, 3000 Chemin de la Cote-Sainte-Catherine, Montréal, QC H3T 2A7, Canada; bruno.remillard@hec.ca * Correspondence: qi.guo1@ucalgary.ca Received: 29 July 2020; Accepted: 8 September 2020; Published: 11 September 2020 Abstract: In this paper, we focus on a new generalization of multivariate general compound Hawkes process (MGCHP), which we referred to as the multivariate general compound point process (MGCPP). Namely, we applied a multivariate point process to model the order ﬂow instead of the Hawkes process. The law of large numbers (LLN) and two functional central limit theorems (FCLTs) for the MGCPP were proved in this work. Applications of the MGCPP in the limit order market were also considered. We provided numerical simulations and comparisons for the MGCPP and MGCHP by applying Google, Apple, Microsoft, Amazon, and Intel trading data. Keywords: point process (PP); multivariate point processes (MPP); multivariate general compound point processes (MGCPP); limit order books (LOB); functional central limit theorems (FCLT); law of large numbers (LLN) 1. Introduction In this paper, we introduced a new class of stochastic models, which can be considered as a generalization of the multivariate general compound Hawkes process (MGCHP) in Guo and Swishchuk (2020). We called this model the multivariate general compound point processes (MGCPP). A Law of Large Numbers (LLN) and two Functional Central Limit Theorems (FCLT) for MGCPP were proved. FCLTs of the MGCPP can be viewed as a link between price volatility and the order ﬂow. Thus, we applied this asymptotic method to study the mid-price modeling in the limit order book (LOB). Hawkes process was applied to ﬁnancial modelling for the ﬁrst time in 2007 Bowsher (2007). Bacry et al. (2013) proved a LLN and FCLT for multivariate Hawkes process and applied them to study some economic phenomenons in 2013. Volatilities between ﬁve stocks were estimated by a 5-dimensional Hawkes process in Bauwens and Hautsch (2009) in 2009. Other types of Hawkes processes have been studied widely as well. The nonlinear Hawkes process was considered by Brémaud and Massoulié (1996) and the corresponding FCLT was proved in Zhu (2013). Some applications of multivariate Hawkes process to ﬁnancial data are given in Embrechts et al. (2011). The regime-switching Hawkes process was considered by Vinkovskaya (2014) to describe the dynamics dependency on the bid–ask spread in limit order book. In Swishchuk and Vadori (2017), a semi-Markov process based on a renewal process was applied to the mid-price modeling in LOB. Swishchuk et al. (2017) also considered the general case of the semi-Markovian models in 2017. A good textbook for algorithmic and High-Frequency trading methods was written by Cartea et al. (2015) in 2015. Zheng et al. (2014) introduced a multivariate point process describing the dynamics of the Bid and Ask price of a ﬁnancial asset. The point process is similar to a Hawkes process, with additional constraints on its intensity Risks 2020, 8, 98; doi:10.3390/risks8030098 www.mdpi.com/journal/risks Risks 2020, 8, 98 2 of 20 corresponding to the natural ordering of the best Bid and Ask prices. Chen et al. (2019) developed a new approach for investigating the properties of the Hawkes process without the restriction to mutual excitation or linear link functions. They employed a thinning process representation and a coupling construction to bound the dependence coefﬁcient of the Hawkes process. Using recent developments on weakly dependent sequences, a concentration inequality for second-order statistics of the Hawkes process was established. This concentration inequality was applied to cross-covariance analysis in the high-dimensional regime, and it was veriﬁed the theoretical claims with simulation studies. A framework for ﬁtting multivariate Hawkes process for large-scale problems (long history and a wide variety of events) was proposed by Lemonnier et al. (2017). Liniger thesis addresses theoretical and practical questions arising in connection with multivariate, marked, linear Hawkes process Liniger (2009). Yang et al. (2017) developed a nonparametric and online learning algorithm that estimates the triggering functions of a multivariate Hawkes process. An introduction to point processes from a martingale point of view may be found in Bjork’s lecture notes Bjork (2011). Guo and Swishchuk (2020) constructed a multivariate general compound Hawkes process (MGCHP) which is an extended model from Cont and De Larrard (2013) and Swishchuk (2017). In Guo and Swishchuk (2020), they applied the multivariate Hawkes process to model the order ﬂow of several stocks in limit order market and proved limit theorems for the MGCHP. In this paper, we proposed a new mid-price model which is a generalization of the MGCHP and we called it the multivariate general compound point process (MGCPP). For the MGCPP, we applied a multi-dimensional simple point process to represent the order ﬂow in LOB instead of the Hawkes process. We also proved the corresponding LLN and FCLTs for the MGCPP. One of the reasons why we considered the generalized model is parameters for simple point process are much easier to estimate than Hawkes process. So, we provided the numerical comparisons of the MGCPP and MGCHP by real high-frequency trading data and we found that results of the new generalized model are as good as the MGCHP. This paper is organized as follows. Deﬁnitions and assumptions of the multivariate general compound point process (MGCPP) can be found in Section 2. Functional central limit theorem (FCLT) I and law of large numbers were proved in Section 3. We also provided numerical examples simulated by real data for the FCLT I in Section 3. In Section 4, we considered a FCLT II for the MGCPP and applied it in the mid-price prediction. Section 5 concludes the paper. 2. Deﬁnition of Multivariate General Compound Point Process (MGCPP) In this Section, we proposed a multivariate stochastic model for the mid-price in the limit order book. This is a generalization for models in Cont and De Larrard (2013), Guo and Swishchuk (2020), and Swishchuk (2017). Here, we assume the order ﬂow was described by a multivariate simple point process with some good asymptotic properties. Deﬁnition 1 ((Counting Process). (see, e.g., Bjork (2011))). We called a stochastic process fN(t), t 0g counting process if it satisﬁes: the trajectories of N are right continues and piecewise constant with probability one, N(0) = 0, and DN = N N = 0 or 1 with probability one. t t t Counting process is the simplest type of point process. In the following discussion of paper, we adopt Deﬁnition 1 as the deﬁnition of a point process. The point process can be determined by the conditional intensity function l(t) in the form of E[N(t + h) N(t)jF (t)] l(t) = lim , (1) h!0 h where l(t) 0 and F (t) is the corresponding natural ﬁltration. Risks 2020, 8, 98 3 of 20 2.1. Assumptions for Multivariate Point Processes Let N = (N , N , , N , ) be d-dimensional point process with following assumptions: 1,t 2,t d,t Assumption 1. We assume there’s a law of large numbers (LLN) of the N in the form of: N(nt) ! lt (2) ¯ ¯ ¯ ¯ ¯ as n ! +¥ almost surely, where l = (l , l , l , , l ). 2 3 1 d Assumption 2. We also assume there’s a Functional Central Limit Theorem (FCLT) of the N in the form of: n!¥ 1/2 ~ ~ ~ p (N E(N )) ! S W , t 2 [0, 1] (3) nt nt t in law of the Skorohod topology, where W is a standard d-dimensional Brownian motion and S is a d-by-d covariance matrix. Here, N denotes the order ﬂow in the limit order market for d stocks. Liquidity for the high-frequency trading data guarantees there are enough price changes in one day or even a small window size nt. So, it is reasonable to consider those two limit assumptions in the limit order book modeling. Remark 1. For a simple example, if we consider the point process as a multivariate homogeneous Poisson process with independent coordinates, then two assumptions above are LLN and FCLT for the multi-dimensional ~ ~ ~ Poisson process. Let P be a d-dimensional Poisson process with intensity l. Here, we used notation P to t t distinguish the general case and the Poisson example. Then, we have the LLN in the form of ~ ~ su p n P tl ! 0 (4) nt t2[0,1] as n ! ¥ almost surely. Further, the FCLT in the form of ~ ~ n P tl nt 1/2 ~ ~ converge in law for the Skorokhod topology to W l as n ! ¥, where is the element-wise product. Remark 2. Another interesting example is limit theorems for the multivariate Hawkes process (MHP) in Bacry et al. (2013). Let H = ( H , H , , H ) be a d-dimensional Hawkes process. The intensity function t 1,t 2,t d,t for each H is in the form of l (t) = l + m (t s)d H , (5) i i å i j j,s (0,t) j=1 Let m = (m ) , l = (l , l , , l ) , and K = m(t)dt, then the LLN for MHP is in the i j 1i,jd 1 2 d form of 1 1 ~ ~ su p n H t(I K) l ! 0 (6) nt t2[0,1] as n ! ¥ almost surely, where I is a d-by-d identity matrix. We can also have the FCLT for MHP: n!¥ 1 1/2 ~ ~ ~ p ( H E( H )) ! (I K) D W , t 2 [0, 1] nt nt t n Risks 2020, 8, 98 4 of 20 in law of the Skorohod topology, where W is a standard d-dimensional Brownian motion and D is a diagonal matrix determined by D = ((I K) l) . Details about the LLN and FCLT of MHP can be found in ii i Bacry et al. (2013). 2.2. Deﬁnition for MGCPP ~ ~ Next, we consider a price process S in the form S = (S , S , , S , ) as: t t 1,t 2,t d,t i,t S = S + a (X ), (7) i,t i,0 å i i,k k=1 where X are independent ergodic continuous-time Markov chains, independent of N . The state i,k t fig space of X is denoted by X = f1, 2, ,N g. a () are bounded continuous functions. We refer S i,k i i as multivariate general compound point processes (MGCPP). Remark 3. If we consider the one-dimensional case, let N be a Poisson process, a (x) = (d)_ (x^ d), and X is a sequence of independent random variables such that P(X = d) = P(X = d) = 1/2, then S is k 1 1 t a stochastic model for the dynamics of a limit order book discussed in Cont and De Larrard (2013). ~ ~ Remark 4. When N is a multivariate Hawkes process in Remark 2, then S is a multivariate general compound t t Hawkes processes (MGCHP) which proposed in Guo and Swishchuk (2020). 3. LLNs and Diffusion Limits for MGCPP In this Section, we considered the diffusion limit theorems for the MGCPP. It provides us a link ~ ~ between the order ﬂow N and the price process S . The functional central limit theorem (FCLT) and t t law of large numbers (LLN) for the MGCPP are generalizations for the diffusion limit theorems of the MGCHP in Guo and Swishchuk (2020). 3.1. LLN for MGCPP Theorem 1 (LLN for MGCPP). Let S = (S , S , S , , S ) be a d-dimensional MGCPP deﬁned nt 1,nt 2,nt 3,nt d,nt before, we have nt ˜ ¯ ! a lt as n ! ¥ almost surely. Proof of Theorem 1. From the deﬁnition of MGCPP in Equation (7), we have i,nt S S a (X ) i,nt i,0 i i,k = + . n n n k=1 Since S is a constant, we have i,0 i,nt a (X ) S S å i i,k i,t i,0 k=1 lim = lim + lim n!¥ n!¥ n!¥ n n n (8) i,nt å a (X ) i i,k k=1 = 0 + lim . n!¥ Recall the strong LLN of Markov chain (see, e.g., Norris (1998)), we have n!+¥ a (X ) ! a , a.s., å i i,k k=1 Risks 2020, 8, 98 5 of 20 where a is deﬁned by a = p a X . Consider the LLN of MPP in Assumption 1, we have å ( ) fig i i,k i i k2X i,k i,nt ! l t as n ! ¥ almost surely, we obtain N N i,nt i,nt 1 1 i,nt n!+¥ a X = a X ! a l t, a.s. (9) ( ) ( ) å i i,k å i i,k i n n N i,nt k=1 k=1 Rewrite (9) in the multivariate case, we derive the LLN for the MGCPP. 3.2. Diffusion Limits for MGCPP: Stochastic Centralization Theorem 2 (FCLT I: Stochastic Centralization). Let X , i = 1, 2, , d be independent ergodic Markov i,k fig chains deﬁned before. X = f1, 2, ,N g is the state space and the ergodic probabilities is given by ~ ~ p , p , . . . , p . We assume X is independent of N . Let S be d-dimensional MGCPP, we have i,k t nt i,1 i,2 i,n ˜ ~ S a N nt nt n!¥ 1/2 p ! s ˜ L W(t), f or all t > 0, (10) where W(t) is a standard d-dimensional Brownian motion. L is a d-by-d diagonal matrix in the form of ¯ ¯ ¯ ¯ ~ L = diag(l , l , l , , l ). N , a and s ˜ are given by 1 2 3 nt 2 3 2 3 2 3 a 0 N s 0 1,nt 1 1 6 . . 7 6 . 7 6 . . 7 . . ˜ ~ . . . . . . . a = , N = , s ˜ = . 4 5 4 5 4 5 . nt . . . . . . 0 a N 0 s d,nt d d Here, a = å p a (X ), and s := å p v (k) with fig fig i i,k i i k2X i,k i k2X i,k 2 2 v (k) = b (k) + (g (j) g (k)) P (k, j) 2b (k) (g (j) g (k))P (k, j) i i å i i i i å i i i fig fig j2X j2X b = (b (1), b (2), . . . , b (n)) i i i i b (k) : = a (k) a i i g : = (P + P I) b , i i i where P is the transition probability matrix for X , P is the matrix of stationary distributions of P , and g (j) i i i i is the jth entry of g . Proof of Theorem 2. From the deﬁnition of MGCPP, we have i,nt S = S + a (X ), (11) i,nt i,0 å i i,k k=1 and i,nt S = S + (a (X ) a ) + a N , (12) i,t i,0 i i,nt å i,k i i k=1 here the a is deﬁned by a = p a (X ). Then, for some n, we have å fig i i,k i i i,k k2X i,nt S a N S + å (a (X ) a ) i,t i,nt i,0 i i,k i k=1 i p = p . (13) n n Risks 2020, 8, 98 6 of 20 Since S is a constant, when n ! ¥, we have i,0 i,nt S a N (a (X ) a ) S å i,t i,nt i i,k i,0 i i k=1 p p p lim = lim + lim n!¥ n!¥ n!¥ n n n (14) i,nt (a (X ) a ) i i,k k=1 i = 0 + lim p . n!¥ Consider the following sums: R := a X a , ( ( ) ) å i i,k i,n i k=1 and h i 1/2 U (t) := n (1 (ntbntc))R + (ntbntc)R , i,n i,bntc i,bntc+1 where bc is the ﬂoor function. By applying the martingale method in Swishchuk and Vadori (2017) and Vadori and Swishchuk (2015), we have n!+¥ U (t) ! s W (t) (15) i,n i converge weakly in Skorokhod topology. From the assumption (1), we have the LLN for the MPP in the form of N (nt) n!¥ ! l t. Using change of time in (15) and let t ! N (nt)/n, we have n!+¥ U (N (nt)/n) ! s l W (t). (16) i i i i,n i Rewrite (16) in the multivariate form we derive the weak convergence for MGCPP: ~ ~ S a N n!¥ nt nt 1/2 p ! s ˜ L W(t), f or all t > 0. (17) Next, we considered a simple special case of the MGCPP. Let X be a Markov chain with two i,k dependent states (+d,d) and the ergodic probabilities p , 1 p . In the limit order market, the d is i i the ﬁxed tick size and the d-dimensional point process N represents the order ﬂow for d stocks. Here, nt we set a (x) = (d)_ (x^ d) in Equation (7). Then, we can derive the corresponding limit theorems for this kind of special case. Corollary 1 (FCLT I two-state MGCPP: Stochastic Centralization). ~ ˜ ~ S a N nt nt 1/2 p ! s ˜ L W(t), f or all t > 0. (18) n!¥ a and s ˜ are given by 2 3 2 3 2 3 a 0 N s 0 1,nt 1 1 6 7 6 7 6 7 . . . . . . . ˜ ~ . . . . . . . a = , N = , s ˜ = , 4 5 nt 4 5 4 5 . . . . . . . 0 a N 0 s d,nt d d Risks 2020, 8, 98 7 of 20 where a = d(2p 1), and i i 0 0 1 p + p p p 2 2 i i i s := 4d p (1 p ) (19) i i i p + p 2 ( p , p ) are transition probabilities of the Markov chain X . i i,k Corollary 2 (LLN for two-state MGCPP). Let S be d-dimensional general compound point process with nt two-state Markov chain X , we have i,k nt ! a lt, a.s. ˜ ¯ Here, a and l are constants deﬁned in Corollary 1. Proof of Corollarys 1 and 2. Set Markov chain X with two states (+d,d) and i,k a (x) = (d)_ (x^ d) in Theorem 2 and Theorem 1, we can derive Corollarys 1 and 2 directly. Remark 5. From the FCLT I of MGCPP, we can derive an approximation for the mid-price S : nt 1/2 ~ ~ ~ S s ˜ L W(t) n + a N , (20) nt nt for all t > 0 and some lagre enough n. Since S is the price process in high-frequency trading, the time is always nt measured in a very short period (e.g., milliseconds). So, even if the window size nt = 10 s with t = 0.001, the n will equal to 10,000 which is a very large number. In this way, it is reasonable to consider this kind of approximation in the LOB. ~ ~ Remark 6. When N is a multivariate Hawkes process in Remark 2, then the S is a MGCHP model, t nt corresponding FCLTs and LLNs were considered in Guo and Swishchuk (2020). To distinguish with the ~ ~ general case, we also applied the H to denote the multivariate Hawkes process and S (nt) to denote the nt H awkes price process by MGCHP. Then we have the FCLT for MGCHP in the form of ~ ~ S (nt) a ˜ H n!¥ H awkes nt 1/2 p ! s ˜ D W(t), for all t > 0, where W(t) is a multivariate standard Brownian motion and D is deﬁned in Remark 2. We can ﬁnd clearly that the limit theorem for MGCPP is a generalization of the Hawkes case. Also, when we consider an one-dimensional case, if N is a renewal process, the corresponding limit theorems for the semi-Markovian model S were discussed t t in Swishchuk and Vadori (2017) and Swishchuk et al. (2017). 3.3. Numerical Examples for FCLT: Stochastic Centralization In this Section, we tested the FCLT I of MGCPP model with the LOBSTER data and compared our results with the simulation results by MGCHP in Guo and Swishchuk (2020). In their paper, they applied two stocks in the LOBSTER data set, namely the mid-price of Microsoft and Intel. As for the Markov chain part, they used the two-state Markov chain (+d,d). In order to make our results comparable with the MGCHP, we ﬁrst applied the same data set (Microsoft and Intel) and same two-state Markov chain (+d,d) for the MGCPP model. Next, we explore more simulation examples (by Apple, Amazon, and Google data) which were mentioned in Guo and Swishchuk (2020). For those three stocks, we applied the MGCPP model with both two-state Markov chain and N -state Markov chain. Risks 2020, 8, 98 8 of 20 3.3.1. Data Description and Parameter Estimations The level one LOBSTER data was considered in this paper. The LOBSTER data set contained the stock prices and order ﬂows of Apple, Amazon, Google, Microsoft, and Intel on 21 June 2012. The tick size is one cent (d = 0.005) and time was measured in milliseconds (0.001 s). We can ﬁnd the basic data description and check the liquidity from Table 1. Notation # is the number sign. Table 1. Data description and stock liquidity of Apple, Amazon, Google, Microsoft, and Intel. Ticker # of Orders in 1 Day Avg # of Orders/s # of Price Changes in 1 Day Avg # of Price Changes/s INTC 404,986 17.3071 3218 0.1375 MSFT 411,409 5.0640 4016 0.1716 AAPL 118,497 5.0640 64,351 2.7500 AMZN 57,515 2.4579 27,558 1.1777 GOOG 49,482 2.1146 24,085 1.0293 ¯ ¯ ¯ ¯ ¯ ~ Next, we estimate l = (l , l , l , , l ) via the LLN assumption of N . From Assumption 1, 1 2 3 d t when n is large enough, we can derive the approximation: N(nt) l, t 2 [0, 1]. (21) nt Take the expectation for (21), we have E(N(nt)) l, t 2 [0, 1]. (22) nt In this way, we derived the estimated parameters l for 5 stocks in Table 2. Table 2. Estimated parameters of 5 stocks via the law of large numbers (LLN) and functional central limit theorem (FCLT) assumptions. Ticker l INTC 0.1366 MSFT 0.1729 AAPL 2.2938 AMZN 1.0374 GOOG 0.8178 In the deﬁnition of the MGCPP, we assumed Markov chains X are independent. So, we checked i,k correlations of the price increments between 5 stocks in Table 3. As can be seen in Table 3, correlations are relatively weak (around 0.3). So, it is reasonable to consider Markov chains X i,k here are independent. In the future work, we will discuss the dependent case for different data sets. Table 3. Correlations of price increments between 5 stocks. We set the time step as 10 s. Ticker INTC MSFT AAPL AMZN GOOG INTC 1.0000 0.3870 0.2948 0.2932 0.2389 MSFT 0.3870 1.0000 0.4373 0.3984 0.3474 AAPL 0.2948 0.4373 1.0000 0.3697 0.3322 AMZN 0.2932 0.3984 0.3697 1.0000 0.3251 GOOG 0.2389 0.3474 0.3322 0.3251 1.0000 Next, we estimated parameters for the Markov chain by applying the two-state MGCPP model in Corollary 1. The transition matrix P of two dependent state Markov chain X is denoted as " # p 1 p uu uu P = . 1 p p dd dd Risks 2020, 8, 98 9 of 20 We calculated frequency in our data to estimate the p and p in P by uu dd uu p = , uu q + q uu ud dd p = , dd q + q dd du where q , q , q , and q are the number of price goes up twice, goes down twice, goes up and then uu dd ud du down, goes down and then up, respectively. The result is in Table 4: Table 4. Transition matrix and constant parameters for two-state MGCPP. a and s were calculated by Equation (19). Ticker p p s a uu dd INTC 0.5373 0.5814 0.0057 2.5023 10 MSFT 0.5711 0.6044 0.0060 2.0145 10 AAPL 0.4954 0.4955 0.0050 2.1529 10 AMZN 0.4511 0.4590 0.0046 3.6077 10 GOOG 0.4536 0.4886 0.0047 1.6584 10 3.3.2. Comparison with MGCHP with Two Dependent Orders In this Section, we compared the simulation results of MGCPP with the multivariate general compound Hawkes process (MGCHP) model to show that the simple generalized model can also reach a good accuracy as the MGCHP who has a sophisticated intensity function (see Equation (5)). In Guo and Swishchuk (2020), they simulated the MGCHP with two dependent states for Microsoft and Intel’s data. So here we also conduct simulations for Microsoft and Intel’s data with the two-state MGCPP. We tested the MGCPP model by comparing the standard deviation for the left hand side and right hand side in the FCLT: ~ ˜ ~ S N a n!¥ nt nt 1/2 p ! s ˜ L W(t). We separated the data set into disjoint windows [int, (i + 1)nt]. Since the time was measured in milliseconds, we set t = 0.001. Then we can calculate: ~ ~ ~ ~ ~ S = S S (N((i + 1)nt) N(int))a , int i (i+1)nt and the standard deviation is in the form of n o p 1/2 ~ ~ std S ns ˜ L t. (23) Figure 1 gives a standard deviation comparison of MGCPP, MGCHP, and the raw data for 2 stocks in different window sizes from 0.1 s to 12 s in steps of 0.1 s. First, we could ﬁnd the MGCPP parameters make the standard deviation of LHS very similar to the RHS for each stocks when n is large. So, generally speaking, we can say our MGCPP model ﬁts the data well. Second, the MGCPP curve is very close to the MGCHP curve or we could say the simulation results via Intel and Microsoft stocks data are nearly same. It shows that even we do not have a sophisticated intensity function as the Hawkes process, we still can reach a relative good result with a simple point process model. This can help us deal with the computing efﬁciency problem when using the MGCHP model. We’ll give more quantitative error analysis later. Risks 2020, 8, 98 10 of 20 Remark 7. Since the number of windows decreases as the window size nt increases, we can ﬁnd that the spread of data increases when the window size increases in Figure 1. For example, when we consider nt = 0.1 s, the number of windows is 234,000. However, a 12-s window size yields 1950 windows which will lead the standard deviation increases. -3 INTC thm1 Compound Hawkes Std Empirical Std Compound point std -3 3.85 3.8 4 3.75 3.7 3.65 3.6 2 3 3.1 3.2 3.3 3.4 0 2 4 6 8 10 12 Window Size(Sec) MSFT thm1 0.01 Compound Hawkes Std 0.009 Empirical Std Compound point std 0.008 0.007 -3 4.8 0.006 4.75 4.7 0.005 4.65 0.004 4.6 0.003 3.4 3.5 3.6 0.002 0.001 0 2 4 6 8 10 12 Window Size(Sec) Figure 1. Standard deviation comparisons for 2 stocks by FCLT I for multivariate general compound point process (MGCPP) and multivariate general compound Hawkes process (MGCHP). Intuitively, Figure 1 shows that the standard deviation of MGCHP and MGCPP are very close and both of them ﬁt the real standard deviation very well. Next, we analyze MGCHP and MGCPP models quantitatively. We computed the mean square error (MSE) of the real standard deviation and theoretical standard deviations in Table 5. As can be seen from Table 5, MGCHP model performs better than the MGCPP model with both Intel and Microsoft data. For Intel stock data, the MSE of MGCHP is 17% better than MGCPP and nearly 10% better than MGCPP model with the Microsoft stock data. However, when we compare the order of magnitude of the MSE (8) with the real standard deviation (2 and 3), we still can conclude that MGCPP is good enough for the mid-price modeling task. Standard Deviation Standard Deviation Risks 2020, 8, 98 11 of 20 Table 5. The mean square error (MSE) of the real standard deviation and theoretical standard deviations from MGCHP and MGCPP. Ticker MGCHP MSE MGCPP MSE 8 8 INTC 3.4039 10 3.9858 10 8 8 MSFT 9.6454 10 8.6189 10 Recall Equation (23), we can ﬁnd the standard deviation and the square root of time step have a linear relationship. So, we can ﬁt the real standard deviation data with the square root curve by using the least-square regression. Then, we can set the regression curve as a benchmark and compare the benchmark coefﬁcients with two stochastic models. From Table 6, we can ﬁnd that the percentage error of both two stochastic models are all smaller than 5% and there is no signiﬁcant difference between the MGCPP coefﬁcient and the MGCHP coefﬁcient. Table 6. Coefﬁcients calculated by MGCHP and MGCPP models. Benchmark coefﬁcients are coefﬁcients of the least-square regression curves. Percentage errors are differences between two stochastic models with the benchmark. Ticker MGCHP Coefﬁcient MGCPP Coefﬁcient Benchmark Coefﬁcient MGCHP % Error MGCPP % Error INTC 0.002086 0.002089 0.002162 3.515% 3.377% MSFT 0.002494 0.002487 0.002609 4.408% 4.676% Based on the previous analysis, we can conclude that the empirical results of MGCHP and MGCPP are very close and all of them have a very good performance in the mid-price modeling. However, as for the MGCHP, we need to estimate many parameters. As the Guo and Swishchuk (2020) mentioned, if we consider a two-dimensional MGCHP (two stocks), we have to estimate 5 parameters for the Hawkes process part and the number of parameters increases dramatically to 55 when we consider a 5-dimensional case (5 stocks). The parameter estimation procedure is also quite time consuming for the MGCHP because of the complicated likelihood function of multivariate Hawkes process. For example, it takes a dozen hours to estimate parameters for a 3-dimensional Hawkes process (21 parameters) with LOBSTER data set by using the maximum likelihood estimation (MLE) and the particle swarm optimization (PSO) method in Guo and Swishchuk (2020). On the contrary, the number of parameters for MGCPP is much smaller than the MGCHP. In the two-dimensional case, we have 2 parameters to be estimated in the simple point process part and this increases to 5 parameters in the 5-dimensional case, which is much smaller than 55. The parameter estimation procedure is also quite simple and fast (in several seconds with the same data set) because we do not have to deal with the likelihood function. In this way, from the numerical perspective, the generalized model MGCPP is better than the MGCHP because of the fast and simple estimation procedure. Remark 8. Note that the numbers of parameters we mentioned before are all parameters of the order ﬂow N . Parameters of Markov chains for MGCHP and MGCPP are same. In general, we showed that the results of the new generalized model are as good as the MGCHP and this kind of generalization has better numerical properties. In the following parts, we will explore the MGCPP model more. 3.3.3. MGCPP with N -State Dependent Orders We give more simulation examples by using the Google, Apple, and Amazon data with the MGCPP model with N -state dependent orders in this section. Thanks to Swishchuk and Huffman (2020), we can conclude that the accuracy of the general compound Hawkes process model increases when the number of states increases. For Google, Apple, and Risks 2020, 8, 98 12 of 20 Amazon in the LOBSTER data set, the best number of states is 4 to 7. In the previous section, we also showed that simulation results of MGCPP are nearly same as the MGCHP. So, it is reasonable to consider a MGCPP model with 7-state Markov chain here. We applied the method in Swishchuk and Huffman (2020) to calculate the state values a(X ) for i,k each stock. First, we compute the changes of mid-price and separate the data into two sets by positive increments or negative increments. Next, we calculate the quantiles for both data sets and split the data set according to the quantiles. If there are identical quantiles, we merge them into one. Then, we set the state values a(X ) as the average of mid-price changes located in each quantile (or merged quantile). i,k Figures 2–4 give standard deviation comparisons for MGCPP with 2-state Markov chain and 7-state Markov chain simulated by different tickers’ data. Since the 2-state simulation results here are not as good as the results simulated by Intel’s and Microsoft’s data, we take bigger time steps and window sizes (from 10 s to 20 min with 10 s time step) to capture more dynamics. From ﬁgures we can ﬁnd that the 7-state model has a signiﬁcant improvement than the 2-state model. Seven-state curves for AAPL and GOOG are very close to the real standard deviation, although the theoretical curve of AMZN is underestimated even with the 7-state model. AAPL thm1 MGCPP with 2-state Markov chain (+ , - ) 1.2 Empirical Std Compound point std 0.8 0.6 0.4 0.2 0 200 400 600 800 1000 1200 Window Size(Sec) AAPL thm1 MGCPP with 7-state Markov chain 1.2 Empirical Std Compound point std 0.8 0.6 0.4 0.2 0 200 400 600 800 1000 1200 Window Size(Sec) Figure 2. Standard deviation comparisons for MGCPP with 2-state Markov chain and 7-state Markov chain simulated by Apple’s stock data. Standard Deviation Standard Deviation Risks 2020, 8, 98 13 of 20 GOOG thm1 MGCPP with 2-state Markov chain (+ , - ) 1.4 Empirical Std Compound point std 1.2 0.8 0.6 0.4 0.2 0 200 400 600 800 1000 1200 Window Size(Sec) GOOG thm1 MGCPP with 7-state Markov chain 1.4 Empirical Std Compound point std 1.2 0.8 0.6 0.4 0.2 0 200 400 600 800 1000 1200 Window Size(Sec) Figure 3. Standard deviation comparisons for MGCPP with 2-state Markov chain and 7-state Markov chain simulated by Google’s stock data. Standard Deviation Standard Deviation Risks 2020, 8, 98 14 of 20 AMZN thm1 MGCPP with 2-state Markov chain (+ , - ) 0.8 Empirical Std Compound point std 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 200 400 600 800 1000 1200 Window Size(Sec) AMZN thm1 MGCPP with 7-state Markov chain 0.8 Empirical Std Compound point std 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 200 400 600 800 1000 1200 Window Size(Sec) Figure 4. Standard deviation comparisons for MGCPP with 2-state Markov chain and 7-state Markov chain simulated by Amazon’s stock data. Table 7 lists the MSE and coefﬁcients of the 2-state and 7-state models with different tickers. We can ﬁnd the improvement of 7-state model quantitatively from the table. The results of AAPL and GOOG are good enough for the mid-price modeling. As for AMZN, although we derive a remarkable improvement from 2-state model (74.60% error) to 7-state model (28.29% error), we cannot make Standard Deviation Standard Deviation Risks 2020, 8, 98 15 of 20 the error smaller than 5% or 10%. This is to say, MGCPP model may not be able to capture the full dynamics for AMZN data, but it still can be a strong candidate for modeling the mid-price, which is consistent with the conclusion of compound Hawkes model in Swishchuk and Huffman (2020). Table 7. The MSE and coefﬁcients computed by MGCPP with 2-state and 7-state Markov chain for different tickers. The regression coefﬁcients were derived by ﬁtting the real standard deviations with square root curve. The MGCPP coefﬁcients were computed by Equation (23). Ticker MSE Regression Ceofﬁcient MGCPP Ceofﬁcient Percentage Error AAPL 2-state 0.2467 0.0278 0.0076 72.66% AAPL 7-state 0.0064 0.0311 0.0288 7.40% GOOG 2-state 0.4161 0.0307 0.0044 85.67% GOOG 7-state 0.0081 0.0307 0.0287 6.51% AMZN 2-state 0.1233 0.0189 0.0048 74.60% AMZN 7-state 0.0225 0.0205 0.0147 28.29% Remark 9. The MGCPP is not only a generalization of MGCHP, but also a generalization for all multivariate compound models whose point processes N satisfy the Assumptions 1 and 2. The reason we use Hawkes process for comparison is we want to take the advantage of numerical examples in references. 4. Diffusion Limit for the MGCPP: Deterministic Centralization We proved a LLN and FCLT for the MGCPP in the previous section. Limit theorems provide us an approximation for the mid-price modeling in the LOB. Recall the approximation in Remark 5, we have 1/2 ~ ~ ˜ ~ S s L W(t) n + a N , (24) nt nt ~ ~ where the S is the price process and N is the order ﬂow. However, in the real-world problems, nt nt Equation (24) cannot help us with the forecasting task directly because we cannot observe the future order ﬂow N in advance. This is the motivation for us to consider a FCLT II for the MGCPP model. nt 4.1. FCLT for MGCPP: Deterministic Centralization Theorem 3. (FCLT II: Deterministic Centralization). Let X , i = 1, 2, , d be independent ergodic Markov i,k fig chains deﬁned before. X = f1, 2, ,N g is the state space and the ergodic probabilities is given by ~ ~ p , p , . . . , p . Assume X is independent of N . Let S be d-dimensional MGCPP, we have t nt i,k i,1 i,2 i,n ~ ˜ ~ S a E(N ) nt nt n!¥ 1/2 1/2 ~ ~ p ˜ ! s L W (t) + a S W (t), f or all t > 0, (25) 1 2 ~ ~ ˜ where W (t) and W (t) are independent d-dimensional Brownian motions. Parameters s ˜ , a , L, and S are 1 2 deﬁned in Theorem 2. Proof of Theorem 3. Recall the FCLT for MPP (Assumption 2), we have 1 1 n!¥ 1/2 ~ ~ ~ p N p E(N ) ! S W (26) nt nt t n n in law for the Skorokhod topology. From Theorem 2, we have the FCLT for MGCPP ~ ˜ ~ S a N n!¥ nt nt 1/2 p ! s ˜ L W , f or all t > 0 (27) in the weak law of Skorokhod topology. Here, we assume two multivariate Brownian motions in ~ ~ (26) and (27) are mutually independent and we refer them W (t) and W (t). Let G be the s-algebra 2 1 t Risks 2020, 8, 98 16 of 20 ~ ~ generated by N (s), s t, 1 i d. Since N and the Markov chain a(X ) are independent, S is only i t t i,k determined by N and a(X ), we can have processes i,k 1 1 ~ ~ p N p E(N ) , (28) nt nt n n and ~ ~ S a N nt nt p (29) are G -conditional independent. Similar to the central limit theorem in Prakasa-Rao (2009), we consider the convergence of conditional expectations for processes (28) and (29) on G . Then with the characteristic functions for both limiting processes, we have the joint convergence 1 1 S a ˜ N conditional on G nt nt t 1/2 1/2 ~ ~ ~ ~ p N p E N , p ! S W (t), s ˜ L W (t) (30) nt nt 2 1 n n n as n ! ¥. Next, consider ~ ~ ~ ~ S a ˜ E(N ) S a ˜ N 1 1 nt nt nt nt ~ ~ p p p ˜ p p = + a N E(N ) . (31) nt nt n n n n n By (30) we can derive ~ ~ S a ˜ N 1 1 nt nt 1/2 1/2 ~ ~ ~ ~ p + a ˜ p N p E(N ) ! s ˜ L W (t) + a S W (t) (32) nt nt 1 2 n n n as n ! ¥ which gives (25). Remark 10. We can also consider a special case as the FCLT I. Let X be a Markov chain with two dependent i,k states (+d,d) and the ergodic probabilities are p , 1 p . Set a (x) = (d)_ (x^ d) in the Deﬁnition 7. i i Then, we can derive a similar result for FCLT II. Parameters a ˜ and s ˜ can be computed by Equation (19). Remark 11. For the FCLT II, we can also consider a similar approximation as the FCLT I. For some large enough n, we have p p 1/2 1/2 ~ ~ ˜ ~ ˜ ~ S ns ˜ L W (t) + na S W (t) + a E(N ), for all t > 0. (33) nt 1 2 nt To deal with the E(N ) term, we consider the approximation derived from Assumption 1 in Equation (22): nt E(N(nt)) ntl. (34) Rewrite Equation (33), we have the new approximation p p 1/2 1/2 ~ ~ ~ ¯ ˜ ˜ S ns L W (t) + na S W (t) + a ntl. (35) nt 1 2 4.2. Numerical Examples for FCLT: Deterministic Centralization In this Section, we applied the LOBSTER data to test the FCLT II. According to the numerical examples of FCLT I, we consider the standard deviation of the approximation in Remark 11, namely n o 2 2 ~ ~ ~ ˜ ~ std S S (s ) Lnt + (a ) Snt. (36) (i+1)nt int Risks 2020, 8, 98 17 of 20 First, we estimated the covariance matrix S by applying the Assumption 2. When n is large enough, have the approximation: 1/2 ~ ~ ~ (N E(N )) S W , t 2 [0, 1]. (37) nt nt t Take the covariance for both side of (37), we have (Cov(N (nt), N (nt))) S , t 2 [0, 1], i, j = 1, 2, , d. (38) i j i,j nt Then, we can derive the estimated S for 5 stocks in Table 8. Table 8. Estimated covariance matrix S of 5 stocks via the FCLT assumption. Ticker INTC MSFT AAPL AMZN GOOG INTC 0.4844 0.1719 1.2393 0.5317 0.5312 MSFT 0.1719 0.5634 1.8361 0.7834 0.7162 AAPL 1.2393 1.8361 62.3800 6.7811 6.4331 AMZN 0.5317 0.7834 6.7811 19.2883 1.9617 GOOG 0.5312 0.7162 6.4331 1.9617 22.7980 Comparisons of real standard deviation and theoretical standard deviation can be found in Figure 5. Since results of INTC and MSFT are good enough with the 2-state Markov chain (+d,d) in FCLT I, we also applied 2-state Markov chain for INTC and MSFT here. As for AAPL, GOOG, and AMZN, we used the MGCPP model with 7-state Markov chain. Window sizes here start from 1 s and increase to 20 min in time steps of 10 s. As can be seen in Figure 5, the results for FCLT II are as good as the FCLT I results in Figures 1–4. We also computed the MSE and coefﬁcients in Table 9. Benchmark coefﬁcients are from the least-square regression curves which are similar as benchmarks in Table 6. Table 9. The MSE and coefﬁcients computed by MGCPP FCLT II. Ticker MSE Benchmark Ceofﬁcient MGCPP Ceofﬁcient Percentage Error 5 3 3 INTC 2-state 1.4452 10 2.2361 10 2.0958 10 6.27% 5 3 3 MSFT 2-state 6.6227 10 2.5157 10 2.4919 10 0.94% 3 2 2 AAPL 7-state 6.1382 10 2.7799 10 2.8788 10 3.56% 3 2 2 GOOG 7-state 8.0981 10 3.0736 10 2.8686 10 6.67% 2 2 2 AMZN 7-state 1.1156 10 1.8940 10 1.4747 10 22.14% Overall Percentage Error 7.92% We see that the percentage errors of MSFT and AAPL are very small (less than 5%) and the results of INTC and GOOG are also good (less than 10%). The percentage error of AMZN is large, but it is still smaller than the error derived from FCLT I in Table 7. In general, the simulation results of FCLT II is as good as the FCLT I and we can apply this FCLT II to model a mid-price. Risks 2020, 8, 98 18 of 20 AAPL thm2 MGCPP with 7-state Markov chain GOOG thm2 MGCPP with 7-state Markov chain 1.2 1.4 Empirical Std Empirical Std Compound point std Compound point std 1.2 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 Window Size(Sec) Window Size(Sec) INTC thm2 MGCPP with 2-state Markov chain (+ , - ) MSFT thm2 MGCPP with 2-state Markov chain (+ , - ) 0.09 0.12 Empirical Std Empirical Std 0.08 Compound point std Compound point std 0.1 0.07 0.06 0.08 0.05 0.06 0.04 0.03 0.04 0.02 0.02 0.01 0 0 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 Window Size(Sec) Window Size(Sec) AMZN thm2 MGCPP with 7-state Markov chain 0.8 Empirical Std Compound point std 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 200 400 600 800 1000 1200 Window Size(Sec) Figure 5. Standard deviation comparisons for 5 stocks by FCLT II for the MGCPP. INTC and MSFT are simulated with 2-state Markov chain while AAPL, AMZN, and GOOG are using 7-state Markov chain. 4.3. Rolling Cross-Validation In this section, we tested the forecast ability of the MGCPP model. Since we did not assume the multivariate point process N is stationary, we cannot apply the K-fold cross-validation directly. Here, we used the rolling K-fold cross-validation method which proposed in Bergmeir et al. (2018). We divided the last 50 mins’ data into 5 disjoint 10-min windows for each stock. For the fold 1, We take the ﬁrst 280 mins’ data as the training set to estimate parameters. Then, we applied the data in the next 10-min window to calculate the percentage error. Next, we merge the test set into the training set in fold 1 as the new training set in fold 2 and apply the next 10-min window as a new test set. Repeating this procedure 5 times, we will get 5 percentage errors. The mean value of the 5 percentage errors will be the test error E for this stock. So, the overall test error for our multivariate model is the average of all test errors. Figure 6 gives an example diagram for the rolling cross-validation. Standard Deviation Standard Deviation Standard Deviation Standard Deviation Standard Deviation Risks 2020, 8, 98 19 of 20 Figure 6. Diagram for the Rolling cross-validation. Table 10 lists test errors for different tickers and the overall test error for the MGCPP model. As can be seen from the table, the test error for each stock is relatively large and the overall test error (15.46%) is nearly double the overall percentage error (7.92%) in Table 9. That is because the results in Table 9 is a ﬁtting error while the test errors in Table 10 is a kind of forecast error. We did not apply any future information when we conduct the forecast task. So, even the 15.46% overall test error is not as good as the ﬁtting one, it is still a good prediction in the LOB and can provide lots of insights in the forecast task. Table 10. Test Errors for different tickers by applying 5-fold cross-validation. The errors are percentage errors between benchmark coefﬁcients and the MGCPP coefﬁcients. Ticker Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Mean Error INTC 6.75% 0.39% 3.16% 14.32% 16.60% 8.24% MSFT 20.33% 31.35% 16.96% 8.33% 22.61% 19.92% AAPL 8.22% 0.51% 22.53% 21.34% 23.33% 15.01% GOOG 19.60% 20.41% 16.41% 6.13% 12.51% 15.19% AMZN 20.78% 4.87% 7.98% 18.81% 42.15% 18.92% Overall Test Error E = 15.46% test 5. Conclusions and Future Work In this paper, we proposed a MGCPP model for the mid-price modeling in limit order book. This kind of process is a generalization of several stochastic models in the limit order market. We applied LOBSTER data to conduct simulations and found the multivariate generalized model is as good as the general compound Hawkes process model. We also tested the prediction ability of this kind of process. In general, the MGCPP performs very well in LOB modeling and it can be a meaningful reference in the mid-price prediction. In the future, we will explore more applications of the MGCPP and consider related option pricing problems under this kind of frame work. Author Contributions: Q.G.: methodology, software, validation, data curation, visualization, writing—original draft preparation. B.R.: conceptualization, methodology. A.S.: project administration, supervision, writing—review and editing, conceptualization, methodology. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by NSERC grant number: RT732266. Acknowledgments: All authors wish to thank NSERC for continuing support. Conﬂicts of Interest: The authors declare no conﬂict of interest. Risks 2020, 8, 98 20 of 20 References Bacry, Emmanuel, Sylvain Delattre, Marc Hoffmann, and Jean-François Muzy. 2013. Some limit theorems for Hawkes processes and application to financial statistics. Stochastic Processes and their Applications 123: 2475–99. Bowsher, Clive G. 2007. Modelling security market events in continuous time: Intensity based, multivariate point process models. Journal of Econometrics 141: 876–912. Brémaud, Pierre, and Laurent Massoulié. 1996. Stability of nonlinear Hawkes processes. The Annals of Probability 1: 1563–88. Bjork, Tomas. 2011. Introduction to Point Processes from a Martingale Point of View. Stockholm: KTH. Bauwens, Luc, and Nikolaus Hautsch. 2009. Modelling Financial High Frequency Data Using Point Processes. Berlin/Heidelberg: Springer. Bergmeir, Christoph, Rob J. Hyndman, and Bonsoo Koo. 2018. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Computational Statistics and Data Analysis 120: 70–83. Cartea, Álvaro, Sebastian Jaimungal, and José Penalva. 2015. Algorithmic and High-Frequency Trading. Cambridge: Cambridge University Press. Chen, Shizhe, Ali Shojaie, Eric Shea-Brown, and Daniela Witten. 2019. The Multivariate Hawkes Process in High Dimensions: Beyond Mutual Excitation. arXiv arXiv:1707.04928v2. Cont, Rama, and Adrien De Larrard. 2013. Price dynamics in a Markovian limit order market. SIAM Journal on Financial Mathematics 4: 1–25. Embrechts, Paul, Thomas Liniger, and Lu Lin. 2011. Multivariate Hawkes processes: An application to ﬁnancial data. Journal of Applied Probability 48: 367–78. Guo, Qi, and Anatoliy Swishchuk. 2020. Multivariate general compound Hawkes processes and their applications in limit order books. Wilmott 107: 42–51. Lemonnier, Rémi, Kevin Scaman, and Argyris Kalogeratos. 2017. Multivariate Hawkes Processes for Large-Scale Inference. Paper presented at Thirty-First AAAI Conference on Artiﬁcial Intelligence, San Francisco, CA, USA, February 4–9. Liniger, Thomas Josef. 2009. Multivariate Hawkes Processes. Ph.D. dissertation, Swiss Federal Institute of Technology in Zurich, Zurich, Switzerland. Norris, James R. 1998. Markov Chains. Cambridge: Cambridge University Press. Rao, B. L. S. Prakasa. 2009. Conditional independence, conditional mixing and conditional association. Annals of the Institute of Statistical Mathematics 61: 441–60. Swishchuk, Anatoliy. 2017. Risk model based on compound Hawkes process. Wilmott 94: 50–57. Swishchuk, Anatoliy, and Nelson Vadori. 2017. A semi-Markovian modelling of limit order markets. SIAM Journal on Financial Mathematics 8: 240–73. Swishchuk, Anatoliy, Tyler Hofmeister, Katharina Cera, and Julia Schmidt. 2017. General semi-Markov model for limit order books. International Journal of Theoretical and Applied Finance 20: 1750019. Swishchuk, Anatoliy, and Aiden Huffman. 2020. General compound Hawkes processes in limit order books. Risks 8: 28. Vinkovskaya, Ekaterina. 2014. A Point Process Model for the Dynamics of LOB. Ph.D. dissertation, Columbia University, New York, NY, USA. Vadori, Nelson, and Anatoliy Swishchuk. 2015. Strong law of large numbers and central limit theorems for functionals of inhomogeneous Semi-Markov processes. Stochastic Analysis and Applications 33: 213–43. Yang, Yingxiang, Jalal Etesami, Niao He, and Negar Kiyavash. 2017. Online Learning for Multivariate Hawkes Processes. Paper presented at 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA, December 4–9. Zheng, Ban, François Roueff, and Frédéric Abergel. 2014. Ergodicity and scaling limit of a constrained multivariate Hawkes process. SIAM Journal on Financial Mathematics 5: 99–136. Zhu, Lingjiong. 2013. Central limit theorem for nonlinear Hawkes processes. Journal of Applied Probability 50: 760–71. c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Risks – Multidisciplinary Digital Publishing Institute
Published: Sep 11, 2020
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.