Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Ensemble Forecasting for Intraday Electricity Prices: Simulating Trajectories

Ensemble Forecasting for Intraday Electricity Prices: Simulating Trajectories Recent studies concerning the point electricity price forecasting have shown ev- idence that the hourly German Intraday Continuous Market is weak-form ecient. Therefore, we take a novel, advanced approach to the problem. A probabilistic fore- casting of the hourly intraday electricity prices is performed by simulating trajectories in every trading window to receive a realistic ensemble to allow for more ecient in- traday trading and redispatch. A generalized additive model is tted to the price di erences with the assumption that they follow a zero-in ated distribution, pre- cisely a mixture of the Dirac and the Student's t-distributions. Moreover, the mixing term is estimated using a high-dimensional logistic regression with lasso penalty. We model the expected value and volatility of the series using i.a. autoregressive and no-trade e ects or load, wind and solar generation forecasts and accounting for the non-linearities in e.g. time to maturity. Both the in-sample characteristics and fore- casting performance are analysed using a rolling window forecasting study. Multiple versions of the model are compared to several benchmark models and evaluated using probabilistic forecasting measures and signi cance tests. The study aims to forecast the price distribution in the German Intraday Continuous Market in the last 3 hours of trading, but the approach allows for application to other continuous markets, es- pecially in Europe. The results prove superiority of the mixture model over the benchmarks gaining the most from the modelling of the volatility. They also indicate that the introduction of XBID reduced the market volatility. Keywords: electricity price forecasting, power markets, intraday market, continuous-trade markets, XBID, ensemble forecasting, probabilistic forecasting, short-term forecasting, tra- jectories, generalized additive models, lasso, logistic regression, zero-in ated distribution, scenario simulation accepted for publication in Applied Energy. arXiv:2005.01365v3 [q-fin.ST] 29 Aug 2020 1 Introduction Intraday continuous electricity markets gain on importance every day [1]. Their primary purpose is to handle the uncertainty in electricity generation and load arisen since the day-ahead markets [2]. A number of events can cause the uncertainty, e.g. unexpected power plant outage or changing weather conditions. The latter one is the result of the global trend of investing in weather-dependent renewable power sources and is a subject of modelling and forecasting [3]. The need of intraday continuous trading is ful lled by the power exchanges and transmission system operators (TSO) [4]. They allow the market participants to trade the energy continuously up to 5 minutes before the delivery, e.g. in France or Germany, and to trade it cross-border, e.g. using the cross-border intraday (XBID) market [5]. Even though there is a clear evidence of the importance of this kind of markets, the researchers do not investigate them in terms of forecasting as willingly as the day-ahead market. The day-ahead market is the main electricity spot market with a long history of research on electricity price forecasting [6]. Recent studies on the electricity price forecasting (EPF) in day-ahead markets consider i.a. the probabilistic forecasting and forecasting combina- tion. Nowotarski and Weron [7] present a review of probabilistic EPF and Muniain and Ziel [8] use it to simulate peak and o -peak prices. Marcjasz et al. [9] combine point fore- casts achieved using di erent calibration windows while Uniejewski et al. [10] and Sera n et al. [11] do it for probabilistic forecasts. A very big part of the recent EPF literature are also hybrid models [12{14] and neural networks [15{17]. Also the market integration plays an important role in price formation in both day-ahead and intraday markets what is elaborated by Lago et al. [18] and Kath [5]. The role of the intraday markets in the balancing of electricity systems was emphasized and explained by Ocker and Ehrhart [19] and Koch and Hirth [20] on the basis of the Ger- man electricity market. They observed that the introduction of the intraday continuous market in Germany partially led to a substantial decrease in the demand for balancing energy while the wind and solar energy generation increased. Karan l and Li [21] clarify the reason for the spread between day-ahead and intraday prices in Denmark, while Ma- ciejowska et al. [22] forecast the price spread between the day-ahead and intraday markets 2 based on the Polish and German data. The continuity of the intraday market has encour- aged the researchers to investigate the transaction arrival process [23], bidding behaviour [24{26] and optimal trading strategies [27{29]. The impact of fundamental regressors on the price formation in the intraday market was examined by Pape et al. [30], Gur  tler and Paulsen [31], and Kremer et al. [32]. The literature on the EPF in the intraday markets is not that broad as in the day- ahead markets or as the one regarding other aspects of the intraday markets. Monteiro et al. [33] and Andrade et al. [34] conducted the EPF for the Iberian intraday market, however it is not a continuous market, and thus their studies are more similar to these on day-ahead markets. Uniejewski et al. [35], Narajewski and Ziel [36] and Marcjasz et al. [37] performed the EPF in the German Intraday Continuous Market, while Oksuz and Ugurlu [38] in the Turkish Intraday market. An outcome of the second one was an indication of the weak-form eciency of the investigated market. This was partially con rmed by Janke and Steinke [39], who forecasted the distribution of prices during the last three hours of trading and concluded that forecasting of the central quantiles yields marginal improvement to the naive benchmark. However, Marcjasz et al. [37] managed to outperform the most recent price by using an ensemble of it and a lasso-estimated model. The only four papers on EPF in the German intraday market considered the ID -Price (a volume-weighted average price of transactions in the last three hours before delivery) as the most important price index in the German intraday market and conducted forecasting of it. This paper focuses on the ID index as well, but not directly. Instead of forecasting its price we simulate the paths of 5-minute volume weighted average price during the time- frame of the index. This way we obtain a distribution forecast of the prices in every 5 min window during the last three hours before the delivery. An example of this approach can be seen in Figure 1. We motivate our research with results on the weak-form eciency of the market concluded by Narajewski and Ziel [36] and a possible application of the methodology to trading of the electricity and optimal redispatch management. In purpose of modelling and forecasting of the trajectories, we utilize the generalized additive models for location scale and shape (GAMLSS) [40] which extends the generalized additive models (GAM) [41]. This methodology found applications to the electricity load 3 60 360 270 185 90 30 Time to delivery (minutes) Figure 1: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00. The black part is the realization and the colourful part consists of 100 simulations from the Gaussian random walk. Time of forecasting is indicated by the green dashed line. [42, 43] and day-ahead price [44{46] forecasting, but never to the intraday electricity mar- kets. The model for price di erence P is tted to the Student's t-distribution and mixed with the Dirac distribution, i.e. P  (1 ) + t. is assumed to be a Bernoulli variable with probability  and is modelled using the logistic regression. We estimate it with the lasso method [47]. A broader description of the modelling exercise can be found in Section 4. The forecasting part utilizes a rolling window study. This is a very common study type in the EPF and is widely utilized by researchers [35, 36]. We analyse both in-sample characteristics and evaluate the out-of-sample forecasting performance. The major contributions of this paper are as follows: (1) It is the rst work on the price trajectories in intraday continuous markets which are a new and developing part of the electricity markets. (2) A rigourous presentation and discussion of all characteristics of the market, like trad- ing frequency and volatility. (3) We propose a model that utilizes a mixture of GAMLSS and logit-lasso estimation methods and generates realistic ensembles what allows for ecient decision-making, especially for trading and redispatch. ID price (hourly, 12:00) x 5m (4) The components of the proposed model are interpreted with respect to the market behaviour, highlighting the impact of the XBID introduction and relevant features, like wind and solar generation, load, calendar e ects, trading activity and historic prices. (5) The high-quality predictive performance of the proposed model is compared with simple benchmarks and sophisticated models with respect to point and probabilistic forecasting. The remainder of this paper has the following structure. In the next section, we describe the market. The third section consists of the data description and descriptive statistics. Then, a broader explanation of the estimation methods is presented, followed by the de- scription of the considered models and benchmarks. In the fth section, the forecasting study and evaluation measures are introduced and discussed in detail. In the sixth sec- tion, we present the results which consist of the in-sample analysis with relevant model interpretations and the out-of-sample evaluation. The nal section concludes this paper. The methodology used in the paper is very innovative, especially in regard to the intraday electricity markets. We present it with an application to the German Intraday Continuous Market, but it can be easily used with any other intraday electricity continuous market. 2 Market description The German Intraday Continuous Market allows to trade hourly, half-hourly and quarter- hourly products. We conduct the study using the most liquid part of the market { the hourly one. This is in line with other EPF studies in intraday markets. Trading of hourly products in the German Intraday Continuous begins every day at 15:00 for the 24 products of the following day. It is possible to trade the electricity until 30 minutes (in the whole market) and up to 5 minutes (within respective control zones) before the delivery. In the meantime, between hour 22:00 and 60 minutes before the delivery the cross-border trading within XBID system is possible Kath [5]. This system went live on 18th June 2018. A visualization of the trading timeline can be seen in Figure 2. For more details on the German electricity market, we recommend the paper of Viehmann [4]. 5 Day-Ahead Intraday XBID XBID Market Control zones Delivery Auction Auction starts closes closes close Quarter-Hourly Intraday Continuous Hourly Intraday Continuous d 1, d 1, d 1, d 1, d, d, d, d, s 12:00 15:00 16:00 22:00 s 60 min s 30 min s 5 min Figure 2: The daily routine of the German spot electricity market. d corresponds to the day of the delivery and s corresponds to the hour of the delivery. The most important price measure in the German intraday market is the volume- weighted average price of transactions in the last three hours of trading, called ID . The index takes into account only these transactions that happen until the gate closure 30 minutes before the delivery, so in fact it measures the last two and a half hours of trading before the gate closure. The relevance of ID is an outcome of the behaviour of traders in the intraday market { most of the transactions are held in this time period making it very liquid. This results in a high interest of practitioners and researchers in the ID -Price. For more details on the index visit the webpage of EPEX SPOT or see e.g. Narajewski and Ziel [36]. To measure the prices during the trading period, we use the ID de ned by Narajewski x y and Ziel [36]. Let us recall the de nition of ID . Let b(d; s) be the start of the delivery of x y d;s a product s on day d. By T = [b(d; s) x y; b(d; s) x), x  0 and y > 0, we denote x;y d;s the time interval between x + y and x minutes before the delivery, and by T we denote a set of timestamps of transactions on the product. The ID is de ned by x y d;s d;s d;s ID := V P ; (1) x y k k d;s d;s V d;s d;s k2T \T k d;s x;y k2T \T x;y d;s d;s where V and P are the volume and the price of k-th trade within the transaction set k k d;s d;s T \ T respectively. Let us note that the ID is simply a volume-weighted average x y x;y price of transactions in the time interval of length y hours and ending x hours before the delivery. d;s d;s In the case of T \ T = ; we use the value of ID , that is to say the previous x+y y x;y 6 1 observed volume-weighted average price measured on the time period of the same length. In the case of no trades appearing since the start of trading, the price is set to the price of the corresponding Day-Ahead Auction. 3 Data and descriptive statistics The data used in purpose of this study consists of all transactions on hourly products in the German Intraday Continuous Market between 16th July 2015 and 1st October 2019. A more general descriptive statistics were presented by Narajewski and Ziel [36]. As men- tioned in the previous section, the XBID system started to function on 18th June 2018. This means that XBID trades were possible only on around 30% of the days in the data. In the forecasting study, we use D = 365 days of the data as in-sample, and therefore the analysis in this section is based only on the initial in-sample, i.e. the data between 16th July 2015 and 14th July 2016. The start of the data is set to the rst day of lead change in Germany from 45 min to 30 min in order to avoid this structural break. In d;s this paper, we aggregate the transactions using the ID with y = 5 min, and this way we obtain dense time series data. As said before, we are particularly interested in the evolution of prices during the last 2.5 hours of trading before the gate closure, so we use x 2 J = f180; 175; : : : ; 35; 30g, where x is denoted in minutes. This way we observe T = 31 price points a day, what results in TD = 31  365 = 11315 in-sample observations and T -dimensional simulated trajectories. Subsequently, we use a very speci c setting, but it can be applied to any other continuous intraday market with other input variables. As the market shows strong indications of weak-form eciency, we focus on modelling d;s d;s d;s of the price di erences P = ID ID instead of pure prices (Tt)y+30 (T(t1))y+30 t y y d;s d;s d;s P = ID . We also introduce the P notation for simplicity. Due to the usage (Tt)y+30 t y t of price di erences and to the fact that the data is aggregated using 5-minutes grid, we observe a high frequency of observations with no trade, and thus price di erences equal to 0. In Narajewski and Ziel [36] this value is set to the price of the last transaction. This adjustment is caused by the fact that in this paper we work with 5-minute time intervals, leading to a signi cant number of the events of no trade in the time interval. This would often result in an arti cial change of the price, compared to the previously observed ID . x y 7 This is depicted in Figure 3. One can see that lack of transactions happens more often to the night and morning hours. In Figure 4, we zoom in the tails of the histograms from Figure 3. We also plot there densities of 4 distributions tted to the data: the normal distribution N (0; b) and the t-distribution t(0; b; ) with xed  2 f2:5; 3; 4g and estimated b using maximum likelihood estimation ignoring the no-trade observations. Based on Figure 4 it d;s is clear that the price di erences P are heavy-tailed. One can see that even the t- distribution with  = 4 seems to be not heavy-tailed enough for the data. This indicates that the tail-index of the price di erences may be lower than 4 which would mean that the d;s fourth moment of the P might not exist what is a strong indication for heavy tails. Figure 5 shows the frequency of the no-trade event over time to delivery. We see that the overall behaviour is very similar across all products { the closer to the delivery, the less observations without transactions. What is di erent among the products is the level of the frequency. It is clear that the frequency decreases as the product time increases and the reason for it may be the time distance from the Day-Ahead and Intraday Auctions. It is intuitive that since these auctions the uncertainty could be smaller for the rst products and higher for the last ones, but the smallest values of frequency are achieved not for the evening, but for the day-peak hours. This can be explained by higher activity in the market due to higher expected demand. d;s Figure 6 shows the in-sample standard deviation of price di erences P over time to delivery. The dashed lines depict the standard deviation of the whole samples, independent Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 2.0 1.5 1.0 0.5 0.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 d,s ΔP d;s Figure 3: Histograms of the initial in-sample price di erences P for selected hours. Blue colour corresponds to the no-trade cases. 8 Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 0.015 N(0, ) t(0, , 2.5) σ ν = t(0,σ,ν = 3) t(0,σ,ν = 4) 0.010 0.005 0.000 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 d,s d;s Figure 4: Histograms of the tails of the initial in-sample price di erences P for selected hours. Solid lines depict densities of the distributions according to the legend. 00:00 01:00 02:00 0.6 03:00 04:00 05:00 06:00 07:00 08:00 0.4 09:00 10:00 11:00 12:00 13:00 14:00 0.2 15:00 16:00 17:00 18:00 19:00 20:00 0.0 21:00 22:00 23:00 150 100 50 Time to delivery (minutes) d;s Figure 5: Frequency of no-trade event in the initial in-sample price di erences P over time to delivery for all 24 products. of time. If the price processes would be similar to random walk, the sample standard de- viation over time should be oscillating around these dashed lines. The behaviour in Figure 6 is clearly di erent, with a spike in the last 30 minutes before gate closure. This suggests that the variance should be a subject of modelling. Figure 7 presents the partial autocor- d;s relation function of the absolute price di erences P to explore potential conditional heteroscedasticity in the heavy-tailed data. Figure 7 shows that the most signi cant are the rst three lags. Also, lags up to 6 may contain some information. Surprisingly, lags around 31 seem to be signi cant too, but this is most likely some daily dependence. Frequency 3 00:00 06:00 12:00 18:00 150 100 50 Time to delivery (minutes) d;s Figure 6: Standard deviation of the initial in-sample price di erences P for selected hours over time to delivery. Dashed lines indicate the standard deviation independent of time. Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 0.3 0.2 0.1 0.0 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 Lag Figure 7: Partial autocorrelation function of the initial in-sample absolute price di erences d;s P for selected hours. Blue, dashed lines indicate the con dence intervals. 4 Modelling and estimation d;s We assume the price di erences P to follow a 4-parametric distribution { a mixture of the Dirac  distribution and the 3-parametric t-distribution, sometimes referred as zero-in ated t-distribution: d;s d;s d;s d;s G = (1 ) + F (2) t t t t d;s d;s where = 1(V 6= 0) is a Bernoulli variable of the event that there is non-zero vol- t t d;s d;s ume of energy traded on product s on day d at time t with probability  and F is t t d,s Standard deviation PACF of | P | t d;s d;s d;s d;s d;s the 3-parametric t-distribution t( ;  ;  ) where  2 R is the mean,  > 0 the t t t t t d;s standard deviation and  > 2 the degrees of freedom. The t-distribution is estimated with GAMLSS framework [40]. The GAMLSS is an expansion of the GAM [41] and it allows to model not only the expected value of a response variable, but also potentially the higher moments, represented by scale and shape parameters. Namely, let Y be a random variable with a density function f (yj), where  is a set of up to four distribution parameters. Then each  2  may be modelled by g ( ) = h (x ) (3) i i ji ji j=1 where g is some link function, J is a number of explanatory variables and h is a smooth i i ji function of explanatory variable x . Note that function h does not have to be a parametric ji ji function. In our exercise, we use the following link functions g () = (4) g () = log()1(  1) + ( 1)1( > 1) g () = log( 2): g is a standard link function for the expected 5.0 g1 g2 value. g is a link function that we call "logident" 2 g3 2.5 and we introduce it in order to avoid exponential 0.0 inverse function for high values of estimates. The third link function is simply a natural logarithm -2.5 shifted to 2 for preserving the condition that  > 2. 0 1 2 3 4 5 The three link functions are plotted in Figure 8. d;s The models for F are estimated using the gamlss Figure 8: An illustration of the three package in R [48]. link functions g , g and g . 1 2 3 Due to the novelty of the exercise, we cannot use any literature benchmarks, as well as any standard approaches to the modelling of volatility, e.g. GARCH. Even though the data looks like time series, the biggest problem lies in the gap between days. We model each product separately, and for each product we have 31 observations every day. In the corresponding time series, the observations on day 11 d appear in 5-minute breaks, while the time di erence between the last observation on day d and the rst on day d + 1 is around 21 hours. Furthermore, there is no direct link between the prices on day d and day d + 1 as they are for di erent delivery periods with potentially di erent fundamental market situations. Thus, the usage of GARCH-type components to address conditional heteroscedasticity is not straight-forward. Instead, as simple benchmarks we use models that assume the d;s d;s d;s d;s distribution of P = (P ; P ; : : : ; P ) to be multivariate, random walk models, 1 2 and a model that uses in-sample price di erences to create an ensemble forecast. Also, as advanced benchmarks linear quantile regression with copula models are considered. In the following subsection, the more complicated models are considered. We model explicitly the probability of non-zero number of transactions, the mean, and the variance of tted distribution. We present the models from the least to the most complex and show the results similarly. This allows us to observe the gain caused by every new part of the model. 4.1 Mixture models d;s We introduce a dependency structure between the rst three parameters of the G dis- d;s d;s d;s tribution, i.e.  ,  and  , and the data. For the fourth parameter, the degrees t t t d;s d;s of freedom  , we assume the constancy. The G distribution is estimated in a 2-step t t d;s d;s approach. First, the  parameter is estimated, and then the F distribution is tted to t t d;s d;s the in-sample price di erences P for which the value of is 1. t t d;s In the rst step, we build a logistic model for 3 6 12 d;s X X X d;s d;s d;s log = + P + P + P 0 j 3+j 10 tj tj tj d;s j=1 j=1 j=7 | {z } price di erences + Mon(d) + Sat(d) + Sun(d) + TtM (t) 11 12 13 13+j j (5) j=1 | {z } time dummies d;s d;s d;s d;s d;s + DA + DA + DA + DA +  : 45 46 47 48 48+j Load Sol WiOn WiO tj | {z } j=1 fundamental regressors | {z } d;s regression on 12 The model explains the logit function with 4 main components: price di erence impact, d;s time dummies, fundamental regressors and regression on . Price di erence impact consists of 3 most recent price di erences, 6 most recent absolute price di erences and a sum of absolute prices di erences lagged by 7 to 12. This component addresses the overall d;s impact of price volatility on  . We expect to observe more trades when the prices are more volatile. Time dummies consist of three weekday dummies and time to maturity dummies. The weekday dummies for Monday, Saturday and Sunday are chosen literature-based. A number of studies [49{51] have proven that usage of these dummies in EPF substantially improves the forecasting performance. These three dummies indicate the end of the week with Monday being a transition day. The use of time to maturity dummies is clear when we d;s take a look again at Figure 5. It is expected that  rises as we approach the gate closure. Fundamental regressors consist of day-ahead forecasts of total load, solar generation, wind onshore generation and wind o shore generation. It is expected that higher load and share of renewables should rise the uncertainty in the market, and encourage market participants d;s to trade more. The last, but not the least is the regression on . We do not use the d;s regression directly, but instead we use the average of last j observed values of which d;s we denote by  . We expect these values to have a signi cant impact on the prediction tj d;s d;s of  . Intuitively, the higher these averages, the higher the value of  . t t Model (5) consists of 61 regressors in total. To avoid over tting problems, we estimate the model using the least absolute shrinkage and selection operator (lasso) of Tibshirani [47]. Let us recall that if we possess a logistic model log = X for the Bernoulli lasso variable with P ( = 1) = , then the lasso estimator is given by n   o lasso b e = arg min l ; X +  (jj jj j j) ; (6) 1 0 where l is the corresponding log-likelihood 0 X e e l( ; X) = X log(1 + e ); (7) i=1 X is a standardization of X and  is a tunable shrinkage parameter. This method found already many successful applications to the EPF and intraday markets [35, 36, 52]. In this exercise, we utilize the glmnet package in R by Friedman et al. [53]. The estimation is conducted using a BIC-tuned  value chosen from an exponential grid of 100 values. 13 d;s Let us now take a look at the F distribution in equation (2). We consider four d;s d;s d;s d;s versions of it. In the rst one, we assume that F follows t(0;  ;  ) with constant t t t t d;s d;s and  . We denote it simply by Mix.RW.t. The F distribution is tted to the in- t t sample price di erences with non-zero transaction number using the GAMLSS. With this d;s model we can observe the gain of using a complex model for the  parameter. Figure 9 shows tted densities to the histograms presented in Figure 3. They were obtained with model Mix.RW.t. d;s d;s d;s d;s The second model utilizes F with modelled  and constant  and  , and we t t t t denote it by Mix.t.mu. This model helps us understand the outcome of modelling of the expected value of P . However, a preliminary analysis has shown that most of the d;s regressors used in model (5) were not signi cant for modelling of  . The only signi cant were the three most recent price di erences. Therefore, we model the expected value with d;s d;s d;s d;s g ( ) = P + P + P : (8) 1 1 2 3 t t1 t2 t3 d;s d;s d;s d;s The next model uses F with   0, modelled  and constant  . We denote it t t t t by Mix.t.sigma. The formula for the standard deviation is as follows 6 12 X X d;s d;s d;s g ( ) = + P + P + Mon(d) + Sat(d) + Sun(d) 2 0 j 7 8 9 10 t tj tj | {z } j=1 j=7 weekday dummies | {z } absolute price di erences d;s d;s d;s d;s d;s d;s (9) + DA + DA + DA + DA + + 11 12 13 14 15 16 Load Sol WiOn WiO t1 t2 | {z } | {z } fundamental regressors d;s lagged d;s + h (P ) + h (t) 1 2 t1 | {z } non-linear e ects where h and h are smooth non-linear P-spline functions. The P-splines simply combine 1 2 equally-spaced B-splines and discrete penalties. More information on P-splines can be found in Eilers et al. [54]. Let us note that the model described by equation (9) uses much more regressors than in equation (8). The explanation of the choice of the variables is very similar to the one of the model described by equation (5). We explain the standard deviation of price di erences with: lagged absolute price di erences, weekday dummies, fundamental d;s regressors, lagged values of and non-linearities in most recent price and time to maturity 14 Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 2.0 1.5 1.0 0.5 0.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 d,s ΔP d;s Figure 9: Histograms of the initial in-sample price di erences P with tted densities of model Mix.RW.t for selected hours. Blue colour corresponds to the no-trade cases. variables. We expect that the absolute price changes are a suitable explanatory variable for the standard deviation as motivated through Figure 7. The fundamental regressors are d;s supposed to have a positive linear correlation with the  . For the Saturday and Sunday dummies we might expect a negative impact due to lower trading activity on weekends, but also a positive impact due to the fact that higher bid-ask spreads are plausible. The lagged d;s values of indicate if the market participants traded lately, and thus we believe that it could identify higher price di erence's variance. The last two regressors are expected to d;s have a non-linear impact on the formation of  , and therefore they are estimated using P-splines. Figure 6 provides already an evidence that the standard deviation varies over d;s time to maturity. Moreover, we suspect that extreme values of most recent price P result t1 in a higher variance due to a relatively inelastic supply curve in extreme price areas. d;s d;s The last and at the same point the most complicated model uses F with  and t t d;s d;s d;s modelled and constant  . We denote it by Mix.t.mu.sigma. The  is modelled t t t d;s using the formula from equation (8) and the  using the formula from equation (9). Let us mention that we could make the mixture model even more complex by modelling d;s the degrees of freedom parameter  . However, a preliminary analysis has shown that it does not yield any signi cant improvement while increasing heavily the computational cost. Thus, in the forecasting study we analyse the performance of 8 models described in this section. 15 4.2 Simple benchmark models The rst benchmark model uses one of D = 365 historical trajectories to model the price d;s d;s d;s d;s di erence vector P = (P ; P ; : : : ; P ). We denote it by Naive and its for- 1 2 mula is given by d;s d ;s P = P (10) where d  U (fd 1; : : : ; d Dg) is a uniform random variable indicating the day used to model the price di erence. Let us note that a xed d index is used to model the whole price trajectory, i.e. for every t 2 f1; 2; : : : ; Tg: This model assumes that the future trajectories can be forecasted using simply the past ones. The second and the third benchmark models assume that the price di erence vector d;s P follows a multivariate normal and t-distributions, respectively. They are denoted by MV.N and MV.t and are given by d;s d;s P = " (11) d;s d;s d;s d;s d;s where "  N 0;  in the case of MV:N and "  t 0;  ;  in the case d;s d;s of MV.t. Let us note that the covariance matrix  and degrees of freedom  are estimated by tting the respective distributions to the in-sample observations. Moreover, d;s the degrees of freedom  is assumed to be constant for all t 2 f1; 2; : : : ; Tg: The next benchmark model is the random walk version of the mixture model described by equation (2), and we denote it by RW.t.mix.D. The formula is as follows d;s d;s d;s d;s P = P P = " (12) t t t1 t P P d;s d;s d;s d1 T i;s d;s d;s where "  G with b = 1(V 6= 0),   0 and constant  and t t t t t t DT i=dD t=1 d;s . These values are estimated based on the in-sample data. The fth benchmark model is a modi cation of the RW.t.mix.D. We denote it by d;s RW.t, and we simply set   1 which means that we do not incorporate the mixing part and assume that the price di erences follow the t-distribution. The last and the simplest of the random walk models assumes the price di erences to follow a Gaussian distribution d;s d;s N (0; ( ) ) and is denoted by RW.N. In terms of the G distribution, we simply modify t t d;s the RW.t model by taking  ! 1. 16 Later, we consider the random walk models from the simplest RW.N to the most complex RW.t.mix.D. This allows us to observe the gain of introducing more complex structure of the distribution. Let us note that model RW.N assumes exponentially decay- d;s ing tails of the price di erences P . Comparing it to model RW.t we measure the gain of assuming heavier, polynomially decaying tails. Based on the number of outlier observa- tions in the German intraday market and on Figure 4, we expect it to perform better than the Gaussian random walk. Then, considering the RW.t.mix.D helps us to understand the gain of the introduction of the mixture. 4.3 Advanced benchmark models As mentioned, we are unable to use any literature-based benchmarks as this is the rst paper on ensemble forecasting in intraday electricity markets. However, it is possible to implement scenario generating methods that are utilized in other research areas. Thus, as advanced benchmark we utilize a smoothed linear quantile regression model with two copulas: Gaussian and independence, and we denote them by LQR.Gauss and LQR.ind, respectively. A very similar approach was applied recently in the purpose of generating density forecasts of signi cant sea wave height and peak wave period [55]. First, we build the linear quantile regression (LQR) model using the same set of regres- sors as for the mixture models. The formula is as follows 3 6 12 X X X d;s d;s d;s d;s d;s Q P = + P + P + P + P 0 j 3+j 10 11 t 0 1j 1j 1j j=1 j=1 j=7 | {z } price components d;s d;s (13) + Mon(d) + Sat(d) + Sun(d) + + 12 13 14 15 16 11 12 | {z } | {z } d;s lagged d;s d;s d;s d;s + DA + DA + DA + DA 17 18 19 20 Load Sol WiOn WiO | {z } fundamental regressors for  2 f0:01; 0:02; : : : ; 0:99g and t = 1; 2 : : : ; T . That is to say, we build separate models for each quantile  and each time point t. Let us note that due to the design of the model, we can use only the regressor values available at the time of forecasting t = 0 (i.e. 3 h 5 min before the delivery). This results in the fact that here we model all T time points using 17 the same data, what is contrary to the mixture models where we can use autoregressive variables due to the recursive character of the models. We estimate the LQR models using the quantreg package in R [56]. d;s In the next step, a spline interpolation is applied over all tted Q P for  2 f0:01; 0:02; : : : ; 0:99g and for every t = 1; 2 : : : ; T . In order to preserve the monotonicity of the estimated cumulative distribution function (CDF) we compute a monotonic cubic spline using Hyman ltering [57]. This way we obtain a smooth and monotonic T -dimensional CDF function d;s d;s d;s d;s (x) = (x ); (x ); : : : ; (x ) (14) 1 2 T 1 2 T d;s d;s d;s d;s where min P = 0 and max P = 1. To fd1;:::;dDg fd1;:::;dDg t t t t d;s assess the dependency structure of the price di erences P over t we use two copulas: Gaussian and independence. The Gaussian copula for a given correlation matrix R can be written as 1 1 1 C (u) =   (u );  (u ); : : : ;  (u ) (15) R R 1 2 T where  is the inverse CDF of a standard normal distribution and  is a joint CDF of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix R. We estimate the correlation matrix R simply by calculating the in-sample correlation matrix. 5 Forecasting study and evaluation We use a rolling window forecasting study approach with D = 365 days in-sample size and N = 1173 days out-of-sample size. The in-sample data consists of DT data points where T = 31 in this study. We model each of the S = 24 hourly products separately and our forecasting time is 185 minutes before the delivery of product s on day D + 1. That is to say, we can utilize all the information from the in-sample data and from the day D + 1 until 185 minutes before the delivery. At this time we forecast M = 1000 times the d;s d;s d;s rst price di erence P = ID ID . Based on these forecasts and explanatory 180 185 1 5 5 d;s data we simulate M second price di erences P and we continue this recursive process until we reach the gate closure. Figure 10 provides an outline of the exercise. This gives 18 us M simulated trajectories, each consisting of T = 31 points. After that, we move the window forward by one day and repeat the exercise until the end of out-of-sample data. However, in the case of benchmark models we do not use the recursive algorithm as there is no recursion in their formulas. Before we discuss the evaluation design in detail, we recall that we are mainly interested d;s in the evaluation of the forecasted T -dimensional distribution of the price vector P = d;s d;s P ; : : : ; P which is represented by the predicted ensemble. Indeed, the multivariate cumulative distribution function of the ensemble coincides with the underlying cumulative distribution function if the ensemble sample size M goes to in nity. Thus, for suciently large M the evaluation of the scenario set can be regarded as the evaluation of probabilistic distributions. From the theoretical point of view, strictly proper multivariate scoring rules are the rst choice for evaluation, as they are able to identify the optimal forecast resp. the true distribution, see Gneiting and Raftery [58], Pinson and Girard [59]. However, we want to remind us that even if forecast A performs signi cantly better than forecast B with respect to a strictly proper scoring rule, there is no guarantee that A also performs better than B in stochastic optimization problems (e.g. trading or storage optimization) where the forecasts are used as input. The optimal forecast would always yield optimal solutions in the stochastic optimization application. Thus, if A is close to the optimal forecast with respect to a strictly proper multivariate scoring rule, the aforementioned risk that B outperforms A in the application is very limited if the stochastic optimization problem is continuous in the stochastic argument. Unfortunately, this only holds for strictly proper multivariate scoring rules. For proper scoring rules which identify only some characteristics of the full predictive distribution, this is certainly not true. The range of available strictly proper Start Time Delivery of trading of forecasting ID ID ID ID ID 185 5 180 5 175 5 170 5 . . . . . . 30 5 . . . d 1, d; s d; s d; s d; s d; s d; s d; s 15:00 3h5m 3h 2h55m 2h50m 35m 30m Figure 10: An outline of the ensemble forecasting exercise. 19 scoring rules is very limited, and reduces basically to the energy score for our ensemble forecasting problem, see e.g. Lerch et al. [60]. Therefore, we consider also proper scoring rules which might allow further insights as they focus on speci c characteristics of the full distribution. To draw statistically signi cant conclusions on the outperformance of the forecasts of the considered models we utilize also the Diebold and Mariano [61] test. As mentioned, the only available multivariate strictly proper scoring rule is the energy score (ES) [58] . We compute the ES loss function in the following way d;s d;s d;s ES = ED EI (16) where d;s d;s d;s ED = P P (17) M 2 j=1 and M M X X d;s d;s d;s b b EI = P P (18) 1 j i M (M 1) 2 j=1 i=j+1 d;s d;s d;s d;s d;s d;s d;s d;s d;s b b b b with P = P ; P ; : : : ; P and P = P ; P ; : : : ; P . The ED component 1 2 T j 1;j 2;j T;j measures the distance between the simulated trajectories and the observed prices. On the d;s other hand, EI measures the spread between the simulations. To calculate the overall energy score we use an average S N XX d;s ES = ES : (19) SN s=1 d=1 We mentioned that the ES evaluates the full predictive distribution, which includes the path dependency in the generated scenarios. To illustrate the appropriateness of the energy score to evaluate correctly an ensemble forecasting study, we perform a short experiment in the results section. We take the best performing model and modify it with 3 di erent copulas which we refer as maximum dependency, minimum dependency and independence. For the maximum dependency copula we consider the T -dimensional co-monotonicity copula de ned by M (u) = min(u ; u ; : : : ; u ). The minimum dependency copula M is max 1 2 T min Also the multivariate log-score is a known strictly proper scoring rule for multivariate distributions. However, it requires that the underlying multivariate distribution is continuous and has a density. Due to the non-trade events this is not satis ed for our forecasting problem. 20 constructed using pairwise bivariate counter-monotonicity copulas de ned by W (u ; u ) = 1 2 max(u +u 1; 0). So M is the copula that is associated with the T -dimensional uniform 1 2 min random variable U = (U ; : : : ; U ) that satis es (U ; U )  W for all t = 1; : : : ; T 1. 1 T t t+1 The T -dimensional independence copula is de ned by M (u) =  u . The 3 new models ind t t=1 are evaluated using all considered measures and compared to the original one. As pointed out by Pinson and Girard [59], the ES does not evaluate the ability of the trajectories to mimic speci c characteristics of the stochastic process. Therefore, we also focus our evaluation on speci c characteristics of the underlying multivariate distribution. In this purpose, we consider additionally the subsequent proper scoring rules. We utilize the mean absolute error (MAE) and the root mean squared error (RMSE), pinball score (PB) to evaluate the median, mean and selected quantile trajectories, respectively, see e.g. [62]. For evaluation of the marginal density t of our scenarios, we consider the continuous ranked probability score (CRPS) and additionally the empirical coverage of speci c predic- tion intervals [58]. Moreover, we consider the variogram score (VS) and Dawid-Sebastiani score (DSS) which are regularly used to evaluate multivariate distributions [60]. Note that both measures are only proper scoring rules and correct model identi cation fails in gen- eral, see e.g. [63] for empirical examples. Other scoring rules evaluating e.g. marginal distribution characteristics or speci c events might also be added if it is relevant for the desired application. The RMSE is the optimal least squares measure, i.e. it is the strictly proper scoring rule for mean evaluation while MAE is strictly proper for median evaluation. They are widely used both by researchers and practitioners. Their formulas are given by S N T M u XXX X 1 1 d;s d;s RMSE = P P (20) t t;j SNT M s=1 t=1 j=1 d=1 and S N T XXX d;s d;s MAE = P med P (21) j=1;:::;M t t;j SNT s=1 d=1 t=1 d;s d;s d;s b b where P is the j-th simulation of P and med P is the median of M simu- j=1;:::;M t;j t;j d;s lated P prices. t;j 21 We approximate the CRPS using the pinball loss d;s d;s CRPS = PB (22) t t; 2r for a dense equidistant grid of probabilities r between 0 and 1 of size R, see e.g. [7]. In this d;s study, we consider r = f0:01; 0:02; : : : ; 0:99g of size R = 99. PB is the pinball loss with t; respect to probability  . Its formula is given by d;s d;s d;s PB =  1 d;s d;s P Q P (23) t; t P <Q P j=1;:::;M t;j f ( )g t j=1;:::;M t;j d;s d;s b b where Q P is the  -th quantile of M simulated P prices. To calculate the j=1;:::;M t;j t;j overall CRPS value we use a simple average S N T XXX d;s CRPS = CRPS : (24) SNT s=1 t=1 d=1 We can also use the pinball loss to compare the models' performance in particular quantiles. In this purpose the following formula is used S N T XXX d;s PB = PB : (25) t; SNT s=1 t=1 d=1 As mentioned, we use also the empirical coverage of prediction intervals, precisely the (=2; 1=2)-prediction interval. The  %-coverage is calculated using the following formula S N T XXX n o %-cov = 1 (26) (1 )=2 d;s d;s (1+ )=2 d;s b b Q (P )<P <Q (P ) SNT j=1;:::;M t;j j=1;:::;M t;j s=1 d=1 t=1 where  2 f0:5; 0:9; 0:99g. The variogram score was introduced by Scheuerer and Hamill [64] in a probabilistic forecasting exercise for meteorological data. We compute it by S N T T M XXXX X d;s d;s d;s d;s b b VS = P P P P (27) i j i;k j;k SNT s=1 i=1 j=1 d=1 k=1 The Dawid-Sebastiani score evaluates the rst and second moments [62]. It corresponds to the log-score of the multivariate normal distribution. We calculate it by S N XX 0 1 1 d;s d;s d;s d;s d;s d;s b b DSS = log det  + P b  P b (28) SN s=1 d=1 22 d;s d;s where b and  are the sample mean vector and the sample covariance matrix of the predicted price ensemble. However, the aforementioned measures do not allow us to make conclusions regarding the statistical signi cance. To do so we utilize the Diebold and Mariano [61] test which tests forecasts of model A against the ones of model B. The DM test is mostly used to evaluate point forecasts, but with correctly de ned loss di erential series it can be successfully applied in the evaluation of probability forecasts. We derive the series using ES and CRPS what was already applied by e.g. Muniain and Ziel [8] and Lerch et al. [60]. We utilize the multivariate version of the DM test as Ziel and Weron [51]. The multivariate DM test results in one statistic for each model which is computed based on d;1 d;2 d;S d 0 the S-dimensional vector of losses per day. Therefore, denote L = (L ; L ; : : : ; L ) A A A A d;1 d;2 d;S d 0 and L = (L ; L ; : : : ; L ) the vectors of out-of-sample losses for day d of models A B B B B d;s d;s d;s and B, respectively. By L we mean the ES and CRPS losses of model Z, formally we choose d;s d;s d;s d;s d;s L = ES and L = CRPS = CRPS : (29) Z Z t=1 The multivariate loss di erential series d d d = jjL jj jjL jj (30) 1 1 A;B A B de nes the di erence of losses in jjjj norm. For each model pair, we compute the p-value of two one-sided DM tests. The rst one is with the null hypothesis H : E( )  0, A;B that is to say the outperformance of the forecasts of model B by the ones of model A. The second test is the reverse null hypothesis H : E( )  0. Let us note that these tests A;B are complementary, and we assume that the loss di erential series is covariance stationary. 6 Results We divided this section into two subsections: in the rst one, we inspect the in-sample characteristics and in the second one, we present the out-of-sample simulation results. 23 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 1 t d;s P -0.09 -0.09 -0.11 -0.09 -0.11 -0.08 -0.06 -0.02 0.02 0.07 0.06 0.07 0.04 0.03 -0.01 0 -0.01 0.02 0.01 0.03 -0.02 -0.03 -0.03 -0.07 t1 d;s P -0.05 -0.06 -0.04 -0.07 -0.05 -0.05 -0.04 -0.04 -0.02 0 0.01 0.02 0.01 0 0 0 -0.03 -0.01 0 -0.03 -0.01 -0.01 -0.02 -0.05 t2 d;s P -0.02 -0.03 -0.03 -0.03 -0.03 -0.04 -0.01 -0.01 0 0.01 0.02 0.01 0.01 0.01 0.01 0 0 0 0 -0.01 -0.01 0 0.01 -0.02 t3 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Intercept 0.49 0.52 0.5 0.42 0.47 0.56 0.83 0.66 0.55 0.29 0.45 0.48 0.36 0.31 -1.78 -1.75 -1.65 0.39 0.62 0.36 0.36 0.59 0.61 0.51 d;s P 0.28 0.27 0.28 0.26 0.26 0.23 0.27 0.22 0.21 0.18 0.2 0.18 0.21 0.27 1.8 2.39 2.74 0.24 0.24 0.23 0.31 0.27 0.29 0.27 t1 d;s P 0.09 0.13 0.12 0.15 0.16 0.16 0.14 0.08 0.1 0.07 0.1 0.1 0.12 0.14 -0.06 0 1.27 0.11 0.14 0.13 0.14 0.12 0.14 0.13 t2 d;s P 0.07 0.08 0.05 0.08 0.06 0.08 0.06 0.08 0.08 0.05 0.06 0.05 0.05 0.09 1.1 -0.15 0.2 0.08 0.05 0.04 0.06 0.08 0.03 0.07 t3 d;s P 0.03 0.02 0.09 0.03 0.07 0.06 0.03 0.04 0.03 0.04 0.05 0.04 0.05 0.07 0.02 0 -0.41 0.05 0.05 0.03 0.03 0.06 0.05 0.04 t4 d;s P 0.04 0.02 0.02 0.07 0.03 0.05 0.02 0.04 0.04 0.02 0.03 0.04 0.03 0.05 -0.14 0 0.28 0.04 0.04 0.03 0.05 0.05 0.03 0.06 t5 d;s P 0.04 0.04 0.04 0.04 0.04 0.04 0.05 0.03 0.03 0.03 0.06 0.06 0.03 0.05 0.02 0.01 0.02 0.04 0.02 0.04 0.04 0.05 0.03 0.05 t6 d;s P 0.06 0.08 0.07 0.09 0.14 0.09 0.09 0.1 0.05 0.05 0.06 0.05 0.1 0.08 -0.01 0 -0.01 0.08 0.08 0.1 0.11 0.15 0.1 0.13 j=7 tj d;s DA -0.02 -0.03 -0.02 -0.03 0 0.02 0.05 -0.01 -0.02 -0.02 -0.01 0.02 0.03 -0.01 -0.03 -0.04 -0.02 -0.03 -0.04 0 -0.01 0 0.01 0.03 Load d;s DA 0 0 0 0 -0.02 -0.04 -0.02 0.01 0.02 0.01 0.02 0.04 0.03 0.02 0.03 0.03 0.02 -0.01 -0.04 -0.03 -0.04 -0.03 -0.03 0 Sol d;s DA 0.13 0.15 0.11 0.11 0.1 0.08 0.1 0.09 0.09 0.1 0.08 0.11 0.08 0.09 0.11 0.14 0.1 0.09 0.1 0.12 0.11 0.1 0.11 0.11 WiOn d;s DA 0.02 0.03 0.07 0.04 0.02 0.02 -0.01 0 0.01 0.03 0.03 0.03 0.03 0.01 0.02 0.02 0.03 -0.01 0 -0.01 0 0 0.01 0 WiO Mon(d) 0.01 0.05 0.05 0.1 0.12 0.21 0.15 0.1 0.02 0.06 0.02 0.03 0.01 -0.01 0 -0.03 -0.08 -0.1 -0.05 -0.03 -0.06 -0.05 0.01 0 Sat(d) -0.06 0.06 0.05 0.02 0.06 0.06 0.01 0.03 0.04 0.06 0.15 0.24 0.26 0.13 0.16 0.23 0.19 0.15 0.12 0.16 0.12 0.11 0.1 0.04 Sun(d) 0.11 0.07 0.18 0.09 0.08 0.06 -0.01 0.01 0.02 0.15 0.16 0.24 0.3 0.2 0.2 0.27 0.34 0.21 0.21 0.24 0.2 0.12 0.13 0.04 d;s -0.45 -0.44 -0.45 -0.39 -0.36 -0.39 -0.5 -0.4 -0.42 -0.38 -0.44 -0.35 -0.34 -0.37 -0.38 -0.34 -0.38 -0.4 -0.36 -0.32 -0.4 -0.4 -0.5 -0.39 t1 d;s -0.18 -0.19 -0.18 -0.16 -0.17 -0.19 -0.22 -0.15 -0.15 -0.18 -0.15 -0.13 -0.15 -0.08 -0.04 -0.05 -0.17 -0.12 -0.2 -0.12 -0.13 -0.18 -0.16 -0.18 t2 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Intercept 0.53 0.74 0.69 0.86 0.84 0.52 0.56 0.7 1.11 1.25 1.52 1.37 1.41 1.37 1.07 0.87 0.93 1.06 1.24 1.18 1.02 0.74 0.76 0.57 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% Signi cance level Table 1: Initial in-sample coecient values of model Mix.t.mu.sigma reported for every hourly product. The price and generation variables were scaled for better clarity. The p-value of the test for signi cance of the values is indicated by the colour. The legend is explained by the table in the bottom. 6.1 In-sample characteristics We start our study with an analysis of the initial in-sample characteristics. Table 1 shows the estimated coecient values of model Mix.t.mu.sigma based on the initial in-sample data. The table reports the values for every hourly product, and it is split to 3 sub- tables, each regarding di erent parameter of the t-distribution. The rst sub-table presents d;s coecients of the model described by equation 8. Variable P appears to be statistically t1 signi cant for most of the hours. However, raising the lag decreases the signi cance. This behaviour goes in the direction of weak-form eciency concluded by Narajewski and Ziel [36]. The second sub-table shows coecients of the model presented in equation 9. Here, 24 we see that all the variables using lagged absolute price di erences are mostly signi cantly d;s d;s di erent from 0. Moreover, the coecients of jP j and jP j are relatively high. t1 t2 Surprisingly, the day-ahead forecast of total load is mostly irrelevant. The day-ahead forecast of solar generation is signi cant mainly during the day-peak and in the evening. The day-ahead forecast of wind onshore generation appears to have a big positive impact on the volatility of price di erences, in contrast to the wind o shore forecast. The behaviour of weekday dummies gives some light to our mixed expectancies { they indicate a di erent behaviour of traders on Monday at night and on Saturday and Sunday during the day. On weekends the volatility is higher, likely due to higher bid-ask spreads on weekends. The d;s lagged values of have signi cant, negative impact on the volatility of price di erences. It means that if there was trading at times t 1 and t 2, then the standard deviation would be lower. Let us note that the values of the coecients are very similar among all hours except hours 14 to 16. For these hours the estimates of intercept and absolute price di erences deviate heavily from the estimates of the remaining hours. A possible reason for this may be a few extreme outliers which were observed for these hours and for the others not. The table presents no values for the P-splines, because they are non-parametric d;s functions. The last sub-table shows the estimate values for g ( ) which we assumed to 3 t be constant. We show it anyway to gain an insight in the magnitude of the degrees of freedom. Let us recall that g () = exp() + 2. Applying this to the estimate results in d;s values of  between around 3.7 and 6.6. Thus, the innovations are not extremely heavy tailed, and it is reasonable to apply asymptotic statistic for validation and interpretation. d;s Figure 11 shows the initial in-sample P-splines h (P ) and h (t). We see that in case 1 2 t1 of both variables, the smoothing functions are non-linear. Extreme values of most recent d;s price P result in most cases in high rise of volatility. On the other hand, the values t1 between 0 and 50 EUR/MWh have rather marginal impact on the variance of the price di erences. An interesting e ect can be seen in Figure 11b. We see that until 60 minutes before the delivery the impact on the volatility is on a similar, negative level among all products. Then, in the last 30 minutes of trading the volatility rises substantially above zero. This behaviour can be misinterpreted as a result of the closure of XBID as in Figure 2. However, this plot is based on the initial in-sample data, i.e. the data between 16th July 25 2015 and 14th July 2016. Therefore, the e ect of XBID could not be in the data as it was introduced on 18th June 2018. Figure 11c is analogous to Figure 11b, but based on the rst year of XBID, i.e. the data between 18th June 2018 and 17th June 2019. Comparing the two gures concludes that the introduction of XBID has an impact on the volatility of the price di erences decreasing it even lower before the XBID closure and rising it even higher just after it. Interestingly, this is in contrary to the paper of Kath [5] who concluded that there is no evidence for the in uence of XBID on the price volatility. d;s Figure 12 presents a price trajectory and a decomposition of tted g ( ) of the prod- uct with delivery at 12:00 for the last 7 days of in-sample data. For the sake of readability we grouped the components of the model for standard deviation similarly as in equation (9). Let us note that the absolute price di erences and fundamental regressors have big, positive impact on the volatility of price di erences. We also observe overall higher volatility on the weekend, i.e. the second and third day on the plot, than on the week. Note that in this speci c example the impact of the non-linear price due to h looks rather negligible. How- ever, the price level in these seven trading sessions is always between 0 and 40 EUR/MWh where we expect minor impacts. 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 01:00 03:00 05:00 07:00 09:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00 10.0 0.5 0.5 7.5 0.0 0.0 5.0 -0.5 -0.5 2.5 -1.0 -1.0 0.0 -1.5 -1.5 -50 0 50 100 150 100 50 150 100 50 d,s t-1 Time to delivery (minutes) Time to delivery (minutes) (a) (b) (c) d;s Figure 11: Initial in-sample smoothing e ects of variables (a) P and (b) time to maturity t1 in the model described in equation (9). (c) is analogous to (b), but as in-sample considering d;s the rst year of XBID. Note that the support of P di ers among products. t1 d,s h (P ) t-1 h (t) h (t) 2 6.2 Out-of-sample simulation Now, we turn ourselves to the analysis of the simulated trajectories. Figure 13 shows the rst out-of-sample simulation exercise of prices of product with delivery at 12:00 on 15.07.2016. The trajectories are simulated from Mix.t.mu.sigma model and it can be easily compared to the simulations from Gaussian random walk presented in Figure 1. It is clear that in this example the trajectories of the mixture model are less volatile than the random walk. Table 2 shows the values of utilized error measures. The Naive model performs very well overall. Moreover, it gives the best results in terms of 50% coverage. Very similar results to the naive model gives the MV.t which assumes the trajectories to follow a multivariate t-distribution. Having a look at the performance of MV.N we see that indeed the t-distribution is better in modelling of the trajectories. The LQR-based models are according to most of the considered measures worse than the Naive or MV.t. It is worth to emphasize the very bad ability to model the mean and median trajectories and on the other hand quite good 99% coverage and the values of VS and DSS. Let us note a d,s d,s h (t) |ΔP | No-trade indicator DA.Load DA.Solar DA.Wind Intercept h (P ) 2 Weekday dummies t-i t-1 -1 08/07 08:55 09/07 08:55 10/07 08:55 11/07 08:55 12/07 08:55 13/07 08:55 14/07 08:55 Time (year 2016) d;s Figure 12: Price trajectory (top) and decomposition of tted g ( ) (bottom) of hourly product with delivery at 12:00 for 7 consecutive in-sample days. The end of trading and the time of forecasting are indicated by the dashed black and grey lines, respectively. d,s d, s g (σ ) Y t t ES CRPS VS DSS MAE RMSE 50%-cov 90%-cov 99%-cov Naive 16.428 1.179 10.521 77.33 3.075 5.805 0.490 0.8919 0.9835 MV.N 17.104 1.226 12.905 76.50 3.087 5.810 0.6712 0.9247 0.9723 MV.t 16.454 1.181 10.524 71.66 3.081 5.809 0.5115 0.8948 0.9853 RW.N 18.898 1.400 19.965 75.94 3.100 5.815 0.8025 0.9668 0.9887 RW.t 16.999 1.234 11.218 76.77 3.086 5.815 0.6992 0.9578 0.9952 RW.t.mix.D 17.168 1.248 11.385 77.01 3.086 5.814 0.7198 0.9615 0.9949 LQR.Gauss 16.536 1.186 9.923 61.36 3.162 5.966 0.5588 0.9204 0.9875 LQR.ind 16.595 1.191 10.308 58.41 3.168 5.970 0.5789 0.9283 0.9884 Mix.RW.t 17.092 1.239 11.377 73.97 3.085 5.815 0.6924 0.9569 0.9942 Mix.t.mu 17.284 1.255 11.642 74.52 3.086 5.815 0.7117 0.9620 0.9950 Mix.t.sigma 15.965 1.144 9.444 54.42 3.075 5.814 0.5331 0.9167 0.9907 Mix.t.mu.sigma 15.956 1.144 9.405 53.15 3.073 5.804 0.5482 0.9277 0.9928 Table 2: Error measures of the considered models. Colour indicates the performance columnwise (the greener, the better). With bold, we depicted the best values in each column. very bad performance of the Gaussian random walk. Model RW.N is clearly the worst. Having a look at its coverage values, we conclude that its simulations are too volatile. The introduction of t-distribution to random walk yields already a big improvement. Another 360 270 185 90 30 Time to delivery (minutes) Figure 13: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00. The black part is the realization and the colourful part consists of 100 simulations from the Mix.t.mu.sigma. Time of forecasting is indicated by the green dashed line. ID price (hourly, 12:00) x 5m step in our modelling, the usage of simple mixture distribution of the Dirac distribution and the random walk with innovations from t-distribution do not improve the results. d;s However, the next step, i.e. modelling of the probability  with model (5) improves the results, but still they are not better than the ones of model RW.t. Moreover, modelling of the expected value as in equation (8) also worsens the performance substantially. All these models are clearly worse than the Naive considering almost every measure. The last change to the mixture model, i.e. modelling of the standard deviation according to the formula in equation (9) lowers the errors signi cantly. Modelling of the expected value in addition to the standard deviation brings a little improvement. Model Mix.t.mu.sigma is marginally better than Mix.t.sigma and it turns out to be the best model in terms of ES, CRPS, VS, DSS, MAE and RMSE. A little disturbing are the values of the 50%- and 90%-coverage which are too high for the mixture models. This means that it is very likely that the results can be still improved. On the other hand, they capture better the behaviour in the tails than the Naive model. The values of the error measures in Table 2 may suggest that both ES and CRPS evaluate the same thing { the marginal distribution. To emphasize that ES evaluates also the quality of the generated scenarios, we perform a small experiment on the model Mix.t.mu.sigma. In Table 3, we compare the performance of the Mix.t.mu.sigma with its copies modi ed using 3 copulas: maximum dependency, minimum dependency and in- dependent. This results in the same marginal distributions, mean and median trajectories, and coverage values, but in completely di erent ensembles. This is depicted by the values of the measures { the CRPS remains unchanged while the ES, VS and DSS changed dras- tically. Let us note the enormous aggravation of the DSS which is particularly sensitive to changes of the dependency structure. ES CRPS VS DSS MAE RMSE 50%-cov 90%-cov 99%-cov original 15.956 1.144 9.405 53.15 3.073 5.804 0.5482 0.9277 0.9928 maximum dependency 31.625 1.144 10.274 9299.60 3.073 5.804 0.5482 0.9277 0.9928 minimum dependency 33.263 1.144 16.918 27205.49 3.073 5.804 0.5482 0.9277 0.9928 independent 16.925 1.144 14.350 110.02 3.073 5.804 0.5482 0.9277 0.9928 Table 3: Error measures of 4 Mix.t.mu.sigma models with di erent copulas. 29 Figure 14 shows the models' performance over all products in terms of energy score. A very interesting is the case of model RW.N. Usually it is not that much worse than the other random walks, but for hours 14-16 the error explodes. A look into the data explains the situation clearly { there were a few in-sample observations of extreme price di erences. The normal distribution assumes exponentially decaying tails, and thus the model overestimated the variance. This indicates clearly that the t-distribution is better in this purpose as it was una ected by these events. Furthermore, we observe that models d;s with modelled  are uniformly better than the others. Figure 15 presents the models' performance over time to delivery. The values rise as the Naive LQR.Gauss 1.15 MV.N LQR.ind MV.t Mix.RW.t RW.N Mix.t.mu 1.10 RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 1.05 1.00 0.95 00 06 12 18 00 00 06 12 18 00 Product Product Figure 14: Energy score (left) and its ratio to the Naive (right) over 24 hourly products. The right graph is shown without RW.N for better clarity. 1.12 Naive LQR.Gauss MV.N LQR.ind MV.t Mix.RW.t 1.08 RW.N Mix.t.mu RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 1.04 1.00 0.96 150 100 50 150 100 50 Time to delivery (min) Time to delivery (min) Figure 15: Continuous ranked probability score (left) and its ratio to the Naive (right) over time to delivery. The right graph is shown without RW.N for better clarity. Energy Score CRPS Ratio to the Naive Ratio to the Naive 1.1 1.5 1.0 1.0 Naive LQR.Gauss 0.9 MV.N LQR.ind MV.t Mix.RW.t RW.N Mix.t.mu 0.8 0.5 RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 0.7 0 25 50 75 100 0 25 50 75 100 τ τ Figure 16: Pinball score (left) and its ratio to the Naive (right) over quantiles  2 r. The right graph is shown without RW.N for better clarity. time goes, but it is rather not surprising. An interesting behaviour can be observed from 150 to 100 minutes before the delivery. In this time range the errors of the random walk models and the mixture models that assume constant standard deviation rise signi cantly in comparison to the other models. It is also the time of decreasing volatility in Figure 11b. Pinball Score values over quantiles  are depicted in Figure 16. Let us note that the gain from the forecasting of central quantiles is marginal, and it is inline with other studies regarding the ID -Price in the German intraday market [36, 39]. On the other hand, models Mix.t.sigma and Mix.t.mu.sigma gain a lot from the forecasting of quantiles outside the centre, performing especially well in the tails. In relation to the naive benchmark, the error is around 30% lower in the lower tail and around 25% lower in the upper tail. Let us also note that the LQR-based models give quite good results in the tails, but lose a lot in d;s the centre, compared to the naive or to the models with non-constant  . d;s Figure 17 shows the results of the DM test using two types of losses: the ES and the d;s CRPS . Before applying the test we conducted on the multivariate loss di erential series three tests that evaluate the null hypothesis that a unit root is present in the series A;B against the alternative that the data is stationary or trend-stationary. We used the Dickey- Fuller test [65], the Augmented Dickey-Fuller test [66], and the Phillips-Perron test [67]. In 99% of cases the obtained p-values were smaller than 0.01, so we reject the null hypothesis. This indicates in our case that the loss di erential series is covariance stationary. Only for PB Ratio to the Naive 10% 10% Naive Naive 9% 9% MV.N MV.N MV.t 8% MV.t 8% RW.N RW.N 7% 7% RW.t RW.t 6% 6% RW.t.mix.D RW.t.mix.D 5% 5% LQR.Gauss LQR.Gauss 4% 4% LQR.ind LQR.ind Mix.RW.t 3% Mix.RW.t 3% Mix.t.mu Mix.t.mu 2% 2% Mix.t.sigma Mix.t.sigma 1% 1% ● ● Mix.t.mu.sigma Mix.t.mu.sigma 0% 0% p−value p−value (a) (b) d;s Figure 17: Results of the Diebold-Mariano test. (a) presents the p-values for the ES loss, d;s (b) the values for the CRPS loss. The gures use a heat map to indicate the range of the p-values. The closer they are to zero (! dark green), the more signi cant the di erence is between forecasts of X-axis model (better) and forecasts of the Y-axis model (worse). the di erences with RW.N the Dickey-Fuller test reported no signi cance for rejecting the null hypothesis. This may be caused by the bad capturing of the marginal distribution of the RW.N. The results of the DM test show that the di erence between the forecasts of models Mix.t.mu.sigma and Mix.t.sigma is insigni cant. Moreover, these models give better forecasts than all the other considered models. It is worth to emphasize a very good performance of the Naive model, but it is not surprising after taking a look at Table 2. 7 Conclusion We conducted an ensemble forecasting study in the German Intraday Continuous Market which is novel in two ways. The rst way, this study is the rst one that raises the issue of price trajectory simulation and ensemble forecasting in continuous intraday electricity markets. The second way, the study uses a very clever mixture of distributions that is tted to the data. The results are very satisfying and showing that it is possible to successfully Naive MV.N MV.t RW.N RW.t RW.t.mix.D LQR.Gauss LQR.ind Mix.RW.t Mix.t.mu Mix.t.sigma Mix.t.mu.sigma Naive MV.N MV.t RW.N RW.t RW.t.mix.D LQR.Gauss LQR.ind Mix.RW.t Mix.t.mu Mix.t.sigma Mix.t.mu.sigma model the volatility in the German Intraday Continuous Market. The study was carried out using the data from the German market, but the generality of this method and the organization of the European electricity markets ensure a possible application to other markets, especially the markets participate in XBID which covers exchanges like EPEX, Nordpool and OMIE. Obviously, the proposed method can be developed further. One of possible directions is using other external processes like the traded volume or price of nearby hours as regressors. Although, this is a non-trivial task and could easily lead to the accumulation of errors. Another possibility is utilization of other probability distribution. The not perfect coverage of the best performing model indicates that there is still some space for improvement. This issue could be addressed with some post-processing method as well. Acknowledgments This research article was partially supported by the German Research Foundation (DFG, Germany) and the National Science Center (NCN, Poland) through BEETHOVEN grant no. 2016/23/G/HS4/01005. References [1] S. Goodarzi, H. N. Perera, and D. Bunn. The impact of renewable energy forecast errors on imbalance volumes and electricity spot prices. Energy Policy, 134:110827, [2] C. Kath and F. Ziel. The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts. Energy Economics, 76:411{423, 2018. [3] K. Maciejowska. Assessing the impact of renewable energy sources on the electricity price level and variability{A quantile regression approach. Energy Economics, 85: 104532, 2020. [4] J. Viehmann. State of the German Short-Term Power Market. Zeitschrift 33 fur  Energiewirtschaft, 41(2):87{103, Jun 2017. ISSN 1866-2765. doi: 10.1007/ s12398-017-0196-9. URL https://doi.org/10.1007/s12398-017-0196-9. [5] C. Kath. Modeling intraday markets under the new advances of the cross-border intraday project (XBID): Evidence from the German intraday market. Energies, 12 (22):4339, 2019. [6] R. Weron. Electricity price forecasting: A review of the state-of-the-art with a look into the future. International journal of forecasting, 30(4):1030{1081, 2014. [7] J. Nowotarski and R. Weron. Recent advances in electricity price forecasting: A review of probabilistic forecasting. Renewable and Sustainable Energy Reviews, 81:1548{1568, [8] P. Muniain and F. Ziel. Probabilistic forecasting in day-ahead electricity markets: Simulating peak and o -peak prices. International Journal of Forecasting, 2020. [9] G. Marcjasz, T. Sera n, and R. Weron. Selection of calibration windows for day-ahead electricity price forecasting. Energies, 11(9):2364, 2018. [10] B. Uniejewski, G. Marcjasz, and R. Weron. On the importance of the long-term seasonal component in day-ahead electricity price forecasting: Part II|Probabilistic forecasting. Energy Economics, 79:171{182, 2019. [11] T. Sera n, B. Uniejewski, and R. Weron. Averaging predictive distributions across calibration windows for day-ahead electricity price forecasting. Energies, 12(13):2561, [12] Z. Yang, L. Ce, and L. Lian. Electricity price forecasting by a hybrid model, combin- ing wavelet transform, ARMA and kernel-based extreme learning machine methods. Applied Energy, 190:291{305, 2017. [13] D. Wang, H. Luo, O. Grunder, Y. Lin, and H. Guo. Multi-step ahead electricity price forecasting using a hybrid model based on two-layer decomposition technique and BP neural network optimized by re y algorithm. Applied Energy, 190:390{407, 2017. 34 [14] J. Zhang, Z. Tan, and Y. Wei. An adaptive hybrid model for short term electricity price forecasting. Applied Energy, 258:114087, 2020. [15] L. Xiao, W. Shao, M. Yu, J. Ma, and C. Jin. Research and application of a hybrid wavelet neural network model with the improved cuckoo search algorithm for electrical power system forecasting. Applied Energy, 198:203{222, 2017. [16] P. Bento, J. Pombo, M. Calado, and S. Mariano. A bat optimized neural network and wavelet transform approach for short-term price forecasting. Applied Energy, 210: 88{97, 2018. [17] D. Keles, J. Scelle, F. Paraschiv, and W. Fichtner. Extended forecast methods for day-ahead electricity spot prices applying arti cial neural networks. Applied Energy, 162:218{230, 2016. [18] J. Lago, F. De Ridder, P. Vrancx, and B. De Schutter. Forecasting day-ahead electricity prices in Europe: the importance of considering market integration. Applied Energy, 211:890{903, 2018. [19] F. Ocker and K.-M. Ehrhart. The \German Paradox" in the balancing power markets. Renewable and Sustainable Energy Reviews, 67:892{898, 2017. [20] C. Koch and L. Hirth. Short-term electricity trading for system balancing: An empir- ical analysis of the role of intraday trading in balancing Germany's electricity system. Renewable and Sustainable Energy Reviews, 113:109275, 2019. [21] F. Karan l and Y. Li. The role of continuous intraday electricity markets: The inte- gration of large-share wind power generation in Denmark. The Energy Journal, 38(2), [22] K. Maciejowska, W. Nitka, and T. Weron. Day-ahead vs. Intraday|Forecasting the price spread to maximize economic bene ts. Energies, 12(4):631, 2019. [23] M. Narajewski and F. Ziel. Estimation and Simulation of the Transaction Arrival Process in Intraday Electricity Markets. Energies, 12(23):4518, 2019. 35 [24] R. Kiesel and F. Paraschiv. Econometric analysis of 15-minute intraday electricity prices. Energy Economics, 64:77{90, 2017. [25] N. Graf von Luckner and R. Kiesel. Modeling market order arrivals on the intraday market for electricity deliveries in Germany with the Hawkes process. Available at SSRN, 2020. [26] T. Rintam aki, A. S. Siddiqui, and A. Salo. Strategic o ering of a exible producer in day-ahead and intraday power markets. European Journal of Operational Research, [27] R. A d, P. Gruet, and H. Pham. An optimal trading problem in intraday electricity markets. Mathematics and Financial Economics, 10(1):49{85, 2016. [28] X. Ay on, M. A. Moreno, and J. Usaola. Aggregators' optimal bidding strategy in sequential day-ahead and intraday electricity spot markets. Energies, 10(4):450, 2017. [29] S. Glas, R. Kiesel, S. Kolkmann, M. Kremer, N. G. von Luckner, L. Ostmeier, et al. Intraday renewable electricity trading: advanced modeling and numerical optimal con- trol. Journal of Mathematics in Industry, 10(1):3, 2020. [30] C. Pape, S. Hagemann, and C. Weber. Are fundamentals enough? Explaining price variations in the German day-ahead and intraday power market. Energy Economics, 54:376{387, 2016. [31] M. Gurtler  and T. Paulsen. The e ect of wind and solar power forecasts on day-ahead and intraday electricity prices in Germany. Energy Economics, 75:150{162, 2018. [32] M. Kremer, R. Kiesela, and F. Paraschivc. An Econometric Model for Intraday Elec- tricity Trading. 2020. [33] C. Monteiro, I. J. Ramirez-Rosado, L. A. Fernandez-Jimenez, and P. Conde. Short- Term Price Forecasting Models Based on Arti cial Neural Networks for Intraday Ses- sions in the Iberian Electricity Market. Energies, 9(9):721, 2016. 36 [34] J. R. Andrade, J. Filipe, M. Reis, and R. J. Bessa. Probabilistic Price Forecasting for Day-Ahead and Intraday Markets: Beyond the Statistical Model. Sustainability, 9 (11):1990, 2017. [35] B. Uniejewski, G. Marcjasz, and R. Weron. Understanding intraday electricity mar- kets: Variable selection and very short-term price forecasting using LASSO. Interna- tional Journal of Forecasting, 35(4):1533{1547, 2019. [36] M. Narajewski and F. Ziel. Econometric modelling and forecasting of intraday elec- tricity prices. Journal of Commodity Markets, page 100107, 2019. [37] G. Marcjasz, B. Uniejewski, and R. Weron. Beating the Na ve|Combining LASSO with Na ve Intraday Electricity Price Forecasts. Energies, 13(7):1667, 2020. [38] I. Oksuz and U. Ugurlu. Neural network based model comparison for intraday elec- tricity price forecasting. Energies, 12(23):4557, 2019. [39] T. Janke and F. Steinke. Forecasting the price distribution of continuous intraday electricity trading. Energies, 12(22):4262, 2019. [40] R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54 (3):507{554, 2005. [41] T. Hastie and R. Tibshirani. Generalized Additive Models, volume 43. CRC Press, [42] A. Pierrot and Y. Goude. Short-term electricity load forecasting with generalized additive models. Proceedings of ISAP power, 2011, 2011. [43] P. Gaillard, Y. Goude, and R. Nedellec. Additive models and robust aggregation for GEFCom2014 probabilistic electric load and electricity price forecasting. International Journal of forecasting, 32(3):1038{1050, 2016. [44] F. Serinaldi. Distributional modeling and short-term forecasting of electricity prices by generalized additive models for location, scale and shape. Energy Economics, 33 (6):1216{1226, 2011. 37 [45] A. Gianfreda and D. Bunn. A stochastic latent moment model for electricity price formation. Operations Research, 66(5):1189{1203, 2018. [46] E. Abramova and D. Bunn. Forecasting the Intra-Day Spread Densities of Electricity Prices. Energies, 13(3):687, 2020. [47] R. Tibshirani. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267{288, 1996. ISSN 00359246. URL http://www.jstor.org/stable/2346178. [48] D. M. Stasinopoulos, R. A. Rigby, et al. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, 23(7):1{46, 2007. [49] A. Misiorek, S. Trueck, and R. Weron. Point and interval forecasting of spot electricity prices: Linear vs. non-linear time series models. Studies in Nonlinear Dynamics & Econometrics, 10(3), 2006. [50] B. Uniejewski, J. Nowotarski, and R. Weron. Automated variable selection and shrink- age for day-ahead electricity price forecasting. Energies, 9(8):621, 2016. [51] F. Ziel and R. Weron. Day-ahead electricity price forecasting with high-dimensional structures: Univariate vs. multivariate modeling frameworks. Energy Economics, 70: 396{420, 2018. [52] F. Ziel. Forecasting electricity spot prices using lasso: On capturing the autoregressive intraday structure. IEEE Transactions on Power Systems, 31(6):4977{4987, 2016. [53] J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1):1, 2010. [54] P. H. Eilers, B. D. Marx, and M. Durb an. Twenty years of P-splines. SORT: statistics and operations research transactions, 39(2):0149{186, 2015. [55] C. Gilbert, J. Browell, and D. McMillan. Probabilistic access forecasting for improved o shore operations. International Journal of Forecasting, 2020. 38 [56] R. Koenker. quantreg: Quantile Regression, 2019. URL https://CRAN.R-project. org/package=quantreg. R package version 5.54. [57] J. M. Hyman. Accurate monotonicity preserving cubic interpolation. SIAM Journal on Scienti c and Statistical Computing, 4(4):645{654, 1983. [58] T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359{378, 2007. [59] P. Pinson and R. Girard. Evaluating the quality of scenarios of short-term wind power generation. Applied Energy, 96:12{20, 2012. [60] S. Lerch, S. Baran, A. M oller, J. Gro, R. Schefzik, S. Hemri, and M. Graeter. Simulation-based comparison of multivariate ensemble post-processing methods. Non- linear Processes in Geophysics, 27(2):349{371, 2020. [61] F. Diebold and R. Mariano. Comparing Predictive Accuracy. Journal of Business & Economic Statistics, 13(3):253{63, 1995. [62] T. Gneiting. Making and evaluating point forecasts. Journal of the American Statis- tical Association, 106(494):746{762, 2011. [63] F. Ziel and K. Berk. Multivariate forecasting evaluation: On sensitive and strictly proper scoring rules. arXiv preprint arXiv:1910.07325, 2019. [64] M. Scheuerer and T. M. Hamill. Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities. Monthly Weather Review, 143(4):1321{1334, 2015. [65] D. A. Dickey and W. A. Fuller. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association, 74(366a): 427{431, 1979. [66] S. E. Said and D. A. Dickey. Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika, 71(3):599{607, 1984. [67] P. C. Phillips and P. Perron. Testing for a unit root in time series regression. Biometrika, 75(2):335{346, 1988. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Computing Research Repository arXiv (Cornell University)

Ensemble Forecasting for Intraday Electricity Prices: Simulating Trajectories

Computing Research Repository , Volume 2021 (2005) – May 4, 2020

Loading next page...
 
/lp/arxiv-cornell-university/ensemble-forecasting-for-intraday-electricity-prices-simulating-7aZ3UFZqVY
ISSN
0306-2619
eISSN
ARCH-3344
DOI
10.1016/j.apenergy.2020.115801
Publisher site
See Article on Publisher Site

Abstract

Recent studies concerning the point electricity price forecasting have shown ev- idence that the hourly German Intraday Continuous Market is weak-form ecient. Therefore, we take a novel, advanced approach to the problem. A probabilistic fore- casting of the hourly intraday electricity prices is performed by simulating trajectories in every trading window to receive a realistic ensemble to allow for more ecient in- traday trading and redispatch. A generalized additive model is tted to the price di erences with the assumption that they follow a zero-in ated distribution, pre- cisely a mixture of the Dirac and the Student's t-distributions. Moreover, the mixing term is estimated using a high-dimensional logistic regression with lasso penalty. We model the expected value and volatility of the series using i.a. autoregressive and no-trade e ects or load, wind and solar generation forecasts and accounting for the non-linearities in e.g. time to maturity. Both the in-sample characteristics and fore- casting performance are analysed using a rolling window forecasting study. Multiple versions of the model are compared to several benchmark models and evaluated using probabilistic forecasting measures and signi cance tests. The study aims to forecast the price distribution in the German Intraday Continuous Market in the last 3 hours of trading, but the approach allows for application to other continuous markets, es- pecially in Europe. The results prove superiority of the mixture model over the benchmarks gaining the most from the modelling of the volatility. They also indicate that the introduction of XBID reduced the market volatility. Keywords: electricity price forecasting, power markets, intraday market, continuous-trade markets, XBID, ensemble forecasting, probabilistic forecasting, short-term forecasting, tra- jectories, generalized additive models, lasso, logistic regression, zero-in ated distribution, scenario simulation accepted for publication in Applied Energy. arXiv:2005.01365v3 [q-fin.ST] 29 Aug 2020 1 Introduction Intraday continuous electricity markets gain on importance every day [1]. Their primary purpose is to handle the uncertainty in electricity generation and load arisen since the day-ahead markets [2]. A number of events can cause the uncertainty, e.g. unexpected power plant outage or changing weather conditions. The latter one is the result of the global trend of investing in weather-dependent renewable power sources and is a subject of modelling and forecasting [3]. The need of intraday continuous trading is ful lled by the power exchanges and transmission system operators (TSO) [4]. They allow the market participants to trade the energy continuously up to 5 minutes before the delivery, e.g. in France or Germany, and to trade it cross-border, e.g. using the cross-border intraday (XBID) market [5]. Even though there is a clear evidence of the importance of this kind of markets, the researchers do not investigate them in terms of forecasting as willingly as the day-ahead market. The day-ahead market is the main electricity spot market with a long history of research on electricity price forecasting [6]. Recent studies on the electricity price forecasting (EPF) in day-ahead markets consider i.a. the probabilistic forecasting and forecasting combina- tion. Nowotarski and Weron [7] present a review of probabilistic EPF and Muniain and Ziel [8] use it to simulate peak and o -peak prices. Marcjasz et al. [9] combine point fore- casts achieved using di erent calibration windows while Uniejewski et al. [10] and Sera n et al. [11] do it for probabilistic forecasts. A very big part of the recent EPF literature are also hybrid models [12{14] and neural networks [15{17]. Also the market integration plays an important role in price formation in both day-ahead and intraday markets what is elaborated by Lago et al. [18] and Kath [5]. The role of the intraday markets in the balancing of electricity systems was emphasized and explained by Ocker and Ehrhart [19] and Koch and Hirth [20] on the basis of the Ger- man electricity market. They observed that the introduction of the intraday continuous market in Germany partially led to a substantial decrease in the demand for balancing energy while the wind and solar energy generation increased. Karan l and Li [21] clarify the reason for the spread between day-ahead and intraday prices in Denmark, while Ma- ciejowska et al. [22] forecast the price spread between the day-ahead and intraday markets 2 based on the Polish and German data. The continuity of the intraday market has encour- aged the researchers to investigate the transaction arrival process [23], bidding behaviour [24{26] and optimal trading strategies [27{29]. The impact of fundamental regressors on the price formation in the intraday market was examined by Pape et al. [30], Gur  tler and Paulsen [31], and Kremer et al. [32]. The literature on the EPF in the intraday markets is not that broad as in the day- ahead markets or as the one regarding other aspects of the intraday markets. Monteiro et al. [33] and Andrade et al. [34] conducted the EPF for the Iberian intraday market, however it is not a continuous market, and thus their studies are more similar to these on day-ahead markets. Uniejewski et al. [35], Narajewski and Ziel [36] and Marcjasz et al. [37] performed the EPF in the German Intraday Continuous Market, while Oksuz and Ugurlu [38] in the Turkish Intraday market. An outcome of the second one was an indication of the weak-form eciency of the investigated market. This was partially con rmed by Janke and Steinke [39], who forecasted the distribution of prices during the last three hours of trading and concluded that forecasting of the central quantiles yields marginal improvement to the naive benchmark. However, Marcjasz et al. [37] managed to outperform the most recent price by using an ensemble of it and a lasso-estimated model. The only four papers on EPF in the German intraday market considered the ID -Price (a volume-weighted average price of transactions in the last three hours before delivery) as the most important price index in the German intraday market and conducted forecasting of it. This paper focuses on the ID index as well, but not directly. Instead of forecasting its price we simulate the paths of 5-minute volume weighted average price during the time- frame of the index. This way we obtain a distribution forecast of the prices in every 5 min window during the last three hours before the delivery. An example of this approach can be seen in Figure 1. We motivate our research with results on the weak-form eciency of the market concluded by Narajewski and Ziel [36] and a possible application of the methodology to trading of the electricity and optimal redispatch management. In purpose of modelling and forecasting of the trajectories, we utilize the generalized additive models for location scale and shape (GAMLSS) [40] which extends the generalized additive models (GAM) [41]. This methodology found applications to the electricity load 3 60 360 270 185 90 30 Time to delivery (minutes) Figure 1: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00. The black part is the realization and the colourful part consists of 100 simulations from the Gaussian random walk. Time of forecasting is indicated by the green dashed line. [42, 43] and day-ahead price [44{46] forecasting, but never to the intraday electricity mar- kets. The model for price di erence P is tted to the Student's t-distribution and mixed with the Dirac distribution, i.e. P  (1 ) + t. is assumed to be a Bernoulli variable with probability  and is modelled using the logistic regression. We estimate it with the lasso method [47]. A broader description of the modelling exercise can be found in Section 4. The forecasting part utilizes a rolling window study. This is a very common study type in the EPF and is widely utilized by researchers [35, 36]. We analyse both in-sample characteristics and evaluate the out-of-sample forecasting performance. The major contributions of this paper are as follows: (1) It is the rst work on the price trajectories in intraday continuous markets which are a new and developing part of the electricity markets. (2) A rigourous presentation and discussion of all characteristics of the market, like trad- ing frequency and volatility. (3) We propose a model that utilizes a mixture of GAMLSS and logit-lasso estimation methods and generates realistic ensembles what allows for ecient decision-making, especially for trading and redispatch. ID price (hourly, 12:00) x 5m (4) The components of the proposed model are interpreted with respect to the market behaviour, highlighting the impact of the XBID introduction and relevant features, like wind and solar generation, load, calendar e ects, trading activity and historic prices. (5) The high-quality predictive performance of the proposed model is compared with simple benchmarks and sophisticated models with respect to point and probabilistic forecasting. The remainder of this paper has the following structure. In the next section, we describe the market. The third section consists of the data description and descriptive statistics. Then, a broader explanation of the estimation methods is presented, followed by the de- scription of the considered models and benchmarks. In the fth section, the forecasting study and evaluation measures are introduced and discussed in detail. In the sixth sec- tion, we present the results which consist of the in-sample analysis with relevant model interpretations and the out-of-sample evaluation. The nal section concludes this paper. The methodology used in the paper is very innovative, especially in regard to the intraday electricity markets. We present it with an application to the German Intraday Continuous Market, but it can be easily used with any other intraday electricity continuous market. 2 Market description The German Intraday Continuous Market allows to trade hourly, half-hourly and quarter- hourly products. We conduct the study using the most liquid part of the market { the hourly one. This is in line with other EPF studies in intraday markets. Trading of hourly products in the German Intraday Continuous begins every day at 15:00 for the 24 products of the following day. It is possible to trade the electricity until 30 minutes (in the whole market) and up to 5 minutes (within respective control zones) before the delivery. In the meantime, between hour 22:00 and 60 minutes before the delivery the cross-border trading within XBID system is possible Kath [5]. This system went live on 18th June 2018. A visualization of the trading timeline can be seen in Figure 2. For more details on the German electricity market, we recommend the paper of Viehmann [4]. 5 Day-Ahead Intraday XBID XBID Market Control zones Delivery Auction Auction starts closes closes close Quarter-Hourly Intraday Continuous Hourly Intraday Continuous d 1, d 1, d 1, d 1, d, d, d, d, s 12:00 15:00 16:00 22:00 s 60 min s 30 min s 5 min Figure 2: The daily routine of the German spot electricity market. d corresponds to the day of the delivery and s corresponds to the hour of the delivery. The most important price measure in the German intraday market is the volume- weighted average price of transactions in the last three hours of trading, called ID . The index takes into account only these transactions that happen until the gate closure 30 minutes before the delivery, so in fact it measures the last two and a half hours of trading before the gate closure. The relevance of ID is an outcome of the behaviour of traders in the intraday market { most of the transactions are held in this time period making it very liquid. This results in a high interest of practitioners and researchers in the ID -Price. For more details on the index visit the webpage of EPEX SPOT or see e.g. Narajewski and Ziel [36]. To measure the prices during the trading period, we use the ID de ned by Narajewski x y and Ziel [36]. Let us recall the de nition of ID . Let b(d; s) be the start of the delivery of x y d;s a product s on day d. By T = [b(d; s) x y; b(d; s) x), x  0 and y > 0, we denote x;y d;s the time interval between x + y and x minutes before the delivery, and by T we denote a set of timestamps of transactions on the product. The ID is de ned by x y d;s d;s d;s ID := V P ; (1) x y k k d;s d;s V d;s d;s k2T \T k d;s x;y k2T \T x;y d;s d;s where V and P are the volume and the price of k-th trade within the transaction set k k d;s d;s T \ T respectively. Let us note that the ID is simply a volume-weighted average x y x;y price of transactions in the time interval of length y hours and ending x hours before the delivery. d;s d;s In the case of T \ T = ; we use the value of ID , that is to say the previous x+y y x;y 6 1 observed volume-weighted average price measured on the time period of the same length. In the case of no trades appearing since the start of trading, the price is set to the price of the corresponding Day-Ahead Auction. 3 Data and descriptive statistics The data used in purpose of this study consists of all transactions on hourly products in the German Intraday Continuous Market between 16th July 2015 and 1st October 2019. A more general descriptive statistics were presented by Narajewski and Ziel [36]. As men- tioned in the previous section, the XBID system started to function on 18th June 2018. This means that XBID trades were possible only on around 30% of the days in the data. In the forecasting study, we use D = 365 days of the data as in-sample, and therefore the analysis in this section is based only on the initial in-sample, i.e. the data between 16th July 2015 and 14th July 2016. The start of the data is set to the rst day of lead change in Germany from 45 min to 30 min in order to avoid this structural break. In d;s this paper, we aggregate the transactions using the ID with y = 5 min, and this way we obtain dense time series data. As said before, we are particularly interested in the evolution of prices during the last 2.5 hours of trading before the gate closure, so we use x 2 J = f180; 175; : : : ; 35; 30g, where x is denoted in minutes. This way we observe T = 31 price points a day, what results in TD = 31  365 = 11315 in-sample observations and T -dimensional simulated trajectories. Subsequently, we use a very speci c setting, but it can be applied to any other continuous intraday market with other input variables. As the market shows strong indications of weak-form eciency, we focus on modelling d;s d;s d;s of the price di erences P = ID ID instead of pure prices (Tt)y+30 (T(t1))y+30 t y y d;s d;s d;s P = ID . We also introduce the P notation for simplicity. Due to the usage (Tt)y+30 t y t of price di erences and to the fact that the data is aggregated using 5-minutes grid, we observe a high frequency of observations with no trade, and thus price di erences equal to 0. In Narajewski and Ziel [36] this value is set to the price of the last transaction. This adjustment is caused by the fact that in this paper we work with 5-minute time intervals, leading to a signi cant number of the events of no trade in the time interval. This would often result in an arti cial change of the price, compared to the previously observed ID . x y 7 This is depicted in Figure 3. One can see that lack of transactions happens more often to the night and morning hours. In Figure 4, we zoom in the tails of the histograms from Figure 3. We also plot there densities of 4 distributions tted to the data: the normal distribution N (0; b) and the t-distribution t(0; b; ) with xed  2 f2:5; 3; 4g and estimated b using maximum likelihood estimation ignoring the no-trade observations. Based on Figure 4 it d;s is clear that the price di erences P are heavy-tailed. One can see that even the t- distribution with  = 4 seems to be not heavy-tailed enough for the data. This indicates that the tail-index of the price di erences may be lower than 4 which would mean that the d;s fourth moment of the P might not exist what is a strong indication for heavy tails. Figure 5 shows the frequency of the no-trade event over time to delivery. We see that the overall behaviour is very similar across all products { the closer to the delivery, the less observations without transactions. What is di erent among the products is the level of the frequency. It is clear that the frequency decreases as the product time increases and the reason for it may be the time distance from the Day-Ahead and Intraday Auctions. It is intuitive that since these auctions the uncertainty could be smaller for the rst products and higher for the last ones, but the smallest values of frequency are achieved not for the evening, but for the day-peak hours. This can be explained by higher activity in the market due to higher expected demand. d;s Figure 6 shows the in-sample standard deviation of price di erences P over time to delivery. The dashed lines depict the standard deviation of the whole samples, independent Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 2.0 1.5 1.0 0.5 0.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 d,s ΔP d;s Figure 3: Histograms of the initial in-sample price di erences P for selected hours. Blue colour corresponds to the no-trade cases. 8 Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 0.015 N(0, ) t(0, , 2.5) σ ν = t(0,σ,ν = 3) t(0,σ,ν = 4) 0.010 0.005 0.000 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 -10 -8 -6 -4 4 6 8 10 d,s d;s Figure 4: Histograms of the tails of the initial in-sample price di erences P for selected hours. Solid lines depict densities of the distributions according to the legend. 00:00 01:00 02:00 0.6 03:00 04:00 05:00 06:00 07:00 08:00 0.4 09:00 10:00 11:00 12:00 13:00 14:00 0.2 15:00 16:00 17:00 18:00 19:00 20:00 0.0 21:00 22:00 23:00 150 100 50 Time to delivery (minutes) d;s Figure 5: Frequency of no-trade event in the initial in-sample price di erences P over time to delivery for all 24 products. of time. If the price processes would be similar to random walk, the sample standard de- viation over time should be oscillating around these dashed lines. The behaviour in Figure 6 is clearly di erent, with a spike in the last 30 minutes before gate closure. This suggests that the variance should be a subject of modelling. Figure 7 presents the partial autocor- d;s relation function of the absolute price di erences P to explore potential conditional heteroscedasticity in the heavy-tailed data. Figure 7 shows that the most signi cant are the rst three lags. Also, lags up to 6 may contain some information. Surprisingly, lags around 31 seem to be signi cant too, but this is most likely some daily dependence. Frequency 3 00:00 06:00 12:00 18:00 150 100 50 Time to delivery (minutes) d;s Figure 6: Standard deviation of the initial in-sample price di erences P for selected hours over time to delivery. Dashed lines indicate the standard deviation independent of time. Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 0.3 0.2 0.1 0.0 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 Lag Figure 7: Partial autocorrelation function of the initial in-sample absolute price di erences d;s P for selected hours. Blue, dashed lines indicate the con dence intervals. 4 Modelling and estimation d;s We assume the price di erences P to follow a 4-parametric distribution { a mixture of the Dirac  distribution and the 3-parametric t-distribution, sometimes referred as zero-in ated t-distribution: d;s d;s d;s d;s G = (1 ) + F (2) t t t t d;s d;s where = 1(V 6= 0) is a Bernoulli variable of the event that there is non-zero vol- t t d;s d;s ume of energy traded on product s on day d at time t with probability  and F is t t d,s Standard deviation PACF of | P | t d;s d;s d;s d;s d;s the 3-parametric t-distribution t( ;  ;  ) where  2 R is the mean,  > 0 the t t t t t d;s standard deviation and  > 2 the degrees of freedom. The t-distribution is estimated with GAMLSS framework [40]. The GAMLSS is an expansion of the GAM [41] and it allows to model not only the expected value of a response variable, but also potentially the higher moments, represented by scale and shape parameters. Namely, let Y be a random variable with a density function f (yj), where  is a set of up to four distribution parameters. Then each  2  may be modelled by g ( ) = h (x ) (3) i i ji ji j=1 where g is some link function, J is a number of explanatory variables and h is a smooth i i ji function of explanatory variable x . Note that function h does not have to be a parametric ji ji function. In our exercise, we use the following link functions g () = (4) g () = log()1(  1) + ( 1)1( > 1) g () = log( 2): g is a standard link function for the expected 5.0 g1 g2 value. g is a link function that we call "logident" 2 g3 2.5 and we introduce it in order to avoid exponential 0.0 inverse function for high values of estimates. The third link function is simply a natural logarithm -2.5 shifted to 2 for preserving the condition that  > 2. 0 1 2 3 4 5 The three link functions are plotted in Figure 8. d;s The models for F are estimated using the gamlss Figure 8: An illustration of the three package in R [48]. link functions g , g and g . 1 2 3 Due to the novelty of the exercise, we cannot use any literature benchmarks, as well as any standard approaches to the modelling of volatility, e.g. GARCH. Even though the data looks like time series, the biggest problem lies in the gap between days. We model each product separately, and for each product we have 31 observations every day. In the corresponding time series, the observations on day 11 d appear in 5-minute breaks, while the time di erence between the last observation on day d and the rst on day d + 1 is around 21 hours. Furthermore, there is no direct link between the prices on day d and day d + 1 as they are for di erent delivery periods with potentially di erent fundamental market situations. Thus, the usage of GARCH-type components to address conditional heteroscedasticity is not straight-forward. Instead, as simple benchmarks we use models that assume the d;s d;s d;s d;s distribution of P = (P ; P ; : : : ; P ) to be multivariate, random walk models, 1 2 and a model that uses in-sample price di erences to create an ensemble forecast. Also, as advanced benchmarks linear quantile regression with copula models are considered. In the following subsection, the more complicated models are considered. We model explicitly the probability of non-zero number of transactions, the mean, and the variance of tted distribution. We present the models from the least to the most complex and show the results similarly. This allows us to observe the gain caused by every new part of the model. 4.1 Mixture models d;s We introduce a dependency structure between the rst three parameters of the G dis- d;s d;s d;s tribution, i.e.  ,  and  , and the data. For the fourth parameter, the degrees t t t d;s d;s of freedom  , we assume the constancy. The G distribution is estimated in a 2-step t t d;s d;s approach. First, the  parameter is estimated, and then the F distribution is tted to t t d;s d;s the in-sample price di erences P for which the value of is 1. t t d;s In the rst step, we build a logistic model for 3 6 12 d;s X X X d;s d;s d;s log = + P + P + P 0 j 3+j 10 tj tj tj d;s j=1 j=1 j=7 | {z } price di erences + Mon(d) + Sat(d) + Sun(d) + TtM (t) 11 12 13 13+j j (5) j=1 | {z } time dummies d;s d;s d;s d;s d;s + DA + DA + DA + DA +  : 45 46 47 48 48+j Load Sol WiOn WiO tj | {z } j=1 fundamental regressors | {z } d;s regression on 12 The model explains the logit function with 4 main components: price di erence impact, d;s time dummies, fundamental regressors and regression on . Price di erence impact consists of 3 most recent price di erences, 6 most recent absolute price di erences and a sum of absolute prices di erences lagged by 7 to 12. This component addresses the overall d;s impact of price volatility on  . We expect to observe more trades when the prices are more volatile. Time dummies consist of three weekday dummies and time to maturity dummies. The weekday dummies for Monday, Saturday and Sunday are chosen literature-based. A number of studies [49{51] have proven that usage of these dummies in EPF substantially improves the forecasting performance. These three dummies indicate the end of the week with Monday being a transition day. The use of time to maturity dummies is clear when we d;s take a look again at Figure 5. It is expected that  rises as we approach the gate closure. Fundamental regressors consist of day-ahead forecasts of total load, solar generation, wind onshore generation and wind o shore generation. It is expected that higher load and share of renewables should rise the uncertainty in the market, and encourage market participants d;s to trade more. The last, but not the least is the regression on . We do not use the d;s regression directly, but instead we use the average of last j observed values of which d;s we denote by  . We expect these values to have a signi cant impact on the prediction tj d;s d;s of  . Intuitively, the higher these averages, the higher the value of  . t t Model (5) consists of 61 regressors in total. To avoid over tting problems, we estimate the model using the least absolute shrinkage and selection operator (lasso) of Tibshirani [47]. Let us recall that if we possess a logistic model log = X for the Bernoulli lasso variable with P ( = 1) = , then the lasso estimator is given by n   o lasso b e = arg min l ; X +  (jj jj j j) ; (6) 1 0 where l is the corresponding log-likelihood 0 X e e l( ; X) = X log(1 + e ); (7) i=1 X is a standardization of X and  is a tunable shrinkage parameter. This method found already many successful applications to the EPF and intraday markets [35, 36, 52]. In this exercise, we utilize the glmnet package in R by Friedman et al. [53]. The estimation is conducted using a BIC-tuned  value chosen from an exponential grid of 100 values. 13 d;s Let us now take a look at the F distribution in equation (2). We consider four d;s d;s d;s d;s versions of it. In the rst one, we assume that F follows t(0;  ;  ) with constant t t t t d;s d;s and  . We denote it simply by Mix.RW.t. The F distribution is tted to the in- t t sample price di erences with non-zero transaction number using the GAMLSS. With this d;s model we can observe the gain of using a complex model for the  parameter. Figure 9 shows tted densities to the histograms presented in Figure 3. They were obtained with model Mix.RW.t. d;s d;s d;s d;s The second model utilizes F with modelled  and constant  and  , and we t t t t denote it by Mix.t.mu. This model helps us understand the outcome of modelling of the expected value of P . However, a preliminary analysis has shown that most of the d;s regressors used in model (5) were not signi cant for modelling of  . The only signi cant were the three most recent price di erences. Therefore, we model the expected value with d;s d;s d;s d;s g ( ) = P + P + P : (8) 1 1 2 3 t t1 t2 t3 d;s d;s d;s d;s The next model uses F with   0, modelled  and constant  . We denote it t t t t by Mix.t.sigma. The formula for the standard deviation is as follows 6 12 X X d;s d;s d;s g ( ) = + P + P + Mon(d) + Sat(d) + Sun(d) 2 0 j 7 8 9 10 t tj tj | {z } j=1 j=7 weekday dummies | {z } absolute price di erences d;s d;s d;s d;s d;s d;s (9) + DA + DA + DA + DA + + 11 12 13 14 15 16 Load Sol WiOn WiO t1 t2 | {z } | {z } fundamental regressors d;s lagged d;s + h (P ) + h (t) 1 2 t1 | {z } non-linear e ects where h and h are smooth non-linear P-spline functions. The P-splines simply combine 1 2 equally-spaced B-splines and discrete penalties. More information on P-splines can be found in Eilers et al. [54]. Let us note that the model described by equation (9) uses much more regressors than in equation (8). The explanation of the choice of the variables is very similar to the one of the model described by equation (5). We explain the standard deviation of price di erences with: lagged absolute price di erences, weekday dummies, fundamental d;s regressors, lagged values of and non-linearities in most recent price and time to maturity 14 Hour 00:00 Hour 06:00 Hour 12:00 Hour 18:00 2.0 1.5 1.0 0.5 0.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 -5.0 -2.5 0.0 2.5 5.0 d,s ΔP d;s Figure 9: Histograms of the initial in-sample price di erences P with tted densities of model Mix.RW.t for selected hours. Blue colour corresponds to the no-trade cases. variables. We expect that the absolute price changes are a suitable explanatory variable for the standard deviation as motivated through Figure 7. The fundamental regressors are d;s supposed to have a positive linear correlation with the  . For the Saturday and Sunday dummies we might expect a negative impact due to lower trading activity on weekends, but also a positive impact due to the fact that higher bid-ask spreads are plausible. The lagged d;s values of indicate if the market participants traded lately, and thus we believe that it could identify higher price di erence's variance. The last two regressors are expected to d;s have a non-linear impact on the formation of  , and therefore they are estimated using P-splines. Figure 6 provides already an evidence that the standard deviation varies over d;s time to maturity. Moreover, we suspect that extreme values of most recent price P result t1 in a higher variance due to a relatively inelastic supply curve in extreme price areas. d;s d;s The last and at the same point the most complicated model uses F with  and t t d;s d;s d;s modelled and constant  . We denote it by Mix.t.mu.sigma. The  is modelled t t t d;s using the formula from equation (8) and the  using the formula from equation (9). Let us mention that we could make the mixture model even more complex by modelling d;s the degrees of freedom parameter  . However, a preliminary analysis has shown that it does not yield any signi cant improvement while increasing heavily the computational cost. Thus, in the forecasting study we analyse the performance of 8 models described in this section. 15 4.2 Simple benchmark models The rst benchmark model uses one of D = 365 historical trajectories to model the price d;s d;s d;s d;s di erence vector P = (P ; P ; : : : ; P ). We denote it by Naive and its for- 1 2 mula is given by d;s d ;s P = P (10) where d  U (fd 1; : : : ; d Dg) is a uniform random variable indicating the day used to model the price di erence. Let us note that a xed d index is used to model the whole price trajectory, i.e. for every t 2 f1; 2; : : : ; Tg: This model assumes that the future trajectories can be forecasted using simply the past ones. The second and the third benchmark models assume that the price di erence vector d;s P follows a multivariate normal and t-distributions, respectively. They are denoted by MV.N and MV.t and are given by d;s d;s P = " (11) d;s d;s d;s d;s d;s where "  N 0;  in the case of MV:N and "  t 0;  ;  in the case d;s d;s of MV.t. Let us note that the covariance matrix  and degrees of freedom  are estimated by tting the respective distributions to the in-sample observations. Moreover, d;s the degrees of freedom  is assumed to be constant for all t 2 f1; 2; : : : ; Tg: The next benchmark model is the random walk version of the mixture model described by equation (2), and we denote it by RW.t.mix.D. The formula is as follows d;s d;s d;s d;s P = P P = " (12) t t t1 t P P d;s d;s d;s d1 T i;s d;s d;s where "  G with b = 1(V 6= 0),   0 and constant  and t t t t t t DT i=dD t=1 d;s . These values are estimated based on the in-sample data. The fth benchmark model is a modi cation of the RW.t.mix.D. We denote it by d;s RW.t, and we simply set   1 which means that we do not incorporate the mixing part and assume that the price di erences follow the t-distribution. The last and the simplest of the random walk models assumes the price di erences to follow a Gaussian distribution d;s d;s N (0; ( ) ) and is denoted by RW.N. In terms of the G distribution, we simply modify t t d;s the RW.t model by taking  ! 1. 16 Later, we consider the random walk models from the simplest RW.N to the most complex RW.t.mix.D. This allows us to observe the gain of introducing more complex structure of the distribution. Let us note that model RW.N assumes exponentially decay- d;s ing tails of the price di erences P . Comparing it to model RW.t we measure the gain of assuming heavier, polynomially decaying tails. Based on the number of outlier observa- tions in the German intraday market and on Figure 4, we expect it to perform better than the Gaussian random walk. Then, considering the RW.t.mix.D helps us to understand the gain of the introduction of the mixture. 4.3 Advanced benchmark models As mentioned, we are unable to use any literature-based benchmarks as this is the rst paper on ensemble forecasting in intraday electricity markets. However, it is possible to implement scenario generating methods that are utilized in other research areas. Thus, as advanced benchmark we utilize a smoothed linear quantile regression model with two copulas: Gaussian and independence, and we denote them by LQR.Gauss and LQR.ind, respectively. A very similar approach was applied recently in the purpose of generating density forecasts of signi cant sea wave height and peak wave period [55]. First, we build the linear quantile regression (LQR) model using the same set of regres- sors as for the mixture models. The formula is as follows 3 6 12 X X X d;s d;s d;s d;s d;s Q P = + P + P + P + P 0 j 3+j 10 11 t 0 1j 1j 1j j=1 j=1 j=7 | {z } price components d;s d;s (13) + Mon(d) + Sat(d) + Sun(d) + + 12 13 14 15 16 11 12 | {z } | {z } d;s lagged d;s d;s d;s d;s + DA + DA + DA + DA 17 18 19 20 Load Sol WiOn WiO | {z } fundamental regressors for  2 f0:01; 0:02; : : : ; 0:99g and t = 1; 2 : : : ; T . That is to say, we build separate models for each quantile  and each time point t. Let us note that due to the design of the model, we can use only the regressor values available at the time of forecasting t = 0 (i.e. 3 h 5 min before the delivery). This results in the fact that here we model all T time points using 17 the same data, what is contrary to the mixture models where we can use autoregressive variables due to the recursive character of the models. We estimate the LQR models using the quantreg package in R [56]. d;s In the next step, a spline interpolation is applied over all tted Q P for  2 f0:01; 0:02; : : : ; 0:99g and for every t = 1; 2 : : : ; T . In order to preserve the monotonicity of the estimated cumulative distribution function (CDF) we compute a monotonic cubic spline using Hyman ltering [57]. This way we obtain a smooth and monotonic T -dimensional CDF function d;s d;s d;s d;s (x) = (x ); (x ); : : : ; (x ) (14) 1 2 T 1 2 T d;s d;s d;s d;s where min P = 0 and max P = 1. To fd1;:::;dDg fd1;:::;dDg t t t t d;s assess the dependency structure of the price di erences P over t we use two copulas: Gaussian and independence. The Gaussian copula for a given correlation matrix R can be written as 1 1 1 C (u) =   (u );  (u ); : : : ;  (u ) (15) R R 1 2 T where  is the inverse CDF of a standard normal distribution and  is a joint CDF of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix R. We estimate the correlation matrix R simply by calculating the in-sample correlation matrix. 5 Forecasting study and evaluation We use a rolling window forecasting study approach with D = 365 days in-sample size and N = 1173 days out-of-sample size. The in-sample data consists of DT data points where T = 31 in this study. We model each of the S = 24 hourly products separately and our forecasting time is 185 minutes before the delivery of product s on day D + 1. That is to say, we can utilize all the information from the in-sample data and from the day D + 1 until 185 minutes before the delivery. At this time we forecast M = 1000 times the d;s d;s d;s rst price di erence P = ID ID . Based on these forecasts and explanatory 180 185 1 5 5 d;s data we simulate M second price di erences P and we continue this recursive process until we reach the gate closure. Figure 10 provides an outline of the exercise. This gives 18 us M simulated trajectories, each consisting of T = 31 points. After that, we move the window forward by one day and repeat the exercise until the end of out-of-sample data. However, in the case of benchmark models we do not use the recursive algorithm as there is no recursion in their formulas. Before we discuss the evaluation design in detail, we recall that we are mainly interested d;s in the evaluation of the forecasted T -dimensional distribution of the price vector P = d;s d;s P ; : : : ; P which is represented by the predicted ensemble. Indeed, the multivariate cumulative distribution function of the ensemble coincides with the underlying cumulative distribution function if the ensemble sample size M goes to in nity. Thus, for suciently large M the evaluation of the scenario set can be regarded as the evaluation of probabilistic distributions. From the theoretical point of view, strictly proper multivariate scoring rules are the rst choice for evaluation, as they are able to identify the optimal forecast resp. the true distribution, see Gneiting and Raftery [58], Pinson and Girard [59]. However, we want to remind us that even if forecast A performs signi cantly better than forecast B with respect to a strictly proper scoring rule, there is no guarantee that A also performs better than B in stochastic optimization problems (e.g. trading or storage optimization) where the forecasts are used as input. The optimal forecast would always yield optimal solutions in the stochastic optimization application. Thus, if A is close to the optimal forecast with respect to a strictly proper multivariate scoring rule, the aforementioned risk that B outperforms A in the application is very limited if the stochastic optimization problem is continuous in the stochastic argument. Unfortunately, this only holds for strictly proper multivariate scoring rules. For proper scoring rules which identify only some characteristics of the full predictive distribution, this is certainly not true. The range of available strictly proper Start Time Delivery of trading of forecasting ID ID ID ID ID 185 5 180 5 175 5 170 5 . . . . . . 30 5 . . . d 1, d; s d; s d; s d; s d; s d; s d; s 15:00 3h5m 3h 2h55m 2h50m 35m 30m Figure 10: An outline of the ensemble forecasting exercise. 19 scoring rules is very limited, and reduces basically to the energy score for our ensemble forecasting problem, see e.g. Lerch et al. [60]. Therefore, we consider also proper scoring rules which might allow further insights as they focus on speci c characteristics of the full distribution. To draw statistically signi cant conclusions on the outperformance of the forecasts of the considered models we utilize also the Diebold and Mariano [61] test. As mentioned, the only available multivariate strictly proper scoring rule is the energy score (ES) [58] . We compute the ES loss function in the following way d;s d;s d;s ES = ED EI (16) where d;s d;s d;s ED = P P (17) M 2 j=1 and M M X X d;s d;s d;s b b EI = P P (18) 1 j i M (M 1) 2 j=1 i=j+1 d;s d;s d;s d;s d;s d;s d;s d;s d;s b b b b with P = P ; P ; : : : ; P and P = P ; P ; : : : ; P . The ED component 1 2 T j 1;j 2;j T;j measures the distance between the simulated trajectories and the observed prices. On the d;s other hand, EI measures the spread between the simulations. To calculate the overall energy score we use an average S N XX d;s ES = ES : (19) SN s=1 d=1 We mentioned that the ES evaluates the full predictive distribution, which includes the path dependency in the generated scenarios. To illustrate the appropriateness of the energy score to evaluate correctly an ensemble forecasting study, we perform a short experiment in the results section. We take the best performing model and modify it with 3 di erent copulas which we refer as maximum dependency, minimum dependency and independence. For the maximum dependency copula we consider the T -dimensional co-monotonicity copula de ned by M (u) = min(u ; u ; : : : ; u ). The minimum dependency copula M is max 1 2 T min Also the multivariate log-score is a known strictly proper scoring rule for multivariate distributions. However, it requires that the underlying multivariate distribution is continuous and has a density. Due to the non-trade events this is not satis ed for our forecasting problem. 20 constructed using pairwise bivariate counter-monotonicity copulas de ned by W (u ; u ) = 1 2 max(u +u 1; 0). So M is the copula that is associated with the T -dimensional uniform 1 2 min random variable U = (U ; : : : ; U ) that satis es (U ; U )  W for all t = 1; : : : ; T 1. 1 T t t+1 The T -dimensional independence copula is de ned by M (u) =  u . The 3 new models ind t t=1 are evaluated using all considered measures and compared to the original one. As pointed out by Pinson and Girard [59], the ES does not evaluate the ability of the trajectories to mimic speci c characteristics of the stochastic process. Therefore, we also focus our evaluation on speci c characteristics of the underlying multivariate distribution. In this purpose, we consider additionally the subsequent proper scoring rules. We utilize the mean absolute error (MAE) and the root mean squared error (RMSE), pinball score (PB) to evaluate the median, mean and selected quantile trajectories, respectively, see e.g. [62]. For evaluation of the marginal density t of our scenarios, we consider the continuous ranked probability score (CRPS) and additionally the empirical coverage of speci c predic- tion intervals [58]. Moreover, we consider the variogram score (VS) and Dawid-Sebastiani score (DSS) which are regularly used to evaluate multivariate distributions [60]. Note that both measures are only proper scoring rules and correct model identi cation fails in gen- eral, see e.g. [63] for empirical examples. Other scoring rules evaluating e.g. marginal distribution characteristics or speci c events might also be added if it is relevant for the desired application. The RMSE is the optimal least squares measure, i.e. it is the strictly proper scoring rule for mean evaluation while MAE is strictly proper for median evaluation. They are widely used both by researchers and practitioners. Their formulas are given by S N T M u XXX X 1 1 d;s d;s RMSE = P P (20) t t;j SNT M s=1 t=1 j=1 d=1 and S N T XXX d;s d;s MAE = P med P (21) j=1;:::;M t t;j SNT s=1 d=1 t=1 d;s d;s d;s b b where P is the j-th simulation of P and med P is the median of M simu- j=1;:::;M t;j t;j d;s lated P prices. t;j 21 We approximate the CRPS using the pinball loss d;s d;s CRPS = PB (22) t t; 2r for a dense equidistant grid of probabilities r between 0 and 1 of size R, see e.g. [7]. In this d;s study, we consider r = f0:01; 0:02; : : : ; 0:99g of size R = 99. PB is the pinball loss with t; respect to probability  . Its formula is given by d;s d;s d;s PB =  1 d;s d;s P Q P (23) t; t P <Q P j=1;:::;M t;j f ( )g t j=1;:::;M t;j d;s d;s b b where Q P is the  -th quantile of M simulated P prices. To calculate the j=1;:::;M t;j t;j overall CRPS value we use a simple average S N T XXX d;s CRPS = CRPS : (24) SNT s=1 t=1 d=1 We can also use the pinball loss to compare the models' performance in particular quantiles. In this purpose the following formula is used S N T XXX d;s PB = PB : (25) t; SNT s=1 t=1 d=1 As mentioned, we use also the empirical coverage of prediction intervals, precisely the (=2; 1=2)-prediction interval. The  %-coverage is calculated using the following formula S N T XXX n o %-cov = 1 (26) (1 )=2 d;s d;s (1+ )=2 d;s b b Q (P )<P <Q (P ) SNT j=1;:::;M t;j j=1;:::;M t;j s=1 d=1 t=1 where  2 f0:5; 0:9; 0:99g. The variogram score was introduced by Scheuerer and Hamill [64] in a probabilistic forecasting exercise for meteorological data. We compute it by S N T T M XXXX X d;s d;s d;s d;s b b VS = P P P P (27) i j i;k j;k SNT s=1 i=1 j=1 d=1 k=1 The Dawid-Sebastiani score evaluates the rst and second moments [62]. It corresponds to the log-score of the multivariate normal distribution. We calculate it by S N XX 0 1 1 d;s d;s d;s d;s d;s d;s b b DSS = log det  + P b  P b (28) SN s=1 d=1 22 d;s d;s where b and  are the sample mean vector and the sample covariance matrix of the predicted price ensemble. However, the aforementioned measures do not allow us to make conclusions regarding the statistical signi cance. To do so we utilize the Diebold and Mariano [61] test which tests forecasts of model A against the ones of model B. The DM test is mostly used to evaluate point forecasts, but with correctly de ned loss di erential series it can be successfully applied in the evaluation of probability forecasts. We derive the series using ES and CRPS what was already applied by e.g. Muniain and Ziel [8] and Lerch et al. [60]. We utilize the multivariate version of the DM test as Ziel and Weron [51]. The multivariate DM test results in one statistic for each model which is computed based on d;1 d;2 d;S d 0 the S-dimensional vector of losses per day. Therefore, denote L = (L ; L ; : : : ; L ) A A A A d;1 d;2 d;S d 0 and L = (L ; L ; : : : ; L ) the vectors of out-of-sample losses for day d of models A B B B B d;s d;s d;s and B, respectively. By L we mean the ES and CRPS losses of model Z, formally we choose d;s d;s d;s d;s d;s L = ES and L = CRPS = CRPS : (29) Z Z t=1 The multivariate loss di erential series d d d = jjL jj jjL jj (30) 1 1 A;B A B de nes the di erence of losses in jjjj norm. For each model pair, we compute the p-value of two one-sided DM tests. The rst one is with the null hypothesis H : E( )  0, A;B that is to say the outperformance of the forecasts of model B by the ones of model A. The second test is the reverse null hypothesis H : E( )  0. Let us note that these tests A;B are complementary, and we assume that the loss di erential series is covariance stationary. 6 Results We divided this section into two subsections: in the rst one, we inspect the in-sample characteristics and in the second one, we present the out-of-sample simulation results. 23 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 1 t d;s P -0.09 -0.09 -0.11 -0.09 -0.11 -0.08 -0.06 -0.02 0.02 0.07 0.06 0.07 0.04 0.03 -0.01 0 -0.01 0.02 0.01 0.03 -0.02 -0.03 -0.03 -0.07 t1 d;s P -0.05 -0.06 -0.04 -0.07 -0.05 -0.05 -0.04 -0.04 -0.02 0 0.01 0.02 0.01 0 0 0 -0.03 -0.01 0 -0.03 -0.01 -0.01 -0.02 -0.05 t2 d;s P -0.02 -0.03 -0.03 -0.03 -0.03 -0.04 -0.01 -0.01 0 0.01 0.02 0.01 0.01 0.01 0.01 0 0 0 0 -0.01 -0.01 0 0.01 -0.02 t3 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Intercept 0.49 0.52 0.5 0.42 0.47 0.56 0.83 0.66 0.55 0.29 0.45 0.48 0.36 0.31 -1.78 -1.75 -1.65 0.39 0.62 0.36 0.36 0.59 0.61 0.51 d;s P 0.28 0.27 0.28 0.26 0.26 0.23 0.27 0.22 0.21 0.18 0.2 0.18 0.21 0.27 1.8 2.39 2.74 0.24 0.24 0.23 0.31 0.27 0.29 0.27 t1 d;s P 0.09 0.13 0.12 0.15 0.16 0.16 0.14 0.08 0.1 0.07 0.1 0.1 0.12 0.14 -0.06 0 1.27 0.11 0.14 0.13 0.14 0.12 0.14 0.13 t2 d;s P 0.07 0.08 0.05 0.08 0.06 0.08 0.06 0.08 0.08 0.05 0.06 0.05 0.05 0.09 1.1 -0.15 0.2 0.08 0.05 0.04 0.06 0.08 0.03 0.07 t3 d;s P 0.03 0.02 0.09 0.03 0.07 0.06 0.03 0.04 0.03 0.04 0.05 0.04 0.05 0.07 0.02 0 -0.41 0.05 0.05 0.03 0.03 0.06 0.05 0.04 t4 d;s P 0.04 0.02 0.02 0.07 0.03 0.05 0.02 0.04 0.04 0.02 0.03 0.04 0.03 0.05 -0.14 0 0.28 0.04 0.04 0.03 0.05 0.05 0.03 0.06 t5 d;s P 0.04 0.04 0.04 0.04 0.04 0.04 0.05 0.03 0.03 0.03 0.06 0.06 0.03 0.05 0.02 0.01 0.02 0.04 0.02 0.04 0.04 0.05 0.03 0.05 t6 d;s P 0.06 0.08 0.07 0.09 0.14 0.09 0.09 0.1 0.05 0.05 0.06 0.05 0.1 0.08 -0.01 0 -0.01 0.08 0.08 0.1 0.11 0.15 0.1 0.13 j=7 tj d;s DA -0.02 -0.03 -0.02 -0.03 0 0.02 0.05 -0.01 -0.02 -0.02 -0.01 0.02 0.03 -0.01 -0.03 -0.04 -0.02 -0.03 -0.04 0 -0.01 0 0.01 0.03 Load d;s DA 0 0 0 0 -0.02 -0.04 -0.02 0.01 0.02 0.01 0.02 0.04 0.03 0.02 0.03 0.03 0.02 -0.01 -0.04 -0.03 -0.04 -0.03 -0.03 0 Sol d;s DA 0.13 0.15 0.11 0.11 0.1 0.08 0.1 0.09 0.09 0.1 0.08 0.11 0.08 0.09 0.11 0.14 0.1 0.09 0.1 0.12 0.11 0.1 0.11 0.11 WiOn d;s DA 0.02 0.03 0.07 0.04 0.02 0.02 -0.01 0 0.01 0.03 0.03 0.03 0.03 0.01 0.02 0.02 0.03 -0.01 0 -0.01 0 0 0.01 0 WiO Mon(d) 0.01 0.05 0.05 0.1 0.12 0.21 0.15 0.1 0.02 0.06 0.02 0.03 0.01 -0.01 0 -0.03 -0.08 -0.1 -0.05 -0.03 -0.06 -0.05 0.01 0 Sat(d) -0.06 0.06 0.05 0.02 0.06 0.06 0.01 0.03 0.04 0.06 0.15 0.24 0.26 0.13 0.16 0.23 0.19 0.15 0.12 0.16 0.12 0.11 0.1 0.04 Sun(d) 0.11 0.07 0.18 0.09 0.08 0.06 -0.01 0.01 0.02 0.15 0.16 0.24 0.3 0.2 0.2 0.27 0.34 0.21 0.21 0.24 0.2 0.12 0.13 0.04 d;s -0.45 -0.44 -0.45 -0.39 -0.36 -0.39 -0.5 -0.4 -0.42 -0.38 -0.44 -0.35 -0.34 -0.37 -0.38 -0.34 -0.38 -0.4 -0.36 -0.32 -0.4 -0.4 -0.5 -0.39 t1 d;s -0.18 -0.19 -0.18 -0.16 -0.17 -0.19 -0.22 -0.15 -0.15 -0.18 -0.15 -0.13 -0.15 -0.08 -0.04 -0.05 -0.17 -0.12 -0.2 -0.12 -0.13 -0.18 -0.16 -0.18 t2 d;s g ( ) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Intercept 0.53 0.74 0.69 0.86 0.84 0.52 0.56 0.7 1.11 1.25 1.52 1.37 1.41 1.37 1.07 0.87 0.93 1.06 1.24 1.18 1.02 0.74 0.76 0.57 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% Signi cance level Table 1: Initial in-sample coecient values of model Mix.t.mu.sigma reported for every hourly product. The price and generation variables were scaled for better clarity. The p-value of the test for signi cance of the values is indicated by the colour. The legend is explained by the table in the bottom. 6.1 In-sample characteristics We start our study with an analysis of the initial in-sample characteristics. Table 1 shows the estimated coecient values of model Mix.t.mu.sigma based on the initial in-sample data. The table reports the values for every hourly product, and it is split to 3 sub- tables, each regarding di erent parameter of the t-distribution. The rst sub-table presents d;s coecients of the model described by equation 8. Variable P appears to be statistically t1 signi cant for most of the hours. However, raising the lag decreases the signi cance. This behaviour goes in the direction of weak-form eciency concluded by Narajewski and Ziel [36]. The second sub-table shows coecients of the model presented in equation 9. Here, 24 we see that all the variables using lagged absolute price di erences are mostly signi cantly d;s d;s di erent from 0. Moreover, the coecients of jP j and jP j are relatively high. t1 t2 Surprisingly, the day-ahead forecast of total load is mostly irrelevant. The day-ahead forecast of solar generation is signi cant mainly during the day-peak and in the evening. The day-ahead forecast of wind onshore generation appears to have a big positive impact on the volatility of price di erences, in contrast to the wind o shore forecast. The behaviour of weekday dummies gives some light to our mixed expectancies { they indicate a di erent behaviour of traders on Monday at night and on Saturday and Sunday during the day. On weekends the volatility is higher, likely due to higher bid-ask spreads on weekends. The d;s lagged values of have signi cant, negative impact on the volatility of price di erences. It means that if there was trading at times t 1 and t 2, then the standard deviation would be lower. Let us note that the values of the coecients are very similar among all hours except hours 14 to 16. For these hours the estimates of intercept and absolute price di erences deviate heavily from the estimates of the remaining hours. A possible reason for this may be a few extreme outliers which were observed for these hours and for the others not. The table presents no values for the P-splines, because they are non-parametric d;s functions. The last sub-table shows the estimate values for g ( ) which we assumed to 3 t be constant. We show it anyway to gain an insight in the magnitude of the degrees of freedom. Let us recall that g () = exp() + 2. Applying this to the estimate results in d;s values of  between around 3.7 and 6.6. Thus, the innovations are not extremely heavy tailed, and it is reasonable to apply asymptotic statistic for validation and interpretation. d;s Figure 11 shows the initial in-sample P-splines h (P ) and h (t). We see that in case 1 2 t1 of both variables, the smoothing functions are non-linear. Extreme values of most recent d;s price P result in most cases in high rise of volatility. On the other hand, the values t1 between 0 and 50 EUR/MWh have rather marginal impact on the variance of the price di erences. An interesting e ect can be seen in Figure 11b. We see that until 60 minutes before the delivery the impact on the volatility is on a similar, negative level among all products. Then, in the last 30 minutes of trading the volatility rises substantially above zero. This behaviour can be misinterpreted as a result of the closure of XBID as in Figure 2. However, this plot is based on the initial in-sample data, i.e. the data between 16th July 25 2015 and 14th July 2016. Therefore, the e ect of XBID could not be in the data as it was introduced on 18th June 2018. Figure 11c is analogous to Figure 11b, but based on the rst year of XBID, i.e. the data between 18th June 2018 and 17th June 2019. Comparing the two gures concludes that the introduction of XBID has an impact on the volatility of the price di erences decreasing it even lower before the XBID closure and rising it even higher just after it. Interestingly, this is in contrary to the paper of Kath [5] who concluded that there is no evidence for the in uence of XBID on the price volatility. d;s Figure 12 presents a price trajectory and a decomposition of tted g ( ) of the prod- uct with delivery at 12:00 for the last 7 days of in-sample data. For the sake of readability we grouped the components of the model for standard deviation similarly as in equation (9). Let us note that the absolute price di erences and fundamental regressors have big, positive impact on the volatility of price di erences. We also observe overall higher volatility on the weekend, i.e. the second and third day on the plot, than on the week. Note that in this speci c example the impact of the non-linear price due to h looks rather negligible. How- ever, the price level in these seven trading sessions is always between 0 and 40 EUR/MWh where we expect minor impacts. 00:00 02:00 04:00 06:00 08:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 01:00 03:00 05:00 07:00 09:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00 10.0 0.5 0.5 7.5 0.0 0.0 5.0 -0.5 -0.5 2.5 -1.0 -1.0 0.0 -1.5 -1.5 -50 0 50 100 150 100 50 150 100 50 d,s t-1 Time to delivery (minutes) Time to delivery (minutes) (a) (b) (c) d;s Figure 11: Initial in-sample smoothing e ects of variables (a) P and (b) time to maturity t1 in the model described in equation (9). (c) is analogous to (b), but as in-sample considering d;s the rst year of XBID. Note that the support of P di ers among products. t1 d,s h (P ) t-1 h (t) h (t) 2 6.2 Out-of-sample simulation Now, we turn ourselves to the analysis of the simulated trajectories. Figure 13 shows the rst out-of-sample simulation exercise of prices of product with delivery at 12:00 on 15.07.2016. The trajectories are simulated from Mix.t.mu.sigma model and it can be easily compared to the simulations from Gaussian random walk presented in Figure 1. It is clear that in this example the trajectories of the mixture model are less volatile than the random walk. Table 2 shows the values of utilized error measures. The Naive model performs very well overall. Moreover, it gives the best results in terms of 50% coverage. Very similar results to the naive model gives the MV.t which assumes the trajectories to follow a multivariate t-distribution. Having a look at the performance of MV.N we see that indeed the t-distribution is better in modelling of the trajectories. The LQR-based models are according to most of the considered measures worse than the Naive or MV.t. It is worth to emphasize the very bad ability to model the mean and median trajectories and on the other hand quite good 99% coverage and the values of VS and DSS. Let us note a d,s d,s h (t) |ΔP | No-trade indicator DA.Load DA.Solar DA.Wind Intercept h (P ) 2 Weekday dummies t-i t-1 -1 08/07 08:55 09/07 08:55 10/07 08:55 11/07 08:55 12/07 08:55 13/07 08:55 14/07 08:55 Time (year 2016) d;s Figure 12: Price trajectory (top) and decomposition of tted g ( ) (bottom) of hourly product with delivery at 12:00 for 7 consecutive in-sample days. The end of trading and the time of forecasting are indicated by the dashed black and grey lines, respectively. d,s d, s g (σ ) Y t t ES CRPS VS DSS MAE RMSE 50%-cov 90%-cov 99%-cov Naive 16.428 1.179 10.521 77.33 3.075 5.805 0.490 0.8919 0.9835 MV.N 17.104 1.226 12.905 76.50 3.087 5.810 0.6712 0.9247 0.9723 MV.t 16.454 1.181 10.524 71.66 3.081 5.809 0.5115 0.8948 0.9853 RW.N 18.898 1.400 19.965 75.94 3.100 5.815 0.8025 0.9668 0.9887 RW.t 16.999 1.234 11.218 76.77 3.086 5.815 0.6992 0.9578 0.9952 RW.t.mix.D 17.168 1.248 11.385 77.01 3.086 5.814 0.7198 0.9615 0.9949 LQR.Gauss 16.536 1.186 9.923 61.36 3.162 5.966 0.5588 0.9204 0.9875 LQR.ind 16.595 1.191 10.308 58.41 3.168 5.970 0.5789 0.9283 0.9884 Mix.RW.t 17.092 1.239 11.377 73.97 3.085 5.815 0.6924 0.9569 0.9942 Mix.t.mu 17.284 1.255 11.642 74.52 3.086 5.815 0.7117 0.9620 0.9950 Mix.t.sigma 15.965 1.144 9.444 54.42 3.075 5.814 0.5331 0.9167 0.9907 Mix.t.mu.sigma 15.956 1.144 9.405 53.15 3.073 5.804 0.5482 0.9277 0.9928 Table 2: Error measures of the considered models. Colour indicates the performance columnwise (the greener, the better). With bold, we depicted the best values in each column. very bad performance of the Gaussian random walk. Model RW.N is clearly the worst. Having a look at its coverage values, we conclude that its simulations are too volatile. The introduction of t-distribution to random walk yields already a big improvement. Another 360 270 185 90 30 Time to delivery (minutes) Figure 13: Price trajectory for the hourly product with delivery on 15.07.2016 at 12:00. The black part is the realization and the colourful part consists of 100 simulations from the Mix.t.mu.sigma. Time of forecasting is indicated by the green dashed line. ID price (hourly, 12:00) x 5m step in our modelling, the usage of simple mixture distribution of the Dirac distribution and the random walk with innovations from t-distribution do not improve the results. d;s However, the next step, i.e. modelling of the probability  with model (5) improves the results, but still they are not better than the ones of model RW.t. Moreover, modelling of the expected value as in equation (8) also worsens the performance substantially. All these models are clearly worse than the Naive considering almost every measure. The last change to the mixture model, i.e. modelling of the standard deviation according to the formula in equation (9) lowers the errors signi cantly. Modelling of the expected value in addition to the standard deviation brings a little improvement. Model Mix.t.mu.sigma is marginally better than Mix.t.sigma and it turns out to be the best model in terms of ES, CRPS, VS, DSS, MAE and RMSE. A little disturbing are the values of the 50%- and 90%-coverage which are too high for the mixture models. This means that it is very likely that the results can be still improved. On the other hand, they capture better the behaviour in the tails than the Naive model. The values of the error measures in Table 2 may suggest that both ES and CRPS evaluate the same thing { the marginal distribution. To emphasize that ES evaluates also the quality of the generated scenarios, we perform a small experiment on the model Mix.t.mu.sigma. In Table 3, we compare the performance of the Mix.t.mu.sigma with its copies modi ed using 3 copulas: maximum dependency, minimum dependency and in- dependent. This results in the same marginal distributions, mean and median trajectories, and coverage values, but in completely di erent ensembles. This is depicted by the values of the measures { the CRPS remains unchanged while the ES, VS and DSS changed dras- tically. Let us note the enormous aggravation of the DSS which is particularly sensitive to changes of the dependency structure. ES CRPS VS DSS MAE RMSE 50%-cov 90%-cov 99%-cov original 15.956 1.144 9.405 53.15 3.073 5.804 0.5482 0.9277 0.9928 maximum dependency 31.625 1.144 10.274 9299.60 3.073 5.804 0.5482 0.9277 0.9928 minimum dependency 33.263 1.144 16.918 27205.49 3.073 5.804 0.5482 0.9277 0.9928 independent 16.925 1.144 14.350 110.02 3.073 5.804 0.5482 0.9277 0.9928 Table 3: Error measures of 4 Mix.t.mu.sigma models with di erent copulas. 29 Figure 14 shows the models' performance over all products in terms of energy score. A very interesting is the case of model RW.N. Usually it is not that much worse than the other random walks, but for hours 14-16 the error explodes. A look into the data explains the situation clearly { there were a few in-sample observations of extreme price di erences. The normal distribution assumes exponentially decaying tails, and thus the model overestimated the variance. This indicates clearly that the t-distribution is better in this purpose as it was una ected by these events. Furthermore, we observe that models d;s with modelled  are uniformly better than the others. Figure 15 presents the models' performance over time to delivery. The values rise as the Naive LQR.Gauss 1.15 MV.N LQR.ind MV.t Mix.RW.t RW.N Mix.t.mu 1.10 RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 1.05 1.00 0.95 00 06 12 18 00 00 06 12 18 00 Product Product Figure 14: Energy score (left) and its ratio to the Naive (right) over 24 hourly products. The right graph is shown without RW.N for better clarity. 1.12 Naive LQR.Gauss MV.N LQR.ind MV.t Mix.RW.t 1.08 RW.N Mix.t.mu RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 1.04 1.00 0.96 150 100 50 150 100 50 Time to delivery (min) Time to delivery (min) Figure 15: Continuous ranked probability score (left) and its ratio to the Naive (right) over time to delivery. The right graph is shown without RW.N for better clarity. Energy Score CRPS Ratio to the Naive Ratio to the Naive 1.1 1.5 1.0 1.0 Naive LQR.Gauss 0.9 MV.N LQR.ind MV.t Mix.RW.t RW.N Mix.t.mu 0.8 0.5 RW.t Mix.t.sigma RW.t.mix.D Mix.t.mu.sigma 0.7 0 25 50 75 100 0 25 50 75 100 τ τ Figure 16: Pinball score (left) and its ratio to the Naive (right) over quantiles  2 r. The right graph is shown without RW.N for better clarity. time goes, but it is rather not surprising. An interesting behaviour can be observed from 150 to 100 minutes before the delivery. In this time range the errors of the random walk models and the mixture models that assume constant standard deviation rise signi cantly in comparison to the other models. It is also the time of decreasing volatility in Figure 11b. Pinball Score values over quantiles  are depicted in Figure 16. Let us note that the gain from the forecasting of central quantiles is marginal, and it is inline with other studies regarding the ID -Price in the German intraday market [36, 39]. On the other hand, models Mix.t.sigma and Mix.t.mu.sigma gain a lot from the forecasting of quantiles outside the centre, performing especially well in the tails. In relation to the naive benchmark, the error is around 30% lower in the lower tail and around 25% lower in the upper tail. Let us also note that the LQR-based models give quite good results in the tails, but lose a lot in d;s the centre, compared to the naive or to the models with non-constant  . d;s Figure 17 shows the results of the DM test using two types of losses: the ES and the d;s CRPS . Before applying the test we conducted on the multivariate loss di erential series three tests that evaluate the null hypothesis that a unit root is present in the series A;B against the alternative that the data is stationary or trend-stationary. We used the Dickey- Fuller test [65], the Augmented Dickey-Fuller test [66], and the Phillips-Perron test [67]. In 99% of cases the obtained p-values were smaller than 0.01, so we reject the null hypothesis. This indicates in our case that the loss di erential series is covariance stationary. Only for PB Ratio to the Naive 10% 10% Naive Naive 9% 9% MV.N MV.N MV.t 8% MV.t 8% RW.N RW.N 7% 7% RW.t RW.t 6% 6% RW.t.mix.D RW.t.mix.D 5% 5% LQR.Gauss LQR.Gauss 4% 4% LQR.ind LQR.ind Mix.RW.t 3% Mix.RW.t 3% Mix.t.mu Mix.t.mu 2% 2% Mix.t.sigma Mix.t.sigma 1% 1% ● ● Mix.t.mu.sigma Mix.t.mu.sigma 0% 0% p−value p−value (a) (b) d;s Figure 17: Results of the Diebold-Mariano test. (a) presents the p-values for the ES loss, d;s (b) the values for the CRPS loss. The gures use a heat map to indicate the range of the p-values. The closer they are to zero (! dark green), the more signi cant the di erence is between forecasts of X-axis model (better) and forecasts of the Y-axis model (worse). the di erences with RW.N the Dickey-Fuller test reported no signi cance for rejecting the null hypothesis. This may be caused by the bad capturing of the marginal distribution of the RW.N. The results of the DM test show that the di erence between the forecasts of models Mix.t.mu.sigma and Mix.t.sigma is insigni cant. Moreover, these models give better forecasts than all the other considered models. It is worth to emphasize a very good performance of the Naive model, but it is not surprising after taking a look at Table 2. 7 Conclusion We conducted an ensemble forecasting study in the German Intraday Continuous Market which is novel in two ways. The rst way, this study is the rst one that raises the issue of price trajectory simulation and ensemble forecasting in continuous intraday electricity markets. The second way, the study uses a very clever mixture of distributions that is tted to the data. The results are very satisfying and showing that it is possible to successfully Naive MV.N MV.t RW.N RW.t RW.t.mix.D LQR.Gauss LQR.ind Mix.RW.t Mix.t.mu Mix.t.sigma Mix.t.mu.sigma Naive MV.N MV.t RW.N RW.t RW.t.mix.D LQR.Gauss LQR.ind Mix.RW.t Mix.t.mu Mix.t.sigma Mix.t.mu.sigma model the volatility in the German Intraday Continuous Market. The study was carried out using the data from the German market, but the generality of this method and the organization of the European electricity markets ensure a possible application to other markets, especially the markets participate in XBID which covers exchanges like EPEX, Nordpool and OMIE. Obviously, the proposed method can be developed further. One of possible directions is using other external processes like the traded volume or price of nearby hours as regressors. Although, this is a non-trivial task and could easily lead to the accumulation of errors. Another possibility is utilization of other probability distribution. The not perfect coverage of the best performing model indicates that there is still some space for improvement. This issue could be addressed with some post-processing method as well. Acknowledgments This research article was partially supported by the German Research Foundation (DFG, Germany) and the National Science Center (NCN, Poland) through BEETHOVEN grant no. 2016/23/G/HS4/01005. References [1] S. Goodarzi, H. N. Perera, and D. Bunn. The impact of renewable energy forecast errors on imbalance volumes and electricity spot prices. Energy Policy, 134:110827, [2] C. Kath and F. Ziel. The value of forecasts: Quantifying the economic gains of accurate quarter-hourly electricity price forecasts. Energy Economics, 76:411{423, 2018. [3] K. Maciejowska. Assessing the impact of renewable energy sources on the electricity price level and variability{A quantile regression approach. Energy Economics, 85: 104532, 2020. [4] J. Viehmann. State of the German Short-Term Power Market. Zeitschrift 33 fur  Energiewirtschaft, 41(2):87{103, Jun 2017. ISSN 1866-2765. doi: 10.1007/ s12398-017-0196-9. URL https://doi.org/10.1007/s12398-017-0196-9. [5] C. Kath. Modeling intraday markets under the new advances of the cross-border intraday project (XBID): Evidence from the German intraday market. Energies, 12 (22):4339, 2019. [6] R. Weron. Electricity price forecasting: A review of the state-of-the-art with a look into the future. International journal of forecasting, 30(4):1030{1081, 2014. [7] J. Nowotarski and R. Weron. Recent advances in electricity price forecasting: A review of probabilistic forecasting. Renewable and Sustainable Energy Reviews, 81:1548{1568, [8] P. Muniain and F. Ziel. Probabilistic forecasting in day-ahead electricity markets: Simulating peak and o -peak prices. International Journal of Forecasting, 2020. [9] G. Marcjasz, T. Sera n, and R. Weron. Selection of calibration windows for day-ahead electricity price forecasting. Energies, 11(9):2364, 2018. [10] B. Uniejewski, G. Marcjasz, and R. Weron. On the importance of the long-term seasonal component in day-ahead electricity price forecasting: Part II|Probabilistic forecasting. Energy Economics, 79:171{182, 2019. [11] T. Sera n, B. Uniejewski, and R. Weron. Averaging predictive distributions across calibration windows for day-ahead electricity price forecasting. Energies, 12(13):2561, [12] Z. Yang, L. Ce, and L. Lian. Electricity price forecasting by a hybrid model, combin- ing wavelet transform, ARMA and kernel-based extreme learning machine methods. Applied Energy, 190:291{305, 2017. [13] D. Wang, H. Luo, O. Grunder, Y. Lin, and H. Guo. Multi-step ahead electricity price forecasting using a hybrid model based on two-layer decomposition technique and BP neural network optimized by re y algorithm. Applied Energy, 190:390{407, 2017. 34 [14] J. Zhang, Z. Tan, and Y. Wei. An adaptive hybrid model for short term electricity price forecasting. Applied Energy, 258:114087, 2020. [15] L. Xiao, W. Shao, M. Yu, J. Ma, and C. Jin. Research and application of a hybrid wavelet neural network model with the improved cuckoo search algorithm for electrical power system forecasting. Applied Energy, 198:203{222, 2017. [16] P. Bento, J. Pombo, M. Calado, and S. Mariano. A bat optimized neural network and wavelet transform approach for short-term price forecasting. Applied Energy, 210: 88{97, 2018. [17] D. Keles, J. Scelle, F. Paraschiv, and W. Fichtner. Extended forecast methods for day-ahead electricity spot prices applying arti cial neural networks. Applied Energy, 162:218{230, 2016. [18] J. Lago, F. De Ridder, P. Vrancx, and B. De Schutter. Forecasting day-ahead electricity prices in Europe: the importance of considering market integration. Applied Energy, 211:890{903, 2018. [19] F. Ocker and K.-M. Ehrhart. The \German Paradox" in the balancing power markets. Renewable and Sustainable Energy Reviews, 67:892{898, 2017. [20] C. Koch and L. Hirth. Short-term electricity trading for system balancing: An empir- ical analysis of the role of intraday trading in balancing Germany's electricity system. Renewable and Sustainable Energy Reviews, 113:109275, 2019. [21] F. Karan l and Y. Li. The role of continuous intraday electricity markets: The inte- gration of large-share wind power generation in Denmark. The Energy Journal, 38(2), [22] K. Maciejowska, W. Nitka, and T. Weron. Day-ahead vs. Intraday|Forecasting the price spread to maximize economic bene ts. Energies, 12(4):631, 2019. [23] M. Narajewski and F. Ziel. Estimation and Simulation of the Transaction Arrival Process in Intraday Electricity Markets. Energies, 12(23):4518, 2019. 35 [24] R. Kiesel and F. Paraschiv. Econometric analysis of 15-minute intraday electricity prices. Energy Economics, 64:77{90, 2017. [25] N. Graf von Luckner and R. Kiesel. Modeling market order arrivals on the intraday market for electricity deliveries in Germany with the Hawkes process. Available at SSRN, 2020. [26] T. Rintam aki, A. S. Siddiqui, and A. Salo. Strategic o ering of a exible producer in day-ahead and intraday power markets. European Journal of Operational Research, [27] R. A d, P. Gruet, and H. Pham. An optimal trading problem in intraday electricity markets. Mathematics and Financial Economics, 10(1):49{85, 2016. [28] X. Ay on, M. A. Moreno, and J. Usaola. Aggregators' optimal bidding strategy in sequential day-ahead and intraday electricity spot markets. Energies, 10(4):450, 2017. [29] S. Glas, R. Kiesel, S. Kolkmann, M. Kremer, N. G. von Luckner, L. Ostmeier, et al. Intraday renewable electricity trading: advanced modeling and numerical optimal con- trol. Journal of Mathematics in Industry, 10(1):3, 2020. [30] C. Pape, S. Hagemann, and C. Weber. Are fundamentals enough? Explaining price variations in the German day-ahead and intraday power market. Energy Economics, 54:376{387, 2016. [31] M. Gurtler  and T. Paulsen. The e ect of wind and solar power forecasts on day-ahead and intraday electricity prices in Germany. Energy Economics, 75:150{162, 2018. [32] M. Kremer, R. Kiesela, and F. Paraschivc. An Econometric Model for Intraday Elec- tricity Trading. 2020. [33] C. Monteiro, I. J. Ramirez-Rosado, L. A. Fernandez-Jimenez, and P. Conde. Short- Term Price Forecasting Models Based on Arti cial Neural Networks for Intraday Ses- sions in the Iberian Electricity Market. Energies, 9(9):721, 2016. 36 [34] J. R. Andrade, J. Filipe, M. Reis, and R. J. Bessa. Probabilistic Price Forecasting for Day-Ahead and Intraday Markets: Beyond the Statistical Model. Sustainability, 9 (11):1990, 2017. [35] B. Uniejewski, G. Marcjasz, and R. Weron. Understanding intraday electricity mar- kets: Variable selection and very short-term price forecasting using LASSO. Interna- tional Journal of Forecasting, 35(4):1533{1547, 2019. [36] M. Narajewski and F. Ziel. Econometric modelling and forecasting of intraday elec- tricity prices. Journal of Commodity Markets, page 100107, 2019. [37] G. Marcjasz, B. Uniejewski, and R. Weron. Beating the Na ve|Combining LASSO with Na ve Intraday Electricity Price Forecasts. Energies, 13(7):1667, 2020. [38] I. Oksuz and U. Ugurlu. Neural network based model comparison for intraday elec- tricity price forecasting. Energies, 12(23):4557, 2019. [39] T. Janke and F. Steinke. Forecasting the price distribution of continuous intraday electricity trading. Energies, 12(22):4262, 2019. [40] R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society: Series C (Applied Statistics), 54 (3):507{554, 2005. [41] T. Hastie and R. Tibshirani. Generalized Additive Models, volume 43. CRC Press, [42] A. Pierrot and Y. Goude. Short-term electricity load forecasting with generalized additive models. Proceedings of ISAP power, 2011, 2011. [43] P. Gaillard, Y. Goude, and R. Nedellec. Additive models and robust aggregation for GEFCom2014 probabilistic electric load and electricity price forecasting. International Journal of forecasting, 32(3):1038{1050, 2016. [44] F. Serinaldi. Distributional modeling and short-term forecasting of electricity prices by generalized additive models for location, scale and shape. Energy Economics, 33 (6):1216{1226, 2011. 37 [45] A. Gianfreda and D. Bunn. A stochastic latent moment model for electricity price formation. Operations Research, 66(5):1189{1203, 2018. [46] E. Abramova and D. Bunn. Forecasting the Intra-Day Spread Densities of Electricity Prices. Energies, 13(3):687, 2020. [47] R. Tibshirani. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267{288, 1996. ISSN 00359246. URL http://www.jstor.org/stable/2346178. [48] D. M. Stasinopoulos, R. A. Rigby, et al. Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, 23(7):1{46, 2007. [49] A. Misiorek, S. Trueck, and R. Weron. Point and interval forecasting of spot electricity prices: Linear vs. non-linear time series models. Studies in Nonlinear Dynamics & Econometrics, 10(3), 2006. [50] B. Uniejewski, J. Nowotarski, and R. Weron. Automated variable selection and shrink- age for day-ahead electricity price forecasting. Energies, 9(8):621, 2016. [51] F. Ziel and R. Weron. Day-ahead electricity price forecasting with high-dimensional structures: Univariate vs. multivariate modeling frameworks. Energy Economics, 70: 396{420, 2018. [52] F. Ziel. Forecasting electricity spot prices using lasso: On capturing the autoregressive intraday structure. IEEE Transactions on Power Systems, 31(6):4977{4987, 2016. [53] J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1):1, 2010. [54] P. H. Eilers, B. D. Marx, and M. Durb an. Twenty years of P-splines. SORT: statistics and operations research transactions, 39(2):0149{186, 2015. [55] C. Gilbert, J. Browell, and D. McMillan. Probabilistic access forecasting for improved o shore operations. International Journal of Forecasting, 2020. 38 [56] R. Koenker. quantreg: Quantile Regression, 2019. URL https://CRAN.R-project. org/package=quantreg. R package version 5.54. [57] J. M. Hyman. Accurate monotonicity preserving cubic interpolation. SIAM Journal on Scienti c and Statistical Computing, 4(4):645{654, 1983. [58] T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359{378, 2007. [59] P. Pinson and R. Girard. Evaluating the quality of scenarios of short-term wind power generation. Applied Energy, 96:12{20, 2012. [60] S. Lerch, S. Baran, A. M oller, J. Gro, R. Schefzik, S. Hemri, and M. Graeter. Simulation-based comparison of multivariate ensemble post-processing methods. Non- linear Processes in Geophysics, 27(2):349{371, 2020. [61] F. Diebold and R. Mariano. Comparing Predictive Accuracy. Journal of Business & Economic Statistics, 13(3):253{63, 1995. [62] T. Gneiting. Making and evaluating point forecasts. Journal of the American Statis- tical Association, 106(494):746{762, 2011. [63] F. Ziel and K. Berk. Multivariate forecasting evaluation: On sensitive and strictly proper scoring rules. arXiv preprint arXiv:1910.07325, 2019. [64] M. Scheuerer and T. M. Hamill. Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities. Monthly Weather Review, 143(4):1321{1334, 2015. [65] D. A. Dickey and W. A. Fuller. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American statistical association, 74(366a): 427{431, 1979. [66] S. E. Said and D. A. Dickey. Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika, 71(3):599{607, 1984. [67] P. C. Phillips and P. Perron. Testing for a unit root in time series regression. Biometrika, 75(2):335{346, 1988.

Journal

Computing Research RepositoryarXiv (Cornell University)

Published: May 4, 2020

References