Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Temperature Field Data Reconstruction Using the Sparse Low-Rank Matrix Completion Method

Temperature Field Data Reconstruction Using the Sparse Low-Rank Matrix Completion Method Hindawi Advances in Meteorology Volume 2019, Article ID 3676182, 10 pages https://doi.org/10.1155/2019/3676182 Research Article Temperature Field Data Reconstruction Using the Sparse Low-Rank Matrix Completion Method 1 1 1 2 3 Shan Wang , Jianhui Hu, Huiling Shan, Chun-Xiang Shi, and Weimin Huang School of Information Engineering, East China Jiaotong University, Nanchang, China National Meteorological Information Center, Beijing, China Department of Electrical and Computer Engineering, Memorial University, St. John’s, Canada Correspondence should be addressed to Shan Wang; patrick_shan@163.com Received 16 April 2019; Revised 15 September 2019; Accepted 9 October 2019; Published 3 November 2019 Academic Editor: Helena A. Flocas Copyright © 2019 Shan Wang et al. +is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Due to limited number of weather stations and interruption of data collection, the temperature field data may be incomplete. In the past, spatial interpolation is usually used for filling the data gap. However, the interpolation method does not work well for the case of the large-scale data loss. Matrix completion has emerged very recently and provides a global optimization for temperature field data reconstruction. A recovery method is proposed for improving the accuracy of temperature field data by using sparse low- rank matrix completion (SLR-MC). +e method is tested using continuous gridded data provided by ERA Interim and the station temperature data provided by Jiangxi Meteorological Bureau. Experimental results show that the average signal-to-noise ratio can be increased by 12.5%, and the average reconstruction error is reduced by 29.3% compared with the matrix completion (MC) method. a corrupted matrix from a small part of its entries. It is 1. Introduction impossible to recover a corrupted matrix without any Temperature field data are measured at a height of at least assumptions about the matrix. Cande`s and Recht [3] 1.5 m above the ground, which is an important parameter to found if the given matrix is low rank or approximately describe the environmental conditions of the land [1] and low rank, the missing entries of the corrupted matrix can widely utilized in weather forecast. +e initial research on be recovered through minimizing the matrix rank. n1×n2 city thermal environment phenomenon mainly employed Mathematically, for a corrupted matrix M ∈ R , the the temperature data from meteorological stations [2]. +e low-rank matrix completion problem usually can be number of meteorological stations is often limited. In ad- formulated as dition, the data are continuous, but not in space. Based on minimize rank(X), the above data characteristics, it is challenging to investigate (1) the temperature-related problem over large area. Sparse subject to P (X) � P (M), Ω Ω temperature field data are highly correlated with low spatial variability; interpolation is usually used to obtain the tem- where rank(X) denotes the rank of matrix X, Ω represents perature data missed in a region. And so far, no research on the locations of sampling in matrix X which is the number of the sparse property of continuous temperature data has been known entries, and P is the sampling operator which conducted. obtains only the entries indexed by Ω. With the rapid development of sparse representation, Unfortunately, matrix completion problem is NP-hard matrix completion (MC) method [3–5], which extends because the rank is nonconvex and discontinuous in reality. the idea of compressed sensing to matrices, has been To solve the problem, Candes ` and Tao proposed convex proposed recently. Matrix completion aims to recover nuclear norm to solve the rank minimization problem [6]. 2 Advances in Meteorology and the station temperature data provided by Jiangxi Me- While the known entries are sampled randomly and uni- formly from the unknown matrix, the missing entries can be teorological Bureau. +e rest of this paper is organized as follows. +e recovered accurately if the matrix satisfies low-rank struc- ture and incoherence condition [7]. Nuclear norm repre- temperature field matrix model and the proposed SLR-MC sents the sum of the singular values and can be seen as method are described in Section 2. +e experiment data and a special case of l norm. It is easy to know that nuclear norm results are presented in Section 3, in which the performance is widely adopted as a low-rank convex surrogate [8], which comparison between the MC and SLR-MC reconstruction is can be solved via the convex optimization. In order to solve also given. In Section 4, a summary of the work is provided. the convex problem, semidefinite programming (SDP) has been proposed. Since SDP has high computational cost, 2. Method several faster algorithms which are more computationally efficient than the SDP-based methods have been proposed, In this section, we will describe the temperature field matrix such as singular value thresholding (SVT) [9], singular value model which is decomposed into a low-rank part and projection (SVP) [10], and inexact augmented Lagrangian a sparse part, then the fundamentals of matrix completion method (IALM) [11]. are introduced, and finally the SLR-MC method is presented. Unlike the interpolation method presented in [12–14], matrix completion requires the corrupted matrix to be low 2.1. 'e Temperature Field Matrix Model. In order to rank, and it works well for the case when a large portion of overcome the influence of rank, the gridded temperature data is lost. Taking advantage of the low rank and spa- field data at each time can be regarded as a new low-rank tiotemporal correlation of a matrix, MC can achieve good matrix. According to [19], the gridded temperature field data interpolation performance. Compared with the traditional T , T , . . ., T collected over a period can be arranged in spatial interpolation method, MC takes good use of the 1 2 L rows to a large matrix T, as shown in Figure 1. correlation between the data, and it could only use a few Assume that the size of each matrix is m × n, the rank of temperature field data to reconstruct the global temper- T is r , the rank of T is r , . . ., and the rank of T is r . For ature field. +e reconstructed data quality is comparable to 1 1 2 2 L L each single temperature field, the observation matrix may the spatial interpolation. In order to obtain good re- not satisfy the low-rank property; therefore, the MC method construction resolution, the temperature matrix needs to cannot be directly used to reconstruct missing data or lost be low rank based on matrix completion theory. However, data. However, due to structure similarity and strong cor- the temperature field data matrix does not have a stable relation among the matrices T , T , . . ., T , the rank R of rank and the rank of matrix varies with time. So, we regard 1 2 L matrix T is smaller than max (r , r , . . ., r ), and a matrix can the gridded temperature field data as a new matrix, whose 1 2 L be decomposed into two parts: a low-rank matrix T (few rank is more stable. Although matrix completion can M nonzero singular values) and a sparse matrix T (few recover the incomplete temperature field data perfectly, S nonzero entries): some information will still be lost in the process. In [15], the data matrix was supposed to be decomposed into T � T + T , (2) M S a low-rank part and a sparse part, and it can be recovered individually by solving a very convenient convex program where rank (T ) ≪ min (m, n) and sparsity (T ) ≪ mn. M S under some suitable assumptions. To recover the sparse Figure 2 illustrates an example of the decomposition and low-rank components of a matrix efficiently, the al- result. ternating direction method (ADM) has been proposed in [16], but the sparse part of gridded temperature field data 2.2. Fundamentals of Matrix Completion. Matrix completion is well suited for the application of compressed sensing is the technique of completing missing values of a matrix (CS) due to extensive spatiotemporal correlations that with a subset of entries selected randomly and uniformly result in sparser representations. +e combination of from a low-rank matrix or an approximately low-rank compressed sensing and low-rank matrix completion matrix [3, 15]. +e incomplete matrix M can be recovered by represents an attractive proposition for further improving solving the following rank minimization problem [3]: reconstruction. In this paper, a method based on matrix completion and minimize rank(X), compressed sensing [17, 18] is presented and referred to as (3) subject to P (X) � P (M), Ω Ω sparse low-rank matrix completion (SLR-MC). Different from the method proposed in [16], the low-rank part and where rank (X) denotes the rank of a matrix X, and the sparse part of corrupted matrix were recovered by matrix m×n m×n sampling operator P : R ⟶ R is defined as follows: completion and compressed sensing individually. Firstly, the X , (i, j) ∈ Ω, ⎧ ⎨ ij temperature field data matrix is decomposed into a low-rank P (X) � (4) or an approximately low-rank matrix and a sparse matrix. 0, (i, j) ∈ Ω . +en, the low-rank matrix is reconstructed using the matrix completion method, and the sparse part is recovered using We use |Ω| to represent the cardinality of Ω which is the compressed sensing. +e method is tested using the gridded number of known entries. For example, suppose the matrix incomplete temperature field data provided by ERA Interim X is Advances in Meteorology 3 T T 1 M Figure 1: Temperature field matrix model. (a) (b) (c) Figure 2: (a) Temperature field matrix T, (b) low-rank matrix, and (c) sparse matrix. problem (7) with high confidence where the number of 1 2 3 6/5 X � 􏼢 􏼣. (5) samples should obey m≥ Cn r log n. 4 5 6 In order to recover the incomplete matrix exactly, there is a restriction on the range of rank r. +e selection of rank If we have three elements known as has a great influence on recovering low-rank matrix, and we Ω � {(1, 2), (2, 2), (2, 3)}, we can have use a small range of rank values and choose the value that 0 2 0 results in the best performance (in Section 3, the rank is X � 􏼢 􏼣. (6) 0 5 6 selected as 7). However, the problem in equation (3) is NP-hard and impossible in practice. Candes ` and Recht proposed a nuclear 2.3. Proposed Method. Considering model (2), the low- norm minimization model to solve the following rank rank part T and sparse part T from the corrupted M S minimization model: matrix T were supposed to be recovered. According to � � � � [20], a low-rank matrix or an approximate low-rank � � minimize X , � � matrix can be reconstructed using the MC method. As (7) subject to P (X) � P (M), shown in [21], a sparse matrix can be recovered with Ω Ω compressed sensing. +erefore, T and T in (2) can be M S where the nuclear norm ‖X ‖ is the summation of the ∗ obtained through � � singular values of X. � � � � min rank T + λ T , 􏼁 � � M S Unfortunately, we cannot recover any low-rank matrix 0 (8) (even its rank is 1) if the sampling entries in any row or s.t. T � T + T . M S column are completely missing. Suppose a matrix is of rank 1 and we do not have samples from the second column, the Problem (8) is a nonconvex optimization problem, matrix cannot be recovered because no one can obtain all the where ‖ · ‖ denotes the number of nonzero value, and λ is exact entries of the second column using any method. In a tuning weight that balances the contribution of the order to recover an unknown matrix, at least one obser- l -norm term relative to the rank minimization term and vation in each row and column should be available. Candes ` should be greater than 0. Problem (8) is extremely difficult to and Recht [3] proved that if Ω is sampled uniformly and calculate and NP-hard, so it can be converted to the fol- randomly among all subset of cardinality m, we can solve the lowing convex optimization problem: 4 Advances in Meteorology � � � � � � � � � � � � min T + λ T , 310 � � � � M S ∗ 0 (9) 20 s.t. T � T + T , M S where ‖T ‖ � 􏽐 σ (T ) is the nuclear norm of T and 60 M ∗ i i M M σ (T ) represents the ith singular value of T (sorted in i M M decreasing order). Problem (9) is also known as principal component pursuit (PCP), which can be solved by the augmented Lagrange multiplier (ALM) algorithm given in the following equation: L(A, E, Y) � ‖A‖ + λ1‖E‖ + ⟨Y, D − A − E⟩ ∗ 0 (10) 180 230 + ‖D − A − E‖ , 50 100 150 200 where μ is a positive scalar, λ1 is a positive weighting pa- Figure 3: +e data of gridded temperature field (blue color in- dicates the low temperature and red color indicates the high rameter, the Lagrange multiplier Y is introduced to remove temperature). the equality constraint and A + E � D, ‖ · ‖ denotes the 􏽱������� � m n Frobenius norm ‖ · ‖ � 􏽐 􏽐 X , and ⟨., .⟩ represents the F i j ij inner product operator. For a given Y, A and E are de- termined as the values that make L (A, E, Y) reach the minimum. So, it is supposed that T can be recovered by problem (10). Different from the method proposed in 40 [15, 16], to solve the sparse and low-rank matrix de- composition, the sparse part T was obtained by the com- pressed sensing method. +is method represents a combination of augmented Lagrange multiplier used for matrix completion and compressed sensing used for sparse reconstruction. 3. Experimental Results 3.1. Gridded Temperature Field Data. We implemented our algorithms in MATLAB 2016. +e experimental temperature 50 100 150 200 field data used for testing the method are provided by ERA Figure 4: Sample data of gridded temperature field. Interim of ECMWF (European Centre for Medium-Range Weather Forecasts) which can be obtained from the fol- lowing website: https://apps.ecmwf.int/datasets. +e data (the rank of Figure 5(a) can be estimated by LMaFit [22] and were collected from Asia at 00 am, 06 am, 12 pm, and 18 pm is selected as 7 in this section). on January 1, 2014, at a height of 2 m above the ground, and +e results at high latitudes near the North Pole and the grid resolution was 0.75 degrees. +e region of the study low latitude areas near the equator are both satisfactory. ° ° ° ° is at 20 E∼160 W and 60 S∼60 N, and the size of the region Although the recovered global temperature field from the is 200 × 200. low-rank matrix using the MC method is good, the re- Figure 3 shows the gridded temperature field data covery results in some areas are not very satisfactory selected at 06 am on January 1, 2014, which is represented since the local temperature data property is not con- by a matrix of size 200 × 200. +e value of the temperature sidered. For example, red rectangle in Figure 3 presents is from 210 K to 320 K and the grid resolution is 0.75 circumpolar latitude area, and its temperature varies degree. Figure 4 shows the sampled data of the global from 220 K to 250 K. +e corresponding recovery results temperature at 06 am on January 1, 2014 (the sampling are from 230 K to 250 K. In other words, the temperature number is 15680), to which the reconstruction methods data lower than 230 K have not been recovered suc- are applied. cessfully. +us, the data in the red rectangle are selected for further analysis. Table 1 ((a) and (b)) shows the original and recovery 4. Results temperature field in the red rectangle on January 1, 2014, at Both the MC method and the proposed algorithm are tested 06 am, respectively. Comparison of the value at same po- in this section. It can be seen from Figure 5(b) that the sition in Table 1 ((a) and (b)) shows that the difference is recovered global temperature field at 06 am on January 1, about 5 K to 9 K. +e analysis results for low latitude area in 2014, using the MC method agrees well with the original one the black rectangle shows similar performance in Figure 3. Advances in Meteorology 5 40 40 60 60 80 80 100 100 120 120 140 140 160 160 180 180 200 200 50 100 150 200 50 100 150 200 (a) (b) Figure 5: Reconstruction using the MC method. (a) Low-rank temperature field data (r � 7); (b) recovered field. Table 1: Comparison of original data and reconstruction data in +e true temperature data range from 300 K and 310 K. red rectangle in Figure 3. However, the recovery results using the MC method are less than 307 K, i.e., the temperature data higher than 307 K have Longitude (E) Latitude (N) not been recovered. From Table 2 ((a) and (b)), it can been ∘ ∘ ∘ ∘ ∘ ∘ 95.25 96.00 96.75 97.50 98.25 99.00 seen that the recovered temperature field data are all lower (a) Original data than corresponding original data with an average temper- 65.25 223.1 222.0 222.0 222.0 223.4 224.8 ature difference of 4 K. +e above test shows that the per- 64.50 226.8 225.3 224.8 224.2 225.3 226.3 formance of the MC method using low-rank matrix alone is 63.75 232.1 231.0 230.2 229.4 229.1 228.8 not ideal. As mentioned earlier, the SLR-MC method can ∘ 63.00 236.7 235.5 234.3 233.0 231.6 230.2 improve the reconstruction performance of global tem- 62.25 240.5 238.9 237.4 236.1 234.9 233.3 perature field. +e temperature field data collected at dif- 61.50 245.0 243.1 241.6 240.2 239.1 237.8 ferent times were used to test the proposed method. +e 60.75 250.6 248.3 246.3 244.4 242.9 241.4 60.00 255.0 252.2 250.0 248.0 246.3 244.6 original gridded temperature data (see Figure 6) at four moments (00 am, 06 am, 12 pm, and 18 pm) on January 1, (b) Reconstruction results using MC 65.25 230.1 230.1 232.5 227.8 230.9 219.4 2014, were studied. +e sampling number is 15680, and 64.50 231.8 232.4 230.7 229.4 230.3 192.6 matrix rank is 7. +e same data shown in Figure 3 (i.e., 06 am 63.75 235.8 235.3 245.0 229.0 240.1 238.4 in Figure 6) were studied first. For the high latitude region in 63.00 239.3 239.9 237.1 235.9 235.4 191.7 the red rectangle, the recovered temperature using the 62.25 244.3 244.1 246.1 238.6 241.5 224.0 proposed method varies from 225 K to 250 K. 61.50 247.2 248.9 231.8 242.8 233.9 142.4 +e point-to-point comparison is shown in Table 1 ((a) 60.75 252.9 253.6 249.0 247.0 247.0 190.3 and (c)). It can be seen that the temperature difference is ∘ 257.5 258.1 251.7 252.6 249.2 195.2 60.00 reduced from 7 K to 3 K, which is smaller than that in Table 1 (c) Reconstruction results using SLR-MC (b). Using the SLR-MC method, the reconstruction error can ∘ 65.25 224.9 224.3 224.3 224.8 225.1 229.3 be reduced significantly, which means the recovered tem- ∘ 64.50 227.2 227.4 225.9 226.1 226.2 228.4 perature field is closer to the original one. Similarly, it is also 63.75 230.6 231.9 231.7 232.0 227.3 232.0 found that the SLR-MC method can recover temperature 63.00 237.6 237.4 235.8 235.6 233.4 233.2 field data higher than 307 K (in the black rectangle). 62.25 242.2 242.1 238.4 235.9 234.6 234.4 61.50 248.0 246.1 245.5 242.6 238.5 239.0 As illustrated in Table 2 ((a) and (c)), the recovered and 60.75 253.2 250.7 248.5 246.8 242.6 238.9 original temperature field data at 06 am on January 1, 2014, 60.00 257.5 255.1 251.9 249.9 245.4 245.6 were very close to each other. +e average error was 1 K and less than that of MC. It can be concluded that the re- construction results using SLR-MC are more accurate. For the regions with large temperature variation, the recovery norm T − T 􏼁 RE � , (11) performance is more satisfactory. norm(T) In this work, both reconstruction error (RE) and signal- to-noise ratio (SNR) are used to evaluate the recovery where T is the original temperature field data, T is the reconstructed data, and norm represents the 2-norm. +e performance of the two methods. +e RE is defined as follows: SNR is defined as 6 Advances in Meteorology Table 2: Comparison of original data and reconstruction data in black rectangle in Figure 3. Longitude (E) Latitude (S) ∘ ∘ ∘ ∘ 134.25 135.00 135.75 136.50 (a) Original data 17.25 306.9 306.5 306.5 306.2 18.00 307.8 307.6 307.6 307.4 18.75 309.1 309.1 308.9 308.8 19.50 309.9 310.2 310.1 310.1 20.25 310.3 310.7 311.1 311.5 21.00 310.4 311.2 312.0 312.5 (b) Reconstruction results using MC 17.25 305.2 305.8 304.7 305.6 18.00 294.3 305.6 293.5 308.8 18.75 310.2 306.6 325.6 305.8 19.50 315.4 309.7 312.0 308.4 20.25 311.9 309.9 260.5 307.7 21.00 307.2 309.1 309.2 308.6 (c) Reconstruction results using SLR-MC 17.25 305.8 304.3 304.7 303.9 18.00 305.6 305.3 305.4 305.1 18.75 306.2 309.4 308.2 308.3 19.50 309.4 310.1 308.7 308.3 20.25 309.7 309.7 309.7 309.6 21.00 308.6 308.1 309.0 309.2 e original matrix Low-rank matrix Sparse matrix Reconstruction results 20 20 20 20 40 40 40 60 60 60 60 80 80 80 80 100 = 100 + 100 100 120 120 120 120 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 100 100 100 =+ 120 120 120 120 140 140 140 140 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 100 100 100 = + 120 120 120 120 140 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 =+ 100 100 100 120 120 120 120 140 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 Figure 6: +e reconstruction results of SLR-MC method. 18 pm 12 pm 06 am 00 am Advances in Meteorology 7 p1 SNR � 10 log􏼠 􏼡, (12) p2 where p1 � 1/[length(T)∗norm(T) ] , p2 � 1/[length(T)∗ norm(T − T ) ]. From Figures 7 and 8, it can be found that the RE and SNR of the SLR-MC method are lower and slightly higher, respectively, than those of the MC method for which the details are provided in Table 3. +e average SNR for the four moments is increased by 12.5% using the proposed method, while the average error is 29.3% lower. 4.1. Station Temperature Data. In this section, we evaluate the performance of the SLR-MC method on the re- 0 612 18 construction of station temperature data. In this experi- Time (hours) ment, we have collected the temperature data at 92 national weather stations in Jiangxi, China. Figure 9 shows MC the longitude and latitude of stations, where the blue SLR-MC points represent the location of the national weather Figure 7: Comparison of SNR. stations in Jiangxi. Each station reports its temperature data once a day to the monitoring center, and we have downloaded the data from January 2017 to March 2017. 1.8 We put each station data into a vector and arrange the 1.6 vectors into a large matrix. +e data matrix has been set as M � 87 (only the data from 87 stations are used) and T � 90 1.4 (which represents the length from January 2017 to March 1.2 2017). As shown in Figure 10, the row number of the tem- 1.0 perature matrix represents the time from January 2017 to 0.8 March 2017 and the column number represents the locations of 87 stations. Figure 11 shows the sampled temperature data 0.6 matrix which is selected randomly and uniformly (the 0.4 sampling number is 4698), and the blue dots represent the corrupted temperature data and the red dots represent the 0.2 sampled temperature data. +e size of matrix is 87 × 90, and 0.0 the value of the temperature is from 0 K to 300 K. +e 0 612 18 reconstructed temperature data are shown in Figure 12. As Time (hours) shown in Figure 12, both MC and SLR-MC methods can capture the main feature of the original temperature data MC SLR-MC matrix. +e recovery results of the SLR-MC method can cap- Figure 8: Comparison of RE. ture the local feature of original matrix and more key variation details, while the MC method often loses the information. +e SLR-MC method may not have a signif- Table 3: Comparison of the reconfiguration effect of MC and SLR-MC. icant improvement compared with the MC method because − 2 Prerequisite Method Time RE (10 ) SNR (dB) the changed temperature values only occupy a small 0 1.55 87.5 portion of all temperature values in the matrix. +us, the 6 1.71 88.6 data in the white rectangle in Figure 12 are selected for MC 12 1.47 85.6 further analysis. Sampling � 15680 18 1.28 97.7 +e white rectangle in Figure 12 represents an area with Rank � 7 0 0.91 106 significant temperature variation from 270 K to 290 K. Ta- 6 1.40 88.8 SLR-MC ble 4 shows the original and reconstructed temperature data 12 0.67 111 in the white rectangle. +e white rectangle size is 8 × 6, which 18 1.27 98.4 indicates the data matrix obtained by 8 stations (see Table 4 (a)) from time slots 30 to 35. Comparison of the value at same position in Table 4 ((a), (b), and (c)) shows that the matrix. Compared to MC, the data matrix in Table 4 (c) difference is about 1 K to 7 K, which means both MC and SLR- recovered by the SLR-MC method is closer to the original MC methods can capture most information of the original data matrix in Table 4 (a). For example, the reconstruction SNR (dB) –2 RE (10 ) 8 Advances in Meteorology 113 114 115 116 117 118 119 Longitude (°E) Figure 9: +e longitude and latitude of 92 weather stations in Jiangxi province. 20 40 60 80 Time (days) Figure 10: +e temperature data collected by 87 weather stations in Jiangxi from January 2017 to March 2017. 20 40 60 80 Time (days) Figure 11: Sampled temperature data matrix. Latitude (°N) Location (index number) Location (index number) Advances in Meteorology 9 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 20 40 60 80 20 40 60 80 Time (days) Time (days) (a) (b) Figure 12: Comparison of corrupted temperature data matrix reconstruction. (a) Recovered by MC method; (b) recovered by SLR-MC method. 5. Conclusion Table 4: Comparison of original data and reconstruction data in white rectangle in Figure 12. In this paper, the MC and SLR-MC methods were examined Time slot to determine which technique is appropriate for retrieving Location 30 31 32 33 34 35 missing temperature data. Instead of using the alternating (a) Original temperature data direction method (ADM) proposed in [16] to recover 58503 288.05 274.65 279.95 273.55 278.95 280.15 original corrupted matrix data, the SLR-MC method sepa- 58506 285.45 279.95 279.55 279.05 279.95 279.15 rates the clean low-rank matrix from the corrupted data 58510 288.75 279.65 279.35 279.05 280.05 278.55 effectively and applies matrix completion to fully exploit the 58512 289.55 281.15 280.05 279.55 280.05 279.75 low-rank features of temperature field data. +e sparse 58514 290.55 280.05 280.35 279.25 280.45 280.05 matrix is reconstructed using compressed sensing to fully 58517 288.95 281.95 281.05 280.35 280.65 279.75 capture the sparse features of temperature field data. We 58508 290.85 281.15 280.65 280.85 280.95 280.35 have demonstrated the better performance of the SLR-MC 58509 287.45 282.35 282.05 279.85 280.25 279.75 method on gridded temperature field data and point tem- (b) Reconstruction results using MC perature data from corrupted observations. Experimental 58503 282.18 271.40 273.40 272.61 276.22 278.05 results from gridded temperature field data confirm that the 58506 282.01 282.91 276.68 279.33 280.90 276.27 average SNR is increased by 12.5% and the average error is 58510 282.84 283.01 277.00 279.36 280.32 276.36 reduced by 29.3% using the SLR-MC method. +e SLR-MC 58512 282.70 282.84 277.23 279.63 281.69 277.05 58514 284.74 281.44 277.94 278.83 281.29 276.85 method can also be applied to many other meteorological 58517 284.16 283.17 278.15 279.92 281.99 277.01 data with appropriate modification. 58508 287.34 283.05 279.62 280.15 281.09 278.19 58509 283.07 283.93 277.42 279.73 280.84 275.36 Data Availability (c) Reconstruction results using SLR-MC 58503 283.68 272.23 276.00 272.41 278.98 279.46 +e supplementary materials were provided by ERA Interim 58506 281.83 282.89 276.76 279.51 281.12 276.77 of ECMWF (European Centre for Medium-Range Weather 58510 286.69 282.98 276.61 279.63 280.59 276.57 Forecasts) and Jiangxi Meteorological Bureau. +e data 58512 283.26 282.97 277.91 279.78 281.27 277.77 provided by ERA Interim were collected from Asia at 00 am, 58514 284.90 281.13 277.45 278.57 281.73 277.22 06 am, 12 pm, and 18 pm on January 1, 2014, with a spatial 58517 284.04 283.12 278.88 280.17 281.58 277.04 resolution of 0.75 degrees, and the data provided by Jiangxi 58508 287.59 282.61 279.02 280.31 281.22 278.38 Meteorological Bureau were collected from 92 national 58509 281.24 283.85 278.87 279.51 280.48 275.51 weather stations in Jiangxi from January 2017 to March 2017. error (RE) between Table 4 (a) and Table 4 (b) is 1.08E (− 2) Conflicts of Interest while that between Table 4 (a) and Table 4 (c) is 9.43E (− 3). +e above test shows that the performance of the SLR-MC +e authors declare that there are no conflicts of interest method is better than the MC method. regarding the publication of this article. Location (index number) Location (index number) 10 Advances in Meteorology [14] H. Zhu, Y. Zhu, M. Li, and L. M. Ni, “SEER: metropolitan- Acknowledgments scale traffic perception based on lossy sensory data,” in Proceedings of the 28th Conference on Computer Communi- +e authors appreciate ERA Interim of ECMWF for pro- cations INFOCOM, Rio de Janeiro, Brazil, April 2009. viding observation data. +is study was supported in part by [15] E. J. Candes, ` X. Li, Y. Ma, and J. Wright, “Robust principal the Major Program of National Natural Science Foundation component analysis?,” Journal of the ACM, vol. 58, no. 3, of China (no. 91437220), Jiangxi Province Science Foun- pp. 1–37, 2011. dation for Youths (no. 20171ACB21038), JiangXi Municipal [16] X. M. Yuan and J. F. Yang, “Sparse and low-rank matrix Science and Technology Project (no. 20171ACG70017), and decomposition via alternating direction method,” Pacific China Scholarship Council (no. 201808360089). Journal of Optimization, vol. 9, no. 1, pp. 167–180, 2013. [17] C. Y. Li, L. Zhu, W. Z. Bao, Y. L. Jiang, C. A. Yuan, and D. S. Huang, “Convex local sensitive low rank matrix ap- Supplementary Materials proximation,” in Proceedings of the International Joint Con- ference on Neural Networks, Anchorage, AK, USA, May 2017. Array size: 200 × 200; variable: 2 metre temperature [18] R. Tripathi, B. Mohan, and K. Rajawat, “Adaptive low rank unit � “K.” +e original temperature data have been recor- matrix completion,” IEEE Transactions on Signal Processing, ded in Tabel original.xlsx. +e original temperature data vol. 65, no. 14, pp. 3603–3616, 2017. tested by the MC method have been recorded in Tabel [19] F. Ong and M. Lustig, “Beyond low rank + sparse: multi-scale MC.xlsx. +e original temperature data tested by the SLR- low rank matrix decomposition,” in Proceedings of the IEEE MC method have been recorded in Tabel LRC-MC.xlsx. International Conference on Acoustics, Speech and Signal (Supplementary Materials) Processing, Shanghai, China, March 2016. [20] J. Hou, L.-P. Chau, N. Magnenat-+almann, and Y. He, “Sparse low-rank matrix approximation for data compres- References sion,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 5, pp. 1043–1054, 2017. [1] S. Zhang, “Remote sensing estimation of surface temperature [21] J.-F. Cai and S. Osher, “Fast singular value thresholding and improvement of spatial interpolation data,” Master’s without singular value decomposition,” Methods and Appli- thesis, Capital Normal University, Beijing, China, 2014. cations of Analysis, vol. 20, no. 4, pp. 335–352, 2013. [2] K. Xie, X. Ning, X. Wang et al., “Recover corrupted data in [22] Z. Wen, W. Yin, and Y. Zhang, “Solving a low-rank factor- sensor networks: a matrix completion solution,” IEEE ization model for matrix completion by a nonlinear successive Transactions on Mobile Computing, vol. 16, no. 5, pp. 1434– over-relaxation algorithm,” Mathematical Programming 1448, 2017. Computation, vol. 4, no. 4, pp. 333–361, 2012. [3] E. J. Candes ` and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Math- ematics, vol. 9, no. 6, pp. 717–772, 2008. [4] R. H. Keshavan, A. Montanari, and S. Oh, “Matrix completion from noisy entries,” Journal of Machine Learning Research, vol. 11, no. 3, pp. 2057–2078, 2012. [5] S. Becker, V. Cevher, and A. Kyrillidis, Randomized Singular Value Projection, Eprint Arxiv, 2013. [6] E. J. Candes and T. Tao, “+e power of convex relaxation: near-optimal matrix completion,” IEEE Transactions on In- formation 'eory, vol. 56, no. 5, pp. 2053–2080, 2010. [7] E. J. Candes and Y. Plan, “Matrix completion with noise,” Proceedings of the IEEE, vol. 98, no. 6, pp. 925–936, 2010. [8] M. Fazel, Matrix rank minimization with applications, Ph.D. thesis, Stanford University, Stanford, CA, USA, 2002. [9] J. F. Cai, E. J. Candes, ` and Z. Shen, “A singular value thresholding algorithm for matrix completion,” Siam Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2008. [10] R. Meka, P. Jain, and I. S. Dhillon, “Guaranteed rank mini- mization via singular value projection,” in Proceedings of the Neural Information Processing Systems, pp. 937–945, Lake Tahoe, NV, USA, December 2012. [11] Z. Lin, M. Chen, and M. Yi, “+e augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices,” vol. 9, 2010, https://arxiv.org/abs/1009.5055. [12] T. Cover and P. Hart, “Nearest neighbor pattern classifica- tion,” IEEE Transactions on Information 'eory, vol. 13, no. 1, pp. 21–27, 1953. [13] L. Kong, D. Jiang, and M.-Y. Wu, “Optimizing the spatio- temporal distribution of cyber-physical systems for envi- ronment abstraction,” in Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems (ICDCS), June 2010. International Journal of The Scientific Advances in Advances in Geophysics Chemistry Scientica World Journal Public Health Hindawi Hindawi Hindawi Hindawi Publishing Corporation Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www www.hindawi.com .hindawi.com V Volume 2018 olume 2013 www.hindawi.com Volume 2018 Journal of Environmental and Public Health Advances in Meteorology Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 Submit your manuscripts at www.hindawi.com Applied & Environmental Journal of Soil Science Geological Research Hindawi Volume 2018 Hindawi www.hindawi.com www.hindawi.com Volume 2018 International Journal of International Journal of Agronomy Ecology International Journal of Advances in International Journal of Forestry Research Microbiology Agriculture Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 International Journal of Journal of Journal of International Journal of Biodiversity Archaea Analytical Chemistry Chemistry Marine Biology Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Advances in Meteorology Hindawi Publishing Corporation

Temperature Field Data Reconstruction Using the Sparse Low-Rank Matrix Completion Method

Loading next page...
 
/lp/hindawi-publishing-corporation/temperature-field-data-reconstruction-using-the-sparse-low-rank-matrix-SChjn8wJd9

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
Copyright
Copyright © 2019 Shan Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
ISSN
1687-9309
eISSN
1687-9317
DOI
10.1155/2019/3676182
Publisher site
See Article on Publisher Site

Abstract

Hindawi Advances in Meteorology Volume 2019, Article ID 3676182, 10 pages https://doi.org/10.1155/2019/3676182 Research Article Temperature Field Data Reconstruction Using the Sparse Low-Rank Matrix Completion Method 1 1 1 2 3 Shan Wang , Jianhui Hu, Huiling Shan, Chun-Xiang Shi, and Weimin Huang School of Information Engineering, East China Jiaotong University, Nanchang, China National Meteorological Information Center, Beijing, China Department of Electrical and Computer Engineering, Memorial University, St. John’s, Canada Correspondence should be addressed to Shan Wang; patrick_shan@163.com Received 16 April 2019; Revised 15 September 2019; Accepted 9 October 2019; Published 3 November 2019 Academic Editor: Helena A. Flocas Copyright © 2019 Shan Wang et al. +is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Due to limited number of weather stations and interruption of data collection, the temperature field data may be incomplete. In the past, spatial interpolation is usually used for filling the data gap. However, the interpolation method does not work well for the case of the large-scale data loss. Matrix completion has emerged very recently and provides a global optimization for temperature field data reconstruction. A recovery method is proposed for improving the accuracy of temperature field data by using sparse low- rank matrix completion (SLR-MC). +e method is tested using continuous gridded data provided by ERA Interim and the station temperature data provided by Jiangxi Meteorological Bureau. Experimental results show that the average signal-to-noise ratio can be increased by 12.5%, and the average reconstruction error is reduced by 29.3% compared with the matrix completion (MC) method. a corrupted matrix from a small part of its entries. It is 1. Introduction impossible to recover a corrupted matrix without any Temperature field data are measured at a height of at least assumptions about the matrix. Cande`s and Recht [3] 1.5 m above the ground, which is an important parameter to found if the given matrix is low rank or approximately describe the environmental conditions of the land [1] and low rank, the missing entries of the corrupted matrix can widely utilized in weather forecast. +e initial research on be recovered through minimizing the matrix rank. n1×n2 city thermal environment phenomenon mainly employed Mathematically, for a corrupted matrix M ∈ R , the the temperature data from meteorological stations [2]. +e low-rank matrix completion problem usually can be number of meteorological stations is often limited. In ad- formulated as dition, the data are continuous, but not in space. Based on minimize rank(X), the above data characteristics, it is challenging to investigate (1) the temperature-related problem over large area. Sparse subject to P (X) � P (M), Ω Ω temperature field data are highly correlated with low spatial variability; interpolation is usually used to obtain the tem- where rank(X) denotes the rank of matrix X, Ω represents perature data missed in a region. And so far, no research on the locations of sampling in matrix X which is the number of the sparse property of continuous temperature data has been known entries, and P is the sampling operator which conducted. obtains only the entries indexed by Ω. With the rapid development of sparse representation, Unfortunately, matrix completion problem is NP-hard matrix completion (MC) method [3–5], which extends because the rank is nonconvex and discontinuous in reality. the idea of compressed sensing to matrices, has been To solve the problem, Candes ` and Tao proposed convex proposed recently. Matrix completion aims to recover nuclear norm to solve the rank minimization problem [6]. 2 Advances in Meteorology and the station temperature data provided by Jiangxi Me- While the known entries are sampled randomly and uni- formly from the unknown matrix, the missing entries can be teorological Bureau. +e rest of this paper is organized as follows. +e recovered accurately if the matrix satisfies low-rank struc- ture and incoherence condition [7]. Nuclear norm repre- temperature field matrix model and the proposed SLR-MC sents the sum of the singular values and can be seen as method are described in Section 2. +e experiment data and a special case of l norm. It is easy to know that nuclear norm results are presented in Section 3, in which the performance is widely adopted as a low-rank convex surrogate [8], which comparison between the MC and SLR-MC reconstruction is can be solved via the convex optimization. In order to solve also given. In Section 4, a summary of the work is provided. the convex problem, semidefinite programming (SDP) has been proposed. Since SDP has high computational cost, 2. Method several faster algorithms which are more computationally efficient than the SDP-based methods have been proposed, In this section, we will describe the temperature field matrix such as singular value thresholding (SVT) [9], singular value model which is decomposed into a low-rank part and projection (SVP) [10], and inexact augmented Lagrangian a sparse part, then the fundamentals of matrix completion method (IALM) [11]. are introduced, and finally the SLR-MC method is presented. Unlike the interpolation method presented in [12–14], matrix completion requires the corrupted matrix to be low 2.1. 'e Temperature Field Matrix Model. In order to rank, and it works well for the case when a large portion of overcome the influence of rank, the gridded temperature data is lost. Taking advantage of the low rank and spa- field data at each time can be regarded as a new low-rank tiotemporal correlation of a matrix, MC can achieve good matrix. According to [19], the gridded temperature field data interpolation performance. Compared with the traditional T , T , . . ., T collected over a period can be arranged in spatial interpolation method, MC takes good use of the 1 2 L rows to a large matrix T, as shown in Figure 1. correlation between the data, and it could only use a few Assume that the size of each matrix is m × n, the rank of temperature field data to reconstruct the global temper- T is r , the rank of T is r , . . ., and the rank of T is r . For ature field. +e reconstructed data quality is comparable to 1 1 2 2 L L each single temperature field, the observation matrix may the spatial interpolation. In order to obtain good re- not satisfy the low-rank property; therefore, the MC method construction resolution, the temperature matrix needs to cannot be directly used to reconstruct missing data or lost be low rank based on matrix completion theory. However, data. However, due to structure similarity and strong cor- the temperature field data matrix does not have a stable relation among the matrices T , T , . . ., T , the rank R of rank and the rank of matrix varies with time. So, we regard 1 2 L matrix T is smaller than max (r , r , . . ., r ), and a matrix can the gridded temperature field data as a new matrix, whose 1 2 L be decomposed into two parts: a low-rank matrix T (few rank is more stable. Although matrix completion can M nonzero singular values) and a sparse matrix T (few recover the incomplete temperature field data perfectly, S nonzero entries): some information will still be lost in the process. In [15], the data matrix was supposed to be decomposed into T � T + T , (2) M S a low-rank part and a sparse part, and it can be recovered individually by solving a very convenient convex program where rank (T ) ≪ min (m, n) and sparsity (T ) ≪ mn. M S under some suitable assumptions. To recover the sparse Figure 2 illustrates an example of the decomposition and low-rank components of a matrix efficiently, the al- result. ternating direction method (ADM) has been proposed in [16], but the sparse part of gridded temperature field data 2.2. Fundamentals of Matrix Completion. Matrix completion is well suited for the application of compressed sensing is the technique of completing missing values of a matrix (CS) due to extensive spatiotemporal correlations that with a subset of entries selected randomly and uniformly result in sparser representations. +e combination of from a low-rank matrix or an approximately low-rank compressed sensing and low-rank matrix completion matrix [3, 15]. +e incomplete matrix M can be recovered by represents an attractive proposition for further improving solving the following rank minimization problem [3]: reconstruction. In this paper, a method based on matrix completion and minimize rank(X), compressed sensing [17, 18] is presented and referred to as (3) subject to P (X) � P (M), Ω Ω sparse low-rank matrix completion (SLR-MC). Different from the method proposed in [16], the low-rank part and where rank (X) denotes the rank of a matrix X, and the sparse part of corrupted matrix were recovered by matrix m×n m×n sampling operator P : R ⟶ R is defined as follows: completion and compressed sensing individually. Firstly, the X , (i, j) ∈ Ω, ⎧ ⎨ ij temperature field data matrix is decomposed into a low-rank P (X) � (4) or an approximately low-rank matrix and a sparse matrix. 0, (i, j) ∈ Ω . +en, the low-rank matrix is reconstructed using the matrix completion method, and the sparse part is recovered using We use |Ω| to represent the cardinality of Ω which is the compressed sensing. +e method is tested using the gridded number of known entries. For example, suppose the matrix incomplete temperature field data provided by ERA Interim X is Advances in Meteorology 3 T T 1 M Figure 1: Temperature field matrix model. (a) (b) (c) Figure 2: (a) Temperature field matrix T, (b) low-rank matrix, and (c) sparse matrix. problem (7) with high confidence where the number of 1 2 3 6/5 X � 􏼢 􏼣. (5) samples should obey m≥ Cn r log n. 4 5 6 In order to recover the incomplete matrix exactly, there is a restriction on the range of rank r. +e selection of rank If we have three elements known as has a great influence on recovering low-rank matrix, and we Ω � {(1, 2), (2, 2), (2, 3)}, we can have use a small range of rank values and choose the value that 0 2 0 results in the best performance (in Section 3, the rank is X � 􏼢 􏼣. (6) 0 5 6 selected as 7). However, the problem in equation (3) is NP-hard and impossible in practice. Candes ` and Recht proposed a nuclear 2.3. Proposed Method. Considering model (2), the low- norm minimization model to solve the following rank rank part T and sparse part T from the corrupted M S minimization model: matrix T were supposed to be recovered. According to � � � � [20], a low-rank matrix or an approximate low-rank � � minimize X , � � matrix can be reconstructed using the MC method. As (7) subject to P (X) � P (M), shown in [21], a sparse matrix can be recovered with Ω Ω compressed sensing. +erefore, T and T in (2) can be M S where the nuclear norm ‖X ‖ is the summation of the ∗ obtained through � � singular values of X. � � � � min rank T + λ T , 􏼁 � � M S Unfortunately, we cannot recover any low-rank matrix 0 (8) (even its rank is 1) if the sampling entries in any row or s.t. T � T + T . M S column are completely missing. Suppose a matrix is of rank 1 and we do not have samples from the second column, the Problem (8) is a nonconvex optimization problem, matrix cannot be recovered because no one can obtain all the where ‖ · ‖ denotes the number of nonzero value, and λ is exact entries of the second column using any method. In a tuning weight that balances the contribution of the order to recover an unknown matrix, at least one obser- l -norm term relative to the rank minimization term and vation in each row and column should be available. Candes ` should be greater than 0. Problem (8) is extremely difficult to and Recht [3] proved that if Ω is sampled uniformly and calculate and NP-hard, so it can be converted to the fol- randomly among all subset of cardinality m, we can solve the lowing convex optimization problem: 4 Advances in Meteorology � � � � � � � � � � � � min T + λ T , 310 � � � � M S ∗ 0 (9) 20 s.t. T � T + T , M S where ‖T ‖ � 􏽐 σ (T ) is the nuclear norm of T and 60 M ∗ i i M M σ (T ) represents the ith singular value of T (sorted in i M M decreasing order). Problem (9) is also known as principal component pursuit (PCP), which can be solved by the augmented Lagrange multiplier (ALM) algorithm given in the following equation: L(A, E, Y) � ‖A‖ + λ1‖E‖ + ⟨Y, D − A − E⟩ ∗ 0 (10) 180 230 + ‖D − A − E‖ , 50 100 150 200 where μ is a positive scalar, λ1 is a positive weighting pa- Figure 3: +e data of gridded temperature field (blue color in- dicates the low temperature and red color indicates the high rameter, the Lagrange multiplier Y is introduced to remove temperature). the equality constraint and A + E � D, ‖ · ‖ denotes the 􏽱������� � m n Frobenius norm ‖ · ‖ � 􏽐 􏽐 X , and ⟨., .⟩ represents the F i j ij inner product operator. For a given Y, A and E are de- termined as the values that make L (A, E, Y) reach the minimum. So, it is supposed that T can be recovered by problem (10). Different from the method proposed in 40 [15, 16], to solve the sparse and low-rank matrix de- composition, the sparse part T was obtained by the com- pressed sensing method. +is method represents a combination of augmented Lagrange multiplier used for matrix completion and compressed sensing used for sparse reconstruction. 3. Experimental Results 3.1. Gridded Temperature Field Data. We implemented our algorithms in MATLAB 2016. +e experimental temperature 50 100 150 200 field data used for testing the method are provided by ERA Figure 4: Sample data of gridded temperature field. Interim of ECMWF (European Centre for Medium-Range Weather Forecasts) which can be obtained from the fol- lowing website: https://apps.ecmwf.int/datasets. +e data (the rank of Figure 5(a) can be estimated by LMaFit [22] and were collected from Asia at 00 am, 06 am, 12 pm, and 18 pm is selected as 7 in this section). on January 1, 2014, at a height of 2 m above the ground, and +e results at high latitudes near the North Pole and the grid resolution was 0.75 degrees. +e region of the study low latitude areas near the equator are both satisfactory. ° ° ° ° is at 20 E∼160 W and 60 S∼60 N, and the size of the region Although the recovered global temperature field from the is 200 × 200. low-rank matrix using the MC method is good, the re- Figure 3 shows the gridded temperature field data covery results in some areas are not very satisfactory selected at 06 am on January 1, 2014, which is represented since the local temperature data property is not con- by a matrix of size 200 × 200. +e value of the temperature sidered. For example, red rectangle in Figure 3 presents is from 210 K to 320 K and the grid resolution is 0.75 circumpolar latitude area, and its temperature varies degree. Figure 4 shows the sampled data of the global from 220 K to 250 K. +e corresponding recovery results temperature at 06 am on January 1, 2014 (the sampling are from 230 K to 250 K. In other words, the temperature number is 15680), to which the reconstruction methods data lower than 230 K have not been recovered suc- are applied. cessfully. +us, the data in the red rectangle are selected for further analysis. Table 1 ((a) and (b)) shows the original and recovery 4. Results temperature field in the red rectangle on January 1, 2014, at Both the MC method and the proposed algorithm are tested 06 am, respectively. Comparison of the value at same po- in this section. It can be seen from Figure 5(b) that the sition in Table 1 ((a) and (b)) shows that the difference is recovered global temperature field at 06 am on January 1, about 5 K to 9 K. +e analysis results for low latitude area in 2014, using the MC method agrees well with the original one the black rectangle shows similar performance in Figure 3. Advances in Meteorology 5 40 40 60 60 80 80 100 100 120 120 140 140 160 160 180 180 200 200 50 100 150 200 50 100 150 200 (a) (b) Figure 5: Reconstruction using the MC method. (a) Low-rank temperature field data (r � 7); (b) recovered field. Table 1: Comparison of original data and reconstruction data in +e true temperature data range from 300 K and 310 K. red rectangle in Figure 3. However, the recovery results using the MC method are less than 307 K, i.e., the temperature data higher than 307 K have Longitude (E) Latitude (N) not been recovered. From Table 2 ((a) and (b)), it can been ∘ ∘ ∘ ∘ ∘ ∘ 95.25 96.00 96.75 97.50 98.25 99.00 seen that the recovered temperature field data are all lower (a) Original data than corresponding original data with an average temper- 65.25 223.1 222.0 222.0 222.0 223.4 224.8 ature difference of 4 K. +e above test shows that the per- 64.50 226.8 225.3 224.8 224.2 225.3 226.3 formance of the MC method using low-rank matrix alone is 63.75 232.1 231.0 230.2 229.4 229.1 228.8 not ideal. As mentioned earlier, the SLR-MC method can ∘ 63.00 236.7 235.5 234.3 233.0 231.6 230.2 improve the reconstruction performance of global tem- 62.25 240.5 238.9 237.4 236.1 234.9 233.3 perature field. +e temperature field data collected at dif- 61.50 245.0 243.1 241.6 240.2 239.1 237.8 ferent times were used to test the proposed method. +e 60.75 250.6 248.3 246.3 244.4 242.9 241.4 60.00 255.0 252.2 250.0 248.0 246.3 244.6 original gridded temperature data (see Figure 6) at four moments (00 am, 06 am, 12 pm, and 18 pm) on January 1, (b) Reconstruction results using MC 65.25 230.1 230.1 232.5 227.8 230.9 219.4 2014, were studied. +e sampling number is 15680, and 64.50 231.8 232.4 230.7 229.4 230.3 192.6 matrix rank is 7. +e same data shown in Figure 3 (i.e., 06 am 63.75 235.8 235.3 245.0 229.0 240.1 238.4 in Figure 6) were studied first. For the high latitude region in 63.00 239.3 239.9 237.1 235.9 235.4 191.7 the red rectangle, the recovered temperature using the 62.25 244.3 244.1 246.1 238.6 241.5 224.0 proposed method varies from 225 K to 250 K. 61.50 247.2 248.9 231.8 242.8 233.9 142.4 +e point-to-point comparison is shown in Table 1 ((a) 60.75 252.9 253.6 249.0 247.0 247.0 190.3 and (c)). It can be seen that the temperature difference is ∘ 257.5 258.1 251.7 252.6 249.2 195.2 60.00 reduced from 7 K to 3 K, which is smaller than that in Table 1 (c) Reconstruction results using SLR-MC (b). Using the SLR-MC method, the reconstruction error can ∘ 65.25 224.9 224.3 224.3 224.8 225.1 229.3 be reduced significantly, which means the recovered tem- ∘ 64.50 227.2 227.4 225.9 226.1 226.2 228.4 perature field is closer to the original one. Similarly, it is also 63.75 230.6 231.9 231.7 232.0 227.3 232.0 found that the SLR-MC method can recover temperature 63.00 237.6 237.4 235.8 235.6 233.4 233.2 field data higher than 307 K (in the black rectangle). 62.25 242.2 242.1 238.4 235.9 234.6 234.4 61.50 248.0 246.1 245.5 242.6 238.5 239.0 As illustrated in Table 2 ((a) and (c)), the recovered and 60.75 253.2 250.7 248.5 246.8 242.6 238.9 original temperature field data at 06 am on January 1, 2014, 60.00 257.5 255.1 251.9 249.9 245.4 245.6 were very close to each other. +e average error was 1 K and less than that of MC. It can be concluded that the re- construction results using SLR-MC are more accurate. For the regions with large temperature variation, the recovery norm T − T 􏼁 RE � , (11) performance is more satisfactory. norm(T) In this work, both reconstruction error (RE) and signal- to-noise ratio (SNR) are used to evaluate the recovery where T is the original temperature field data, T is the reconstructed data, and norm represents the 2-norm. +e performance of the two methods. +e RE is defined as follows: SNR is defined as 6 Advances in Meteorology Table 2: Comparison of original data and reconstruction data in black rectangle in Figure 3. Longitude (E) Latitude (S) ∘ ∘ ∘ ∘ 134.25 135.00 135.75 136.50 (a) Original data 17.25 306.9 306.5 306.5 306.2 18.00 307.8 307.6 307.6 307.4 18.75 309.1 309.1 308.9 308.8 19.50 309.9 310.2 310.1 310.1 20.25 310.3 310.7 311.1 311.5 21.00 310.4 311.2 312.0 312.5 (b) Reconstruction results using MC 17.25 305.2 305.8 304.7 305.6 18.00 294.3 305.6 293.5 308.8 18.75 310.2 306.6 325.6 305.8 19.50 315.4 309.7 312.0 308.4 20.25 311.9 309.9 260.5 307.7 21.00 307.2 309.1 309.2 308.6 (c) Reconstruction results using SLR-MC 17.25 305.8 304.3 304.7 303.9 18.00 305.6 305.3 305.4 305.1 18.75 306.2 309.4 308.2 308.3 19.50 309.4 310.1 308.7 308.3 20.25 309.7 309.7 309.7 309.6 21.00 308.6 308.1 309.0 309.2 e original matrix Low-rank matrix Sparse matrix Reconstruction results 20 20 20 20 40 40 40 60 60 60 60 80 80 80 80 100 = 100 + 100 100 120 120 120 120 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 100 100 100 =+ 120 120 120 120 140 140 140 140 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 100 100 100 = + 120 120 120 120 140 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 20 20 20 40 40 40 40 60 60 60 60 80 80 80 80 100 =+ 100 100 100 120 120 120 120 140 140 140 140 160 160 160 160 180 180 180 180 200 200 200 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200 Figure 6: +e reconstruction results of SLR-MC method. 18 pm 12 pm 06 am 00 am Advances in Meteorology 7 p1 SNR � 10 log􏼠 􏼡, (12) p2 where p1 � 1/[length(T)∗norm(T) ] , p2 � 1/[length(T)∗ norm(T − T ) ]. From Figures 7 and 8, it can be found that the RE and SNR of the SLR-MC method are lower and slightly higher, respectively, than those of the MC method for which the details are provided in Table 3. +e average SNR for the four moments is increased by 12.5% using the proposed method, while the average error is 29.3% lower. 4.1. Station Temperature Data. In this section, we evaluate the performance of the SLR-MC method on the re- 0 612 18 construction of station temperature data. In this experi- Time (hours) ment, we have collected the temperature data at 92 national weather stations in Jiangxi, China. Figure 9 shows MC the longitude and latitude of stations, where the blue SLR-MC points represent the location of the national weather Figure 7: Comparison of SNR. stations in Jiangxi. Each station reports its temperature data once a day to the monitoring center, and we have downloaded the data from January 2017 to March 2017. 1.8 We put each station data into a vector and arrange the 1.6 vectors into a large matrix. +e data matrix has been set as M � 87 (only the data from 87 stations are used) and T � 90 1.4 (which represents the length from January 2017 to March 1.2 2017). As shown in Figure 10, the row number of the tem- 1.0 perature matrix represents the time from January 2017 to 0.8 March 2017 and the column number represents the locations of 87 stations. Figure 11 shows the sampled temperature data 0.6 matrix which is selected randomly and uniformly (the 0.4 sampling number is 4698), and the blue dots represent the corrupted temperature data and the red dots represent the 0.2 sampled temperature data. +e size of matrix is 87 × 90, and 0.0 the value of the temperature is from 0 K to 300 K. +e 0 612 18 reconstructed temperature data are shown in Figure 12. As Time (hours) shown in Figure 12, both MC and SLR-MC methods can capture the main feature of the original temperature data MC SLR-MC matrix. +e recovery results of the SLR-MC method can cap- Figure 8: Comparison of RE. ture the local feature of original matrix and more key variation details, while the MC method often loses the information. +e SLR-MC method may not have a signif- Table 3: Comparison of the reconfiguration effect of MC and SLR-MC. icant improvement compared with the MC method because − 2 Prerequisite Method Time RE (10 ) SNR (dB) the changed temperature values only occupy a small 0 1.55 87.5 portion of all temperature values in the matrix. +us, the 6 1.71 88.6 data in the white rectangle in Figure 12 are selected for MC 12 1.47 85.6 further analysis. Sampling � 15680 18 1.28 97.7 +e white rectangle in Figure 12 represents an area with Rank � 7 0 0.91 106 significant temperature variation from 270 K to 290 K. Ta- 6 1.40 88.8 SLR-MC ble 4 shows the original and reconstructed temperature data 12 0.67 111 in the white rectangle. +e white rectangle size is 8 × 6, which 18 1.27 98.4 indicates the data matrix obtained by 8 stations (see Table 4 (a)) from time slots 30 to 35. Comparison of the value at same position in Table 4 ((a), (b), and (c)) shows that the matrix. Compared to MC, the data matrix in Table 4 (c) difference is about 1 K to 7 K, which means both MC and SLR- recovered by the SLR-MC method is closer to the original MC methods can capture most information of the original data matrix in Table 4 (a). For example, the reconstruction SNR (dB) –2 RE (10 ) 8 Advances in Meteorology 113 114 115 116 117 118 119 Longitude (°E) Figure 9: +e longitude and latitude of 92 weather stations in Jiangxi province. 20 40 60 80 Time (days) Figure 10: +e temperature data collected by 87 weather stations in Jiangxi from January 2017 to March 2017. 20 40 60 80 Time (days) Figure 11: Sampled temperature data matrix. Latitude (°N) Location (index number) Location (index number) Advances in Meteorology 9 10 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 20 40 60 80 20 40 60 80 Time (days) Time (days) (a) (b) Figure 12: Comparison of corrupted temperature data matrix reconstruction. (a) Recovered by MC method; (b) recovered by SLR-MC method. 5. Conclusion Table 4: Comparison of original data and reconstruction data in white rectangle in Figure 12. In this paper, the MC and SLR-MC methods were examined Time slot to determine which technique is appropriate for retrieving Location 30 31 32 33 34 35 missing temperature data. Instead of using the alternating (a) Original temperature data direction method (ADM) proposed in [16] to recover 58503 288.05 274.65 279.95 273.55 278.95 280.15 original corrupted matrix data, the SLR-MC method sepa- 58506 285.45 279.95 279.55 279.05 279.95 279.15 rates the clean low-rank matrix from the corrupted data 58510 288.75 279.65 279.35 279.05 280.05 278.55 effectively and applies matrix completion to fully exploit the 58512 289.55 281.15 280.05 279.55 280.05 279.75 low-rank features of temperature field data. +e sparse 58514 290.55 280.05 280.35 279.25 280.45 280.05 matrix is reconstructed using compressed sensing to fully 58517 288.95 281.95 281.05 280.35 280.65 279.75 capture the sparse features of temperature field data. We 58508 290.85 281.15 280.65 280.85 280.95 280.35 have demonstrated the better performance of the SLR-MC 58509 287.45 282.35 282.05 279.85 280.25 279.75 method on gridded temperature field data and point tem- (b) Reconstruction results using MC perature data from corrupted observations. Experimental 58503 282.18 271.40 273.40 272.61 276.22 278.05 results from gridded temperature field data confirm that the 58506 282.01 282.91 276.68 279.33 280.90 276.27 average SNR is increased by 12.5% and the average error is 58510 282.84 283.01 277.00 279.36 280.32 276.36 reduced by 29.3% using the SLR-MC method. +e SLR-MC 58512 282.70 282.84 277.23 279.63 281.69 277.05 58514 284.74 281.44 277.94 278.83 281.29 276.85 method can also be applied to many other meteorological 58517 284.16 283.17 278.15 279.92 281.99 277.01 data with appropriate modification. 58508 287.34 283.05 279.62 280.15 281.09 278.19 58509 283.07 283.93 277.42 279.73 280.84 275.36 Data Availability (c) Reconstruction results using SLR-MC 58503 283.68 272.23 276.00 272.41 278.98 279.46 +e supplementary materials were provided by ERA Interim 58506 281.83 282.89 276.76 279.51 281.12 276.77 of ECMWF (European Centre for Medium-Range Weather 58510 286.69 282.98 276.61 279.63 280.59 276.57 Forecasts) and Jiangxi Meteorological Bureau. +e data 58512 283.26 282.97 277.91 279.78 281.27 277.77 provided by ERA Interim were collected from Asia at 00 am, 58514 284.90 281.13 277.45 278.57 281.73 277.22 06 am, 12 pm, and 18 pm on January 1, 2014, with a spatial 58517 284.04 283.12 278.88 280.17 281.58 277.04 resolution of 0.75 degrees, and the data provided by Jiangxi 58508 287.59 282.61 279.02 280.31 281.22 278.38 Meteorological Bureau were collected from 92 national 58509 281.24 283.85 278.87 279.51 280.48 275.51 weather stations in Jiangxi from January 2017 to March 2017. error (RE) between Table 4 (a) and Table 4 (b) is 1.08E (− 2) Conflicts of Interest while that between Table 4 (a) and Table 4 (c) is 9.43E (− 3). +e above test shows that the performance of the SLR-MC +e authors declare that there are no conflicts of interest method is better than the MC method. regarding the publication of this article. Location (index number) Location (index number) 10 Advances in Meteorology [14] H. Zhu, Y. Zhu, M. Li, and L. M. Ni, “SEER: metropolitan- Acknowledgments scale traffic perception based on lossy sensory data,” in Proceedings of the 28th Conference on Computer Communi- +e authors appreciate ERA Interim of ECMWF for pro- cations INFOCOM, Rio de Janeiro, Brazil, April 2009. viding observation data. +is study was supported in part by [15] E. J. Candes, ` X. Li, Y. Ma, and J. Wright, “Robust principal the Major Program of National Natural Science Foundation component analysis?,” Journal of the ACM, vol. 58, no. 3, of China (no. 91437220), Jiangxi Province Science Foun- pp. 1–37, 2011. dation for Youths (no. 20171ACB21038), JiangXi Municipal [16] X. M. Yuan and J. F. Yang, “Sparse and low-rank matrix Science and Technology Project (no. 20171ACG70017), and decomposition via alternating direction method,” Pacific China Scholarship Council (no. 201808360089). Journal of Optimization, vol. 9, no. 1, pp. 167–180, 2013. [17] C. Y. Li, L. Zhu, W. Z. Bao, Y. L. Jiang, C. A. Yuan, and D. S. Huang, “Convex local sensitive low rank matrix ap- Supplementary Materials proximation,” in Proceedings of the International Joint Con- ference on Neural Networks, Anchorage, AK, USA, May 2017. Array size: 200 × 200; variable: 2 metre temperature [18] R. Tripathi, B. Mohan, and K. Rajawat, “Adaptive low rank unit � “K.” +e original temperature data have been recor- matrix completion,” IEEE Transactions on Signal Processing, ded in Tabel original.xlsx. +e original temperature data vol. 65, no. 14, pp. 3603–3616, 2017. tested by the MC method have been recorded in Tabel [19] F. Ong and M. Lustig, “Beyond low rank + sparse: multi-scale MC.xlsx. +e original temperature data tested by the SLR- low rank matrix decomposition,” in Proceedings of the IEEE MC method have been recorded in Tabel LRC-MC.xlsx. International Conference on Acoustics, Speech and Signal (Supplementary Materials) Processing, Shanghai, China, March 2016. [20] J. Hou, L.-P. Chau, N. Magnenat-+almann, and Y. He, “Sparse low-rank matrix approximation for data compres- References sion,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 5, pp. 1043–1054, 2017. [1] S. Zhang, “Remote sensing estimation of surface temperature [21] J.-F. Cai and S. Osher, “Fast singular value thresholding and improvement of spatial interpolation data,” Master’s without singular value decomposition,” Methods and Appli- thesis, Capital Normal University, Beijing, China, 2014. cations of Analysis, vol. 20, no. 4, pp. 335–352, 2013. [2] K. Xie, X. Ning, X. Wang et al., “Recover corrupted data in [22] Z. Wen, W. Yin, and Y. Zhang, “Solving a low-rank factor- sensor networks: a matrix completion solution,” IEEE ization model for matrix completion by a nonlinear successive Transactions on Mobile Computing, vol. 16, no. 5, pp. 1434– over-relaxation algorithm,” Mathematical Programming 1448, 2017. Computation, vol. 4, no. 4, pp. 333–361, 2012. [3] E. J. Candes ` and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Math- ematics, vol. 9, no. 6, pp. 717–772, 2008. [4] R. H. Keshavan, A. Montanari, and S. Oh, “Matrix completion from noisy entries,” Journal of Machine Learning Research, vol. 11, no. 3, pp. 2057–2078, 2012. [5] S. Becker, V. Cevher, and A. Kyrillidis, Randomized Singular Value Projection, Eprint Arxiv, 2013. [6] E. J. Candes and T. Tao, “+e power of convex relaxation: near-optimal matrix completion,” IEEE Transactions on In- formation 'eory, vol. 56, no. 5, pp. 2053–2080, 2010. [7] E. J. Candes and Y. Plan, “Matrix completion with noise,” Proceedings of the IEEE, vol. 98, no. 6, pp. 925–936, 2010. [8] M. Fazel, Matrix rank minimization with applications, Ph.D. thesis, Stanford University, Stanford, CA, USA, 2002. [9] J. F. Cai, E. J. Candes, ` and Z. Shen, “A singular value thresholding algorithm for matrix completion,” Siam Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2008. [10] R. Meka, P. Jain, and I. S. Dhillon, “Guaranteed rank mini- mization via singular value projection,” in Proceedings of the Neural Information Processing Systems, pp. 937–945, Lake Tahoe, NV, USA, December 2012. [11] Z. Lin, M. Chen, and M. Yi, “+e augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices,” vol. 9, 2010, https://arxiv.org/abs/1009.5055. [12] T. Cover and P. Hart, “Nearest neighbor pattern classifica- tion,” IEEE Transactions on Information 'eory, vol. 13, no. 1, pp. 21–27, 1953. [13] L. Kong, D. Jiang, and M.-Y. Wu, “Optimizing the spatio- temporal distribution of cyber-physical systems for envi- ronment abstraction,” in Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems (ICDCS), June 2010. International Journal of The Scientific Advances in Advances in Geophysics Chemistry Scientica World Journal Public Health Hindawi Hindawi Hindawi Hindawi Publishing Corporation Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www www.hindawi.com .hindawi.com V Volume 2018 olume 2013 www.hindawi.com Volume 2018 Journal of Environmental and Public Health Advances in Meteorology Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 Submit your manuscripts at www.hindawi.com Applied & Environmental Journal of Soil Science Geological Research Hindawi Volume 2018 Hindawi www.hindawi.com www.hindawi.com Volume 2018 International Journal of International Journal of Agronomy Ecology International Journal of Advances in International Journal of Forestry Research Microbiology Agriculture Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 International Journal of Journal of Journal of International Journal of Biodiversity Archaea Analytical Chemistry Chemistry Marine Biology Hindawi Hindawi Hindawi Hindawi Hindawi www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018

Journal

Advances in MeteorologyHindawi Publishing Corporation

Published: Nov 3, 2019

References