Access the full text.
Sign up today, get DeepDyve free for 14 days.
Topol. Algebra Appl. 2022; 10:47–60 Research Article Open Access T. Diphofu, P. Kaelo*, and A.R. Tufa A convergent hybrid three-term conjugate gradient method with sucient descent property for unconstrained optimization https://doi.org/10.1515/taa-2022-0112 Received 16 November, 2021; accepted 22 May, 2022 Abstract: Conjugate gradient methods are very popular for solving large scale unconstrained optimization problems because of their simplicity to implement and low memory requirements. In this paper, we present a hybrid three-term conjugate gradient method with a direction that always satises the sucient descent condition. We establish global convergence of the new method under the weak Wolfe line search conditions. We also report some numerical results of the proposed method compared to relevant methods in the literature. Keywords: Conjugate gradient, Global convergence, Sucient descent, Weak Wolfe line search MSC: 90C06, 90C30, 65K05 1 Introduction Conjugate gradient (CG) methods have found wide applications in many practical areas such as health sci- ences [16], management sciences [6], engineering [30], portfolio selection and robotic motion [4, 8, 28], due to their simplicity to implement and low storage requirements. They are used to solve the optimization problem min f (x), x 2 R , (1) where f : R ! R is a continuous function and its gradient g(x) = rf (x) is available. Starting from an arbitrarily chosen x 2 R , CG methods produce an iterate format for solving (1) by generating a sequence fx g, given by x = x + α d , k = 0, 1, 2, . . . , (2) k+1 k k k where α is a positive step length computed by some line search rule. In classical CG methods, d is a search k k direction obtained by −g , k = 0, d = (3) −g + β d , k ≥ 1, k k k−1 where g = rf (x ) and β 2 R is the conjugate gradient parameter. k k k There are many CG parameters that have been proposed in the literature and each one proposed con- stitutes a new CG method. Traditional CG methods include the Hestenes and Stiefel (HS) [18], Fletcher and Reeves (FR) [14], Polak, Ribiére and Polyak (PRP) [26, 27], Dai and Yuan (DY) [7], Liu and Storey (LS) [22] and Conjugate Descent (CD) [15]. When the objective function f is a strictly convex quadratic function and α is exact, these formulas for β are equivalent [15]. However, for general nonlinear functions, researchers have T. Diphofu, A.R. Tufa: Department of Mathematics, University of Botswana, Private Bag UB00704, Gaborone, Botswana *Corresponding Author: P. Kaelo: Department of Mathematics, University of Botswana, Private Bag UB00704, Gaborone, Botswana, E-mail: kaelop@ub.ac.bw Open Access. © 2022 T. Diphofu et al., published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 License. 48 Ë T. Diphofu, P. Kaelo, and A.R. Tufa over the years determined that the FR, DY and CD methods have good convergence properties whereas the PRP, HS and LS methods have great numerical properties [2]. One important factor to consider when constructing a new CG method is that it satises the descent con- dition d g < 0, 8k ≥ 0, k k or the sucient descent condition T 2 d g ≤ −ckg k , c > 0, (4) k k k T n where d g represents the inner product of the vectors d , g 2 R and kg k = g g is the Euclidean norm. k k k k k k k For the rest of this work, k · k represents the Euclidean norm. The sucient descent condition (4) is critical to the convergence of conjugate gradient methods. Thus, it is important to construct CG methods that satisfy (4) and have both good numerical and global convergence properties. As an attempt to construct a method with both good convergence properties and good numerical perfor- mance, Kou [20] proposed two new conjugate gradient methods. The rst method, called MCGOPT1, takes the form (2) and (3) with the parameter ( ) d g n1+ n1 k−1 k β = max β , η , (5) k k kd k k−1 where T * T * 2 g z d g kz k n1 k k−1 k−1 k k−1 β = − T * T * T * d z d z d z k−1 k−1 k−1 k−1 k−1 k−1 and z is dened as k−1 s z < k−1 k−1 z , if − 1 < ξ , ξ 2 (0, 1), k−1 T s y k−1 k−1 z = k−1 y , otherwise. k−1 The method is based on the modied secant condition B s = z , k k−1 k−1 where B is a symmetric matrix approximation of the Hessian of f (x). In the above method, s = x − k k−1 k x , y = g − g and z = y + λ s , where λ is a nonnegative scalar and θ = 6(f − f ) + k−1 k−1 k k−1 k−1 k−1 2 k−1 k k−1 k ks k k−1 3(g + g ) s , with f = f (x ). The second method, called MCGOPT2, is based on the modied secant k−1 k k−1 k k condition B s = w , w = y + ωkg k s , k k−1 k−1 k−1 k−1 k−1 k−1 n1 where ω and q are positive constants. It replaces β in (5) with T T 2 g w g d kw k k−1 k−1 n2 k k k−1 β = − . T T T d w d w d w k−1 k−1 k−1 k−1 k−1 k−1 Global convergence of these methods was established under the Wolfe line search rules f (x + α d ) ≤ f (x ) + δα g d (6) k k k k k k and T T g(x + α d ) d ≥ σg d , (7) k k k k k k A convergent hybrid three-term conjugate gradient method Ë 49 where 0 < δ < σ < 1. Aminifard and Babaie-Kafaki [1], on the other hand, suggested a direction MPRP T − λ g + β d , g d > 0, k > 0, M k k k−1 k−1 k k d = PRP T − g + β d , g d ≤ 0, k > 0, k k−1 k−1 k k where T T g d g y k−1 PRP PRP k−1 k k λ = 1 + β , β = k k 2 2 jjg jj kg k k k−1 and T 2 T g s g s jjy jj k−1 k−1 MPRP k PRP k−1 k β = 1 − β − t , k k 2 2 2 jjg jj jjg jj jjg jj k−1 k−1 k−1 with t ≥ 0 chosen as t = 0.1 and t = 0 in the numerical experiments. They then adaptively switched d with −g whenever g y ≤ 0, that is, they suggested k k−1 − g , g y ≤ 0, k > 0, k k−1 d = −g , d = 0 0 k M T d , g y > 0, k > 0. k−1 k k Global convergence of the above method was proved under the Wolfe line search conditions (6) and (7), and as well as under the backtracking Armijo-type line search rule in which the step length α is obtained by α = maxfρ , j = 0, 1, . . .g, satisfying 2 2 f (x + α d ) ≤ f (x ) − δα jjd jj , k k k k k k with the constants ρ, δ 2 (0, 1). Dong [12] further modied the direction of Aminifard and Babaie-Kafaki [1] by suggesting the modication −g , if g y ≤ 0 (β = 0), k k−1 k DPRP DPRP −λ g + β d , if k 2 K , (β = β ), k k k−1 k k k d = −g + η d , if k 2 K , (β = η ), > 2 k k k−1 k k g y : k−1 PRP −g + d , if k 2 K , (β = β ), 2 3 k k−1 k k kg k k−1 where T T g d g y −1 k−1 k−1 k k λ = 1 + , η = , k k T 2 jjd jj minfη, jjg jjg jg d j jjg jj k−1 k−1 k−1 k k−1 T 2 g d jjy jj DPRP PRP k−1 k−1 β = β − t , k k 2 4 minfjg d j , jjg jj g k−1 k−1 k−1 and t ≥ 1/4. The authors suggested the index sets K , K and K as 1 2 3 T T DPRP K = fk 2 Njg y > 0, g d > 0, β > η g, k k−1 k k−1 k k T T DPRP K = fk 2 Njg y > 0, g d > 0, β ≤ η g k k−1 k k−1 k k and T T K = fk 2 Njg y > 0, g d ≤ 0g. 3 k k−1 k k−1 The direction was shown to satisfy the sucient descent condition, and its global convergence was estab- lished under the strong Wolfe line search conditions (6) and T T jg(x + α d ) d j ≤ σ jg d j, (8) k k k k k k 50 Ë T. Diphofu, P. Kaelo, and A.R. Tufa where 0 < δ < σ < 1. Liu et al. [23], on the other hand, proposed a three-term conjugate gradient method where the direction is calculated by d = −g + β d + θ g k k k k−1 k k−1 and 2 T T kg k g s g d k−1 k−1 k−1 LS k k β = β − , θ = . k k k T 2 T (d g ) −d g k−1 k−1 k−1 k−1 This direction structure satises the sucient descent property under the strong Wolfe line search conditions (6) and (8) and has the global convergence property on general objective functions under some standard conditions. More information about other CG methods can be found in [2, 5, 9, 10, 13, 17, 19, 21, 24, 31, 32]. Motivated by both numerical performances and global convergence properties of three-term conjugate gradient methods, in this paper, we propose another three-term hybrid conjugate gradient method. As a three- term hybrid conjugate gradient method, it inherits the advantages of these other existing methods and there- fore performs better numerically. The method is suggested in Section 2 and the rest of the paper is organized as follows. We establish global convergence of the new method under the weak Wolfe line search in Section 3. In Section 4 we provide numerical comparisons of the new method against existing competing methods. The conclusion is presented in Section 5. 2 The Method Zhang et al. [34] proposed a three-term PRP conjugate gradient (TTPRP) method with a direction given as PRP (1) d = −g + β d − θ y , k k k k−1 k−1 (1) T 2 where θ = g d /kg k . Based on the idea of the TTPRP method, the same authors, in [35], proposed k k−1 k−1 two other three-term HS conjugate gradient methods. The rst is a modied three-term HS (MTTHS) method MHS M d = −g + β d − θ z , k k k−1 k−1 k k M T T MHS T T r where θ = g d /d z , β = g z /d z , z = y + tkg k s , with r ≥ 0 and t > 0 being k−1 k−1 k−1 k−1 k k k k k k k−1 k k−1 constants. The second suggested algorithm, called cautious TTHS (CTTHS) method takes the direction T r 2 −g , if s y < ϵ kg k ks k , k k−1 1 k−1 k−1 k−1 d = HS −g + β d − θ y , otherwise, k k−1 k k−1 T T HS T T where θ = g d /d y and β = g y /d y . These three-term conjugate gradient methods k k−1 k−1 k−1 k−1 k k−1 k k−1 mimic the three-term form of the L-BFGS method [25] in which ! ! T 2 T T g y ky k g d g d k−1 k−1 k−1 k−1 k k k d = −g + − 1 + d + y k k k−1 k−1 T T T T d y d y d y d y k−1 k−1 k−1 k−1 k−1 k−1 k−1 k−1 and were both shown to satisfy the descent condition T 2 d g = −kg k . (9) k k k In Yuan et al. [33], a three-term PRP conjugate gradient method that incorporates function values in the computation of the β parameter is suggested. Their suggested direction is given as g d BPRP k k−1 BPRP d = −g + β g + β d , k k k k k k−1 kg k k A convergent hybrid three-term conjugate gradient method Ë 51 where n o kg k T m 2 T min jg y j, u kg k − jg g j k k−1 k k−1 kg k k BPRP k−1 β = , u , u > 0, (10) 1 2 u kd kky k + kg k 2 k−1 k−1 k−1 with maxfρ , 0g m k−1 y = y + s k−1 k−1 k−1 ks k k−1 and ρ = 2[f (x ) − f (x )] + (g(x ) + g(x )) s . k−1 k−1 k k k−1 k−1 BPRP Note from (10) that by denition, β ≥ 0. The above direction was also shown to satisfy the descent con- dition (9). From Zhang et al. [34, 35], it is clear that constructing d based on the three-term form of the L-BFGS method has added some computational advantages to performances of three-term conjugate gradient meth- ods. Therefore, in this paper, motivated by the works of Zhang et al. [34, 35] and Yuan et al. [33], we construct a new three-term conjugate gradient method with a direction d given by −g + β d − λ y , β ≥ 0, k k k−1 k k−1 k d = (11) g d BPRP k−1 BPRP : k −g + β g + β d , otherwise, k 2 k k−1 k k kg k where T 2 T g y ky k g d k−1 k−1 k−1 k k β = − μ (12) m m and g d k k−1 λ = μ , μ 2 (1, 3). (13) We compute m as 2 T T m = maxfϕkd kky k, kg k ,−d g , d y g, (14) k k−1 k−1 k−1 k−1 k−1 k−1 k−1 where ϕ > 0 is a constant, to obtain a hybrid three-term conjugate gradient method which we denote as ZHYB method. We present our algorithm below. Algorithm 1 A new three-term conjugate gradient method 1: Let k = 0. Set 0 < δ < σ < 1, ϕ > 0, μ 2 (1, 3), u , u > 0 and ϵ > 0. Select a starting point x 2 IR and 1 2 0 set d = −g . 0 0 2: If kg k ≤ ϵ, then stop. 3: for k = 0, 1, . . . do 4: Compute the steplength α by (6) and (7). 5: Set x = x + α d . k+1 k k k 6: Set k = k + 1. 7: If kg k ≤ ϵ, then stop. BPRP 8: Find β by (12) and λ by (13) or β by (10), and compute d by (11). k k k 9: end for 52 Ë T. Diphofu, P. Kaelo, and A.R. Tufa 3 Global Convergence In this section, we study the global convergence of Algorithm 1 under the weak Wolfe line search (6) and (7). Firstly, we state the following mild assumptions, which are essential in proving the global convergence of conjugate gradient methods. Assumption 3.1. The level set Ω = fx 2 IR : f (x) ≤ f (x )g, where x is the starting point, is bounded. Assumption 3.2. In some neighbourhood N of Ω, the objective function f (x) is continuously dierentiable and its gradient is Lipschitz continuous, i.e. there exists a constant L > 0 such that kg(x) − g(y)k ≤ Lkx − yk, 8x, y 2 N. The above assumptions imply that there exist positive constants M and κ such that kxk ≤ M and kg(x)k ≤ κ, 8x 2 Ω. (15) Lemma 3.1. Suppose that the sequences fd g and fg g are generated by Algorithm 1. Then k k T 2 g d ≤ −ckg k , c > 0, (16) k k k for all k ≥ 0. Proof. Firstly, for k = 0 or β ≤ 0, d satises (9) and hence satises (16) for c = 1. For β > 0, from (11) we have k k k T T 2 T 2 T T g y g d μky k (g d ) μg d g y T 2 k−1 k−1 k−1 k−1 k−1 k−1 k k k k k g d = −kg k + − − k k k m m m k k T T 2 T 2 g d g y μky k (g d ) 2 k k−1 k k−1 k−1 k k−1 = −kg k + 1 − μ − ( ) ! ! T T 2 T 2 (1 − μ)g y g d μky k (g d ) k−1 k−1 k−1 k−1 2 k k k = −kg k + 2 − 2 m 2 2 2 T 2 2 T 2 (1 − μ) kg k ky k (g d ) μky k (g d ) 2 k k−1 k k−1 k−1 k k−1 ≤ −kg k + + − 2 2 m m k k 2 T 2 ky k (g d ) (1 − μ) 2 2 k−1 k−1 = −kg k + kg k − (μ − 1) k k (1 − μ) ≤ − 1 − kg k , since μ 2 (1, 3). We get the result with the rst inequality above coming from the fact that T 2 2 u v ≤ kuk + kvk , where y g d (1 − μ)g k−1 k−1 k k u = and v = . 2 m Letting c = (1 − (1 − μ) /4) gives the result (16). 2 A convergent hybrid three-term conjugate gradient method Ë 53 Lemma 3.2. Suppose that Assumptions 3.1 and 3.2 hold. If the step length α satises (6) and (7), then the inequality (1 − σ)jg d j k k α ≥ , (17) Lkd k holds. Proof. By subtracting g d from both sides of (7), we get T T 2 (σ − 1)g d ≤ (g − g ) d ≤ α Lkd k . k k k+1 k k k k Since the search direction d is a descent direction and σ < 1, (17) follows immediately. 2 The Zoutendijk condition [36] is very useful for the proof of the global convergence of nonlinear conjugate gradient algorithms. The following lemma gives this condition. Lemma 3.3 (Zoutendijk Condition). Suppose that Assumptions 3.1 and 3.2 hold. Let the sequence fx g be generated by (2), where d is a descent direction. If α is obtained by the Wolfe line search conditions (6) and k k (7), then T 2 (g d ) < +∞. (18) kd k k=0 Proof. For kg k ≠ 0, we have from (6) and Lemma 3.2 that T 2 (1 − σ)(g d ) T k f − f ≥ −δα g d ≥ δ . (19) k k+1 k k k Lkd k By summing up both sides of (19) and using the bounded below assumption on f (x), we have ∞ ∞ T 2 X X (g d ) δ(1 − σ) ≤ (f − f ) = f − lim f < +∞. k k+1 0 k L kd k k!∞ k=0 k=0 Thus, the Zoutendijk condition (18) holds immediately. 2 Theorem 3.1. Suppose that Assumptions 3.1 and 3.2 hold and Algorithm 1 generates an innite sequence fx g. Then lim inf kg k = 0. (20) k!∞ Proof. Suppose that the desired result does not hold, that is, lim inf kg k ≠ 0. k!∞ Then, there exists a constant c ¯ > 0 such that kg k ≥ c, (21) holds for all k ≥ 0. Now, because the parameter m is calculated as (14), we always have that m ≥ ϕkd kky k > 0. k k−1 k−1 54 Ë T. Diphofu, P. Kaelo, and A.R. Tufa By (12) and Cauchy-Schwarz, we have that T 2 T g y ky k g d k−1 k−1 k−1 k k jβ j = − μ kg kky k ky k kg kkd k k k−1 k−1 k k−1 ≤ + μ 2 2 2 ϕkd kky k ϕ kd k ky k k−1 k−1 k−1 k−1 kg k = V , (22) kd k k−1 ϕ+μ where V = . From the descent condition (16), it follows that kd k ≠ 0 holds for k ≥ 1. By (11), (13), 2 k−1 (15), (22) and Cauchy-Schwarz, we get kd k ≤ kg k + jβ jkd k + jλ jky k k k k k−1 k k−1 g d kg k k−1 k k ≤ kg k + V kd k + μ ky k k k−1 k−1 kd k m k−1 k kg k μkg kkd k k k k−1 ≤ kg k + V kd k + ky k k k−1 k−1 kd k ϕkd kky k k−1 k−1 k−1 μkg k = kg k + Vkg k + k k ≤ 1 + V + κ. Similarly, from (10), Cauchy-Schwarz and triangle inequality, we obtain that kg k 2 k T u kg k − jg g j k k−1 kg k k BPRP k−1 jβ j ≤ u kd kky k + kg k 2 k−1 k−1 k−1 kg k u g g − g 1 k k−1 kg k k−1 u kd kky k + kg k 2 k−1 k−1 k−1 kg k u kg k g − g + g − g 1 k k k−1 k−1 k−1 kg k k−1 u kd kky k + kg k 2 k−1 k−1 k−1 2u kg kkg − g k 1 k k k−1 u kd kky k + kg k 2 k−1 k−1 k−1 2u kg kky k 1 k k−1 u kd kky k 2 k−1 k−1 2u kg k 1 k = , u kd k 2 k−1 and by equation (11) and (15), we have kg kkd k BPRP k k−1 BPRP kd k ≤ kg k + jβ j kg k + jβ jkd k k k k k k k−1 kg k 2u 2u 1 1 ≤ kg k + kg k + kg k k k k u u 2 2 4u ≤ (1 + )κ. n o 4u Letting ω ¯ = max 1 + V + , 1 + κ, gives that kd k ≤ ω ¯ and this inequality implies that ϕ 2 = ∞. (23) kd k k=0 Now, by (18), (21) and Lemma 3.1, we obtain that ∞ ∞ ∞ | X X X (g d ) 1 kg k k 2 4 2 k k c (c ¯) ≤ c ≤ < +∞, 2 2 2 kd k kd k kd k k k k k=0 k=0 k=0 which contradicts (23), therefore (20) holds. 2 A convergent hybrid three-term conjugate gradient method Ë 55 4 Numerical Experiments In this section, we perform experiments using our proposed ZHYB method as presented in Algorithm 1. We make comparisons with other three-term conjugate gradient methods, being ACG method by Dong et al. [13], DPRP method by Dong [12], BZAU method by Baluch et al. [5] and GSTDLCG method by Yao et al. [32]. A total of 72 problems are used in the numerical experiments and their dimensions range from 2 to 12 000. The problems were taken from [3], except for Problems 1, 22, 23 and 24 which were taken from [29]. These problems, along with the starting points and dimensions (DIM), are listed in Table 1. The algorithms are written in MATLAB 2019a on a LENOVO laptop, with processor Intel (R) Celeron (R) N4000 CPU @ 1.10GHz and 4GB RAM. All methods use the Wolfe line search rules (6) and (7) with δ = 0.0001 and σ = 0.8. The parameters for our proposed method are set to μ = 1.005, u = 0.001, u = 1 and ϕ = 0.05. For the other methods, the 1 2 −5 parameters were set as in the respective papers. We stopped the iterations when the inequality kg k ≤ 10 is satised or when the maximum number of iterations exceeds 5000. Table 1: Table of test functions, dimensions and starting points. FN Objective Function Dim Starting Point FN Objective Function Dim Starting Point 1 Beale 2 [0;0] 37 DQDRTIC 100 [3;. . .;3] 2 Almost Perturbed Quadratic 100 [0.5;. . .;0.5] 38 DQDRTIC 1000 [3;. . .;3] 3 Almost Perturbed Quadratic 1000 [0.5;. . .;0.5] 39 DQDRTIC 5000 [3;. . .;3] 4 Almost Perturbed Quadratic 5000 [0.5;. . .;0.5] 40 DQDRTIC 10000 [3;. . .;3] 5 Almost Perturbed Quadratic 10000 [0.5;. . .;0.5] 41 Ext Himmelblau 100 [2;. . .;2] 6 Gen Rosenbrock 100 [1.2;1;. . .;1.2;1] 42 Ext Himmelblau 1000 [2;. . .;2] 7 Gen Rosenbrock 1000 [1.2;1;. . .;1.2;1] 43 Ext Himmelblau 5000 [2;. . . ;2] 8 Gen Rosenbrock 5000 [1.2;1;. . .;1.2;1] 44 Ext Himmelblau 10000 [2;. . .;2] 9 Gen Rosenbrock 10000 [1.2;1;. . .;1.2;1] 45 Ext White & Holst 100 [-1.2;1;. . .;-1.2;1] 10 Ext Wood 120 [-3;-1;. . .;-3;-1] 46 Ext White & Holst 1000 [-1.2;1;. . .;-1.2;1] 11 Ext Wood 1200 [-3;-1;. . .;-3;-1] 47 Ext White & Holst 5000 [-1.2;1;. . .;-1.2;1] 12 Ext Wood 6000 [-3;-1;. . .;-3;-1] 48 Ext White & Holst 10000 [-1.2;1;. . .;-1.2;1] 13 Ext Wood 12000 [-3;-1;. . .;-3;-1] 49 Gen PSC1 100 [3;0.1;. . .3;0.1] 14 Gen Tridiagonal 1 100 [2;. . .;2] 50 Gen PSC1 1000 [3;0.1;. . .3;0.1] 15 Gen White & Holst 10 [-1.2;1;. . .;-1.2;1] 51 Gen PSC1 5000 [3;0.1;. . .;3;0.1] 16 Gen White & Holst 50 [-1.2;1;. . .;-1.2;1] 52 Gen PSC1 10000 [3;0.1;. . .;3;0.1] 17 Gen White & Holst 100 [-1.2;1;. . .;-1.2;1] 53 SINCOS 100 [3;0.1;. . .;3;0.1] 18 Ext Freudenstein & Roth 100 [4;. . .;4] 54 SINCOS 1000 [3;0.1;. . .;3;0.1] 19 Ext Freudenstein & Roth 1000 [4;. . .;4] 55 SINCOS 5000 [3;0.1;. . .;3;0.1] 20 Ext Freudenstein & Roth 5000 [4;. . .;4] 56 SINCOS 10000 [3;0.1;. . .;3;0.1] 21 Ext Freudenstein & Roth 10000 [4;. . .;4] 57 QUARTC 100 [2;. . .;2] 22 Styblinski and Tang 100 [-1;. . .;-1] 58 QUARTC 1000 [2;. . . ;2] 23 Styblinski and Tang 1000 [-1;. . .;-1] 59 QUARTC 5000 [2;. . . ;2] 24 Styblinski and Tang 5000 [-1;. . .;-1] 60 QUARTC 10000 [2;. . .;2] 25 Ext Maratos 200 [1.1;0.1;. . .;1.1;0.1] 61 Diagonal 4 100 [1;. . .;1] 26 ENGVAL1 2 [1.1;0.1] 62 Diagonal 4 1000 [1;. . . ;1] 27 ENGVAL1 100 [1.1;0.1;. . .;1.1;0.1] 63 Diagonal 4 5000 [1;. . . ;1] 28 ENGVAL1 1000 [1.1;0.1;. . .;1.1;0.1] 64 Diagonal 4 10000 [1;. . . ;1] 29 DIXMAANA 3 [0.5;0.5;0.5] 65 Ext Rosenbrock 100 [1.2;1;. . .;1.2;1] 30 DIXMAANA 30 [0.5;. . .;0.5] 66 Ext Rosenbrock 1000 [1.2;1;. . . ;1.2;1] 31 DIXMAANA 60 [0.5;. . .;0.5] 67 Ext Rosenbrock 5000 [1.2;1;. . . ;1.2;1] 32 DIXMAANA 90 [0.5;. . .;0.5] 68 Ext Rosenbrock 10000 [1.2;1;. . . ;1.2;1] 33 DIXMAANC 3 [0.5;0.5;0.5] 69 Ext Beale 100 [1;0.8;. . . ;1;0.8] 34 DIXMAANC 30 [0.5;. . .;0.5] 70 Ext Beale 1000 [1;0.8;. . . ;1;0.8] 35 DIXMAANC 60 [0.5;. . .;0.5] 71 Ext Beale 5000 [1;0.8;. . . ;1;0.8] 36 DIXMAANC 90 [0.5;. . .;0.5] 72 Ext Beale 10000 [1;0.8;. . . ;1;0.8] Numerical results of the methods are presented in Table 2, where we give the number of iterations (NI), number of function evaluations (FE) and CPU time in seconds (CPU). In the event that a method fails to nd the solution within 5000 iterations, an entry of ‘–’ is made in the table. 56 Ë T. Diphofu, P. Kaelo, and A.R. Tufa Table 2: Table of number of iterations, function evaluations and CPU time. FN ZHYB ACG GSTDLCG BZAU DPRP NI FE CPU NI FE CPU NI FE CPU NI FE CPU NI FE CPU 1 24 792 0.04198763 17 578 0.02939466 15 477 0.02494536 10 331 0.01703801 33 993 0.0501483 2 112 1435 0.05352352 94 1206 0.0339095 192 2489 0.07427658 113 1453 0.04112756 97 1249 0.03494377 3 428 5938 0.37227068 312 4312 0.24952317 462 6436 0.40016242 516 7144 0.39830146 377 5257 0.30090539 4 1198 17407 3.99301304 730 10596 2.13607866 927 13489 3.09739674 863 12529 2.48967995 831 12070 2.54899667 5 1410 21007 8.65417342 1127 16791 6.05113358 1413 21138 8.7544848 1544 23003 8.24196246 - - - 6 115 2956 0.08208318 70 1839 0.0441635 88 2201 0.05698695 86 2163 0.06233047 187 4638 0.11136869 7 195 4685 0.31134834 86 2196 0.14776951 120 2982 0.20263759 72 1825 0.1246712 98 2479 0.16478766 8 94 2333 0.63079207 60 1605 0.43067379 118 3128 0.85449834 97 2452 0.67834825 135 3220 0.87590453 9 142 3569 1.80035392 105 2724 1.36852144 84 2134 1.09789236 109 2663 1.38822327 140 3307 1.69148622 10 95 2642 0.08160246 137 3995 0.10030091 121 3478 0.08889677 155 4164 0.12127704 172 4793 0.11710188 11 85 2376 0.16474109 148 4221 0.29822833 137 3957 0.27905456 128 3592 0.25354538 168 4441 0.31090081 12 115 3038 0.85419759 108 3188 0.90926876 139 3751 1.06978425 151 4126 1.15667803 120 3394 0.97435806 13 114 3013 1.58339604 114 3277 1.74631094 171 4486 2.393389 163 4257 2.25713123 127 3365 1.81881627 14 21 487 0.02918041 26 584 0.02173727 21 483 0.018408 21 483 0.02907186 26 614 0.0221574 15 426 16678 0.46358747 253 10447 0.27975412 189 7860 0.21075087 349 13550 0.37968853 306 12757 0.34291525 16 1232 53039 1.6172075 909 39323 1.22365742 829 35023 1.04740834 1000 42598 1.3316293 1328 57888 1.74317203 17 2444 106196 3.69872752 1560 68311 2.45851439 1533 67138 2.33840272 1856 81546 2.84811128 3762 163855 5.66069974 18 8 279 0.02113204 9 331 0.01029836 9 334 0.01077974 13 478 0.02766745 7 252 0.00775737 19 8 279 0.02600145 9 331 0.03289277 9 334 0.03204986 13 478 0.04425719 7 252 0.023094 20 8 279 0.11296663 9 331 0.13372473 9 334 0.13825182 13 478 0.18837894 7 252 0.10041977 21 8 279 0.20829205 9 331 0.25694782 9 334 0.25154172 13 478 0.35278752 7 252 0.18725329 22 6 162 0.01659237 6 164 0.00815499 10 270 0.01375509 6 162 0.01625105 6 172 0.00851321 23 6 162 0.04533378 6 164 0.0456914 6 170 0.04749111 6 162 0.04527557 6 168 0.0467164 24 6 162 0.21114655 - - - 6 170 0.22226692 6 162 0.21129416 7 191 0.25243628 25 31 840 0.03081591 58 1595 0.06245032 38 1056 0.02759163 32 968 0.03317551 18 490 0.0125341 26 9 212 0.01543759 10 237 0.01829543 9 208 0.00747927 8 198 0.01528267 10 248 0.00844932 27 18 419 0.01785971 19 449 0.01851593 18 426 0.01525154 17 397 0.01712409 24 576 0.0206381 28 18 423 0.04135098 19 445 0.04412741 19 444 0.04386156 19 427 0.04084151 20 473 0.04526796 29 15 337 0.06211817 14 361 0.14116162 13 378 0.03144102 16 354 0.09606715 15 382 0.03096692 30 15 461 0.28016924 21 562 0.32892642 14 398 0.23835872 26 623 0.36993313 21 577 0.33897482 31 18 517 0.65519643 24 569 0.70744461 24 753 0.95212087 25 437 0.54210439 27 723 0.88096134 32 18 423 0.86031482 4 83 0.27982762 32 894 1.91800975 25 575 1.15473628 18 555 1.09607443 33 11 337 0.05902661 23 348 0.1021824 14 405 0.03929573 16 277 0.08895967 14 290 0.02286457 34 19 505 0.29984555 20 468 0.28943503 20 658 0.38382165 21 557 0.33203784 23 721 0.4197444 35 24 600 0.75212888 21 507 0.63004204 22 588 0.73945011 27 475 0.59865697 20 505 0.63350738 36 21 644 1.28972499 23 532 1.07978531 18 517 1.04968315 - - - 17 490 0.99706371 37 54 718 0.03184687 72 950 0.12157037 72 958 0.03125065 122 1596 0.05647908 68 910 0.0273036 38 41 538 0.04485604 38 501 0.05176858 52 682 0.05983484 31 406 0.03330794 37 492 0.04106598 39 38 496 0.17498018 27 355 0.12403952 41 537 0.18943142 45 589 0.20180991 27 356 0.12524595 40 18 238 0.15369916 23 303 0.19763342 35 460 0.29916229 38 497 0.31673598 22 292 0.18897752 41 7 185 0.01421891 10 267 0.03673124 15 360 0.00901148 8 207 0.01513832 8 203 0.00465787 42 7 185 0.01055195 11 282 0.01634773 17 389 0.02361578 9 222 0.01241359 8 203 0.01135812 43 7 185 0.03827432 11 282 0.05738622 17 389 0.08208362 9 222 0.04465315 9 216 0.04478889 44 7 185 0.07027216 11 282 0.10568085 17 389 0.14969106 9 222 0.08270206 10 232 0.08966916 45 20 635 0.0296363 23 791 0.03651968 34 1091 0.04023498 15 577 0.02856499 27 883 0.02871364 46 20 635 0.09307843 39 1269 0.19848627 32 1048 0.15656149 14 542 0.08186411 21 691 0.1032319 47 20 635 0.43045306 38 1082 0.7420775 33 1042 0.71654283 15 575 0.39499172 37 1164 0.79820598 48 20 635 0.83446963 34 1044 1.39068711 34 1030 1.37962 17 612 0.81670933 30 682 0.93987845 49 17 291 0.03855043 16 257 0.01568126 55 387 0.02689767 17 311 0.01843346 16 285 0.01714294 50 21 293 0.08344521 21 347 0.09771146 808 1011 0.48173058 19 296 0.08565512 20 318 0.09408721 51 25 367 0.45925671 24 383 0.48789516 272 572 1.01945317 30 449 0.55426662 33 644 0.80336502 52 25 366 0.89303407 34 535 1.23337047 236 755 2.18914744 22 306 0.79281804 40 628 1.46525806 53 14 346 0.02942276 10 227 0.1200726 14 309 0.19888124 10 262 0.04456476 18 421 0.01873046 54 14 346 0.06049062 14 296 0.04800858 15 324 0.07834017 11 278 0.04803374 17 447 0.07907118 55 14 346 0.26942925 14 296 0.20955272 15 324 0.24090633 11 278 0.20666278 19 467 0.3446024 56 14 346 0.51829193 14 296 0.40203233 15 324 0.45949105 11 278 0.39733738 - - - 57 3 45 0.0093485 3 44 0.00949584 14 42 0.01116412 3 44 0.01015302 7 204 0.00963754 58 3 45 0.01222188 3 44 0.01223315 4 65 0.01804746 3 44 0.01200074 8 131 0.03565426 59 3 45 0.05660614 3 44 0.05702933 16 44 0.07243747 4 56 0.07258645 6 106 0.13533739 60 4 56 0.14085221 3 44 0.11170671 4 58 0.14656838 4 56 0.1425561 10 152 0.3806887 61 58 738 0.03152077 31 331 0.01918833 27 293 0.02147903 25 264 0.01699696 32 389 0.01196796 62 26 317 0.01988654 20 218 0.01393685 22 258 0.01690658 25 296 0.01855003 27 324 0.01952272 63 20 210 0.04476963 5 55 0.01203285 22 224 0.04914581 12 142 0.02893148 6 69 0.01411762 64 18 213 0.07828908 19 206 0.07940854 18 194 0.07441067 18 205 0.07442556 25 269 0.09853695 65 19 492 0.02005009 46 1047 0.02425703 - - - 30 850 0.01897311 33 931 0.02062053 66 19 492 0.02453381 30 804 0.0399171 - - - 31 868 0.04224355 38 998 0.04809128 67 19 492 0.08393718 31 805 0.13806889 - - - 30 846 0.14101411 36 891 0.14876253 68 20 510 0.15806468 31 805 0.25085342 - - - 40 972 0.29960321 30 862 0.2596626 69 24 733 0.03927515 25 795 0.04066703 17 543 0.03218858 12 511 0.03031934 23 851 0.03013394 70 50 1246 0.22126513 26 802 0.14564866 17 543 0.10382585 13 528 0.09112489 52 1554 0.27283237 71 40 1020 0.85714513 23 690 0.58027811 17 543 0.4576781 13 528 0.42098957 50 1487 1.23398922 72 17 605 0.959732269 33 883 1.46015817 17 543 0.87022166 13 528 0.82553474 51 1427 2.33414678 A convergent hybrid three-term conjugate gradient method Ë 57 Figure 1: Number of iterations performance proles Figure 2: Number of function evaluations performance proles Table 2 shows that the proposed ZHYB method required the least number of iterations for 29% of the problems, followed by the BZAU method with 27%, ACG with 22%, and lastly, DPRP and GSTDLCG each with 12%. The BZAU method has the least function evaluations for 33% of the test problems, followed by ZHYB with 27%, ACG with 21%, GSTDLCG with 12% and lastly, DPRP with 8%. In terms of CPU time, the new ZHYB method required the least time to solve 26% of the problems, followed by BZAU with 22%, ACG with 21%, DPRP with 18% and lastly, GSTDLCG with 13%. We further present the results in Table 2, together with those of number of gradient evaluations, in graphs using the performance proles of Dolan and More ´ [11]. These are given in Figures 1–4, where Figure 1 shows the performance proles based on number of iterations. Figures 2 and 3 present the performance proles of the methods based on number of function evaluations and gradient evaluations, respectively. The per- formance proles of the methods based on CPU time are given in Figure 4. When preparing these graphs, a higher value is assigned for a method with an entry of ‘–’ in Table 2. We observe from these gures that the graph of ZHYB is above the graphs of these other methods, mean- ing it has the best performance. Furthermore, one should notice that, generally, ZHYB has either the best or second best performance more frequently than the other methods from Table 2, hence having higher perfor- mance proles than the other methods in Figures 1-4. 58 Ë T. Diphofu, P. Kaelo, and A.R. Tufa Figure 3: Number of gradient evaluations performance proles Figure 4: CPU time performance proles 5 Conclusion In this paper, we proposed a new three-term conjugate gradient method which satises the sucient de- scent condition. We also established its global convergence under the weak Wolfe line search. Numerical experiments, compared with some existing three-term conjugate gradient methods, show that our proposed method is ecient and competitive. Conict of interest: The authors declare no conict of interest. References [1] Z. Aminifard and S. Babaei-Kafaki, A modied descent Polak-Ribiere ` -Polyak conjugate gradient method with global conver- gence property for nonconvex functions, Calcolo, 56:16 (2019). A convergent hybrid three-term conjugate gradient method Ë 59 [2] S. Babaie-Kafaki, A quadractic hybridization of Polak-Ribiere ` -Polyak and Fletcher-Reeves conjugate gradient methods, J. Optim. Theory Appl., 154(3), (2012) 916–932. [3] N. Andrei, An Unconstrained Optimization Test Functions Collection, Adv. Model. Optim., 10 (2008), 147–161. [4] A.M. Awwal, I.M. Sulaiman, M. Malik, M. Mamat, P. Kumam and K. Sitthithakerngkiet, A spectral RMIL+ conjugate gradient method for unconstrained optimization with applications in portolio selection and motion control, IEEE Access, 9, 2021, DOI:10.1109/ACCESS.2021.3081570 [5] B. Baluch, Z. Salleh, A. Alhawarat and U.A.M. Roslan, A new modied three-term conjugate gradient method with sucient descent property and its global convergence, J. Math., 2017:2715854 (2017). [6] D. Dabhi and K. Pandya, Enhanced Velocity Dierential Evolutionary Particle Swarm Optimization for Optimal Scheduling of a Distributed Energy Resources With Uncertain Scenarios, IEEE Access, 8 (2020), 27001–27017. [7] Y.H. Dai and Y. Yuan, A nonlinear conjugate gradient method with a strong global convergence property, SIAM J. Optim., 10 (1999), 177–182. [8] J. Deepho, A.B. Abubakar, M. Malik and I.K. Argyros, Solving unconstrained optimization problems via hybrid CD-HY conju- gate gradient methods with applications, J. Comput. Appl. Math., 405:113823 (2022). [9] S. Delladji , M. Bellou and B. Sellami, New hybrid conjugate gradient method as a convex combination of FR and BA methods, J. Inf. Optim. Sci., 42(3) (2021), 591–602. [10] T. Diphofu and P. Kaelo, Another three term conjugate gradient method close to the memoryless BFGS for large scale uncon- strained optimization, Mediterr. J. Math., 18(5): 211 (2021). [11] E.D. Dolan and J. More, ´ Benchmarking optimization software with performance proles, Math. Program., 91 (2002), 201–213. [12] X. Dong, A modied nonlinear Polak-Ribiere ` -Polyak conjugate gradient method with sucient descent property, Calcolo, 57:30 (2020). [13] X. Dong, Z. Liu, H. Liu and X. Li, An ecient three-term extension of the Hestenes-Stiefel conjugate gradient method, Optim. Methods Softw., 34 3 (2019), 546–559. [14] R. Fletcher and C.M Reeves, Function minimization by conjugate gradient, Comput. J., 7 (1964), 149–154. [15] R. Fetcher, Practical methods of Optimization vol.1 : Unconstrained Optimization, John Wiley & Sons, New York, (1987). [16] P. Gao, K. Cheng, E. Schuler, M. Jia, W. Zhao and L. Xing, Restarted primal-dual Newton conjugate gradient method for en- hanced spatial resolution of reconstructed cone-beam x-ray luminescence computed tomography images, Phys. Med. Biol., 65(13):135008 (2020). [17] A. Hamdi, B. Sellami and M. Bellou, New hybrid conjugate gradient method as a convex combination of HZ and CD methods, Asian Eur. J Math., 14(10):2150187 (2021). [18] M. Hestenes and E. Steifel, Method of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Stan. Sect. B., 49 (1952), 409–436. [19] P. Kaelo, P. Mtagulwa and M.V. Thuto, A globally convergent hybrid conjugate gradient method with strong Wolfe conditions for unconstrained optimization, Math. Sci. (Springer), 14 (2020), 1–9. [20] C. X. Kou, An improved nonlinear conjugate gradient method with an optimal property, Sci. China Math., 57 (2014), 635–648. [21] M. Li, A three-term Polak-Ribiere ` -Polyak conjugate gradient method close to the memoryless BFGS quasi-Newton method, J. Ind. Manag. Optim., 16 (2020), 245–260. [22] Y. Liu and C. Storey, Ecient generalized conjugate gradient algorithms, Part 1: Theory, J. Optim, Theory Appl., 69 (1991), 129–137. [23] J.K. Liu, Y.X. Zhao and X.L. Wu, Some three-term conjugate gradient methods with the new direction structure, Appl. Numer. Math., 150 (2019), 433–443. [24] P. Mtagulwa and P. Kaelo, An ecient mixed conjugate gradient method for solving unconstrained optimisation problems, East Asian J. Appl. Math., 11(2) (2021), 421–434. [25] J. Nocedal, Updating quasi-Newton matrices with limited storage, Math. Comput., 35, (1980), 773–782. [26] E. Polak and G. Ribiere, ` Note sur la covergence de directions conjugtextees, Rev. Fren. Inf. Rech. Oper., 3e Année, 16 (1969), 35–43. [27] T. Polyak, The conjugate gradient method in extreme problems, USSR Comp. Math. Math. Phys., 9 (1969), 94–112. [28] I.M. Sulaiman, M. Malik, A.M. Awwal, P. Kumam, M. Mamat and S. Al-Ahmad, On three-term conjugate gradient method for optimization problems with applications on COVID-19 model and robotic motion control, Adv. Cont. Discr. Mod., 2022:1 (2022). [29] S. Surjanovic and D. Bingham, Virtual Library of Simulation Experiments: Test Functions and Databases. Retrieved February 25, 2021, from http:/www.sfu.ca/~ssurjano. [30] Z. Wang, G. He, W. Du, J. Zhou, X. Han, J. Wang, H. He, X. Guo, J. Wang and Y. Kou, Application of Parameter Optimized Variational Mode Decomposition Method in Fault Diagnosis of Gearbox, IEEE Access, 7 (2019), 44871–44882. [31] T. G. Woldu, H. Zhang, X. Zhang and Y. H. Fissuh, A Modied Nonlinear Conjugate Gradient Algorithm for Large-Scale Nons- mooth Convex Optimization, J. Optim. Theory Appl., 185 (2020), 223–238. [32] S. Yao, Q. Feng, L. Li and J. Xu, A class of globally convergent three-term Dai-Liao conjugate gradient methods, Appl. Numer. Math., 151 (2020), 354–366. [33] G. Yuan, X. Duan, W. Liu, X. Wang, Z. Cui and Z. Sheng, Two new PRP conjugate gradient algorithms for minimization opti- mization models, PLoS ONE, 10(10):e0140071 (2015). 60 Ë T. Diphofu, P. Kaelo, and A.R. Tufa [34] L. Zhang, W. Zhou and D. H. Li, A descent modied Polak-Ribiére-Polyak conjugate gradient method and its global conver- gence, IMA J. Numer. Anal., 26 (2006), 629–640. [35] L. Zhang, W. Zhou and D. H. Li, Some descent three-term conjugate gradient methods and their global convergence, Optim. Methods Softw., 22 (2007), 697–711. [36] G. Zoutendijk, Nonlinear programming, computational methods. In: J. Abadie (Ed.), Integer and Nonlinear Programming, North-Holland, Amsterdam (1970) 37–86.
Topological Algebra and its Applications – de Gruyter
Published: Jan 1, 2022
Keywords: Conjugate gradient; Global convergence; Sufficient descent; Weak Wolfe line search; 90C06; 90C30; 65K05
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.