Access the full text.
Sign up today, get DeepDyve free for 14 days.
1IntroductionRapid innovation in the wireless communication field has created many challenges to transfer data in a secure manner. Due to these advancements, there has been a significant increase in the transmission of images and videos through the public network. Also, the introduction of social media has led to the mass sharing of videos on the Internet frequently. Because of this intensification of sharing videos, there is a high demand for data security. Unencrypted data are more vulnerable to attacks. Therefore, there is a need for the data to be transmitted securely through any network, and the method called encryption ensures this [1,2,3]. There are various encryption algorithms used to encrypt/decrypt data. Most of these algorithms are focused on image and text data. Because of the large input size and huge time constraints, a less significant development was made when it comes to video encryption [4,5,6,7,8,9,10,11,12,13]. The algorithms that produced some good results in the past are explained in the coming paragraphs; however, in this era of social media, better encryption techniques are required.In 1998, Shi and Bhargava [6] proposed an efficient MPEG video encryption algorithm that uses a secret key. This secret key is used for changing the sign bits of encoded differential values of motion vectors of B and P pictures randomly. This approach basically conducted three major steps. It encrypts the motion vectors of B and P frames and then encrypts the DC coefficient of I frame and then encrypts all coefficients of motion vectors and I frame. Experimental results show that this adds overhead to MPEG codec and can be used only for securing video emails applications.Chiaraluce et al. in ref. [14] proposed a new chaotic algorithm for video encryption. The concept of this algorithm is to use three different chaotic functions. It uses two keys that act as an input for two chaotic functions and the XOR of the output act as input for the third function. The output from this chaotic function is XORed with video data that results in the encrypted video. The security could be increased by using a larger key length.Li et al. in ref. [15] proposed a new video encryption algorithm for H.264, which is a video encoding standard, which guarantees the security. In this method, the user key is responsible for four major parts: interprediction and intraprediction mode scrambling, encryption of motion vectors, and encryption of transform. The old prediction mode is XORed with a three-digit random sequence to get the new prediction mode. Because of the two-dimensional chaos system, this algorithm works better in terms of security but suffers from the problem of having less effect on entropy.Raju et al. in ref. [16] proposed a fast and real-time computationally efficient video encryption algorithm. It compresses the video data using Discrete Cosine Transform (DCT). Each video is divided into frames, and DCT is applied on a block of 8 × 8 size. It uses the RC5 (Rivest Cipher) algorithm for encrypting the DCT coefficients. This approach produces high security and entropy but adds overhead to codec.Dumbere and Janwe in ref. [17] proposed Advanced Encryption Standard (AES) algorithm for video encryption. It uses MATLAB for the implementation. It encrypts the 128-bit block data in “N” round of encryption based on the key. It produces better results and takes less encryption and decryption time than Data Encryption Standard (DES).The works proposed in refs. [18,19,20,21,22,23] are the main motivation behind the proposed approach of this article. The techniques proposed in refs. [18,19,20] use 1D chaos map, i.e., logistic map. Although a logistic map is simplest to implement and exhibits good chaotic behavior, it suffers from the issues of a blank and stable window. The proposed work of this article is an extension of these works. Ye and Huang in ref. [21] proposed an approach for image encryption using ILM. It is found that results obtained from ILM performs better than LM. In 2018, Hua et al. in ref. [22] proposed a method that coupled the logistic map with sine function that exhibits more complexity and large chaotic range. Recently, in 2019, a work is proposed in ref. [23], in which a logistic map combined with sine function acts as an input for cosine function to increase the nonlinearity in the chaotic sequence.This article proposes a fast and secure method of video encryption using 3D ILM with cosine transformation to generate a complex chaotic sequence. The proposed algorithm performs better in terms of security and efficiency and is highly resistant to cyber-attacks due to its nonlinearity. The novelty of this approach is using the 3D ILM cosine transformation for generating the chaotic sequence, which is provided as an input to the cosine function to produce a new chaotic pattern, which is more nonlinear. This article also checks the quality of decrypted video using MSE and PSNR parameters.The rest of this article is divided as follows: Section 2 describes the preliminaries, i.e., the fundamentals of chaos, LM, and ILM. Section 3 elaborates the proposed architecture. Implementation and experimental setup details are presented in Section 4. Section 5 compares the results with other existing approaches of video encryption. Section 6 discusses the conclusion and future works related to this technique.2PreliminariesThis section annotates the fundamentals of chaos theory, logistic maps (LM), intertwining logistic map (ILM), and ILM-Cosine, which were used to generate keys for the proposed video encryption algorithm.2.1Chaos theoryChaos theory is a mathematical theory [24,25] that is still a part of the research. It belongs to the dynamics, a field of physics that concerns the motion of objects when any force is applied. It is also the study of how simple patterns can be generated using the complicated behavior of numbers. The basic structure of chaos is divided into two parts, i.e., permutation and diffusion. Permutation is defined as the scrambling of the quantities in which the values are placed randomly. On the other hand, diffusion is a kind of substitution that is used for removing the redundancy from the chaos.Later, in the 1900, Lorenz used the term chaos for the first time [26]. He studied chaos theory in contextual weather systems. His studies prove that chaos theory can make decisions in complex environments. Because of handling the unpredictability in complex systems, chaos is used for encrypting sensitive information. In 1997, Fridrich [27] proposed an image encryption algorithm based on chaotic maps. Since then chaos has been used for encrypting digital data and its secure transmission. The unpredictability and intractability of chaos make it useful for this proposed algorithm for video encryption.2.2Logistic mapLogistic map (LM) is a two-degree polynomial mapping chaos function. It is the most popular and simplest of all the chaotic functions. It is used to generate chaotic behavior [28] from nonlinear dynamic equations. The mathematical representation of this function is defined as follows:(1)Us+1=η⁎Us(1−Us),{U}_{s+1}=\eta \hspace{.5em}\ast \hspace{.5em}{U}_{s}(1\hspace{.25em}-\hspace{.25em}{U}_{s}),where the range of Usvaries in (0, 1] that represents the value at sth position in the sequence and is a control parameter that is responsible for complete the chaotic sequence in the range [3.57, 4].Although LM has good ergodic nature that limits it to depend on initial conditions, it is sensitive to one control parameter. Due to this reason, one-dimensional logistic map extended to two dimensional [29], which is represented by the following equations:(2)Us+1=η1⁎Us(1−Us)+ε1⁎(Vs)2,{U}_{s+1}=\eta 1\hspace{.5em}\ast \hspace{.5em}{U}_{s}(1-{U}_{s})\hspace{.25em}+\varepsilon 1\hspace{.5em}\ast \hspace{.5em}{({V}_{s})}^{2},(3)Vs+1=η2⁎Vs(1−Vs)+ε2((Us)2+UsVs).{V}_{s+1}=\eta 2\hspace{.5em}\ast \hspace{.5em}{V}_{s}(1\hspace{.25em}-\hspace{.25em}{V}_{s})+\varepsilon 2({({U}_{s})}^{2}+{U}_{s}{V}_{s}).These equations are responsible for generating chaotic sequences in the range (0, 1] represented by U and V. Here, Usand Vsrepresent the value at sth position in two different dimensions. This chaos map remains in a chaotic state when 2.75 < η1 ≤ 3.4, 2.75 < η2 ≤ 3.45, 0.15 < ε1 ≤ 0.21, and 0.13 < ε2 ≤ 0.15. With further technical developments, this is extended to three-dimensional chaotic sequences [30,31]. The equations for 3D chaotic maps are described as follows:(4)Us+1=η⁎Us(1−Us)+(Vs)2Us+(Ws)3,{U}_{s+1}=\hspace{.25em}\eta \hspace{.5em}\ast \hspace{.5em}{U}_{s}(1-{U}_{s})+{({V}_{s})}^{2}{U}_{s}+{({W}_{s})}^{3},(5)Vs+1=η⁎Vs(1−Vs)+(Ws)2Vs+(Us)3,{V}_{s+1}=\eta \hspace{.5em}\ast \hspace{.5em}{V}_{s}(1-{V}_{s})+{({W}_{s})}^{2}{V}_{s}+{({U}_{s})}^{3},(6)Ws+1=η⁎Ws(1−Ws)+(Us)2Ws+(Vs)3.{W}_{s+1}=\eta \hspace{.5em}\ast \hspace{.5em}{W}_{s}(1-{W}_{s})+{({U}_{s})}^{2}{W}_{s}+{({V}_{s})}^{3}.Here, Us, Vs, and Wsrepresent the value at sth position in three dimensions These equations exhibit nonlinear system when U0, V0, and W0 are in [0, 1] and 0.53 < U0 < 3.81, 0 < V0 < 0.022, and 0 < W0 < 0.015. These generated sequences have good cross-relation and auto-correlation, but they also have some drawbacks. LM suffers from the problem of stable windows, blank windows [32], and nonuniform sequence distribution. Figure 1 shows the problem of blank windows in LMs. To overcome these boundaries of LMs, ILMs came into the picture.Figure 1Sequence distribution in LM showing blank window with η on X-axis and U on Y-axis [33].2.3ILMIn 2014, Wang and Xu [33] proposed an intertwining relation between different LM sequences, which are nonlinear. Since the Lyapunov exponent of ILM is always positive in comparison to LM, it indicates that ILM has more dynamic behavior than LM [34]. The equations for ILM sequence are expressed as follows:(7)Us+1=[η⁎σ⁎Vs⁎(1−Us)+Ws]Mod1,{U}_{s+1}={[}\hspace{.25em}\eta \ast \hspace{.5em}\sigma \hspace{.5em}\ast \hspace{.5em}{V}_{s}\hspace{.5em}\ast \hspace{.5em}(1-{U}_{s})+{W}_{s}]\hspace{.25em}\text{Mod}\hspace{.5em}1,(8)Vs+1=[η⁎ϑ⁎Vs+Ws⁎(1+Us+12)]Mod1,{V}_{s+1}={[}\eta \hspace{.5em}\ast \hspace{.5em}{\vartheta }\hspace{.5em}\ast \hspace{.5em}{V}_{s}+{W}_{s}\hspace{.5em}\ast \hspace{.5em}(1+{U}_{s}+12)]\hspace{.25em}\text{Mod}\hspace{.5em}1,(9)Ws+1=[η⁎(Vs+1+Us+1+κ)⁎sin(Ws)]Mod1,{W}_{s+1}={[}\eta \hspace{.5em}\ast \hspace{.5em}({V}_{s}+1+{U}_{s}+1+\kappa )\hspace{.5em}\ast \hspace{.5em}\sin ({W}_{s})]\hspace{.5em}\text{Mod}\hspace{.5em}1,where η is in range of [0, 4), σ > 33.5, ϑ > 37.9, and κ > 35.7 for exhibiting the chaotic behavior.As shown in Figure 2, chaotic sequences generated by ILM are uniformly distributed. This uniform distribution results in the removal of LM’s demerits, i.e., uneven key distribution, blank window, and stable window.Figure 2ILM: (a) single-sequence distribution (U, V, or W) and (b) three-sequences distribution (U, V, and W) [33].Figure 3 shows the Lyapunov exponent for LM and ILM methods. It can be observed that the Lyapunov exponent of ILM is never negative and is more uniformly distributed.Figure 3Lyapunov exponents of (a) LM and (b) ILM [33].2.4ILM with cosine function (ILM-cosine)ILM indicates the dynamic behavior in the output. But to increase the efficiency of encryption, ILM is combined with cosine function in the proposed approach. The cosine function is used to increase the content of nonlinearity in the output produced by ILM [35]. The equations for ILM-cosine are expressed as follows:(10)Us+1=cos([η⁎σ⁎Vs⁎(1−Us)+Ws]Mod1+ϑ),\begin{array}{c}{U}_{s+1}=\cos ({[}\hspace{.25em}\eta \hspace{.5em}\ast \hspace{.5em}\sigma \hspace{.5em}\ast \hspace{.5em}{V}_{s}\hspace{.5em}\ast \hspace{.5em}(1-{U}_{s})\hspace{3em}+{W}_{s}]\hspace{.25em}\text{Mod}\hspace{.25em}1+{\vartheta }),\end{array}(11)Vs+1=cos([η⁎ϑ⁎Vs+Ws⁎(1+Us+12)]Mod1+ϑ),\begin{array}{c}{V}_{s+1}=\cos ({[}\hspace{.25em}\eta \hspace{.5em}\ast \hspace{.5em}{\vartheta }\hspace{.5em}\ast \hspace{.5em}{V}_{s}+\hspace{.25em}{W}_{s}\hspace{.5em}\ast \hspace{.5em}(1+{U}_{s}+12)]\hspace{3em}\text{Mod}\hspace{.25em}1+{\vartheta }),\end{array}(12)Ws+1=cos([η⁎(Vs+1+Us+1+κ)⁎sin(Ws)]Mod1+ϑ).\begin{array}{c}{W}_{s+1}=\hspace{.25em}\cos ([\hspace{.25em}\eta \hspace{.5em}\ast \hspace{.5em}({V}_{s}+1+{U}_{s}+1+\kappa )\\ \hspace{3.5em}\ast \hspace{.5em}\sin ({W}_{s})]\hspace{.5em}\text{Mod}\hspace{.5em}1\hspace{.25em}+\hspace{.25em}{\vartheta }).\end{array}3Proposed architectureThis section proposes a new symmetric key encryption and decryption method for video data. Chaos-based cryptographic algorithms are used to make it difficult for unauthorized users to break the encryption. The encryption process starts by dividing the input video into multiple frames. Each of these frames go through a series of steps, i.e., permutation, rotation, diffusion to produce encrypted frames. The encrypted frames are jumbled using a frame selection key before joining them to form the final encrypted video. This encrypted video can be transmitted securely as any unauthorized user cannot access it without the keys. The decryption process involves splitting the encrypted video, rearranging the frames according to the FS key, and performing antisubstitution, rotation, and descrambling on each of them to generate decrypted frames. These decrypted frames are merged together to generate the decrypted video.3.1EncryptionAs soon as a video (in mp4 format) is chosen for encryption, it is split into a number of frames. The duration and FPS value of the video determine the number of frames. The steps followed for encrypting the video are shown in Figure 4. The encryption process is performed on each of the frames iteratively until all the frames are encrypted. Each frame is split into three 2D matrices that represent its red, green, and blue components. An example of this process is shown in Figure 5.Figure 4Encryption process of the proposed approach.Figure 5Splitting an image into its R, G, and B components.3.1.1Key generationThe secret keys required for permutation are generated using a keyless hash function called Secure Hash Algorithm-256 (SHA-256) [36,37]. It converts messages of any size into a hash of fixed length (256 bits) and provides an additional advantage that getting a collision is computationally impossible. It is a new cryptographically secure one-way hash function that falls under the category of SHA 2. SHA 2 is an improved version of SHA 0 and SHA 1.SHA-256 function is used to generate the seeds of length equal to the key length. Whatever be the length of input information, this algorithm breaks the data into 64 bytes or 512 bits and generates a 256-bit hash seed after cryptographic mixing. Three such seeds are generated for the R, G, and B components of each frame. It is resistant to all kinds of attacks identified till date and thus is used to secure highly sensitive data.3.1.2PermutationThe process of jumbling the position of pixels in a frame is called permutation or scrambling. This interchanging of pixels helps to reduce the correlation among them. Let A and B be the dimensions of the input image, and I be the size of block, which is calculated as follows:(13)I=min{⌊√A⌋,⌊√B⌋}.I=\hspace{.25em}\min \{\lfloor \surd A\rfloor ,\lfloor \surd B\rfloor \hspace{.25em}\}.Then, permutation can be done on the matrix of size I2 * I2. Each frame is divided into I2 blocks. By using the chaotic sequence generated by the key generator, the pixels of each row of the block are shuffled into other blocks. In the next step, another chaotic sequence is used to jumble the position of pixels column-wise. This scrambling of pixels reduces the association between them which is considered to be a feature of a good encryption technique. The following steps are followed for permutation:I is calculated using equation (13) and an image of size I2 * I2 pixels is extracted from the frame.Four chaotic sequences, say P, Q, R, and S of length I2 are generated by the key generator and are sorted.After sorting these sequences, four index vectors are generated, one from each of them. Let them be labeled as In P, In Q, In R, and In S.Two blank matrices L and M are generated with dimensions I2 * I2. The matrices L and M are initially filled with the values present in the columns of In P and In R, respectively, and the elements present in L and M are shifted according to the elements of In Q and In S, respectively.The value of row(r) is initialized to 1. Let the value present in the rth row and cth column of L matrix (i.e., (Lr, c)th value) be “x.” The (r, c)th pixel of the frame under consideration is replaced by the value in xth row and cth column of M matrix (i.e., Mx, c).Step 5 is done iteratively for all values of “r” in the range [2, I2].3.1.3RotationEven after permutation, the position of many pixels in the matrix remains unchanged. This is because only those pixels that are in the range I2 * I2 are permuted. To induce more randomness, the entire matrix is rotated by 90° in the anticlockwise direction. This rotates not only the entire frame but also every pixel in it. It is preferable to rotate the matrix in multiples of 90°, as rotating by any other intermediate angles results in a tilted frame, which is unfit for computation.3.1.4DiffusionDiffusion or substitution is the process of spreading variation throughout the matrix by replacing the existing values with new ones. The changes are made in the matrix row-wise and column-wise according to general rules of diffusion. Following these preestablished rules makes the system more susceptible to attacks. To overcome this drawback, random order substitution technique is used, and the equations used to implement the same are as follows:(14)EInr,c,c=(LInr,c,c+LIn,N+⌊232⁎PInr,c,c⌋)modM,forr=1,c=1,(LInr,c,c+EInr−1,N,N+⌊232⁎PInr,c,c⌋)modM,forr=2∼N,c=1,(LInr,c,c+EInr,c−1,c−1+⌊232⁎PInr,c,c⌋)modM,forr=1∼N,c=2∼N.{E}_{{\text{In}}_{r,c},c}=\left\{\begin{array}{l}({L}_{{\text{In}}_{r,c},c}+\hspace{.25em}{L}_{\text{In},N}+\lfloor {2}^{32}\hspace{.5em}\ast \hspace{.5em}{P}_{{\text{In}}_{r,c,}c}\rfloor )\mathrm{mod}\hspace{.5em}M,\\ \hspace{1em}\text{for}\hspace{.5em}r=1,\hspace{.25em}c=1,\\ ({L}_{{\text{In}}_{r,c},c}+\hspace{.25em}{E}_{{\text{In}}_{r-1,N},N}+\lfloor {2}^{32}\hspace{.5em}\ast \hspace{.5em}{P}_{{\text{In}}_{r,c},c}\rfloor \hspace{.25em})\mathrm{mod}\hspace{.5em}M,\\ \hspace{1em}\text{for}\hspace{.5em}r=2\sim N,\hspace{.25em}c=1,\\ ({L}_{{\text{In}}_{r,c},c}+\hspace{.25em}{E}_{{\text{In}}_{r,c-1},c-1}+\hspace{.25em}\lfloor {2}^{32}{\ast }_{P{\text{In}}_{r,c},c}\rfloor )\mathrm{mod}\hspace{.25em}M,\hspace{.25em}\\ \hspace{1em}\text{for}\hspace{.5em}r=1\sim N,\hspace{.5em}c=2\sim N.\end{array}\right.By using the chaotic sequence used for diffusion, an index matrix is generated. Here, L is the final matrix obtained after permutation and P is the chaotic matrix. It is the index matrix obtained by sorting the elements of P and M = 256 for 8-bit representation of pixels.3.1.5Frame selectionAfter diffusion, the frames are completely encrypted and are ready to be joined together. To make the encryption even better, the frames are not joined sequentially. The encrypted frames are jumbled using a frame selection key and then are merged to form an encrypted video. This encrypted video seems meaningless to any unauthorized user without the keys and hence can be transmitted securely.3.2DecryptionThe steps followed for decrypting an encrypted video are shown in Figure 6. Decryption is possible only if all the keys used for encryption and the encrypted video are available. Even a slight change in any of the keys does not produce the original video back.Figure 6Decryption process of the proposed approach.3.2.1Frame selectionWhile generating the encrypted video, a frame selection key is used to jumble the frames to ensure more randomness. The same key is used to place encrypted frames in their correct position before starting the decryption process. Once all the frames are shuffled back to their original places, decryption is done iteratively on each of them.3.2.2AntisubstitutionEach encrypted frame is split into three 2D matrices that represent its R, G, and B components as shown in Figure 5. The next step is to revert the changes done by the random order substitution during the diffusion process. It is done by using the following equations:(15)LIr,c,c=(EIr,c,c−EIr,c,c−1−232⁎PIr,c,c)modM,forr=1∼N,c=2∼N,(EIr,c,c−EIr−1,N,N−232⁎PIr,c,c)modM,forr=2∼N,c=1,(EIr,c,c−LIM,N,N−232⁎PIr,c,c)modM,forr=1,c=1.{L}_{{\text{I}}_{r,c},c}=\left\{\begin{array}{l}({E}_{{\text{I}}_{r,c},c}-\hspace{.25em}{E}_{{\text{I}}_{r,c},c-1}-{2}^{32}\ast \hspace{.25em}{P}_{{\text{I}}_{r,c},c})\mathrm{mod}\hspace{.25em}M,\\ \hspace{1em}\text{for}\hspace{.5em}r=1\sim N,c=2\sim N,\\ ({E}_{{\text{I}}_{r,c},c}-\hspace{.25em}{E}_{{\text{I}}_{r-1,N},N}-\hspace{.25em}{2}^{32}\ast \hspace{.25em}{P}_{{\text{I}}_{r,c},c})\mathrm{mod}\hspace{.25em}M,\\ \hspace{1em}\text{for}\hspace{.5em}r=2\sim N\hspace{.25em},\hspace{.25em}c=1,\\ ({E}_{{\text{I}}_{r,c},c}-\hspace{.25em}{L}_{{\text{I}}_{M,N},N}-\hspace{.25em}{2}^{32}\ast \hspace{.25em}{P}_{{\text{I}}_{r,c},c})\mathrm{mod}\hspace{.25em}M,\\ \hspace{1em}\text{for}\hspace{.5em}r=1,c=1.\end{array}\right.3.2.3RotationThe original image is rotated by 90° in the anticlockwise direction during encryption. The association among adjacent pixels is reduced by using this rotation step. To retrieve the actual image, the encrypted image is rotated by 270° in the counterclockwise direction.3.2.4De-permutationBy using the same chaos-based encryption keys used during scrambling, descrambling is done to restore all the pixels to their original positions. This can be considered as the exact opposite process of scrambling.4ImplementationThis section focuses on the experimental setup including software and hardware requirements for implementing the approach proposed in this article. The properties of a sample video taken as input for testing are also explained below along with all the implementation details.4.1Experimental setupThe software requirements to implement and test the proposed approach include Python 2.7, version 2.7.17 used on Atom 1.45.0. Libraries like Pillow, OpenCV, NumPy, Matplotlib, and scikit-image are added externally. System configurations are Windows 10, Intel(R) Core (TM) i-5-6200U CPU clocked at 2.30 GHz 2.40 GHz, 8GB RAM, and 64-bit operating system.The videos for the experiment are taken from ref. [38]. There are four videos with their properties presented in Table 1.Table 1Properties of various videos used for encryption and decryptionVideo nameFrame per second (FPS)Length of video (s)No. of framesDimension of frames (Pixels)Flamingo.mp425.013328352 × 192Train.mp425.010262352 × 192Rhino.mp415.07114320 × 240Viptrain.mp430.020626360 × 2404.2EncryptionThe encryption procedure followed in this article is chaos based. The implementation details explained for encryption and decryption are based on the video titled Flamingo.mp4 from source [38].4.2.1Frame generationThe input video is partitioned into N number of frames, where N is defined by equation (16). N is equivalent to 328 frames for the given input video. The frames are stored in the jpg format, because in this image format, the encryption procedure works faster. Function 1 shows the pseudocode to split the video into frames.(16)N=FPS⁎(lengthofvideo).N=\text{FPS}\hspace{.5em}\ast \hspace{.5em}(\text{length}\hspace{.5em}\text{of}\hspace{.5em}\text{video}).Function 1: Split mp4 video into 2D framesInput: Original video (n, m) mp4 format, image format (jpg or png)Output: N (n × m) 2D framesPseudocode:WHILE(True) DO: frame ← Capture the current frame Write frame in memory INCR (currentFrame)END WHILE4.2.2Color frame InputAs the input video is colored, each 2D color frame, which is in jpg format, is converted to a 2D RGB matrix. The RGB matrix is of dimension 3*(n*m), where n is the height, m is the width of the image, and each row corresponds to red (R), green (G), and blue (B) colors, respectively. This step is performed to obtain the pixel image for further processing. Function 2 shows the pseudocode to convert the frame into a 2D RGB matrix.Function 2: Convert a 2D color image to 2D RGB matrixInput: Original frame (n, m)Output: 2D IMG_MATRIX (3×(n * m))Pseudocode:FOR x ← (0, n) DO: FOR y ← (0, m) DO: r, g, b ← framey, x IMG_MATRIX[:, x * m + y] ← [r, g, b] END FOREND FOR4.2.3Key generationILM chaotic sequence is generated for three-dimensional encryption for R, G, and B components of each frame. Here, the length of the given key is 360. Function 3 shows the pseudocode of secret hash function-256 for seed generation.Function 3: SHA-256 for 3D seed generationInput: Length of key (L)Output: 3D seeds of length LPseudocode:SECRET_KEY ← UNIFORM_RANDOM(L)//within range [0,1]KEY_MATRIX ← hash (SECRET_KEY, [SHA-256, binary Mode, double])FOR l ← (0, L/2) DO:A ← A xor KEY_ MATRIX(l)END FORFOR gap ← (L/2, L) DO:SUM ← SUM + KEY_MATRIX (gap)END FOR3D_SEED ← A + SUM3D_SEED ← 3D_SEED/2^124.2.4ILM-cosine sequence generationSecret key or seeds generated in the previous steps are used to generate a 3*(4*b*b) ILM-cosine transformation-based chaotic sequences, where b is the block size given in equation (13). Function 4 gives the pseudocode to generate the ILM-cosine sequence, and Table 2 presents a sample ILM-cosine sequence for an 8*8 frame.Function 4: Generate ILM-Cosine sequence from the secret keyInput: No. of pixels (4*b*b), secret key (S)Output: A 3*(4*b*b) ILM-cosine chaotic sequencePseudocode:FOR c ← (0, 4*b*b) DO: ILM0, c ← ((a1 * S2 * (1-S1)) + S3) mod 1 ILM1, c ← ((a1 * S2) + (S3 * 1/(1 + ILM0, c2))) mod 1 ILM2, c ← (b1 * (ILM0, c + ILM1, c + b2) *SIN(S3)) mod 1 S ← ILMEND FORRETURN COS(PI*ILM)Table 2ILM-cosine based chaotic sequence for 8 × 8 frame0.6790.9390.8770.1200.2340.0770.0170.6180.3360.5840.7560.8250.8830.0170.7530.0040.2780.5760.2480.8920.3690.6730.7940.9210.3860.8210.7970.5760.2760.0790.7040.4180.8940.1320.0120.1270.8950.0060.1350.8070.9480.5800.5470.3120.6650.2560.5380.1094.2.5Extracting gray componentsThe 2D RGB matrix image is divided into its corresponding 2D gray components. The gray components of a frame are encrypted individually with a separate ILM sequence for each component. This is done to encrypt the frame from all three dimensions separately. Function 5 shows the pseudocode for extracting the 2D gray components. Figure 7a shows the gray component (red) of dimensions 352 × 192 from the sample video.Function 5: Get the 2D gray component from the 2D RGB imageInput: Row, Col, Entire row from RGB matrix that corresponds to the gray component (Data)//1×(n * m)Output: 2D GRAY_IMAGE (n × m)Pseudocode:GRAY_IMAGE ← create a new image of dimensions (col, row)GRAY_IMAGE ← PUT(Data)RETURN GRAY_IMAGEFigure 7Red component of a frame during encryption process in each step. (a) Original image, (b) scrambled image, (c) rotated image, and (d) encrypted image after substitution.4.2.6PermutationPermutation is performed using the ILM-cosine transformation-based chaotic sequence generated and is the first step in chaos encryption. Only a fraction of the image is de-correlated by shuffling its pixels. Function 6 shows the pseudocode for scrambling the image, and Figure 7b shows the scrambled gray component (red) of dimensions 352×192 from the sample video.Function 6: Scramble or permutate the pixels to reduce the correlationInput: GRAY_COMPONENT (n × m), ILM_CSEQ, BLOCK_SIZE(b)Output: SCRAMBLED_IMG (n × m)Pseudocode:CALCULATE Size of the matrix to be scrambled B ← b*bP ← ILM_CSEQ (0: B)Q ← ILM_CSEQ (B:2B)R ← ILM_CSEQ (2B:3B)S ← ILM_CSEQ (3B:4B)In P ← GET_INDEX_SEQ(P) //sort enumerated P matrix and return P [0]In Q ← GET_INDEX_SEQ(Q)In R ← GET_INDEX_SEQ(R)In S ← GET_INDEX_SEQ(S)col, row ← SHAPE of GRAY_COMPONENTSCRAMBLED_IMG ← COPY GRAY_COMPONENTFOR y ← (1, B + 1) DO: FOR x ← (1, B+1) DO: c ← (x + In Qy−1 − 1) mod (B + 1) d ← (x + In Sy−1 − 1) mod (B + 1) Lx, y ← In Pc−1 Mx, y ← In Rd−1 END FOREND FORFOR x ← (1, B + 1) DO: FOR y ← (1, B + 1) DO: i ← Lx, y j ← Mi, y c1 ← (i − 1)/b d1 ← (i − 1) mod b c2 ← (j − 1)/b + 1 d2 ← (j − 1) mod b + 1 r ← c1*b + c2 c ← d1*b + d2 SCRAMBLED_IMGr, c ← imgx, y END FOREND FORRETURN SCRAMBLED_IMG4.2.7Image rotationFigure 7b shows a scrambled frame. To ensure complete shuffling of pixels from their original positions, the rotation of the frame is done in the anticlockwise direction by the angle of 90 degrees. The rot90 function is used for this purpose. Figure 7c displays the rotated gray component (R) of dimensions 352 × 192 from the sample video.4.2.8DiffusionRandom order substitution is performed using a substitution sequence, which is uniform and random. Function 7 shows the pseudocode for random order substitution, and Figure 7d shows an encrypted 2D gray component (R) of dimensions 352 × 192 after substitution. After substitution, all the three encrypted gray components are stacked together to obtain the encrypted 2D RGB image matrix, which is also the final encrypted frame. This encryption process is repeated N times for each frame.Function 7: Random Order SubstitutionInput: GRAY_COMPONENT (m × n)//rotated image, SUB_CSEQ, SIZEOutput: ENCRYPTED_IMG (m × n)Pseudocode:GENERATE matrix A and B ← SIZEA ← ROTATE (Clockwise, 90 degrees, SUB_CSEQ)In A ← GET_INDEX_SEQ(A)//sort enumerated A matrix and return A [0]B ← ROTATE (Clockwise, 90 degrees, InA)M ← 256FOR r ← (0, row) DO: FOR c ← (0, col) DO: IF r = = 0 and c = = 0 THEN: ENCRYPTED_IMGBr, c, c ← (GRAY_COMPONENTBr, c, c + GRAY_COMPONENTBrow-1, col-1, col-1 + 2^32 * ABr, c, c) mod M ELSE IF c = = 0 THEN: ENCRYPTED_IMGBr, c, c ← (GRAY_COMPONENTBr, c, c + ENCRYPTED_IMGBr-1, col-1, col-1 + 2^32*ABr, c, c) mod M ELSE: ENCRYPTED_IMGBr, c, c ← (GRAY_COMPONENTBr, c, c + ENCRYPTED_IMGBr, c-1, j-1 + 2^32*ABr, c, c) mod M END IF END FOREND FORRETURN ENCRYPTED_IMG4.2.9Frame selectionA random order sequence within the range N, number of frames, is generated and is called frame selection (FS) sequence. The encrypted frames are joined to form an encrypted video in mp4 format according to this FS sequence. Functions 8 and 9 show the pseudocode for generating FS sequence and joining the frames to form a video, respectively.Function 8: Generate Frame Selection sequenceInput: No. of frames (N)Output: 1D FS sequence(1xN)Pseudocode:FOR f ← (0, N) DO: r ← RANDOM_INT (1, N) IF r not in FS THEN: FS [] ← Append r ELSE: WHILE r is less than N DO: INCR(r) IF r more than N THEN: r ← 1 END IF IF r not in FS THEN: FS [] ← Append r BREAK END IF END WHILE END IFRETURN FSFunction 9: Join Frames to get the videoInput: No. of frames(N), FPSOutput: VIDEO_MP4_FPSPseudocodeFS ← FS_SEQUENCE(N)frame ← Read First frameh, w ← SHAPE (frame)OUTPUT_FORMAT ← SET the output video formatsize ← (w, h)SET VIDEO_WRITER using FPS, OUTPUT_FORMAT, size, OUTPUT_PATHFOR f ← (0, N) DO: IMAGE_PATH ← SET frame path Img ← Read (IMAGE_PATH) Write Img using VIDEO_WRITEREND FOR4.3DecryptionInput for the decryption process is the encrypted video file in mp4 format, and the output is decrypted video file in mp4 format, which is similar to the original input video. To test the quality of decryption, MSE and PSNR tests are performed, the results of which are analyzed in the next section.4.3.1Frame generationEncrypted video is in mp4 format and to decrypt the video, it is partitioned into N frames, where N is derived from equation (16). The procedure to partition the video is similar to that given in the encryption process and is shown in Function 1. The frames are saved in the memory in.png format.4.3.2Rearrange framesThe next step is to rearrange the frames in their actual order using the same FS sequence. The process is simply renaming the frames in order of their occurrence according to the FS sequence. After this process, frames are in their actual order and are ready to be decrypted.4.3.3Extracting gray componentsThe .png format 2D image file is initially converted to its corresponding 2D RGB image matrix. This step is performed to get the 2D pixel image for further processing. The pseudocode for the same is shown in Function 2. The frames are encrypted in three dimensions (R, G, and B) individually using a separate ILM sequence for every gray component, and hence, to decrypt the frame, all three gray components are extracted from the 2D RGB image matrix according to Function 5.4.3.4Reversal of diffusionReversal of diffusion or antisubstitution is to replace the pixels by substituting them with the original values using the same substitution sequence used during encryption. Function 10 shows the pseudocode for antisubstitution using random order substitution.Function 10: Substitute pixels back to their original pixel value Input: ENCRYPTED_IMG (m x n), SUB_CSEQ, SIZEOutput: ASUB_IMG (m × n)Pseudocode:GENERATE matrix A and B ← SIZEA ← ROTATE (Clockwise, 90°, SUB_CSEQ)InA ← GET_INDEX_SEQ(A)//sort enumerated A matrix and return A [0]B ← ROTATE (Clockwise, 90°, In A)M ← 256 FOR r ← (0, row) DO: FOR c ← (0, col) DO: ASUB_IMGBr, c, c ← (M + ENCRYPTED_IMGBr, c, c − ENCRYPTED_IMGBr, c-1, c-1 − 2^32*A Br, c, c) mod M END FOREND FORc ← 0 FOR r ← (1, col) DO: ASUB_IMGBr, c, c ← (M + ENCRYPTED_IMGBr, c, c − ENCRYPTED_IMGBr−1, col−1, col−1 − 2^32*ABr, c, c) mod MEND FORi ← 0ASUB_IMGBr, c, c ← (M + ENCRYPTED_IMGBr, c, c − ASUB_IMGBrow−1, col−1, col−1 − 2^32*ABr, c, c) mod MRETURN ASUB_IMG4.3.5Image rotationDuring decryption, the frame is rotated 270° counterclockwise. The rot90 function of numpy library in python is used three times for this purpose.4.3.6DescramblingThe inverse of permutation called descrambling is performed as the final step of decryption to get the original frame. Function 11 shows the pseudocode for descrambling the image. After substitution, all the three decrypted gray components are stacked together to obtain the decrypted 2D RGB image matrix, which is also the final decrypted frame. This decryption process is repeated N times for each frame, and all the decrypted frames are joined together to generate mp4 decrypted video as shown in Function 9.Function 11: Correlate the pixel back to their original positionInput: SCRAMBLED_IMG (n × m), ILM_CSEQ, BLOCK_SIZE(b)Output: ORIG_GRAY_IMG (n × m)Pseudocode:CALCULATE Size of the matrix to be descrambled B ← b*bP ← ILM_CSEQ (0: B)Q ← ILM_CSEQ (B:2B)R ← ILM_CSEQ (2B:3B)S ← ILM_CSEQ (3B:4B)In P ← GET_INDEX_SEQ(P)//sort enumerated P matrix and return P [0]In Q ← GET_INDEX_SEQ(Q)In R ← GET_INDEX_SEQ(R)In S ← GET_INDEX_SEQ(S)col, row ← SHAPE of GRAY_COMPONENTORIG_GRAY_IMG ← COPY SCRAMBLED_IMGFOR y ← (1, B + 1) DO: FOR x ← (1, B + 1) DO: c ← (x + In Qy−1 − 1) mod (B + 1) d ← (x + In Sy−1 − 1) mod (B + 1) Lx, y ← In Pc−1 Mx, y ← In Rd−1 END FOREND FORFOR x ← (1, B + 1) DO: FOR y ← (1, B + 1) DO: i ← Lx, y j ← Mi, y c1 ← (i − 1)/b d1 ← (i − 1) mod b c2 ← (j − 1)/b + 1 d2 ← (j − 1) mod b + 1 r ← c1*b + c2 c ← d1*b + d2 ORIG_GRAY_IMG,y← SCRAMBLED_IMGr, cRETURN ORIG_GRAY_IMG5Results and analysisTo compare the encryption efficiency of the proposed method with existing methods, some parameters were used. They are described as follows.5.1Differential attacksDifferential attacks [39] are the most common kind of attacks performed on block ciphers. Two parameters, NPCR and UACI, where the former is described as the number of pixel change rate and the latter is defined as unified averaged change in intensity, were used to evaluate the performance of encryption algorithms against these attacks. UACI is calculated using equation (17).(17)UACI=1A×B∑x=1A∑y=1B∣E1(x,y)−E2(x,y)∣255×100%,\hspace{-24em}\begin{array}{c}\text{UACI}=\frac{1}{A\hspace{.25em}\times \hspace{.25em}B}\mathop{\sum }\limits_{x=1}^{A}\mathop{\sum }\limits_{y=1}^{B}\hspace{-0.16em}\frac{| {E}_{1}(x,\hspace{.25em}y)-{E}_{2}(x,\hspace{.25em}y)| }{255}\hspace{.25em}\times 100 \% ,\end{array}where E(x, y) is the pixel value present in the xth row and the yth column of E. E1 and E2 denote the encrypted images. According to various experiments done, the ideal value of UACI is 33.NPCR is the rate of change in the number of pixels in an encrypted image when one pixel is modified in the original image. It focuses on the absolute number of pixels, which change after an attack. It is calculated using the following formula:(18)NPCR=1A×B∑x=1A∑y=1BU(x,y)×100%,\text{NPCR}=\frac{1}{A\hspace{.25em}\times \hspace{.25em}B}\mathop{\sum }\limits_{x=1}^{A}\mathop{\sum }\limits_{y=1}^{B}U(x,\hspace{.25em}y)\hspace{.25em}\times 100 \% ,where(19)U(x,y)=0,c1(x,y)≠c2(x,y)1,in other cases.U(x,\hspace{.25em}y)=\left\{\begin{array}{ll}0,& c1(x,\hspace{.25em}y)\ne \hspace{.25em}c2(x,\hspace{.25em}y)\\ 1,& \text{in other cases}.\end{array}\right.The ideal value of NPCR is 99. The values obtained for UACI and NPCR by the proposed work are presented in Table 3.Table 3UACI, NPCR, CC, and entropy analysisVideo (mp4)ComponentUACINPCRCChCCvCCdEntropyFlamingoRed33.7699299.60848−0.001483+0.004846+0.0001777.99719Green35.3437399.62372−0.012479+0.025968−0.0056577.99731Blue32.7250199.61854+0.006315−0.003929−0.0011707.99732RhinoRed33.2325199.61549+0.009020−0.014905+0.0001017.99757Green33.6904599.60130+0.013223+0.004301−0.0057527.99762Blue32.7723799.61276+0.006237−0.007078−0.0017067.99746TrainRed33.7963999.61573−0.014568+0.003141+0.0003977.99726Green32.9072399.62328+0.016226+0.020616−0.0003207.99727Blue37.6604099.60256−0.001335−0.006978−0.0009187.99723ViptrainRed32.2378099.60798−0.003143+0.005592−0.0069097.99791Green31.9826299.61377−0.001178−0.003753+0.0027037.99774Blue32.3844399.61990−0.007492+0.015163+0.0102757.997885.2Correlation coefficient (CC) analysisCC denotes the association between two adjacent pixels in an image. Its value lies within the range [−1, 1], where 0 is considered the ideal value. The value 0 means that there is no correlation, and the value 1 indicates a high correlation among the pixels. CC is calculated by choosing 1,000 pairs of pixels from the image and forming duplets from them. It is estimated for pixel pairs horizontally (CCh), vertically (CCv), and diagonally (CCd) and can be calculated using the following formulae:(20)ci,j=cov(i,j)√A(i)√A(j),{c}_{i,j}=\frac{\mathrm{cov}(i,\hspace{.25em}j)}{\surd A(i)\surd A(j)},(21)where,cov(i,j)=1M∑k=1M(xk−B(x))(yk−B(y)),\text{where},\hspace{.5em}\mathrm{cov}(i,j)=\frac{1}{M}\mathop{\sum }\limits_{k=1}^{M}({x}_{k}-B(x))({y}_{k}-B(y)),(22)A(x)=1M∑k=1M(xk−B(x))2,A(x)=\hspace{.25em}\frac{1}{M}\mathop{\sum }\limits_{k=1}^{M}{({x}_{k}-B(x))}^{2},(23)B(x)=1M∑k=1Mxk.B(x)=\frac{1}{M}\mathop{\sum }\limits_{k=1}^{M}{x}_{k}.Here, ci,jdenotes the CC, M denotes the pixel pairs selected randomly, and i and j denote two adjacent pixels. A(x) and B(x) denote the variance and expectation of x, respectively. Table 3 presents the results of CC for the proposed approach.5.3Entropy analysisEntropy is the degree of randomness of an image. An encrypted image with entropy value 8 is said to have the maximum degree of randomness and hence is resistant to differential attacks. Entropy is calculated using equation (24).(24)E(i)=∑j=0G−1p(ij)1p(ij),E(i)=\mathop{\sum }\limits_{j=0}^{G-1}p({i}_{j})\frac{1}{p({i}_{j})},where E denotes entropy and G = 2kand k = 8 for gray scale image. G denotes the number of states of the frame analyzed. Table 3 presents the entropy analysis for the proposed approach.5.4Histogram analysisThis analysis depicts the level of encryption and can be used to determine the strength of the used algorithm against attacks. A uniform pattern in the histogram implies a good encryption scheme, which is difficult to crack. Table 4 presents the histogram analysis of unencrypted and encrypted frames.Table 4Histogram analysisVideo (mp4)Original frameEncrypted frameOriginal frame HistogramEncrypted frame HistogramFlamingoRhinoTrainViptrain5.5MSE and PSNR analysisMean square error (MSE) is the average of the difference of squares of intensity between the encrypted and plain images. It can be calculated using equation (25) [41,42].Peak signal-to-noise ratio (PSNR) is the measure of change of quality between actual and encrypted images. In the case of image data, the original image acts as the signal, and error is the noise generated because of encryption. A high value of PSNR means that the decrypted image is of good quality. It can be calculated using equation (26) by using the MSE value [43,44].(25)MSE=1m×n∑x=0m−1∑y=0n−1[O(x,y)−E(x,y)]2,\text{MSE}=\frac{1}{m\times n}\mathop{\sum }\limits_{x=0}^{m-1}\mathop{\sum }\limits_{y=0}^{n-1}{{[}O(x,y)-E(x,y)]}^{2},(26)PSNR=10log10(MAXI)2MSE,\text{PSNR}=10\hspace{.5em}{\log }_{10}\frac{{({\text{MAX}}_{\text{I}})}^{2}}{\text{MSE}},where E is the image with noise, O is the original image with dimensions m * n, and MAXI is the pixel with maximum value in the image. The value of MAXI is 255 for 8-bit representation of pixels. Table 5 presents the MSE and PSNR test values for the proposed approach.Table 5PSNR and MSEVideo (mp4)ComponentPSNRMSEFlamingoFull frame36.675560620986715.148983651620369933Red35.9420207657066717.697918639520202067Green37.23771868763313.016120186237373633Blue36.90244468596714.732912129103535067RhinoFull frame31.562940331126749.8130211226851868Red31.423453427493351.312109375Green31.3908351163650.851566840278Blue31.768447479847.27538715279TrainFull frame34.630912854193323.8425902909301348Red34.647044603693324.447270557133839067Green33.707582214826732.875726010101009533Blue37.262103502053314.204774305555555067ViptrainFull frame32.013571362786740.992180298353909067Red31.140802972113350.619496913580246133Green32.32865849146738.004512345679013Blue32.80682065523334.3525316358024689335.6Time analysisAn encryption algorithm is considered to be good if it takes less amount of time to process data without compromising the level of security. The major factor that determines the time complexity of the proposed approach is the size of the frames. Images with bigger dimensions take more time for encryption and decryption when compared to smaller images. The analysis for different values of frame size is presented in Table 6.Table 6Encryption Speed (FPS)Video (mp4)Frame size (Pixels)Time per frame (in seconds)EncryptionDecryptionFlamingo8 × 80.004500.00001128 × 1281.190191.00000352 × 1923.767106.00000Rhino8 × 80.005800.00001128 × 1281.242291.06667320 × 2404.025505.30000Train8 × 80.004200.00001128 × 1281.168791.00000352 × 1924.491795.00000Viptrain8 × 80.011390.00001128 × 1281.200701.46667360 × 2405.497505.20000Table 7 shows the comparison between actual and decrypted frames. Table 8 shows a detailed comparison between the proposed approach and other video encryption models, which already exist. It can be observed that the UACI and NPCR values obtained by the proposed method are very close to the ideal values mentioned in Section 5.1. The ideal value of CC is 0. From Table 8, it is clear that the CC value of the proposed approach is closer to 0 when compared to other methods under inspection. Similarly, the maximum value of entropy is 8, and the entropy of the proposed method is better and much closer to 8 than other methods.Table 7Original and decrypted framesVideoOriginal frameDecrypted frameFlamingoRhinoTrainViptrainTable 8Comparison with existing approachesCriteriaValli and Ganesan [38]Deshmukh and Kolhe [40]Ranjith kumar et al. [20]Proposed approachVideoRhino.mp4Foreman.mpeg [40]Rhino.mp4Rhino.mp4Frame size———128 × 128RedGreenBlueNPCR99.4518—99.5199.6154999.6013099.61276UACI33.63—33.5433.2325133.6904532.77237CC h0.0181−0.01120.03240.0090200.0132230.006237CC v0.0140−0.08130.0261−0.014900.004301−0.00707CC d0.01070.00090.02630.000101−0.00575−0.00170Entropy analysis—7.941—7.997577.997627.99746HistogramUniform—UniformUniformUniformUniformMSE———51.31210950.85156647.275387PSNR———31.42345331.390835131.768447Key spaceSize * 2128128, 192, 2562212SHA-2Encryption time + key generation (s)1.2147 (without key generation)1.122 (without key generation)2.858781.24229Decryption time (s)—2.5401.850821.06667MSE and PSNR values play an important role to determine the quality of image encryption and decryption respectively. Hence, the values obtained after MSE and PSNR analysis are also populated in the table. Finally, the time taken for encryption and decryption is compared to get an idea of the faster algorithm among those under comparison.6Conclusion and future worksOn analyzing the data presented in Section 5, it is evident that the proposed approach for video encryption and decryption is more secure and faster than all the existing methods. The keys produced by combining SHA-2 with cosine-based ILM are more uniform, nonlinear, and better. As a result, the obtained values of various testing parameters are very close to the ideal values and indicate that the proposed approach is very favorable for secure video encryption.In the real world, this method of encryption and decryption can be used to send and receive sensitive medical, military, or any other video data of high importance. Furthermore, this approach can also be integrated with social media to make the data sharing more reliable. In the future, efforts could be made to incorporate audio encryption in the process and achieve better results, within less time.
Open Computer Science – de Gruyter
Published: Jan 1, 2022
Keywords: ILM; chaotic maps; UACI; Entropy; PSNR
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.