RETRACTED ARTICLE: Enhanced bat algorithm for COVID-19 short-term forecasting using optimized LSTM

Rauf, Hafiz Tayyab; Gao, Jiechao; Almadhor, Ahmad; Arif, Muhammad; Nafis, Md Tabrez

doi:10.1007/s00500-021-06075-8

RETRACTED ARTICLE: Enhanced bat algorithm for COVID-19 short-term forecasting using optimized LSTM

Application of soft computing
Published: 11 August 2021

Volume 25, pages 12989–12999, (2021)
Cite this article

Download PDF

Soft Computing Aims and scope Submit manuscript

RETRACTED ARTICLE: Enhanced bat algorithm for COVID-19 short-term forecasting using optimized LSTM

Download PDF

28 Citations
Explore all metrics

This article was retracted on 22 May 2023

This article has been updated

Abstract

The highly infectious COVID-19 critically affected the world that has stuck millions of citizens in their homes to avoid possible spreading of the disease. Researchers in different fields are continually working to develop vaccines and prevention strategies. However, an accurate forecast of the outbreak can help control the pandemic until a vaccine is available. Several machine learning and deep learning-based approaches are available to forecast the confirmed cases, but they lack the optimized temporal component and nonlinearity. To enhance the current forecasting frameworks’ capability, we proposed optimized long short-term memory networks (LSTM) to forecast COVID-19 cases and reduce mean absolute error. For the optimization of LSTM, we applied bat algorithm. Furthermore, to tackle the premature convergence and local minima problem of BA, we proposed an enhanced variant of BA. The proposed version utilized Gaussian adaptive inertia weight to control the individual velocity in the entire swarm. In addition, we substitute random walk with the Gaussian walk to observe the local search mechanism. The proposed LSTM examines the personal best solution with the swarm’s local best and preserves the optimal solution by combining the Gaussian walk. To evaluate the optimized LSTM, we compared it with the non-optimal version of LSTM, recurrent neural network, gated recurrent units, and other recent state-of-the-art algorithms. The experimental results prove the superiority of the optimized LSTM over other recent algorithms by obtaining 99.52 % accuracy.

Comparison Between Two Systems for Forecasting Covid-19 Infected Cases

A Deep Learning Based Hybrid Approach for Short-Term Forecasting of Spread of COVID-19

Rabies Outbreak Prediction Using Deep Learning with Long Short-Term Memory

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The entire world is experiencing a continuous pandemic called the coronavirus (COVID-19) disease due to severe acute respiratory syndrome coronavirus two (SARS-CoV-2) (Abrams et al. 2020). It has been arisen from Wuhan, the capital of Hubei Province in China, through December 2019 (WH Organization et al. 2020). The virus has been discovered on 7th January and found that it is distributed by human-to-human transmission through direct contact or droplet (Wang et al. 2020; Cucinotta and Vanelli 2020). Covid-19 was estimated to be an average incubation period of 6.4 times and a first reproduction number of 2.24–3.58. It has been spread over the entire world, and so the World Health Organization (WHO) had announced COVID-19, a worldwide outbreak on 11th March 2020 (Huang et al. 2020).

COVID-19 contains a few taxonomy symbols as it belongs to the coronavirus family. All such viruses hold several essential proteins fastened in the viral membrane. As it is well worth discovering, the viral plot displays a large diameter, nearly double of a standard organic layer (Bárcena et al. 2009). The genome of SARS-CoV-2 includes six notable open-reading structures (ORFs), usually investigated in several CoVs. A number of the genes received less than 80 % nucleotide chain identification to SARS-CoV (Zhou et al. 2020). With ultraviolet warmth and rays, COVID-19 is fragile. There is a common misconception that at 27 C, this virus might have disappeared. Additionally, Covid-19 may be inactivated by chloroform, peroxyacetic acid, chlorine-containing disinfectant, ether (75 percent), except for chlorhexidine (Cascella et al. 2020).

In 1995, a large-scale study proved that primary clinical symptoms are dyspnea (21.9 percent of cases), expectoration (28.2 percent of cases), fatigue or myalgia (35.8 % of cases), cough (68.6 % of cases), and ever (88.5 percent of cases). In contrast, the minor ones contain vomiting and nausea (3.9 % of cases), nausea (4.8 % of cases), headache, or nausea (12.1 % of cases) (Lq et al. 2020). The frequency of novel coronavirus, like many pathogens, is thought to transpire by respiratory droplets. Thus, the immense bulk of scattering cases is restricted to the adjacent spaces (Cascella et al. 2020).

The SARS-CoV-2 is a pathogenic human coronavirus below the beta coronavirus genus. In the last decade, the two pathogenic species MERS-CoV and SARS-CoV were outbreaks in 2012 and 2002 in the Middle East and China, respectively (Lu et al. 2020; Cui et al. 2019). The laboratory of China put at the NCBI GenBank by discovering the whole genomic sequence (Wuhan-HU1) of the massive RNA virus (SARS-CoV-2) on 10th January (Yang 2020). The SARS-CoV-2 is one positive-stranded RNA virus (Lu et al. 2020).

Following the WHO, no anti-inflammatory medicines and vaccines are not yet prepared for this pandemic (Basu and Chakraborty 2020), and medical industries are looking hard to acquire the vaccine. The vaccine may take at least 18–24 months until it is available, following the quick tracking of the normal vaccine interval of 5–10 decades, and may take additional time to make it appropriate for the large organizations of the world (Grenfell and Drew 2020). Additionally, we do not understand just how long a vaccine could remain successful since the virus mutates. Every attempt was adopted to slow down the coronavirus spread and prepare reasonable medical systems to protect front-line medical staff with sufficient supplies of protective equipment such as personal protective equipment (PPE) masks and other essentials. Consequently, if we know ahead of the number of new coronavirus cases for the next ten days, we could plan our necessary actions. As compared to Asian countries, the USA has been greatly affected by COVID-19. USA COVID-19 cases summary from Feb 2020 to Sep 2020 is illustrated in Fig. 1.

The success of healthcare technologies is a key to artificial intelligence (Panch et al. 2019). Data is structured in smart devices and increases the efficiency of healthcare machine learning (Knight et al. 2016). Several COVID-19 forecasting approaches have been proposed based on machine learning, deep learning, and statistical learning in the past few weeks. However, the primary issue is they lack the temporal components and nonlinearity in terms of machine learning where deep learning approaches are limited to comparative analysis, and uni-model forecasting (Benvenuto et al. 2020; Wieczorek et al. 2020a). Furthermore, some studies considered epidemiological models that need to make hypothesis-based parameter initialization. That model tends to low the net precision due to its under-fitting data nature (Wieczorek et al. 2020a; Gao et al. 2019).

Several optimization algorithms have been used in previous studies to solve time series problems for the weight optimization of neural networks, such as the arithmetic optimization algorithm (Abualigah et al. 2021), group search optimizer (Abualigah 2020), dragonfly algorithm (Alshinwan et al. 2021), genetic algorithm (Momani et al. 2016), reproducing kernel algorithm (Arqub et al. 2017; Arqub 2017) and fuzzy conformable fractional approaches (Arqub and Al-Smadi 2020).

To predict the distribution of COVID-19 in various regions, the authors used Google trend and ECDC data term frequency (Prasanth et al. 2021). To pick the successful COVID-related search words, they used Spearman correlation. The optimization of hyperparameters through the LSTM network proposed a new technique based on a meta-heuristic GWO algorithm.

Three approaches are suggested (Abbasimehr and Paki 2021) that combine Bayesian optimization and deep learning. The optimized values for hyperparameters are effectively chosen by Bayesian optimization in their process. The system architecture is considered to be a process of multiple-output forecasting. Their proposed methods performed better than the reference model on data from the COVID-19 time series.

In order to forecast the COVID-19 outbreak in Saudi Arabia, a study of various deep learning models is proposed (Elsheikh et al. 2021). Officially recorded data was used to evaluate the model. The optimal values of the parameters of the model that optimize the accuracy of forecasting have been determined. They used seven statistical evaluation parameters to forecast the accuracy of the model.

Likewise, the previous studies on COVID-19 did not consider the hyperparameter optimization of neural networks that can help boost the performance of models.

To overcome the issue as mentioned above, we proposed a deep learning model that predicts real-time transmission using optimized LSTM. For the optimization of LSTM, we employed BA. To further deals with the premature convergence (Perwaiz et al. 2020; Rauf et al. 2020b), and local minima problem (Rauf et al. 2020a) of BA, we proposed an enhanced variant of BA. The proposed version consists of two significant enhancements. Firstly, we carried out Gaussian adaptive inertia weight to control the individual velocity in the entire swarm. Secondly, we substitute the random walk with the Gaussian walk to explore the local search mechanism.

Table 1 Recent related works with their dataset details and results

Full size table

2 Methodology

2.1 Proposed BA

The real-world challenges are becoming more complicated every day. Swarm intelligence (SI) is the subset of meta-heuristic algorithms employed to tackle complex optimization problems of continuous nature. We used the self-learning nature of this meta-heuristic to optimize the neural network training parameters. Such features clearly state that local interaction is essential between the swarm-based system components to preserve their survival.

In this research, we have carried out an enhanced version of BA to optimize LSTM training weights. The optimized LSTM dynamically adopt optimal training parameter and decide the execution cycle timeline based on the global convergence manner of enhanced BA. We bring two modifications to classical BA. Firstly, we proposed Gaussian adaptive inertia weight to improve the velocity updating mechanism. Lastly, we update each individual’s local searching strategy to retain local solutions based on the weighted mean of their personal best and the current global solution of the entire swarm.

Properties of standard BA are as follows:

Every micro-bat estimates distance within surroundings and prey by utilizing its property of echolocation.
Frequency of fixed range is utilized to find micro-bat’s velocity from location beside different loudness and distinct wavelength while searching for prey.
Emission pulse rate increases to adjust its pulse frequency while estimating distance among prey and micro-bat.
Loudness will decrease from a considerable positive value to a smaller value.

BA follows three fundamental rules to converge toward an optimal solution.

Each bat is represented by ${\overline{x}}^t_i$ for $i=\{1,2,3\dots {\overline{N}}_p\}$ with the whole population ${\overline{N}}_p$ in an entire search space S and use sonar echolocation to sense the prey and measure the estimated difference of the distance to the prey.
During the convergence process, each bat ${\overline{x}}^t_i$ moves with velocity ${\overline{v}}^t_i$ and the frequency of $f^t_{min}$. The current position of individual can be represented by ${\overline{x}}^t_{ip}$ where p represents the partial coordinate of the current search space. The frequency $f^t_{min}$ consolidates with bat wavelength $\omega $ and variation of loudness $A_o$.
The variation of loudness $A_o$ depends on the current location ${\overline{x}}^t_{ip}$ and the weighted distance $D^t_{ip}$.

Population of fixed size $S_p$, in our case $S_p=40$, is initialized with the random initial values following the uniform distribution ${\overline{x}}^t_i\in [{\overline{x}}_l,{\overline{x}}_u]$, where l and u are lower and upper limits of uniformly distributed sequence. After population initialization, the mutation operators are used to encourage the bats’ movement in the multidimensional search space. The ultimate objective of this phase is to obtain the new local solution, while the frequency $f^t_{min}$factor controls the step-size of the solution. For each individual ${\overline{x}}^t_i$, the current frequency $f^t_i$, current velocity ${\overline{v}}^t_i$ and the current bats potion ${\overline{x}}^t_{ip}$ can be updated using the following equations.

$$\begin{aligned}&f^t_i=f^t_{min}+\left( f^t_{max}-f^t_{min}\right) .R \end{aligned}$$

(1)

$$\begin{aligned}&{\overline{v}}^{t+1}_i={\overline{v}}^t_i+\left( {\overline{x}}^t_{ip} -{\overline{x}}^t_{ig}\right) .f^t_i \end{aligned}$$

(2)

$$\begin{aligned}&{\overline{x}}^{t+1}_{ip}={\overline{x}}^t_{ip}+{\overline{v}}^{t+1}_i. \end{aligned}$$

(3)

Referred to equation 1, $f^t_{max}-f^t_{min}$ are the difference of lower and upper corresponding frequency where R indicates the random number over the interval of [0, 1]. Velocity of each individual ${\overline{x}}^t_i$ can be updated using equation 2, where ${\overline{x}}^t_{ip}-{\overline{x}}^t_{ig}$ is the mean difference of local solution ${\overline{x}}^t_{ip}$ of entire swarm and global solution ${\overline{x}}^t_{ig}$ of all swarms. Likewise, the new vector solution ${\overline{x}}^{t+1}_{ip}$ can be determine using equation 3.

In the proposed BA, we introduced Gaussian adaptive inertia weight to update the velocity in such a manner to avoid more long jumps leading to exploration and to avoid more short jumps leading to exploitation. The proposed Gaussian adaptive inertia weight can help the velocity updating mechanism achieve each individual’s optimal convergence steps. The Gaussian function can be defined as:

$$\begin{aligned} f\left( x\right) =xe^{-\frac{{(a-y)}^2}{{2z}^2}} \end{aligned}$$

(4)

where (x, y, z) are real constant that can be varied over the nature of the problem. A bell shape curve in the Gaussian distribution indicates the height of bell curves and can help the population control the exploration process with the following probability density function.

$$\begin{aligned} g\left( x\right) =\frac{1}{\partial \sqrt{2\pi }}e^{\frac{\frac{1}{2} {\left( a-\grave{a} \right) }^2}{\partial }}. \end{aligned}$$

(5)

In equation 5, $\grave{a} =y$ and can be interpreted as the expected value with variance ${\partial }^2=z^2$.

In order to generate optimal location vectors ${\overline{g}}^{t+1}_i$ through Gaussian distribution over t iterations and D dimensions, the mathematical definition following the adaptive process can be:

$$\begin{aligned} {\overline{g}}^{t+1}_i={\overline{g}}_{min}+\left( {\overline{g}}_{max} -{\overline{g}}_{min}\right) *{\overline{g}}^t_i \end{aligned}$$

(6)

where ${\overline{g}}_{max}-{\overline{g}}_{min}$ are upper and lower intervals [0, 1] of Gaussian distribution. The proposed BA utilized the following equation to update the velocity of each bat ${\overline{v}}^{t+1}_{gi}$.

$$\begin{aligned} {\overline{v}}^{t+1}_{gi}={\overline{g}}^{t+1}_i*{\overline{v}}^t_i +\left( {\overline{x}}^t_{ip}-{\overline{x}}^t_{ig}\right) .f^t_i. \end{aligned}$$

(7)

In equation 7, ${\overline{g}}^{t+1}_i$ shows the proposed Gaussian adaptive inertia weight factor, controlling the exploration and exploitation during the entire convergence process. Gaussian bell curves in the adaptive inertia weight dynamically select each bat’s speed to help the local best vector holder bat to escape local minima. Apart from velocity ${\overline{v}}^{t+1}_{gi}$, updated local solutions ${\overline{x}}^{new}_{ip}$ play an essential role in the exploitation of bats. Consider the speed is regulated, but the newly generated local solutions${\overline{x}}^{new}_{ip}$ are not robust enough to limit the boundary of the entire swarm’s global best ${\overline{x}}^t_{ig}$. In that case, premature convergence can be held. Standard BA uses the following equation to select the best solution among all existing vectors in the swarm:

$$\begin{aligned} {\overline{x}}^{new}_{ip}={\overline{x}}^t_{ig}+\varepsilon A^t_i. \end{aligned}$$

(8)

$\varepsilon $ is a random walk generator throughout $[0,\ 1]$ and $A^t_i$ represents the average loudness factor. The random walk can produce the best solution in the current iteration t and build the worst one in the next iteration $t+1$. The entire local best holder individual will likely follow the best solution ${\overline{x}}^t_{ig}$, which is the worst in the next iteration $t+1$ and leads to the local minima and premature convergence problem. To avoid this random selection that leads to the worst local best solution and effect exploitation, we replace this random walk with a Gaussian walk and propose a local search mechanism. Our proposed variant of BA will use the following equation to attain the local best solution ${\overline{x}}^{new}_{iG}$.

$$\begin{aligned} {\overline{x}}^{new}_{iG}={\overline{x}}^t_{ig}+{\overline{g}}^{t+1}_i ({\overline{x}}^t_{ig}-{\overline{P}}^t_{ig})+\varepsilon A^t_i. \end{aligned}$$

(9)

In the proposed equation 9, ${\overline{g}}^{t+1}_i$ is previously computed Gaussian distribution where ${\overline{x}}^t_{ig}-{\overline{P}}^t_{ig}$ is the mean difference of local best of swarm ${\overline{x}}^t_{ig}$ and the personal best ${\overline{P}}^t_{ig}$ of each bat. The proposed solution will iteratively evaluate the current best and the local best solution ${\overline{P}}^t_{ig}$ for each swarm ${\overline{x}}^t_{ig}$ in the population and check the following condition to use the iterative difference.

$$\begin{aligned} {\overline{x}}^{new}_{iG}=\left\{ \begin{array}{ll} {\overline{x}}^t_{ig} &{} \qquad if({\overline{x}}^t_{ig}>{\overline{x}}^t_{ig}) \\ {\overline{x}}^t_{ig}-{\overline{P}}^t_{ig}&{} \qquad { Otherwise} \end{array}\right. . \end{aligned}$$

(10)

Referred to equation 10, the new local best will be selected ${\overline{x}}^t_{ig}$ if the bats’ personal best is less than the swarm local best otherwise, the weighted mean of local best ${\overline{x}}^t_{ig}$and global best ${\overline{P}}^t_{ig}$ will be chosen as new local best.

New N local bests ${\overline{x}}^{new}_{iG}$ will likely control by the convergence rate, which can be defined by two critical factors loudness ${\overline{A}}^t_i$ and pulse emission rate ${\overline{r}}^t_i$ which can be update thought the following two equations.

$$\begin{aligned}&{\overline{A}}^{t+1}_i=\alpha {\overline{A}}^t_i \end{aligned}$$

(11)

$$\begin{aligned}&{\overline{r}}^t_i={\overline{r}}^0_i[1-\mathrm {exp}\mathrm {}(-{\gamma }^t)]. \end{aligned}$$

(12)

2.2 Optimized Long Short-Term Memory (LSTM)

Recurrent neural network (RNN) has turned out to be the most reliable algorithm for prediction as essential features are extracted automatically from samples of training (Jiang and Schotten 2020). RNN performed well at data processing, and ensured encouraging outcomes for time series prediction while keeping immense information in the internal state (Connor et al. 1994). Nevertheless, it might take much training time due to gradient detonate and evanescence problems (Tomar and Gupta 2020). Hence, in 1997 a long short-term memory RNN structure was designed by Schmidhuber and Hochreiter (Hochreiter and Schmidhuber 1997) to overcome that flaw by administering long-term dependency through multiplicative gates that will handle memory cells and flow of information in the recurrent hidden layer. LSTM’s architecture comprises four gates, i.e., input gate, output gate, control gate, and forget gate (Tomar and Gupta 2020).

Input can be defined as:

$$\begin{aligned} i_t=\sigma (W_i*\left[ h_{t-1},\ x_t\right] +b_i). \end{aligned}$$

(13)

The information extracted from the above equation can be transferred to the cell. Forget gate decides data that will be ignored from the previous layer’s input by utilizing the following equation:

$$\begin{aligned} f_t=\sigma (W_i*\left[ h_{t-1},\ x_t\right] +b_i). \end{aligned}$$

(14)

The input from the entire memory cell is controlled by control gate through following equations:

$$\begin{aligned}&\tilde{C}=\sigma (W_c*\left[ h_{t-1},\ x_t\right] +b_c) \end{aligned}$$

(15)

$$\begin{aligned}&{\tilde{C}}_t=f_{t\ }*\ {\tilde{C}}_{t-1}+i_t*\ {\tilde{C}}_t \end{aligned}$$

(16)

Output and hidden layer $h_{t-1}$ is updated as following:

$$\begin{aligned}&O_t=\sigma (W_o*\left[ h_{t-1},\ x_t\right] +b_o) \end{aligned}$$

(17)

$$\begin{aligned}&h_t=O_t*\mathrm {tanh}\mathrm {}({\tilde{C}}_t). \end{aligned}$$

(18)

The interval [-1 to 1] is normalized by using tanh, where W os the weight matrices and $\sigma $ shows activation function taken as sigmoid.

We feed the learning rate, momentum rate, and dropout rate in each of the LSTM dropout layers to the BA for automatic optimization of the hyperparameters. Each parameter is examined before the classification layer of LSTM to determine BA’s best optimal global solution. If the fitness function produces the same values, the proposed algorithm will check in the next generation to see if it avoids premature convergence.

Hyperparameters of each hidden layer $h_{t-1}$ for $t=\{1,2,3\dots N\}$ are optimized by providing global solution ${\overline{x}}^{new}_{iG}$ obtained using equation 9. The output layer of optimized LSTM can be interpreted as:

$$\begin{aligned} O_t=\sigma \left( W_o*\left[ h_{t-1}\left( \left\{ \begin{array}{ll} {\overline{x}}^t_{ig} &{} \qquad if({\overline{x}}^t_{ig}>{\overline{x}}^t_{ig}) \\ {\overline{x}}^t_{ig}-{\overline{P}}^t_{ig} &{} \qquad { Otherwise} \end{array}\right. \right) ,\ x_t\right] +b_o\right) \end{aligned}$$

(19)

where each hidden layer choose global best of the entire population ${\overline{x}}^t_{ig}$ or mean of personal best and local best of swarm ${\overline{x}}^t_{ig}-{\overline{P}}^t_{ig}$. The pseudocode of proposed Algorithm is presented in Algorithm 1.

We also checked single parameter optimization impact on the proposed technique, and we observed that only learning rate optimization produces a negligible impact on the performance of the proposed LSTM. However, the collective optimization of the learning rate, momentum rate, and dropout rate tends to increase the overall performance of the proposed LSTM.

3 Experiments

WHO accounted for the outbreak of COVID-19 in states and regions around the world. Several areas of South and North America, in particular, witness the adverse effects of a massive COVID-19 explosion. The operation of huge air traffic between each state of the USA has entirely encouraged COVID-19 to propagate from its source to the next infected states; individual-to-individual spread has thus been reported among travelers worldwide. The primary goal of this research is the prediction and forecast of epidemic spreading by COVID-19. This examination contains the count of confirmed and recovered cases obtained from the WHO website regularly. We consider the USA for the experiments and employed live dataset updated daily. The utilized dataset is available at (WHO 2020).

The experiments are conducted using specific python packages, namely Keras, TensorFlow, NumPy, and iplot using python language. To compare the performance of the proposed optimized LSTM, we tested other standard forecasting algorithms, i.e., Simple LSTM, GRU, and RNN.

3.1 Results

This study provides an optimized deep-learning model for COVID-19’s time series analysis of the USA. The proposed framework dynamically selects optimal training parameters and determines the execution cycle based on enhanced BA’s global convergence manner.

The forecasting of COVID-19 was achieved in two preliminary stages: data training and evaluation. To compared the proposed variant with existing algorithms, we used five evaluation metrics; namely root mean absolute error (RMSE), mean absolute percentage error (MAPE), standard deviation (Stdev), prediction interval, and accuracy. The following equations can define RMSE, MAPE, and Stdev:

$$\begin{aligned} RMSE={\left[ \sum ^N_{i=1}{\frac{{(a_i-a_o)}^2}{N}}\right] }^{\frac{1}{2}} \end{aligned}$$

(20)

where $a_i-a_o$ represents squared difference forecasted and actual values.

$$\begin{aligned} MAPE=\frac{1}{n}\sum {\frac{\left| e\right| }{d}} \end{aligned}$$

(21)

where $\left| e\right| $ indicates absolute error and d shows demand for each period.

$$\begin{aligned} Stdev=\sqrt{\frac{1}{N-1}\sum ^N_{i=1}{{(x_i-\overline{x})}^2}}. \end{aligned}$$

(22)

In the above equation $\overline{x}$ is mean of ith sample and N indicates total number of instance.

The raw data is pre-processed and standardized in the initial stages and subsequently used to develop the optimized predictive model based on LSTM. The model’s boundary parameters are selected so that the MAPE can be minimized. From a particular stage on, the optimized LSTM with the optimal learning parameters is used in the testing process to predict the extent of COVID-19 cases in the USA.

Table 2 presents the empirical results for confirmed and predicted cases obtained through GRU, RNN, LSTM, and optimized LSTM. RMSE shows the root mean square errors in each network during the training. MAPE is total loss subtracted from precision, where Stdev shows the significant difference between confirmed and predicted COVID-19 cases. Prediction interval represents the difference in response to confirmed cases between each day of the forecasted cases.

We presented a statistical test called Kruskal–Wallis test for the experimental results, comparing the results with other published methods. The average rank, median value, and Z-score obtained through Kruskal–Wallis test for each employed algorithm is presented in Table 5.

Table 2 Comparison of proposed optimized LSTM with other standard deep learning forecasting models

Full size table

Likewise, training and validation loss minimization curves using GRU, RNN, LSTM, and optimized LSTM are illustrated in Figs. 3, 4, 5, and 6. The convergence curves of real and forecasted COVID-19 cases through optimized LSTM in the USA are presented in Fig. 7.

A comparison of the proposed optimized LSTM with other standard deep learning forecasting models is tabulated in Table 4. We take the forecasting dates from 1/9/20 to 10/9/20, and to validate the predicted values, we retain previous ten-day cases 22/8/20 to 31/8/20. Referred to Table (4), actual confirmed cases do not appear yet in the USA from 31/8/20 to 1/9/20, predicted shows the forecasted cases through existing GRU, RNN, LSTM, and proposed optimized LSTM, respectively.

For validation of the performance of the proposed optimized LSTM, Fig. 8 represents the forecasting curves of several networks compared to the actual number of cases.

Comparison of proposed optimized LSTM with other variants of LSTM and other deep learning models is given in Table 3.

3.2 Analysis

Table 2 shows that GRU obtained the worst accuracy with 1786.613 RMSE and 3261.895 Stdev, which shows a significant difference between actual and predicted COVID-cases. After GRU, standard LSTM performed better with 2688.245 prediction intervals and 12.12 MAPE. The performance of RNN is relatively good compared to GRU and LSTM with 91 % accuracy and 1371.55 Stdev. Lastly, it can be seen that the proposed version of optimized LSTM outperformed all other deep learning models with 32.99 RMSE better than GRU, 0.4838 MAPE better than LSTM, and only 60.23 significant difference among confirmed and predicted cases.

Furthermore, the validation loss in the case of GRU and RNN is not stable throughout the learning process and meets greater than 0.5 and 0.7 (refer Figs. 3 and 4). From Fig. 5, the validation loss of LSTM is stable compared to GRU and RNN throughout the learning process with a greater 0.40. As opposed to GRU, LSTM, and RNN, the proposed model minimized the validation loss up to 0.04 and shows the better capability of loss minimization (refer Fig. 6).

The performance of the proposed optimized LSTM can be confirmed through Fig. 7, where the USA’s actual cases on 31/8/20 were 6030587, and the predictions were 3734918, 5328279 7653031, and 6097641 using GRU, RNN, LSTM, and optimized LSTM, respectively.

Table 3 Comparison of proposed optimized LSTM with other variants of LSTM and other deep learning models

Full size table

Table 4 Comparison of proposed optimized LSTM with other standard deep learning forecasting models

Full size table

Table 5 Kruskal–Wallis test: proposed LSTM vs recent state-of-the-art algorithms

Full size table

From Table 5, it can be observed that the proposed LSTM obtained the best mean rank of 17.0 through Kruskal–Wallis test as compared to others. Advanced algorithms such as NAdam with 41 mean rank, two LSTM variants with 16 and 13 mean ranks, respectively. Similarly, the proposed LSTM outperformed other published results by obtaining the best positive Z-score of 163.

We can conclude that using the proposed optimized framework can help the USA and other governments predict the actual cases with 99 % accuracy and take precautionary measures in advance.

4 Conclusion

This research offers the optimized LSTM to forecasts COVID-19 cases in the USA. Many machine learning and deep learning approaches are available to forecast confirmed cases, but they lack both the optimized temporal aspect and nonlinearity. To overcome this issue, we applied the BA for the optimization of LSTM. Besides, we implemented an enhanced BA variant to tackle BA’s premature convergence and local minima problems. The proposed version of BA used Gaussian adaptive inertia weight to control the individual velocity in the swarm. In addition, we replace the random walk with the Gaussian walk to observe the local search. The robust local search mechanism assists LSTM hyperparameter optimization during the training process. The proposed optimized LSTM is compared with GRU, RNN, and LSTM. Empirical results reveal that optimized LSTM minimized MAPE by 0.48, which is far better than the existing algorithms.

In future work, we intend to adopt other evolutionary models such as the Genetic Algorithm and Differential evolution algorithm in the regression-based deep learning model for multivariate forecasting of a pandemic.

Change history

22 May 2023
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s00500-023-08553-7

References

Abbasimehr H, Paki R (2021) Prediction of COVID-19 confirmed cases combining deep learning methods and Bayesian optimization. Chaos Solitons Fractals. https://doi.org/10.1016/j.chaos.2020.110511
Article MathSciNet Google Scholar
Abrams ME, Johnson KA, Perelman SS, Zhang Ls, Endapally S, Mar KB, Thompson BM, McDonald JG, Schoggins JW, Radhakrishnan A, et al (2020) Oxysterols provide innate immunity to bacterial infection by mobilizing cell surface accessible cholesterol. Nat Microbiol, pp 1–14
Abualigah L (2020) Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput Appl, pp 1–24
Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609
Article MathSciNet MATH Google Scholar
Alakus TB, Turkoglu I (2020) Comparison of deep learning approaches to predict covid-19 infection. Chaos Solitons Fractals, p 110120
Alshinwan M, Abualigah L, Shehab M, Abd Elaziz M, Khasawneh AM, Alabool H, Al Hamad H (2021) Dragonfly algorithm: a comprehensive survey of its results, variants, and applications. Multimedia Tools Appl, pp 1–38
Arora P, Kumar H, Panigrahi BK (2020) Prediction and analysis of COVID-19 positive cases using deep learning models: a descriptive case study of india. Chaos Solitons Fractals 139:110017. https://doi.org/10.1016/j.chaos.2020.110017
Article MathSciNet Google Scholar
Arqub OA (2017) Adaptation of reproducing kernel algorithm for solving fuzzy fredholm-volterra integrodifferential equations. Neural Comput Appl 28(7):1591–1610
Article Google Scholar
Arqub OA, Al-Smadi M (2020) Fuzzy conformable fractional differential equations: novel extended approach and new numerical solutions. Soft Comput, pp 1–22
Arqub OA, Al-Smadi M, Momani S, Hayat T (2017) Application of reproducing kernel algorithm for solving second-order, two-point fuzzy boundary value problems. Soft Comput 21(23):7191–7206
Article MATH Google Scholar
Bárcena M, Oostergetel GT, Bartelink W, Faas FG, Verkleij A, Rottier PJ, Koster AJ, Bosch BJ (2009) Cryo-electron tomography of mouse hepatitis virus: insights into the structure of the coronavirion. Proc Natl Acad Sci 106(2):582–587
Article Google Scholar
Basu A, Chakraborty S (2020). Faculty opinions recommendation of a pneumonia outbreak associated with a new coronavirus of probable bat origin. https://doi.org/10.3410/f.737304963.793575060
Benvenuto D, Giovanetti M, Vassallo L, Angeletti S, Ciccozzi M (2020) Application of the ARIMA model on the COVID-2019 epidemic dataset. Data Brief 29:105340. https://doi.org/10.1016/j.dib.2020.105340
Article Google Scholar
Cascella M, Rajnik M, Cuomo A, Dulebohn SC, Di Napoli R (2020) Features, evaluation and treatment coronavirus (covid-19). In: StatPearls [internet], StatPearls Publishing
Chimmula VKR, Zhang L (2020) Time series forecasting of covid-19 transmission in canada using lstm networks. Chaos Solitons Fractals, p 109864
Chowdhury AA, Hasan KT, Hoque KKS (2020) Analysis and prediction of covid-19 pandemic in bangladesh by using long short-term memory network (lstm) and adaptive neuro fuzzy inference system (anfis)
Connor J, Martin R, Atlas L (1994) Recurrent neural networks and robust time series prediction. IEEE Trans Neural Netw 5(2):240–254. https://doi.org/10.1109/72.279188
Article Google Scholar
Cucinotta D, Vanelli M (2020) Who declares covid-19 a pandemic. Acta Bio-Medica Atenei Parmensis 91(1):157–160
Google Scholar
Cui J, Li F, Shi ZL (2019) Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol 17(3):181–192
Article Google Scholar
Dutta S, Bandyopadhyay SK, Kim TH (2020) Cnn-lstm model for verifying predictions of covid-19 cases. Asian J Res Comput Sci, pp 25–32
Elsheikh AH, Saba AI, Abd Elaziz M, Lu S, Shanmugan S, Muthuramalingam T, Kumar R, Mosleh AO, Essa F, Shehabeldeen TA (2021) Deep learning-based forecasting model for covid-19 outbreak in saudi arabia. Process Safety Environ Protect 149:223–233
Article Google Scholar
Gao J, Wang H, Shen H (2019) Task failure prediction in cloud data centers using deep learning. In: 2019 IEEE International Conference on Big Data (Big Data), IEEE. https://doi.org/10.1109/bigdata47090.2019.9006011
Grenfell R, Drew T (2020) Here’s why its taking so long to develop a vaccine for the new coronavirus. Science Alert Archived from the original on 28
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X et al (2020) Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395(10223):497–506
Article Google Scholar
Jiang W, Schotten HD (2020) Deep learning for fading channel prediction. IEEE Open J Commun Soc 1:320–332
Article Google Scholar
Kavadi MDP, Patan R, Ramachandran M, Gandomi AH (2020) Partial derivative nonlinear global pandemic machine learning prediction of covid 19. Chaos Solitons Fractals, p 110056
Knight GM, Dharan NJ, Fox GJ, Stennis N, Zwerling A, Khurana R, Dowdy DW (2016) Bridging the gap between evidence and policy for infectious diseases: how models can aid public health decision-making. Int J Infect Dis 42:17–23. https://doi.org/10.1016/j.ijid.2015.10.024
Article Google Scholar
Li L, Huang T, Wang Y, Wang Z, Liang Y, Huang T, Zhang H, Sun W, Wang Y (2020) Covid-19 patients clinical characteristics, discharge rate, and fatality rate of meta-analysis. J Med Virol 92(6):577–583
Article Google Scholar
Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, Wang W, Song H, Huang B, Zhu N et al (2020) Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 395(10224):565–574
Article Google Scholar
Momani S, Abo-Hammour ZS, Alsmadi OM (2016) Solution of inverse kinematics problem using genetic algorithms. Appl Math Inf Sci 10(1):225
Article Google Scholar
WH Organization, et al (2020) Covid-19 in suspected human cases: interim guidance, 17 January 2020. WHO
Panch T, Mattie H, Celi LA (2019) The inconvenient truth about AI in healthcare. NPJ Digital Med. https://doi.org/10.1038/s41746-019-0155-4
Article Google Scholar
Pathan RK, Biswas M, Khandaker MU (2020) Time series prediction of covid-19 by mutation rate analysis using recurrent neural network-based lstm model. Chaos Solitons Fractals, p 110018
Perwaiz U, Younas I, Anwar AA (2020) Many-objective bat algorithm. PLoS ONE 15(6):e0234625
Article Google Scholar
Pinter G, Felde I, Mosavi A, Ghamisi P, Gloaguen R (2020) Covid-19 pandemic prediction for hungary; a hybrid machine learning approach. Mathematics 8(6):890
Article Google Scholar
Prasanth S, Singh U, Kumar A, Tikkiwal VA, Chong PH (2021) Forecasting spread of covid-19 using google trends: a hybrid gwo-deep learning approach. Chaos Solitons Fractals 142:110336
Article MathSciNet Google Scholar
Rauf HT, Malik S, Shoaib U, Irfan MN, Lali MI (2020a) Adaptive inertia weight bat algorithm with sugeno-function fuzzy search. Appl Soft Comput 90:106159. https://doi.org/10.1016/j.asoc.2020.106159
Article Google Scholar
Rauf HT, Shoaib U, Lali MI, Alhaisoni M, Irfan MN, Khan MA (2020b) Particle swarm optimization with probability sequence for global optimization. IEEE Access 8:110535–110549. https://doi.org/10.1109/access.2020.3002725
Article Google Scholar
Tomar A, Gupta N (2020) Prediction for the spread of COVID-19 in india and effectiveness of preventive measures. Sci Total Environ 728:138762. https://doi.org/10.1016/j.scitotenv.2020.138762
Article Google Scholar
Tuli S, Tuli S, Tuli R, Gill SS (2020) Predicting the growth and trend of covid-19 pandemic using machine learning and cloud computing. Internet of Things, p 100222
Wang C, Horby PW, Hayden FG, Gao GF (2020) A novel coronavirus outbreak of global health concern. Lancet 395(10223):470–473
Article Google Scholar
WHO (2020) Git-hub live data. https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
Wieczorek M, Siłka J, Woźniak M (2020a) Neural network powered COVID-19 spread forecasting model. Chaos Solitons Fractals 140:110203. https://doi.org/10.1016/j.chaos.2020.110203
Article MathSciNet Google Scholar
Wieczorek M, Siłka J, Woźniak M (2020b) Neural network powered covid-19 spread forecasting model. Chaos Solitons Fractals, p 110203
Yang J (2020) Inhibition of sars-cov-2 replication by acidizing and rna lyase-modified carbon nanotubes combined with photodynamic thermal effect. J Exploratory Res Pharmacol, pp 1–6
Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, Si HR, Zhu Y, Li B, Huang CL et al (2020) A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579(7798):270–273
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Engineering & Informatics, University of BRADFORD, Bradford, UK
Hafiz Tayyab Rauf
Department of Computer Science, University of Virginia, Charlottesville, Virginia, US
Jiechao Gao
Department of Computer Engineering and Networks, Jouf University, Sakakah, Saudi Arabia
Ahmad Almadhor
School of Computer Science, Guangzhou University, Guangzhou, 510006, China
Muhammad Arif
Department of Computer Science and Engineering, Jamia Hamdard, New Delhi, India
Md Tabrez Nafis

Authors

Hafiz Tayyab Rauf
View author publications
You can also search for this author in PubMed Google Scholar
Jiechao Gao
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Almadhor
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Arif
View author publications
You can also search for this author in PubMed Google Scholar
Md Tabrez Nafis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiechao Gao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1007/s00500-023-08553-7"

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

About this article

Cite this article

Rauf, H.T., Gao, J., Almadhor, A. et al. RETRACTED ARTICLE: Enhanced bat algorithm for COVID-19 short-term forecasting using optimized LSTM. Soft Comput 25, 12989–12999 (2021). https://doi.org/10.1007/s00500-021-06075-8

Download citation

Accepted: 22 July 2021
Published: 11 August 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00500-021-06075-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

RETRACTED ARTICLE: Enhanced bat algorithm for COVID-19 short-term forecasting using optimized LSTM

Abstract

Similar content being viewed by others

Comparison Between Two Systems for Forecasting Covid-19 Infected Cases

A Deep Learning Based Hybrid Approach for Short-Term Forecasting of Spread of COVID-19

Rabies Outbreak Prediction Using Deep Learning with Long Short-Term Memory

1 Introduction