Simulation and Empirical Studies of Long Short-Term Memory Performance to Deal with Limited Data

Khusnia Nurul Khikmah; Kusman Sadik; Khairil Anwar Notodiputro

doi:10.15575/join.v10i1.1356

Authors

Khusnia Nurul Khikmah Statistics Department, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia https://orcid.org/0000-0002-9142-6968
Kusman Sadik Statistics Department, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia
Khairil Anwar Notodiputro Statistics Department, Faculty of Mathematics and Natural Sciences, IPB University, Bogor, Indonesia

DOI:

https://doi.org/10.15575/join.v10i1.1356

Keywords:

Functional autoregressive, Long short-term memory, Multiple-Long Short-Term Memory, Poverty, Simulation study

Abstract

This research is proposed to determine the performance of time series machine learning in the presence of noise, where this approach is intended to forecast time series data. The approach method chosen is long short-term memory (LSTM), a development of recurrent neural network (RNN). Another problem is the availability of data, which is not limited to high-dimensional data but also limited data. Therefore, this study tests the performance of long short-term memory using simulated data, where the simulated data used in this study are data generated from the functional autoregressive (FAR) model and data generated from the functional autoregressive model of order 1 FAR(1) which is given additional noise. Simulation results show that the long short-term memory method in analyzing time series data in the presence of noise outperforms by 1-5% the method without noise and data with limited observations. The best performance of the method is determined by testing the analysis of variance against the mean absolute percentage error. In addition, the empirical data used in this study are the percentage of poverty, unemployment, and economic growth in Java. The method that has the best performance in analyzing each poverty data is used to forecast the data. The comparison result for the empirical data is that the M-LSTM method outperforms the LSTM in analyzing the poverty percentage data. The best method performance is determined based on the average value of the mean absolute percentage error of 1-10%.

Author Biography

Khusnia Nurul Khikmah, Statistics Department, Faculty of Mathematics and Natural Sciences, IPB University, Bogor

Department of Statistics, Graduate Student

References

[1] F. Petropoulos et al., “Forecasting: theory and practice,” Int J Forecast, 2022.

[2] S. Jiao, A. Aue, and H. Ombao, “Functional Time Series Prediction Under Partial Observation of the Future Curve,” J Am Stat Assoc, pp. 1–29, May 2021, doi: 10.1080/01621459.2021.1929248.

[3] I. Shah, H. Bibi, S. Ali, L. Wang, and Z. Yue, “Forecasting one-day-ahead electricity prices for italian electricity market using parametric and nonparametric approaches,” IEEE Access, vol. 8, pp. 123104–123113, 2020.

[4] Y. Chen, T. Koch, X. Xu, K. Lim, and N. Zakiyeva, “A review study of functional autoregressive models with application to energy forecasting,” WIREs Computational Statistics, vol. 13, Jul. 2020, doi: 10.1002/wics.1525.

[5] V. K. R. Chimmula and L. Zhang, “Time series forecasting of COVID-19 transmission in Canada using LSTM networks,” Chaos Solitons Fractals, vol. 135, p. 109864, 2020.

[6] H. Chung and K. Shin, “Genetic algorithm-optimized long short-term memory network for stock market prediction,” Sustainability, vol. 10, no. 10, p. 3765, 2018.

[7] S. Bouktif, A. Fiaz, A. Ouni, and M. A. Serhani, “Optimal deep learning lstm model for electric load forecasting using feature selection and genetic algorithm: Comparison with machine learning approaches,” Energies (Basel), vol. 11, no. 7, p. 1636, 2018.

[8] A. Sagheer and M. Kotb, “Time series forecasting of petroleum production using deep LSTM recurrent networks,” Neurocomputing, vol. 323, pp. 203–213, 2019.

[9] M. Elsaraiti and A. Merabet, “A comparative analysis of the arima and lstm predictive models and their effectiveness for predicting wind speed,” Energies (Basel), vol. 14, no. 20, p. 6782, 2021.

[10] Z. Liu, X. Hu, L. Xu, W. Wang, and F. M. Ghannouchi, “Low computational complexity digital predistortion based on convolutional neural network for wideband power amplifiers,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 3, pp. 1702–1706, 2021.

[11] S. I. Alzahrani, I. A. Aljamaan, and E. A. Al-Fakih, “Forecasting the spread of the COVID-19 pandemic in Saudi Arabia using ARIMA prediction model under current public health interventions,” J Infect Public Health, vol. 13, no. 7, pp. 914–919, 2020, doi: https://doi.org/10.1016/j.jiph.2020.06.001.

[12] W. Żuławiński and A. Wyłomańska, “New estimation method for periodic autoregressive time series of order 1 with additive noise,” Int J Adv Eng Sci Appl Math, vol. 13, no. 2, pp. 163–176, 2021, doi: 10.1007/s12572-021-00302-z.

[13] F. Ding, Y. Wang, J. Dai, Q. Li, and Q. Chen, “A recursive least squares parameter estimation algorithm for output nonlinear autoregressive systems using the input–output data filtering,” J Franklin Inst, vol. 354, no. 15, pp. 6938–6955, 2017.

[14] F. Ding, D. Meng, J. Dai, Q. Li, A. Alsaedi, and T. Hayat, “Least Squares based Iterative Parameter Estimation Algorithm for Stochastic Dynamical Systems with ARMA Noise Using the Model Equivalence,” Int J Control Autom Syst, vol. 16, no. 2, pp. 630–639, 2018, doi: 10.1007/s12555-017-0001-x.

[15] J. Li and J. Zhang, “Maximum likelihood identification of dual-rate Hammerstein output-error moving average system,” IET Control Theory & Applications, vol. 14, no. 8, pp. 1089–1101, 2020.

[16] J. E. Cavanaugh and A. A. Neath, “The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements,” Wiley Interdiscip Rev Comput Stat, vol. 11, no. 3, p. e1460, 2019.

[17] H. de-G. Acquah, “Comparison of Akaike information criterion (AIC) and Bayesian information criterion (BIC) in selection of an asymmetric price relationship,” J Dev Agric Econ, vol. 2, no. 1, pp. 1–6, 2010.

[18] D. R. Anderson, K. P. Burnham, and G. C. White, “Comparison of Akaike information criterion and consistent Akaike information criterion for model selection and statistical inference from capture-recapture studies,” J Appl Stat, vol. 25, no. 2, pp. 263–282, 1998.

[19] A. G. Salman and B. Kanigoro, “Visibility forecasting using autoregressive integrated moving average (ARIMA) models,” Procedia Comput Sci, vol. 179, pp. 252–259, 2021.

[20] Z. Zhao, Y. Liu, and C. Peng, “Variable selection in generalized random coefficient autoregressive models,” J Inequal Appl, vol. 2018, no. 1, pp. 1–14, 2018.

[21] A. R. Ghumman, A.-R. Ateeq-ur-Rauf, H. Haider, and M. Shafiquzamman, “Functional data analysis of models for predicting temperature and precipitation under climate change scenarios,” Journal of Water and Climate Change, vol. 11, no. 4, pp. 1748–1765, 2020.

[22] M. Dass and C. Shropshire, “Introducing Functional Data Analysis to Managerial Science,” Organ Res Methods, vol. 15, pp. 693–721, Oct. 2012, doi: 10.1177/1094428112457830.

[23] A. Leroy, A. Marc, O. Dupas, J. L. Rey, and S. Gey, “Functional data analysis in sport science: Example of swimmers’ progression curves clustering,” Applied Sciences, vol. 8, no. 10, p. 1766, 2018.

[24] J.-L. Wang, J.-M. Chiou, and H.-G. Müller, “Functional data analysis,” Annu Rev Stat Appl, vol. 3, pp. 257–295, 2016.

[25] P. Kokoszka and M. Reimherr, Introduction to functional data analysis. Chapman and Hall/CRC, 2017.

[26] M. Giacofci, S. Lambert‐Lacroix, G. Marot, and F. Picard, “Wavelet‐based clustering for mixed‐effects functional models in high dimension,” Biometrics, vol. 69, no. 1, pp. 31–40, 2013.

[27] H. Sun, “Mercer theorem for RKHS on noncompact sets,” J Complex, vol. 21, no. 3, pp. 337–349, 2005, doi: https://doi.org/10.1016/j.jco.2004.09.002.

[28] C. Olah, “Understanding LSTM Networks,” https://colah.github.io/posts/2015-08-Understanding-LSTMs/.

[29] B. Karlik and A. V. Olgac, “Performance analysis of various activation functions in generalized MLP architectures of neural networks,” International Journal of Artificial Intelligence and Expert Systems, vol. 1, no. 4, pp. 111–122, 2011.

[30] S. M. Miran, S. J. Nelson, D. Redd, and Q. Zeng-Treitler, “Using multivariate long short-term memory neural network to detect aberrant signals in health data for quality assurance,” Int J Med Inform, vol. 147, p. 104368, 2021.

[31] X. Xu, X. Rui, Y. Fan, T. Yu, and Y. Ju, “A multivariate long short-term memory neural network for coalbed methane production forecasting,” Symmetry (Basel), vol. 12, no. 12, p. 2045, 2020.

[32] S. Liu, X. Liu, Q. Lyu, and F. Li, “Comprehensive system based on a DNN and LSTM for predicting sinter composition,” Appl Soft Comput, vol. 95, p. 106574, 2020.

[33] A. S. Acharya, A. Prakash, P. Saxena, and A. Nigam, “Sampling: Why and how of it,” Indian Journal of Medical Specialties, vol. 4, no. 2, pp. 330–333, 2013.

[34] J. Ranstam and J. A. Cook, “LASSO regression,” British Journal of Surgery, vol. 105, no. 10, p. 1348, Sep. 2018, doi: 10.1002/bjs.10895.

[35] Z. Zhao, W. Chen, X. Wu, P. C. Y. Chen, and J. Liu, “LSTM network: a deep learning approach for short‐term traffic forecast,” IET Intelligent Transport Systems, vol. 11, no. 2, pp. 68–75, 2017.