MATHEMATICAL FORMULATIONS, OPTIMIZATION, AND STATISTICAL VALIDATION OF HYBRID DEEP LEARNING MODELS FOR SHORT-TERM WIND SPEED FORECASTING: A COMPARATIVE ANALYSIS OF ANN, LSTM, SVM, ARIMA-ANN, CNN-BILSTM, AND CNN-BILSTM-ATTENTION ARCHITECTURES

Er. Rishabh Aryan; Manimozhi I

doi:10.65327/cse.v12i1.2479

Authors

Er. Rishabh Aryan Indian Institute of Information Technology, Bhagalpur (Bihar)
Manimozhi I Amet University Kanathur, Chennai (Tamil Nadu)

DOI:

https://doi.org/10.65327/cse.v12i1.2479

Keywords:

Mathematical Formulation, Optimization, Adam Optimizer, Backpropagation Through Time, Statistical Validation, Residual Diagnostics, CNN–BiLSTM–Attention

Abstract

Hybrid deep learning architectures for short-term wind speed forecasting have proliferated in recent years, yet the mathematical foundations underpinning their comparative performance are rarely presented in a unified way. This paper provides a rigorous mathematical formulation of six representative architectures, ANN, LSTM, SVM, hybrid ARIMA–ANN, CNN–BiLSTM, and CNN–BiLSTM–Attention, and derives their associated loss functions, optimization dynamics, and statistical-validation criteria within a single coherent framework. The paper details the matrix-form operations of each layer, the gating equations of LSTM/BiLSTM, the soft-attention context-vector formulation, the backpropagation-through-time gradient flow, and the Adam optimizer update equations. It also presents the RMSE, MAE, MAPE, and R² metrics; derives their statistical expectations under Gaussian residuals; and develops the full residual-diagnostic apparatus including Ljung–Box autocorrelation, Shapiro–Wilk normality, Breusch–Pagan heteroskedasticity, and Diebold–Mariano comparative accuracy tests. Empirical validation on 8,760 hourly SCADA observations from an Indian onshore wind turbine confirms the mathematical predictions: the attention-augmented hybrid achieves the lowest training and validation loss (MSE = 1.31), the fastest convergence (stable by epoch 55 with Adam at learning rate 0.001), and residual distributions satisfying all four statistical tests. The paper thus positions the CNN–BiLSTM–Attention architecture not merely as an empirically superior model but as a mathematically principled and statistically rigorous choice for Indian wind-speed forecasting.

Downloads

Download data is not yet available.

References

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. https://doi.org/10.48550/arXiv.1409.0473

Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294. https://doi.org/10.2307/1911963

Chen, J., Zeng, G.-Q., Zhou, W., Du, W., & Lu, K.-D. (2021). Wind speed forecasting using a nonlinear learning ensemble of deep learning time series prediction and extremal optimization. Energy Conversion and Management, 165, 681–695. https://doi.org/10.1016/j.enconman.2018.03.098

Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3), 253–263. https://doi.org/10.1080/07350015.1995.10524599

Foley, A. M., Leahy, P. G., Marvuglia, A., & McKeogh, E. J. (2012). Current methods and advances in forecasting of wind power generation. Renewable Energy, 37(1), 1–8. https://doi.org/10.1016/j.renene.2011.05.033

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. https://www.deeplearningbook.org

Hanifi, S., Liu, X., Lin, Z., & Lotfian, S. (2020). A critical review of wind power forecasting methods—past, present, and future. Energies, 13(15), 3764. https://doi.org/10.3390/en13153764b

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001

Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1412.6980

Kumar, G., & Kaur, A. (2020). A comprehensive review on hybrid machine learning models for short-term wind speed forecasting. Renewable and Sustainable Energy Reviews, 130, 109956.

https://doi.org/10.1016/j.rser.2020.109956

Liu, H., Mi, X., & Li, Y. (2021). A smart deep learning-based wind speed prediction model using wavelet packet decomposition, a convolutional neural network, and a convolutional long short-term memory network. Energy Conversion and Management, 166, 120–131. https://doi.org/10.1016/j.enconman.2018.04.021

Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65(2), 297–303. https://doi.org/10.1093/biomet/65.2.297

Neshat, M., Nezhad, M. M., Abbasnejad, E., Mirjalili, S., Groppi, D., Heydari, A., Tjernberg, L. B., Garcia, D. A., Alexander, B., Shi, Q., & Wagner, M. (2021). Wind turbine power output prediction using a new hybrid neuro- evolutionary method. Energy, 229, 120617. https://doi.org/10.1016/j.energy.2021.120617

Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. https://doi.org/10.48550/arXiv.1609.04747

Shahid, F., Zameer, A., & Muneeb, M. (2021). A novel genetic LSTM model for wind power forecast. Energy, 223, 120069. https://doi.org/10.1016/j.energy.2021.120069

Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.1093/biomet/52.3-4.591

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence-to-sequence learning with neural networks. Advances in Neural Information Processing Systems, 27, 3104–3112. https://doi.org/10.48550/arXiv.1409.3215

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://doi.org/10.48550/arXiv.1706.03762

Wang, Y., Hu, Q., Srinivasan, D., & Wang, Z. (2020). Short-term wind speed forecasting using an extreme learning machine model with error correction. Neural Computing and Applications, 32, 4509–4524. https://doi.org/10.1007/s00521-018-3652-5

MATHEMATICAL FORMULATIONS, OPTIMIZATION, AND STATISTICAL VALIDATION OF HYBRID DEEP LEARNING MODELS FOR SHORT-TERM WIND SPEED FORECASTING: A COMPARATIVE ANALYSIS OF ANN, LSTM, SVM, ARIMA-ANN, CNN-BILSTM, AND CNN-BILSTM-ATTENTION ARCHITECTURES

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Information

Make a Submission

crossref