MATHEMATICAL FORMULATIONS, OPTIMIZATION, AND STATISTICAL VALIDATION OF HYBRID DEEP LEARNING MODELS FOR SHORT-TERM WIND SPEED FORECASTING: A COMPARATIVE ANALYSIS OF ANN, LSTM, SVM, ARIMA-ANN, CNN-BILSTM, AND CNN-BILSTM-ATTENTION ARCHITECTURES
DOI:
https://doi.org/10.65327/cse.v12i1.2479Keywords:
Mathematical Formulation, Optimization, Adam Optimizer, Backpropagation Through Time, Statistical Validation, Residual Diagnostics, CNN–BiLSTM–AttentionAbstract
Hybrid deep learning architectures for short-term wind speed forecasting have proliferated in recent years, yet the mathematical foundations underpinning their comparative performance are rarely presented in a unified way. This paper provides a rigorous mathematical formulation of six representative architectures, ANN, LSTM, SVM, hybrid ARIMA–ANN, CNN–BiLSTM, and CNN–BiLSTM–Attention, and derives their associated loss functions, optimization dynamics, and statistical-validation criteria within a single coherent framework. The paper details the matrix-form operations of each layer, the gating equations of LSTM/BiLSTM, the soft-attention context-vector formulation, the backpropagation-through-time gradient flow, and the Adam optimizer update equations. It also presents the RMSE, MAE, MAPE, and R² metrics; derives their statistical expectations under Gaussian residuals; and develops the full residual-diagnostic apparatus including Ljung–Box autocorrelation, Shapiro–Wilk normality, Breusch–Pagan heteroskedasticity, and Diebold–Mariano comparative accuracy tests. Empirical validation on 8,760 hourly SCADA observations from an Indian onshore wind turbine confirms the mathematical predictions: the attention-augmented hybrid achieves the lowest training and validation loss (MSE = 1.31), the fastest convergence (stable by epoch 55 with Adam at learning rate 0.001), and residual distributions satisfying all four statistical tests. The paper thus positions the CNN–BiLSTM–Attention architecture not merely as an empirically superior model but as a mathematically principled and statistically rigorous choice for Indian wind-speed forecasting.
Downloads
References
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. https://doi.org/10.48550/arXiv.1409.0473
Breusch, T. S., & Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica, 47(5), 1287–1294. https://doi.org/10.2307/1911963
Chen, J., Zeng, G.-Q., Zhou, W., Du, W., & Lu, K.-D. (2021). Wind speed forecasting using a nonlinear learning ensemble of deep learning time series prediction and extremal optimization. Energy Conversion and Management, 165, 681–695. https://doi.org/10.1016/j.enconman.2018.03.098
Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3), 253–263. https://doi.org/10.1080/07350015.1995.10524599
Foley, A. M., Leahy, P. G., Marvuglia, A., & McKeogh, E. J. (2012). Current methods and advances in forecasting of wind power generation. Renewable Energy, 37(1), 1–8. https://doi.org/10.1016/j.renene.2011.05.033
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. https://www.deeplearningbook.org
Hanifi, S., Liu, X., Lin, Z., & Lotfian, S. (2020). A critical review of wind power forecasting methods—past, present, and future. Energies, 13(15), 3764. https://doi.org/10.3390/en13153764b
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1412.6980
Kumar, G., & Kaur, A. (2020). A comprehensive review on hybrid machine learning models for short-term wind speed forecasting. Renewable and Sustainable Energy Reviews, 130, 109956.
https://doi.org/10.1016/j.rser.2020.109956
Liu, H., Mi, X., & Li, Y. (2021). A smart deep learning-based wind speed prediction model using wavelet packet decomposition, a convolutional neural network, and a convolutional long short-term memory network. Energy Conversion and Management, 166, 120–131. https://doi.org/10.1016/j.enconman.2018.04.021
Ljung, G. M., & Box, G. E. P. (1978). On a measure of lack of fit in time series models. Biometrika, 65(2), 297–303. https://doi.org/10.1093/biomet/65.2.297
Neshat, M., Nezhad, M. M., Abbasnejad, E., Mirjalili, S., Groppi, D., Heydari, A., Tjernberg, L. B., Garcia, D. A., Alexander, B., Shi, Q., & Wagner, M. (2021). Wind turbine power output prediction using a new hybrid neuro- evolutionary method. Energy, 229, 120617. https://doi.org/10.1016/j.energy.2021.120617
Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. https://doi.org/10.48550/arXiv.1609.04747
Shahid, F., Zameer, A., & Muneeb, M. (2021). A novel genetic LSTM model for wind power forecast. Energy, 223, 120069. https://doi.org/10.1016/j.energy.2021.120069
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.1093/biomet/52.3-4.591
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence-to-sequence learning with neural networks. Advances in Neural Information Processing Systems, 27, 3104–3112. https://doi.org/10.48550/arXiv.1409.3215
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://doi.org/10.48550/arXiv.1706.03762
Wang, Y., Hu, Q., Srinivasan, D., & Wang, Z. (2020). Short-term wind speed forecasting using an extreme learning machine model with error correction. Neural Computing and Applications, 32, 4509–4524. https://doi.org/10.1007/s00521-018-3652-5
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal For Research In Advanced Computer Science And Engineering

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In consideration of the journal, Green Publication taking action in reviewing and editing our manuscript, the authors undersigned hereby transfer, assign, or otherwise convey all copyright ownership to the Editorial Office of the Green Publication in the event that such work is published in the journal. Such conveyance covers any product that may derive from the published journal, whether print or electronic. Green Publication shall have the right to register copyright to the Article in its name as claimant, whether separately
or as part of the journal issue or other medium in which the Article is included.
By signing this Agreement, the author(s), and in the case of a Work Made For Hire, the employer, jointly and severally represent and warrant that the Article is original with the author(s) and does not infringe any copyright or violate any other right of any third parties, and that the Article has not been published elsewhere, and is not being considered for publication elsewhere in any form, except as provided herein. Each author’s signature should appear below. The signing author(s) (and, in
