UI Postgraduate College

OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS

Show simple item record

dc.contributor.author OGUNDUNMADE, Tayo Peter
dc.date.accessioned 2024-04-26T12:43:52Z
dc.date.available 2024-04-26T12:43:52Z
dc.date.issued 2023-08-16
dc.identifier.uri http://hdl.handle.net/123456789/2151
dc.description.abstract Neural Network (NN) allows complex nonlinear relationships between the response variables and its predictors. The Deep NN have made notable contributions across computer vision, reinforcement learning, speech recognition and natural language processing. Previous studies have obtained the parameters of NN through the classical approach using Homogeneous Activation Functions (HOMAFs). However, a major setback of NN using the classical approach is its tendency to over-fit. Therefore, this study was aimed at developing a Bayesian NN (BNN) model to ameliorate over-fitting using Heterogeneous Activation Functions (HETAFs). A BNN model was developed with Gaussian error distribution for the likelihood function; inverse gamma and inverse Wishart priors for the parameters, to obtain the BNN estimators. The HOMAFs (Rectified Linear Unit (ReLU), Sigmoid and Hyperbolic Tangent Sigmoid (TANSIG)) and HETAFs (Symmetric Saturated Linear Hyperbolic Tangent (SSLHT) and Symmetric Saturated Linear Hyperbolic Tangent Sigmoid (SSLHTS)) were used to activate the model parameters.The Bayesian approach was used to ameliorate the problem of over-fitting, while the Posterior Mean (PM), Posterior Standard Deviation (PSD) and Numerical Standard Error (NSE) were used to determine the estimators’ sensitivity. The performance of the Bayesian estimators from each of the activation functions was evaluated in the Monte Carlo experiment using the Mean Square Error (MSE), Mean Absolute Error (MAE) and training error as metrics. The proximity of MSE and training error values were used to generalise on the problem of over-fitting. The derived Bayesian estimators were β ∼ N(Kβ, Hβ) and γ ∼ exp (−1 2{Fγ +Mγ); where Kβ is derived mean of β, Hβ is derived standard deviation of β; Fγ and Mγ are the derived posteriors of γ. For ReLU, the PM, PSD and NSE values for β and γ were 0.4755, 0.0646, 0.0020; and 0.2370, 0.0642, 0.0020, respectively; for Sigmoid: 0.4476, 0.2734, 0.0087; and 1.0269, 0.2732, 0.0086, respectively; for TANSIG: 0.4718, 0.0826, 0.0026, and 1.0239, 0.0822, 0.0026, respectively. For SSLHT, the PM, PSD and NSE values for β and γ were 0.8344, 0.0567, 0.0018; and 1.0242, 0.0566, 0.0016, respectively; and for SSLHTS: 0.89825, 0.01278, 0.0004; and 1.0236, v0.0127, 0.0003, respectively. The MSE, MAE and training error values for the performance of the activation functions were ReLU: 0.1631, 0.2465, 0.1522; Sigmoid: 0.1834, 0.2074, 0.1862; TANSIG: 0.1943, 0.269, 0.1813; SSLHT: 0.0714, 0.0131, 0.0667; and SSLHTS: 0.0322, 0.0339, 0.0328, respectively. The HETAFs showed closer proximity between MSE and training error implying amelioration of overfitting and minimum error values compared to HOMAFS. The derived Bayesian neural network estimators ameliorated the problem of overfitting with close values of Mean Square Error and training error, thus making them more appropriate in handling Neural Network models. They could be used in solving problems in machine learning. en_US
dc.language.iso en en_US
dc.subject words: Bayesian estimator, Estimators performance evaluation, Bayesian neural network, Monte Carlo experiment, Homogeneous activation function en_US
dc.title OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics