OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS

OGUNDUNMADE, Tayo Peter

Home
→
Science
→
Statistics
→
OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS
→
View Item

dc.contributor.author	OGUNDUNMADE, Tayo Peter
dc.date.accessioned	2024-04-26T12:43:52Z
dc.date.available	2024-04-26T12:43:52Z
dc.date.issued	2023-08-16
dc.identifier.uri	http://hdl.handle.net/123456789/2151
dc.description.abstract	Neural Network (NN) allows complex nonlinear relationships between the response variables and its predictors. The Deep NN have made notable contributions across computer vision, reinforcement learning, speech recognition and natural language processing. Previous studies have obtained the parameters of NN through the classical approach using Homogeneous Activation Functions (HOMAFs). However, a major setback of NN using the classical approach is its tendency to over-fit. Therefore, this study was aimed at developing a Bayesian NN (BNN) model to ameliorate over-fitting using Heterogeneous Activation Functions (HETAFs). A BNN model was developed with Gaussian error distribution for the likelihood function; inverse gamma and inverse Wishart priors for the parameters, to obtain the BNN estimators. The HOMAFs (Rectified Linear Unit (ReLU), Sigmoid and Hyperbolic Tangent Sigmoid (TANSIG)) and HETAFs (Symmetric Saturated Linear Hyperbolic Tangent (SSLHT) and Symmetric Saturated Linear Hyperbolic Tangent Sigmoid (SSLHTS)) were used to activate the model parameters.The Bayesian approach was used to ameliorate the problem of over-fitting, while the Posterior Mean (PM), Posterior Standard Deviation (PSD) and Numerical Standard Error (NSE) were used to determine the estimators’ sensitivity. The performance of the Bayesian estimators from each of the activation functions was evaluated in the Monte Carlo experiment using the Mean Square Error (MSE), Mean Absolute Error (MAE) and training error as metrics. The proximity of MSE and training error values were used to generalise on the problem of over-fitting. The derived Bayesian estimators were β ∼ N(Kβ, Hβ) and γ ∼ exp (−1 2{Fγ +Mγ); where Kβ is derived mean of β, Hβ is derived standard deviation of β; Fγ and Mγ are the derived posteriors of γ. For ReLU, the PM, PSD and NSE values for β and γ were 0.4755, 0.0646, 0.0020; and 0.2370, 0.0642, 0.0020, respectively; for Sigmoid: 0.4476, 0.2734, 0.0087; and 1.0269, 0.2732, 0.0086, respectively; for TANSIG: 0.4718, 0.0826, 0.0026, and 1.0239, 0.0822, 0.0026, respectively. For SSLHT, the PM, PSD and NSE values for β and γ were 0.8344, 0.0567, 0.0018; and 1.0242, 0.0566, 0.0016, respectively; and for SSLHTS: 0.89825, 0.01278, 0.0004; and 1.0236, v0.0127, 0.0003, respectively. The MSE, MAE and training error values for the performance of the activation functions were ReLU: 0.1631, 0.2465, 0.1522; Sigmoid: 0.1834, 0.2074, 0.1862; TANSIG: 0.1943, 0.269, 0.1813; SSLHT: 0.0714, 0.0131, 0.0667; and SSLHTS: 0.0322, 0.0339, 0.0328, respectively. The HETAFs showed closer proximity between MSE and training error implying amelioration of overfitting and minimum error values compared to HOMAFS. The derived Bayesian neural network estimators ameliorated the problem of overfitting with close values of Mean Square Error and training error, thus making them more appropriate in handling Neural Network models. They could be used in solving problems in machine learning.	en_US
dc.language.iso	en	en_US
dc.subject	words: Bayesian estimator, Estimators performance evaluation, Bayesian neural network, Monte Carlo experiment, Homogeneous activation function	en_US
dc.title	OVER-FITTING AMELIORATION IN BAYESIAN NEURAL NETWORK MODEL ESTIMATION USING HETEROGENEOUS ACTIVATION FUNCTIONS	en_US
dc.type	Thesis	en_US