Log in

Relevant bibliographies by topics / ReLU / Journal articles

To see the other types of publications on this topic, follow the link: ReLU.

Journal articles on the topic 'ReLU'

Author: Grafiati

Published: 4 June 2021

Last updated: 16 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'ReLU.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Korch, Shaleen B., Heidi Contreras, and Josephine E. Clark-Curtiss. "Three Mycobacterium tuberculosis Rel Toxin-Antitoxin Modules Inhibit Mycobacterial Growth and Are Expressed in Infected Human Macrophages." Journal of Bacteriology 191, no. 5 (2008): 1618–30. http://dx.doi.org/10.1128/jb.01318-08.

Full text

Abstract:

ABSTRACT Mycobacterium tuberculosis protein pairs Rv1246c-Rv1247c, Rv2865-Rv2866, and Rv3357-Rv3358, here named RelBE, RelFG, and RelJK, respectively, were identified based on homology to the Escherichia coli RelBE toxin:antitoxin (TA) module. In this study, we have characterized each Rel protein pair and have established that they are functional TA modules. Overexpression of individual M. tuberculosis rel toxin genes relE, relG, and relK induced growth arrest in Mycobacterium smegmatis; a phenotype that was completely reversible by expression of their cognate antitoxin genes, relB, relF, and relJ, respectively. We also provide evidence that RelB and RelE interact directly, both in vitro and in vivo. Analysis of the genetic organization and regulation established that relBE, relFG, and relJK form bicistronic operons that are cotranscribed and autoregulated, in a manner unlike typical TA modules. RelB and RelF act as transcriptional activators, inducing expression of their respective promoters. However, RelBE, RelFG, and RelJK (together) repress expression to basal levels of activity, while RelJ represses promoter activity altogether. Finally, we have determined that all six rel genes are expressed in broth-grown M. tuberculosis, whereas relE, relF, and relK are expressed during infection of human macrophages. This is the first demonstration of M. tuberculosis expressing TA modules in broth culture and during infection of human macrophages.

APA, Harvard, Vancouver, ISO, and other styles

2

Trudel, Eric. "Saussure relu." Semiotica 2017, no. 217 (2017): 263–69. http://dx.doi.org/10.1515/sem-2016-0059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Ma, Zhongkui, Jiaying Li, and Guangdong Bai. "ReLU Hull Approximation." Proceedings of the ACM on Programming Languages 8, POPL (2024): 2260–87. http://dx.doi.org/10.1145/3632917.

Full text

Abstract:

Convex hulls are commonly used to tackle the non-linearity of activation functions in the verification of neural networks. Computing the exact convex hull is a costly task though. In this work, we propose a fast and precise approach to over-approximating the convex hull of the ReLU function (referred to as the ReLU hull), one of the most used activation functions. Our key insight is to formulate a convex polytope that ”wraps” the ReLU hull, by reusing the linear pieces of the ReLU function as the lower faces and constructing upper faces that are adjacent to the lower faces. The upper faces can be efficiently constructed based on the edges and vertices of the lower faces, given that an n -dimensional (or simply n d hereafter) hyperplane can be determined by an ( n −1)d hyperplane and a point outside of it. We implement our approach as WraLU, and evaluate its performance in terms of precision, efficiency, constraint complexity, and scalability. WraLU outperforms existing advanced methods by generating fewer constraints to achieve tighter approximation in less time. It exhibits versatility by effectively addressing arbitrary input polytopes and higher-dimensional cases, which are beyond the capabilities of existing methods. We integrate WraLU into PRIMA, a state-of-the-art neural network verifier, and apply it to verify large-scale ReLU-based neural networks. Our experimental results demonstrate that WraLU achieves a high efficiency without compromising precision. It reduces the number of constraints that need to be solved by the linear programming solver by up to half, while delivering comparable or even superior results compared to the state-of-the-art verifiers.

APA, Harvard, Vancouver, ISO, and other styles

4

Liang, XingLong, and Jun Xu. "Biased ReLU neural networks." Neurocomputing 423 (January 2021): 71–79. http://dx.doi.org/10.1016/j.neucom.2020.09.050.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Sajadi-Rosen, N. "Michon lu et relu." French Studies 66, no. 3 (2012): 427–28. http://dx.doi.org/10.1093/fs/kns132.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Bai, Yuhan. "RELU-Function and Derived Function Review." SHS Web of Conferences 144 (2022): 02006. http://dx.doi.org/10.1051/shsconf/202214402006.

Full text

Abstract:

The activation function plays an important role in training and improving performance in deep neural networks (dnn). The rectified linear unit (relu) function provides the necessary non-linear properties in the deep neural network (dnn). However, few papers sort out and compare various relu activation functions. Most of the paper focuses on the efficiency and accuracy of certain activation functions used by the model, but does not pay attention to the nature and differences of these activation functions. Therefore, this paper attempts to organize the RELU-function and derived function in this paper. And compared the accuracy of different relu functions (and its derivative functions) under the Mnist data set. From the experimental point of view, the relu function performs the best, and the selu and elu functions perform poorly.

APA, Harvard, Vancouver, ISO, and other styles

7

Layton, Oliver W., Siyuan Peng, and Scott T. Steinmetz. "ReLU, Sparseness, and the Encoding of Optic Flow in Neural Networks." Sensors 24, no. 23 (2024): 7453. http://dx.doi.org/10.3390/s24237453.

Full text

Abstract:

Accurate self-motion estimation is critical for various navigational tasks in mobile robotics. Optic flow provides a means to estimate self-motion using a camera sensor and is particularly valuable in GPS- and radio-denied environments. The present study investigates the influence of different activation functions—ReLU, leaky ReLU, GELU, and Mish—on the accuracy, robustness, and encoding properties of convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs) trained to estimate self-motion from optic flow. Our results demonstrate that networks with ReLU and leaky ReLU activation functions not only achieved superior accuracy in self-motion estimation from novel optic flow patterns but also exhibited greater robustness under challenging conditions. The advantages offered by ReLU and leaky ReLU may stem from their ability to induce sparser representations than GELU and Mish do. Our work characterizes the encoding of optic flow in neural networks and highlights how the sparseness induced by ReLU may enhance robust and accurate self-motion estimation from optic flow.

APA, Harvard, Vancouver, ISO, and other styles

8

Hanin, Boris. "Universal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations." Mathematics 7, no. 10 (2019): 992. http://dx.doi.org/10.3390/math7100992.

Full text

Abstract:

This article concerns the expressive power of depth in neural nets with ReLU activations and a bounded width. We are particularly interested in the following questions: What is the minimal width w min ( d ) so that ReLU nets of width w min ( d ) (and arbitrary depth) can approximate any continuous function on the unit cube [ 0 , 1 ] d arbitrarily well? For ReLU nets near this minimal width, what can one say about the depth necessary to approximate a given function? We obtain an essentially complete answer to these questions for convex functions. Our approach is based on the observation that, due to the convexity of the ReLU activation, ReLU nets are particularly well suited to represent convex functions. In particular, we prove that ReLU nets with width d + 1 can approximate any continuous convex function of d variables arbitrarily well. These results then give quantitative depth estimates for the rate of approximation of any continuous scalar function on the d-dimensional cube [ 0 , 1 ] d by ReLU nets with width d + 3 .

APA, Harvard, Vancouver, ISO, and other styles

9

Naufal, Budiman, Adi Kusworo, and Wibowo Adi. "Impact of Activation Function on the Performance of Convolutional Neural Network in Identifying Oil Palm Fruit Ripeness." International Journal of Mathematics and Computer Research 13, no. 04 (2025): 5107–13. https://doi.org/10.5281/zenodo.15261476.

Full text

Abstract:

Activation functions play a crucial role in Convolutional Neural Networks (CNN), particularly in enabling the model to recognize and represent complex patterns in digital images. In image classification tasks, the choice of activation function can significantly impact the accuracy and overall performance of the model. The Rectified Linear Unit (ReLU) is the most commonly used activation function due to its simplicity; however, it has a limitation in discarding information from negative input values. To address this issue, alternative functions such as Leaky ReLU and Gaussian Error Linear Unit (GELU) have been developed, designed to retain information from negative inputs. This study presents a comparative analysis of three activation functions ReLU, Leaky ReLU, and GELU on a CNN model for classifying oil palm fruit ripeness levels. The results show that although all three activation functions achieved high training accuracy ReLU at 100%, Leaky ReLU at 99.93%, and GELU at 99.49%—the performance on testing data varied significantly. Leaky ReLU outperformed the others, achieving the highest test accuracy of 95.35%, an F1-score of 95.39%, and a Matthews Correlation Coefficient (MCC) of 93.28%. It also exhibited the smallest gap between training and testing accuracy (4.58%), indicating better generalization capability and a lower risk of overfitting compared to ReLU and GELU. Moreover, the model using Leaky ReLU was able to classify all three classes more evenly, particularly excelling in identifying the 'ripe' class, which tends to be more challenging. These findings highlight that Leaky ReLU is a more optimal activation function for oil palm fruit image classification, as it maintains high accuracy while effectively reducing overfitting. This study contributes to the selection of appropriate activation functions for CNN-based classification systems and opens opportunities for exploring more adaptive activation functions in future research.

APA, Harvard, Vancouver, ISO, and other styles

10

Harvey, David R. "RELU Special Issue: Editorial Reflections." Journal of Agricultural Economics 57, no. 2 (2006): 329–36. http://dx.doi.org/10.1111/j.1477-9552.2006.00055.x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Dittmer, Soren, Emily J. King, and Peter Maass. "Singular Values for ReLU Layers." IEEE Transactions on Neural Networks and Learning Systems 31, no. 9 (2020): 3594–605. http://dx.doi.org/10.1109/tnnls.2019.2945113.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Kulathunga, Nalinda, Nishath Rajiv Ranasinghe, Daniel Vrinceanu, Zackary Kinsman, Lei Huang, and Yunjiao Wang. "Effects of Nonlinearity and Network Architecture on the Performance of Supervised Neural Networks." Algorithms 14, no. 2 (2021): 51. http://dx.doi.org/10.3390/a14020051.

Full text

Abstract:

The nonlinearity of activation functions used in deep learning models is crucial for the success of predictive models. Several simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU (L-ReLU) are commonly used in neural networks to impose the nonlinearity. In practice, these functions remarkably enhance the model accuracy. However, there is limited insight into the effects of nonlinearity in neural networks on their performance. Here, we investigate the performance of neural network models as a function of nonlinearity using ReLU and L-ReLU activation functions in the context of different model architectures and data domains. We use entropy as a measurement of the randomness, to quantify the effects of nonlinearity in different architecture shapes on the performance of neural networks. We show that the ReLU nonliearity is a better choice for activation function mostly when the network has sufficient number of parameters. However, we found that the image classification models with transfer learning seem to perform well with L-ReLU in fully connected layers. We show that the entropy of hidden layer outputs in neural networks can fairly represent the fluctuations in information loss as a function of nonlinearity. Furthermore, we investigate the entropy profile of shallow neural networks as a way of representing their hidden layer dynamics.

APA, Harvard, Vancouver, ISO, and other styles

13

Huang, Changcun. "ReLU Networks Are Universal Approximators via Piecewise Linear or Constant Functions." Neural Computation 32, no. 11 (2020): 2249–78. http://dx.doi.org/10.1162/neco_a_01316.

Full text

Abstract:

This letter proves that a ReLU network can approximate any continuous function with arbitrary precision by means of piecewise linear or constant approximations. For univariate function [Formula: see text], we use the composite of ReLUs to produce a line segment; all of the subnetworks of line segments comprise a ReLU network, which is a piecewise linear approximation to [Formula: see text]. For multivariate function [Formula: see text], ReLU networks are constructed to approximate a piecewise linear function derived from triangulation methods approximating [Formula: see text]. A neural unit called TRLU is designed by a ReLU network; the piecewise constant approximation, such as Haar wavelets, is implemented by rectifying the linear output of a ReLU network via TRLUs. New interpretations of deep layers, as well as some other results, are also presented.

APA, Harvard, Vancouver, ISO, and other styles

14

Dung, D., V. K. Nguyen, and M. X. Thao. "ON COMPUTATION COMPLEXITY OF HIGH-DIMENSIONAL APPROXIMATION BY DEEP ReLU NEURAL NETWORKS." BULLETIN of L.N. Gumilyov Eurasian National University. MATHEMATICS. COMPUTER SCIENCE. MECHANICS Series 133, no. 4 (2020): 8–18. http://dx.doi.org/10.32523/2616-7182/2020-133-4-8-18.

Full text

Abstract:

We investigate computation complexity of deep ReLU neural networks for approximating functions in H\"older-Nikol'skii spaces of mixed smoothness $\Lad$ on the unit cube $\IId:=[0,1]^d$. For any function $f\in \Lad$, we explicitly construct nonadaptive and adaptive deep ReLU neural networks having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, and prove dimension-dependent bounds for the computation complexity of this approximation, characterized by the size and depth of this deep ReLU neural network, explicitly in $d$ and $\varepsilon$. Our results show the advantage of the adaptive method of approximation by deep ReLU neural networks over nonadaptive one.

APA, Harvard, Vancouver, ISO, and other styles

15

Katende, Ronald, Henry Kasumba, Godwin Kakuba, and John M. Mango. "A proof of convergence and equivalence for 1D finite element methods and ReLU neural networks." Annals of Mathematics and Computer Science 25 (November 16, 2024): 97–111. http://dx.doi.org/10.56947/amcs.v25.392.

Full text

Abstract:

This paper investigates the convergence and equivalence properties of the Finite Element Method (FEM) and Rectified Linear Unit Neural Networks (ReLU NNs) in solving differential equations. We provide a detailed comparison of the two approaches, highlighting their mutual capabilities in function space approximation. Our analysis proves the subset and superset inclusions that establish the equivalence between FEM and ReLU NNs for approximating piecewise linear functions. Furthermore, a comprehensive numerical evaluation is presented, demonstrating the error convergence behavior of ReLU NNs as the number of neurons per basis function varies. Our results show that while increasing the number of neurons improves approximation accuracy, this benefit diminishes beyond a certain threshold. The maximum observed error between FEM and ReLU NNs is 10-4, reflecting excellent accuracy in solving partial differential equations (PDEs). These findings lay the groundwork for integrating FEM and ReLU NNs, with important implications for computational mathematics and engineering applications.

APA, Harvard, Vancouver, ISO, and other styles

16

Chieng, Hock Hung, Noorhaniza Wahid, Ong Pauline, and Sai Raj Kishore Perla. "Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning." International Journal of Advances in Intelligent Informatics 4, no. 2 (2018): 76. http://dx.doi.org/10.26555/ijain.v4i2.249.

Full text

Abstract:

Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindering the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.

APA, Harvard, Vancouver, ISO, and other styles

17

Butt, F. M., L. Hussain, S. H. M. Jafri, et al. "Optimizing Parameters of Artificial Intelligence Deep Convolutional Neural Networks (CNN) to improve Prediction Performance of Load Forecasting System." IOP Conference Series: Earth and Environmental Science 1026, no. 1 (2022): 012028. http://dx.doi.org/10.1088/1755-1315/1026/1/012028.

Full text

Abstract:

Abstract Load Forecasting is an approach that is implemented to foresee the future load demand projected on some physical parameters such as loading on lines, temperature, losses, pressure, and weather conditions etc. This study is specifically aimed to optimize the parameters of deep convolutional neural networks (CNN) to improve the short-term load forecasting (STLF) and Medium-term load forecasting (MTLF) i.e. one day, one week, one month and three months. The models were tested based on the real-world case by conducting detailed experiments to validate their stability and practicality. The performance was measured in terms of squared error, Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE). We optimized the parameters using three different cases. In first case, we used single layer with Rectified Linear Unit (ReLU) activation function. In the second case, we used double layer with ReLU – ReLU activation function. In the third case, we used double layer with ReLU – Sigmoid activation function. The number of neurons in each case were 2, 4, 6, 8, 10 and 12. To predict the one day ahead load forecasting, the lowest prediction error was yielded using double layer with ReLU – Sigmoid activation function. To predict ahead one-week load forecasting demands, the lowest error was obtained using single layer ReLU activation function. Likewise, to predict the one month ahead forecasting using double layer with ReLU – Sigmoid activation function. Moreover, to predict ahead three months forecasting using double layer ReLU – Sigmoid activation function produced lowest prediction error. The results reveal that by optimizing the parameters further improved the ahead prediction performance. The results also show that predicting nonstationary and nonlinear dynamics of ahead forecasting require more complex activation function and number of neurons. The results can be very useful in real-time implementation of this model to meet load demands and for further planning.

APA, Harvard, Vancouver, ISO, and other styles

18

Purnawansyah, Purnawansyah, Haviluddin Haviluddin, Herdianti Darwis, Huzain Azis, and Yulita Salim. "Backpropagation Neural Network with Combination of Activation Functions for Inbound Traffic Prediction." Knowledge Engineering and Data Science 4, no. 1 (2021): 14. http://dx.doi.org/10.17977/um018v4i12021p14-28.

Full text

Abstract:

Predicting network traffic is crucial for preventing congestion and gaining superior quality of network services. This research aims to use backpropagation to predict the inbound level to understand and determine internet usage. The architecture consists of one input layer, two hidden layers, and one output layer. The study compares three activation functions: sigmoid, rectified linear unit (ReLU), and hyperbolic Tangent (tanh). Three learning rates: 0.1, 0.5, and 0.9 represent low, moderate, and high rates, respectively. Based on the result, in terms of a single form of activation function, although sigmoid provides the least RMSE and MSE values, the ReLu function is more superior in learning the high traffic pattern with a learning rate of 0.9. In addition, Re-LU is more powerful to be used in the first order in terms of combination. Hence, combining a high learning rate and pure ReLU, ReLu-sigmoid, or ReLu-Tanh is more suitable and recommended to predict upper traffic utilization

APA, Harvard, Vancouver, ISO, and other styles

19

Razali, Noor Fadzilah, Iza Sazanita Isa, Siti Noraini Sulaiman, Muhammad Khusairi Osman, Noor Khairiah A. Karim, and Dayang Suhaida Awang Damit. "Genetic algorithm-adapted activation function optimization of deep learning framework for breast mass cancer classification in mammogram images." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 3 (2025): 2820. https://doi.org/10.11591/ijece.v15i3.pp2820-2833.

Full text

Abstract:

The convolutional neural network (CNN) has been explored for mammogram cancer classification to aid radiologists. CNNs require multiple convolution and non-linearity repetitions to learn data sparsity, but deeper networks often face the vanishing gradient effect, which hinders effective learning. The rectified linear unit (ReLU) activation function activates neurons only when the output exceeds zero, limiting activation and potentially lowering performance. This study proposes an adaptive ReLU based on a genetic algorithm (GA) to determine the optimal threshold for neuron activation, thus improving the restrictive nature of the original ReLU. We compared performances on the INbreast and IPPT-mammo mammogram datasets using ReLU and leakyReLU activation functions. Results show accuracy improvements from 95.0% to 97.01% for INbreast and 84.9% to 87.4% for IPPT-mammo with ReLU and from 93.03% to 99.0% for INbreast and 84.03% to 91.06% for IPPT-mammo with leakyReLU. Significant accuracy improvements were observed for breast cancer classification in mammograms, demonstrating its potential to aid radiologists with more robust and reliable diagnostic tools.

APA, Harvard, Vancouver, ISO, and other styles

20

Manns, F. "Zacharie 12,10 relu en Jean 19,37." Liber Annuus 56 (January 2006): 301–10. http://dx.doi.org/10.1484/j.la.2.303646.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Dũng, Dinh, Van Kien Nguyen, and Mai Xuan Thao. "COMPUTATION COMPLEXITY OF DEEP RELU NEURAL NETWORKS IN HIGH-DIMENSIONAL APPROXIMATION." Journal of Computer Science and Cybernetics 37, no. 3 (2021): 291–320. http://dx.doi.org/10.15625/1813-9663/37/3/15902.

Full text

Abstract:

The purpose of the present paper is to study the computation complexity of deep ReLU neural networks to approximate functions in H\"older-Nikol'skii spaces of mixed smoothness $H_\infty^\alpha(\mathbb{I}^d)$ on the unit cube $\mathbb{I}^d:=[0,1]^d$. In this context, for any function $f\in H_\infty^\alpha(\mathbb{I}^d)$, we explicitly construct nonadaptive and adaptive deep ReLU neural networks having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, and prove dimension-dependent bounds for the computation complexity of this approximation, characterized by the size and the depth of this deep ReLU neural network, explicitly in $d$ and $\varepsilon$. Our results show the advantage of the adaptive method of approximation by deep ReLU neural networks over nonadaptive one.

APA, Harvard, Vancouver, ISO, and other styles

22

Gühring, Ingo, Gitta Kutyniok, and Philipp Petersen. "Error bounds for approximations with deep ReLU neural networks in Ws,p norms." Analysis and Applications 18, no. 05 (2019): 803–59. http://dx.doi.org/10.1142/s0219530519410021.

Full text

Abstract:

We analyze to what extent deep Rectified Linear Unit (ReLU) neural networks can efficiently approximate Sobolev regular functions if the approximation error is measured with respect to weaker Sobolev norms. In this context, we first establish upper approximation bounds by ReLU neural networks for Sobolev regular functions by explicitly constructing the approximate ReLU neural networks. Then, we establish lower approximation bounds for the same type of function classes. A trade-off between the regularity used in the approximation norm and the complexity of the neural network can be observed in upper and lower bounds. Our results extend recent advances in the approximation theory of ReLU networks to the regime that is most relevant for applications in the numerical analysis of partial differential equations.

APA, Harvard, Vancouver, ISO, and other styles

23

Gao, Hongyang, Lei Cai, and Shuiwang Ji. "Adaptive Convolutional ReLUs." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 3914–21. http://dx.doi.org/10.1609/aaai.v34i04.5805.

Full text

Abstract:

Rectified linear units (ReLUs) are currently the most popular activation function used in neural networks. Although ReLUs can solve the gradient vanishing problem and accelerate training convergence, it suffers from the dying ReLU problem in which some neurons are never activated if the weights are not updated properly. In this work, we propose a novel activation function, known as the adaptive convolutional ReLU (ConvReLU), that can better mimic brain neuron activation behaviors and overcome the dying ReLU problem. With our novel parameter sharing scheme, ConvReLUs can be applied to convolution layers that allow each input neuron to be activated by different trainable thresholds without involving a large number of extra parameters. We employ the zero initialization scheme in ConvReLU to encourage trainable thresholds to be close to zero. Finally, we develop a partial replacement strategy that only replaces the ReLUs in the early layers of the network. This resolves the dying ReLU problem and retains sparse representations for linear classifiers. Experimental results demonstrate that our proposed ConvReLU has consistently better performance compared to ReLU, LeakyReLU, and PReLU. In addition, the partial replacement strategy is shown to be effective not only for our ConvReLU but also for LeakyReLU and PReLU.

APA, Harvard, Vancouver, ISO, and other styles

24

Kondra, Pranitha, and Naresh Vurukonda. "Feature Extraction and Classification of Gray-Scale Images of Brain Tumor using Deep Learning." Scalable Computing: Practice and Experience 25, no. 2 (2024): 1005–17. http://dx.doi.org/10.12694/scpe.v25i2.2456.

Full text

Abstract:

Deep Learning using CNN plays a paramount role for the classification methods applied on medical image data. With a crucial role in accurate diagnosis, treatment planning and patient management for medical and healthcare systems, CNNs won accolades in the Deep Learning research. As simple the learning model so precise are the results for decision making. The proposed Sequential model of CNN is built with Parametric ReLU with the values aligned to geometric mean, attains a specific goal of tumor classification. The additional support of ground-truth aid in deciding the shape and severity of tumor in the Grayscale MRI of brain tumor. The simple Sequential model, although a minimal version has proved achieved significant classification goals using the GMP-ReLU. Comparative results with variants of ReLU have been charted in this article standing with the proof of consistent classification model with parametric-ReLU. The proposed design is conducted on images from Kaggle and a model is trained (classifier is built), which can be considered as ideal filter for all the benchmark images. The accuracy of proposed design is considerably improved compared to normal ReLU up to 89.214%.

APA, Harvard, Vancouver, ISO, and other styles

25

Klusowski, Jason M., and Andrew R. Barron. "Approximation by Combinations of ReLU and Squared ReLU Ridge Functions With $\ell^1$ and $\ell^0$ Controls." IEEE Transactions on Information Theory 64, no. 12 (2018): 7649–56. http://dx.doi.org/10.1109/tit.2018.2874447.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Hanoon, Faten Salim, and Abbas Hanon Hassin Alasadi. "A modified residual network for detection and classification of Alzheimer’s disease." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 4 (2022): 4400. http://dx.doi.org/10.11591/ijece.v12i4.pp4400-4407.

Full text

Abstract:

<p>Alzheimer's disease (AD) is a brain disease that significantly declines a person's ability to remember and behave normally. By applying several approaches to distinguish between various stages of AD, neuroimaging data has been used to extract different patterns associated with various phases of AD. However, because the brain patterns of older adults and those in different phases are similar, researchers have had difficulty classifying them. In this paper, the 50-layer ResNet is modified by adding extra convolution layers to make the extracted features more diverse. Besides, the activation function (ReLU) was replaced with (Leaky ReLU) because ReLU takes the negative parts of its input, drops them to zero, and retains the positive parts. These negative inputs may contain useful feature information that could aid in the development of high-level discriminative features. Thus, Leaky ReLU was used instead of ReLU to prevent any potential loss of input information. In order to train the network from scratch without encountering the issue of overfitting, we added a dropout layer before the fully connected layer. The proposed method successfully classified the four stages of AD with an accuracy of 97.49 % and 98 % for precision, recall, and f1-score.</p>

APA, Harvard, Vancouver, ISO, and other styles

27

Yahya, Ali Abdullah, Kui Liu, Ammar Hawbani, Yibin Wang, and Ali Naser Hadi. "A Novel Image Classification Method Based on Residual Network, Inception, and Proposed Activation Function." Sensors 23, no. 6 (2023): 2976. http://dx.doi.org/10.3390/s23062976.

Full text

Abstract:

In deeper layers, ResNet heavily depends on skip connections and Relu. Although skip connections have demonstrated their usefulness in networks, a major issue arises when the dimensions between layers are not consistent. In such cases, it is necessary to use techniques such as zero-padding or projection to match the dimensions between layers. These adjustments increase the complexity of the network architecture, resulting in an increase in parameter number and a rise in computational costs. Another problem is the vanishing gradient caused by utilizing Relu. In our model, after making appropriate adjustments to the inception blocks, we replace the deeper layers of ResNet with modified inception blocks and Relu with our non-monotonic activation function (NMAF). To reduce parameter number, we use symmetric factorization and 1×1 convolutions. Utilizing these two techniques contributed to reducing the parameter number by around 6 M parameters, which has helped reduce the run time by 30 s/epoch. Unlike Relu, NMAF addresses the deactivation problem of the non-positive number by activating the negative values and outputting small negative numbers instead of zero in Relu, which helped in enhancing the convergence speed and increasing the accuracy by 5%, 15%, and 5% for the non-noisy datasets, and 5%, 6%, 21% for non-noisy datasets.

APA, Harvard, Vancouver, ISO, and other styles

28

Faten, Salim Hanoon, and Hanon Hassin Alasadi Abbas. "A modified residual network for detection and classification of Alzheimer's disease." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 4 (2022): 4400–4407. https://doi.org/10.11591/ijece.v12i4.pp4400-4407.

Full text

Abstract:

Alzheimer's disease (AD) is a brain disease that significantly declines a person's ability to remember and behave normally. By applying several approaches to distinguish between various stages of AD, neuroimaging data has been used to extract different patterns associated with various phases of AD. However, because the brain patterns of older adults and those in different phases are similar, researchers have had difficulty classifying them. In this paper, the 50-layer residual neural network (ResNet) is modified by adding extra convolution layers to make the extracted features more diverse. Besides, the activation function (ReLU) was replaced with (Leaky ReLU) because ReLU takes the negative parts of its input, drops them to zero, and retains the positive parts. These negative inputs may contain useful feature information that could aid in the development of high-level discriminative features. Thus, Leaky ReLU was used instead of ReLU to prevent any potential loss of input information. In order to train the network from scratch without encountering the issue of overfitting, we added a dropout layer before the fully connected layer. The proposed method successfully classified the four stages of AD with an accuracy of 97.49 % and 98 % for precision, recall, and f1-score.

APA, Harvard, Vancouver, ISO, and other styles

29

Noprisson, Handrie, Vina Ayumi, Mariana Purba, and Nur Ani. "MOBILENET PERFORMANCE IMPROVEMENTS FOR DEEPFAKE IMAGE IDENTIFICATION USING ACTIVATION FUNCTION AND REGULARIZATION." JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer) 10, no. 2 (2024): 441–48. http://dx.doi.org/10.33480/jitk.v10i2.5798.

Full text

Abstract:

Deepfake images are often used to spread false information, manipulate public opinion, and harm individuals by creating fake content, making developing deepfake detection technology essential to mitigate these potential dangers. This study utilized the MobileNet architecture by applying regularization and activation function methods to improve detection accuracy. ReLU (Rectified Linear Unit) enhances the model's efficiency and ability to capture non-linear features, while Dropout and L2 regularization help reduce overfitting by penalizing large weights, thereby improving generalization. Based on experimental results, the MobileNet model optimized with ReLU and Dropout achieved an accuracy of 99.17% in the training phase, 85.34% in validation, and 70.60% in testing, whereas the MobileNet model optimized with ReLU and L2 showed lower accuracy in the training and validation phases compared to Dropout but achieved higher accuracy in testing at 72.18%. This study recommends MobileNet with ReLU and L2 due to its better generalization ability when testing data (resulting from reduced overfitting).

APA, Harvard, Vancouver, ISO, and other styles

30

Salam, Abdulwahed, Abdelaaziz El Hibaoui, and Abdulgabbar Saif. "A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 1 (2021): 163. http://dx.doi.org/10.11591/ijece.v11i1.pp163-170.

Full text

Abstract:

Predicting electricity power is an important task, which helps power utilities in improving their systems’ performance in terms of effectiveness, productivity, management and control. Several researches had introduced this task using three main models: engineering, statistical and artificial intelligence. Based on the experiments, which used artificial intelligence models, multilayer neural networks model has proven its success in predicting many evaluation datasets. However, the performance of this model depends mainly on the type of activation function. Therefore, this paper introduces an experimental study for investigating the performance of the multilayer neural networks model with respect to different activation functions and different depths of hidden layers. The experiments in this paper cover the comparison among eleven activation functions using four benchmark electricity datasets. The activation functions under examination are sigmoid, hyperbolic tangent, SoftSign, SoftPlus, ReLU, Leak ReLU, Gaussian, ELU, SELU, Swish and Adjust-Swish. Experimental results show that ReLU and Leak ReLU activation functions outperform their counterparts in all datasets.

APA, Harvard, Vancouver, ISO, and other styles

31

Pattanaik, Abhipsa, and Leena Das. "DeepSkinNet: A Deep Learning Induced Skin Lesion Extraction System from Dermoscopic Images." International Journal of Online and Biomedical Engineering (iJOE) 21, no. 07 (2025): 15–28. https://doi.org/10.3991/ijoe.v21i07.54621.

Full text

Abstract:

In this work, a DeepSkinNet model was developed based on an encoder-decoder type framework. The designed encoder incorporates three blocks where each block sandwiches convolution, rectified linear unit (ReLU), and maxpooling layers to retain the prominent details. The developed DIL (dilated convolution + instance normalization + Leaky ReLU) module comprises three branches, where each branch consists of an atrous convolution layer with a sampling rate of two, followed by instance normalization and a Leaky ReLU activation function to retain the subtle details accurately. Further, the proposed decoder network with a feature fusion mechanism stacks convolution, transposed convolution, and ReLU activation functions to precisely extract the lesion regions from the dermoscopic images. The efficacy of the designed DeepSkinNet is validated through subjective as well as objective analysis and found to be suitable for medical diagnosis against various SOTA methods. The Dice coefficients (DC) found using ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets are found to be 93.33%, 89.00%, 92.05%, and 91.24%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

32

Abdulwahed, Salam, El Hibaoui Abdelaaziz, and Saif Abdulgabbar. "A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 1 (2021): 163–70. https://doi.org/10.11591/ijece.v11i1.pp163-170.

Full text

Abstract:

Predicting electricity power is an important task, which helps power utilities in improving their systems’ performance in terms of effectiveness, productivity, management and control. Several researches had introduced this task using three main models: engineering, statistical and artificial intelligence. Based on the experiments, which used artificial intelligence models, multilayer neural networks model has proven its success in predicting many evaluation datasets. However, the performance of this model depends mainly on the type of activation function. Therefore, this paper introduces an experimental study for investigating the performance of the multilayer neural networks model with respect to different activation functions and different depths of hidden layers. The experiments in this paper cover the comparison among eleven activation functions using four benchmark electricity datasets. The activation functions under examination are sigmoid, hyperbolic tangent, SoftSign, SoftPlus, ReLU, Leak ReLU, Gaussian, ELU, SELU, Swish and Adjust-Swish. Experimental results show that ReLU and Leak ReLU activation functions outperform their counterparts in all datasets.

APA, Harvard, Vancouver, ISO, and other styles

33

Akter, Shahrin, and Mohammad Rafiqul Haider. "mTanh: A Low-Cost Inkjet-Printed Vanishing Gradient Tolerant Activation Function." Journal of Low Power Electronics and Applications 15, no. 2 (2025): 27. https://doi.org/10.3390/jlpea15020027.

Full text

Abstract:

Inkjet-printed circuits on flexible substrates are rapidly emerging as a key technology in flexible electronics, driven by their minimal fabrication process, cost-effectiveness, and environmental sustainability. Recent advancements in inkjet-printed devices and circuits have broadened their applications in both sensing and computing. Building on this progress, this work has developed a nonlinear computational element coined as mTanh to serve as an activation function in neural networks. Activation functions are essential in neural networks as they introduce nonlinearity, enabling machine learning models to capture complex patterns. However, widely used functions such as Tanh and sigmoid often suffer from the vanishing gradient problem, limiting the depth of neural networks. To address this, alternative functions like ReLU and Leaky ReLU have been explored, yet these also introduce challenges such as the dying ReLU issue, bias shifting, and noise sensitivity. The proposed mTanh activation function effectively mitigates the vanishing gradient problem, allowing for the development of deeper neural network architectures without compromising training efficiency. This study demonstrates the feasibility of mTanh as an activation function by integrating it into an Echo State Network to predict the Mackey–Glass time series signal. The results show that mTanh performs comparably to Tanh, ReLU, and Leaky ReLU in this task. Additionally, the vanishing gradient resistance of the mTanh function was evaluated by implementing it in a deep multi-layer perceptron model for Fashion MNIST image classification. The study indicates that mTanh enables the addition of 3–5 extra layers compared to Tanh and sigmoid, while exhibiting vanishing gradient resistance similar to ReLU. These results highlight the potential of mTanh as a promising activation function for deep learning models, particularly in flexible electronics applications.

APA, Harvard, Vancouver, ISO, and other styles

34

Opschoor, Joost A. A., Philipp C. Petersen, and Christoph Schwab. "Deep ReLU networks and high-order finite element methods." Analysis and Applications 18, no. 05 (2020): 715–70. http://dx.doi.org/10.1142/s0219530519410136.

Full text

Abstract:

Approximation rate bounds for emulations of real-valued functions on intervals by deep neural networks (DNNs) are established. The approximation results are given for DNNs based on ReLU activation functions. The approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same approximation rates as nonlinear, variable-order, free-knot (or so-called “[Formula: see text]-adaptive”) spline approximations and spectral approximations, for a wide range of Sobolev and Besov spaces. In particular, exponential convergence rates in terms of the DNN size for univariate, piecewise Gevrey functions with point singularities are established. Combined with recent results on ReLU DNN approximation of rational, oscillatory, and high-dimensional functions, this corroborates that continuous, piecewise affine ReLU DNNs afford algebraic and exponential convergence rate bounds which are comparable to “best in class” schemes for several important function classes of high and infinite smoothness. Using composition of DNNs, we also prove that radial-like functions obtained as compositions of the above with the Euclidean norm and, possibly, anisotropic affine changes of co-ordinates can be emulated at exponential rate in terms of the DNN size and depth without the curse of dimensionality.

APA, Harvard, Vancouver, ISO, and other styles

35

Liu, Bo, and Yi Liang. "Optimal function approximation with ReLU neural networks." Neurocomputing 435 (May 2021): 216–27. http://dx.doi.org/10.1016/j.neucom.2021.01.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Wang, Pichao, Xue Wang, Hao Luo, et al. "Scaled ReLU Matters for Training Vision Transformers." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (2022): 2495–503. http://dx.doi.org/10.1609/aaai.v36i3.20150.

Full text

Abstract:

Vision transformers (ViTs) have been an alternative design paradigm to convolutional neural networks (CNNs). However, the training of ViTs is much harder than CNNs, as it is sensitive to the training parameters, such as learning rate, optimizer and warmup epoch. The reasons for training difficulty are empirically analysed in the paper Early Convolutions Help Transformers See Better, and the authors conjecture that the issue lies with the patchify-stem of ViT models. In this paper, we further investigate this problem and extend the above conclusion: only early convolutions do not help for stable training, but the scaled ReLU operation in the convolutional stem (conv-stem) matters. We verify, both theoretically and empirically, that scaled ReLU in conv-stem not only improves training stabilization, but also increases the diversity of patch tokens, thus boosting peak performance with a large margin via adding few parameters and flops. In addition, extensive experiments are conducted to demonstrate that previous ViTs are far from being well trained, further showing that ViTs have great potential to be a better substitute of CNNs.

APA, Harvard, Vancouver, ISO, and other styles

37

Dereich, Steffen, and Sebastian Kassing. "On minimal representations of shallow ReLU networks." Neural Networks 148 (April 2022): 121–28. http://dx.doi.org/10.1016/j.neunet.2022.01.006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Chen, Zhi, and Pin-Han Ho. "Global-connected network with generalized ReLU activation." Pattern Recognition 96 (December 2019): 106961. http://dx.doi.org/10.1016/j.patcog.2019.07.006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Barbu, Adrian. "Training a Two-Layer ReLU Network Analytically." Sensors 23, no. 8 (2023): 4072. http://dx.doi.org/10.3390/s23084072.

Full text

Abstract:

Neural networks are usually trained with different variants of gradient descent-based optimization algorithms such as the stochastic gradient descent or the Adam optimizer. Recent theoretical work states that the critical points (where the gradient of the loss is zero) of two-layer ReLU networks with the square loss are not all local minima. However, in this work, we will explore an algorithm for training two-layer neural networks with ReLU-like activation and the square loss that alternatively finds the critical points of the loss function analytically for one layer while keeping the other layer and the neuron activation pattern fixed. Experiments indicate that this simple algorithm can find deeper optima than stochastic gradient descent or the Adam optimizer, obtaining significantly smaller training loss values on four out of the five real datasets evaluated. Moreover, the method is faster than the gradient descent methods and has virtually no tuning parameters.

APA, Harvard, Vancouver, ISO, and other styles

40

M Mesran, Sitti Rachmawati Yahya, Fifto Nugroho, and Agus Perdana Windarto. "Investigating the Impact of ReLU and Sigmoid Activation Functions on Animal Classification Using CNN Models." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 8, no. 1 (2024): 111–18. http://dx.doi.org/10.29207/resti.v8i1.5367.

Full text

Abstract:

VGG16 is a convolutional neural network model used for image recognition. It is unique in that it only has 16 weighted layers, rather than relying on a large number of hyperparameters. It is considered as one of the best vision model architectures. This study compares the performance of ReLU (rectified linear unit) and sigmoid activation functions in CNN models for animal classification. To choose which model to use, we tested 2 state-of-the-art CNN architectures: the default VGG16 with the proposed method VGG16. A data set consisting of 2,000 images of five different animals was used. The results show that ReLU achieves higher classification accuracy than sigmoid. The model with ReLU on convolutional and fully connected layers achieved the highest accuracy of 97.56% on the test dataset. However, further experiments and considerations are needed to improve the results. Research aims to find better activation functions and identify factors that influence model performance. The data set consists of animal images collected from Kaggle, including cats, cows, elephants, horses, and sheep. It is divided into training and test sets (ratio 80:20). The CNN model has two convolution layers and two fully connected layers. ReLU and sigmoid activation functions with different learning rates are used. Evaluation metrics include precision, precision, recall, F1 score, and test cost. ReLU outperforms sigmoid in accuracy, precision, recall, and F1 score. However, other factors such as the size, complexity and parameters of the data set must be taken into account. This study emphasizes the importance of choosing the right activation function for better classification accuracy. ReLU is identified as effective in solving the vanish gradient problem. These findings can guide future research to improve CNN models in animal classification.

APA, Harvard, Vancouver, ISO, and other styles

41

Windarto, Agus Perdana, Indra Riyana Rahadjeng, Muhammad Noor Hasan Siregar, and Muhammad Habib Yuhandri. "Optimization of the Activation Function for Predicting Inflation Levels to Increase Accuracy Values." JURNAL MEDIA INFORMATIKA BUDIDARMA 8, no. 3 (2024): 1627. http://dx.doi.org/10.30865/mib.v8i3.7776.

Full text

Abstract:

This study aims to optimize the backpropagation algorithm by evaluating various activation functions to improve the accuracy of inflation rate predictions. Utilizing historical inflation data, neural network models were constructed and trained with Sigmoid, ReLU, and TanH activation functions. Evaluation using the Mean Squared Error (MSE) metric revealed that the ReLU function provided the most significant performance improvement. The findings indicate that the choice of activation function and neural network architecture significantly influences the model's ability to predict inflation rates. In the 5-7-1 architecture, the Logsig and ReLU activation functions demonstrated the best performance, with Logsig achieving the lowest MSE (0.00923089) and the highest accuracy (75%) on the test data. These results underscore the importance of selecting appropriate activation functions to enhance prediction accuracy, with ReLU outperforming the other functions in the context of the dataset used. This research concludes that optimizing activation functions in backpropagation is a crucial step in developing more accurate inflation prediction models, contributing significantly to neural network literature and practical economic applications.

APA, Harvard, Vancouver, ISO, and other styles

42

Madhu, Golla, Sandeep Kautish, Khalid Abdulaziz Alnowibet, Hossam M. Zawbaa, and Ali Wagdy Mohamed. "NIPUNA: A Novel Optimizer Activation Function for Deep Neural Networks." Axioms 12, no. 3 (2023): 246. http://dx.doi.org/10.3390/axioms12030246.

Full text

Abstract:

In recent years, various deep neural networks with different learning paradigms have been widely employed in various applications, including medical diagnosis, image analysis, self-driving vehicles and others. The activation functions employed in deep neural networks have a huge impact on the training model and the reliability of the model. The Rectified Linear Unit (ReLU) has recently emerged as the most popular and extensively utilized activation function. ReLU has some flaws, such as the fact that it is only active when the units are positive during back-propagation and zero otherwise. This causes neurons to die (dying ReLU) and a shift in bias. However, unlike ReLU activation functions, Swish activation functions do not remain stable or move in a single direction. This research proposes a new activation function named NIPUNA for deep neural networks. We test this activation by training on customized convolutional neural networks (CCNN). On benchmark datasets (Fashion MNIST images of clothes, MNIST dataset of handwritten digits), the contributions are examined and compared to various activation functions. The proposed activation function can outperform traditional activation functions.

APA, Harvard, Vancouver, ISO, and other styles

43

Sanjaya, Andi, Endang Setyati, and Herman Budianto. "Model Architecture of CNN for Recognition the Pandava Mask." Inform : Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi 5, no. 2 (2020): 99–104. http://dx.doi.org/10.25139/inform.v5i2.2740.

Full text

Abstract:

Abstract - This research was conducted in an effort to observe looking for CNN model architecture that is suitable for use on the Pandava mask object. The data tested were as much as 200 class data. So there are 1000 trial data in this study. In experiments with LeNEt using the input layer 32x32, 64x64, 128x128, 224x224, 256x256 showed that the 32x32 input layer succeeded in showing a faster time than the other input layers, accuracy and validation accuracy are not underfit or overfit. However, when the second dense process is switched from relu to sigmoid, the result of sigmoid is better than relu in terms of time and the possibility of overfit is smaller than using relu.

APA, Harvard, Vancouver, ISO, and other styles

44

Xu, Xintao, Yi Liu, Gang Chen, Junbin Ye, Zhigang Li, and Huaxiang Lu. "A Cooperative Lightweight Translation Algorithm Combined with Sparse-ReLU." Computational Intelligence and Neuroscience 2022 (May 28, 2022): 1–12. http://dx.doi.org/10.1155/2022/4398839.

Full text

Abstract:

In the field of natural language processing (NLP), machine translation algorithm based on Transformer is challenging to deploy on hardware due to a large number of parameters and low parametric sparsity of the network weights. Meanwhile, the accuracy of lightweight machine translation networks also needs to be improved. To solve this problem, we first design a new activation function, Sparse-ReLU, to improve the parametric sparsity of weights and feature maps, which facilitates hardware deployment. Secondly, we design a novel cooperative processing scheme with CNN and Transformer and use Sparse-ReLU to improve the accuracy of the translation algorithm. Experimental results show that our method, which combines Transformer and CNN with the Sparse-ReLU, achieves a 2.32% BLEU improvement in prediction accuracy and reduces the number of parameters of the model by 23%, and the sparsity of the inference model increases by more than 50%.

APA, Harvard, Vancouver, ISO, and other styles

45

Pardede, Doughlas, Ichsan Firmansyah, Meli Handayani, Meisarah Riandini, and Rika Rosnelly. "COMPARISON OF MULTILAYER PERCEPTRON’S ACTIVATION AND OP-TIMIZATION FUNCTIONS IN CLASSIFICATION OF COVID-19 PATIENTS." JURTEKSI (Jurnal Teknologi dan Sistem Informasi) 8, no. 3 (2022): 271–78. http://dx.doi.org/10.33330/jurteksi.v8i3.1482.

Full text

Abstract:

Abstract: Patient’s symptoms could be used as features in Covid-19 classification. Using multi layer perceptron, the classification uses data set that contains patient’s diagnosis which has Covid-19 symptoms dan processes the data set to see if the patient is Covid-19 positive or not. This paper compare four activation function such as identity, logistic, ReLu and tanh and combined them with optimizer such as L-BFGS-B, SGD and Adam. Using 5-fold and 10-fold cross validation technique to get the accuracy, F1, precision and recall values, the result that we get is that logistic function with L-BFGS-B optimizer and ReLu function with L-BFGS-B optimizer are the best combinations. The logistic function with SGD optimizer, ReLu function with Adam optimizer and tanh function with Adam optimizer are the worst combinations according to their accuration values. The logistic function with SGD optimizer is the worst combination according to its F1 value. The logistic function with SGD optimizer and tanh function with L-BFGS-B optimizer are the worst combinations according to their precision values. The logistic function with SGD optimizer, ReLu function with Adam optimizer and tanh function with Adam optimizer are the worst combinations according to their recall values. Keywords: activation function, covid-19; multi layer perceptron; optimizer algorithm Abstrak: Diagnosa gejala yang dialami pasien dapat digunakan sebagai fitur dalam klasifikasi penderita Covid-19. Dengan multi layer perceptron, klasifikasi dilakukan menggunakan data set yang berisi hasil diagnosa pasien yang memiliki gejala Covid-19 dan selanjutnya diolah untuk melihat apakah memang pasien tersebut menderita Covid-19 atau tidak. Penelitian ini membandingkan fungsi aktivasi identity, logistic, ReLu dan tanh yang dikombinasikan dengan algoritma optimasi L-BFGS-B, SGD dan Adam. Hasil evaluasi cross validation menggunakan 5-fold dan 10-fold digunakan sebagai dasar menentukan kombinasi yang terbaik dan terburuk, dengan hasil yang menunjukkan bahwa kombinasi fungsi logistic dengan optimasi L-BFGS-B dan fungsi ReLu dengan optimasi L-BFGS-B merupakan kombinasi terbaik. Kombinasi fungsi logisctic dengan optimasi SGD, fungsi ReLu dengan optimasi Adam dan fungsi tanh dengan optimasi Adam merupakan yang terburuk dari nilai accuracy. Kombinasi fungsi logistic dan optimasi SGD merupakan kombinasi terburuk dari nilai F1. Kombinasi fungsi logistic dengan optimasi SGD dan fungsi tanh dan optimasi L-BFGS-B merupakan yang terburuk dari nilai precision. Kombinasi fungsi logisctic dengan optimasi SGD, fungsi ReLu dengan optimasi Adam dan fungsi tanh dengan optimasi Adam merupakan kombinasi terburuk dari nilai recall. Kata kunci: algoritma optimasi; covid-19; fungsi optimasi; multi layer perceptron

APA, Harvard, Vancouver, ISO, and other styles

46

Lee, Hyeonjeong, Jaewon Lee, and Miyoung Shin. "Using Wearable ECG/PPG Sensors for Driver Drowsiness Detection Based on Distinguishable Pattern of Recurrence Plots." Electronics 8, no. 2 (2019): 192. http://dx.doi.org/10.3390/electronics8020192.

Full text

Abstract:

This paper aims to investigate the robust and distinguishable pattern of heart rate variability (HRV) signals, acquired from wearable electrocardiogram (ECG) or photoplethysmogram (PPG) sensors, for driver drowsiness detection. As wearable sensors are so vulnerable to slight movement, they often produce more noise in signals. Thus, from noisy HRV signals, we need to find good traits that differentiate well between drowsy and awake states. To this end, we explored three types of recurrence plots (RPs) generated from the R–R intervals (RRIs) of heartbeats: Bin-RP, Cont-RP, and ReLU-RP. Here Bin-RP is a binary recurrence plot, Cont-RP is a continuous recurrence plot, and ReLU-RP is a thresholded recurrence plot obtained by filtering Cont-RP with a modified rectified linear unit (ReLU) function. By utilizing each of these RPs as input features to a convolutional neural network (CNN), we examined their usefulness for drowsy/awake classification. For experiments, we collected RRIs at drowsy and awake conditions with an ECG sensor of the Polar H7 strap and a PPG sensor of the Microsoft (MS) band 2 in a virtual driving environment. The results showed that ReLU-RP is the most distinct and reliable pattern for drowsiness detection, regardless of sensor types (i.e., ECG or PPG). In particular, the ReLU-RP based CNN models showed their superiority to other conventional models, providing approximately 6–17% better accuracy for ECG and 4–14% for PPG in drowsy/awake classification.

APA, Harvard, Vancouver, ISO, and other styles

47

Zheng, Shuxin, Qi Meng, Huishuai Zhang, Wei Chen, Nenghai Yu, and Tie-Yan Liu. "Capacity Control of ReLU Neural Networks by Basis-Path Norm." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5925–32. http://dx.doi.org/10.1609/aaai.v33i01.33015925.

Full text

Abstract:

Recently, path norm was proposed as a new capacity measure for neural networks with Rectified Linear Unit (ReLU) activation function, which takes the rescaling-invariant property of ReLU into account. It has been shown that the generalization error bound in terms of the path norm explains the empirical generalization behaviors of the ReLU neural networks better than that of other capacity measures. Moreover, optimization algorithms which take path norm as the regularization term to the loss function, like Path-SGD, have been shown to achieve better generalization performance. However, the path norm counts the values of all paths, and hence the capacity measure based on path norm could be improperly influenced by the dependency among different paths. It is also known that each path of a ReLU network can be represented by a small group of linearly independent basis paths with multiplication and division operation, which indicates that the generalization behavior of the network only depends on only a few basis paths. Motivated by this, we propose a new norm Basis-path Norm based on a group of linearly independent paths to measure the capacity of neural networks more accurately. We establish a generalization error bound based on this basis path norm, and show it explains the generalization behaviors of ReLU networks more accurately than previous capacity measures via extensive experiments. In addition, we develop optimization algorithms which minimize the empirical risk regularized by the basis-path norm. Our experiments on benchmark datasets demonstrate that the proposed regularization method achieves clearly better performance on the test set than the previous regularization approaches.

APA, Harvard, Vancouver, ISO, and other styles

48

Azhary, Muhammad Zulhazmi Rafiqi, and Amelia Ritahani Ismail. "Comparative Performance of Different Convolutional Neural Network Activation Functions on Image Classification." International Journal on Perceptive and Cognitive Computing 10, no. 2 (2024): 118–22. http://dx.doi.org/10.31436/ijpcc.v10i2.490.

Full text

Abstract:

Activation functions are crucial in optimising Convolutional Neural Networks (CNNs) for image classification. While CNNs excel at capturing spatial hierarchies in images, the activation functions substantially impact their effectiveness. Traditional functions, such as ReLU and Sigmoid, have drawbacks, including the "dying ReLU" problem and vanishing gradients, which can inhibit learning and efficacy. The study seeks to comprehensively analyse various activation functions across different CNN architectures to determine their impact on performance. The findings suggest that Swish and Leaky ReLU outperform other functions, with Swish particularly promising in complicated networks such as ResNet. This emphasises the relevance of activation function selection in improving CNN performance and implies that investigating alternative functions can lead to more accurate and efficient models for image classification tasks.

APA, Harvard, Vancouver, ISO, and other styles

49

Margolang, Khairul Fadhli, Sugeng Riyadi, Rika Rosnelly, and Wanayumini -. "Pengenalan Masker Wajah Menggunakan VGG-16 dan Multilayer Perceptron." Jurnal Telematika 17, no. 2 (2023): 80–87. http://dx.doi.org/10.61769/telematika.v17i2.519.

Full text

Abstract:

The use of face masks during the Covid-19 pandemic can be identified based on images taken of a person's face and then classified based on the results of their feature extraction. VGG 16 is a pre-trained CNN model that can extract 4,096 features from an image and transfer learning to the multilayer perceptron algorithm in classifying someone using a face mask. The results of this study indicate that the combination of ReLu activation with adaptive moment optimization (Adam) and stochastic gradient descent (SGD), the combination of ReLu and Adam, produces the best classification performance with accuracy, precision, and recall values of 98.1%. Penggunaan masker wajah pada masa pandemi Covid-19 dapat diidentifikasi berdasarkan citra yang diambil dari wajah seseorang kemudian diklasifikasi berdasarkan hasil ekstraksi fiturnya. VGG 16 merupakan sebuah pre-trained CNN model yang dapat mengekstrak 4.096 fitur dari sebuah citra dan melakukan transfer learning kepada algoritme multilayer perceptron dalam mengklasifikasikan seseorang menggunakan masker wajah atau tidak. Hasil dari penelitian ini menunjukkan bahwa kombinasi aktivasi ReLu dengan optimasi adaptive moment (Adam) dan stochastic gradient descent (SGD), kombinasi ReLu dan Adam, menghasilkan performa klasifikasi terbaik dengan nilai accuracy, precision, dan recall sebesar 98,1%.

APA, Harvard, Vancouver, ISO, and other styles

50

Daniel, Irwan, Agus Fahmi Limas Ptr, and Aulia Ichsan. "Klasifikasi Risiko Penyakit Jantung Dengan Multilayer Perceptron." Data Sciences Indonesia (DSI) 4, no. 1 (2024): 78–82. https://doi.org/10.47709/dsi.v4i1.4667.

Full text

Abstract:

Penyakit jantung merupakan salah satu penyebab utama kematian di seluruh dunia, dengan deteksi dini yang seringkali menjadi tantangan karena gejala awalnya yang tidak spesifik. Penelitian ini bertujuan untuk mengevaluasi efektivitas model Multilayer Perceptron (MLP) dalam klasifikasi risiko penyakit jantung dengan membandingkan dua fungsi aktivasi, yaitu ReLU dan Tanh. Dataset yang digunakan terdiri dari 1190 entri dengan 11 fitur kesehatan, yang dibagi dalam rasio 80:20 untuk pelatihan dan pengujian. Model MLP dikembangkan dengan tiga lapisan tersembunyi, dan setiap model diterapkan dengan fungsi aktivasi ReLU dan Tanh untuk mengevaluasi performa masing-masing fungsi dalam mengklasifikasikan risiko penyakit jantung. Evaluasi model dilakukan menggunakan metrik akurasi, presisi, dan recall. Hasil penelitian menunjukkan bahwa model MLP dengan fungsi aktivasi ReLU memperoleh akurasi sebesar 81,51%, presisi 81,77%, dan recall 81,51%, sedangkan model dengan fungsi aktivasi Tanh mencapai akurasi 80,25%, presisi 80,32%, dan recall 80,25%. Perbedaan ini mengindikasikan bahwa ReLU unggul dalam hal akurasi dan metrik evaluasi lainnya, menjadikannya pilihan yang lebih efektif untuk deteksi dini penyakit jantung. Temuan ini memberikan insight berharga tentang bagaimana pemilihan fungsi aktivasi dapat mempengaruhi kinerja model dalam klasifikasi risiko penyakit, serta menggarisbawahi pentingnya pemilihan teknik yang tepat untuk meningkatkan akurasi deteksi dalam aplikasi medis

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!