To see the other types of publications on this topic, follow the link: Two-layers neural networks.

Journal articles on the topic 'Two-layers neural networks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Two-layers neural networks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wei, Chih-Chiang. "Comparison of River Basin Water Level Forecasting Methods: Sequential Neural Networks and Multiple-Input Functional Neural Networks." Remote Sensing 12, no. 24 (December 20, 2020): 4172. http://dx.doi.org/10.3390/rs12244172.

Full text
Abstract:
To precisely forecast downstream water levels in catchment areas during typhoons, the deep learning artificial neural networks were employed to establish two water level forecasting models using sequential neural networks (SNNs) and multiple-input functional neural networks (MIFNNs). SNNs, which have a typical neural network structure, are network models constructed using sequential methods. To develop a network model capable of flexibly consolidating data, MIFNNs are employed for processing data from multiple sources or with multiple dimensions. Specifically, when images (e.g., radar reflectivity images) are used as input attributes, feature extraction is required to provide effective feature maps for model training. Therefore, convolutional layers and pooling layers were adopted to extract features. Long short-term memory (LSTM) layers adopted during model training enabled memory cell units to automatically determine the memory length, providing more useful information. The Hsintien River basin in northern Taiwan was selected as the research area and collected relevant data from 2011 to 2019. The input attributes comprised one-dimensional data (e.g., water levels at river stations, rain rates at rain gauges, and reservoir release) and two-dimensional data (i.e., radar reflectivity mosaics). Typhoons Saola, Soudelor, Dujuan, and Megi were selected, and the water levels 1 to 6 h after the typhoons struck were forecasted. The results indicated that compared with linear regressions (REG), SNN using dense layers (SNN-Dense), and SNN using LSTM layers (SNN-LSTM) models, superior forecasting results were achieved for the MIFNN model. Thus, the MIFNN model, as the optimal model for water level forecasting, was identified.
APA, Harvard, Vancouver, ISO, and other styles
2

Yin, Chun Hua, Jia Wei Chen, and Lei Chen. "Weight to Vision Neural Network Information Processing Influence Research." Advanced Materials Research 605-607 (December 2012): 2131–36. http://dx.doi.org/10.4028/www.scientific.net/amr.605-607.2131.

Full text
Abstract:
Many factors influence vision neural network information processing process, for example: Signal initial value, weight, time and number of learning. This paper discussed the importance of weight in vision neural network information processing process. Different weight values can cause different results in neural networks learning. We structure a vision neural network model with three layers based on synapse dynamics at first. Then we change the weights of the vision neural network model’s to make the three layers a neural network of learning Chinese characters. At last we change the initial weight distribution to simulate the neural network of process of the learning Chinese words. Two results are produced. One is that weight plays a very important role in vision neural networks learning, the other is that different initial weight distributions have different results in vision neural networks learning.
APA, Harvard, Vancouver, ISO, and other styles
3

Carpenter, William C., and Margery E. Hoffman. "Guidelines for the selection of network architecture." Artificial Intelligence for Engineering Design, Analysis and Manufacturing 11, no. 5 (November 1997): 395–408. http://dx.doi.org/10.1017/s0890060400003322.

Full text
Abstract:
AbstractThis paper is concerned with presenting guidelines to aide in the selection of the appropriate network architecture for back-propagation neural networks used as approximators. In particular, its goal is to indicate under what circumstances neural networks should have two hidden layers and under what circumstances they should have one hidden layer. Networks with one and with two hidden layers were used to approximate numerous test functions. Guidelines were developed from the results of these investigations.
APA, Harvard, Vancouver, ISO, and other styles
4

Baptista, Marcia, Helmut Prendinger, and Elsa Henriques. "Prognostics in Aeronautics with Deep Recurrent Neural Networks." PHM Society European Conference 5, no. 1 (July 22, 2020): 11. http://dx.doi.org/10.36001/phme.2020.v5i1.1230.

Full text
Abstract:
Recurrent neural networks (RNNs) such as LSTM and GRU are not new to the field of prognostics. However, the performance of neural networks strongly depends on their architectural structure. In this work, we investigate a hybrid network architecture that is a combination of recurrent and feed-forward (conditional) layers. Two networks, one recurrent and another feed-forward, are chained together, with inference and weight gradients being learned using the standard back-propagation learning procedure. To better tune the network, instead of using raw sensor data, we do some preprocessing on the data, using mostly simple but effective statistics (researched in previous work). This helps the feature extraction phase and eases the problem of finding a suitable network configuration among the immense set of possible ones. This is not the first proposal of a hybrid network in prognostics but our work is novel in the sense that it performs a more comprehensive comparison of this type of architecture for different RNN layers and number of layers. Also, we compare our work with other classical machine learning methods. Evaluation is performed on two real-world case studies from the aero-engine industry: one involving a critical valve subsystem of the jet engine and another the whole reliability of the jet engine. Our goal here is to compare two cases contrasting micro (valve) and macro (whole engine) prognostics. Our results indicate that the performance of the LSTM and GRU deep networks are significantly better than that of other models.
APA, Harvard, Vancouver, ISO, and other styles
5

PAUGAM-MOISY, HÉLÈNE. "HOW TO MAKE GOOD USE OF MULTILAYER NEURAL NETWORKS." Journal of Biological Systems 03, no. 04 (December 1995): 1177–91. http://dx.doi.org/10.1142/s0218339095001064.

Full text
Abstract:
This article is a survey of recent advances on multilayer neural networks. The first section is a short summary on multilayer neural networks, their history, their architecture and their learning rule, the well-known back-propagation. In the following section, several theorems are cited, which present one-hidden-layer neural networks as universal approximators. The next section points out that two hidden layers are often required for exactly realizing d-dimensional dichotomies. Defining the frontier between one-hidden-layer and two-hidden-layer networks is still an open problem. Several bounds on the size of a multilayer network which learns from examples are presented and we enhance the fact that, even if all can be done with only one hidden layer, more often, things can be done better with two or more hidden layers. Finally, this assertion 'is supported by the behaviour of multilayer neural networks in two applications: prediction of pollution and odor recognition modelling.
APA, Harvard, Vancouver, ISO, and other styles
6

Vetrov, Igor A., and Vladislav V. Podtopelny. "Features of building neural networks taking into account the specifics of their training to solve the tasks of searching for network attacks." Proceedings of Tomsk State University of Control Systems and Radioelectronics 26, no. 2 (2023): 42–50. http://dx.doi.org/10.21293/1818-0442-2023-26-2-42-50.

Full text
Abstract:
The problems of building neural networks to solve the problems of detecting network intrusions, taking into account modern publicly available technologies, are considered. Several configurations of neural networks are analyzed: a simple perceptron, a combined network consisting of two interconnected networks, simplified networks based on a simple perceptron, LSTM networks using hidden layers with data compression function. The weaknesses and strengths of neural network architectures are considered, taking into account the specifics of their training based on abnormal traffic datasets in intrusion detection tasks.
APA, Harvard, Vancouver, ISO, and other styles
7

Petzka, Henning, Martin Trimmel, and Cristian Sminchisescu. "Notes on the Symmetries of 2-Layer ReLU-Networks." Proceedings of the Northern Lights Deep Learning Workshop 1 (February 6, 2020): 6. http://dx.doi.org/10.7557/18.5150.

Full text
Abstract:
Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.
APA, Harvard, Vancouver, ISO, and other styles
8

Lamy, Lucas, and Paulo Henrique Siqueira. "The Null Layer: increasing convolutional neural network efficiency." Caderno Pedagógico 22, no. 6 (April 4, 2025): e15344. https://doi.org/10.54033/cadpedv22n6-050.

Full text
Abstract:
Convolutional neural networks are currently used in many applications; however, their construction involves a sequence of choices that can drastically affect the final network accuracy. In addition to the choice of architecture and hyperparameters, weight initialization of the layers is an essential step. We analyzed two different initialization methods in the layers to study their impact on network accuracy. The proposed method, the Null Layer, has a weight initialization of the first convolutional layer equal to zero, whereas the other layers have another type of initialization. The second method, the traditional method, uses the same weight initialization for all layers. Three different networks, four datasets, five activation functions, and three weight initializations were used for the tests. The results showed that the Null Layer method is an efficient approach for increasing network accuracy. It presented better accuracy in 53% of the tests than the traditional method without additional computational cost.
APA, Harvard, Vancouver, ISO, and other styles
9

Shpinareva, Irina M., Anastasia A. Yakushina, Lyudmila A. Voloshchuk, and Nikolay D. Rudnichenko. "Detection and classification of network attacks using the deep neural network cascade." Herald of Advanced Information Technology 4, no. 3 (October 15, 2021): 244–54. http://dx.doi.org/10.15276/hait.03.2021.4.

Full text
Abstract:
This article shows the relevance of developing a cascade of deep neural networks for detecting and classifying network attacks based on an analysis of the practical use of network intrusion detection systems to protect local computer networks. A cascade of deep neural networks consists of two elements. The first network is a hybrid deep neural network that contains convolutional neural network layers and long short-term memory layers to detect attacks. The second network is a CNN convolutional neural network for classifying the most popular classes of network attacks such as Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnais-sance, Shellcode, and Worms. At the stage of tuning and training the cascade of deep neural networks, the selection of hyperparame-ters was carried out, which made it possible to improve the quality of the model. Among the available public datasets, one ofthe current UNSW-NB15 datasets was selected, taking into account modern traffic. For the data set under consideration, a data prepro-cessing technology has been developed. The cascade of deep neural networks was trained, tested, and validated on the UNSW-NB15 dataset. The cascade of deep neural networks was tested on real network traffic, which showed its ability to detect and classify at-tacks in a computer network. The use of a cascade of deep neural networks, consisting of a hybrid neural network CNN + LSTM and a neural network CNNhas improved the accuracy of detecting and classifying attacks in computer networks and reduced the fre-quency of false alarms in detecting network attacks
APA, Harvard, Vancouver, ISO, and other styles
10

Chen, Jingfeng. "Spam mail classification using back propagation neural networks." Applied and Computational Engineering 5, no. 1 (June 14, 2023): 438–49. http://dx.doi.org/10.54254/2755-2721/5/20230617.

Full text
Abstract:
Mail classification methods based on machine learning have been introduced to combat spams. However, few researches focus on the most powerful machine learning model that is neural networks. In this paper, the author trains BP neural networks to detect spams. The inputs of the neural networks are only information about words, punctures, signs, numbers and illegal words. Five neural networks which are different in number of neurons and number of layers are experimented on. All networks apply Rectified Linear Unit (ReLU) functions and Momentum learning technology. The results show that the network with four hidden layers enjoys the best classifying accuracy of 97.0%. In networks with two hidden layers, when the number of neurons in each layer is above 300, the accuracy is between 95.5% and 96.0%; and 100 neurons in each layer result in an accuracy of 93.8%. Although the training only captures information of words, punctures and signs, the networks have achieved high accuracy, and the author suggests that making the computer understand sentences as well as other kinds of improvements can lead to even higher performance.
APA, Harvard, Vancouver, ISO, and other styles
11

Huang, Hong-Hua, Jian-Fei Luo, Feng Gan, and Philip K. Hopke. "Two Revised Deep Neural Networks and Their Applications in Quantitative Analysis Based on Near-Infrared Spectroscopy." Applied Sciences 13, no. 14 (July 23, 2023): 8494. http://dx.doi.org/10.3390/app13148494.

Full text
Abstract:
Small data sets make developing calibration models using deep neural networks difficult because it is easy to overfit the system. We developed two deep neural network architectures by revising two existing network architectures: the U-Net and the attention mechanism. The major changes were to use 1D convolutional layers to replace the fully connected layers. We also designed and combined average pooling and maximum pooling in our revised networks, respectively. We applied these revised network architectures to three publicly available data sets and the resulting calibration models can generate acceptable results for general quantitative analysis. It also generated rather good results for data sets that concern calibration transfer. It demonstrates that constructing network architectures by properly revising existing successful network architectures may provide additional choices in the exploration of the application of deep neural network in analytical chemistry.
APA, Harvard, Vancouver, ISO, and other styles
12

Khodnevych, Yaroslav V., and Dmytro V. Stefanyshyn. "Do we need a more sophisticated multilayer artificial neural network to compute roughness coefficient?" Environmental safety and natural resources 48, no. 4 (December 26, 2023): 170–82. http://dx.doi.org/10.32347/2411-4049.2023.4.170-182.

Full text
Abstract:
Artificial neural networks (ANNs) are one of the most rapidly growing fields of soft computing. Along with deep learning, they are currently the most widely used machine learning techniques. Artificial neural networks are especially suitable for problem-solving where a researcher deals with incomplete data sets and no algorithms or specific sets of rules to be followed.This article deals with a case of comparison of several modifications of neural networks that may be applied to compute Chézy’s roughness coefficient. Neural network modelling is often started with one hidden layer. Having even one hidden layer, a neural network presents a powerful computing system to give good results. If it is necessary, the number of hidden layers may increase. Usually, two or three hidden layers of neurons are used. Diverse activation functions may also apply. The article aims to explore the necessity of developing sophisticated multilayer artificial neural networks to compute Chézy’s roughness coefficient.Under the study, the following modifications of the neural network computing Chézy’s roughness coefficient were considered and analysed: (1) Application of two hidden layers of neurons; (2) Application of three hidden layers of neurons; (3) Use of a dropout algorithm for training neural networks by randomly dropping units during training to prevent their co-adaptation; (4) Apart from the sigmoid (logistic) activation function, the use of other artificial neuron transfer functions – hyperbolic tangent (tanh) and rectifying activation function (ReLU).The training and testing of the considered neural network options were carried out using the actual hydro-morphological and hydrological data related to the channel section on the Dnieper River (downstream of Kyiv), the Desna River section near Chernihiv, and the Pripyat River section near the town of Turiv. The Python object-oriented programming environment was applied to build and train the neural networks. The test results confirm the acceptability and sufficiency of computing the Chézy roughness coefficient using the ANN of direct propagation with one hidden layer and a sigmoid logistic activation function. The formation of a qualitative set of training data, as well as data arrangement and choosing a relevant computing model based on empirical knowledge, are, as concluded, among more actual issues than creating more sophisticated neural networks.
APA, Harvard, Vancouver, ISO, and other styles
13

Mezher, Liqaa Saadi. "Design and implementation hamming neural network with VHDL." Indonesian Journal of Electrical Engineering and Computer Science 19, no. 3 (September 1, 2020): 1469. http://dx.doi.org/10.11591/ijeecs.v19.i3.pp1469-1479.

Full text
Abstract:
<p>Hamming Neural Network is type of artificial neural network consist of two types of layers (Feed Forward Layers and Recurrent Layer). In this paper, two inputs of patterns in bianary number were used. In the first layer, two neurons and pure line function were used. In the second layer, three neurons and positive line function were used. Also applied Hamming Neural networks algorithm in three simulation methods (Logical gate method, software program coding method and instant block diagram method). In this work in VHDL software program was used and FPGA hardware used.</p>
APA, Harvard, Vancouver, ISO, and other styles
14

Hayati, Mohsen, and Kaveh Darabi. "Modeling and Simulation of Turbogenerator Using Computational Intelligence." Applied Mechanics and Materials 110-116 (October 2011): 5211–15. http://dx.doi.org/10.4028/www.scientific.net/amm.110-116.5211.

Full text
Abstract:
In this paper, modeling and simulation of Turbogenerators has been presented using artificial neural networks. The training and testing of neural network was done by MATLAB 6.5.1 software in order to find the optimum values of weights and biases. To find the optimal neural structure, training of several structures with two layers and three layers with different number of neurons in each layer has been done. Moreover, the neural network qualified with the least amount of error was presented along with their related charts.
APA, Harvard, Vancouver, ISO, and other styles
15

Yang, Linrang. "Predicting consumer acceptance of automobiles based on deep learning and traditional machine learning algorithms." Applied and Computational Engineering 27, no. 1 (December 11, 2023): 30–37. http://dx.doi.org/10.54254/2755-2721/27/20230119.

Full text
Abstract:
Researchers have made significant progress in machine learning in recent years. Machine learning can learn and predict large and complex data sets. Researchers have divided machine learning algorithms into two categories: deep learning and traditional machine learning. Every problem can be predicted in both ways. This paper uses the "Car Data" dataset to investigate deep learning and traditional machine learning. In order to find a machine learning algorithm that is more conducive to analyzing and predicting consumers' acceptance of different cars, this paper mainly explores the differences in the prediction accuracy of the three methods of Neural Networks, Random Forest and Support Vector Machine (SVM). We construct 3-hidden layers neural networks and 4-hidden layers neural networks. After testing, it is known that the result predicted by Random Forest is the worst. The prediction accuracy of 3-hidden layers Neural Networks is similar to that by SVM. When we added an extra layer of hidden layers on the basis of 3-hidden layers, the prediction accuracy was higher than that of SVM. Adding a hidden layer can improve the prediction accuracy, and both SVM and Neural Network can be used to analyze Car Data. But not all methods have similar predictive accuracy.
APA, Harvard, Vancouver, ISO, and other styles
16

Yang, Linrang. "Predicting consumer acceptance of automobiles based on deep learning and traditional machine learning algorithms." Applied and Computational Engineering 27, no. 9 (December 11, 2023): 30–37. http://dx.doi.org/10.54254/2755-2721/27/ojs/20230119.

Full text
Abstract:
Researchers have made significant progress in machine learning in recent years. Machine learning can learn and predict large and complex data sets. Researchers have divided machine learning algorithms into two categories: deep learning and traditional machine learning. Every problem can be predicted in both ways. This paper uses the "Car Data" dataset to investigate deep learning and traditional machine learning. In order to find a machine learning algorithm that is more conducive to analyzing and predicting consumers' acceptance of different cars, this paper mainly explores the differences in the prediction accuracy of the three methods of Neural Networks, Random Forest and Support Vector Machine (SVM). We construct 3-hidden layers neural networks and 4-hidden layers neural networks. After testing, it is known that the result predicted by Random Forest is the worst. The prediction accuracy of 3-hidden layers Neural Networks is similar to that by SVM. When we added an extra layer of hidden layers on the basis of 3-hidden layers, the prediction accuracy was higher than that of SVM. Adding a hidden layer can improve the prediction accuracy, and both SVM and Neural Network can be used to analyze Car Data. But not all methods have similar predictive accuracy.
APA, Harvard, Vancouver, ISO, and other styles
17

Firsov, Nikita, Evgeny Myasnikov, Valeriy Lobanov, Roman Khabibullin, Nikolay Kazanskiy, Svetlana Khonina, Muhammad A. Butt, and Artem Nikonorov. "HyperKAN: Kolmogorov–Arnold Networks Make Hyperspectral Image Classifiers Smarter." Sensors 24, no. 23 (November 30, 2024): 7683. https://doi.org/10.3390/s24237683.

Full text
Abstract:
In traditional neural network designs, a multilayer perceptron (MLP) is typically employed as a classification block following the feature extraction stage. However, the Kolmogorov–Arnold Network (KAN) presents a promising alternative to MLP, offering the potential to enhance prediction accuracy. In this paper, we studied KAN-based networks for pixel-wise classification of hyperspectral images. Initially, we compared baseline MLP and KAN networks with varying numbers of neurons in their hidden layers. Subsequently, we replaced the linear, convolutional, and attention layers of traditional neural networks with their KAN-based counterparts. Specifically, six cutting-edge neural networks were modified, including 1D (1DCNN), 2D (2DCNN), and 3D convolutional networks (two different 3DCNNs, NM3DCNN), as well as transformer (SSFTT). Experiments conducted using seven publicly available hyperspectral datasets demonstrated a substantial improvement in classification accuracy across all the networks. The best classification quality was achieved using a KAN-based transformer architecture.
APA, Harvard, Vancouver, ISO, and other styles
18

OH, SUNG-KWUN, DONG-WON KIM, and WITOLD PEDRYCZ. "HYBRID FUZZY POLYNOMIAL NEURAL NETWORKS." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10, no. 03 (June 2002): 257–80. http://dx.doi.org/10.1142/s0218488502001478.

Full text
Abstract:
We propose a hybrid architecture based on a combination of fuzzy systems and polynomial neural networks. The resulting Hybrid Fuzzy Polynomial Neural Networks (HFPNN) dwells on the ideas of fuzzy rule-based computing and polynomial neural networks. The structure of the network comprises of fuzzy polynomial neurons (FPNs) forming the nodes of the first (input) layer of the HFPNN and polynomial neurons (PNs) that are located in the consecutive layers of the network. In the FPN (that forms a fuzzy inference system), the generic rules assume the form "if A then y = P(x) " where A is fuzzy relation in the condition space while P(x) is a polynomial standing in the conclusion part of the rule. The conclusion part of the rules, especially the regression polynomial uses several types of high-order polynomials such as constant, linear, quadratic, and modified quadratic. As the premise part of the rules, both triangular and Gaussian-like membership functions are considered. Each PN of the network realizes a polynomial type of partial description (PD) of the mapping between input and out variables. HFPNN is a flexible neural architecture whose structure is based on the Group Method of Data Handling (GMDH) and developed through learning. In particular, the number of layers of the PNN is not fixed in advance but is generated in a dynamic way. The experimental part of the study involves two representative numerical examples such as chaotic time series and Box-Jenkins gas furnace data.
APA, Harvard, Vancouver, ISO, and other styles
19

Yildirim, Sahin, Asli Durmusoglu, Caglar Sevim, Mehmet Safa Bingol, and Menderes Kalkat. "Design of neural predictors for predicting and analysing COVID-19 cases in different regions." Neural Network World 32, no. 5 (2022): 233–51. http://dx.doi.org/10.14311/nnw.2022.32.014.

Full text
Abstract:
Nowadays, some unexpected viruses are affecting people with many troubles. COVID-19 virus is spread in the world very rapidly. However, it seems that predicting cases and death fatalities is not easy. Artificial neural networks are employed in many areas for predicting the system’s parameters in simulation or real-time approaches. This paper presents the design of neural predictors for analysing the cases of COVID-19 in three countries. Three countries were selected because of their different regions. Especially, these major countries’ cases were selected for predicting future effects. Furthermore, three types of neural network predictors were employed to analyse COVID-19 cases. NAR-NN is one of the proposed neural networks that have three layers with one input layer neurons, hidden layer neurons and an output layer with fifteen neurons. Each neuron consisted of the activation functions of the tan-sigmoid. The other proposed neural network, ANFIS, consists of five layers with two inputs and one output and ARIMA uses four iterative steps to predict. The proposed neural network types have been selected from many other types of neural network types. These neural network structures are feed-forward types rather than recurrent neural networks. Learning time is better and faster than other types of networks. Finally, three types of neural predictors were used to predict the cases. The R2 and MSE results improved that three types of neural networks have good performance to predict and analyse three region cases of countries.
APA, Harvard, Vancouver, ISO, and other styles
20

Morozov, A. Yu, D. L. Reviznikov, and K. K. Abgaryan. "Issues of implementing neural network algorithms on memristor crossbars." Izvestiya Vysshikh Uchebnykh Zavedenii. Materialy Elektronnoi Tekhniki = Materials of Electronics Engineering 22, no. 4 (February 4, 2020): 272–78. http://dx.doi.org/10.17073/1609-3577-2019-4-272-278.

Full text
Abstract:
The property of natural parallelization of matrix-vector operations inherent in memristor crossbars creates opportunities for their effective use in neural network computing. Analog calculations are orders of magnitude faster in comparison to calculations on the central processor and on graphics accelerators. Besides, mathematical operations energy costs are significantly lower. The essential feature of analog computing is its low accuracy. In this regard, studying the dependence of neural network quality on the accuracy of setting its weights is relevant. The paper considers two convolutional neural networks trained on the MNIST (handwritten digits) and CIFAR_10 (airplanes, boats, cars, etc.) data sets. The first convolutional neural network consists of two convolutional layers, one subsample layer and two fully connected layers. The second one consists of four convolutional layers, two subsample layers and two fully connected layers. Calculations in convolutional and fully connected layers are performed through matrix-vector operations that are implemented on memristor crossbars. Sub-sampling layers imply the operation of finding the maximum value from several values. This operation can be implemented at the analog level. The process of training a neural network runs separately from data analysis. As a rule, gradient optimization methods are used at the training stage. It is advisable to perform calculations using these methods on CPU. When setting the weights, 3—4 precision bits are required to obtain an acceptable recognition quality in the case the network is trained on MNIST. 6-10 precision bits are required if the network is trained on CIFAR_10.
APA, Harvard, Vancouver, ISO, and other styles
21

Hao, Yaobin, and Fangying Song. "Fourier Neural Operator Networks for Solving Reaction–Diffusion Equations." Fluids 9, no. 11 (November 6, 2024): 258. http://dx.doi.org/10.3390/fluids9110258.

Full text
Abstract:
In this paper, we used Fourier Neural Operator (FNO) networks to solve reaction–diffusion equations. The FNO is a novel framework designed to solve partial differential equations by learning mappings between infinite-dimensional functional spaces. We applied the FNO to the Surface Quasi-Geostrophic (SQG) equation, and we tested the model with two significantly different initial conditions: Vortex Initial Conditions and Sinusoidal Initial Conditions. Furthermore, we explored the generalization ability of the model by evaluating its performance when trained on Vortex Initial Conditions and applied to Sinusoidal Initial Conditions. Additionally, we investigated the modes (frequency parameters) used during training, analyzing their impact on the experimental results, and we determined the most suitable modes for this study. Next, we conducted experiments on the number of convolutional layers. The results showed that the performance of the models did not differ significantly when using two, three, or four layers, with the performance of two or three layers even slightly surpassing that of four layers. However, as the number of layers increased to five, the performance improved significantly. Beyond 10 layers, overfitting became evident. Based on these observations, we selected the optimal number of layers to ensure the best model performance. Given the autoregressive nature of the FNO, we also applied it to solve the Gray–Scott (GS) model, analyzing the impact of different input time steps on the performance of the model during recursive solving. The results indicated that the FNO requires sufficient information to capture the long-term evolution of the equations. However, compared to traditional methods, the FNO offers a significant advantage by requiring almost no additional computation time when predicting with new initial conditions.
APA, Harvard, Vancouver, ISO, and other styles
22

Moon, Jihoon, Sungwoo Park, Seungmin Rho, and Eenjun Hwang. "A comparative analysis of artificial neural network architectures for building energy consumption forecasting." International Journal of Distributed Sensor Networks 15, no. 9 (September 2019): 155014771987761. http://dx.doi.org/10.1177/1550147719877616.

Full text
Abstract:
Smart grids have recently attracted increasing attention because of their reliability, flexibility, sustainability, and efficiency. A typical smart grid consists of diverse components such as smart meters, energy management systems, energy storage systems, and renewable energy resources. In particular, to make an effective energy management strategy for the energy management system, accurate load forecasting is necessary. Recently, artificial neural network–based load forecasting models with good performance have been proposed. For accurate load forecasting, it is critical to determine effective hyperparameters of neural networks, which is a complex and time-consuming task. Among these parameters, the type of activation function and the number of hidden layers are critical in the performance of neural networks. In this study, we construct diverse artificial neural network–based building electric energy consumption forecasting models using different combinations of the two hyperparameters and compare their performance. Experimental results indicate that neural networks with scaled exponential linear units and five hidden layers exhibit better performance, on average than other forecasting models.
APA, Harvard, Vancouver, ISO, and other styles
23

Jayaprakash, T., V. Jyoshita, E. Mallesh, Malleswari Neelam, T. Manikanta, and sankaran ramesh kumar. "Face Mask Detection Using Convolutional Neural Networks." International Journal for Research in Applied Science and Engineering Technology 12, no. 5 (May 31, 2024): 3541–46. http://dx.doi.org/10.22214/ijraset.2024.61608.

Full text
Abstract:
Abstract: Face mask detection has become a critical task in the current scenario to ensure public safety and prevent the spread of infectious diseases. In this project, we propose a deep learning approach using convolutional neural networks (CNNs) to detect whether a person is wearing a face mask or not. The dataset used for training and evaluation consists of images of individuals with and without face masks. The images are collected from various sources and carefully labelled to ensure accuracy. Each image is pre processed to enhance the features and normalize the pixel values. The proposed CNN architecture consists of multiple convolutional layers, followed by pooling and fully connected layers. The convolutional layers extract relevant features from the input images, while the pooling layers reduce the spatial dimensions of the feature maps. The fully connected layers classify the images into two categories: with face mask and without face mask. To train the CNN model, we use a combination of supervised learning techniques and data augmentation. We employ various optimization algorithms to minimize the classification loss and improve the accuracy of the model. In conclusion, this project presents a deep learning based approach for face mask detection using convolutional neural networks. The proposed model demonstrates promising results in accurately identifying individuals wearing face masks. The project contributes to the development of intelligent systems for public health and safety
APA, Harvard, Vancouver, ISO, and other styles
24

Litavrin, Andrey V., and Tatyana V. Moiseenkova. "About One Groupoid Associated with the Composition of Multilayer Feedforward Neural Networks." Zhurnal Srednevolzhskogo Matematicheskogo Obshchestva 26, no. 2 (June 30, 2024): 111–22. http://dx.doi.org/10.15507/2079-6900.26.202402.111-122.

Full text
Abstract:
Abstract. The authors construct a groupoid whose elements are associated with multilayer feedforward neural networks. This groupoid is called the complete groupoid of the composition of neural networks. Multilayer feedforward neural networks (hereinafter referred to as neural networks) are modelled by defining a special type of tuple. Its components define layers of neurons and structural mappings that specify weights of synaptic connections, activation functions and threshold values. Using the artificial neuron model (that of McCulloch-Pitts) for each such tuple it is possible to define a mapping that models the operation of a neural network as a computational circuit. This approach differs from defining a neural network using abstract automata and related constructions. Modeling neural networks using the proposed method makes it possible to describe the architecture of the network (that is, the network graph, the synaptic weights, etc.). The operation in the full neural network composition groupoid models the composition of two neural networks. A network, obtained as the product of a pair of neural networks, operates on input signals by sequentially applying original networks and contains information about their structure. It is proved that the constructed groupoid is a free.
APA, Harvard, Vancouver, ISO, and other styles
25

Strijhak, Sergei, Daniil Ryazanov, Konstantin Koshelev, and Aleksandr Ivanov. "Neural Network Prediction for Ice Shapes on Airfoils Using iceFoam Simulations." Aerospace 9, no. 2 (February 12, 2022): 96. http://dx.doi.org/10.3390/aerospace9020096.

Full text
Abstract:
In this article the procedure and method for the ice accretion prediction for different airfoils using artificial neural networks (ANNs) are discussed. A dataset for the neural network is based on the numerical experiment results—obtained through iceFoam solver—with four airfoils (NACA0012, General Aviation, Business Jet, and Commercial Transport). Input data for neural networks include airfoil and ice geometries, transformed into a set of parameters using a parabolic coordinate system and Fourier series expansion. Besides input features include physical parameters of flow (velocity, temperature, droplets diameter, liquid water content, time of ice accretion) and angle of attack. The novelty of this work is in that the neural network dataset includes various airfoils and the data augmentation technique being a combination of all time slices. Several artificial neural networks (ANNs), fully connected networks (FCNNs), and convolutional networks (CNNs) were trained to predict airfoil ice shapes. Two different loss functions were considered. In order to improve performance of models, batch normalization and dropout layers were used. The most accurate results of ice shape prediction were obtained using CNN and FCNN that applied batch normalization and dropout layers to output neurons of each layer.
APA, Harvard, Vancouver, ISO, and other styles
26

Matondo-Mvula, Nadine, and Khaled Elleithy. "Breast Cancer Detection with Quanvolutional Neural Networks." Entropy 26, no. 8 (July 26, 2024): 630. http://dx.doi.org/10.3390/e26080630.

Full text
Abstract:
Quantum machine learning holds the potential to revolutionize cancer treatment and diagnostic imaging by uncovering complex patterns beyond the reach of classical methods. This study explores the effectiveness of quantum convolutional layers in classifying ultrasound breast images for cancer detection. By encoding classical data into quantum states through angle embedding and employing a robustly entangled 9-qubit circuit design with an SU(4) gate, we developed a Quantum Convolutional Neural Network (QCNN) and compared it to a classical CNN of similar architecture. Our QCNN model, leveraging two quantum circuits as convolutional layers, achieved an impressive peak training accuracy of 76.66% and a validation accuracy of 87.17% at a learning rate of 1 × 10−2. In contrast, the classical CNN model attained a training accuracy of 77.52% and a validation accuracy of 83.33%. These compelling results highlight the potential of quantum circuits to serve as effective convolutional layers for feature extraction in image classification, especially with small datasets.
APA, Harvard, Vancouver, ISO, and other styles
27

Belorutsky, R. Yu, and S. V. Zhitnik. "SPEECH RECOGNITION BASED ON CONVOLUTION NEURAL NETWORKS." Issues of radio electronics, no. 4 (May 10, 2019): 47–52. http://dx.doi.org/10.21778/2218-5453-2019-4-47-52.

Full text
Abstract:
The problem of recognizing a human speech in the form of digits from one to ten recorded by dictaphone is considered. The method of the sound signal spectrogram recognition by means of convolutional neural networks is used. The algorithms for input data preliminary processing, networks training and words recognition are realized. The recognition accuracy for different number of convolution layers is estimated. Its number is determined and the structure of neural network is proposed. The comparison of recognition accuracy when the input data for the network is spectrogram or first two formants is carried out. The recognition algorithm is tested by male and female voices with different duration of pronunciation.
APA, Harvard, Vancouver, ISO, and other styles
28

Tanabe, Kazutoshi, Tadao Tamura, and Hiroyuki Uesaka. "Neural Network System for the Identification of Infrared Spectra." Applied Spectroscopy 46, no. 5 (May 1992): 807–10. http://dx.doi.org/10.1366/0003702924124619.

Full text
Abstract:
A neural network system has been developed on a personal computer to identify 1129 infrared spectra. The system is composed of two steps of networks. The first step classifies 1129 spectra into 40 categories, and each unit of the output layer is connected to one of the 40 networks in the second step, which identify each spectrum. Each network is composed of three layers. The input, intermediate, and output layers are composed of 250, 40, and 40 units, respectively. Intensity data at 250 wavenumber points between 1800 and 550 cm−1 of the infrared spectra are entered into the input layer of each network. The training of the networks was carried out with the spectral data of 1129 compounds stored in the SDBS system, and thus the networks were successfully constructed. On the basis of the results, the system has been developed by preparing pre- and post-processing programs. The system can identify each unknown spectrum within 0.1 s, and is quite efficient for identifying infrared spectra on a personal computer.
APA, Harvard, Vancouver, ISO, and other styles
29

Geva, Shlomo, and Joaquin Sitte. "An Exponential Response Neural Net." Neural Computation 3, no. 4 (December 1991): 623–32. http://dx.doi.org/10.1162/neco.1991.3.4.623.

Full text
Abstract:
By using artificial neurons with exponential transfer functions one can design perfect autoassociative and heteroassociative memory networks, with virtually unlimited storage capacity, for real or binary valued input and output. The autoassociative network has two layers: input and memory, with feedback between the two. The exponential response neurons are in the memory layer. By adding an encoding layer of conventional neurons the network becomes a heteroassociator and classifier. Because for real valued input vectors the dot-product with the weight vector is no longer a measure for similarity, we also consider a euclidean distance based neuron excitation and present Lyapunov functions for both cases. The network has energy minima corresponding only to stored prototype vectors. The exponential neurons make it simpler to build fast adaptive learning directly into classification networks that map real valued input to any class structure at its output.
APA, Harvard, Vancouver, ISO, and other styles
30

Trejo-Alonso, Josué, Carlos Fuentes, Carlos Chávez, Antonio Quevedo, Alfonso Gutierrez-Lopez, and Brandon González-Correa. "Saturated Hydraulic Conductivity Estimation Using Artificial Neural Networks." Water 13, no. 5 (March 5, 2021): 705. http://dx.doi.org/10.3390/w13050705.

Full text
Abstract:
In the present work, we construct several artificial neural networks (varying the input data) to calculate the saturated hydraulic conductivity (KS) using a database with 900 measured samples obtained from the Irrigation District 023, in San Juan del Rio, Queretaro, Mexico. All of them were constructed using two hidden layers, a back-propagation algorithm for the learning process, and a logistic function as a nonlinear transfer function. In order to explore different arrays for neurons into hidden layers, we performed the bootstrap technique for each neural network and selected the one with the least Root Mean Square Error (RMSE) value. We also compared these results with pedotransfer functions and another neural networks from the literature. The results show that our artificial neural networks obtained from 0.0459 to 0.0413 in the RMSE measurement, and 0.9725 to 0.9780 for R2, which are in good agreement with other works. We also found that reducing the amount of the input data offered us better results.
APA, Harvard, Vancouver, ISO, and other styles
31

Xu, Zhengzheng, and Junhua Gu. "Research on traffic flow prediction method based on adaptive multi-channel graph convolutional neural networks." Advances in Engineering Innovation 7, no. 1 (April 25, 2024): 41–47. http://dx.doi.org/10.54254/2977-3903/7/2024066.

Full text
Abstract:
In order to address the issues of predefined adjacency matrices inadequately representing information in road networks, insufficiently capturing spatial dependencies of traffic networks, and the potential problem of excessive smoothing or neglecting initial node information as the layers of graph convolutional neural networks increase, thus affecting traffic prediction performance, this paper proposes a prediction model based on Adaptive Multi-channel Graph Convolutional Neural Networks (AMGCN). The model utilizes an adaptive adjacency matrix to automatically learn implicit graph structures from data, introduces a mixed skip propagation graph convolutional neural network model, which retains the original node states and selectively acquires outputs of convolutional layers, thus avoiding the loss of node initial states and comprehensively capturing spatial correlations of traffic flow. Finally, the output is fed into Long Short-Term Memory networks to capture temporal correlations. Comparative experiments on two real datasets validate the effectiveness of the proposed model.
APA, Harvard, Vancouver, ISO, and other styles
32

Jiao, Libin, Rongfang Bie, Hao Wu, Yu Wei, Jixin Ma, Anton Umek, and Anton Kos. "Golf swing classification with multiple deep convolutional neural networks." International Journal of Distributed Sensor Networks 14, no. 10 (October 2018): 155014771880218. http://dx.doi.org/10.1177/1550147718802186.

Full text
Abstract:
The use of smart sports equipment and body sensory systems supervising daily sports training is gradually emerging in professional and amateur sports; however, the problem of processing large amounts of data from sensors used in sport and discovering constructive knowledge is a novel topic and the focus of our research. In this article, we investigate golf swing data classification methods based on varieties of representative convolutional neural networks (deep convolutional neural networks) which are fed with swing data from embedded multi-sensors, to group the multi-channel golf swing data labeled by hybrid categories from different golf players and swing shapes. In particular, four convolutional neural classifiers are customized: “GolfVanillaCNN” with the convolutional layers, “GolfVGG” with the stacked convolutional layers, “GolfInception” with the multi-scale convolutional layers, and “GolfResNet” with the residual learning. Testing on the real-world swing dataset sampled from the system integrating two strain gage sensors, three-axis accelerometer, and three-axis gyroscope, we explore the accuracy and performance of our convolutional neural network–based classifiers from two perspectives: classification implementations and sensor combinations. Besides, we further evaluate the performance of these four classifiers in terms of classification accuracy, precision–recall curves, and F1 scores. These common classification indicators illustrate that our convolutional neural network–based classifiers can basically group the golf swing predefined by the combination of shapes and golf players correctly and outperform support vector machine method representing traditional classification methods.
APA, Harvard, Vancouver, ISO, and other styles
33

Díaz-Vico, David, Jesús Prada, Adil Omari, and José Dorronsoro. "Deep support vector neural networks." Integrated Computer-Aided Engineering 27, no. 4 (September 11, 2020): 389–402. http://dx.doi.org/10.3233/ica-200635.

Full text
Abstract:
Kernel based Support Vector Machines, SVM, one of the most popular machine learning models, usually achieve top performances in two-class classification and regression problems. However, their training cost is at least quadratic on sample size, making them thus unsuitable for large sample problems. However, Deep Neural Networks (DNNs), with a cost linear on sample size, are able to solve big data problems relatively easily. In this work we propose to combine the advanced representations that DNNs can achieve in their last hidden layers with the hinge and ϵ insensitive losses that are used in two-class SVM classification and regression. We can thus have much better scalability while achieving performances comparable to those of SVMs. Moreover, we will also show that the resulting Deep SVM models are competitive with standard DNNs in two-class classification problems but have an edge in regression ones.
APA, Harvard, Vancouver, ISO, and other styles
34

Wang, Jinfeng, and Xuegang Wang. "Two new methods for facial expression recognition using Convolutional Neural Networks." Journal of Physics: Conference Series 2031, no. 1 (September 1, 2021): 012023. http://dx.doi.org/10.1088/1742-6596/2031/1/012023.

Full text
Abstract:
Abstract In this research, we propose two novel methods for facial expression recognition to improve the accuracy of recognition. The first of our novel approach is to add the Batch Normalization (BN) layer to the CNN model, and the second of the novel approach is to preprocess the image before image training, such as rotating image, cropping the image and adding Gaussian noise in the picture, especially it is beneficial for unbalanced classifications. Our model consists of 3 CNN layers, 3 BN layers, three average-pooling layers, and three fully-connected layers; our model has a satisfying performance on the prediction category after adopting the two methods mentioned above. Our CNN model is trained and tested with Kaggle facial expression recognition challenge databases. The implemented system can automatically recognize seven expressions in real-time: anger, disgust, fear, happiness, neutral, sadness, and sur-prise. The experimental results demonstrate the effectiveness of our proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
35

Fathima, Sheeba. "Music Genre Classification using Deep Learning." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (July 10, 2021): 66–71. http://dx.doi.org/10.22214/ijraset.2021.36087.

Full text
Abstract:
Many subjects are affected by digital music production., including music genre prediction. Machine learning techniques were used to classify music genres in this research. Deep neural networks (DNN) have recently been demonstrated to be effective in a variety of classification tasks. Including music genre classification. In this paper, we propose two methods for boosting music genre classification with convolutional neural networks: 1) using a process inspired by residual learning to combine peak- and average pooling to provide more statistical information to higher level neural networks; and 2) To bypass one or more layers, use shortcut connections. To perform classification, the KNN output is fed into another deep neural network. Our preliminary experimental results on the GTZAN data set show that the above two methods, especially the second one, can effectively improve classification accuracy when compared to two different network topologies.
APA, Harvard, Vancouver, ISO, and other styles
36

Zakić, Milorad, and Goran Kvaščev. "Procena mesta nastanka kvara na električnom vodu primenom veštačkih neuralnih mreža." Energija, ekonomija, ekologija XXIV, no. 4 (December 2022): 68–74. http://dx.doi.org/10.46793/eee22-4.68z.

Full text
Abstract:
This paper deals with the application of neural networks to fault location on extra-high voltage (EHV) transmission lines. A relatively simple power system, consisting of two 220 kV power grids connected with one transmission line, has been modelled using MATLAB/Simulink software. Simulating different fault scenarios (fault types, locations, resistances, and inception angles), the proposed neural network fault locator was trained using various sets of terminal line data (line-to-line voltages and phase currents). Feedforward networks have been employed along with the backpropagation algorithm. An analysis of the neural networks with a varying number of hidden layers and neurons per hidden layer has been performed in order to validate the choice of the neural networks in each step. All analyses were carried out using Neural Network Toolbox.
APA, Harvard, Vancouver, ISO, and other styles
37

Wang, Lingfeng. "Forecast Model of TV Show Rating Based on Convolutional Neural Network." Complexity 2021 (February 24, 2021): 1–10. http://dx.doi.org/10.1155/2021/6694538.

Full text
Abstract:
The TV show rating analysis and prediction system can collect and transmit information more quickly and quickly upload the information to the database. The convolutional neural network is a multilayer neural network structure that simulates the operating mechanism of biological vision systems. It is a neural network composed of multiple convolutional layers and downsampling layers sequentially connected. It can obtain useful feature descriptions from original data and is an effective method to extract features from data. At present, convolutional neural networks have become a research hotspot in speech recognition, image recognition and classification, natural language processing, and other fields and have been widely and successfully applied in these fields. Therefore, this paper introduces the convolutional neural network structure to predict the TV program rating data. First, it briefly introduces artificial neural networks and deep learning methods and focuses on the algorithm principles of convolutional neural networks and support vector machines. Then, we improve the convolutional neural network to fit the TV program rating data and finally apply the two prediction models to the TV program rating data prediction. We improve the convolutional neural network TV program rating prediction model and combine the advantages of the convolutional neural network to extract effective features and good classification and prediction capabilities to improve the prediction accuracy. Through simulation comparison, we verify the feasibility and effectiveness of the TV program rating prediction model given in this article.
APA, Harvard, Vancouver, ISO, and other styles
38

Tzougas, George, and Konstantin Kutzkov. "Enhancing Logistic Regression Using Neural Networks for Classification in Actuarial Learning." Algorithms 16, no. 2 (February 9, 2023): 99. http://dx.doi.org/10.3390/a16020099.

Full text
Abstract:
We developed a methodology for the neural network boosting of logistic regression aimed at learning an additional model structure from the data. In particular, we constructed two classes of neural network-based models: shallow–dense neural networks with one hidden layer and deep neural networks with multiple hidden layers. Furthermore, several advanced approaches were explored, including the combined actuarial neural network approach, embeddings and transfer learning. The model training was achieved by minimizing either the deviance or the cross-entropy loss functions, leading to fourteen neural network-based models in total. For illustrative purposes, logistic regression and the alternative neural network-based models we propose are employed for a binary classification exercise concerning the occurrence of at least one claim in a French motor third-party insurance portfolio. Finally, the model interpretability issue was addressed via the local interpretable model-agnostic explanations approach.
APA, Harvard, Vancouver, ISO, and other styles
39

Bukhari, Syeda Sana, Waqar Ahmad, Khurram Khan Jadoon, and Shahab U. Ansari. "Artificial Neural Network-Based Color Contrast Recommendation System." MATEC Web of Conferences 398 (2024): 01029. http://dx.doi.org/10.1051/matecconf/202439801029.

Full text
Abstract:
Color contrast pertains to graphics and the field of design. Visual objects can be described nicely with the best contrast combinations used in their representation. Color contrast suggestion is usually done with color theory, which defines two colors exactly opposite or adjacent in color hue are good contrast with each other. Herein, this paper presents a Color Contrast Recommendation System (CCRS) as an innovative solution based on Artificial Neural Networks (ANN). The main aim of the paper is to facilitate different users to find suitable contrast for any base color. We used a simple neural network model with two hidden layers for a regression task. The proposed model suggests three contrast layers for the base color given by the user. We prepare a data set of 420 color combinations for training our Neural Network model that looks appealing together and enhances the visuals. The proposed color contrast recommendation application based on Neural Networks represents a significant advancement in leveraging AI technology to streamline the design process, improve accessibility, and enhance user experiences across digital platforms.
APA, Harvard, Vancouver, ISO, and other styles
40

SOHN, ANDREW, and JEAN-LUC GAUDIOT. "REPRESENTING AND PROCESSING PRODUCTION SYSTEMS IN CONNECTIONIST ARCHITECTURES." International Journal of Pattern Recognition and Artificial Intelligence 04, no. 02 (June 1990): 199–214. http://dx.doi.org/10.1142/s0218001490000149.

Full text
Abstract:
Much effort has been expended on developing special architectures dedicated to the efficient execution of problems in artificial intelligence (AI), especially production systems. While artificial neural networks (ANNs) offer the promise of solving various problems in pattern recognition and classification, we demonstrate here that the ANN approach can be applied to the AI production system paradigm. Among various types of neural networks, the three-layers of ring-structured feedback network is considered in this paper to suit the problem domain under investigation. Characteristics of the production system paradigm are identified. Various aspects of the use of feedback neural networks in mapping production systems are discussed. Two types of representation techniques are studied: local and hierarchical representations. A hierarchical representation derives features from patterns in production systems and constructs a 3-dimensional space called feature space, where a pattern can be uniquely defined by a vector. To demonstrate the efficient use of the neural network approach, a mapping of the generic production system is detailed throughout the paper. The results of a deterministic simulation demonstrate that the three layers of ring-structured feedback neural network architecture can be an efficient processing mechanism for the AI production system paradigm.
APA, Harvard, Vancouver, ISO, and other styles
41

Yu, Haichao, Haoxiang Li, Gang Hua, Gao Huang, and Humphrey Shi. "Boosted Dynamic Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (June 26, 2023): 10989–97. http://dx.doi.org/10.1609/aaai.v37i9.26302.

Full text
Abstract:
Early-exiting dynamic neural networks (EDNN), as one type of dynamic neural networks, has been widely studied recently. A typical EDNN has multiple prediction heads at different layers of the network backbone. During inference, the model will exit at either the last prediction head or an intermediate prediction head where the prediction confidence is higher than a predefined threshold. To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data. This brings a train-test mismatch problem that all the prediction heads are optimized on all types of data in training phase while the deeper heads will only see difficult inputs in testing phase. Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions. To mitigate this problem, we formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively. We name our method BoostNet. Our experiments show it achieves the state-of-the-art performance on CIFAR100 and ImageNet datasets in both anytime and budgeted-batch prediction modes. Our code is released at https://github.com/SHI-Labs/Boosted-Dynamic-Networks.
APA, Harvard, Vancouver, ISO, and other styles
42

Curteanu, Silvia. "Direct and inverse neural network modeling in free radical polymerization." Open Chemistry 2, no. 1 (March 1, 2004): 113–40. http://dx.doi.org/10.2478/bf02476187.

Full text
Abstract:
AbstractThe first part of this paper reviews of the most important aspects regarding the use of neural networks in the polymerization reaction engineering. Then, direct and inverse neural network modeling of the batch, bulk free radical polymerization of methyl methacrylate is performed. To obtain monomer conversion, number and weight average molecular weights, and mass reaction viscosity, separate neural networks and, a network with multiple outputs were built (direct neural network modeling). The inverse neural network modeling gives the reaction conditions (temperature and initial initiator concentration) that assure certain values of conversion and polymerization degree at the end of the reaction. Each network is a multi-layer perceptron with one or two hidden layers and a different number of hidden neurons. The best topology correlates with the smallest error at the end of the training phase. The possibility of obtaining accurate results is demonstrated with a relatively simple architecture of the networks. Two types of neural network modeling, direct and inverse, represent possible alternatives to classical procedures of modeling and optimization, each producing accurate results and having simple methodologies.
APA, Harvard, Vancouver, ISO, and other styles
43

Pecev, Predrag, and Milos Rackovic. "LTR-MDTS structure - a structure for multiple dependent time series prediction." Computer Science and Information Systems 14, no. 2 (2017): 467–90. http://dx.doi.org/10.2298/csis150815004p.

Full text
Abstract:
The subject of research presented in this paper is to model a neural network structure and appropriate training algorithm that is most suited for multiple dependent time series prediction / deduction. The basic idea is to take advantage of neural networks in solving the problem of prediction of synchronized basketball referees? movement during a basketball action. Presentation of time series stemming from the aforementioned problem, by using traditional Multilayered Perceptron neural networks (MLP), leads to a sort of paradox of backward time lapse effect that certain input and hidden layers nodes have on output nodes that correspond to previous moments in time. This paper describes conducted research and analysis of different methods of overcoming the presented problem. Presented paper is essentially split into two parts. First part gives insight on efforts that are put into training set configuration on standard Multi Layered Perceptron back propagation neural networks, in order to decrease backwards time lapse effects that certain input and hidden layers nodes have on output nodes. Second part of paper focuses on the results that a new neural network structure called LTR - MDTS provides. Foundation of LTR - MDTS design relies on a foundation on standard MLP neural networks with certain, left-to-right synapse removal to eliminate aforementioned backwards time lapse effect on the output nodes.
APA, Harvard, Vancouver, ISO, and other styles
44

Ito, Yoshifusa. "Approximation Capability of Layered Neural Networks with Sigmoid Units on Two Layers." Neural Computation 6, no. 6 (November 1994): 1233–43. http://dx.doi.org/10.1162/neco.1994.6.6.1233.

Full text
Abstract:
Using only an elementary constructive method, we prove the universal approximation capability of three-layered feedforward neural networks that have sigmoid units on two layers. We regard the Heaviside function as a special case of sigmoid function and measure accuracy of approximation in either the supremum norm or in the Lp-norm. Given a continuous function defined on a unit hypercube and the required accuracy of approximation, we can estimate the numbers of necessary units on the respective sigmoid unit layers. In the case where the sigmoid function is the Heaviside function, our result improves the estimation of Kůrková (1992). If the accuracy of approximation is measured in the LP-norm, our estimation also improves that of Kůrková (1992), even when the sigmoid function is not the Heaviside function.
APA, Harvard, Vancouver, ISO, and other styles
45

BORSCHBACH, M., W. M. LIPPE, and S. NIENDIEK. "A TOOL FOR ANALYZING MAGNETOENCEPHALOGRAPHY-DATA BASED ON DIFFERENT ARTIFICIAL NEURAL NETWORKS." International Journal of Software Engineering and Knowledge Engineering 13, no. 06 (December 2003): 609–26. http://dx.doi.org/10.1142/s0218194003001457.

Full text
Abstract:
The localization of intracerebral dipole sources in order to detect pathological events is one purpose of magnetoencephalography (MEG). Another aspect is the analysis of brain processes and brain structures. A system consisting of two different types of Artificial Neural Networks is presented. The structure of a feed forward neural network with two layers and a learning rule designed for the task of Blind Signal Separation (BSS) is used to separate temporarily overlapping neuron activities in the brain. Based on the separated signals, the task of the second type of neural network is to determine the position and strength of the different, underlying magnetic dipoles. Several concepts of neural networks, their limits and potentials concerning both tasks of mining medical data are discussed briefly.
APA, Harvard, Vancouver, ISO, and other styles
46

Du, Lei, Haifeng Song, Yingying Xu, and Songsong Dai. "An Architecture as an Alternative to Gradient Boosted Decision Trees for Multiple Machine Learning Tasks." Electronics 13, no. 12 (June 12, 2024): 2291. http://dx.doi.org/10.3390/electronics13122291.

Full text
Abstract:
Deep networks-based models have achieved excellent performances in various applications for extracting discriminative feature representations by convolutional neural networks (CNN) or recurrent neural networks (RNN). However, CNN or RNN may not work when handling data without temporal/spatial structures. Therefore, finding a new technique to extract features instead of CNN or RNN is a necessity. Gradient Boosted Decision Trees (GBDT) can select the features with the largest information gain when building trees. In this paper, we propose an architecture based on the ensemble of decision trees and neural network (NN) for multiple machine learning tasks, e.g., classification, regression, and ranking. It can be regarded as an extension of the widely used deep-networks-based model, in which we use GBDT instead of CNN or RNN. This architecture consists of two main parts: (1) the decision forest layers, which focus on learning features from the input data, (2) the fully connected layers, which focus on distilling knowledge from the decision forest layers. Powered by these two parts, the proposed model could handle data without temporal/spatial structures. This model can be efficiently trained by stochastic gradient descent via back-propagation. The empirical evaluation results of different machine learning tasks demonstrate the the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
47

Konarev, D. I., and A. A. Gulamov. "Synthesis of Neural Network Architecture for Recognition of Sea-Going Ship Images." Proceedings of the Southwest State University 24, no. 1 (June 23, 2020): 130–43. http://dx.doi.org/10.21869/2223-1560-2020-24-1-130-143.

Full text
Abstract:
Purpose of research. The current task is to monitor ships using video surveillance cameras installed along the canal. It is important for information communication support for navigation of the Moscow Canal. The main subtask is direct recognition of ships in an image or video. Implementation of a neural network is perspectively.Methods. Various neural network are described. images of ships are an input data for the network. The learning sample uses CIFAR-10 dataset. The network is built and trained by using Keras and TensorFlow machine learning libraries.Results. Implementation of curving artificial neural networks for problems of image recognition is described. Advantages of such architecture when working with images are also described. The selection of Python language for neural network implementation is justified. The main used libraries of machine learning, such as TensorFlow and Keras are described. An experiment has been conducted to train swirl neural networks with different architectures based on Google collaboratoty service. The effectiveness of different architectures was evaluated as a percentage of correct pattern recognition in the test sample. Conclusions have been drawn about parameters influence of screwing neural network on showing its effectiveness.Conclusion. The network with a single curl layer in each cascade showed insufficient results, so three-stage curls with two and three curl layers in each cascade were used. Feature map extension has the greatest impact on the accuracy of image recognition. The increase in cascades' number has less noticeable effect and the increase in the number of screwdriver layers in each cascade does not always have an increase in the accuracy of the neural network. During the study, a three-frame network with two buckling layers in each cascade and 128 feature maps is defined as an optimal architecture of neural network under described conditions. operability checking of architecture's part under consideration on random images of ships confirmed the correctness of optimal architecture choosing.
APA, Harvard, Vancouver, ISO, and other styles
48

Ban, Jung-Chao, and Chih-Hung Chang. "On the Structure of Multilayer Cellular Neural Networks: Complexity between Two Layers." Complex Systems 24, no. 4 (December 15, 2015): 311–54. http://dx.doi.org/10.25088/complexsystems.24.4.311.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

McEneaney, John E. "Neural Networks for Readability Analysis." Journal of Educational Computing Research 10, no. 1 (January 1994): 79–93. http://dx.doi.org/10.2190/2ln8-8chq-64mu-7d9c.

Full text
Abstract:
This article describes and reports on the performance of six related artificial neural networks that have been developed for the purpose of readability analysis. Two networks employ counts of linguistic variables that simulate a traditional regression-based approach to readability. The remaining networks determine readability from “visual snapshots” of text. Input text is transformed into a visual pattern representing activation levels for input level nodes and then “blurred” slightly in an effort to promote generalization. Each network included one hidden layer of nodes in addition to input and an output layers. Of the four snapshot readability systems, two are trained to produce grade equivalent output and two depict readability as a distribution of activation values across several grade levels. Results of preliminary trials indicate that the correlation between visual input systems and judgements by experts is low although, in at least one case, comparable to previous correlations reported between readability formulas and teacher judgement. A system using linguistic variables and numerical output correlated perfectly with a regression-based formula within the error tolerance established prior to training. The networks which produce output in the form of a readability distribution suggest a new way of reporting readability that may do greater justice to the concept of readability than traditional grade equivalent scores while, at the same time, addressing concerns that have been voiced about the illusory precision of readability formulas.
APA, Harvard, Vancouver, ISO, and other styles
50

KHASHMAN, ADNAN. "A NEURAL NETWORK MODEL FOR CREDIT RISK EVALUATION." International Journal of Neural Systems 19, no. 04 (August 2009): 285–94. http://dx.doi.org/10.1142/s0129065709002014.

Full text
Abstract:
Credit scoring is one of the key analytical techniques in credit risk evaluation which has been an active research area in financial risk management. This paper presents a credit risk evaluation system that uses a neural network model based on the back propagation learning algorithm. We train and implement the neural network to decide whether to approve or reject a credit application, using seven learning schemes and real world credit applications from the Australian credit approval datasets. A comparison of the system performance under the different learning schemes is provided, furthermore, we compare the performance of two neural networks; with one and two hidden layers following the ideal learning scheme. Experimental results suggest that neural networks can be effectively used in automatic processing of credit applications.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography