Статті в журналах: "Deep neural networks (DNNs)"

1

Galván, Edgar. "Neuroevolution in deep neural networks." ACM SIGEVOlution 14, no. 1 (April 2021): 3–7. http://dx.doi.org/10.1145/3460310.3460311.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

A variety of methods have been applied to the architectural configuration and learning or training of artificial deep neural networks (DNNs). These methods play a crucial role in the success or failure of the DNNs for most problems. Evolutionary Algorithms are gaining momentum as a computationally feasible method for the automated optimisation of DNNs. Neuroevolution is a term that describes these processes. This newsletter article summarises the full version available at https://arxiv.org/abs/2006.05415.

2

Zhang, Lei, Shengyuan Zhou, Tian Zhi, Zidong Du, and Yunji Chen. "TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 1319–26. http://dx.doi.org/10.1609/aaai.v33i01.33011319.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Continuous-valued deep convolutional networks (DNNs) can be converted into accurate rate-coding based spike neural networks (SNNs). However, the substantial computational and energy costs, which is caused by multiple spikes, limit their use in mobile and embedded applications. And recent works have shown that the newly emerged temporal-coding based SNNs converted from DNNs can reduce the computational load effectively. In this paper, we propose a novel method to convert DNNs to temporal-coding SNNs, called TDSNN. Combined with the characteristic of the leaky integrate-andfire (LIF) neural model, we put forward a new coding principle Reverse Coding and design a novel Ticking Neuron mechanism. According to our evaluation, our proposed method achieves 42% total operations reduction on average in large networks comparing with DNNs with no more than 0.5% accuracy loss. The evaluation shows that TDSNN may prove to be one of the key enablers to make the adoption of SNNs widespread.

3

Díaz-Vico, David, Jesús Prada, Adil Omari, and José Dorronsoro. "Deep support vector neural networks." Integrated Computer-Aided Engineering 27, no. 4 (September 11, 2020): 389–402. http://dx.doi.org/10.3233/ica-200635.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Kernel based Support Vector Machines, SVM, one of the most popular machine learning models, usually achieve top performances in two-class classification and regression problems. However, their training cost is at least quadratic on sample size, making them thus unsuitable for large sample problems. However, Deep Neural Networks (DNNs), with a cost linear on sample size, are able to solve big data problems relatively easily. In this work we propose to combine the advanced representations that DNNs can achieve in their last hidden layers with the hinge and ϵ insensitive losses that are used in two-class SVM classification and regression. We can thus have much better scalability while achieving performances comparable to those of SVMs. Moreover, we will also show that the resulting Deep SVM models are competitive with standard DNNs in two-class classification problems but have an edge in regression ones.

4

Cai, Chenghao, Yanyan Xu, Dengfeng Ke, and Kaile Su. "Deep Neural Networks with Multistate Activation Functions." Computational Intelligence and Neuroscience 2015 (2015): 1–10. http://dx.doi.org/10.1155/2015/721367.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including theN-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

5

Verpoort, Philipp C., Alpha A. Lee, and David J. Wales. "Archetypal landscapes for deep neural networks." Proceedings of the National Academy of Sciences 117, no. 36 (August 25, 2020): 21857–64. http://dx.doi.org/10.1073/pnas.1919995117.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The predictive capabilities of deep neural networks (DNNs) continue to evolve to increasingly impressive levels. However, it is still unclear how training procedures for DNNs succeed in finding parameters that produce good results for such high-dimensional and nonconvex loss functions. In particular, we wish to understand why simple optimization schemes, such as stochastic gradient descent, do not end up trapped in local minima with high loss values that would not yield useful predictions. We explain the optimizability of DNNs by characterizing the local minima and transition states of the loss-function landscape (LFL) along with their connectivity. We show that the LFL of a DNN in the shallow network or data-abundant limit is funneled, and thus easy to optimize. Crucially, in the opposite low-data/deep limit, although the number of minima increases, the landscape is characterized by many minima with similar loss values separated by low barriers. This organization is different from the hierarchical landscapes of structural glass formers and explains why minimization procedures commonly employed by the machine-learning community can navigate the LFL successfully and reach low-lying solutions.

6

Xu, Xiangxiang, Shao-Lun Huang, Lizhong Zheng, and Gregory W. Wornell. "An Information Theoretic Interpretation to Deep Neural Networks." Entropy 24, no. 1 (January 17, 2022): 135. http://dx.doi.org/10.3390/e24010135.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the unprecedented performance achieved by deep learning, it is commonly believed that deep neural networks (DNNs) attempt to extract informative features for learning tasks. To formalize this intuition, we apply the local information geometric analysis and establish an information-theoretic framework for feature selection, which demonstrates the information-theoretic optimality of DNN features. Moreover, we conduct a quantitative analysis to characterize the impact of network structure on the feature extraction process of DNNs. Our investigation naturally leads to a performance metric for evaluating the effectiveness of extracted features, called the H-score, which illustrates the connection between the practical training process of DNNs and the information-theoretic framework. Finally, we validate our theoretical results by experimental designs on synthesized data and the ImageNet dataset.

7

Marrow, Scythia, Eric J. Michaud, and Erik Hoel. "Examining the Causal Structures of Deep Neural Networks Using Information Theory." Entropy 22, no. 12 (December 18, 2020): 1429. http://dx.doi.org/10.3390/e22121429.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN’s causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The EI can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the EI can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the “causal plane”, which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable.

8

Shu, Hai, and Hongtu Zhu. "Sensitivity Analysis of Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 4943–50. http://dx.doi.org/10.1609/aaai.v33i01.33014943.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have achieved superior performance in various prediction tasks, but can be very vulnerable to adversarial examples or perturbations. Therefore, it is crucial to measure the sensitivity of DNNs to various forms of perturbations in real applications. We introduce a novel perturbation manifold and its associated influence measure to quantify the effects of various perturbations on DNN classifiers. Such perturbations include various external and internal perturbations to input samples and network parameters. The proposed measure is motivated by information geometry and provides desirable invariance properties. We demonstrate that our influence measure is useful for four model building tasks: detecting potential ‘outliers’, analyzing the sensitivity of model architectures, comparing network sensitivity between training and test sets, and locating vulnerable areas. Experiments show reasonably good performance of the proposed measure for the popular DNN models ResNet50 and DenseNet121 on CIFAR10 and MNIST datasets.

9

Nakamura, Kensuke, Bilel Derbel, Kyoung-Jae Won, and Byung-Woo Hong. "Learning-Rate Annealing Methods for Deep Neural Networks." Electronics 10, no. 16 (August 22, 2021): 2029. http://dx.doi.org/10.3390/electronics10162029.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have achieved great success in the last decades. DNN is optimized using the stochastic gradient descent (SGD) with learning rate annealing that overtakes the adaptive methods in many tasks. However, there is no common choice regarding the scheduled-annealing for SGD. This paper aims to present empirical analysis of learning rate annealing based on the experimental results using the major data-sets on the image classification that is one of the key applications of the DNNs. Our experiment involves recent deep neural network models in combination with a variety of learning rate annealing methods. We also propose an annealing combining the sigmoid function with warmup that is shown to overtake both the adaptive methods and the other existing schedules in accuracy in most cases with DNNs.

10

Xu, Shenghe, Shivendra S. Panwar, Murali Kodialam, and T. V. Lakshman. "Deep Neural Network Approximated Dynamic Programming for Combinatorial Optimization." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 02 (April 3, 2020): 1684–91. http://dx.doi.org/10.1609/aaai.v34i02.5531.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In this paper, we propose a general framework for combining deep neural networks (DNNs) with dynamic programming to solve combinatorial optimization problems. For problems that can be broken into smaller subproblems and solved by dynamic programming, we train a set of neural networks to replace value or policy functions at each decision step. Two variants of the neural network approximated dynamic programming (NDP) methods are proposed; in the value-based NDP method, the networks learn to estimate the value of each choice at the corresponding step, while in the policy-based NDP method the DNNs only estimate the best decision at each step. The training procedure of the NDP starts from the smallest problem size and a new DNN for the next size is trained to cooperate with previous DNNs. After all the DNNs are trained, the networks are fine-tuned together to further improve overall performance. We test NDP on the linear sum assignment problem, the traveling salesman problem and the talent scheduling problem. Experimental results show that NDP can achieve considerable computation time reduction on hard problems with reasonable performance loss. In general, NDP can be applied to reducible combinatorial optimization problems for the purpose of computation time reduction.

11

Kutz, J. Nathan. "Deep learning in fluid dynamics." Journal of Fluid Mechanics 814 (January 31, 2017): 1–4. http://dx.doi.org/10.1017/jfm.2016.803.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

It was only a matter of time before deep neural networks (DNNs) – deep learning – made their mark in turbulence modelling, or more broadly, in the general area of high-dimensional, complex dynamical systems. In the last decade, DNNs have become a dominant data mining tool for big data applications. Although neural networks have been applied previously to complex fluid flows, the article featured here (Ling et al., J. Fluid Mech., vol. 807, 2016, pp. 155–166) is the first to apply a true DNN architecture, specifically to Reynolds averaged Navier Stokes turbulence models. As one often expects with modern DNNs, performance gains are achieved over competing state-of-the-art methods, suggesting that DNNs may play a critically enabling role in the future of modelling complex flows.

12

Servais, Jason, and Ehsan Atoofian. "Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural Networks." ACM Transactions on Embedded Computing Systems 20, no. 6 (November 30, 2021): 1–24. http://dx.doi.org/10.1145/3487025.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In recent years, Deep Neural Networks (DNNs) have been deployed into a diverse set of applications from voice recognition to scene generation mostly due to their high-accuracy. DNNs are known to be computationally intensive applications, requiring a significant power budget. There have been a large number of investigations into energy-efficiency of DNNs. However, most of them primarily focused on inference while training of DNNs has received little attention. This work proposes an adaptive technique to identify and avoid redundant computations during the training of DNNs. Elements of activations exhibit a high degree of similarity, causing inputs and outputs of layers of neural networks to perform redundant computations. Based on this observation, we propose Adaptive Computation Reuse for Tensor Cores (ACRTC) where results of previous arithmetic operations are used to avoid redundant computations. ACRTC is an architectural technique, which enables accelerators to take advantage of similarity in input operands and speedup the training process while also increasing energy-efficiency. ACRTC dynamically adjusts the strength of computation reuse based on the tolerance of precision relaxation in different training phases. Over a wide range of neural network topologies, ACRTC accelerates training by 33% and saves energy by 32% with negligible impact on accuracy.

13

Jang, Hojin, Devin McCormack, and Frank Tong. "Noise-trained deep neural networks effectively predict human vision and its neural responses to challenging images." PLOS Biology 19, no. 12 (December 9, 2021): e3001418. http://dx.doi.org/10.1371/journal.pbio.3001418.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) for object classification have been argued to provide the most promising model of the visual system, accompanied by claims that they have attained or even surpassed human-level performance. Here, we evaluated whether DNNs provide a viable model of human vision when tested with challenging noisy images of objects, sometimes presented at the very limits of visibility. We show that popular state-of-the-art DNNs perform in a qualitatively different manner than humans—they are unusually susceptible to spatially uncorrelated white noise and less impaired by spatially correlated noise. We implemented a noise training procedure to determine whether noise-trained DNNs exhibit more robust responses that better match human behavioral and neural performance. We found that noise-trained DNNs provide a better qualitative match to human performance; moreover, they reliably predict human recognition thresholds on an image-by-image basis. Functional neuroimaging revealed that noise-trained DNNs provide a better correspondence to the pattern-specific neural representations found in both early visual areas and high-level object areas. A layer-specific analysis of the DNNs indicated that noise training led to broad-ranging modifications throughout the network, with greater benefits of noise robustness accruing in progressively higher layers. Our findings demonstrate that noise-trained DNNs provide a viable model to account for human behavioral and neural responses to objects in challenging noisy viewing conditions. Further, they suggest that robustness to noise may be acquired through a process of visual learning.

14

Aamir, Aisha, Minija Tamosiunaite, and Florentin Wörgötter. "Caffe2Unity: Immersive Visualization and Interpretation of Deep Neural Networks." Electronics 11, no. 1 (December 28, 2021): 83. http://dx.doi.org/10.3390/electronics11010083.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) dominate many tasks in the computer vision domain, but it is still difficult to understand and interpret the information contained within these networks. To gain better insight into how a network learns and operates, there is a strong need to visualize these complex structures, and this remains an important research direction. In this paper, we address the problem of how the interactive display of DNNs in a virtual reality (VR) setup can be used for general understanding and architectural assessment. We compiled a static library as a plugin for the Caffe framework in the Unity gaming engine. We used routines from this plugin to create and visualize a VR-based AlexNet architecture for an image classification task. Our layered interactive model allows the user to freely navigate back and forth within the network during visual exploration. To make the DNN model even more accessible, the user can select certain connections to understand the activity flow at a particular neuron. Our VR setup also allows users to hide the activation maps/filters or even interactively occlude certain features in an image in real-time. Furthermore, we added an interpretation module and reframed the Shapley values to give a deeper understanding of the different layers. Thus, this novel tool offers more direct access to network structures and results, and its immersive operation is especially instructive for both novices and experts in the field of DNNs.

15

Jacobs, Robert A., and Christopher J. Bates. "Comparing the Visual Representations and Performance of Humans and Deep Neural Networks." Current Directions in Psychological Science 28, no. 1 (November 27, 2018): 34–39. http://dx.doi.org/10.1177/0963721418801342.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Although deep neural networks (DNNs) are state-of-the-art artificial intelligence systems, it is unclear what insights, if any, they provide about human intelligence. We address this issue in the domain of visual perception. After briefly describing DNNs, we provide an overview of recent results comparing human visual representations and performance with those of DNNs. In many cases, DNNs acquire visual representations and processing strategies that are very different from those used by people. We conjecture that there are at least two factors preventing them from serving as better psychological models. First, DNNs are currently trained with impoverished data, such as data lacking important visual cues to three-dimensional structure, data lacking multisensory statistical regularities, and data in which stimuli are unconnected to an observer’s actions and goals. Second, DNNs typically lack adaptations to capacity limits, such as attentional mechanisms, visual working memory, and compressed mental representations biased toward preserving task-relevant abstractions.

16

Zheng, Zhong, Xin Zhang, Jinxing Yu, Rui Guo, and Lili Zhangzhong. "Deep Neural Networks for the Classification of Pure and Impure Strawberry Purees." Sensors 20, no. 4 (February 23, 2020): 1223. http://dx.doi.org/10.3390/s20041223.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In this paper, a comparative study of the effectiveness of deep neural networks (DNNs) in the classification of pure and impure purees is conducted. Three different types of deep neural networks (DNNs)—the Gated Recurrent Unit (GRU), the Long Short Term Memory (LSTM), and the temporal convolutional network (TCN)—are employed for the detection of adulteration of strawberry purees. The Strawberry dataset, a time series spectroscopy dataset from the UCR time series classification repository, is utilized to evaluate the performance of different DNNs. Experimental results demonstrate that the TCN is able to obtain a higher classification accuracy than the GRU and LSTM. Moreover, the TCN achieves a new state-of-the-art classification accuracy on the Strawberry dataset. These results indicates the great potential of using the TCN for the detection of adulteration of fruit purees in the future.

17

Bassi, Pedro R. A. S., and Romis Attux. "FBDNN: filter banks and deep neural networks for portable and fast brain-computer interfaces." Biomedical Physics & Engineering Express 8, no. 3 (April 8, 2022): 035018. http://dx.doi.org/10.1088/2057-1976/ac6300.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract Objective. To propose novel SSVEP classification methodologies using deep neural networks (DNNs) and improve performances in single-channel and user-independent brain-computer interfaces (BCIs) with small data lengths. Approach. We propose the utilization of filter banks (creating sub-band components of the EEG signal) in conjunction with DNNs. In this context, we created three different models: a recurrent neural network (FBRNN) analyzing the time domain, a 2D convolutional neural network (FBCNN-2D) processing complex spectrum features and a 3D convolutional neural network (FBCNN-3D) analyzing complex spectrograms, which we introduce in this study as possible input for SSVEP classification. We tested our neural networks on three open datasets and conceived them so as not to require calibration from the final user, simulating a user-independent BCI. Results. The DNNs with the filter banks surpassed the accuracy of similar networks without this preprocessing step by considerable margins, and they outperformed common SSVEP classification methods (SVM and FBCCA) by even higher margins. Conclusion and significance. Filter banks allow different types of deep neural networks to more efficiently analyze the harmonic components of SSVEP. Complex spectrograms carry more information than complex spectrum features and the magnitude spectrum, allowing the FBCNN-3D to surpass the other CNNs. The performances obtained in the challenging classification problems indicates a strong potential for the construction of portable, economical, fast and low-latency BCIs.

18

Sun, Guangling, Yuying Su, Chuan Qin, Wenbo Xu, Xiaofeng Lu, and Andrzej Ceglowski. "Complete Defense Framework to Protect Deep Neural Networks against Adversarial Examples." Mathematical Problems in Engineering 2020 (May 11, 2020): 1–17. http://dx.doi.org/10.1155/2020/8319249.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Although Deep Neural Networks (DNNs) have achieved great success on various applications, investigations have increasingly shown DNNs to be highly vulnerable when adversarial examples are used as input. Here, we present a comprehensive defense framework to protect DNNs against adversarial examples. First, we present statistical and minor alteration detectors to filter out adversarial examples contaminated by noticeable and unnoticeable perturbations, respectively. Then, we ensemble the detectors, a deep Residual Generative Network (ResGN), and an adversarially trained targeted network, to construct a complete defense framework. In this framework, the ResGN is our previously proposed network which is used to remove adversarial perturbations, and the adversarially trained targeted network is a network that is learned through adversarial training. Specifically, once the detectors determine an input example to be adversarial, it is cleaned by ResGN and then classified by the adversarially trained targeted network; otherwise, it is directly classified by this network. We empirically evaluate the proposed complete defense on ImageNet dataset. The results confirm the robustness against current representative attacking methods including fast gradient sign method, randomized fast gradient sign method, basic iterative method, universal adversarial perturbations, DeepFool method, and Carlini & Wagner method.

19

Putra, Prasetia Utama, Keisuke Shima, and Koji Shimatani. "A deep neural network model for multi-view human activity recognition." PLOS ONE 17, no. 1 (January 7, 2022): e0262181. http://dx.doi.org/10.1371/journal.pone.0262181.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Multiple cameras are used to resolve occlusion problem that often occur in single-view human activity recognition. Based on the success of learning representation with deep neural networks (DNNs), recent works have proposed DNNs models to estimate human activity from multi-view inputs. However, currently available datasets are inadequate in training DNNs model to obtain high accuracy rate. Against such an issue, this study presents a DNNs model, trained by employing transfer learning and shared-weight techniques, to classify human activity from multiple cameras. The model comprised pre-trained convolutional neural networks (CNNs), attention layers, long short-term memory networks with residual learning (LSTMRes), and Softmax layers. The experimental results suggested that the proposed model could achieve a promising performance on challenging MVHAR datasets: IXMAS (97.27%) and i3DPost (96.87%). A competitive recognition rate was also observed in online classification.

20

Luo, Yaoru, Guole Liu, Yuanhao Guo, and Ge Yang. "Deep Neural Networks Learn Meta-Structures from Noisy Labels in Semantic Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 1908–16. http://dx.doi.org/10.1609/aaai.v36i2.20085.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

How deep neural networks (DNNs) learn from noisy labels has been studied extensively in image classification but much less in image segmentation. So far, our understanding of the learning behavior of DNNs trained by noisy segmentation labels remains limited. In this study, we address this deficiency in both binary segmentation of biological microscopy images and multi-class segmentation of natural images. We generate extremely noisy labels by randomly sampling a small fraction (e.g., 10%) or flipping a large fraction (e.g., 90%) of the ground truth labels. When trained with these noisy labels, DNNs provide largely the same segmentation performance as trained by the original ground truth. This indicates that DNNs learn structures hidden in labels rather than pixel-level labels per se in their supervised training for semantic segmentation. We refer to these hidden structures in labels as meta-structures. When DNNs are trained by labels with different perturbations to the meta-structure, we find consistent degradation in their segmentation performance. In contrast, incorporation of meta-structure information substantially improves performance of an unsupervised segmentation model developed for binary semantic segmentation. We define meta-structures mathematically as spatial density distributions and show both theoretically and experimentally how this formulation explains key observed learning behavior of DNNs.

21

Wang, Li-Na, Wenxue Liu, Xiang Liu, Guoqiang Zhong, Partha Pratim Roy, Junyu Dong, and Kaizhu Huang. "Compressing Deep Networks by Neuron Agglomerative Clustering." Sensors 20, no. 21 (October 23, 2020): 6033. http://dx.doi.org/10.3390/s20216033.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In recent years, deep learning models have achieved remarkable successes in various applications, such as pattern recognition, computer vision, and signal processing. However, high-performance deep architectures are often accompanied by a large storage space and long computational time, which make it difficult to fully exploit many deep neural networks (DNNs), especially in scenarios in which computing resources are limited. In this paper, to tackle this problem, we introduce a method for compressing the structure and parameters of DNNs based on neuron agglomerative clustering (NAC). Specifically, we utilize the agglomerative clustering algorithm to find similar neurons, while these similar neurons and the connections linked to them are then agglomerated together. Using NAC, the number of parameters and the storage space of DNNs are greatly reduced, without the support of an extra library or hardware. Extensive experiments demonstrate that NAC is very effective for the neuron agglomeration of both the fully connected and convolutional layers, which are common building blocks of DNNs, delivering similar or even higher network accuracy. Specifically, on the benchmark CIFAR-10 and CIFAR-100 datasets, using NAC to compress the parameters of the original VGGNet by 92.96% and 81.10%, respectively, the compact network obtained still outperforms the original networks.

22

Zhang, Duzhen, Tielin Zhang, Shuncheng Jia, and Bo Xu. "Multi-Sacle Dynamic Coding Improved Spiking Actor Network for Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (June 28, 2022): 59–67. http://dx.doi.org/10.1609/aaai.v36i1.19879.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the help of deep neural networks (DNNs), deep reinforcement learning (DRL) has achieved great success on many complex tasks, from games to robotic control. Compared to DNNs with partial brain-inspired structures and functions, spiking neural networks (SNNs) consider more biological features, including spiking neurons with complex dynamics and learning paradigms with biologically plausible plasticity principles. Inspired by the efficient computation of cell assembly in the biological brain, whereby memory-based coding is much more complex than readout, we propose a multiscale dynamic coding improved spiking actor network (MDC-SAN) for reinforcement learning to achieve effective decision-making. The population coding at the network scale is integrated with the dynamic neurons coding (containing 2nd-order neuronal dynamics) at the neuron scale towards a powerful spatial-temporal state representation. Extensive experimental results show that our MDC-SAN performs better than its counterpart deep actor network (based on DNNs) on four continuous control tasks from OpenAI gym. We think this is a significant attempt to improve SNNs from the perspective of efficient coding towards effective decision-making, just like that in biological networks.

23

Grill-Spector, Kalanit, Kevin S. Weiner, Jesse Gomez, Anthony Stigliani, and Vaidehi S. Natu. "The functional neuroanatomy of face perception: from brain measurements to deep neural networks." Interface Focus 8, no. 4 (June 15, 2018): 20180013. http://dx.doi.org/10.1098/rsfs.2018.0013.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

A central goal in neuroscience is to understand how processing within the ventral visual stream enables rapid and robust perception and recognition. Recent neuroscientific discoveries have significantly advanced understanding of the function, structure and computations along the ventral visual stream that serve as the infrastructure supporting this behaviour. In parallel, significant advances in computational models, such as hierarchical deep neural networks (DNNs), have brought machine performance to a level that is commensurate with human performance. Here, we propose a new framework using the ventral face network as a model system to illustrate how increasing the neural accuracy of present DNNs may allow researchers to test the computational benefits of the functional architecture of the human brain. Thus, the review (i) considers specific neural implementational features of the ventral face network, (ii) describes similarities and differences between the functional architecture of the brain and DNNs, and (iii) provides a hypothesis for the computational value of implementational features within the brain that may improve DNN performance. Importantly, this new framework promotes the incorporation of neuroscientific findings into DNNs in order to test the computational benefits of fundamental organizational features of the visual system.

24

Harada, Akira, Shota Nishikawa, and Shoichi Yamada. "Deep Learning of the Eddington Tensor in Core-collapse Supernova Simulation." Astrophysical Journal 925, no. 2 (January 31, 2022): 117. http://dx.doi.org/10.3847/1538-4357/ac3998.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract We trained deep neural networks (DNNs) as a function of the neutrino energy density, flux, and the fluid velocity to reproduce the Eddington tensor for neutrinos obtained in our first-principles core-collapse supernova simulation. Although the moment method, which is one of the most popular approximations for neutrino transport, requires a closure relation, none of the analytical closure relations commonly employed in the literature capture all aspects of the neutrino angular distribution in momentum space. In this paper, we develop a closure relation by using DNNs that take the neutrino energy density, flux, and the fluid velocity as the inputs and the Eddington tensor as the output. We consider two kinds of DNNs: a conventional DNN, named a component-wise neural network (CWNN), and a tensor-basis neural network (TBNN). We find that the diagonal component of the Eddington tensor is better reproduced by the DNNs than the M1 closure relation, especially for low to intermediate energies. For the off-diagonal component, the DNNs agree better with the Boltzmann solver than the M1 closure relation at large radii. In the comparison between the two DNNs, the TBNN displays slightly better performance than the CWNN. With these new closure relations at hand, based on DNNs that well reproduce the Eddington tensor at much lower costs, we have opened up a new possibility for the moment method.

25

Cheng, Hao, Dongze Lian, Shenghua Gao, and Yanlin Geng. "Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification." Entropy 21, no. 5 (May 1, 2019): 456. http://dx.doi.org/10.3390/e21050456.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Inspired by the pioneering work of the information bottleneck (IB) principle for Deep Neural Networks’ (DNNs) analysis, we thoroughly study the relationship among the model accuracy, I ( X ; T ) and I ( T ; Y ) , where I ( X ; T ) and I ( T ; Y ) are the mutual information of DNN’s output T with input X and label Y. Then, we design an information plane-based framework to evaluate the capability of DNNs (including CNNs) for image classification. Instead of each hidden layer’s output, our framework focuses on the model output T. We successfully apply our framework to many application scenarios arising in deep learning and image classification problems, such as image classification with unbalanced data distribution, model selection, and transfer learning. The experimental results verify the effectiveness of the information plane-based framework: Our framework may facilitate a quick model selection and determine the number of samples needed for each class in the unbalanced classification problem. Furthermore, the framework explains the efficiency of transfer learning in the deep learning area.

26

Kwon, Hyun, Hyunsoo Yoon, and Ki-Woong Park. "Selective Poisoning Attack on Deep Neural Networks †." Symmetry 11, no. 7 (July 8, 2019): 892. http://dx.doi.org/10.3390/sym11070892.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Studies related to pattern recognition and visualization using computer technology have been introduced. In particular, deep neural networks (DNNs) provide good performance for image, speech, and pattern recognition. However, a poisoning attack is a serious threat to a DNN’s security. A poisoning attack reduces the accuracy of a DNN by adding malicious training data during the training process. In some situations, it may be necessary to drop a specifically chosen class of accuracy from the model. For example, if an attacker specifically disallows nuclear facilities to be selectively recognized, it may be necessary to intentionally prevent unmanned aerial vehicles from correctly recognizing nuclear-related facilities. In this paper, we propose a selective poisoning attack that reduces the accuracy of only the chosen class in the model. The proposed method achieves this by training malicious data corresponding to only the chosen class while maintaining the accuracy of the remaining classes. For the experiment, we used tensorflow as the machine-learning library as well as MNIST, Fashion-MNIST, and CIFAR10 as the datasets. Experimental results show that the proposed method can reduce the accuracy of the chosen class by 43.2%, 41.7%, and 55.3% in MNIST, Fashion-MNIST, and CIFAR10, respectively, while maintaining the accuracy of the remaining classes.

27

Deng, Xiang, Yun Xiao, Bo Long, and Zhongfei Zhang. "Reducing Flipping Errors in Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6506–14. http://dx.doi.org/10.1609/aaai.v36i6.20603.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have been widely applied in various domains in artificial intelligence including computer vision and natural language processing. A DNN is typically trained for many epochs and then a validation dataset is used to select the DNN in an epoch (we simply call this epoch ``the last epoch") as the final model for making predictions on unseen samples, while it usually cannot achieve a perfect accuracy on unseen samples. An interesting question is ``how many test (unseen) samples that a DNN misclassifies in the last epoch were ever correctly classified by the DNN before the last epoch?". In this paper, we empirically study this question and find on several benchmark datasets that the vast majority of the misclassified samples in the last epoch were ever classified correctly before the last epoch, which means that the predictions for these samples were flipped from ``correct" to ``wrong". Motivated by this observation, we propose to restrict the behavior changes of a DNN on the correctly-classified samples so that the correct local boundaries can be maintained and the flipping error on unseen samples can be largely reduced. Extensive experiments on different benchmark datasets with different modern network architectures demonstrate that the proposed flipping error reduction (FER) approach can substantially improve the generalization, the robustness, and the transferability of DNNs without introducing any additional network parameters or inference cost, only with a negligible training overhead.

28

Deng, Hanming, Yang Hua, Tao Song, Zhengui Xue, Ruhui Ma, Neil Robertson, and Haibing Guan. "Reinforcing Neural Network Stability with Attractor Dynamics." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3765–72. http://dx.doi.org/10.1609/aaai.v34i04.5787.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Recent approaches interpret deep neural works (DNNs) as dynamical systems, drawing the connection between stability in forward propagation and generalization of DNNs. In this paper, we take a step further to be the first to reinforce this stability of DNNs without changing their original structure and verify the impact of the reinforced stability on the network representation from various aspects. More specifically, we reinforce stability by modeling attractor dynamics of a DNN and propose relu-max attractor network (RMAN), a light-weight module readily to be deployed on state-of-the-art ResNet-like networks. RMAN is only needed during training so as to modify a ResNet's attractor dynamics by minimizing an energy function together with the loss of the original learning task. Through intensive experiments, we show that RMAN-modified attractor dynamics bring a more structured representation space to ResNet and its variants, and more importantly improve the generalization ability of ResNet-like networks in supervised tasks due to reinforced stability.

29

Kanamura, Momomi, Kanata Suzuki, Yuki Suga, and Tetsuya Ogata. "Development of a Basic Educational Kit for Robotic System with Deep Neural Networks." Sensors 21, no. 11 (May 31, 2021): 3804. http://dx.doi.org/10.3390/s21113804.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

In many robotics studies, deep neural networks (DNNs) are being actively studied due to their good performance. However, existing robotic techniques and DNNs have not been systematically integrated, and packages for beginners are yet to be developed. In this study, we proposed a basic educational kit for robotic system development with DNNs. Our goal was to educate beginners in both robotics and machine learning, especially the use of DNNs. Initially, we required the kit to (1) be easy to understand, (2) employ experience-based learning, and (3) be applicable in many areas. To clarify the learning objectives and important parts of the basic educational kit, we analyzed the research and development (R&D) of DNNs and divided the process into three steps of data collection (DC), machine learning (ML), and task execution (TE). These steps were configured under a hierarchical system flow with the ability to be executed individually at the development stage. To evaluate the practicality of the proposed system flow, we implemented it for a physical robotic grasping system using robotics middleware. We also demonstrated that the proposed system can be effectively applied to other hardware, sensor inputs, and robot tasks.

30

Villalobos, Kimberly, Vilim Štih, Amineh Ahmadinejad, Shobhita Sundaram, Jamell Dozier, Andrew Francl, Frederico Azevedo, Tomotake Sasaki, and Xavier Boix. "Do Neural Networks for Segmentation Understand Insideness?" Neural Computation 33, no. 9 (August 19, 2021): 2511–49. http://dx.doi.org/10.1162/neco_a_01413.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Abstract The insideness problem is an aspect of image segmentation that consists of determining which pixels are inside and outside a region. Deep neural networks (DNNs) excel in segmentation benchmarks, but it is unclear if they have the ability to solve the insideness problem as it requires evaluating long-range spatial dependencies. In this letter, we analyze the insideness problem in isolation, without texture or semantic cues, such that other aspects of segmentation do not interfere in the analysis. We demonstrate that DNNs for segmentation with few units have sufficient complexity to solve the insideness for any curve. Yet such DNNs have severe problems with learning general solutions. Only recurrent networks trained with small images learn solutions that generalize well to almost any curve. Recurrent networks can decompose the evaluation of long-range dependencies into a sequence of local operations, and learning with small images alleviates the common difficulties of training recurrent networks with a large number of unrolling steps.

31

Opschoor, Joost A. A., Philipp C. Petersen, and Christoph Schwab. "Deep ReLU networks and high-order finite element methods." Analysis and Applications 18, no. 05 (February 21, 2020): 715–70. http://dx.doi.org/10.1142/s0219530519410136.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Approximation rate bounds for emulations of real-valued functions on intervals by deep neural networks (DNNs) are established. The approximation results are given for DNNs based on ReLU activation functions. The approximation error is measured with respect to Sobolev norms. It is shown that ReLU DNNs allow for essentially the same approximation rates as nonlinear, variable-order, free-knot (or so-called “[Formula: see text]-adaptive”) spline approximations and spectral approximations, for a wide range of Sobolev and Besov spaces. In particular, exponential convergence rates in terms of the DNN size for univariate, piecewise Gevrey functions with point singularities are established. Combined with recent results on ReLU DNN approximation of rational, oscillatory, and high-dimensional functions, this corroborates that continuous, piecewise affine ReLU DNNs afford algebraic and exponential convergence rate bounds which are comparable to “best in class” schemes for several important function classes of high and infinite smoothness. Using composition of DNNs, we also prove that radial-like functions obtained as compositions of the above with the Euclidean norm and, possibly, anisotropic affine changes of co-ordinates can be emulated at exponential rate in terms of the DNN size and depth without the curse of dimensionality.

32

Jin, Wei, Yaxing Li, Han Xu, Yiqi Wang, Shuiwang Ji, Charu Aggarwal, and Jiliang Tang. "Adversarial Attacks and Defenses on Graphs." ACM SIGKDD Explorations Newsletter 22, no. 2 (January 17, 2021): 19–34. http://dx.doi.org/10.1145/3447556.3447566.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have achieved significant performance in various tasks. However, recent studies have shown that DNNs can be easily fooled by small perturbation on the input, called adversarial attacks.

33

Krishnan, Gokul, Sumit K. Mandal, Chaitali Chakrabarti, Jae-Sun Seo, Umit Y. Ogras, and Yu Cao. "Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks." ACM Journal on Emerging Technologies in Computing Systems 18, no. 2 (April 30, 2022): 1–22. http://dx.doi.org/10.1145/3460233.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6 × improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.

34

Cai, Jingyong, Masashi Takemoto, Yuming Qiu, and Hironori Nakajo. "Trigonometric Inference Providing Learning in Deep Neural Networks." Applied Sciences 11, no. 15 (July 21, 2021): 6704. http://dx.doi.org/10.3390/app11156704.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Despite being heavily used in the training of deep neural networks (DNNs), multipliers are resource-intensive and insufficient in many different scenarios. Previous discoveries have revealed the superiority when activation functions, such as the sigmoid, are calculated by shift-and-add operations, although they fail to remove multiplications in training altogether. In this paper, we propose an innovative approach that can convert all multiplications in the forward and backward inferences of DNNs into shift-and-add operations. Because the model parameters and backpropagated errors of a large DNN model are typically clustered around zero, these values can be approximated by their sine values. Multiplications between the weights and error signals are transferred to multiplications of their sine values, which are replaceable with simpler operations with the help of the product to sum formula. In addition, a rectified sine activation function is utilized for further converting layer inputs into sine values. In this way, the original multiplication-intensive operations can be computed through simple add-and-shift operations. This trigonometric approximation method provides an efficient training and inference alternative for devices with insufficient hardware multipliers. Experimental results demonstrate that this method is able to obtain a performance close to that of classical training algorithms. The approach we propose sheds new light on future hardware customization research for machine learning.

35

Cao, Yuan, and Quanquan Gu. "Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 3349–56. http://dx.doi.org/10.1609/aaai.v34i04.5736.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very good generalization performance in the over-parameterization regime, where DNNs can easily fit a random labeling of the training data. Very recently, a line of work explains in theory that with over-parameterization and proper random initialization, gradient-based methods can find the global minima of the training loss for DNNs. However, existing generalization error bounds are unable to explain the good generalization performance of over-parameterized DNNs. The major limitation of most existing generalization bounds is that they are based on uniform convergence and are independent of the training algorithm. In this work, we derive an algorithm-dependent generalization error bound for deep ReLU networks, and show that under certain assumptions on the data distribution, gradient descent (GD) with proper random initialization is able to train a sufficiently over-parameterized DNN to achieve arbitrarily small generalization error. Our work sheds light on explaining the good generalization performance of over-parameterized deep neural networks.

36

Han, Pang Ying, Liew Yee Ping, Goh Fan Ling, Ooi Shih Yin, and Khoh Wee How. "Stacked deep analytic model for human activity recognition on a UCI HAR database." F1000Research 10 (October 15, 2021): 1046. http://dx.doi.org/10.12688/f1000research.73174.1.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Owing to low cost and ubiquity, human activity recognition using smartphones is emerging as a trendy mobile application in diverse appliances such as assisted living, healthcare monitoring, etc. Analysing this one-dimensional time-series signal is rather challenging due to its spatial and temporal variances. Numerous deep neural networks (DNNs) are conducted to unveil deep features of complex real-world data. However, the drawback of DNNs is the un-interpretation of the network's internal logic to achieve the output. Furthermore, a huge training sample size (i.e. millions of samples) is required to ensure great performance. Methods In this work, a simpler yet effective stacked deep network, known as Stacked Discriminant Feature Learning (SDFL), is proposed to analyse inertial motion data for activity recognition. Contrary to DNNs, this deep model extracts rich features without the prerequisite of a gigantic training sample set and tenuous hyper-parameter tuning. SDFL is a stacking deep network with multiple learning modules, appearing in a serialized layout for multi-level feature learning from shallow to deeper features. In each learning module, Rayleigh coefficient optimized learning is accomplished to extort discriminant features. A subject-independent protocol is implemented where the system model (trained by data from a group of users) is used to recognize data from another group of users. Results Empirical results demonstrate that SDFL surpasses state-of-the-art methods, including DNNs like Convolutional Neural Network, Deep Belief Network, etc., with ~97% accuracy from the UCI HAR database with thousands of training samples. Additionally, the model training time of SDFL is merely a few minutes, compared with DNNs, which require hours for model training. Conclusions The supremacy of SDFL is corroborated in analysing motion data for human activity recognition requiring no GPU but only a CPU with a fast- learning rate.

37

Pang, Ying Han, Liew Yee Ping, Goh Fan Ling, Ooi Shih Yin, and Khoh Wee How. "Stacked deep analytic model for human activity recognition on a UCI HAR database." F1000Research 10 (April 1, 2022): 1046. http://dx.doi.org/10.12688/f1000research.73174.3.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Owing to low cost and ubiquity, human activity recognition using smartphones is emerging as a trendy mobile application in diverse appliances such as assisted living, healthcare monitoring, etc. Analysing this one-dimensional time-series signal is rather challenging due to its spatial and temporal variances. Numerous deep neural networks (DNNs) are conducted to unveil deep features of complex real-world data. However, the drawback of DNNs is the un-interpretation of the network's internal logic to achieve the output. Furthermore, a huge training sample size (i.e. millions of samples) is required to ensure great performance. Methods In this work, a simpler yet effective stacked deep network, known as Stacked Discriminant Feature Learning (SDFL), is proposed to analyse inertial motion data for activity recognition. Contrary to DNNs, this deep model extracts rich features without the prerequisite of a gigantic training sample set and tenuous hyper-parameter tuning. SDFL is a stacking deep network with multiple learning modules, appearing in a serialized layout for multi-level feature learning from shallow to deeper features. In each learning module, Rayleigh coefficient optimized learning is accomplished to extort discriminant features. A subject-independent protocol is implemented where the system model (trained by data from a group of users) is used to recognize data from another group of users. Results Empirical results demonstrate that SDFL surpasses state-of-the-art methods, including DNNs like Convolutional Neural Network, Deep Belief Network, etc., with ~97% accuracy from the UCI HAR database with thousands of training samples. Additionally, the model training time of SDFL is merely a few minutes, compared with DNNs, which require hours for model training. Conclusions The supremacy of SDFL is corroborated in analysing motion data for human activity recognition requiring no GPU but only a CPU with a fast- learning rate.

38

Pang, Ying Han, Liew Yee Ping, Goh Fan Ling, Ooi Shih Yin, and Khoh Wee How. "Stacked deep analytic model for human activity recognition on a UCI HAR database." F1000Research 10 (February 18, 2022): 1046. http://dx.doi.org/10.12688/f1000research.73174.2.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Background Owing to low cost and ubiquity, human activity recognition using smartphones is emerging as a trendy mobile application in diverse appliances such as assisted living, healthcare monitoring, etc. Analysing this one-dimensional time-series signal is rather challenging due to its spatial and temporal variances. Numerous deep neural networks (DNNs) are conducted to unveil deep features of complex real-world data. However, the drawback of DNNs is the un-interpretation of the network's internal logic to achieve the output. Furthermore, a huge training sample size (i.e. millions of samples) is required to ensure great performance. Methods In this work, a simpler yet effective stacked deep network, known as Stacked Discriminant Feature Learning (SDFL), is proposed to analyse inertial motion data for activity recognition. Contrary to DNNs, this deep model extracts rich features without the prerequisite of a gigantic training sample set and tenuous hyper-parameter tuning. SDFL is a stacking deep network with multiple learning modules, appearing in a serialized layout for multi-level feature learning from shallow to deeper features. In each learning module, Rayleigh coefficient optimized learning is accomplished to extort discriminant features. A subject-independent protocol is implemented where the system model (trained by data from a group of users) is used to recognize data from another group of users. Results Empirical results demonstrate that SDFL surpasses state-of-the-art methods, including DNNs like Convolutional Neural Network, Deep Belief Network, etc., with ~97% accuracy from the UCI HAR database with thousands of training samples. Additionally, the model training time of SDFL is merely a few minutes, compared with DNNs, which require hours for model training. Conclusions The supremacy of SDFL is corroborated in analysing motion data for human activity recognition requiring no GPU but only a CPU with a fast- learning rate.

39

Grant, Lauren L., and Clarissa S. Sit. "De novo molecular drug design benchmarking." RSC Medicinal Chemistry 12, no. 8 (2021): 1273–80. http://dx.doi.org/10.1039/d1md00074h.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) used for de novo drug design have different architectures and hyperparameters that impact the final output of suggested drug candidates. Herein we review benchmarking platforms that assess the utility and validity of DNNs.

40

Trimech, Imen Hamrouni, Ahmed Maalej, and Najoua Essoukri Ben Amara. "Facial Expression Recognition Using 3D Points Aware Deep Neural Network." Traitement du Signal 38, no. 2 (April 30, 2021): 321–30. http://dx.doi.org/10.18280/ts.380209.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Point cloud-based Deep Neural Networks (DNNs) have gained increasing attention as an insightful solution in the study field of geometric deep learning. Point set aware DNNs have proven capable of dealing with the unstructured data type and successful in 3D data applications such as 3D object classification, segmentation and recognition. On the other hand, two major challenges remain understudied when it comes to the use of point cloud-based DNNs for 3D facial expression (FE) recognition. The first challenge is the lack of large labelled 3D facial data. The second is how to obtain a point-based discriminative representation of 3D faces. To address the first issue, we suggest to enlarge the used dataset by generating synthetic 3D FEs. For the second one, we propose to apply a level-curve based sampling strategy in order to exploit crucial geometric information. The conducted experiments show promising results reaching 97.23% on the enlarged BU-3DFE dataset.

41

Xue, Wanqi, Bo An, and Chai Kiat Yeo. "NSGZero: Efficiently Learning Non-exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 4 (June 28, 2022): 4646–53. http://dx.doi.org/10.1609/aaai.v36i4.20389.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

How resources are deployed to secure critical targets in networks can be modelled by Network Security Games (NSGs). While recent advances in deep learning (DL) provide a powerful approach to dealing with large-scale NSGs, DL methods such as NSG-NFSP suffer from the problem of data inefficiency. Furthermore, due to centralized control, they cannot scale to scenarios with a large number of resources. In this paper, we propose a novel DL-based method, NSGZero, to learn a non-exploitable policy in NSGs. NSGZero improves data efficiency by performing planning with neural Monte Carlo Tree Search (MCTS). Our main contributions are threefold. First, we design deep neural networks (DNNs) to perform neural MCTS in NSGs. Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources. Third, we provide an efficient learning paradigm, to achieve joint training of the DNNs in NSGZero. Compared to state-of-the-art algorithms, our method achieves significantly better data efficiency and scalability.

42

Jónsson, Hlynur, Giovanni Cherubini, and Evangelos Eleftheriou. "Convergence Behavior of DNNs with Mutual-Information-Based Regularization." Entropy 22, no. 7 (June 30, 2020): 727. http://dx.doi.org/10.3390/e22070727.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Information theory concepts are leveraged with the goal of better understanding and improving Deep Neural Networks (DNNs). The information plane of neural networks describes the behavior during training of the mutual information at various depths between input/output and hidden-layer variables. Previous analysis revealed that most of the training epochs are spent on compressing the input, in some networks where finiteness of the mutual information can be established. However, the estimation of mutual information is nontrivial for high-dimensional continuous random variables. Therefore, the computation of the mutual information for DNNs and its visualization on the information plane mostly focused on low-complexity fully connected networks. In fact, even the existence of the compression phase in complex DNNs has been questioned and viewed as an open problem. In this paper, we present the convergence of mutual information on the information plane for a high-dimensional VGG-16 Convolutional Neural Network (CNN) by resorting to Mutual Information Neural Estimation (MINE), thus confirming and extending the results obtained with low-dimensional fully connected networks. Furthermore, we demonstrate the benefits of regularizing a network, especially for a large number of training epochs, by adopting mutual information estimates as additional terms in the loss function characteristic of the network. Experimental results show that the regularization stabilizes the test accuracy and significantly reduces its variance.

43

Ao, Ren, Zhang Tao, Wang Yuhao, Lin Sheng, Dong Peiyan, Chen Yen-kuang, Xie Yuan, and Wang Yanzhi. "DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 5495–502. http://dx.doi.org/10.1609/aaai.v34i04.6000.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The rapidly growing parameter volume of deep neural networks (DNNs) hinders the artificial intelligence applications on resource constrained devices, such as mobile and wearable devices. Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the model size and thus the amount of computation. And thereby, the state-of-the-art DNNs are able to be deployed on those devices with high runtime energy efficiency. In contrast to irregular pruning that incurs high index storage and decoding overhead, structured pruning techniques have been proposed as the promising solutions. However, prior studies on structured pruning tackle the problem mainly from the perspective of facilitating hardware implementation, without diving into the deep to analyze the characteristics of sparse neural networks. The neglect on the study of sparse neural networks causes inefficient trade-off between regularity and pruning ratio. Consequently, the potential of structurally pruning neural networks is not sufficiently mined.In this work, we examine the structural characteristics of the irregularly pruned weight matrices, such as the diverse redundancy of different rows, the sensitivity of different rows to pruning, and the position characteristics of retained weights. By leveraging the gained insights as a guidance, we first propose the novel block-max weight masking (BMWM) method, which can effectively retain the salient weights while imposing high regularity to the weight matrix. As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that can effectively take advantage of the intrinsic characteristics of neural networks, and thereby outperform prior structured pruning work with high pruning ratio and decoding efficiency. Our experimental results show that DARB can achieve 13× to 25× pruning ratio, which are 2.8× to 4.3× improvements than the state-of-the-art counterparts on multiple neural network models and tasks. Moreover, DARB can achieve 14.3× decoding efficiency than block pruning with higher pruning ratio.

44

Elangovan, Reena, Shubham Jain, and Anand Raghunathan. "Ax-BxP: Approximate Blocked Computation for Precision-reconfigurable Deep Neural Network Acceleration." ACM Transactions on Design Automation of Electronic Systems 27, no. 3 (May 31, 2022): 1–20. http://dx.doi.org/10.1145/3492733.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Precision scaling has emerged as a popular technique to optimize the compute and storage requirements of Deep Neural Networks (DNNs). Efforts toward creating ultra-low-precision (sub-8-bit) DNNs for efficient inference suggest that the minimum precision required to achieve a given network-level accuracy varies considerably across networks, and even across layers within a network. This translates to a need to support variable precision computation in DNN hardware. Previous proposals for precision-reconfigurable hardware, such as bit-serial architectures, incur high overheads, significantly diminishing the benefits of lower precision. We propose Ax-BxP, a method for approximate blocked computation wherein each multiply-accumulate operation is performed block-wise (a block is a group of bits), facilitating re-configurability at the granularity of blocks. Further, approximations are introduced by only performing a subset of the required block-wise computations to realize precision re-configurability with high efficiency. We design a DNN accelerator that embodies approximate blocked computation and propose a method to determine a suitable approximation configuration for any given DNN. For the AlexNet, ResNet50, and MobileNetV2 DNNs, Ax-BxP achieves improvement in system energy and performance, respectively, over an 8-bit fixed-point (FxP8) baseline, with minimal loss (<1% on average) in classification accuracy. Further, by varying the approximation configurations at a finer granularity across layers and data-structures within a DNN, we achieve improvement in system energy and performance, respectively.

45

Nam, Woo-Jeoung, Shir Gur, Jaesik Choi, Lior Wolf, and Seong-Whan Lee. "Relative Attributing Propagation: Interpreting the Comparative Contributions of Individual Units in Deep Neural Networks." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 03 (April 3, 2020): 2501–8. http://dx.doi.org/10.1609/aaai.v34i03.5632.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

As Deep Neural Networks (DNNs) have demonstrated superhuman performance in a variety of fields, there is an increasing interest in understanding the complex internal mechanisms of DNNs. In this paper, we propose Relative Attributing Propagation (RAP), which decomposes the output predictions of DNNs with a new perspective of separating the relevant (positive) and irrelevant (negative) attributions according to the relative influence between the layers. The relevance of each neuron is identified with respect to its degree of contribution, separated into positive and negative, while preserving the conservation rule. Considering the relevance assigned to neurons in terms of relative priority, RAP allows each neuron to be assigned with a bi-polar importance score concerning the output: from highly relevant to highly irrelevant. Therefore, our method makes it possible to interpret DNNs with much clearer and attentive visualizations of the separated attributions than the conventional explaining methods. To verify that the attributions propagated by RAP correctly account for each meaning, we utilize the evaluation metrics: (i) Outside-inside relevance ratio, (ii) Segmentation mIOU and (iii) Region perturbation. In all experiments and metrics, we present a sizable gap in comparison to the existing literature.

46

Hussain, Farhan, and Jechang Jeong. "Efficient Deep Neural Network for Digital Image Compression Employing Rectified Linear Neurons." Journal of Sensors 2016 (2016): 1–7. http://dx.doi.org/10.1155/2016/3184840.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

A compression technique for still digital images is proposed with deep neural networks (DNNs) employing rectified linear units (ReLUs). We tend to exploit the DNNs capabilities to find a reasonable estimate of the underlying compression/decompression relationships. We aim for a DNN for image compression purpose that has better generalization property and reduced training time and support real time operation. The use of ReLUs which map more plausibly to biological neurons, makes the training of our DNN significantly faster, shortens the encoding/decoding time, and improves its generalization ability. The introduction of the ReLUs establishes an efficient gradient propagation, induces sparsity in the proposed network, and is efficient in terms of computations making these networks suitable for real time compression systems. Experiments performed on standard real world images show that using ReLUs instead of logistic sigmoid units speeds up the training of the DNN by converging markedly faster. The evaluation of objective and subjective quality of reconstructed images also proves that our DNN achieves better generalization as most of the images are never seen by the network before.

47

Venkat, Anand, Tharindu Rusira, Raj Barik, Mary Hall, and Leonard Truong. "SWIRL: High-performance many-core CPU code generation for deep neural networks." International Journal of High Performance Computing Applications 33, no. 6 (August 4, 2019): 1275–89. http://dx.doi.org/10.1177/1094342019866247.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have demonstrated effectiveness in many domains including object recognition, speech recognition, natural language processing, and health care. Typically, the computations involved in DNN training and inferencing are time consuming and require efficient implementations. Existing frameworks such as TensorFlow, Theano, Torch, Cognitive Tool Kit (CNTK), and Caffe enable Graphics Processing Unit (GPUs) as the status quo devices for DNN execution, leaving Central Processing Unit (CPUs) behind. Moreover, existing frameworks forgo or limit cross layer optimization opportunities that have the potential to improve performance by significantly reducing data movement through the memory hierarchy. In this article, we describe an alternative approach called SWIRL, a compiler that provides high-performance CPU implementations for DNNs. SWIRL is built on top of the existing domain-specific language (DSL) for DNNs called LATTE. SWIRL separates DNN specification and its schedule using predefined transformation recipes for tensors and layers commonly found in DNN layers. These recipes synergize with DSL constructs to generate high-quality fused, vectorized, and parallelized code for CPUs. On an Intel Xeon Platinum 8180M CPU, SWIRL achieves performance comparable with Tensorflow integrated with MKL-DNN; on average 1.00× of Tensorflow inference and 0.99× of Tensorflow training. It also outperforms the original LATTE compiler on average by 1.22× and 1.30× on inference and training, respectively.

48

Abraham, Lizy, Steven Davy, Muhammad Zawish, Rahul Mhapsekar, John A. Finn, and Patrick Moran. "Preliminary Classification of Selected Farmland Habitats in Ireland Using Deep Neural Networks." Sensors 22, no. 6 (March 11, 2022): 2190. http://dx.doi.org/10.3390/s22062190.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Ireland has a wide variety of farmlands that includes arable fields, grassland, hedgerows, streams, lakes, rivers, and native woodlands. Traditional methods of habitat identification rely on field surveys, which are resource intensive, therefore there is a strong need for digital methods to improve the speed and efficiency of identification and differentiation of farmland habitats. This is challenging because of the large number of subcategories having nearly indistinguishable features within the habitat classes. Heterogeneity among sites within the same habitat class is another problem. Therefore, this research work presents a preliminary technique for accurate farmland classification using stacked ensemble deep convolutional neural networks (DNNs). The proposed approach has been validated on a high-resolution dataset collected using drones. The image samples were manually labelled by the experts in the area before providing them to the DNNs for training purposes. Three pre-trained DNNs customized using the transfer learning approach are used as the base learners. The predicted features derived from the base learners were then used to train a DNN based meta-learner to achieve high classification rates. We analyse the obtained results in terms of convergence rate, confusion matrices, and ROC curves. This is a preliminary work and further research is needed to establish a standard technique.

49

Zhang, Xingwei, Xiaolong Zheng, and Wenji Mao. "Adversarial Perturbation Defense on Deep Neural Networks." ACM Computing Surveys 54, no. 8 (November 30, 2022): 1–36. http://dx.doi.org/10.1145/3465397.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have been verified to be easily attacked by well-designed adversarial perturbations. Image objects with small perturbations that are imperceptible to human eyes can induce DNN-based image class classifiers towards making erroneous predictions with high probability. Adversarial perturbations can also fool real-world machine learning systems and transfer between different architectures and datasets. Recently, defense methods against adversarial perturbations have become a hot topic and attracted much attention. A large number of works have been put forward to defend against adversarial perturbations, enhancing DNN robustness against potential attacks, or interpreting the origin of adversarial perturbations. In this article, we provide a comprehensive survey on classical and state-of-the-art defense methods by illuminating their main concepts, in-depth algorithms, and fundamental hypotheses regarding the origin of adversarial perturbations. In addition, we further discuss potential directions of this domain for future researchers.

50

Zhang, Xingwei, Xiaolong Zheng, and Wenji Mao. "Adversarial Perturbation Defense on Deep Neural Networks." ACM Computing Surveys 54, no. 8 (November 30, 2022): 1–36. http://dx.doi.org/10.1145/3465397.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Deep neural networks (DNNs) have been verified to be easily attacked by well-designed adversarial perturbations. Image objects with small perturbations that are imperceptible to human eyes can induce DNN-based image class classifiers towards making erroneous predictions with high probability. Adversarial perturbations can also fool real-world machine learning systems and transfer between different architectures and datasets. Recently, defense methods against adversarial perturbations have become a hot topic and attracted much attention. A large number of works have been put forward to defend against adversarial perturbations, enhancing DNN robustness against potential attacks, or interpreting the origin of adversarial perturbations. In this article, we provide a comprehensive survey on classical and state-of-the-art defense methods by illuminating their main concepts, in-depth algorithms, and fundamental hypotheses regarding the origin of adversarial perturbations. In addition, we further discuss potential directions of this domain for future researchers.

Статті в журналах з теми "Deep neural networks (DNNs)"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями