Segui questo link per vedere altri tipi di pubblicazioni sul tema: Neural state-space models.

Articoli di riviste sul tema "Neural state-space models"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Neural state-space models".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Korbicz, Józef, Marcin Mrugalski e Thomas Parisini. "DESIGNING STATE-SPACE MODELS WITH NEURAL NETWORKS". IFAC Proceedings Volumes 35, n. 1 (2002): 459–64. http://dx.doi.org/10.3182/20020721-6-es-1901.01630.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Schüssler, Max. "Machine learning with nonlinear state space models". at - Automatisierungstechnik 70, n. 11 (27 ottobre 2022): 1027–28. http://dx.doi.org/10.1515/auto-2022-0089.

Testo completo
Abstract (sommario):
Abstract In this dissertation, a novel class of model structures and associated training algorithms for building data-driven nonlinear state space models is developed. The new identification procedure with the resulting model is called local model state space network (LMSSN). Furthermore, recurrent neural networks (RNNs) and their similarities to nonlinear state space models are elaborated on. The overall outstanding performance of the LMSSN is demonstrated on various applications.
Gli stili APA, Harvard, Vancouver, ISO e altri
3

He, Mingjian, Proloy Das, Gladia Hotan e Patrick L. Purdon. "Switching state-space modeling of neural signal dynamics". PLOS Computational Biology 19, n. 8 (28 agosto 2023): e1011395. http://dx.doi.org/10.1371/journal.pcbi.1011395.

Testo completo
Abstract (sommario):
Linear parametric state-space models are a ubiquitous tool for analyzing neural time series data, providing a way to characterize the underlying brain dynamics with much greater statistical efficiency than non-parametric data analysis approaches. However, neural time series data are frequently time-varying, exhibiting rapid changes in dynamics, with transient activity that is often the key feature of interest in the data. Stationary methods can be adapted to time-varying scenarios by employing fixed-duration windows under an assumption of quasi-stationarity. But time-varying dynamics can be explicitly modeled by switching state-space models, i.e., by using a pool of state-space models with different dynamics selected by a probabilistic switching process. Unfortunately, exact solutions for state inference and parameter learning with switching state-space models are intractable. Here we revisit a switching state-space model inference approach first proposed by Ghahramani and Hinton. We provide explicit derivations for solving the inference problem iteratively after applying variational approximation on the joint posterior of the hidden states and the switching process. We introduce a novel initialization procedure using an efficient leave-one-out strategy to compare among candidate models, which significantly improves performance compared to the existing method that relies on deterministic annealing. We then utilize this state-inference solution within a generalized expectation-maximization algorithm to estimate model parameters of the switching process and the linear state-space models with dynamics potentially shared among candidate models. We perform extensive simulations under different settings to benchmark performance against existing switching inference methods and further validate the robustness of our switching inference solution outside the generative switching model class. Finally, we demonstrate the utility of our method for sleep spindle detection in real recordings, showing how switching state-space models can be used to detect and extract transient spindles from human sleep electroencephalograms in an unsupervised manner.
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Forgione, Marco, e Dario Piga. "Neural State-Space Models: Empirical Evaluation of Uncertainty Quantification". IFAC-PapersOnLine 56, n. 2 (2023): 4082–87. http://dx.doi.org/10.1016/j.ifacol.2023.10.1736.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Raol, J. R. "Parameter estimation of state space models by recurrent neural networks". IEE Proceedings - Control Theory and Applications 142, n. 2 (1 marzo 1995): 114–18. http://dx.doi.org/10.1049/ip-cta:19951733.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Bendtsen, J. D., e K. Trangbaek. "Robust quasi-LPV control based on neural state-space models". IEEE Transactions on Neural Networks 13, n. 2 (marzo 2002): 355–68. http://dx.doi.org/10.1109/72.991421.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Paninski, Liam, Yashar Ahmadian, Daniel Gil Ferreira, Shinsuke Koyama, Kamiar Rahnama Rad, Michael Vidne, Joshua Vogelstein e Wei Wu. "A new look at state-space models for neural data". Journal of Computational Neuroscience 29, n. 1-2 (1 agosto 2009): 107–26. http://dx.doi.org/10.1007/s10827-009-0179-x.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Ghahramani, Zoubin, e Geoffrey E. Hinton. "Variational Learning for Switching State-Space Models". Neural Computation 12, n. 4 (1 aprile 2000): 831–64. http://dx.doi.org/10.1162/089976600300015619.

Testo completo
Abstract (sommario):
We introduce a new statistical model for time series that iteratively segments data into regimes with approximately linear dynamics and learns the parameters of each of these linear regimes. This model combines and generalizes two of the most widely used stochastic time-series models—hidden Markov models and linear dynamical systems—and is closely related to models that are widely used in the control and econometrics literatures. It can also be derived by extending the mixture of experts neural network (Jacobs, Jordan, Nowlan, & Hinton, 1991) to its fully dynamical version, in which both expert and gating networks are recurrent. Inferring the posterior probabilities of the hidden states of this model is computationally intractable, and therefore the exact expectation maximization (EM) algorithm cannot be applied. However, we present a variational approximation that maximizes a lower bound on the log-likelihood and makes use of both the forward and backward recursions for hidden Markov models and the Kalman filter recursions for linear dynamical systems. We tested the algorithm on artificial data sets and a natural data set of respiration force from a patient with sleep apnea. The results suggest that variational approximations are a viable method for inference and learning in switching state-space models.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Aghaee, Mohammad, Stephane Krau, Melih Tamer e Hector Budman. "Graph Neural Network Representation of State Space Models of Metabolic Pathways". IFAC-PapersOnLine 58, n. 14 (2024): 464–69. http://dx.doi.org/10.1016/j.ifacol.2024.08.380.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Mangion, Andrew Zammit, Ke Yuan, Visakan Kadirkamanathan, Mahesan Niranjan e Guido Sanguinetti. "Online Variational Inference for State-Space Models with Point-Process Observations". Neural Computation 23, n. 8 (agosto 2011): 1967–99. http://dx.doi.org/10.1162/neco_a_00156.

Testo completo
Abstract (sommario):
We present a variational Bayesian (VB) approach for the state and parameter inference of a state-space model with point-process observations, a physiologically plausible model for signal processing of spike data. We also give the derivation of a variational smoother, as well as an efficient online filtering algorithm, which can also be used to track changes in physiological parameters. The methods are assessed on simulated data, and results are compared to expectation-maximization, as well as Monte Carlo estimation techniques, in order to evaluate the accuracy of the proposed approach. The VB filter is further assessed on a data set of taste-response neural cells, showing that the proposed approach can effectively capture dynamical changes in neural responses in real time.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Li, Jiahao, Yang Lu, Yuan Xie e Yanyun Qu. "MaskViM: Domain Generalized Semantic Segmentation with State Space Models". Proceedings of the AAAI Conference on Artificial Intelligence 39, n. 5 (11 aprile 2025): 4752–60. https://doi.org/10.1609/aaai.v39i5.32502.

Testo completo
Abstract (sommario):
Domain Generalized Semantic Segmentation (DGSS) aims to utilize segmentation model training on known source domains to make predictions on unknown target domains. Currently, there are two network architectures: one based on Convolutional Neural Networks (CNNs) and the other based on Visual Transformers (ViTs). However, both CNN-based and ViT-based DGSS methods face challenges: the former lacks a global receptive field, while the latter requires more computational demands. Drawing inspiration from State Space Models (SSMs), which not only possess a global receptive field but also maintain linear complexity, we propose SSM-based method for achieving DGSS. In this work, we first elucidate why does mask make sense in SSM-based DGSS and propose our mask learning mechanism. Leveraging this mechanism, we present our Mask Vision Mamba network (MaskViM), a model for SSM-based DGSS, and design our mask loss to optimize MaskViM. Our method achieves superior performance on four diverse DGSS setting, which demonstrates the effectiveness of our method.
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Timm, Luís Carlos, Daniel Takata Gomes, Emanuel Pimentel Barbosa, Klaus Reichardt, Manoel Dornelas de Souza e José Flávio Dynia. "Neural network and state-space models for studying relationships among soil properties". Scientia Agricola 63, n. 4 (agosto 2006): 386–95. http://dx.doi.org/10.1590/s0103-90162006000400010.

Testo completo
Abstract (sommario):
The study of soil property relationships is of great importance in agronomy aiming for a rational management of environmental resources and an improvement of agricultural productivity. Studies of this kind are traditionally performed using static regression models, which do not take into account the involved spatial structure. This work has the objective of evaluating the relation between a time-consuming and "expensive" variable (like soil total nitrogen) and other simple, easier to measure variables (as for instance, soil organic carbon, pH, etc.). Two important classes of models (linear state-space and neural networks) are used for prediction and compared with standard uni- and multivariate regression models, used as reference. For an oat crop cultivated area, situated in Jaguariuna, SP, Brazil (22º41' S, 47º00' W) soil samples of a Typic Haplustox were collected from the plow layer at points spaced 2 m apart along a 194 m spatial transect. Recurrent neural networks and standard state-space models had a better predictive performance of soil total nitrogen as compared to the standard regression models. Among the standard regression models the Vector Auto-Regression model had a better predictive performance for soil total nitrogen.
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Bao, Yajie, Javad Mohammadpour Velni, Aditya Basina e Mahdi Shahbakhti. "Identification of State-space Linear Parameter-varying Models Using Artificial Neural Networks". IFAC-PapersOnLine 53, n. 2 (2020): 5286–91. http://dx.doi.org/10.1016/j.ifacol.2020.12.1209.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Chakrabarty, Ankush, Gordon Wichern e Christopher R. Laughman. "Meta-Learning of Neural State-Space Models Using Data From Similar Systems". IFAC-PapersOnLine 56, n. 2 (2023): 1490–95. http://dx.doi.org/10.1016/j.ifacol.2023.10.1843.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Mentzer, Katherine L., e J. Luc Peterson. "Neural network surrogate models for equations of state". Physics of Plasmas 30, n. 3 (marzo 2023): 032704. http://dx.doi.org/10.1063/5.0126708.

Testo completo
Abstract (sommario):
Equation of state (EOS) data provide necessary information for accurate multiphysics modeling, which is necessary for fields such as inertial confinement fusion. Here, we suggest a neural network surrogate model of energy and entropy and use thermodynamic relationships to derive other necessary thermodynamic EOS quantities. We incorporate phase information into the model by training a phase classifier and using phase-specific regression models, which improves the modal prediction accuracy. Our model predicts energy values to 1% relative error and entropy to 3.5% relative error in a log-transformed space. Although sound speed predictions require further improvement, the derived pressure values are accurate within 10% relative error. Our results suggest that neural network models can effectively model EOS for inertial confinement fusion simulation applications.
Gli stili APA, Harvard, Vancouver, ISO e altri
16

Shen, Shuaijie, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang e Luziwei Leng. "SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models". Proceedings of the AAAI Conference on Artificial Intelligence 39, n. 19 (11 aprile 2025): 20380–88. https://doi.org/10.1609/aaai.v39i19.34245.

Testo completo
Abstract (sommario):
Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence learning by leveraging on the sequence learning abilities of state space models (SSMs). Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block, meanwhile realizing sparse synaptic computation. Furthermore, to solve the conflict of event-driven neuronal dynamics with parallel computing, we propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds, enabling orders of acceleration in training speed compared with conventional iterative methods. On the long range arena benchmark task, SpikingSSM achieves competitive performance to state-of-the-art SSMs meanwhile realizing on average 90% of network sparsity. On language modeling, our network significantly surpasses existing spiking large language models (spikingLLMs) on the WikiText-103 dataset with only a third of the model size, demonstrating its potential as backbone architecture for low computation cost LLMs.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Malik, Wasim Q., Leigh R. Hochberg, John P. Donoghue e Emery N. Brown. "Modulation Depth Estimation and Variable Selection in State-Space Models for Neural Interfaces". IEEE Transactions on Biomedical Engineering 62, n. 2 (febbraio 2015): 570–81. http://dx.doi.org/10.1109/tbme.2014.2360393.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
18

SUYKENS, JOHAN A. K., BART L. R. DE MOOR e JOOS VANDEWALLE. "Nonlinear system identification using neural state space models, applicable to robust control design". International Journal of Control 62, n. 1 (luglio 1995): 129–52. http://dx.doi.org/10.1080/00207179508921536.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Bendtsen, Jan Dimon, e Jakob Stoustrup. "Gain Scheduling Control of Non linear Systems Based on Neural State Space Models". IFAC Proceedings Volumes 36, n. 11 (giugno 2003): 573–78. http://dx.doi.org/10.1016/s1474-6670(17)35725-7.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Cox, Benjamin, Santiago Segarra e Víctor Elvira. "Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks". Signal Processing 234 (settembre 2025): 109998. https://doi.org/10.1016/j.sigpro.2025.109998.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Bonatti, Colin, e Dirk Mohr. "One for all: Universal material model based on minimal state-space neural networks". Science Advances 7, n. 26 (giugno 2021): eabf3658. http://dx.doi.org/10.1126/sciadv.abf3658.

Testo completo
Abstract (sommario):
Computational models describing the mechanical behavior of materials are indispensable when optimizing the stiffness and strength of structures. The use of state-of-the-art models is often limited in engineering practice due to their mathematical complexity, with each material class requiring its own distinct formulation. Here, we develop a recurrent neural network framework for material modeling by introducing “Minimal State Cells.” The framework is successfully applied to datasets representing four distinct classes of materials. It reproduces the three-dimensional stress-strain responses for arbitrary loading paths accurately and replicates the state space of conventional models. The final result is a universal model that is flexible enough to capture the mechanical behavior of any engineering material while providing an interpretable representation of their state.
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Wang, Zhiyuan, Xovee Xu, Goce Trajcevski, Kunpeng Zhang, Ting Zhong e Fan Zhou. "PrEF: Probabilistic Electricity Forecasting via Copula-Augmented State Space Model". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 11 (28 giugno 2022): 12200–12207. http://dx.doi.org/10.1609/aaai.v36i11.21480.

Testo completo
Abstract (sommario):
Electricity forecasting has important implications for the key decisions in modern electricity systems, ranging from power generation, transmission, distribution and so on. In the literature, traditional statistic approaches, machine-learning methods and deep learning (e.g., recurrent neural network) based models are utilized to model the trends and patterns in electricity time-series data. However, they are restricted either by their deterministic forms or by independence in probabilistic assumptions -- thereby neglecting the uncertainty or significant correlations between distributions of electricity data. Ignoring these, in turn, may yield error accumulation, especially when relying on historical data and aiming at multi-step prediction. To overcome these, we propose a novel method named Probabilistic Electricity Forecasting (PrEF) by proposing a non-linear neural state space model (SSM) and incorporating copula-augmented mechanism into that, which can learn uncertainty-dependencies knowledge and understand interactive relationships between various factors from large-scale electricity time-series data. Our method distinguishes itself from existing models by its traceable inference procedure and its capability of providing high-quality probabilistic distribution predictions. Extensive experiments on two real-world electricity datasets demonstrate that our method consistently outperforms the alternatives.
Gli stili APA, Harvard, Vancouver, ISO e altri
23

CANELON, JOSE I., LEANG S. SHIEH, SHU M. GUO e HEIDAR A. MALKI. "NEURAL NETWORK-BASED DIGITAL REDESIGN APPROACH FOR CONTROL OF UNKNOWN CONTINUOUS-TIME CHAOTIC SYSTEMS". International Journal of Bifurcation and Chaos 15, n. 08 (agosto 2005): 2433–55. http://dx.doi.org/10.1142/s021812740501340x.

Testo completo
Abstract (sommario):
This paper presents a neural network-based digital redesign approach for digital control of continuous-time chaotic systems with unknown structures and parameters. Important features of the method are that: (i) it generalizes the existing optimal linearization approach for the class of state-space models which are nonlinear in the state but linear in the input, to models which are nonlinear in both the state and the input; (ii) it develops a neural network-based universal optimal linear state-space model for unknown chaotic systems; (iii) it develops an anti-digital redesign approach for indirectly estimating an analog control law from a fast-rate digital control law without utilizing the analog models. The estimated analog control law is then converted to a slow-rate digital control law via the prediction-based digital redesign method; (iv) it develops a linear time-varying piecewise-constant low-gain tracker which can be implemented using microprocessors. Illustrative examples are presented to demonstrate the effectiveness of the proposed methodology.
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Ruciński, Dariusz. "Artificial Neural Network based on mathematical models used in quantum computing". Studia Informatica. System and information technology 27, n. 2 (11 gennaio 2023): 27–48. http://dx.doi.org/10.34739/si.2022.27.02.

Testo completo
Abstract (sommario):
The article is a proposition of a new approach to building a neural model based on the system of Day-Ahead Market operating at TGE S.A. The reason for the proposed method is an attempt to find a better model for the DAM system. The proposed methodology is based on using mathematical models used in quantum computing. All calculations performed on learning the Artificial Neuron Network are based on operations described in Hilbert space. The main idea of calculations is to replace the data from the decimal system into the quantum state in Hilbert space and perform learning operations for a neural model of the DAM system in a special manner which relay on the teaching model for each position of the quantum register for all data. The obtained results were compared to the “classical” neural model with the use of a comparative model.
Gli stili APA, Harvard, Vancouver, ISO e altri
25

Xie, Yusen, e Yingjie Mi. "Optimizing inverted pendulum control: Integrating neural network adaptability". Applied and Computational Engineering 101, n. 1 (8 novembre 2024): 213–23. http://dx.doi.org/10.54254/2755-2721/101/20241008.

Testo completo
Abstract (sommario):
This study explores the implementation and efficacy of a neural network controller for an inverted pendulum system, contrasting it with traditional state feedback control. Initially, state feedback control exhibited limitations in managing complex system dynamics. Subsequently, a neural network controller was developed, trained using datasets from both uncontrolled and refined state space models. The refined model yielded lower training loss and superior control performance. This research demonstrates the neural network controllers enhanced adaptability and precision, offering significant improvements over traditional methods in controlling dynamic systems like inverted pendulums.
Gli stili APA, Harvard, Vancouver, ISO e altri
26

Rashid, Mustafa, e Prashant Mhaskar. "Are Neural Networks the Right Tool for Process Modeling and Control of Batch and Batch-like Processes?" Processes 11, n. 3 (24 febbraio 2023): 686. http://dx.doi.org/10.3390/pr11030686.

Testo completo
Abstract (sommario):
The prevalence of batch and batch-like operations, in conjunction with the continued resurgence of artificial intelligence techniques for clustering and classification applications, has increasingly motivated the exploration of the applicability of deep learning for modeling and feedback control of batch and batch-like processes. To this end, the present study seeks to evaluate the viability of artificial intelligence in general, and neural networks in particular, toward process modeling and control via a case study. Nonlinear autoregressive with exogeneous input (NARX) networks are evaluated in comparison with subspace models within the framework of model-based control. A batch polymethyl methacrylate (PMMA) polymerization process is chosen as a simulation test-bed. Subspace-based state-space models and NARX networks identified for the process are first compared for their predictive power. The identified models are then implemented in model predictive control (MPC) to compare the control performance for both modeling approaches. The comparative analysis reveals that the state-space models performed better than NARX networks in predictive power and control performance. Moreover, the NARX networks were found to be less versatile than state-space models in adapting to new process operation. The results of the study indicate that further research is needed before neural networks may become readily applicable for the feedback control of batch processes.
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Faramarzi, Mojtaba, Mohammad Amini, Akilesh Badrinaaraayanan, Vikas Verma e Sarath Chandar. "PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 1 (28 giugno 2022): 589–97. http://dx.doi.org/10.1609/aaai.v36i1.19938.

Testo completo
Abstract (sommario):
Large capacity deep learning models are often prone to a high generalization gap when trained with a limited amount of labeled training data. A recent class of methods to address this problem uses various ways to construct a new training sample by mixing a pair (or more) of training samples. We propose PatchUp, a hidden state block-level regularization technique for Convolutional Neural Networks (CNNs), that is applied on selected contiguous blocks of feature maps from a random pair of samples. Our approach improves the robustness of CNN models against the manifold intrusion problem that may occur in other state-of-the-art mixing approaches. Moreover, since we are mixing the contiguous block of features in the hidden space, which has more dimensions than the input space, we obtain more diverse samples for training towards different dimensions. Our experiments on CIFAR10/100, SVHN, Tiny-ImageNet, and ImageNet using ResNet architectures including PreActResnet18/34, WRN-28-10, ResNet101/152 models show that PatchUp improves upon, or equals, the performance of current state-of-the-art regularizers for CNNs. We also show that PatchUp can provide a better generalization to deformed samples and is more robust against adversarial attacks.
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Dreyfus, Gérard, e Yizhak Idan. "The Canonical Form of Nonlinear Discrete-Time Models". Neural Computation 10, n. 1 (1 gennaio 1998): 133–64. http://dx.doi.org/10.1162/089976698300017926.

Testo completo
Abstract (sommario):
Discrete-time models of complex nonlinear processes, whether physical, biological, or economical, are usually under the form of systems of coupled difference equations. In analyzing such systems, one of the first tasks is to find a state-space description of the process—that is, a set of state variables and the associated state equations. We present a methodology for finding a set of state variables and a canonical representation of a class of systems described by a set of recurrent discrete-time, time-invariant equations. In the field of neural networks, this is of special importance since the application of standard training algorithms requires the network to be in a canonical form. Several illustrative examples are presented.
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Wang, RuiXue, Kaikang Chen, Bo Zhao, Liming Zhou, Licheng Zhu, Chengxu Lv, Zhenhao Han, Kunlei Lu, Xuguang Feng e Siyuan Zhao. "Construction of Full-Space State Model and Prediction of Plant Growth Information". Journal of the ASABE 68, n. 2 (2025): 133–46. https://doi.org/10.13031/ja.16165.

Testo completo
Abstract (sommario):
HighlightsThis research proposes a model based on DTs and BPNN and accurately predicts the growth indexes and state of lettuce.Abstract. This research proposed a full-space state prediction model based on Digital Twins (DTs) for intelligent prediction and optimization control of environmental parameters and crop growth in plant factories. Compared with traditional prediction models, this model significantly improved production efficiency and resource utilization in plant factories by dynamically adjusting environmental control strategies through real-time data collection and feedback. The model employed a Back Propagation Neural Network (BPNN) for accurate prediction of crop growth indexes, with experimental results showing a Root Mean Squared Error (RMSE) of 0.868 and a Mean Absolute Error (MAE) of 0.625 on the test dataset, indicating high prediction accuracy. The innovative aspect of this model lies its integration of DTs technology, enabling full-cycle monitoring and intelligent regulation of the crop growth process, addressing the limitations of existing models in dynamic feedback and real-time adjustment capabilities. Future extensive validation and optimization of the model across different crop types and environmental conditions will further enhance its potential for application in plant factory management. Keywords: Back propagation neural network, Digital twins technology, Lettuce, Plant factory, State prediction.
Gli stili APA, Harvard, Vancouver, ISO e altri
30

John, Dr Jogi, Babita Prasad, Bhushan Murkute, Manav Patil, Aditya Agrawal e Uday Shahu. "Battery Lifespan Prediction Using Machine Learning and NASA Aging Dataset". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, n. 04 (3 aprile 2025): 1–9. https://doi.org/10.55041/ijsrem43585.

Testo completo
Abstract (sommario):
NASA Battery RLU 16.5 plays a crucial role in powering space missions, ensuring reliability and longevity under extreme conditions. Accurate estimation and control of its State of Health (SOH) are essential for maintaining its performance, particularly in the harsh and unpredictable environment of space. This review paper explores the latest advancements in SOH estimation for lithium-ion batteries, focusing on methods applicable to NASA Battery RLU 16.5. Key methods discussed include machine learning models such as Long Short-Term Memory (LSTM) networks, Convolutional Neural Networks (CNN), and hybrid deep learning models, which have shown promising results in accurately predicting SOH and Remaining Useful Life (RUL). Additionally, optimization techniques like ant lion optimization combined with support vector regression and incremental capacity analysis offer high precision in SOH predictions. Temperature-based SOH estimation and the integration of electrochemical models also emerge as essential methods for improving accuracy. Despite the significant progress in SOH estimation, challenges such as the unpredictability of space conditions remain, necessitating further research in hybrid modeling approaches. This paper provides a comprehensive overview of the state-of-the-art SOH estimation techniques and highlights the challenges and future directions in managing NASA’s lithium-ion batteries for long-term missions. Keywords— Lithium-ion battery (LIB), Remaining Useful Life (RUL), Machine learning algorithms, Neural networks, LSTM (Long Short-Term Memory), CNN (Convolutional Neural Network), Battery degradation modeling, Hybrid neural networks, Optimization techniques, Space mission battery management
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Wang, Niannian, Weiyi Du, Hongjin Liu, Kuankuan Zhang, Yongbin Li, Yanquan He e Zejun Han. "Fine-Grained Leakage Detection for Water Supply Pipelines Based on CNN and Selective State-Space Models". Water 17, n. 8 (9 aprile 2025): 1115. https://doi.org/10.3390/w17081115.

Testo completo
Abstract (sommario):
The water supply pipeline system is responsible for providing clean drinking water to residents, but pipeline leaks can lead to water resource wastage, increased operational costs, and safety hazards. To effectively detect the leakage level in the water supply pipelines and address the difficulty of accurately distinguishing fine-grained leakage levels using traditional methods, this paper proposes a fine-grained leakage identification method based on Convolutional Neural Networks (CNN) and the Selective State Space Model (Mamba). An experimental platform was built to simulate different leakage conditions, and multi-axis sensors were used to collect data, resulting in the creation of a high-quality dataset. The signals were converted into frequency-domain images using Short-Time Fourier Transform (STFT), and CNN was employed to extract image features. Mamba was integrated to capture the one-dimensional time dynamic characteristics of the leakage signal, and the CosFace loss function was introduced to increase the inter-class distance, thereby improving the fine-grained classification ability. Experimental results show that the proposed method achieves optimal performance across various evaluation metrics. Compared to SVM, BP neural networks, and CNN methods, the accuracy was improved by 17.9%, 15.9%, and 3.0%, respectively. Compared to Support Vector Machine (SVM), Backpropagation neural network (BP), attention mechanism with the LSTM network (LSTM-AM), CNN, and inverted transformers network (iTransformer) methods, the accuracy improved by 17.9%, 15.9%, 7.8%, 3.0%, and 2.3%, respectively. Additionally, the method enhanced intra-class consistency and increased inter-class differences, showing outstanding performance at different leakage levels, which could contribute to improved intelligent management for water pipeline leakage detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
32

Meng, Wenjie, Aiming Mu e Huajun Wang. "Efficient UNet fusion of convolutional neural networks and state space models for medical image segmentation". Digital Signal Processing 158 (marzo 2025): 104937. https://doi.org/10.1016/j.dsp.2024.104937.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Kotta, Ü., F. N. Chowdhury e S. Nõmm. "On realizability of neural networks-based input–output models in the classical state-space form". Automatica 42, n. 7 (luglio 2006): 1211–16. http://dx.doi.org/10.1016/j.automatica.2006.03.003.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Krikelis, Konstantinos, Jin-Song Pei, Koos van Berkel e Maarten Schoukens. "Identification of structured nonlinear state–space models for hysteretic systems using neural network hysteresis operators". Measurement 224 (gennaio 2024): 113966. http://dx.doi.org/10.1016/j.measurement.2023.113966.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Pang, Shuwei, Haoyuan Lu, Qiuhong Li e Ziyu Gu. "An Improved Onboard Adaptive Aero-Engine Model Based on an Enhanced Neural Network and Linear Parameter Variance for Parameter Prediction". Energies 17, n. 12 (12 giugno 2024): 2888. http://dx.doi.org/10.3390/en17122888.

Testo completo
Abstract (sommario):
Achieving measurable and unmeasurable parameter prediction is the key process in model-based control, for which an accurate onboard model is the most important part. However, neither nonlinear models like component level models or LPV models, nor linear models like state–space models can fully meet the requirements. Hence, an original ENN-LPV linearization strategy is proposed to achieve the online modelling of the state–space model. A special network structure that has the same format as the state–space model’s calculation was applied to establish the state–space model. Importantly, the network’s modelling ability was improved through applying multiple activation functions in the single hidden layer and an experience pool that records data of past sampling instants, which strengthens the ability to capture the engine’s strongly nonlinear dynamics. Furthermore, an adaptive model, consisting of a component-level model with adaptive factors, a linear Kalman filter, a predictive model, an experience pool, and two ENN-LPV networks, was developed using the proposed linearization strategy as the core process to continuously update the Kalman filter and the predictive model. Simulations showed that the state space model built using the ENN-LPV linearization strategy had a better model identification ability in comparison with the model built using the OSELM-LPV linearization strategy, and the maximum output error between the ENN-LPV model and the simulated engine was 0.1774%. In addition, based on the ENN-LPV linearization strategy, the adaptive model was able to make accurate predictions of unmeasurable performance parameters such as thrust and high-pressure turbine inlet temperature, with a maximum prediction error within 0.5%. Thus, the effectiveness and the advantages of the proposed method are demonstrated.
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Simionato, Riccardo, e Stefano Fasciani. "Modeling Time-Variant Responses of Optical Compressors With Selective State Space Models". Journal of the Audio Engineering Society 73, n. 3 (7 aprile 2025): 144–65. https://doi.org/10.17743/jaes.2022.0194.

Testo completo
Abstract (sommario):
This paper presents a method for modeling optical dynamic range compressors using deep neural networks with selective state space models. The proposed approach surpasses previous methods based on recurrent layers by employing a selective state space block to encode the input audio. It features a refined technique integrating feature-wise linear modulation and gated linear units to adjust the network dynamically, conditioning the compression’s attack and release phases according to external parameters. The proposed architecture is well-suited for low-latency and real-time applications, which are crucial in live audio processing. The method has been validated on the analog optical compressors Tube-Tech CL 1B and Teletronix LA-2A, which possess distinct characteristics. Evaluation is performed using quantitative metrics and subjective listening tests, comparing the proposed method with other state-of-the art models. Results show that black-box modeling methods used here outperform all others, achieving accurate emulation of the compression process for both seen and unseen settings during training. Furthermore, it is shown that there is a correlation between this accuracy and the sampling density of the control parameters in the data set and it is identified the settings with fast attack and slow release as the most challenging to emulate.
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Chen, Hanlin, Li'an Zhuo, Baochang Zhang, Xiawu Zheng, Jianzhuang Liu, David Doermann e Rongrong Ji. "Binarized Neural Architecture Search". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 07 (3 aprile 2020): 10526–33. http://dx.doi.org/10.1609/aaai.v34i07.6624.

Testo completo
Abstract (sommario):
Neural architecture search (NAS) can have a significant impact in computer vision by automatically designing optimal neural network architectures for various tasks. A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models. Unfortunately, this area remains largely unexplored. BNAS is more challenging than NAS due to the learning inefficiency caused by optimization requirements and the huge architecture space. To address these issues, we introduce channel sampling and operation space reduction into a differentiable NAS to significantly reduce the cost of searching. This is accomplished through a performance-based strategy used to abandon less potential operations. Two optimization methods for binarized neural networks are used to validate the effectiveness of our BNAS. Extensive experiments demonstrate that the proposed BNAS achieves a performance comparable to NAS on both CIFAR and ImageNet databases. An accuracy of 96.53% vs. 97.22% is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a 40% faster search than the state-of-the-art PC-DARTS.
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Zhou, Xun, Xingyu Wu, Liang Feng, Zhichao Lu e Kay Chen Tan. "Design Principle Transfer in Neural Architecture Search via Large Language Models". Proceedings of the AAAI Conference on Artificial Intelligence 39, n. 21 (11 aprile 2025): 23000–23008. https://doi.org/10.1609/aaai.v39i21.34463.

Testo completo
Abstract (sommario):
Transferable neural architecture search (TNAS) has been introduced to design efficient neural architectures for multiple tasks, to enhance the practical applicability of NAS in real-world scenarios. In TNAS, architectural knowledge accumulated in previous search processes is reused to warm up the architecture search for new tasks. However, existing TNAS methods still search in an extensive search space, necessitating the evaluation of numerous architectures. To overcome this challenge, this work proposes a novel transfer paradigm, i.e., design principle transfer. In this work, the linguistic description of various structural components' effects on architectural performance is termed design principles. They are learned from established architectures and then can be reused to reduce the search space tasks by discarding unpromising architectures. Searching in the refined search space can boost both the search performance and efficiency for new NAS tasks. To this end, a large language model (LLM)-assisted design principle transfer (LAPT) framework is devised. In LAPT, LLM is applied to automatically reason the design principles from a set of given architectures, and then a principle adaptation method is applied to refine these principles progressively based on the search results. Experimental results demonstrate that LAPT can beat the state-of-the-art TNAS methods on most tasks and achieve comparable performance on the remainder.
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Liu, Qiao, Jiaze Xu, Rui Jiang e Wing Hung Wong. "Density estimation using deep generative neural networks". Proceedings of the National Academy of Sciences 118, n. 15 (8 aprile 2021): e2101344118. http://dx.doi.org/10.1073/pnas.2101344118.

Testo completo
Abstract (sommario):
Density estimation is one of the fundamental problems in both statistics and machine learning. In this study, we propose Roundtrip, a computational framework for general-purpose density estimation based on deep generative neural networks. Roundtrip retains the generative power of deep generative models, such as generative adversarial networks (GANs) while it also provides estimates of density values, thus supporting both data generation and density estimation. Unlike previous neural density estimators that put stringent conditions on the transformation from the latent space to the data space, Roundtrip enables the use of much more general mappings where target density is modeled by learning a manifold induced from a base density (e.g., Gaussian distribution). Roundtrip provides a statistical framework for GAN models where an explicit evaluation of density values is feasible. In numerical experiments, Roundtrip exceeds state-of-the-art performance in a diverse range of density estimation tasks.
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Xie, Guotian. "Redundancy-Aware Pruning of Convolutional Neural Networks". Neural Computation 32, n. 12 (dicembre 2020): 2532–56. http://dx.doi.org/10.1162/neco_a_01330.

Testo completo
Abstract (sommario):
Pruning is an effective way to slim and speed up convolutional neural networks. Generally previous work directly pruned neural networks in the original feature space without considering the correlation of neurons. We argue that such a way of pruning still keeps some redundancy in the pruned networks. In this letter, we proposed to prune in the intermediate space in which the correlation of neurons is eliminated. To achieve this goal, the input and output of a convolutional layer are first mapped to an intermediate space by orthogonal transformation. Then neurons are evaluated and pruned in the intermediate space. Extensive experiments have shown that our redundancy-aware pruning method surpasses state-of-the-art pruning methods on both efficiency and accuracy. Notably, using our redundancy-aware pruning method, ResNet models with three times the speed-up could achieve competitive performance with fewer floating point operations per second even compared to DenseNet.
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Zhang, Peng, Wenjie Hui, Benyou Wang, Donghao Zhao, Dawei Song, Christina Lioma e Jakob Grue Simonsen. "Complex-valued Neural Network-based Quantum Language Models". ACM Transactions on Information Systems 40, n. 4 (31 ottobre 2022): 1–31. http://dx.doi.org/10.1145/3505138.

Testo completo
Abstract (sommario):
Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words. This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM. The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM’s performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Lee, JoonSeong, e . "Analysis Methodology of Inelastic Constitutive Parameter Using State Space Method and Neural Network". International Journal of Engineering & Technology 7, n. 3.34 (1 settembre 2018): 163. http://dx.doi.org/10.14419/ijet.v7i3.34.18938.

Testo completo
Abstract (sommario):
Background/Objectives: In this paper, we present a method for describing a set of variables of an inelastic constitutive equation based on state space method (SSM) and neural network (NN). The advantage of this method is that it can identify the appropriate parameters.Methods/Statistical analysis: Two NNs based on SSM are proposed. One outputs the ratio of inelastic strain for the internal parameters of the material, and the other is the following state of the inelastic strain ratio and material internal variable. Both NNs were trained and successfully collected using input and output data generated by Chaboche 's model.Findings: As a result, previous NNs have demonstrated their validity as a powerful material model. However, the training data for the proposed NN can’t be easily obtained from actual experimental data. Previous neural networks can reproduce the original stress-strain curves. The NNs also produced untrained curves to demonstrate interpolation capabilities. It was also found that the NNs can be estimated to be close to training data. The author defines the implicit constitutive model and proposes the implicit viscous constitutive model using NNs. In modeling, inelastic behavior is generalized in state space representation, and the state space form is constructed by NNs using an input-output data sets. The proposed model was first created from the pseudo-experimental data generated by one of the commonly used configuration models and has been found to be a good replacement for the model. The actual experimental data was then tested, and the proposed model showed the accuracy of its superiority over all existing specified models because the amount of model errors was negligible.Improvements/Applications: The comparison between the NN constitutive laws with the Chaboche’s model indicates that the NN constitutive law generated curves with less model errors than the experimental data, thereby indicating the superiority of the neural constitutive law to explicit constitutive laws as a material model.
Gli stili APA, Harvard, Vancouver, ISO e altri
43

CHEN, CHEN-YUAN, JOHN RONG-CHUNG HSU e CHENG-WU CHEN. "FUZZY LOGIC DERIVATION OF NEURAL NETWORK MODELS WITH TIME DELAYS IN SUBSYSTEMS". International Journal on Artificial Intelligence Tools 14, n. 06 (dicembre 2005): 967–74. http://dx.doi.org/10.1142/s021821300500248x.

Testo completo
Abstract (sommario):
This paper extends the Takagi-Sugeno (T-S) fuzzy model representation to analyze the stability of interconnected systems in which there exist time delays in subsystems. A novel stability criterion which can be solved numerically is presented in terms of Lyapunov's theory for fuzzy interconnected models. In this paper, we use linear difference inclusion (LDI) state-space representation to represent the fuzzy model. Then, the linear matrix inequality (LMI) optimization algorithm is employed to find common solution and then guarantee the asymptotic stability.
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Abbas, H., e H. Werner. "LPV Design of Charge Control for an SI Engine Based on LFT Neural State-Space Models". IFAC Proceedings Volumes 41, n. 2 (2008): 7427–32. http://dx.doi.org/10.3182/20080706-5-kr-1001.01255.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Abbas, H., e H. Werner. "Polytopic Quasi-LPV Models Based on Neural State-Space Models and Application to Air Charge Control of a SI Engine". IFAC Proceedings Volumes 41, n. 2 (2008): 6466–71. http://dx.doi.org/10.3182/20080706-5-kr-1001.01090.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Wang, Yiren, Lijun Wu, Yingce Xia, Tao Qin, ChengXiang Zhai e Tie-Yan Liu. "Transductive Ensemble Learning for Neural Machine Translation". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 6291–98. http://dx.doi.org/10.1609/aaai.v34i04.6097.

Testo completo
Abstract (sommario):
Ensemble learning, which aggregates multiple diverse models for inference, is a common practice to improve the accuracy of machine learning tasks. However, it has been observed that the conventional ensemble methods only bring marginal improvement for neural machine translation (NMT) when individual models are strong or there are a large number of individual models. In this paper, we study how to effectively aggregate multiple NMT models under the transductive setting where the source sentences of the test set are known. We propose a simple yet effective approach named transductive ensemble learning (TEL), in which we use all individual models to translate the source test set into the target language space and then finetune a strong model on the translated synthetic corpus. We conduct extensive experiments on different settings (with/without monolingual data) and different language pairs (English↔{German, Finnish}). The results show that our approach boosts strong individual models with significant improvement and benefits a lot from more individual models. Specifically, we achieve the state-of-the-art performances on the WMT2016-2018 English↔German translations.
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Zhang, Dongxiang, Ziyang Xiao, Yuan Wang, Mingli Song e Gang Chen. "Neural TSP Solver with Progressive Distillation". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 10 (26 giugno 2023): 12147–54. http://dx.doi.org/10.1609/aaai.v37i10.26432.

Testo completo
Abstract (sommario):
Travelling salesman problem (TSP) is NP-Hard with exponential search space. Recently, the adoption of encoder-decoder models as neural TSP solvers has emerged as an attractive topic because they can instantly obtain near-optimal results for small-scale instances. Nevertheless, their training efficiency and solution quality degrade dramatically when dealing with large-scale problems. To address the issue, we propose a novel progressive distillation framework, by adopting curriculum learning to train TSP samples in increasing order of their problem size and progressively distilling high-level knowledge from small models to large models via a distillation loss. In other words, the trained small models are used as the teacher network to guide action selection when training large models. To accelerate training speed, we also propose a Delaunary-graph based action mask and a new attention-based decoder to reduce decoding cost. Experimental results show that our approach establishes clear advantages over existing encoder-decoder models in terms of training effectiveness and solution quality. In addition, we validate its usefulness as an initial solution generator for the state-of-the-art TSP solvers, whose probability of obtaining the optimal solution can be further improved in such a hybrid manner.
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Rule, Michael, e Guido Sanguinetti. "Autoregressive Point Processes as Latent State-Space Models: A Moment-Closure Approach to Fluctuations and Autocorrelations". Neural Computation 30, n. 10 (ottobre 2018): 2757–80. http://dx.doi.org/10.1162/neco_a_01121.

Testo completo
Abstract (sommario):
Modeling and interpreting spike train data is a task of central importance in computational neuroscience, with significant translational implications. Two popular classes of data-driven models for this task are autoregressive point-process generalized linear models (PPGLM) and latent state-space models (SSM) with point-process observations. In this letter, we derive a mathematical connection between these two classes of models. By introducing an auxiliary history process, we represent exactly a PPGLM in terms of a latent, infinite-dimensional dynamical system, which can then be mapped onto an SSM by basis function projections and moment closure. This representation provides a new perspective on widely used methods for modeling spike data and also suggests novel algorithmic approaches to fitting such models. We illustrate our results on a phasic bursting neuron model, showing that our proposed approach provides an accurate and efficient way to capture neural dynamics.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Tuli, Shikhar, Bhishma Dedhia, Shreshth Tuli e Niraj K. Jha. "FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?" Journal of Artificial Intelligence Research 77 (6 maggio 2023): 39–70. http://dx.doi.org/10.1613/jair.1.13942.

Testo completo
Abstract (sommario):
The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. However, training such models and exploring their hyperparameter space is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, such works limit analysis to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model has 2.6× smaller size. FlexiBERT-Large, another proposed model, attains state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Sensoy, Murat, Lance Kaplan, Federico Cerutti e Maryam Saleki. "Uncertainty-Aware Deep Classifiers Using Generative Models". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 04 (3 aprile 2020): 5620–27. http://dx.doi.org/10.1609/aaai.v34i04.6015.

Testo completo
Abstract (sommario):
Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed approach provides better estimates of uncertainty for in- and out-of-distribution samples, and adversarial examples on well-known data sets against state-of-the-art approaches including recent Bayesian approaches for neural networks and anomaly detection methods.
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia