To see the other types of publications on this topic, follow the link: Bayesian models of generalization.

Journal articles on the topic 'Bayesian models of generalization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Bayesian models of generalization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhu, Lin, Xinbing Wang, Chenghu Zhou, and Nanyang Ye. "Bayesian Cross-Modal Alignment Learning for Few-Shot Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 11461–69. http://dx.doi.org/10.1609/aaai.v37i9.26355.

Full text
Abstract:
Recent advances in large pre-trained models showed promising results in few-shot learning. However, their generalization ability on two-dimensional Out-of-Distribution (OoD) data, i.e., correlation shift and diversity shift, has not been thoroughly investigated. Researches have shown that even with a significant amount of training data, few methods can achieve better performance than the standard empirical risk minimization method (ERM) in OoD generalization. This few-shot OoD generalization dilemma emerges as a challenging direction in deep neural network generalization research, where the performance suffers from overfitting on few-shot examples and OoD generalization errors. In this paper, leveraging a broader supervision source, we explore a novel Bayesian cross-modal image-text alignment learning method (Bayes-CAL) to address this issue. Specifically, the model is designed as only text representations are fine-tuned via a Bayesian modelling approach with gradient orthogonalization loss and invariant risk minimization (IRM) loss. The Bayesian approach is essentially introduced to avoid overfitting the base classes observed during training and improve generalization to broader unseen classes. The dedicated loss is introduced to achieve better image-text alignment by disentangling the causal and non-casual parts of image features. Numerical experiments demonstrate that Bayes-CAL achieved state-of-the-art OoD generalization performances on two-dimensional distribution shifts. Moreover, compared with CLIP-like models, Bayes-CAL yields more stable generalization performances on unseen classes. Our code is available at https://github.com/LinLLLL/BayesCAL.
APA, Harvard, Vancouver, ISO, and other styles
2

Tenenbaum, Joshua B., and Thomas L. Griffiths. "Generalization, similarity, and Bayesian inference." Behavioral and Brain Sciences 24, no. 4 (2001): 629–40. http://dx.doi.org/10.1017/s0140525x01000061.

Full text
Abstract:
Shepard has argued that a universal law should govern generalization across different domains of perception and cognition, as well as across organisms from different species or even different planets. Starting with some basic assumptions about natural kinds, he derived an exponential decay function as the form of the universal generalization gradient, which accords strikingly well with a wide range of empirical data. However, his original formulation applied only to the ideal case of generalization from a single encountered stimulus to a single novel stimulus, and for stimuli that can be represented as points in a continuous metric psychological space. Here we recast Shepard's theory in a more general Bayesian framework and show how this naturally extends his approach to the more realistic situation of generalizing from multiple consequential stimuli with arbitrary representational structure. Our framework also subsumes a version of Tversky's set-theoretic model of similarity, which is conventionally thought of as the primary alternative to Shepard's continuous metric space model of similarity and generalization. This unification allows us not only to draw deep parallels between the set-theoretic and spatial approaches, but also to significantly advance the explanatory power of set-theoretic models.
APA, Harvard, Vancouver, ISO, and other styles
3

San Martín, Ernesto, Alejandro Jara, Jean-Marie Rolin, and Michel Mouchart. "On the Bayesian Nonparametric Generalization of IRT-Type Models." Psychometrika 76, no. 3 (2011): 385–409. http://dx.doi.org/10.1007/s11336-011-9213-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

X. Linares Cedeño, Francisco, Gabriel German, Juan Carlos Hidalgo та Ariadna Montiel. "Bayesian analysis for a class of α-attractor inflationary models". Journal of Cosmology and Astroparticle Physics 2023, № 03 (2023): 038. http://dx.doi.org/10.1088/1475-7516/2023/03/038.

Full text
Abstract:
Abstract We perform a Bayesian study of a generalization of the basic α-attractor T model given by the potential V(ϕ) = V 0[1-sech p (ϕ/√(6α)M pl)] where ϕ is the inflaton field and the parameter α corresponds to the inverse curvature of the scalar manifold in the conformal or superconformal realizations of the attractor models. Such generalization is characterized by the power p which includes the basic or base model for p = 2. Once the priors for the parameters of the α-attractor potential are set by numerical exploration, we perform the corresponding statistical analysis for the cases p = 1, 2, 3, 4, and derive posteriors. Considering the original α-attractor potential as the base model, we calculate the evidence for our generalization, and conclude that the p = 4 model is preferred by the CMB data. We also present constraints for the parameter α. Interestingly, all the cases studied prefer a specific value for the tensor-to-scalar ratio given by r ≃ 0.0025.
APA, Harvard, Vancouver, ISO, and other styles
5

Ahuja, Kabir, Vidhisha Balachandran, Madhur Panwar, et al. "Learning Syntax Without Planting Trees: Understanding Hierarchical Generalization in Transformers." Transactions of the Association for Computational Linguistics 13 (February 12, 2024): 121–41. https://doi.org/10.1162/tacl_a_00733.

Full text
Abstract:
Abstract Transformers trained on natural language data have been shown to exhibit hierarchical generalization without explicitly encoding any structural bias. In this work, we investigate sources of inductive bias in transformer models and their training that could cause such preference for hierarchical generalization. We extensively experiment with transformers trained on five synthetic, controlled datasets using several training objectives and show that, while objectives such as sequence-to-sequence modeling, classification, etc., often fail to lead to hierarchical generalization, the language modeling objective consistently leads to transformers generalizing hierarchically. We then study how different generalization behaviors emerge during the training by conducting pruning experiments that reveal the joint existence of subnetworks within the model implementing different generalizations. Finally, we take a Bayesian perspective to understand transformers’ preference for hierarchical generalization: We establish a correlation between whether transformers generalize hierarchically on a dataset and if the simplest explanation of that dataset is provided by a hierarchical grammar compared to regular grammars exhibiting linear generalization. Overall, our work presents new insights on the origins of hierarchical generalization in transformers and provides a theoretical framework for studying generalization in language models.
APA, Harvard, Vancouver, ISO, and other styles
6

Gentner, Dedre. "Exhuming similarity." Behavioral and Brain Sciences 24, no. 4 (2001): 669. http://dx.doi.org/10.1017/s0140525x01350082.

Full text
Abstract:
Tenenbaum and Griffiths' paper attempts to subsume theories of similarity – including spatial models, featural models, and structure-mapping models – into a framework based on Bayesian generalization. But in so doing it misses significant phenomena of comparison. It would be more fruitful to examine how comparison processes suggest hypotheses than to try to derive similarity from Bayesian reasoning. [Shepard; Tenenbaum & Griffiths]
APA, Harvard, Vancouver, ISO, and other styles
7

Tenenbaum, Joshua B., and Thomas L. Griffiths. "Some specifics about generalization." Behavioral and Brain Sciences 24, no. 4 (2001): 762–78. http://dx.doi.org/10.1017/s0140525x01780089.

Full text
Abstract:
We address two kinds of criticisms of our Bayesian framework for generalization: those that question the correctness or the coverage of our analysis, and those that question its intrinsic value. Speaking to the first set, we clarify the origins and scope of our size principle for weighting hypotheses or features, focusing on its potential status as a cognitive universal; outline several variants of our framework to address additional phenomena of generalization raised in the commentaries; and discuss the subtleties of our claims about the relationship between similarity and generalization. Speaking to the second set, we identify the unique contributions that a rational statistical approach to generalization offers over traditional models that focus on mental representation and cognitive processes.
APA, Harvard, Vancouver, ISO, and other styles
8

Shalaeva, Vera, Alireza Fakhrizadeh Esfahani, Pascal Germain, and Mihaly Petreczky. "Improved PAC-Bayesian Bounds for Linear Regression." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 5660–67. http://dx.doi.org/10.1609/aaai.v34i04.6020.

Full text
Abstract:
In this paper, we improve the PAC-Bayesian error bound for linear regression derived in Germain et al. (2016). The improvements are two-fold. First, the proposed error bound is tighter, and converges to the generalization loss with a well-chosen temperature parameter. Second, the error bound also holds for training data that are not independently sampled. In particular, the error bound applies to certain time series generated by well-known classes of dynamical models, such as ARX models.
APA, Harvard, Vancouver, ISO, and other styles
9

MacKay, David J. C. "A Practical Bayesian Framework for Backpropagation Networks." Neural Computation 4, no. 3 (1992): 448–72. http://dx.doi.org/10.1162/neco.1992.4.3.448.

Full text
Abstract:
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained.
APA, Harvard, Vancouver, ISO, and other styles
10

Hinton, Geoffrey E., and Zoubin Ghahramani. "Generative models for discovering sparse distributed representations." Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 352, no. 1358 (1997): 1177–90. http://dx.doi.org/10.1098/rstb.1997.0101.

Full text
Abstract:
We describe a hierarchical, generative model that can be viewed as a nonlinear generalization of factor analysis and can be implemented in a neural network. The model uses bottom–up, top–down and lateral connections to perform Bayesian perceptual inference correctly. Once perceptual inference has been performed the connection strengths can be updated using a very simple learning rule that only requires locally available information. We demonstrate that the network learns to extract sparse, distributed, hierarchical representations.
APA, Harvard, Vancouver, ISO, and other styles
11

Achcar, Jorge Alberto, Emerson Barili, and Edson Zangiacomi Martinez. "Semiparametric transformation model:A hierarchical Bayesian approach." Model Assisted Statistics and Applications 18, no. 3 (2023): 245–56. http://dx.doi.org/10.3233/mas-221408.

Full text
Abstract:
The use of semiparametric or transformation models has been considered by many authors in the analysis of lifetime data in the presence of censoring and covariates as an alternative and generalization of the usual proportional hazards, the proportional odds models, and the accelerated failure time models, extensively used in lifetime data analysis. The inferences for the proportional hazards model introduced by Cox (1972) are usually obtained by maximum likelihood estimation methods assuming the partial likelihood function introduced by Cox (Cox, 1975). In this study, we consider a hierarchical Bayesian analysis of the proportional hazards model assuming the complete likelihood function obtained from a transformation model considering the unknown hazard function as a latent unknown variable under a Bayesian approach. Some applications with real time medical data illustrate the proposed methodology.
APA, Harvard, Vancouver, ISO, and other styles
12

Amari, Shun-ichi. "Integration of Stochastic Models by Minimizing α-Divergence". Neural Computation 19, № 10 (2007): 2780–96. http://dx.doi.org/10.1162/neco.2007.19.10.2780.

Full text
Abstract:
When there are a number of stochastic models in the form of probability distributions, one needs to integrate them. Mixtures of distributions are frequently used, but exponential mixtures also provide a good means of integration. This letter proposes a one-parameter family of integration, called α-integration, which includes all of these well-known integrations. These are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages. There are psychophysical experiments that suggest that α-integrations are used in the brain. The α-divergence between two distributions is defined, which is a natural generalization of Kullback-Leibler divergence and Hellinger distance, and it is proved that α-integration is optimal in the sense of minimizing α-divergence. The theory is applied to generalize the mixture of experts and the product of experts to the α-mixture of experts. The α-predictive distribution is also stated in the Bayesian framework.
APA, Harvard, Vancouver, ISO, and other styles
13

Abdelmadjid, Youcefa, Lamine Kherfi Mohammed, Khaldi Belal, and Aiadi Oussama. "Understanding user intention in image retrieval: generalization selection using multiple concept hierarchies." TELKOMNIKA Telecommunication, Computing, Electronics and Control 17, no. 5 (2019): 2572–86. https://doi.org/10.12928/TELKOMNIKA.v17i5.10202.

Full text
Abstract:
Image retrieval is the technique that helps users to find and retrieve desired images from a huge image database. The user has firstly to formulate a query that expresses his/her needs. This query may appear in textual form as in semantic retrieval (SR), in visual example form as in query by visual example (QBVE), or as a combination of these two forms named query by semantic example (QBSE). The focus of this paper lies in the techniques of analyzing queries composed of multiple semantic examples. This is a very challenging task due to the different interpretations that can be drawn from the same query. To solve such a problem, we introduce a model based on Bayesian generalization. In cognitive science, Bayesian generalization, which is the base of most works in literature, is a method that tries to find, in one hierarchy of concepts, the parent concept of a given set of concepts. In addition and instead of using one single concept hierarchy, we propose a generalization so it can be used with multiple hierarchies where each one has a different semantic context and contains several abstraction levels. Our method consists in finding the optimal generalization by, firstly, determining the appropriate concept hierarchy, and then determining the appropriate level of generalization. Experimental evaluations demonstrate that our method, which uses multiple hierarchies, yields better results than those using only one single hierarchy.
APA, Harvard, Vancouver, ISO, and other styles
14

OLANREWAJU, Rasaki Olawale, Sodiq OLANREWAJU, Adedeji Adigun OYINLOYE, and Wasiu ADEPOJU. "On Finite and Non-Finite Bayesian Mixture Models." Journal of New Theory, no. 45 (December 31, 2023): 57–72. http://dx.doi.org/10.53570/jnt.1358754.

Full text
Abstract:
In this paper, a Bayesian paradigm of a mixture model with finite and non-finite components is expounded for a generic prior and likelihood that can be of any distributional random noise. The mixture model consists of stylized properties-proportional allocation, sample size allocation, and latent (unobserved) variable for similar probabilistic generalization. The Expectation-Maximization (EM) algorithm technique of parameter estimation was adopted to estimate the stated stylized parameters. The Markov Chain Monte Carlo (MCMC) and Metropolis–Hastings sampler algorithms were adopted as an alternative to the EM algorithm when it is not analytically feasible, that is, when the unobserved variable cannot be replaced by imposed expectations (means) and when there is need for correction of exploration of posterior distribution by means of acceptance ratio quantity, respectively. Label switching for exchangeability of posterior distribution via truncated or alternating prior distributional form was imposed on the posterior distribution for robust tailoring inference through Maximum a Posterior (MAP) index. In conclusion, it was deduced via simulation study that the number of components grows large for all permutations to be considered for subsample permutations.
APA, Harvard, Vancouver, ISO, and other styles
15

Chen, Zhiyong, Minghui Chen, and Guodong Xing. "Bayesian Estimation of Partially Linear Additive Spatial Autoregressive Models with P-Splines." Mathematical Problems in Engineering 2021 (July 15, 2021): 1–14. http://dx.doi.org/10.1155/2021/1777469.

Full text
Abstract:
In this paper, we aim to develop a partially linear additive spatial autoregressive model (PLASARM), which is a generalization of the partially linear additive model and spatial autoregressive model. It can be used to simultaneously evaluate the linear and nonlinear effects of the covariates on the response for spatial data. To estimate the unknown parameters and approximate nonparametric functions by Bayesian P-splines, we develop a Bayesian Markov Chain Monte Carlo approach to estimate the PLASARM and design a Gibbs sampler to explore the joint posterior distributions of unknown parameters. Furthermore, we illustrate the performance of the proposed model and estimation method by a simulation study and analysis of Chinese housing price data.
APA, Harvard, Vancouver, ISO, and other styles
16

McCulloch, Robert E., and Ruey S. Tsay. "Bayesian Inference of Trend and Difference-Stationarity." Econometric Theory 10, no. 3-4 (1994): 596–608. http://dx.doi.org/10.1017/s0266466600008689.

Full text
Abstract:
This paper proposes a general Bayesian framework for distinguishing between trend- and difference-stationarity. Usually, in model selection, we assume that all of the data were generated by one of the models under consideration. In studying time series, however, we may be concerned that the process is changing over time, so that the preferred model changes over time as well. To handle this possibility, we compute the posterior probabilities of the competing models for each observation. This way we can see if different segments of the series behave differently with respect to the competing models. The proposed method is a generalization of the usual odds ratio for model discrimination in Bayesian inference. In application, we employ the Gibbs sampler to overcome the computational difficulty. The procedure is illustrated by a real example.
APA, Harvard, Vancouver, ISO, and other styles
17

Aoyagi, Miki, and Kenji Nagata. "Learning Coefficient of Generalization Error in Bayesian Estimation and Vandermonde Matrix-Type Singularity." Neural Computation 24, no. 6 (2012): 1569–610. http://dx.doi.org/10.1162/neco_a_00271.

Full text
Abstract:
The term algebraic statistics arises from the study of probabilistic models and techniques for statistical inference using methods from algebra and geometry (Sturmfels, 2009 ). The purpose of our study is to consider the generalization error and stochastic complexity in learning theory by using the log-canonical threshold in algebraic geometry. Such thresholds correspond to the main term of the generalization error in Bayesian estimation, which is called a learning coefficient (Watanabe, 2001a , 2001b ). The learning coefficient serves to measure the learning efficiencies in hierarchical learning models. In this letter, we consider learning coefficients for Vandermonde matrix-type singularities, by using a new approach: focusing on the generators of the ideal, which defines singularities. We give tight new bound values of learning coefficients for the Vandermonde matrix-type singularities and the explicit values with certain conditions. By applying our results, we can show the learning coefficients of three-layered neural networks and normal mixture models.
APA, Harvard, Vancouver, ISO, and other styles
18

Ribeiro, Fabiano, and Manfred Opper. "Expectation Propagation with Factorizing Distributions: A Gaussian Approximation and Performance Results for Simple Models." Neural Computation 23, no. 4 (2011): 1047–69. http://dx.doi.org/10.1162/neco_a_00104.

Full text
Abstract:
We discuss the expectation propagation (EP) algorithm for approximate Bayesian inference using a factorizing posterior approximation. For neural network models, we use a central limit theorem argument to make EP tractable when the number of parameters is large. For two types of models, we show that EP can achieve optimal generalization performance when data are drawn from a simple distribution.
APA, Harvard, Vancouver, ISO, and other styles
19

Vedadi, Elahe, Joshua V. Dillon, Philip Andrew Mansfield, Karan Singhal, Arash Afkanpour, and Warren Richard Morningstar. "Federated Variational Inference: Towards Improved Personalization and Generalization." Proceedings of the AAAI Symposium Series 3, no. 1 (2024): 323–27. http://dx.doi.org/10.1609/aaaiss.v3i1.31228.

Full text
Abstract:
Conventional federated learning algorithms train a single global model by leveraging all participating clients’ data. However, due to heterogeneity in client generative distributions and predictive models, these approaches may not appropriately approximate the predictive process, converge to an optimal state, or generalize to new clients. We study personalization and generalization in stateless cross-device federated learning setups assuming heterogeneity in client data distributions and predictive models. We first propose a hierarchical generative model and formalize it using Bayesian Inference. We then approximate this process using Variational Inference to train our model efficiently. We call this algorithm Federated Variational Inference (FedVI). We use PAC-Bayes analysis to provide generalization bounds for FedVI. We evaluate our model on FEMNIST and CIFAR-100 image classification and show that FedVI beats the state-of-the-art on both tasks.
APA, Harvard, Vancouver, ISO, and other styles
20

Jacobsen, Daniel J., Lars Kai Hansen, and Kristoffer Hougaard Madsen. "Bayesian Model Comparison in Nonlinear BOLD fMRI Hemodynamics." Neural Computation 20, no. 3 (2008): 738–55. http://dx.doi.org/10.1162/neco.2007.07-06-282.

Full text
Abstract:
Nonlinear hemodynamic models express the BOLD (blood oxygenation level dependent) signal as a nonlinear, parametric functional of the temporal sequence of local neural activity. Several models have been proposed for both the neural activity and the hemodynamics. We compare two such combined models: the original balloon model with a square-pulse neural model (Friston, Mechelli, Turner, & Price, 2000) and an extended balloon model with a more sophisticated neural model (Buxton, Uludag, Dubowitz, & Liu, 2004). We learn the parameters of both models using a Bayesian approach, where the distribution of the parameters conditioned on the data is estimated using Markov chain Monte Carlo techniques. Using a split-half resampling procedure (Strother, Anderson, & Hansen, 2002), we compare the generalization abilities of the models as well as their reproducibility, for both synthetic and real data, recorded from two different visual stimulation paradigms. The results show that the simple model is the better one for these data.
APA, Harvard, Vancouver, ISO, and other styles
21

ZLOBIN, Mykola, and Volodymyr BAZYLEVYCH. "BAYESIAN OPTIMIZATION FOR TUNING HYPERPARAMETRS OF MACHINE LEARNING MODELS: A PERFORMANCE ANALYSIS IN XGBOOST." Computer systems and information technologies, no. 1 (March 27, 2025): 141–46. https://doi.org/10.31891/csit-2025-1-16.

Full text
Abstract:
The performance of machine learning models depends on the selection and tuning of hyperparameters. As a widely used gradient boosting method, XGBoost relies on optimal hyperparameter configurations to balance model complexity, prevent overfitting, and improve generalization. Especially in high-dimensional hyperparameter spaces, traditional approaches including grid search and random search are computationally costly and ineffective. Recent findings in automated hyperparameter tuning, specifically Bayesian optimization with the tree-structured parzen estimator have shown promise in raising the accuracy and efficiency of model optimization. The aim of this paper is to analyze how effective Bayesian optimization is in tuning XGBoost hyperparameters for a real classification issue. Comparing Bayesian optimization with traditional search methods can help to assess its effects on model accuracy, convergence speed, and computing economy. As a case study in this research, a dataset of consumer spending behaviors was used. The classification task aimed to differentiate between two transaction categories: hotels, restaurants, and cafés against the retail sector. The performance of the model was evaluated using loss function minimization, convergence stability, and classification accuracy. This paper shows that Bayesian optimization improves XGBoost hyperparameter tuning, hence improving classification performance while lowering computational costs. The results offer empirical proof that Bayesian optimization outperforms traditional techniques in terms of accuracy, stability, and scalability.
APA, Harvard, Vancouver, ISO, and other styles
22

Chen, Xingdi, Peng Kong, Peng Jiang, and Yanlan Wu. "Estimation of PM2.5 Concentration Using Deep Bayesian Model Considering Spatial Multiscale." Remote Sensing 13, no. 22 (2021): 4545. http://dx.doi.org/10.3390/rs13224545.

Full text
Abstract:
Directly establishing the relationship between satellite data and PM2.5 concentration through deep learning methods for PM2.5 concentration estimation is an important means for estimating regional PM2.5 concentration. However, due to the lack of consideration of uncertainty in deep learning methods, methods based on deep learning have certain overfitting problems in the process of PM2.5 estimation. In response to this problem, this paper designs a deep Bayesian PM2.5 estimation model that takes into account multiple scales. The model uses a Bayesian neural network to describe key parameters a priori, provide regularization effects to the neural network, perform posterior inference through parameters, and take into account the characteristics of data uncertainty, which is used to alleviate the problem of model overfitting and to improve the generalization ability of the model. In addition, different-scale Moderate-Resolution Imaging Spectroradiometer (MODIS) satellite data and ERA5 reanalysis data were used as input to the model to strengthen the model’s perception of different-scale features of the atmosphere, as well as to further enhance the model’s PM2.5 estimation accuracy and generalization ability. Experiments with Anhui Province as the research area showed that the R2 of this method on the independent test set was 0.78, which was higher than that of the DNN, random forest, and BNN models that do not consider the impact of the surrounding environment; moreover, the RMSE was 19.45 μg·m−3, which was also lower than the three compared models. In the experiment of different seasons in 2019, compared with the other three models, the estimation accuracy was significantly reduced; however, the R2 of the model in this paper could still reach 0.66 or more. Thus, the model in this paper has a higher accuracy and better generalization ability.
APA, Harvard, Vancouver, ISO, and other styles
23

Candelieri, Antonio, Riccardo Perego, Ilaria Giordani, Andrea Ponti, and Francesco Archetti. "Modelling human active search in optimizing black-box functions." Soft Computing 24, no. 23 (2020): 17771–85. http://dx.doi.org/10.1007/s00500-020-05398-2.

Full text
Abstract:
AbstractModelling human function learning has been the subject of intense research in cognitive sciences. The topic is relevant in black-box optimization where information about the objective and/or constraints is not available and must be learned through function evaluations. In this paper, we focus on the relation between the behaviour of humans searching for the maximum and the probabilistic model used in Bayesian optimization. As surrogate models of the unknown function, both Gaussian processes and random forest have been considered: the Bayesian learning paradigm is central in the development of active learning approaches balancing exploration/exploitation in uncertain conditions towards effective generalization in large decision spaces. In this paper, we analyse experimentally how Bayesian optimization compares to humans searching for the maximum of an unknown 2D function. A set of controlled experiments with 60 subjects, using both surrogate models, confirm that Bayesian optimization provides a general model to represent individual patterns of active learning in humans.
APA, Harvard, Vancouver, ISO, and other styles
24

Zhou, Suhua, Wenjie Han, Minghua Huang, Zhiwen Xu, Jinfeng Li, and Jiuchang Zhang. "Slope Stability Prediction Based on Incremental Learning Bayesian Model and Literature Data Mining." Applied Sciences 15, no. 5 (2025): 2423. https://doi.org/10.3390/app15052423.

Full text
Abstract:
In predicting slope stability, updating datasets with new cases necessitates retraining traditional machine learning models, consuming substantial time and resources. This paper introduces the Incremental Learning Bayesian (ILB) model, combining incremental learning theory with the naive Bayesian model, to address this issue. Key slope parameters—height; slope angle; unit weight; cohesion; internal friction angle; and pore water ratio—are used as predictive indicators. A dataset of 242 slope cases from existing literature is compiled for training and evaluation. The ILB model’s performance is assessed using accuracy, area under the ROC curve (AUC), generalization ability, and computation time and compared to four common batch learning models: Random Forest (RF), Gradient Boosting Machine (GBM), Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP). Variable importance and partial dependence plots are used to explore the relationship between prediction results and parameters. Validation is performed with real slope cases from the Lala Copper Mine in Sichuan Province, China. Results show that (1) The ILB model’s accuracy and AUC improve as the dataset grows. (2) The ILB model outperforms GBM, SVM, and MLP in accuracy and AUC, similar to RF. (3) It demonstrates superior generalization and lower computation time than batch learning models. (4) Internal friction angle, slope angle, and pore water ratio are the most important predictors.
APA, Harvard, Vancouver, ISO, and other styles
25

Shah, S., P. J. Hazarika, S. Chakraborty, and G. G. Hamedani. "A generalization of Balakrishnan-alpha-skew-normal distribution : Properties, characterisations and applications." Journal of Statistics and Management Systems 27, no. 1 (2024): 9–33. http://dx.doi.org/10.47974/jsms-1013.

Full text
Abstract:
In this article, a generalized form of the Balakrishnan-alpha-skew-normal distribution of Hazarika et al. (2020) is proposed and some of its simple and distributional properties are studied. Two characterizations have also been presented. For model selection criteria namely Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), are used to check appropriateness of the proposed distribution by conducting data fitting experiments and comparing it with some other related distributions. The likelihood ratio test is employed for discriminating between nested models.
APA, Harvard, Vancouver, ISO, and other styles
26

Elgohari, Hanaa, Mohamed Ibrahim, and Haitham Yousof. "A New Probability Distribution for Modeling Failure and Service Times: Properties, Copulas and Various Estimation Methods." Statistics, Optimization & Information Computing 9, no. 3 (2021): 555–86. http://dx.doi.org/10.19139/soic-2310-5070-1101.

Full text
Abstract:
In this paper, a new generalization of the Pareto type II model is introduced and studied. The new density canbe “right skewed” with heavy tail shape and its corresponding failure rate can be “J-shape”, “decreasing” and “upside down (or increasing-constant-decreasing)”. The new model may be used as an “under-dispersed” and “over-dispersed” model. Bayesian and non-Bayesian estimation methods are considered. We assessed the performance of all methods via simulation study. Bayesian and non-Bayesian estimation methods are compared in modeling real data via two applications. In modeling real data, the maximum likelihood method is the best estimation method. So, we used it in comparing competitive models. Before using the the maximum likelihood method, we performed simulation experiments to assess the finite sample behavior of it using the biases and mean squared errors.
APA, Harvard, Vancouver, ISO, and other styles
27

Nand Kumar, Et al. "Enhancing Robustness and Generalization in Deep Learning Models for Image Processing." Power System Technology 47, no. 4 (2023): 278–93. http://dx.doi.org/10.52783/pst.193.

Full text
Abstract:
In recent years, deep learning models have demonstrated remarkable success in various image processing tasks, ranging from object recognition to medical image analysis. However, their performance often degrades in the presence of unseen data or adversarial attacks, highlighting the need for enhancing robustness and generalization. This paper explores innovative approaches to address these challenges, aiming to improve the reliability and applicability of deep learning models in real-world scenarios. The first section of the paper delves into the importance of robustness and generalization in deep learning models for image processing tasks. It discusses the implications of model vulnerabilities, such as overfitting to training data and susceptibility to adversarial perturbations, on the reliability of model predictions. [1] Through a comprehensive review of existing literature, various factors influencing robustness and generalization are identified, including dataset diversity, model architecture, regularization techniques, and adversarial training methods. The paper proposes novel methodologies to enhance the robustness and generalization capabilities of deep learning models. One key approach involves the integration of diverse training data sources, including synthetic data augmentation and domain adaptation techniques, to expose the model to a wider range of scenarios and improve its ability to generalize to unseen data. Additionally, advanced regularization techniques, such as dropout and batch normalization, are explored to mitigate overfitting and improve model generalization. The paper investigates the effectiveness of adversarial training strategies in enhancing model robustness against adversarial attacks. By incorporating adversarially generated examples during training, deep learning models can learn to better resist perturbations and maintain performance under adversarial conditions. Moreover, the paper explores the potential of incorporating uncertainty estimation methods, such as Bayesian neural networks and Monte Carlo dropout, to quantify model uncertainty and improve robustness in uncertain environments. This paper presents a comprehensive investigation into enhancing robustness and generalization in deep learning models for image processing tasks. By addressing key challenges such as overfitting, dataset bias, and adversarial vulnerabilities, the proposed methodologies offer promising avenues for improving the reliability and applicability of deep learning models in real-world scenarios.
APA, Harvard, Vancouver, ISO, and other styles
28

Karimi, Omid, Henning Omre, and Mohsen Mohammadzadeh. "Bayesian closed-skew Gaussian inversion of seismic AVO data for elastic material properties." GEOPHYSICS 75, no. 1 (2010): R1—R11. http://dx.doi.org/10.1190/1.3299291.

Full text
Abstract:
Bayesian closed-skew Gaussian inversion is defined as a generalization of traditional Bayesian Gaussian inversion, which is used frequently in seismic amplitude-versus-offset (AVO) inversion. The new model captures skewness in the variables of interest; hence, the posterior model for log-transformed elastic material properties given seismic AVO data might be a skew probability density function. The model is analytically tractable, and this makes it applicable in high-dimensional 3D inversion problems. Assessment of the posterior models in high dimensions requires numerical approximations, however. The Bayesian closed-skew Gaussian inversion approach has been applied on real elastic material properties from a well in the Sleipner field in the North Sea. A comparison with results from traditional Bayesian Gaussian inversion shows that the mean square error of predictions of P-wave and S-wave velocities are reduced by a factor of two, although somewhat less for density predictions.
APA, Harvard, Vancouver, ISO, and other styles
29

Sanghai, S., P. Domingos, and D. Weld. "Relational Dynamic Bayesian Networks." Journal of Artificial Intelligence Research 24 (December 2, 2005): 759–97. http://dx.doi.org/10.1613/jair.1625.

Full text
Abstract:
Stochastic processes that involve the creation of objects and relations over time are widespread, but relatively poorly studied. For example, accurate fault diagnosis in factory assembly processes requires inferring the probabilities of erroneous assembly operations, but doing this efficiently and accurately is difficult. Modeled as dynamic Bayesian networks, these processes have discrete variables with very large domains and extremely high dimensionality. In this paper, we introduce relational dynamic Bayesian networks (RDBNs), which are an extension of dynamic Bayesian networks (DBNs) to first-order logic. RDBNs are a generalization of dynamic probabilistic relational models (DPRMs), which we had proposed in our previous work to model dynamic uncertain domains. We first extend the Rao-Blackwellised particle filtering described in our earlier work to RDBNs. Next, we lift the assumptions associated with Rao-Blackwellization in RDBNs and propose two new forms of particle filtering. The first one uses abstraction hierarchies over the predicates to smooth the particle filter's estimates. The second employs kernel density estimation with a kernel function specifically designed for relational domains. Experiments show these two methods greatly outperform standard particle filtering on the task of assembly plan execution monitoring.
APA, Harvard, Vancouver, ISO, and other styles
30

Dou, Liyu, and Ulrich K. Müller. "Generalized Local‐to‐Unity Models." Econometrica 89, no. 4 (2021): 1825–54. http://dx.doi.org/10.3982/ecta17944.

Full text
Abstract:
We introduce a generalization of the popular local‐to‐unity model of time series persistence by allowing for p autoregressive (AR) roots and p − 1 moving average (MA) roots close to unity. This generalized local‐to‐unity model, GLTU( p), induces convergence of the suitably scaled time series to a continuous time Gaussian ARMA( p, p − 1) process on the unit interval. Our main theoretical result establishes the richness of this model class, in the sense that it can well approximate a large class of processes with stationary Gaussian limits that are not entirely distinct from the unit root benchmark. We show that Campbell and Yogo's (2006) popular inference method for predictive regressions fails to control size in the GLTU(2) model with empirically plausible parameter values, and we propose a limited‐information Bayesian framework for inference in the GLTU( p) model and apply it to quantify the uncertainty about the half‐life of deviations from purchasing power parity.
APA, Harvard, Vancouver, ISO, and other styles
31

Wu, Bingzhe, Chaochao Chen, Shiwan Zhao, et al. "Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (2020): 6372–79. http://dx.doi.org/10.1609/aaai.v34i04.6107.

Full text
Abstract:
Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks (DNNs). Stochastic Gradient Langevin Dynamics (SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i.e., preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w.r.t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.
APA, Harvard, Vancouver, ISO, and other styles
32

Arshad, Muhammad, Salman A. Cheema, Juan L. G. Guirao, Juan M. Sánchez, and Adrián Valverde. "Assisting the decision making-A generalization of choice models to handle the binary choices." AIMS Mathematics 8, no. 2 (2023): 3083–100. http://dx.doi.org/10.3934/math.2023159.

Full text
Abstract:
<abstract> <p>This research fundamentally aims at providing a generalized framework to assist the launch of paired comparison models while dealing with discrete binary choices. The purpose is served by exploiting the fundaments of the exponential family of distributions. The proposed generalization is proved to cater to seven paired comparison models as members of this newly developed mechanism. The legitimacy of the devised scheme is demonstrated through rigorous simulation-based investigation as well as keenly persuaded empirical evaluations. A detailed analysis, covering a wide range of parametric settings, through the launch of Gibbs Sampler—a notable extension of Markov Chain Monte Carlo methods, is conducted under the Bayesian paradigm. The outcomes of this research substantiate the legitimacy of the devised general structure by not only successfully retaining the preference ordering but also by staying consistent with the established theoretical framework of comparative models.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
33

GHAHRAMANI, ZOUBIN. "AN INTRODUCTION TO HIDDEN MARKOV MODELS AND BAYESIAN NETWORKS." International Journal of Pattern Recognition and Artificial Intelligence 15, no. 01 (2001): 9–42. http://dx.doi.org/10.1142/s0218001401000836.

Full text
Abstract:
We provide a tutorial on learning and inference in hidden Markov models in the context of the recent literature on Bayesian networks. This perspective makes it possible to consider novel generalizations of hidden Markov models with multiple hidden state variables, multiscale representations, and mixed discrete and continuous variables. Although exact inference in these generalizations is usually intractable, one can use approximate inference algorithms such as Markov chain sampling and variational methods. We describe how such methods are applied to these generalized hidden Markov models. We conclude this review with a discussion of Bayesian methods for model selection in generalized HMMs.
APA, Harvard, Vancouver, ISO, and other styles
34

Chen, Haoyu. "The advance of neural networks generalization performance." Applied and Computational Engineering 5, no. 1 (2023): 818–25. http://dx.doi.org/10.54254/2755-2721/5/20230711.

Full text
Abstract:
A big mystery in deep learning is the promising generalization performance generated by massive neural networks. While over-parameterization increases the tendency of overfitting in other machine learning models, neural networks seem to magically overcome this hurdle and achieve minor test errors in various tasks. Researchers are motivated to resolve this enigma through a variety of aspects and methods, both theoretically and empirically. This paper aims to comprehensively review the explanations for the generalization power of deep networks. Firstly, the review compares various types of generalization bounds under PAC-Bayes analysis and non-PAC-Bayesian settings. Then, works of regularizers, both explicit (e.g., dropout) and implicit (e.g., batch normalization), and algorithms-caused regularizations are reviewed in this work. Some researchers also explore networks generalization ability from other perspectives, and this review talks about works that investigate the relationship between images and generalization performance. Additionally, works of adversarial examples are included in this review, since adversarial attacks have challenged networks power to generalize well and have become an important field in understanding deep learning. By collecting works from different viewpoints, this paper finally discusses some possible directions in the future.
APA, Harvard, Vancouver, ISO, and other styles
35

Chomacki, Leszek, Janusz Rusek, and Leszek Słowik. "Selected Artificial Intelligence Methods in the Risk Analysis of Damage to Masonry Buildings Subject to Long-Term Underground Mining Exploitation." Minerals 11, no. 9 (2021): 958. http://dx.doi.org/10.3390/min11090958.

Full text
Abstract:
This paper presents an advanced computational approach to assess the risk of damage to masonry buildings subjected to negative kinematic impacts of underground mining exploitation. The research goals were achieved using selected tools from the area of artificial intelligence (AI) methods. Ultimately, two models of damage risk assessment were built using the Naive Bayes classifier (NBC) and Bayesian Networks (BN). The first model was used to compare results obtained using the more computationally advanced Bayesian network methodology. In the case of the Bayesian network, the unknown Directed Acyclic Graph (DAG) structure was extracted using Chow-Liu’s Tree Augmented Naive Bayes (TAN-CL) algorithm. Thus, one of the methods involving Bayesian Network Structure Learning from data (BNSL) was implemented. The application of this approach represents a novel scientific contribution in the interdisciplinary field of mining and civil engineering. The models created were verified with respect to quality of fit to observed data and generalization properties. The connections in the Bayesian network structure obtained were also verified with respect to the observed relations occurring in engineering practice concerning the assessment of the damage intensity to masonry buildings in mining areas. This allowed evaluation of the model and justified the utility of the conducted research in the field of protection of mining areas. The possibility of universal application of the Bayesian network, both in the case of damage prediction and diagnosis of its potential causes, was also pointed out.
APA, Harvard, Vancouver, ISO, and other styles
36

Meng, Yao, Xianku Zhang, Guoqing Zhang, Xiufeng Zhang, and Yating Duan. "Sparse Bayesian Relevance Vector Machine Identification Modeling and Its Application to Ship Maneuvering Motion Prediction." Journal of Marine Science and Engineering 11, no. 8 (2023): 1572. http://dx.doi.org/10.3390/jmse11081572.

Full text
Abstract:
In order to establish a sparse and accurate ship motion prediction model, a novel Bayesian probability prediction model based on relevance vector machine (RVM) was proposed for nonparametric modeling. The sparsity, effectiveness, and generalization of RVM were verified from two aspects: (1) the processed Sinc function dataset, and (2) the tank test dataset of the KRISO container ship (KCS) model. The KCS was taken as the main research plant, and the motion prediction models of KCS were obtained. The ε-support vector regression and υ-support vector regression were taken as the compared algorithms. The sparsity, effectiveness, and generalization of the three algorithms were analyzed. According to the trained prediction models of the three algorithms, the number of relevance vectors was compared with the number of support vectors. From the prediction results of the Sinc function and tank test datasets, the highest percentage of relevance vectors in the trained sample was below 17%. The final prediction results indicated that the proposed nonparametric models had good prediction performance. They could ensure good sparsity while ensuring high prediction accuracy. Compared with the SVR, the prediction accuracy can be improved by more than 14.04%, and the time consumption was also relatively lower. A training model with good sparsity can reduce prediction time. This is essential for the online prediction of ship motion.
APA, Harvard, Vancouver, ISO, and other styles
37

Coles, Darrell, and Andrew Curtis. "Efficient nonlinear Bayesian survey design using DN optimization." GEOPHYSICS 76, no. 2 (2011): Q1—Q8. http://dx.doi.org/10.1190/1.3552645.

Full text
Abstract:
A new method for fully nonlinear, Bayesian survey design renders the optimization of industrial-scale geoscientific surveys as a practical possibility. The method, DN optimization, designs surveys to maximally discriminate between different possible models. It is based on a generalization to nonlinear design problems of the D criterion (which is for linearized design problems). The main practical advantage of DN optimization is that it uses efficient algorithms developed originally for linearized design theory, resulting in lower computing and storage costs than for other nonlinear Bayesian design techniques. In a real example in which we optimized a seafloor microseismic sensor network to monitor a fractured petroleum reservoir, we compared DN optimization with two other networks: one proposed by an industrial contractor and one optimized using a linearized Bayesian design method. Our technique yielded a network with superior expected data quality in terms of reduced uncertainties on hypocenter locations.
APA, Harvard, Vancouver, ISO, and other styles
38

Mambo, Lewis N. K. "From Multidimensional Ornstein - Uhlenbeck Process to Bayesian Vector Autoregressive Process." Journal of Mathematics Research 15, no. 1 (2023): 32. http://dx.doi.org/10.5539/jmr.v15n1p32.

Full text
Abstract:
The main purpose of  this paper is to make the connexion between stochastic analysis, the Bayesian Statistics, and time series analysis for policy analysis. This approach solves the problem of mathematical modelling - the presence of uncertainties in the models and parameters -  that reduces the  policy analysis and forecasting   effectiveness. By using the multiple It\^o  integral, the multidimensional Ornstein - Uhlenbeck process can be written as a Vector Autoregressive with lag 1 (VAR(1)) that is the generalization of Vector Autoregressive process. The  limit of this approach is in fact it requires  the strong foundations of stochastic analysis, the Bayesian Statistics, and time series analysis.
APA, Harvard, Vancouver, ISO, and other styles
39

Huang, Haibing, Zujie Xu, Xiaoliang Li, et al. "Predicting Rheological Properties of Asphalt Modified with Mineral Powder: Bagging, Boosting, and Stacking vs. Single Machine Learning Models." Materials 18, no. 12 (2025): 2913. https://doi.org/10.3390/ma18122913.

Full text
Abstract:
This study systematically compares the predictive performance of single machine learning (ML) models (KNN, Bayesian ridge regression, decision tree) and ensemble learning methods (bagging, boosting, stacking) for quantifying the rheological properties of mineral powder-modified asphalt, specifically the complex shear modulus (G*) and the phase angle (δ). We used two emulsifiers and three mineral powders for fabricating modified emulsified asphalt and conducting rheological property tests, respectively. Dynamic shear rheometer (DSR) test data were preprocessed using the local outlier factor (LOF) algorithm, followed by K-fold cross-validation (K = 5) and Bayesian optimization to tune model hyperparameters. This framework uniquely employs cross-validated predictions from base models as input features for the meta-learner, reducing information leakage and enhancing generalization. Traditional single ML models struggle to characterize accurately as a result, and an innovative stacking model was developed, integrating predictions from four heterogeneous base learners—KNN, decision tree (DT), random forest (RF), and XGBoost—with a Bayesian ridge regression meta-learner. Results demonstrate that ensemble models outperform single models significantly, with the stacking model achieving the highest accuracy (R2 = 0.9727 for G* and R2 = 0.9990 for δ). Shapley additive explanations (SHAP) analysis reveals temperature and mineral powder type as key factors, addressing the “black box” limitation of ML in materials science. This study validates the stacking model as a robust framework for optimizing asphalt mixture design, offering insights into material selection and pavement performance improvement.
APA, Harvard, Vancouver, ISO, and other styles
40

Sinha, Samarth, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg, and Florian Shkurti. "DIBS: Diversity Inducing Information Bottleneck in Model Ensembles." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (2021): 9666–74. http://dx.doi.org/10.1609/aaai.v35i11.17163.

Full text
Abstract:
Although deep learning models have achieved state-of-the art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertainty estimation are still active areas of research. Bayesian approaches including Bayesian Neural Nets (BNNs) do not scale well to modern computer vision tasks, as they are difficult to train, and have poor generalization under dataset-shift. This motivates the need for effective ensembles which can generalize and give reliable uncertainty estimates. In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning the stochastic latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. We evaluate our method on benchmark datasets: MNIST, CIFAR100, TinyImageNet and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. over 10% relative improvement in classification accuracy, over 5% relative improvement in generalizing under dataset shift, and over 5% better predictive uncertainty estimation as inferred by efficient out-of-distribution (OOD) detection.
APA, Harvard, Vancouver, ISO, and other styles
41

Tiwari, Pradeep Kumar, Pooja Singh, Navaneetha Krishnan Rajagopal, et al. "IoT-Based Reinforcement Learning Using Probabilistic Model for Determining Extensive Exploration through Computational Intelligence for Next-Generation Techniques." Computational Intelligence and Neuroscience 2023 (October 10, 2023): 1–13. http://dx.doi.org/10.1155/2023/5113417.

Full text
Abstract:
Computing intelligence is built on several learning and optimization techniques. Incorporating cutting-edge learning techniques to balance the interaction between exploitation and exploration is therefore an inspiring field, especially when it is combined with IoT. The reinforcement learning techniques created in recent years have largely focused on incorporating deep learning technology to improve the generalization skills of the algorithm while ignoring the issue of detecting and taking full advantage of the dilemma. To increase the effectiveness of exploration, a deep reinforcement algorithm based on computational intelligence is proposed in this study, using intelligent sensors and the Bayesian approach. In addition, the technique for computing the posterior distribution of parameters in Bayesian linear regression is expanded to nonlinear models such as artificial neural networks. The Bayesian Bootstrap Deep Q-Network (BBDQN) algorithm is created by combining the bootstrapped DQN with the recommended computing technique. Finally, tests in two scenarios demonstrate that, when faced with severe exploration problems, BBDQN outperforms DQN and bootstrapped DQN in terms of exploration efficiency.
APA, Harvard, Vancouver, ISO, and other styles
42

Asnaashari, K., and R. V. Krems. "Gradient domain machine learning with composite kernels: improving the accuracy of PES and force fields for large molecules." Machine Learning: Science and Technology 3, no. 1 (2021): 015005. http://dx.doi.org/10.1088/2632-2153/ac3845.

Full text
Abstract:
Abstract The generalization accuracy of machine learning models of potential energy surfaces (PES) and force fields (FF) for large polyatomic molecules can be improved either by increasing the number of training points or by improving the models. In order to build accurate models based on expensive ab initio calculations, much of recent work has focused on the latter. In particular, it has been shown that gradient domain machine learning (GDML) models produce accurate results for high-dimensional molecular systems with a small number of ab initio calculations. The present work extends GDML to models with composite kernels built to maximize inference from a small number of molecular geometries. We illustrate that GDML models can be improved by increasing the complexity of underlying kernels through a greedy search algorithm using Bayesian information criterion as the model selection metric. We show that this requires including anisotropy into kernel functions and produces models with significantly smaller generalization errors. The results are presented for ethanol, uracil, malonaldehyde and aspirin. For aspirin, the model with composite kernels trained by forces at 1000 randomly sampled molecular geometries produces a global 57-dimensional PES with the mean absolute accuracy 0.177 kcal mol−1 (61.9 cm−1) and FFs with the mean absolute error 0.457 kcal mol−1 Å−1.
APA, Harvard, Vancouver, ISO, and other styles
43

Dai, J., and R. V. Krems. "Quantum Gaussian process model of potential energy surface for a polyatomic molecule." Journal of Chemical Physics 156, no. 18 (2022): 184802. http://dx.doi.org/10.1063/5.0088821.

Full text
Abstract:
With gates of a quantum computer designed to encode multi-dimensional vectors, projections of quantum computer states onto specific qubit states can produce kernels of reproducing kernel Hilbert spaces. We show that quantum kernels obtained with a fixed ansatz implementable on current quantum computers can be used for accurate regression models of global potential energy surfaces (PESs) for polyatomic molecules. To obtain accurate regression models, we apply Bayesian optimization to maximize marginal likelihood by varying the parameters of the quantum gates. This yields Gaussian process models with quantum kernels. We illustrate the effect of qubit entanglement in the quantum kernels and explore the generalization performance of quantum Gaussian processes by extrapolating global six-dimensional PESs in the energy domain.
APA, Harvard, Vancouver, ISO, and other styles
44

Zhu, Ruiqi. "Research on Stock Price Multi-Factor Prediction Model Based on Bayesian Model Averaging." Highlights in Business, Economics and Management 33 (May 9, 2024): 211–18. http://dx.doi.org/10.54097/s0amgb53.

Full text
Abstract:
Factor mining is a crucial component in constructing stock price prediction and quantitative models, where factor mining methods based on feature dimensionality reduction, such as Principal Component Analysis (PCA) and sufficient dimensionality reduction, are widely utilized. However, the connection structure between the quantitative factors extracted using these methods and stock prices is unknown, leading to potential issues like overfitting or underfitting in prediction models. In light of this, this paper proposes a multi-factor prediction model based on Bayesian model averaging. On one hand, the proposed method employs the concept of model averaging instead of model selection, effectively balancing the variance and bias of prediction models. On the other hand, it can adaptively choose sub-models that play a crucial role in predicting stock prices, thereby enhancing overall prediction accuracy. Empirical data analysis indicates that the proposed method, compared to PCA-based Lasso and Ridge regression, exhibits smaller mean squared error and possesses a certain level of robustness. Lastly, by incorporating other model averaging techniques such as the Bagging algorithm, the generalization ability of the proposed method can be further improved.
APA, Harvard, Vancouver, ISO, and other styles
45

Wu, Qiaoyun, Dinesh Manocha, Jun Wang, and Kai Xu. "NeoNav: Improving the Generalization of Visual Navigation via Generating Next Expected Observations." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 06 (2020): 10001–8. http://dx.doi.org/10.1609/aaai.v34i06.6556.

Full text
Abstract:
We propose improving the cross-target and cross-scene generalization of visual navigation through learning an agent that is guided by conceiving the next observations it expects to see. This is achieved by learning a variational Bayesian model, called NeoNav, which generates the next expected observations (NEO) conditioned on the current observations of the agent and the target view. Our generative model is learned through optimizing a variational objective encompassing two key designs. First, the latent distribution is conditioned on current observations and the target view, leading to a model-based, target-driven navigation. Second, the latent space is modeled with a Mixture of Gaussians conditioned on the current observation and the next best action. Our use of mixture-of-posteriors prior effectively alleviates the issue of over-regularized latent space, thus significantly boosting the model generalization for new targets and in novel scenes. Moreover, the NEO generation models the forward dynamics of agent-environment interaction, which improves the quality of approximate inference and hence benefits data efficiency. We have conducted extensive evaluations on both real-world and synthetic benchmarks, and show that our model consistently outperforms the state-of-the-art models in terms of success rate, data efficiency, and generalization.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhang, Min. "Deep Residual Networks and Bayesian Data Priors in the Survival Prediction and Classification." Highlights in Science, Engineering and Technology 56 (July 14, 2023): 139–47. http://dx.doi.org/10.54097/hset.v56i.10095.

Full text
Abstract:
Sepsis is a highly lethal disease in intensive care units, and patient indicators are constantly changing, making accurate prediction of patient mortality crucial for doctors to develop appropriate treatment plans. While machine learning and deep learning have been applied to sepsis research, model generalization performance can suffer from underfitting or gradient issues. To address these challenges, we propose using a deep residual network and a deep residual network incorporating Bayesian data prior to predict patient mortality using the MIMIC-III dataset. The accuracy of the two models on the validation set reached 0.9392 and 0.9329, respectively.
APA, Harvard, Vancouver, ISO, and other styles
47

Nicora, Giovanna, Michele Catalano, Chandra Bortolotto, et al. "Bayesian Networks in the Management of Hospital Admissions: A Comparison between Explainable AI and Black Box AI during the Pandemic." Journal of Imaging 10, no. 5 (2024): 117. http://dx.doi.org/10.3390/jimaging10050117.

Full text
Abstract:
Artificial Intelligence (AI) and Machine Learning (ML) approaches that could learn from large data sources have been identified as useful tools to support clinicians in their decisional process; AI and ML implementations have had a rapid acceleration during the recent COVID-19 pandemic. However, many ML classifiers are “black box” to the final user, since their underlying reasoning process is often obscure. Additionally, the performance of such models suffers from poor generalization ability in the presence of dataset shifts. Here, we present a comparison between an explainable-by-design (“white box”) model (Bayesian Network (BN)) versus a black box model (Random Forest), both studied with the aim of supporting clinicians of Policlinico San Matteo University Hospital in Pavia (Italy) during the triage of COVID-19 patients. Our aim is to evaluate whether the BN predictive performances are comparable with those of a widely used but less explainable ML model such as Random Forest and to test the generalization ability of the ML models across different waves of the pandemic.
APA, Harvard, Vancouver, ISO, and other styles
48

TIAN, LIANG, and AFZEL NOORE. "SOFTWARE RELIABILITY PREDICTION USING RECURRENT NEURAL NETWORK WITH BAYESIAN REGULARIZATION." International Journal of Neural Systems 14, no. 03 (2004): 165–74. http://dx.doi.org/10.1142/s0129065704001966.

Full text
Abstract:
A recurrent neural network modeling approach for software reliability prediction with respect to cumulative failure time is proposed. Our proposed network structure has the capability of learning and recognizing the inherent internal temporal property of cumulative failure time sequence. Further, by adding a penalty term of sum of network connection weights, Bayesian regularization is applied to our network training scheme to improve the generalization capability and lower the susceptibility of overfitting. The performance of our proposed approach has been tested using four real-time control and flight dynamic application data sets. Numerical results show that our proposed approach is robust across different software projects, and has a better performance with respect to both goodness-of-fit and next-step-predictability compared to existing neural network models for failure time prediction.
APA, Harvard, Vancouver, ISO, and other styles
49

EL-Morshedy, Mahmoud, Fahad Sameer Alshammari, Abhishek Tyagi, Iberahim Elbatal, Yasser S. Hamed, and Mohamed S. Eliwa. "Bayesian and Frequentist Inferences on a Type I Half-Logistic Odd Weibull Generator with Applications in Engineering." Entropy 23, no. 4 (2021): 446. http://dx.doi.org/10.3390/e23040446.

Full text
Abstract:
In this article, we have proposed a new generalization of the odd Weibull-G family by consolidating two notable families of distributions. We have derived various mathematical properties of the proposed family, including quantile function, skewness, kurtosis, moments, incomplete moments, mean deviation, Bonferroni and Lorenz curves, probability weighted moments, moments of (reversed) residual lifetime, entropy and order statistics. After producing the general class, two of the corresponding parametric statistical models are outlined. The hazard rate function of the sub-models can take a variety of shapes such as increasing, decreasing, unimodal, and Bathtub shaped, for different values of the parameters. Furthermore, the sub-models of the introduced family are also capable of modelling symmetric and skewed data. The parameter estimation of the special models are discussed by numerous methods, namely, the maximum likelihood, simple least squares, weighted least squares, Cramér-von Mises, and Bayesian estimation. Under the Bayesian framework, we have used informative and non-informative priors to obtain Bayes estimates of unknown parameters with the squared error and generalized entropy loss functions. An extensive Monte Carlo simulation is conducted to assess the effectiveness of these estimation techniques. The applicability of two sub-models of the proposed family is illustrated by means of two real data sets.
APA, Harvard, Vancouver, ISO, and other styles
50

Mahajan, Akash, Srijita Das, Wencong Su, and Van-Hai Bui. "Bayesian-Neural-Network-Based Approach for Probabilistic Prediction of Building-Energy Demands." Sustainability 16, no. 22 (2024): 9943. http://dx.doi.org/10.3390/su16229943.

Full text
Abstract:
Reliable prediction of building-level energy demand is crucial for the building managers to optimize and regulate energy consumption. Conventional prediction models omit the uncertainties associated with demand over time; hence, they are mostly inaccurate and unreliable. In this study, a Bayesian neural network (BNN)-based probabilistic prediction model is proposed to tackle this challenge. By quantifying the uncertainty, BNNs provide probabilistic predictions that capture the variations in the energy demand. The proposed model is trained and evaluated on a subset of the building operations dataset of Lawrence Berkeley National Laboratory (LBNL), Berkeley, California, which includes diverse attributes related to climate and key building-performance indicators. We have performed thorough hyperparameter tuning and used fixed-horizon validation to evaluate trained models on various test data to assess generalization ability. To validate the results, quantile random forest (QRF) was used as a benchmark. This study compared BNN with LSTM, showing that BNN outperformed LSTM in uncertainty quantification.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography