Log in

Relevant bibliographies by topics / Out-of-distribution generalization / Journal articles

To see the other types of publications on this topic, follow the link: Out-of-distribution generalization.

Journal articles on the topic 'Out-of-distribution generalization'

Author: Grafiati

Published: 25 May 2024

Last updated: 22 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Out-of-distribution generalization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ye, Nanyang, Lin Zhu, Jia Wang, et al. "Certifiable Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 10927–35. http://dx.doi.org/10.1609/aaai.v37i9.26295.

Full text

Abstract:

Machine learning methods suffer from test-time performance degeneration when faced with out-of-distribution (OoD) data whose distribution is not necessarily the same as training data distribution. Although a plethora of algorithms have been proposed to mitigate this issue, it has been demonstrated that achieving better performance than ERM simultaneously on different types of distributional shift datasets is challenging for existing approaches. Besides, it is unknown how and to what extent these methods work on any OoD datum without theoretical guarantees. In this paper, we propose a certifiable out-of-distribution generalization method that provides provable OoD generalization performance guarantees via a functional optimization framework leveraging random distributions and max-margin learning for each input datum. With this approach, the proposed algorithmic scheme can provide certified accuracy for each input datum's prediction on the semantic space and achieves better performance simultaneously on OoD datasets dominated by correlation shifts or diversity shifts. Our code is available at https://github.com/ZlatanWilliams/StochasticDisturbanceLearning.

APA, Harvard, Vancouver, ISO, and other styles

2

Liu, Bowen, Haoyang Li, Shuning Wang, Shuo Nie, and Shanghang Zhang. "Subgraph Aggregation for Out-of-Distribution Generalization on Graphs." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 18 (2025): 18763–71. https://doi.org/10.1609/aaai.v39i18.34065.

Full text

Abstract:

Out-of-distribution (OOD) generalization in Graph Neural Networks (GNNs) has gained significant attention due to its critical importance in graph-based predictions in real-world scenarios. Existing methods primarily focus on extracting a single causal subgraph from the input graph to achieve generalizable predictions. However, relying on a single subgraph can lead to susceptibility to spurious correlations and is insufficient for learning invariant patterns behind graph data. Moreover, in many real-world applications, such as molecular property prediction, multiple critical subgraphs may influence the target label property. To address these challenges, we propose a novel framework, SubGraph Aggregation(SuGAr), designed to learn a diverse set of subgraphs that are crucial for OOD generalization on graphs. Specifically, SuGAr employs a tailored subgraph sampler and diversity regularizer to extract a diverse set of invariant subgraphs. These invariant subgraphs are then aggregated by averaging their representations, which enriches the subgraph signals and enhances coverage of the underlying causal structures, thereby improving OOD generalization. Extensive experiments on both synthetic and real-world datasets demonstrate that SuGAr outperforms state-of-the-art methods, achieving up to a 24% improvement in OOD generalization on graphs. To the best of our knowledge, this is the first work to study graph OOD generalization by learning multiple invariant subgraphs.

APA, Harvard, Vancouver, ISO, and other styles

3

Yuan, Lingxiao, Harold S. Park, and Emma Lejeune. "Towards out of distribution generalization for problems in mechanics." Computer Methods in Applied Mechanics and Engineering 400 (October 2022): 115569. http://dx.doi.org/10.1016/j.cma.2022.115569.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Liu, Anji, Hongming Xu, Guy Van den Broeck, and Yitao Liang. "Out-of-Distribution Generalization by Neural-Symbolic Joint Training." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 10 (2023): 12252–59. http://dx.doi.org/10.1609/aaai.v37i10.26444.

Full text

Abstract:

This paper develops a novel methodology to simultaneously learn a neural network and extract generalized logic rules. Different from prior neural-symbolic methods that require background knowledge and candidate logical rules to be provided, we aim to induce task semantics with minimal priors. This is achieved by a two-step learning framework that iterates between optimizing neural predictions of task labels and searching for a more accurate representation of the hidden task semantics. Notably, supervision works in both directions: (partially) induced task semantics guide the learning of the neural network and induced neural predictions admit an improved semantic representation. We demonstrate that our proposed framework is capable of achieving superior out-of-distribution generalization performance on two tasks: (i) learning multi-digit addition, where it is trained on short sequences of digits and tested on long sequences of digits; (ii) predicting the optimal action in the Tower of Hanoi, where the model is challenged to discover a policy independent of the number of disks in the puzzle.

APA, Harvard, Vancouver, ISO, and other styles

5

Yu, Yemin, Luotian Yuan, Ying Wei, et al. "RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 1 (2024): 374–82. http://dx.doi.org/10.1609/aaai.v38i1.27791.

Full text

Abstract:

Machine learning-assisted retrosynthesis prediction models have been gaining widespread adoption, though their performances oftentimes degrade significantly when deployed in real-world applications embracing out-of-distribution (OOD) molecules or reactions. Despite steady progress on standard benchmarks, our understanding of existing retrosynthesis prediction models under the premise of distribution shifts remains stagnant. To this end, we first formally sort out two types of distribution shifts in retrosynthesis prediction and construct two groups of benchmark datasets. Next, through comprehensive experiments, we systematically compare state-of-the-art retrosynthesis prediction models on the two groups of benchmarks, revealing the limitations of previous in-distribution evaluation and re-examining the advantages of each model. More remarkably, we are motivated by the above empirical insights to propose two model-agnostic techniques that can improve the OOD generalization of arbitrary off-the-shelf retrosynthesis prediction algorithms. Our preliminary experiments show their high potential with an average performance improvement of 4.6%, and the established benchmarks serve as a foothold for further retrosynthesis prediction research towards OOD generalization.

APA, Harvard, Vancouver, ISO, and other styles

6

Du, Hongyi, Xuewei Li, and Minglai Shao. "Graph out-of-distribution generalization through contrastive learning paradigm." Knowledge-Based Systems 315 (April 2025): 113316. https://doi.org/10.1016/j.knosys.2025.113316.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Xu, Yiming, Bin Shi, Zhen Peng, Huixiang Liu, Bo Dong, and Chen Chen. "Out-of-Distribution Generalization on Graphs via Progressive Inference." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 12 (2025): 12963–71. https://doi.org/10.1609/aaai.v39i12.33414.

Full text

Abstract:

The development and evaluation of graph neural networks (GNNs) generally follow the independent and identically distributed (i.i.d.) assumption. Yet this assumption is often untenable in practice due to the uncontrollable data generation mechanism. In particular, when the data distribution shows a significant shift, most GNNs would fail to produce reliable predictions and may even make decisions randomly. One of the most promising solutions to improve the model generalization is to pick out causal invariant parts in the input graph. Nonetheless, we observe a significant distribution gap between the causal parts learned by existing methods and the ground-truth, leading to undesirable performance. In response to the above issues, this paper presents GPro, a model that learns graph causal invariance with progressive inference. Specifically, the complicated graph causal invariant learning is decomposed into multiple intermediate inference steps from easy to hard, and the perception of GPro is continuously strengthened through a progressive inference process to extract causal features that are stable to distribution shifts. We also enlarge the training distribution by creating counterfactual samples to enhance the capability of the GPro in capturing the causal invariant parts. Extensive experiments demonstrate that our proposed GPro outperforms the state-of-the-art methods by 4.91% on average. For datasets with more severe distribution shifts, the performance improvement can be up to 6.86%.

APA, Harvard, Vancouver, ISO, and other styles

8

Zhu, Lin, Xinbing Wang, Chenghu Zhou, and Nanyang Ye. "Bayesian Cross-Modal Alignment Learning for Few-Shot Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 11461–69. http://dx.doi.org/10.1609/aaai.v37i9.26355.

Full text

Abstract:

Recent advances in large pre-trained models showed promising results in few-shot learning. However, their generalization ability on two-dimensional Out-of-Distribution (OoD) data, i.e., correlation shift and diversity shift, has not been thoroughly investigated. Researches have shown that even with a significant amount of training data, few methods can achieve better performance than the standard empirical risk minimization method (ERM) in OoD generalization. This few-shot OoD generalization dilemma emerges as a challenging direction in deep neural network generalization research, where the performance suffers from overfitting on few-shot examples and OoD generalization errors. In this paper, leveraging a broader supervision source, we explore a novel Bayesian cross-modal image-text alignment learning method (Bayes-CAL) to address this issue. Specifically, the model is designed as only text representations are fine-tuned via a Bayesian modelling approach with gradient orthogonalization loss and invariant risk minimization (IRM) loss. The Bayesian approach is essentially introduced to avoid overfitting the base classes observed during training and improve generalization to broader unseen classes. The dedicated loss is introduced to achieve better image-text alignment by disentangling the causal and non-casual parts of image features. Numerical experiments demonstrate that Bayes-CAL achieved state-of-the-art OoD generalization performances on two-dimensional distribution shifts. Moreover, compared with CLIP-like models, Bayes-CAL yields more stable generalization performances on unseen classes. Our code is available at https://github.com/LinLLLL/BayesCAL.

APA, Harvard, Vancouver, ISO, and other styles

9

Lavda, Frantzeska, and Alexandros Kalousis. "Semi-Supervised Variational Autoencoders for Out-of-Distribution Generation." Entropy 25, no. 12 (2023): 1659. http://dx.doi.org/10.3390/e25121659.

Full text

Abstract:

Humans are able to quickly adapt to new situations, learn effectively with limited data, and create unique combinations of basic concepts. In contrast, generalizing out-of-distribution (OOD) data and achieving combinatorial generalizations are fundamental challenges for machine learning models. Moreover, obtaining high-quality labeled examples can be very time-consuming and expensive, particularly when specialized skills are required for labeling. To address these issues, we propose BtVAE, a method that utilizes conditional VAE models to achieve combinatorial generalization in certain scenarios and consequently to generate out-of-distribution (OOD) data in a semi-supervised manner. Unlike previous approaches that use new factors of variation during testing, our method uses only existing attributes from the training data but in ways that were not seen during training (e.g., small objects of a specific shape during training and large objects of the same shape during testing).

APA, Harvard, Vancouver, ISO, and other styles

10

Zhang, Xiao, Sunhao Dai, Jun Xu, Yong Liu, and Zhenhua Dong. "AdaO2B: Adaptive Online to Batch Conversion for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 21 (2025): 22596–604. https://doi.org/10.1609/aaai.v39i21.34418.

Full text

Abstract:

Online to batch conversion involves constructing a new batch learner by utilizing a series of models generated by an existing online learning algorithm, for achieving generalization guarantees under i.i.d assumption. However, when applied to real-world streaming applications such as streaming recommender systems, the data stream may be sampled from time-varying distributions instead of persistently being i.i.d. This poses a challenge in terms of out-of-distribution (OOD) generalization. Existing approaches employ fixed conversion mechanisms that are unable to adapt to novel testing distributions, hindering the testing accuracy of the batch learner. To address these issues, we propose AdaO2B, an adaptive online to batch conversion approach under the bandit setting. AdaO2B is designed to be aware of the distribution shifts in the testing data and achieves OOD generalization guarantees. Specifically, AdaO2B can dynamically combine the sequence of models learned by a contextual bandit algorithm and determine appropriate combination weights using a context-aware weighting function. This innovative approach allows for the conversion of a sequence of models into a batch learner that facilitates OOD generalization. Theoretical analysis provides justification for why and how the learned adaptive batch learner can achieve OOD generalization error guarantees. Experimental results have demonstrated that AdaO2B significantly outperforms state-of-the-art baselines on both synthetic and real-world recommendation datasets.

APA, Harvard, Vancouver, ISO, and other styles

11

Su, Hang, and Wei Wang. "An Out-of-Distribution Generalization Framework Based on Variational Backdoor Adjustment." Mathematics 12, no. 1 (2023): 85. http://dx.doi.org/10.3390/math12010085.

Full text

Abstract:

In practical applications, learning models that can perform well even when the data distribution is different from the training set are essential and meaningful. Such problems are often referred to as out-of-distribution (OOD) generalization problems. In this paper, we propose a method for OOD generalization based on causal inference. Unlike the prevalent OOD generalization methods, our approach does not require the environment labels associated with the data in the training set. We analyze the causes of distributional shifts in data from a causal modeling perspective and then propose a backdoor adjustment method based on variational inference. Finally, we constructed a unique network structure to simulate the variational inference process. The proposed variational backdoor adjustment (VBA) framework can be combined with any mainstream backbone network. In addition to theoretical derivation, we conduct experiments on different datasets to demonstrate that our method performs well in prediction accuracy and generalization gaps. Furthermore, by comparing the VBA framework with other mainstream OOD methods, we show that VBA performs better than mainstream methods.

APA, Harvard, Vancouver, ISO, and other styles

12

Cao, Linfeng, Aofan Jiang, Wei Li, Huaying Wu, and Nanyang Ye. "OoDHDR-Codec: Out-of-Distribution Generalization for HDR Image Compression." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (2022): 158–66. http://dx.doi.org/10.1609/aaai.v36i1.19890.

Full text

Abstract:

Recently, deep learning has been proven to be a promising approach in standard dynamic range (SDR) image compression. However, due to the wide luminance distribution of high dynamic range (HDR) images and the lack of large standard datasets, developing a deep model for HDR image compression is much more challenging. To tackle this issue, we view HDR data as distributional shifts of SDR data and the HDR image compression can be modeled as an out-of-distribution generalization (OoD) problem. Herein, we propose a novel out-of-distribution (OoD) HDR image compression framework (OoDHDR-codec). It learns the general representation across HDR and SDR environments, and allows the model to be trained effectively using a large set of SDR datases supplemented with much fewer HDR samples. Specifically, OoDHDR-codec consists of two branches to process the data from two environments. The SDR branch is a standard blackbox network. For the HDR branch, we develop a hybrid system that models luminance masking and tone mapping with white-box modules and performs content compression with black-box neural networks. To improve the generalization from SDR training data on HDR data, we introduce an invariance regularization term to learn the common representation for both SDR and HDR compression. Extensive experimental results show that the OoDHDR codec achieves strong competitive in-distribution performance and state-of-the-art OoD performance. To the best of our knowledge, our proposed approach is the first work to model HDR compression as OoD generalization problems and our OoD generalization algorithmic framework can be applied to any deep compression model in addition to the network architectural choice demonstrated in the paper. Code available at https://github.com/caolinfeng/OoDHDR-codec.

APA, Harvard, Vancouver, ISO, and other styles

13

Li, Jiacheng, and Min Yang. "Dual-branch neural operator for enhanced out-of-distribution generalization." Engineering Analysis with Boundary Elements 171 (February 2025): 106082. https://doi.org/10.1016/j.enganabound.2024.106082.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Deng, Bin, and Kui Jia. "Counterfactual Supervision-Based Information Bottleneck for Out-of-Distribution Generalization." Entropy 25, no. 2 (2023): 193. http://dx.doi.org/10.3390/e25020193.

Full text

Abstract:

Learning invariant (causal) features for out-of-distribution (OOD) generalization have attracted extensive attention recently, and among the proposals, invariant risk minimization (IRM) is a notable solution. In spite of its theoretical promise for linear regression, the challenges of using IRM in linear classification problems remain. By introducing the information bottleneck (IB) principle into the learning of IRM, the IB-IRM approach has demonstrated its power to solve these challenges. In this paper, we further improve IB-IRM from two aspects. First, we show that the key assumption of support overlap of invariant features used in IB-IRM guarantees OOD generalization, and it is still possible to achieve the optimal solution without this assumption. Second, we illustrate two failure modes where IB-IRM (and IRM) could fail in learning the invariant features, and to address such failures, we propose a Counterfactual Supervision-based Information Bottleneck (CSIB) learning algorithm that recovers the invariant features. By requiring counterfactual inference, CSIB works even when accessing data from a single environment. Empirical experiments on several datasets verify our theoretical results.

APA, Harvard, Vancouver, ISO, and other styles

15

Ashok, Arjun, Chaitanya Devaguptapu, and Vineeth N. Balasubramanian. "Learning Modular Structures That Generalize Out-of-Distribution (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 11 (2022): 12905–6. http://dx.doi.org/10.1609/aaai.v36i11.21589.

Full text

Abstract:

Out-of-distribution (O.O.D.) generalization remains to be a key challenge for real-world machine learning systems. We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network. Preliminary evaluation on two benchmark datasets corroborates the promise of our method.

APA, Harvard, Vancouver, ISO, and other styles

16

Zou, Xin, and Weiwei Liu. "Coverage-Guaranteed Prediction Sets for Out-of-Distribution Data." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 15 (2024): 17263–70. http://dx.doi.org/10.1609/aaai.v38i15.29673.

Full text

Abstract:

Out-of-distribution (OOD) generalization has attracted increasing research attention in recent years, due to its promising experimental results in real-world applications. In this paper, we study the confidence set prediction problem in the OOD generalization setting. Split conformal prediction (SCP) is an efficient framework for handling the confidence set prediction problem. However, the validity of SCP requires the examples to be exchangeable, which is violated in the OOD setting. Empirically, we show that trivially applying SCP results in a failure to maintain the marginal coverage when the unseen target domain is different from the source domain. To address this issue, we develop a method for forming confident prediction sets in the OOD setting and theoretically prove the validity of our method. Finally, we conduct experiments on simulated data to empirically verify the correctness of our theory and the validity of our proposed method.

APA, Harvard, Vancouver, ISO, and other styles

17

Bai, Haoyue, Rui Sun, Lanqing Hong, et al. "DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (2021): 6705–13. http://dx.doi.org/10.1609/aaai.v35i8.16829.

Full text

Abstract:

While deep learning demonstrates its strong ability to handle independent and identically distributed (IID) data, it often suffers from out-of-distribution (OoD) generalization, where the test data come from another distribution (w.r.t. the training one). Designing a general OoD generalization framework for a wide range of applications is challenging, mainly due to different kinds of distribution shifts in the real world, such as the shift across domains or the extrapolation of correlation. Most of the previous approaches can only solve one specific distribution shift, leading to unsatisfactory performance when applied to various OoD benchmarks. In this work, we propose DecAug, a novel decomposed feature representation and semantic augmentation approach for OoD generalization. Specifically, DecAug disentangles the category-related and context-related features by orthogonalizing the two gradients (w.r.t. intermediate features) of losses for predicting category and context labels, where category-related features contain causal information of the target object, while context-related features cause distribution shifts between training and test data. Furthermore, we perform gradient-based augmentation on context-related features to improve the robustness of learned representations. Experimental results show that DecAug outperforms other state-of-the-art methods on various OoD datasets, which is among the very few methods that can deal with different types of OoD generalization challenges.

APA, Harvard, Vancouver, ISO, and other styles

18

Ren, Yifei, and Pouya Bashivan. "How well do models of visual cortex generalize to out of distribution samples?" PLOS Computational Biology 20, no. 5 (2024): e1011145. http://dx.doi.org/10.1371/journal.pcbi.1011145.

Full text

Abstract:

Unit activity in particular deep neural networks (DNNs) are remarkably similar to the neuronal population responses to static images along the primate ventral visual cortex. Linear combinations of DNN unit activities are widely used to build predictive models of neuronal activity in the visual cortex. Nevertheless, prediction performance in these models is often investigated on stimulus sets consisting of everyday objects under naturalistic settings. Recent work has revealed a generalization gap in how predicting neuronal responses to synthetically generated out-of-distribution (OOD) stimuli. Here, we investigated how the recent progress in improving DNNs’ object recognition generalization, as well as various DNN design choices such as architecture, learning algorithm, and datasets have impacted the generalization gap in neural predictivity. We came to a surprising conclusion that the performance on none of the common computer vision OOD object recognition benchmarks is predictive of OOD neural predictivity performance. Furthermore, we found that adversarially robust models often yield substantially higher generalization in neural predictivity, although the degree of robustness itself was not predictive of neural predictivity score. These results suggest that improving object recognition behavior on current benchmarks alone may not lead to more general models of neurons in the primate ventral visual cortex.

APA, Harvard, Vancouver, ISO, and other styles

19

Fan, Caoyun, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, and Yaohui Jin. "Unlock the Potential of Counterfactually-Augmented Data in Out-Of-Distribution Generalization." Expert Systems with Applications 238 (March 2024): 122066. http://dx.doi.org/10.1016/j.eswa.2023.122066.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Ramachandran, Sai Niranjan, Rudrabha Mukhopadhyay, Madhav Agarwal, C. V. Jawahar, and Vinay Namboodiri. "Understanding the Generalization of Pretrained Diffusion Models on Out-of-Distribution Data." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 13 (2024): 14767–75. http://dx.doi.org/10.1609/aaai.v38i13.29395.

Full text

Abstract:

This work tackles the important task of understanding out-of-distribution behavior in two prominent types of generative models, i.e., GANs and Diffusion models. Understanding this behavior is crucial in understanding their broader utility and risks as these systems are increasingly deployed in our daily lives. Our first contribution is demonstrating that diffusion spaces outperform GANs' latent spaces in inverting high-quality OOD images. We also provide a theoretical analysis attributing this to the lack of prior holes in diffusion spaces. Our second significant contribution is to provide a theoretical hypothesis that diffusion spaces can be projected onto a bounded hypersphere, enabling image manipulation through geodesic traversal between inverted images. Our analysis shows that different geodesics share common attributes for the same manipulation, which we leverage to perform various image manipulations. We conduct thorough empirical evaluations to support and validate our claims. Finally, our third and final contribution introduces a novel approach to the few-shot sampling for out-of-distribution data by inverting a few images to sample from the cluster formed by the inverted latents. The proposed technique achieves state-of-the-art results for the few-shot generation task in terms of image quality. Our research underscores the promise of diffusion spaces in out-of-distribution imaging and offers avenues for further exploration. Please find more details about our project at \url{http://cvit.iiit.ac.in/research/projects/cvit-projects/diffusionOOD}

APA, Harvard, Vancouver, ISO, and other styles

21

Jia, Tianrui, Haoyang Li, Cheng Yang, Tao Tao, and Chuan Shi. "Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 8 (2024): 8562–70. http://dx.doi.org/10.1609/aaai.v38i8.28700.

Full text

Abstract:

Graph neural networks (GNNs) have been demonstrated to perform well in graph representation learning, but always lacking in generalization capability when tackling out-of-distribution (OOD) data. Graph invariant learning methods, backed by the invariance principle among defined multiple environments, have shown effectiveness in dealing with this issue. However, existing methods heavily rely on well-predefined or accurately generated environment partitions, which are hard to be obtained in practice, leading to sub-optimal OOD generalization performances. In this paper, we propose a novel graph invariant learning method based on invariant and variant patterns co-mixup strategy, which is capable of jointly generating mixed multiple environments and capturing invariant patterns from the mixed graph data. Specifically, we first adopt a subgraph extractor to identify invariant subgraphs. Subsequently, we design one novel co-mixup strategy, i.e., jointly conducting environment mixup and invariant mixup. For the environment mixup, we mix the variant environment-related subgraphs so as to generate sufficiently diverse multiple environments, which is important to guarantee the quality of the graph invariant learning. For the invariant mixup, we mix the invariant subgraphs, further encouraging to capture invariant patterns behind graphs while getting rid of spurious correlations for OOD generalization. We demonstrate that the proposed environment mixup and invariant mixup can mutually promote each other. Extensive experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art under various distribution shifts.

APA, Harvard, Vancouver, ISO, and other styles

22

Zhang, Lily H., and Rajesh Ranganath. "Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 12 (2023): 15305–12. http://dx.doi.org/10.1609/aaai.v37i12.26785.

Full text

Abstract:

Methods which utilize the outputs or feature representations of predictive models have emerged as promising approaches for out-of-distribution (OOD) detection of image inputs. However, as demonstrated in previous work, these methods struggle to detect OOD inputs that share nuisance values (e.g. background) with in-distribution inputs. The detection of shared-nuisance OOD (SN-OOD) inputs is particularly relevant in real-world applications, as anomalies and in-distribution inputs tend to be captured in the same settings during deployment. In this work, we provide a possible explanation for these failures and propose nuisance-aware OOD detection to address them. Nuisance-aware OOD detection substitutes a classifier trained via Empirical Risk Minimization (ERM) with one that 1. approximates a distribution where the nuisance-label relationship is broken and 2. yields representations that are independent of the nuisance under this distribution, both marginally and conditioned on the label. We can train a classifier to achieve these objectives using Nuisance-Randomized Distillation (NuRD), an algorithm developed for OOD generalization under spurious correlations. Output- and feature-based nuisance-aware OOD detection perform substantially better than their original counterparts, succeeding even when detection based on domain generalization algorithms fails to improve performance.

APA, Harvard, Vancouver, ISO, and other styles

23

Zhang, Jiaqiang, and Songcan Chen. "Expand Horizon: Graph Out-of-Distribution Generalization via Multi-Level Environment Inference." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 12 (2025): 13233–41. https://doi.org/10.1609/aaai.v39i12.33444.

Full text

Abstract:

Graph neural networks (GNNs) are widely used for node classification tasks, but when encountering distribution shifts due to environmental change in real-world scenarios, they tend to learn unstable correlations between features and labels. To overcome this dilemma, a powerful class of approaches views the environment as the root cause of those unstable correlations, thereby their key focus is to infer the environment involved, enabling the model to avoid capturing environment-sensitive correlations. However, their inferences rely solely on the single-level information from one low-hop ego-graph, neglecting both global information and multi-granularity information in local ego-graphs with different hops. Although applying deeper GNNs on the high-hop ego-graph could capture global information, it will bring the side effect of over-smoothing node representations. To tackle these issues, we propose a novel Multi-Level Environment Inference model named MLEI, which effectively broadens the horizon of training GNNs under node-level distribution shifts. Specifically, MLEI first leverages a linear graph transformer to surpass the scope of ego-graph, efficiently enabling high-level global environment inference. This global environment is in turn used as an overview to assist layer-by-layer environment inference on local multi-hop ego-graphs. Finally, we combine the environment from global and local views and utilize the designed objective function to capture stable predictive patterns. Extensive experiments on real-world datasets demonstrate that our model achieves satisfactory performance compared with the state-of-the-art methods under various distribution shifts.

APA, Harvard, Vancouver, ISO, and other styles

24

Gwon, Kyungpil, and Joonhyuk Yoo. "Out-of-Distribution (OOD) Detection and Generalization Improved by Augmenting Adversarial Mixup Samples." Electronics 12, no. 6 (2023): 1421. http://dx.doi.org/10.3390/electronics12061421.

Full text

Abstract:

Deep neural network (DNN) models are usually built based on the i.i.d. (independent and identically distributed), also known as in-distribution (ID), assumption on the training samples and test data. However, when models are deployed in a real-world scenario with some distributional shifts, test data can be out-of-distribution (OOD) and both OOD detection and OOD generalization should be simultaneously addressed to ensure the reliability and safety of applied AI systems. Most existing OOD detectors pursue these two goals separately, and therefore, are sensitive to covariate shift rather than semantic shift. To alleviate this problem, this paper proposes a novel adversarial mixup (AM) training method which simply executes OOD data augmentation to synthesize differently distributed data and designs a new AM loss function to learn how to handle OOD data. The proposed AM generates OOD samples being significantly diverged from the support of training data distribution but not completely disjoint to increase the generalization capability of the OOD detector. In addition, the AM is combined with a distributional-distance-aware OOD detector at inference to detect semantic OOD samples more efficiently while being robust to covariate shift due to data tampering. Experimental evaluation validates that the designed AM is effective on both OOD detection and OOD generalization tasks compared to previous OOD detectors and data mixup methods.

APA, Harvard, Vancouver, ISO, and other styles

25

Boccato, Tommaso, Alberto Testolin, and Marco Zorzi. "Learning Numerosity Representations with Transformers: Number Generation Tasks and Out-of-Distribution Generalization." Entropy 23, no. 7 (2021): 857. http://dx.doi.org/10.3390/e23070857.

Full text

Abstract:

One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional models assuming that some information is given as input. In the domain of numerical cognition, deep learning architectures have successfully demonstrated that approximate numerosity representations can emerge in multi-layer networks that build latent representations of a set of images with a varying number of items. However, existing models have focused on tasks requiring to conditionally estimate numerosity information from a given image. Here, we focus on a set of much more challenging tasks, which require to conditionally generate synthetic images containing a given number of items. We show that attention-based architectures operating at the pixel level can learn to produce well-formed images approximately containing a specific number of items, even when the target numerosity was not present in the training distribution.

APA, Harvard, Vancouver, ISO, and other styles

26

Chen, Minghui, Cheng Wen, Feng Zheng, Fengxiang He, and Ling Shao. "VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 1 (2022): 321–29. http://dx.doi.org/10.1609/aaai.v36i1.19908.

Full text

Abstract:

Invariance to diverse types of image corruption, such as noise, blurring, or colour shifts, is essential to establish robust models in computer vision. Data augmentation has been the major approach in improving the robustness against common corruptions. However, the samples produced by popular augmentation strategies deviate significantly from the underlying data manifold. As a result, performance is skewed toward certain types of corruption. To address this issue, we propose a multi-source vicinal transfer augmentation (VITA) method for generating diverse on-manifold samples. The proposed VITA consists of two complementary parts: tangent transfer and integration of multi-source vicinal samples. The tangent transfer creates initial augmented samples for improving corruption robustness. The integration employs a generative model to characterize the underlying manifold built by vicinal samples, facilitating the generation of on-manifold samples. Our proposed VITA significantly outperforms the current state-of-the-art augmentation methods, demonstrated in extensive experiments on corruption benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

27

Maier, Anatol, and Christian Riess. "Reliable Out-of-Distribution Recognition of Synthetic Images." Journal of Imaging 10, no. 5 (2024): 110. http://dx.doi.org/10.3390/jimaging10050110.

Full text

Abstract:

Generative adversarial networks (GANs) and diffusion models (DMs) have revolutionized the creation of synthetically generated but realistic-looking images. Distinguishing such generated images from real camera captures is one of the key tasks in current multimedia forensics research. One particular challenge is the generalization to unseen generators or post-processing. This can be viewed as an issue of handling out-of-distribution inputs. Forensic detectors can be hardened by the extensive augmentation of the training data or specifically tailored networks. Nevertheless, such precautions only manage but do not remove the risk of prediction failures on inputs that look reasonable to an analyst but in fact are out of the training distribution of the network. With this work, we aim to close this gap with a Bayesian Neural Network (BNN) that provides an additional uncertainty measure to warn an analyst of difficult decisions. More specifically, the BNN learns the task at hand and also detects potential confusion between post-processing and image generator artifacts. Our experiments show that the BNN achieves on-par performance with the state-of-the-art detectors while producing more reliable predictions on out-of-distribution examples.

APA, Harvard, Vancouver, ISO, and other styles

28

Xin, Shiji, Yifei Wang, Jingtong Su, and Yisen Wang. "On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 10519–27. http://dx.doi.org/10.1609/aaai.v37i9.26250.

Full text

Abstract:

Despite impressive success in many tasks, deep learning models are shown to rely on spurious features, which will catastrophically fail when generalized to out-of-distribution (OOD) data. Invariant Risk Minimization (IRM) is proposed to alleviate this issue by extracting domain-invariant features for OOD generalization. Nevertheless, recent work shows that IRM is only effective for a certain type of distribution shift (e.g., correlation shift) while it fails for other cases (e.g., diversity shift). Meanwhile, another thread of method, Adversarial Training (AT), has shown better domain transfer performance, suggesting that it has the potential to be an effective candidate for extracting domain-invariant features. This paper investigates this possibility by exploring the similarity between the IRM and AT objectives. Inspired by this connection, we propose Domain-wise Adversarial Training (DAT), an AT-inspired method for alleviating distribution shift by domain-specific perturbations. Extensive experiments show that our proposed DAT can effectively remove domain-varying features and improve OOD generalization under both correlation shift and diversity shift.

APA, Harvard, Vancouver, ISO, and other styles

29

Madan, Spandan, Mingran Cao, Will Xiao, Hanspeter Pfister, and Gabriel Kreiman. "Out-of-Distribution generalization behavior of DNN-based encoding models for the visual cortex." Journal of Vision 24, no. 10 (2024): 1148. http://dx.doi.org/10.1167/jov.24.10.1148.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Hassan, A., S. A. Dar, P. B. Ahmad, and B. A. Para. "A new generalization of Aradhana distribution: Properties and applications." Journal of Applied Mathematics, Statistics and Informatics 16, no. 2 (2020): 51–66. http://dx.doi.org/10.2478/jamsi-2020-0009.

Full text

Abstract:

Abstract In this paper, we introduce a new generalization of Aradhana distribution called as Weighted Aradhana Distribution (WID). The statistical properties of this distribution are derived and the model parameters are estimated by maximum likelihood estimation. Simulation study of ML estimates of the parameters is carried out in R software. Finally, an application to real data set is presented to examine the significance of newly introduced model.

APA, Harvard, Vancouver, ISO, and other styles

31

Chen, Zhe, Zhiquan Ding, Xiaoling Zhang, Xin Zhang, and Tianqi Qin. "Improving Out-of-Distribution Generalization in SAR Image Scene Classification with Limited Training Samples." Remote Sensing 15, no. 24 (2023): 5761. http://dx.doi.org/10.3390/rs15245761.

Full text

Abstract:

For practical maritime SAR image classification tasks with special imaging platforms, scenes to be classified are often different from those in the training sets. The quantity and diversity of the available training data can also be extremely limited. This problem of out-of-distribution (OOD) generalization with limited training samples leads to a sharp drop in the performance of conventional deep learning algorithms. In this paper, a knowledge-guided neural network (KGNN) model is proposed to overcome these challenges. By analyzing the saliency features of various maritime SAR scenes, universal knowledge in descriptive sentences is summarized. A feature integration strategy is designed to assign the descriptive knowledge to the ResNet-18 backbone. Both the individual semantic information and the inherent relations of the entities in SAR images are addressed. The experimental results show that our KGNN method outperforms conventional deep learning models in OOD scenarios with varying training sample sizes and achieves higher robustness in handling distributional shifts caused by weather conditions, terrain type, and sensor characteristics. In addition, the KGNN model converges within many fewer epochs during training. The performance improvement indicates that the KGNN model learns representations guided by beneficial properties for ODD generalization with limited training samples.

APA, Harvard, Vancouver, ISO, and other styles

32

Sha, Naijun. "A New Inference Approach for Type-II Generalized Birnbaum-Saunders Distribution." Stats 2, no. 1 (2019): 148–63. http://dx.doi.org/10.3390/stats2010011.

Full text

Abstract:

The Birnbaum-Saunders (BS) distribution, with its generalizations, has been successfully applied in a wide variety of fields. One generalization, type-II generalized BS (denoted as GBS-II), has been developed and attracted considerable attention in recent years. In this article, we propose a new simple and convenient procedure of inference approach for GBS-II distribution. An extensive simulation study is carried out to assess performance of the methods under various settings of parameter values with different sample sizes. Real data are analyzed for illustrative purposes to display the efficiency of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

33

Zhou, Pengyang, Chaochao Chen, Weiming Liu, et al. "FedGOG: Federated Graph Out-of-Distribution Generalization with Diffusion Data Exploration and Latent Embedding Decorrelation." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 21 (2025): 22965–73. https://doi.org/10.1609/aaai.v39i21.34459.

Full text

Abstract:

Federated graph learning (FGL) has emerged as a promising approach to enable collaborative training of graph models while preserving data privacy. However, current FGL methods overlook the out-of-distribution (OOD) shifts that occur in real-world scenarios. The distribution shifts between training and testing datasets in each client impact the FGL performance. To address this issue, we propose federated graph OOD generalization framework FedGOG, which includes two modules, i.e., diffusion data exploration (DDE) and latent embedding decorrelation (LED). In DDE, all clients jointly train score models to accurately estimate the global graph data distribution and sufficiently explore sample space using score-based graph diffusion with conditional generation. In LED, each client models a global invariant GNN and a personalized spurious GNN. LED aims to decorrelate spuriousness from invariant relationships by minimizing the mutual information between two categories of latent embeddings from different GNN models. Extensive experiments on six benchmark datasets demonstrate the superiority of FedGOG.

APA, Harvard, Vancouver, ISO, and other styles

34

Das, Siddhant, and Markus Nöth. "Times of arrival and gauge invariance." Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 477, no. 2250 (2021): 20210101. http://dx.doi.org/10.1098/rspa.2021.0101.

Full text

Abstract:

We revisit the arguments underlying two well-known arrival-time distributions in quantum mechanics, viz., the Aharonov–Bohm–Kijowski (ABK) distribution, applicable for freely moving particles, and the quantum flux (QF) distribution. An inconsistency in the original axiomatic derivation of Kijowski’s result is pointed out, along with an inescapable consequence of the ‘negative arrival times’ inherent to this proposal (and generalizations thereof). The ABK free-particle restriction is lifted in a discussion of an explicit arrival-time set-up featuring a charged particle moving in a constant magnetic field. A natural generalization of the ABK distribution is in this case shown to be critically gauge-dependent. A direct comparison to the QF distribution, which does not exhibit this flaw, is drawn (its acknowledged drawback concerning the quantum backflow effect notwithstanding).

APA, Harvard, Vancouver, ISO, and other styles

35

Sharifi-Noghabi, Hossein, Parsa Alamzadeh Harjandi, Olga Zolotareva, Colin C. Collins, and Martin Ester. "Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction." Nature Machine Intelligence 3, no. 11 (2021): 962–72. http://dx.doi.org/10.1038/s42256-021-00408-w.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Bogin, Ben, Sanjay Subramanian, Matt Gardner, and Jonathan Berant. "Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering." Transactions of the Association for Computational Linguistics 9 (2021): 195–210. http://dx.doi.org/10.1162/tacl_a_00361.

Full text

Abstract:

Abstract Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. However, state-of-the-art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties in generalization to out-of-distribution examples. In this work, we propose a model that computes a representation and denotation for all question spans in a bottom-up, compositional manner using a CKY-style parser. Our model induces latent trees, driven by end-to-end (the answer) supervision only. We show that this inductive bias towards tree structures dramatically improves systematic generalization to out-of- distribution examples, compared to strong baselines on an arithmetic expressions benchmark as well as on C losure, a dataset that focuses on systematic generalization for grounded question answering. On this challenging dataset, our model reaches an accuracy of 96.1%, significantly higher than prior models that almost perfectly solve the task on a random, in-distribution split.

APA, Harvard, Vancouver, ISO, and other styles

37

Zhi Tan, Zhi Tan, and Zhao-Fei Teng Zhi Tan. "Image Domain Generalization Method based on Solving Domain Discrepancy Phenomenon." 電腦學刊 33, no. 3 (2022): 171–85. http://dx.doi.org/10.53106/199115992022063303014.

Full text

Abstract:

<p>In order to solve the problem that the recognition performance is obviously degraded when the model trained by known data distribution transfer to unknown data distribution, domain generalization method based on attention mechanism and adversarial training is proposed. Firstly, a multi-level attention mechanism module is designed to capture the underlying abstract information features of the image; Secondly, increases the loss limit of the generative adversarial network，the virtual enhanced domain which can simulate the target domain of unknown data distribution is generated by adversarial training on the premise of ensuring the consistency of data features and semantics; Finally, through the data mixing algorithm, the source domain and virtual enhanced domain are mixed and input into the model to improve the performance of the classifier. The experiment is carried out on five classic digit recognition and CIFAR-10 series datasets. The experimental results show that the model can learn better decision boundary, generate virtual enhanced domain and significantly improve the accuracy of recognition after model transplantation. Comparing to the previous method, our method improves average accuracy by at least 2.5% and 3% respectively. Experiments on five classic digit recognition and CIFAR-10 series datasets which significantly improves the classification average accuracy after model transfer. </p> <p>&nbsp;</p>

APA, Harvard, Vancouver, ISO, and other styles

38

Vasiliuk, Anton, Daria Frolova, Mikhail Belyaev, and Boris Shirokikh. "Limitations of Out-of-Distribution Detection in 3D Medical Image Segmentation." Journal of Imaging 9, no. 9 (2023): 191. http://dx.doi.org/10.3390/jimaging9090191.

Full text

Abstract:

Deep learning models perform unreliably when the data come from a distribution different from the training one. In critical applications such as medical imaging, out-of-distribution (OOD) detection methods help to identify such data samples, preventing erroneous predictions. In this paper, we further investigate OOD detection effectiveness when applied to 3D medical image segmentation. We designed several OOD challenges representing clinically occurring cases and found that none of the methods achieved acceptable performance. Methods not dedicated to segmentation severely failed to perform in the designed setups; the best mean false-positive rate at a 95% true-positive rate (FPR) was 0.59. Segmentation-dedicated methods still achieved suboptimal performance, with the best mean FPR being 0.31 (lower is better). To indicate this suboptimality, we developed a simple method called Intensity Histogram Features (IHF), which performed comparably or better in the same challenges, with a mean FPR of 0.25. Our findings highlight the limitations of the existing OOD detection methods with 3D medical images and present a promising avenue for improving them. To facilitate research in this area, we release the designed challenges as a publicly available benchmark and formulate practical criteria to test the generalization of OOD detection beyond the suggested benchmark. We also propose IHF as a solid baseline to contest emerging methods.

APA, Harvard, Vancouver, ISO, and other styles

39

Yu, Bowen, Yuhong Liu, Xin Wu, Jing Ren, and Zhibin Zhao. "Trustworthy diagnosis of Electrocardiography signals based on out-of-distribution detection." PLOS ONE 20, no. 2 (2025): e0317900. https://doi.org/10.1371/journal.pone.0317900.

Full text

Abstract:

Cardiovascular disease is one of the most dangerous conditions, posing a significant threat to daily health. Electrocardiography (ECG) is crucial for heart health monitoring. It plays a pivotal role in early heart disease detection, heart function assessment, and guiding treatments. Thus, refining ECG diagnostic methods is vital for timely and accurate heart disease diagnosis. Recently, deep learning has significantly advanced in ECG signal classification and recognition. However, these methods struggle with new or Out-of-Distribution (OOD) heart diseases. The deep learning model performs well on existing heart diseases but falters on unknown types, which leads to less reliable diagnoses. To address this challenge, we propose a novel trustworthy diagnosis method for ECG signals based on OOD detection. The proposed model integrates Convolutional Neural Networks (CNN) and Attention mechanisms to enhance feature extraction. Meanwhile, Energy and ReAct techniques are used to recognize OOD heart diseases and its generalization capacity for trustworthy diagnosis. Empirical validation using both the MIT-BIH Arrhythmia Database and the INCART 12-lead Arrhythmia Database demonstrated our method’s high sensitivity and specificity in diagnosing both known and out-of-distribution (OOD) heart diseases, thus verifying the model’s diagnostic trustworthiness. The results not only validate the effectiveness of our approach but also highlight its potential application value in cardiac health diagnostics.

APA, Harvard, Vancouver, ISO, and other styles

40

Nguyen, Hai Van, Jau-Uei Chen, and Tan Bui-Thanh. "A model-constrained discontinuous Galerkin Network (DGNet) for compressible Euler equations with out-of-distribution generalization." Computer Methods in Applied Mechanics and Engineering 440 (May 2025): 117912. https://doi.org/10.1016/j.cma.2025.117912.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Lee, Ingyun, Wooju Lee, and Hyun Myung. "Domain Generalization with Vital Phase Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 4 (2024): 2892–900. http://dx.doi.org/10.1609/aaai.v38i4.28070.

Full text

Abstract:

Deep neural networks have shown remarkable performance in image classification. However, their performance significantly deteriorates with corrupted input data. Domain generalization methods have been proposed to train robust models against out-of-distribution data. Data augmentation in the frequency domain is one of such approaches that enable a model to learn phase features to establish domain-invariant representations. This approach changes the amplitudes of the input data while preserving the phases. However, using fixed phases leads to susceptibility to phase fluctuations because amplitudes and phase fluctuations commonly occur in out-of-distribution. In this study, to address this problem, we introduce an approach using finite variation of the phases of input data rather than maintaining fixed phases. Based on the assumption that the degree of domain-invariant features varies for each phase, we propose a method to distinguish phases based on this degree. In addition, we propose a method called vital phase augmentation (VIPAug) that applies the variation to the phases differently according to the degree of domain-invariant features of given phases. The model depends more on the vital phases that contain more domain-invariant features for attaining robustness to amplitude and phase fluctuations. We present experimental evaluations of our proposed approach, which exhibited improved performance for both clean and corrupted data. VIPAug achieved SOTA performance on the benchmark CIFAR-10 and CIFAR-100 datasets, as well as near-SOTA performance on the ImageNet-100 and ImageNet datasets. Our code is available at https://github.com/excitedkid/vipaug.

APA, Harvard, Vancouver, ISO, and other styles

42

He, Rundong, Yue Yuan, Zhongyi Han, et al. "Exploring Channel-Aware Typical Features for Out-of-Distribution Detection." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 11 (2024): 12402–10. http://dx.doi.org/10.1609/aaai.v38i11.29132.

Full text

Abstract:

Detecting out-of-distribution (OOD) data is essential to ensure the reliability of machine learning models when deployed in real-world scenarios. Different from most previous test-time OOD detection methods that focus on designing OOD scores, we delve into the challenges in OOD detection from the perspective of typicality and regard the feature’s high-probability region as the feature’s typical set. However, the existing typical-feature-based OOD detection method implies an assumption: the proportion of typical feature sets for each channel is fixed. According to our experimental analysis, each channel contributes differently to OOD detection. Adopting a fixed proportion for all channels results in several channels losing too many typical features or incorporating too many abnormal features, resulting in low performance. Therefore, exploring the channel-aware typical features is crucial to better-separating ID and OOD data. Driven by this insight, we propose expLoring channel-Aware tyPical featureS (LAPS). Firstly, LAPS obtains the channel-aware typical set by calibrating the channel-level typical set with the global typical set from the mean and standard deviation. Then, LAPS rectifies the features into channel-aware typical sets to obtain channel-aware typical features. Finally, LAPS leverages the channel-aware typical features to calculate the energy score for OOD detection. Theoretical and visual analyses verify that LAPS achieves a better bias-variance trade-off. Experiments verify the effectiveness and generalization of LAPS under different architectures and OOD scores.

APA, Harvard, Vancouver, ISO, and other styles

43

Ding, Kun, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, and Chunhong Pan. "Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 2 (2024): 1528–36. http://dx.doi.org/10.1609/aaai.v38i2.27918.

Full text

Abstract:

We propose a generalized method for boosting the generalization ability of pre-trained vision-language models (VLMs) while fine-tuning on downstream few-shot tasks. The idea is realized by exploiting out-of-distribution (OOD) detection to predict whether a sample belongs to a base distribution or a novel distribution and then using the score generated by a dedicated competition based scoring function to fuse the zero-shot and few-shot classifier. The fused classifier is dynamic, which will bias towards the zero-shot classifier if a sample is more likely from the distribution pre-trained on, leading to improved base-to-novel generalization ability. Our method is performed only in test stage, which is applicable to boost existing methods without time-consuming re-training. Extensive experiments show that even weak distribution detectors can still improve VLMs' generalization ability. Specifically, with the help of OOD detectors, the harmonic mean of CoOp and ProGrad increase by 2.6 and 1.5 percentage points over 11 recognition datasets in the base-to-novel setting.

APA, Harvard, Vancouver, ISO, and other styles

44

Simmachan, Teerawat, and Wikanda Phaphan. "Generalization of Two-Sided Length Biased Inverse Gaussian Distributions and Applications." Symmetry 14, no. 10 (2022): 1965. http://dx.doi.org/10.3390/sym14101965.

Full text

Abstract:

The notion of length-biased distribution can be used to develop adequate models. Length-biased distribution was known as a special case of weighted distribution. In this work, a new class of length-biased distribution, namely the two-sided length-biased inverse Gaussian distribution (TS-LBIG), was introduced. The physical phenomenon of this scenario was described in a case of cracks developing from two sides. Since the probability density function of the original TS-LBIG distribution cannot be written in a closed-form expression, its generalization form was further introduced. Important properties such as the moment-generating function and survival function cannot be provided. We offered a different approach to solving this problem. Some distributional properties were investigated. The parameters were estimated by the method of the moment. Monte Carlo simulation studies were carried out to appraise the performance of the suggested estimators using bias, variance, and mean square error. An application of a real dataset was presented for illustration. The results showed that the suggested estimators performed better than the original study. The proposed distribution provided a more appropriate model than other candidate distributions for fitting based on Akaike information criterion.

APA, Harvard, Vancouver, ISO, and other styles

45

Nain, Philippe. "On a generalization of the preemptive resume priority." Advances in Applied Probability 18, no. 1 (1986): 255–73. http://dx.doi.org/10.2307/1427245.

Full text

Abstract:

This paper considers a queueing system with two classes of customers and a single server, where the service policy is of threshold type. As soon as the amount of work required by the class 1 customers is greater than a fixed threshold, the class 1 customers get the server's attention; otherwise the class 2 customers have the priority. Service interruptions can occur for both classes of customers on the basis of the above description of the service mechanism, and in this case the service interruption discipline is preemptive resume priority (PRP). This model, which turns out to be a generalization of the PRP queueing system, has potential applications in computer systems and in communication networks. For Poisson inputs, exponential (arbitrary) servicetime distribution for class 1 (class 2) customers, we derive the Laplace–Stieltjes transform of the stationary joint distribution of the workload of the server, by reducing the analysis to the resolution of a boundary value problem. Explicit formulas are obtained.

APA, Harvard, Vancouver, ISO, and other styles

46

Nain, Philippe. "On a generalization of the preemptive resume priority." Advances in Applied Probability 18, no. 01 (1986): 255–73. http://dx.doi.org/10.1017/s0001867800015652.

Full text

Abstract:

This paper considers a queueing system with two classes of customers and a single server, where the service policy is of threshold type. As soon as the amount of work required by the class 1 customers is greater than a fixed threshold, the class 1 customers get the server's attention; otherwise the class 2 customers have the priority. Service interruptions can occur for both classes of customers on the basis of the above description of the service mechanism, and in this case the service interruption discipline is preemptive resume priority (PRP). This model, which turns out to be a generalization of the PRP queueing system, has potential applications in computer systems and in communication networks. For Poisson inputs, exponential (arbitrary) servicetime distribution for class 1 (class 2) customers, we derive the Laplace–Stieltjes transform of the stationary joint distribution of the workload of the server, by reducing the analysis to the resolution of a boundary value problem. Explicit formulas are obtained.

APA, Harvard, Vancouver, ISO, and other styles

47

Zhang, Weifeng, Zhiyuan Wang, Kunpeng Zhang, Ting Zhong, and Fan Zhou. "DyCVAE: Learning Dynamic Causal Factors for Non-stationary Series Domain Generalization (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (2023): 16382–83. http://dx.doi.org/10.1609/aaai.v37i13.27051.

Full text

Abstract:

Learning domain-invariant representations is a major task of out-of-distribution generalization. To address this issue, recent efforts have taken into accounting causality, aiming at learning the causal factors with regard to tasks. However, extending existing generalization methods for adapting non-stationary time series may be ineffective, because they fail to model the underlying causal factors due to temporal-domain shifts except for source-domain shifts, as pointed out by recent studies. To this end, we propose a novel model DyCVAE to learn dynamic causal factors. The results on synthetic and real datasets demonstrate the effectiveness of our proposed model for the task of generalization in time series domain.

APA, Harvard, Vancouver, ISO, and other styles

48

Chen, Zhengyu, Teng Xiao, Kun Kuang, et al. "Learning to Reweight for Generalizable Graph Neural Network." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 8 (2024): 8320–28. http://dx.doi.org/10.1609/aaai.v38i8.28673.

Full text

Abstract:

Graph Neural Networks (GNNs) show promising results for graph tasks. However, existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. The fundamental reason for the severe degeneration is that most GNNs are designed based on the I.I.D hypothesis. In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation. In this paper, we study the problem of the generalization ability of GNNs on Out-Of-Distribution (OOD) settings. To solve this problem, we propose the Learning to Reweight for Generalizable Graph Neural Network (L2R-GNN) to enhance the generalization ability for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs. We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability and compares favorably to previous methods in restraining the over-reduced sample size. The variables of graph representation are clustered based on the stability of their correlations, and graph decorrelation method learns weights to remove correlations between the variables of different clusters rather than any two variables. Besides, we introduce an effective stochastic algorithm based on bi-level optimization for the L2R-GNN framework, which enables simultaneously learning the optimal weights and GNN parameters, and avoids the over-fitting issue. Experiments show that L2R-GNN greatly outperforms baselines on various graph prediction benchmarks under distribution shifts.

APA, Harvard, Vancouver, ISO, and other styles

49

Wang, Da, Lin Li, Wei Wei, Qixian Yu, Jianye Hao, and Jiye Liang. "Improving Generalization in Offline Reinforcement Learning via Latent Distribution Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 20 (2025): 21053–61. https://doi.org/10.1609/aaai.v39i20.35402.

Full text

Abstract:

Dealing with the distribution shift is a significant challenge when building offline reinforcement learning (RL) models that can generalize from a static dataset to out-of-distribution (OOD) scenarios. Previous approaches have employed pessimism or conservatism strategies. More recently, data-driven work has taken a distributional perspective, treating offline data as a domain adaptation problem. However, these methods use heuristic techniques to simulate distribution shifts, resulting in a limited diversity of artificially created distribution gaps. In this paper, we propose a novel perspective: offline datasets inherently contain multiple latent distributions, with behavior data from diverse policies potentially following different distributions and data from the same policy across various time phases also exhibiting distribution variance. We introduce the Latent Distribution Representation Learning (LAD) framework, which aims to characterize the multiple latent distributions within offline data and reduce the distribution gaps between any pair of them. LAD consists of a min-max adversarial process: it first identifies the "worst-case" distributions to enlarge the diversity of distribution gaps and then reduces these gaps to learn invariant representations for generalization. We derive a generalization error bound to support LAD theoretically and verify its effectiveness through extensive experiments.

APA, Harvard, Vancouver, ISO, and other styles

50

Welleck, Sean, Peter West, Jize Cao, and Yejin Choi. "Symbolic Brittleness in Sequence Models: On Systematic Generalization in Symbolic Mathematics." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (2022): 8629–37. http://dx.doi.org/10.1609/aaai.v36i8.20841.

Full text

Abstract:

Neural sequence models trained with maximum likelihood estimation have led to breakthroughs in many tasks, where success is defined by the gap between training and test performance. However, their ability to achieve stronger forms of generalization remains unclear. We consider the problem of symbolic mathematical integration, as it requires generalizing systematically beyond the training set. We develop a methodology for evaluating generalization that takes advantage of the problem domain's structure and access to a verifier. Despite promising in-distribution performance of sequence-to-sequence models in this domain, we demonstrate challenges in achieving robustness, compositionality, and out-of-distribution generalization, through both carefully constructed manual test suites and a genetic algorithm that automatically finds large collections of failures in a controllable manner. Our investigation highlights the difficulty of generalizing well with the predominant modeling and learning approach, and the importance of evaluating beyond the test set, across different aspects of generalization.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!