Log in

Relevant bibliographies by topics / Identical and Independent Distributed (IID) / Journal articles

To see the other types of publications on this topic, follow the link: Identical and Independent Distributed (IID).

Journal articles on the topic 'Identical and Independent Distributed (IID)'

Author: Grafiati

Published: 3 June 2025

Last updated: 23 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Identical and Independent Distributed (IID).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wu, Jikun, JiaHao Yu, and YuJun Zheng. "Research on Federated Learning Algorithms in Non-Independent Identically Distributed Scenarios." Highlights in Science, Engineering and Technology 85 (March 13, 2024): 104–12. http://dx.doi.org/10.54097/7newsv97.

Full text

Abstract:

Federal learning is distributed learning and is mainly training locally by using multiple distributed devices. After receiving a local parameter, a server performs aggregation and performs multiple iterations until convergence to a final stable model. However, in actual application, due to different preferences of clients and differences in local data of different clients, data in federal learning may be not independently and identically distributed. (Non-Independent Identically Distribution). The main research work of this article is as follows: 1)Analyze and summarize the methods and techniques for solving the non-IID data problem in past experiments.2) Perform in-depth research on the basic methods of federal learning on non-IID data, such as FedAvg and FedProx.3) By using the FedAvg algorithm, using the CIFAR-10 data set, the simulation method is used to simulate the number of types contained in each client, and the distribution of the data set divided according to the distribution of Dirichlet to simulate the non-independent identical distribution of data. The detailed data analysis is made on the influence of the data on the accuracy and loss of model training.

APA, Harvard, Vancouver, ISO, and other styles

2

Collins, Megan. "Distribution and Properties of the Critical Values of Random Polynomials With Non-Independent and Non-Identically Distributed Roots." PUMP Journal of Undergraduate Research 3 (November 6, 2020): 244–76. http://dx.doi.org/10.46787/pump.v3i0.2282.

Full text

Abstract:

This paper considers the pairing between the distribution of the roots and the distribution of the critical values of random polynomials. The primary model of random polynomial considered in this paper consists of monic polynomials of degree n with a single complex variable z where the roots of the random polynomial are complex valued random variables that are chosen from two independent sequences of iid, complex valued random variables. The distribution of the random variables from each of the two sequences are different, producing roots of the random polynomial which have non-identical distributions. Furthermore, both the iid, complex valued random variables from one of the sequences of random variables and the complex conjugates of those random variables are roots of the random polynomial. Hence, this model of monic random polynomials of degree n has roots that are random variables which are not independent, due to the dependence based on complex conjugates, and the non-identical distributions which arise from the use of the two independent sequences. This paper also describes the relationship between the roots and critical values of monic random polynomials of degree n where the roots are chosen to be random variables and their complex conjugates where the random variables are from a sequence of iid complex valued random variables.

APA, Harvard, Vancouver, ISO, and other styles

3

Aggarwal, Meenakshi, Vikas Khullar, Nitin Goyal, Abdullah Alammari, Marwan Ali Albahar, and Aman Singh. "Lightweight Federated Learning for Rice Leaf Disease Classification Using Non Independent and Identically Distributed Images." Sustainability 15, no. 16 (2023): 12149. http://dx.doi.org/10.3390/su151612149.

Full text

Abstract:

Rice (Oryza sativa L.) is a vital food source all over the world, contributing 15% of the protein and 21% of the energy intake per person in Asia, where most rice is produced and consumed. However, bacterial, fungal, and other microbial diseases that have a negative effect on the health of plants and crop yield are a major problem for rice farmers. It is challenging to diagnose these diseases manually, especially in areas with a shortage of crop protection experts. Automating disease identification and providing readily available decision-support tools are essential for enabling effective rice leaf protection measures and minimising rice crop losses. Although there are numerous classification systems for the diagnosis of rice leaf disease, no reliable, secure method has been identified that meets these needs. This paper proposes a lightweight federated deep learning architecture while maintaining data privacy constraints for rice leaf disease classification. The distributed client–server design of this framework protects the data privacy of all clients, and by using independent and identically distributed (IID) and non-IID data, the validity of the federated deep learning models was examined. To validate the framework’s efficacy, the researchers conducted experiments in a variety of settings, including conventional learning, federated learning via a single client, as well as federated learning via multiple clients. The study began by extracting features from various pre-trained models, ultimately selecting EfficientNetB3 with an impressive 99% accuracy as the baseline model. Subsequently, experimental results were conducted using the federated learning (FL) approach with both IID and non-IID datasets. The FL approach, along with a dense neural network trained and evaluated on an IID dataset, achieved outstanding training and evaluated accuracies of 99% with minimal losses of 0.006 and 0.03, respectively. Similarly, on a non-IID dataset, the FL approach maintained a high training accuracy of 99% with a loss of 0.04 and an evaluation accuracy of 95% with a loss of 0.08. These results indicate that the FL approach performs nearly as well as the base model, EfficientNetB3, highlighting its effectiveness in handling both IID and non-IID data. It was found that federated deep learning models with multiple clients outperformed conventional pre-trained models. The unique characteristics of the proposed framework, such as its data privacy for edge devices with limited resources, set it apart from the existing classification schemes for rice leaf diseases. The framework is the best alternative solution for the early classification of rice leaf disease because of these additional features.

APA, Harvard, Vancouver, ISO, and other styles

4

Alotaibi, Basmah, Fakhri Alam Khan, and Sajjad Mahmood. "Communication Efficiency and Non-Independent and Identically Distributed Data Challenge in Federated Learning: A Systematic Mapping Study." Applied Sciences 14, no. 7 (2024): 2720. http://dx.doi.org/10.3390/app14072720.

Full text

Abstract:

Federated learning has emerged as a promising approach for collaborative model training across distributed devices. Federated learning faces challenges such as Non-Independent and Identically Distributed (non-IID) data and communication challenges. This study aims to provide in-depth knowledge in the federated learning environment by identifying the most used techniques for overcoming non-IID data challenges and techniques that provide communication-efficient solutions in federated learning. The study highlights the most used non-IID data types, learning models, and datasets in federated learning. A systematic mapping study was performed using six digital libraries, and 193 studies were identified and analyzed after the inclusion and exclusion criteria were applied. We identified that enhancing the aggregation method and clustering are the most widely used techniques for non-IID data problems (used in 18% and 16% of the selected studies), and a quantization technique was the most common technique in studies that provide communication-efficient solutions in federated learning (used in 27% and 15% of the selected studies). Additionally, our work shows that label distribution skew is the most used case to simulate a non-IID environment, specifically, the quantity label imbalance. The supervised learning model CNN model is the most commonly used learning model, and the image datasets MNIST and Cifar-10 are the most widely used datasets when evaluating the proposed approaches. Furthermore, we believe the research community needs to consider the client’s limited resources and the importance of their updates when addressing non-IID and communication challenges to prevent the loss of valuable and unique information. The outcome of this systematic study will benefit federated learning users, researchers, and providers.

APA, Harvard, Vancouver, ISO, and other styles

5

Zhu, Feng, Jiangshan Hao, Zhong Chen, Yanchao Zhao, Bing Chen, and Xiaoyang Tan. "STAFL: Staleness-Tolerant Asynchronous Federated Learning on Non-iid Dataset." Electronics 11, no. 3 (2022): 314. http://dx.doi.org/10.3390/electronics11030314.

Full text

Abstract:

With the development of the Internet of Things, edge computing applications are paying more and more attention to privacy and real-time. Federated learning, a promising machine learning method that can protect user privacy, has begun to be widely studied. However, traditional synchronous federated learning methods are easily affected by stragglers, and non-independent and identically distributed data sets will also reduce the convergence speed. In this paper, we propose an asynchronous federated learning method, STAFL, where users can upload their updates at any time and the server will immediately aggregate the updates and return the latest global model. Secondly, STAFL will judge the user’s data distribution according to the user’s update and dynamically change the aggregation parameters according to the user’s network weight and staleness to minimize the impact of non-independent and identically distributed data sets on asynchronous updates. The experimental results show that our method performs better on non-independent and identically distributed data sets than existing methods.

APA, Harvard, Vancouver, ISO, and other styles

6

Tayyeh, Huda Kadhim, and Ahmed Sabah Ahmed AL-Jumaili. "Balancing Privacy and Performance: A Differential Privacy Approach in Federated Learning." Computers 13, no. 11 (2024): 277. http://dx.doi.org/10.3390/computers13110277.

Full text

Abstract:

Federated learning (FL), a decentralized approach to machine learning, facilitates model training across multiple devices, ensuring data privacy. However, achieving a delicate privacy preservation–model convergence balance remains a major problem. Understanding how different hyperparameters affect this balance is crucial for optimizing FL systems. This article examines the impact of various hyperparameters, like the privacy budget (ϵ), clipping norm (C), and the number of randomly chosen clients (K) per communication round. Through a comprehensive set of experiments, we compare training scenarios under both independent and identically distributed (IID) and non-independent and identically distributed (Non-IID) data settings. Our findings reveal that the combination of ϵ and C significantly influences the global noise variance, affecting the model’s performance in both IID and Non-IID scenarios. Stricter privacy conditions lead to fluctuating non-converging loss behavior, particularly in Non-IID settings. We consider the number of clients (K) and its impact on the loss fluctuations and the convergence improvement, particularly under strict privacy measures. Thus, Non-IID settings are more responsive to stricter privacy regulations; yet, with a higher client interaction volume, they also can offer better convergence. Collectively, knowledge of the privacy-preserving approach in FL has been extended and useful suggestions towards an ideal privacy–convergence balance were achieved.

APA, Harvard, Vancouver, ISO, and other styles

7

LYONS, RUSSELL. "Factors of IID on Trees." Combinatorics, Probability and Computing 26, no. 2 (2016): 285–300. http://dx.doi.org/10.1017/s096354831600033x.

Full text

Abstract:

Classical ergodic theory for integer-group actions uses entropy as a complete invariant for isomorphism of IID (independent, identically distributed) processes (a.k.a. product measures). This theory holds for amenable groups as well. Despite recent spectacular progress of Bowen, the situation for non-amenable groups, including free groups, is still largely mysterious. We present some illustrative results and open questions on free groups, which are particularly interesting in combinatorics, statistical physics and probability. Our results include bounds on minimum and maximum bisection for random cubic graphs that improve on all past bounds.

APA, Harvard, Vancouver, ISO, and other styles

8

Gao, Huiguo, Mengyuan Lee, Guanding Yu, and Zhaolin Zhou. "A Graph Neural Network Based Decentralized Learning Scheme." Sensors 22, no. 3 (2022): 1030. http://dx.doi.org/10.3390/s22031030.

Full text

Abstract:

As an emerging paradigm considering data privacy and transmission efficiency, decentralized learning aims to acquire a global model using the training data distributed over many user devices. It is a challenging problem since link loss, partial device participation, and non-independent and identically distributed (non-iid) data distribution would all deteriorate the performance of decentralized learning algorithms. Existing work may restrict to linear models or show poor performance over non-iid data. Therefore, in this paper, we propose a decentralized learning scheme based on distributed parallel stochastic gradient descent (DPSGD) and graph neural network (GNN) to deal with the above challenges. Specifically, each user device participating in the learning task utilizes local training data to compute local stochastic gradients and updates its own local model. Then, each device utilizes the GNN model and exchanges the model parameters with its neighbors to reach the average of resultant global models. The iteration repeats until the algorithm converges. Extensive simulation results over both iid and non-iid data validate the algorithm’s convergence to near optimal results and robustness to both link loss and partial device participation.

APA, Harvard, Vancouver, ISO, and other styles

9

Zhang, You, Jin Wang, Liang-Chih Yu, Dan Xu, and Xuejie Zhang. "Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 24 (2025): 25967–75. https://doi.org/10.1609/aaai.v39i24.34791.

Full text

Abstract:

Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual characteristics and the relationships among samples. However, the extent of the impact of non-IID information and how these methods affect pre-trained language models (PLMs) remains unclear. This study revisits the assumption that non-IID information enhances PLMs to achieve performance improvements from a Bayesian perspective, which unearths and integrates non-IID and IID features. Furthermore, we proposed a multi-attribute multi-grained framework for PLM adaptations (M2A), which combines multi-attribute and multi-grained views to mitigate uncertainty in a lightweight manner. We evaluate M2A through prevalent text-understanding datasets and demonstrate its superior performance, mainly when data are implicitly non-IID, and PLMs scale larger.

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Ying, Zhiqiang Wang, Shufang Pang, and Lei Ju. "Distributed Malicious Traffic Detection." Electronics 13, no. 23 (2024): 4720. http://dx.doi.org/10.3390/electronics13234720.

Full text

Abstract:

With the wide deployment of edge devices, distributed network traffic data are rapidly increasing. Traditional detection methods for malicious traffic rely on centralized training, in which a single server is often used to aggregate private traffic data from edge devices, so as to extract and identify features. However, these methods face difficult data collection, heavy computational complexity, and high privacy risks. To address these issues, this paper proposes a federated learning-based distributed malicious traffic detection framework, FL-CNN-Traffic. In this framework, edge devices utilize a convolutional neural network (CNN) to process local detection, data collection, feature extraction, and training. A server aggregates model updates from edge devices using four federated learning algorithms (FedAvg, FedProx, Scaffold, and FedNova) to build a global model. This framework allows multiple devices to collaboratively train a model without sharing private traffic data, addressing the “Data Silo” problem while ensuring privacy. Evaluations on the USTC-TFC2016 dataset show that for independent and identically distributed (IID) data, this framework can reach or exceed the performance of centralized deep learning methods. For Non-IID data, this framework outperforms other neural networks based on federated learning, with accuracy improvements ranging from 2.59% to 4.73%.

APA, Harvard, Vancouver, ISO, and other styles

11

Bejenar, Iuliana, Lavinia Ferariu, Carlos Pascal, and Constantin-Florin Caruntu. "Aggregation Methods Based on Quality Model Assessment for Federated Learning Applications: Overview and Comparative Analysis." Mathematics 11, no. 22 (2023): 4610. http://dx.doi.org/10.3390/math11224610.

Full text

Abstract:

Federated learning (FL) offers the possibility of collaboration between multiple devices while maintaining data confidentiality, as required by the General Data Protection Regulation (GDPR). Though FL can keep local data private, it may encounter problems when dealing with non-independent and identically distributed data (non-IID), insufficient local training samples or cyber-attacks. This paper introduces algorithms that can provide a reliable aggregation of the global model by investigating the accuracy of models received from clients. This allows reducing the influence of less confident nodes, who were potentially attacked or unable to perform successful training. The analysis includes the proposed FedAcc and FedAccSize algorithms, together with their new extension based on the Lasso regression, FedLasso. FedAcc and FedAccSize set the confidence in each client based only on local models’ accuracy, while FedLasso exploits additional details related to predictions, like predicted class probabilities, to support a refined aggregation. The ability of the proposed algorithms to protect against intruders or underperforming clients is demonstrated experimentally using testing scenarios involving independent and identically distributed (IID) data as well as non-IID data. The comparison with the established FedAvg and FedAvgM algorithms shows that exploiting the quality of the client models is essential for reliable aggregation, which enables rapid and robust improvement in the global model.

APA, Harvard, Vancouver, ISO, and other styles

12

Liang, Han-Ying, Jong-Il Baek, and Josef Steinebach. "Law of the iterated logarithm for self-normalized sums and their increments." Studia Scientiarum Mathematicarum Hungarica 43, no. 1 (2006): 79–114. http://dx.doi.org/10.1556/sscmath.43.2006.1.6.

Full text

Abstract:

Let X1, X2,… be independent, but not necessarily identically distributed random variables in the domain of attraction of a stable law with index 0<a<2. This paper uses Mn=max 1?i?n|Xi| to establish a self-normalized law of the iterated logarithm (LIL) for partial sums. Similarly self-normalized increments of partial sums are studied as well. In particular, the results of self-normalized sums of Horváth and Shao[9]under independent and identically distributed random variables are extended and complemented. As applications, some corresponding results for self-normalized weighted sums of iid random variables are also concluded.

APA, Harvard, Vancouver, ISO, and other styles

13

Jahani, Khalil, Behzad Moshiri, and Babak Hossein Khalaj. "A Survey on Data Distribution Challenges and Solutions in Vertical and Horizontal Federated Learning." Journal of Artificial Intelligence, Applications, and Innovations 1, no. 2 (2024): 55–71. https://doi.org/10.61838/jaiai.1.2.5.

Full text

Abstract:

Federated learning is a novel way of training machine learning models on data that is distributed across multiple devices, such as smartphones and IoT sensors, without compromising privacy, efficiency, or security. However, federated learning faces a significant challenge when the data on each device is not independent and identically distributed (non-IID), which means that the data may have different distributions, sizes, or qualities. non-IID data is a major challenge for federated learning, as it affects the accuracy and participation of the local devices. Most existing methods focus on improving the model, algorithm, or framework of federated learning to deal with non-IID data. However, there is a lack of systematic and up-to-date reviews on this topic. In this paper, we survey different approaches to address the challenge of non-IID data in Vertical Federated Learning (VFL) and Horizontal Federated Learning (HFL). We organize the existing literature based on the perspective of the researcher and the sub-tasks involved in each approach. Our goal is to provide a comprehensive and systematic overview of the problem and its solutions.

APA, Harvard, Vancouver, ISO, and other styles

14

Chen, Zengjing, and Feng Hu. "A law of the iterated logarithm under sublinear expectations." Journal of Financial Engineering 01, no. 02 (2014): 1450015. http://dx.doi.org/10.1142/s2345768614500159.

Full text

Abstract:

In this paper, with the notion of independent and identically distributed (IID) random variables under sublinear expectations initiated by Peng, we develop a law of the iterated logarithm (LIL) for capacities. It turns out that our theorem is a natural extension of the Kolmogorov and the Hartman–Wintner LIL.

APA, Harvard, Vancouver, ISO, and other styles

15

Wu, Xia, Lei Xu, and Liehuang Zhu. "Local Differential Privacy-Based Federated Learning under Personalized Settings." Applied Sciences 13, no. 7 (2023): 4168. http://dx.doi.org/10.3390/app13074168.

Full text

Abstract:

Federated learning is a distributed machine learning paradigm, which utilizes multiple clients’ data to train a model. Although federated learning does not require clients to disclose their original data, studies have shown that attackers can infer clients’ privacy by analyzing the local models shared by clients. Local differential privacy (LDP) can help to solve the above privacy issue. However, most of the existing federated learning studies based on LDP, rarely consider the diverse privacy requirements of clients. In this paper, we propose an LDP-based federated learning framework, that can meet the personalized privacy requirements of clients. We consider both independent identically distributed (IID) datasets and non-independent identically distributed (non-IID) datasets, and design model perturbation methods, respectively. Moreover, we propose two model aggregation methods, namely weighted average method and probability-based selection method. The main idea, is to weaken the impact of those privacy-conscious clients, who choose relatively small privacy budgets, on the federated model. Experiments on three commonly used datasets, namely MNIST, Fashion-MNIST, and forest cover-types, show that the proposed aggregation methods perform better than the classic arithmetic average method, in the personalized privacy preserving scenario.

APA, Harvard, Vancouver, ISO, and other styles

16

Sharma, Shagun, and Kalpna Guleria. "A Distributed Privacy Preserved Federated Learning Approach for Revolutionizing Pneumonia Detection in Isolated Heterogenous Data Silos." International Journal of Mathematical, Engineering and Management Sciences 10, no. 5 (2025): 1324–50. https://doi.org/10.33889/ijmems.2025.10.5.063.

Full text

Abstract:

Pneumonia is a respiratory lung contamination that ranges in severity from mild to lethal outcomes. The analysis of tomographic images is the most significant method of pneumonia detection. The image analysis requires expertise and proficiency to diagnose the disease correctly. The medical reports with multiple diseases have overlapping symptoms, which may lead to misdiagnosis and deferred identification. The misdiagnosis results in increased healthcare costs, worsened medical conditions, and legal implications. Centralized deep learning enhances the feature extraction process and optimally improves the prediction outcomes; however, these models have data privacy concerns due to centralized storage systems. The healthcare departments follow the Health Insurance Portability and Accountability Act. (HIPAA) to maintain the retaining of patient data and improve the portability and continuity of health insurance coverage. In the proposed work, federated learning has been utilized to enhance data privacy and deal with imbalanced and diverse data silos. This distributed privacy-preserved model has been employed with a pooled dataset curated from multiple sources in a 5-client architecture. The model was implemented with the FedAVG aggregation technique in independent and identically distributed (IID) and non-IID data distributions. The outcomes of the model exhibit 87.62% accuracy with IID and 86.15% accuracy with non-IID distributions. The comparison of these outcomes with the existing studies shows that the proposed model outperforms by exhibiting better performance and resulting in the minimum loss of 0.4041 and 0.4139 with IID and non-IID distributions, respectively.

APA, Harvard, Vancouver, ISO, and other styles

17

Lee, Suchul. "Distributed Detection of Malicious Android Apps While Preserving Privacy Using Federated Learning." Sensors 23, no. 4 (2023): 2198. http://dx.doi.org/10.3390/s23042198.

Full text

Abstract:

Recently, deep learning has been widely used to solve existing computing problems through large-scale data mining. Conventional training of the deep learning model is performed on a central (cloud) server that is equipped with high computing power, by integrating data via high computational intensity. However, integrating raw data from multiple clients raises privacy concerns that are increasingly being focused on. In federated learning (FL), clients train deep learning models in a distributed fashion using their local data; instead of sending raw data to a central server, they send parameter values of the trained local model to a central server for integration. Because FL does not transmit raw data to the outside, it is free from privacy issues. In this paper, we perform an experimental study that explores the dynamics of the FL-based Android malicious app detection method under three data distributions across clients, i.e., (i) independent and identically distributed (IID), (ii) non-IID, (iii) non-IID and unbalanced. Our experiments demonstrate that the application of FL is feasible and efficient in detecting malicious Android apps in a distributed manner on cellular networks.

APA, Harvard, Vancouver, ISO, and other styles

18

Васюта, К. С., У. Р. Збежховська, В. В. Слободянюк, В. С. Загривий та В. І. Чистов. "Метод прихованої передачі інформації в системах з Orthogonal frequency division multiplexing (OFDM) модуляцією". Наука і техніка Повітряних Сил Збройних Сил України, № 2(43), (11 травня 2021): 132–39. http://dx.doi.org/10.30748/nitps.2021.43.18.

Full text

Abstract:

У статті проаналізовано можливості використання сигналів з Orthogonal frequency division multiplexing (OFDM) модуляцією для організації скритного радіозв’язку в системах спеціального призначення. Побудовано та проаналізовано “образ” сигналу з OFDM модуляцією в псевдофазовому просторі. Візуальний аналіз “образу” показав, що його структура є близькою до структури регулярних процесів. Дана оцінка скритності функціонування систем з OFDM модуляцією за допомогою непараметричної Brock Dechert Scheinkman (BDS) статистики. Запропоновано метод підвищення скритності сигналів з OFDM модуляцією за допомогою використання поліномів Чебишева. Проведено порівняльний аналіз Independent аnd Identically Distributed (IID) скритності хаотичних та гармонічних OFDM сигналів. Отримані результати показують, що запропонований метод формування OFDM сигналів дозволяє забезпечити вищий рівень IID скритності.

APA, Harvard, Vancouver, ISO, and other styles

19

Kokic, P. N., and N. C. Weber. "Rates of strong convergence for U-statistics in finite populations." Journal of the Australian Mathematical Society. Series A. Pure Mathematics and Statistics 50, no. 3 (1991): 468–80. http://dx.doi.org/10.1017/s1446788700033024.

Full text

Abstract:

AbstractLet UNn be a U-statistic based on a simple random sample of size n selected without replacement from a finite population of size N. Rates of convergence results in the strong law are obtained for UNn, which are similar to those known for classical U-statistics based on samples of independent and identically distributed (iid) random variables.

APA, Harvard, Vancouver, ISO, and other styles

20

Othman, Abdul Rahman, Choo Heng Lai, Sonia Aissa Sonia Aissa, and Nora Muda. "Approximation of the Sum of Independent Lognormal Variates using Lognormal Distribution by Maximum Likelihood Estimation Approached." Sains Malaysiana 52, no. 1 (2023): 295–304. http://dx.doi.org/10.17576/jsm-2023-5201-24.

Full text

Abstract:

Three methods of approximating the sum of lognormal variates to a lognormal distribution were studied. They were the Wilkinson approximation, the Monte Carlo version of the Wilkinson approximation and the approximation using estimated maximum likelihood lognormal parameters. The lognormal variates were generated empirically using Monte Carlo simulation based on several conditions such as number of lognormal variates in the sum, number of sample points in the variates, the variates are independent and identically distributed (IID) and also not identically distributed (NIID) with lognormal parameters. Evaluation of all three lognormal approximation methods was performed using the Anderson Darling test. Results show that the approximation using estimated maximum likelihood lognormal parameters produced Type I errors close to the 0.05 target and is considered the best approximation.

APA, Harvard, Vancouver, ISO, and other styles

21

Gifuni, Angelo, Antonio Sorrentino, Giuseppe Ferrara, and Maurizio Migliaccio. "An Estimate of the Probability Density Function of the Sum of a Random Number N of Independent Random Variables." Journal of Computational Engineering 2015 (April 6, 2015): 1–12. http://dx.doi.org/10.1155/2015/801652.

Full text

Abstract:

A new estimate of the probability density function (PDF) of the sum of a random number of independent and identically distributed (IID) random variables Xi (i=1, 2, …, N) is shown. The sum PDF is represented as a sum of normal PDFs weighted according to the N PDF. The analytical model is verified by numerical simulations. The comparison is made by the Chi-Square Goodness-of-Fit test.

APA, Harvard, Vancouver, ISO, and other styles

22

Meng, Xutao, Yong Li, Jianchao Lu, and Xianglin Ren. "An Optimization Method for Non-IID Federated Learning Based on Deep Reinforcement Learning." Sensors 23, no. 22 (2023): 9226. http://dx.doi.org/10.3390/s23229226.

Full text

Abstract:

Federated learning (FL) is a distributed machine learning paradigm that enables a large number of clients to collaboratively train models without sharing data. However, when the private dataset between clients is not independent and identically distributed (non-IID), the local training objective is inconsistent with the global training objective, which possibly causes the convergence speed of FL to slow down, or even not converge. In this paper, we design a novel FL framework based on deep reinforcement learning (DRL), named FedRLCS. In FedRLCS, we primarily improved the greedy strategy and action space of the double DQN (DDQN) algorithm, enabling the server to select the optimal subset of clients from a non-IID dataset to participate in training, thereby accelerating model convergence and reaching the target accuracy in fewer communication epochs. In simulation experiments, we partition multiple datasets with different strategies to simulate non-IID on local clients. We adopt four models (LeNet-5, MobileNetV2, ResNet-18, ResNet-34) on the four datasets (CIFAR-10, CIFAR-100, NICO, Tiny ImageNet), respectively, and conduct comparative experiments with five state-of-the-art non-IID FL methods. Experimental results show that FedRLCS reduces the number of communication rounds required by 10–70% with the same target accuracy without increasing the computation and storage costs for all clients.

APA, Harvard, Vancouver, ISO, and other styles

23

Hu, Hengrui, Anai N. Kothari, and Anjishnu Banerjee. "A Novel Algorithm for Personalized Federated Learning: Knowledge Distillation with Weighted Combination Loss." Algorithms 18, no. 5 (2025): 274. https://doi.org/10.3390/a18050274.

Full text

Abstract:

Federated learning (FL) offers a privacy-preserving framework for distributed machine learning, enabling collaborative model training across diverse clients without centralizing sensitive data. However, statistical heterogeneity, characterized by non-independent and identically distributed (non-IID) client data, poses significant challenges, leading to model drift and poor generalization. This paper proposes a novel algorithm, pFedKD-WCL (Personalized Federated Knowledge Distillation with Weighted Combination Loss), which integrates knowledge distillation with bi-level optimization to address non-IID challenges. pFedKD-WCL leverages the current global model as a teacher to guide local models, optimizing both global convergence and local personalization efficiently. We evaluate pFedKD-WCL on the MNIST dataset and a synthetic dataset with non-IID partitioning, using multinomial logistic regression (MLR) and multilayer perceptron models (MLP). Experimental results demonstrate that pFedKD-WCL outperforms state-of-the-art algorithms, including FedAvg, FedProx, PerFedAvg, pFedMe, and FedGKD in terms of accuracy and convergence speed. For example, on MNIST data with an extreme non-IID setting, pFedKD-WCL achieves accuracy improvements of 3.1%, 3.2%, 3.9%, 3.3%, and 0.3% for an MLP model with 50 clients compared to FedAvg, FedProx, PerFedAvg, pFedMe, and FedGKD, respectively, while gains reach 24.1%, 22.6%, 2.8%, 3.4%, and 25.3% for an MLR model with 50 clients.

APA, Harvard, Vancouver, ISO, and other styles

24

Zhang, Xufei, and Yiqing Shen. "Non-IID federated learning with Mixed-Data Calibration." Applied and Computational Engineering 45, no. 1 (2024): 168–78. http://dx.doi.org/10.54254/2755-2721/45/20241048.

Full text

Abstract:

Federated learning (FL) is a privacy-preserving and collaborative machine learning approach for decentralized data across multiple clients. However, the presence of non-independent and non-identically distributed (non-IID) data among clients poses challenges to the performance of the global model. To address this, we propose Mixed Data Calibration (MIDAC). MIDAC mixes M data points to neutralize sensitive information in each individual data point and uses the mixed data to calibrate the global model on the server in a privacy-preserving way. MIDAC improves global model accuracy with low computational overhead while preserving data privacy. Our experiments on CIFAR-10 and BloodMNIST datasets validate the effectiveness of MIDAC in improving the accuracy of federated learning models under non-IID data distributions.

APA, Harvard, Vancouver, ISO, and other styles

25

Navarro, Jorge, and Juan Fernández-Sánchez. "On the extension of signature-based representations for coherent systems with dependent non-exchangeable components." Journal of Applied Probability 57, no. 2 (2020): 429–40. http://dx.doi.org/10.1017/jpr.2020.20.

Full text

Abstract:

AbstractThe signature representation shows that the reliability of the system is a mixture of the reliability functions of the k-out-of-n systems. The first representation was obtained for systems with independent and identically distributed (IID) components and after it was extended to exchangeable (EXC) components. The purpose of the present paper is to extend it to the class of systems with identically distributed (ID) components which have a diagonal-dependent copula. We prove that this class is much larger than the class with EXC components. This extension is used to compare systems with non-EXC components.

APA, Harvard, Vancouver, ISO, and other styles

26

Сінгх, А., та Х. Шанкар. "Комбіноване рознесення сигналів для Фішер-Снедекор композитної моделі завмирання при наявності завад". Известия высших учебных заведений. Радиоэлектроника 66, № 8 (2023): 459–65. http://dx.doi.org/10.20535/s0021347023070014.

Full text

Abstract:

В статті представлено характеристики моделі завмирання з Фішер–Снедекор (F) розподілом для забезпечення комбінованого рознесення сигналів SC (selection combining) в обмеженій системі під дією завад. Приведені функція щільності ймовірності PDF (probability density function) і кумулятивна функція розподілу CDF (cumulative distribution function) без SC рознесення при заваді від одного користувача. Також розраховано вирази CDF і PDF при використанні SC рознесення. PDF представлено для різних комбінацій параметрів корисних сигналів та параметрів багатопроменевого завмирання сигнала перешкод. Результати також проілюстровано в залежності від ймовірності порушення зв’язку OP (outage probability) як для незалежних, але не обов’язково ідентично розподілених INID (independent but not necessarily identically distributed), так і для незалежних ідентично розподілених IID (independent identically distributed) каналів. Краща характеристика ОР забезпечується при більшій кількості гілок рознесеного прийому.

APA, Harvard, Vancouver, ISO, and other styles

27

Stojanović, Vladica, Eugen Ljajko, and Marina Tošić. "Parameters Estimation in Non-Negative Integer-Valued Time Series: Approach Based on Probability Generating Functions." Axioms 12, no. 2 (2023): 112. http://dx.doi.org/10.3390/axioms12020112.

Full text

Abstract:

This manuscript deals with a parameter estimation of a non-negative integer-valued (NNIV) time series based on the so-called probability generating function (PGF) method. The theoretical background of the PGF estimation technique for a very general, stationary class of NNIV time series is described, as well as the asymptotic properties of the obtained estimates. After that, a particular emphasis is given to PGF estimators of independent identical distributed (IID) and integer-valued non-negative autoregressive (INAR) series. A Monte Carlo study of the thus obtained PGF estimates, based on a numerical integration of the appropriate objective function, is also presented. For this purpose, numerical quadrature formulas were computed using Gegenbauer orthogonal polynomials. Finally, the application of the PGF estimators in the dynamic analysis of some actual data is given.

APA, Harvard, Vancouver, ISO, and other styles

28

Agrawal, Shaashwat, Sagnik Sarkar, Mamoun Alazab, Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu, and Quoc-Viet Pham. "Genetic CFL: Hyperparameter Optimization in Clustered Federated Learning." Computational Intelligence and Neuroscience 2021 (November 18, 2021): 1–10. http://dx.doi.org/10.1155/2021/7156420.

Full text

Abstract:

Federated learning (FL) is a distributed model for deep learning that integrates client-server architecture, edge computing, and real-time intelligence. FL has the capability of revolutionizing machine learning (ML) but lacks in the practicality of implementation due to technological limitations, communication overhead, non-IID (independent and identically distributed) data, and privacy concerns. Training a ML model over heterogeneous non-IID data highly degrades the convergence rate and performance. The existing traditional and clustered FL algorithms exhibit two main limitations, including inefficient client training and static hyperparameter utilization. To overcome these limitations, we propose a novel hybrid algorithm, namely, genetic clustered FL (Genetic CFL), that clusters edge devices based on the training hyperparameters and genetically modifies the parameters clusterwise. Then, we introduce an algorithm that drastically increases the individual cluster accuracy by integrating the density-based clustering and genetic hyperparameter optimization. The results are bench-marked using MNIST handwritten digit dataset and the CIFAR-10 dataset. The proposed genetic CFL shows significant improvements and works well with realistic cases of non-IID and ambiguous data. An accuracy of 99.79% is observed in the MNIST dataset and 76.88% in CIFAR-10 dataset with only 10 training rounds.

APA, Harvard, Vancouver, ISO, and other styles

29

Wang, Zhao, Yifan Hu, Shiyang Yan, Zhihao Wang, Ruijie Hou, and Chao Wu. "Efficient Ring-Topology Decentralized Federated Learning with Deep Generative Models for Medical Data in eHealthcare Systems." Electronics 11, no. 10 (2022): 1548. http://dx.doi.org/10.3390/electronics11101548.

Full text

Abstract:

By leveraging deep learning technologies, data-driven-based approaches have reached great success with the rapid increase of data generated for medical applications. However, security and privacy concerns are obstacles for data providers in many sensitive data-driven scenarios, such as rehabilitation and 24 h on-the-go healthcare services. Although many federated learning (FL) approaches have been proposed with DNNs for medical applications, these works still suffer from low usability of data due to data incompleteness, low quality, insufficient quantity, sensitivity, etc. Therefore, we propose a ring-topology-based decentralized federated learning (RDFL) scheme for deep generative models (DGM), where DGM is a promising solution for solving the aforementioned data usability issues. Our RDFL schemes provide communication efficiency and maintain training performance to boost DGMs in target tasks compared with existing FL works. A novel ring FL topology and a map-reduce-based synchronizing method are designed in the proposed RDFL to improve the decentralized FL performance and bandwidth utilization. In addition, an inter-planetary file system (IPFS) is introduced to further improve communication efficiency and FL security. Extensive experiments have been taken to demonstrate the superiority of RDFL with either independent and identically distributed (IID) datasets or non-independent and identically distributed (Non-IID) datasets.

APA, Harvard, Vancouver, ISO, and other styles

30

Guo, Huizhen, and Nabendu Pal. "On a Normal Mean with Known Coefficient of Variation." Calcutta Statistical Association Bulletin 54, no. 1-2 (2003): 17–30. http://dx.doi.org/10.1177/0008068320030102.

Full text

Abstract:

This paper deals with estimation of θ when iid (independent and identically distributed) observations are available from a N( θ, cθ2) distribution where c > 0 is assumed to be known. Using the equivariance principle under the group of scale and direction transformations we first characterize the class of equivariant estimators of θ. We then investigate a few equivariant estimators, including the maximum likelihood estimator, in terms of standardized bias and standardized mean squared error.

APA, Harvard, Vancouver, ISO, and other styles

31

Zhou, Yueying, Gaoxiang Duan, Tianchen Qiu, et al. "Personalized Federated Learning Incorporating Adaptive Model Pruning at the Edge." Electronics 13, no. 9 (2024): 1738. http://dx.doi.org/10.3390/electronics13091738.

Full text

Abstract:

Edge devices employing federated learning encounter several obstacles, including (1) the non-independent and identically distributed (Non-IID) nature of client data, (2) limitations due to communication bottlenecks, and (3) constraints on computational resources. To surmount the Non-IID data challenge, personalized federated learning has been introduced, which involves training tailored networks at the edge; nevertheless, these methods often exhibit inconsistency in performance. In response to these concerns, a novel framework for personalized federated learning that incorporates adaptive pruning of edge-side data is proposed in this paper. This approach, through a two-staged pruning process, creates customized models while ensuring strong generalization capabilities. Concurrently, by utilizing sparse models, it significantly condenses the model parameters, markedly diminishing both the computational burden and communication overhead on edge nodes. This method achieves a remarkable compression ratio of 3.7% on the Non-IID dataset FEMNIST, with the training accuracy remaining nearly unaffected. Furthermore, the total training duration is reduced by 46.4% when compared with the standard baseline method.

APA, Harvard, Vancouver, ISO, and other styles

32

Sharma, Shagun, and Kalpna Guleria. "A Collaborative Privacy Preserved Federated Learning Framework for Pneumonia Detection using Diverse Chest X-ray Data Silos." International Journal of Mathematical, Engineering and Management Sciences 10, no. 2 (2025): 464–85. https://doi.org/10.33889/ijmems.2025.10.2.023.

Full text

Abstract:

Pneumonia detection from chest X-rays remains one of the most challenging tasks in the traditional centralized framework due to the requirement of data consolidation at the central location raising data privacy and security concerns. The amalgamation of healthcare data at the centralized storage leads to regulatory concerns passed by the governments of various countries. To address these challenges, a decentralized, federated learning framework has been proposed for early pneumonia detection in chest X-ray images with a 5-client architecture. This model enhances data privacy while performing collaborative learning with diverse data silos and resulting in improved predictions. The proposed federated learning framework has been trained with a pre-trained EfficientNetB3 model in the Independent and Identically Distributed (IID) and non-IID data distributions, while the model updation has been performed using federated proximal aggregation. The configuration of the proximal term has been kept as 0.05, achieving an accuracy of 99.32% on IID data and 96.14% on non-IID data. In addition, the proximal term has also been configured to 0.5, resulting the accuracy levels of 92.05% and 96.98% in IID data and non-IID data distributions, respectively. The results of the proposed model demonstrate the effectiveness of the federated learning model in pneumonia detection, highlighting its potential for real-world applications in decentralized healthcare configurations.

APA, Harvard, Vancouver, ISO, and other styles

33

Choi, Jai Won, Balgobin Nandram, and Boseung Choi. "Combining Correlated P-values From Primary Data Analyses." International Journal of Statistics and Probability 11, no. 6 (2022): 12. http://dx.doi.org/10.5539/ijsp.v11n6p12.

Full text

Abstract:

Research results on the same subject, extracted from scientific papers or clinical trials, are combined to determine a consensus. We are primarily concerned with combining p-values from experiments that may be correlated. We have two methods, a non-Bayesian method and a Bayesian method. We use a model to combine these results and assume the combined results follow a certain distribution, for example, chi-square or normal. The distribution requires independent and identically distributed (iid) random variables. When the data are correlated or non-iid, we cannot assume such distribution. In order to do so, the combined results from the model need to be adjusted, and the adjustment is done &ldquo;indirectly&rdquo; through two test statistics. Specifically, one test statistic (TS** ) is obtained for the non-iid data and the other is the test statistic (TS) is obtained for iid data. We use the ratio between the two test statistics to adjust the model test statistic (TS**) for its non-iid violation. The adjusted TS** is named as &ldquo;effective test statistics&rdquo; (ETS), which is then used for statistical inferences with the assumed distribution. As it is difficult to estimate the correlation, to provide a more coherent method for combining p-values, we also introduce a novel Bayesian method for both iid data and non-iid data. The examples are used to illustrate the non-Bayesian method and additional examples are given to illustrate the Bayesian method.

APA, Harvard, Vancouver, ISO, and other styles

34

Lee, Yi-Chen, Wei-Che Chien, and Yao-Chung Chang. "FedDB: A Federated Learning Approach Using DBSCAN for DDoS Attack Detection." Applied Sciences 14, no. 22 (2024): 10236. http://dx.doi.org/10.3390/app142210236.

Full text

Abstract:

The rise of Distributed Denial of Service (DDoS) attacks on the internet has necessitated the development of robust and efficient detection mechanisms. DDoS attacks continue to present a significant threat, making it imperative to find efficient ways to detect and prevent these attacks promptly. Traditional machine learning approaches raise privacy concerns when handling sensitive data. In response, federated learning has emerged as a promising paradigm, allowing model training across decentralized devices without centralizing data. However, challenges such as the non-IID (Non-Independent and Identically Distributed) problem persist due to data distribution imbalances among devices. In this research, we propose personalized federated learning (PFL) as a solution for detecting DDoS attacks. PFL preserves data privacy by keeping sensitive information localized on individual devices during model training, thus addressing privacy concerns that are inherent in traditional approaches. In this paper, we propose federated learning with DBSCAN clustering (FedDB). By combining personalized training with model aggregation, our approach effectively mitigates the common challenge of non-IID data in federated learning setups. The integration of DBSCAN clustering further enhances our method by effectively handling data distribution imbalances and improving the overall detection accuracy. Results indicate that our proposed model improves performance, achieving relatively consistent accuracy across all clients, demonstrating that our method effectively overcomes the non-IID problem. Evaluation of our approach utilizes the CICDDOS2019 dataset. Through comprehensive experimentation, we demonstrate the efficacy of personalized federated learning in enhancing detection accuracy while safeguarding data privacy and mitigating non-IID concerns.

APA, Harvard, Vancouver, ISO, and other styles

35

Na, Kyungmin, Dohyoung Kim, and Youngho Lee. "Comparison of Federated Learning and Fair Federated Learning for Pneumonia Patient Classification." Journal of Health Informatics and Statistics 50, no. 1 (2025): 31–38. https://doi.org/10.21032/jhis.2025.50.1.31.

Full text

Abstract:

Objectives: This study aims to compare the performance of federated learning (FL) and fair federated learning (FFL) in classifying pneumonia patients based on chest X-ray data. The primary focus is on assessing the accuracy and fairness of these models in handling imbalanced and distributed data in real-world healthcare settings.Methods: We used a large chest X-ray dataset to evaluate the performance of FL and FFL models. The models were built using the ResNet50 architecture, and experiments were conducted under both independent and identically distributed (IID) and non-IID data conditions. The FFL approach applied optimized loss functions to address data imbalance and ensure fair contribution from each client, regardless of the local data distribution.Results: Our findings indicate that FFL consistently outperforms traditional FL models, particularly in non-IID environments. The FFL model demonstrated higher accuracy in pneumonia classification, achieving a significant improvement in model fairness and performance across different client datasets. The use of the ResNet50 architecture further enhanced the model’s ability to handle complex X-ray image patterns.Conclusions: FFL offers a superior solution for handling imbalanced medical data compared to conventional FL models. Its ability to maintain fairness while improving classification accuracy makes it an ideal approach for decentralized healthcare systems, ensuring better patient outcomes while preserving data privacy.

APA, Harvard, Vancouver, ISO, and other styles

36

Yan, Jiaxing, Yan Li, Sifan Yin, et al. "An Efficient Greedy Hierarchical Federated Learning Training Method Based on Trusted Execution Environments." Electronics 13, no. 17 (2024): 3548. http://dx.doi.org/10.3390/electronics13173548.

Full text

Abstract:

With the continuous development of artificial intelligence, effectively solving the problem of data islands under the premise of protecting user data privacy has become a top priority. Federal learning is an effective solution to the two significant dilemmas of data islands and data privacy protection. However, there are still some security problems in federal learning. Therefore, this study simulates the data distribution in a hardware-based trusted execution environment in the real world through two processing methods: independent identically distributed and non-independent identically distributed methods. The basic model uses ResNet164 and innovatively introduces a greedy hierarchical training strategy to gradually train and aggregate complex models to ensure that the training of each layer is optimized under the premise of protecting privacy. The experimental results show that under the condition of an IID data distribution, the final accuracy of the greedy hierarchical model reaches 86.72%, which is close to the accuracy of the unpruned model at 89.60%. In contrast, under the non-IID condition, the model’s performance decreases. Overall, the TEE-based hierarchical federated learning method shows reasonable practicability and effectiveness in a resource-constrained environment. Through this study, the advantages of the greedy hierarchical federated learning model with regard to enhancing data privacy protection, optimizing resource utilization, and improving model training efficiency are further verified, providing new ideas and methods for solving the data island and data privacy protection problems.

APA, Harvard, Vancouver, ISO, and other styles

37

Chen, Jing. "Law of Large Numbers under Choquet Expectations." Abstract and Applied Analysis 2014 (2014): 1–7. http://dx.doi.org/10.1155/2014/179506.

Full text

Abstract:

With a new notion of independence of random variables, we establish the nonadditive version of weak law of large numbers (LLN) for the independent and identically distributed (IID) random variables under Choquet expectations induced by 2-alternating capacities. Moreover, we weaken the moment assumptions to the first absolute moment and characterize the approximate distributions of random variables as well. Naturally, our theorem can be viewed as an extension of the classical LLN to the case where the probability is no longer additive.

APA, Harvard, Vancouver, ISO, and other styles

38

Rai, Sumit, Arti Kumari, and Dilip K. Prasad. "Client Selection in Federated Learning under Imperfections in Environment." AI 3, no. 1 (2022): 124–45. http://dx.doi.org/10.3390/ai3010008.

Full text

Abstract:

Federated learning promises an elegant solution for learning global models across distributed and privacy-protected datasets. However, challenges related to skewed data distribution, limited computational and communication resources, data poisoning, and free riding clients affect the performance of federated learning. Selection of the best clients for each round of learning is critical in alleviating these problems. We propose a novel sampling method named the irrelevance sampling technique. Our method is founded on defining a novel irrelevance score that incorporates the client characteristics in a single floating value, which can elegantly classify the client into three numerical sign defined pools for easy sampling. It is a computationally inexpensive, intuitive and privacy preserving sampling technique that selects a subset of clients based on quality and quantity of data on edge devices. It achieves 50–80% faster convergence even in highly skewed data distribution in the presence of free riders based on lack of data and severe class imbalance under both Independent and Identically Distributed (IID) and Non-IID conditions. It shows good performance on practical application datasets.

APA, Harvard, Vancouver, ISO, and other styles

39

Lu, Chenyang, Su Deng, Yahui Wu, Haohao Zhou, and Wubin Ma. "Federated Learning Based on OPTICS Clustering Optimization." Discrete Dynamics in Nature and Society 2022 (May 12, 2022): 1–10. http://dx.doi.org/10.1155/2022/7151373.

Full text

Abstract:

Federated learning (FL) has emerged for solving the problem of data fragmentation and isolation in machine learning based on privacy protection. Each client node uploads the trained model parameter information to the central server based on the local training data, and the central server aggregates the parameter information to achieve the purpose of common training. In the real environment, the distribution of data among nodes is often inconsistent. By analyzing the influence of independent identically distributed data (non-IID) on the accuracy of FL, it is shown that the accuracy of the model obtained by the traditional FL method is low. Therefore, we proposed the diversified sampling strategies to simulate the non-IID data situation and came up with the OPTICS (ordering points to identify the clustering structure)-based clustering optimization federated learning method (OCFL), which solves the problem that the learning accuracy is reduced when the data of different nodes are non-IID in FL. Experiments indicate that OCFL greatly improves the model accuracy and training speed compared with the traditional FL algorithm.

APA, Harvard, Vancouver, ISO, and other styles

40

Wu, Jun, Jingrui He, and Elizabeth Ainsworth. "Non-IID Transfer Learning on Graphs." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 9 (2023): 10342–50. http://dx.doi.org/10.1609/aaai.v37i9.26231.

Full text

Abstract:

Transfer learning refers to the transfer of knowledge or information from a relevant source domain to a target domain. However, most existing transfer learning theories and algorithms focus on IID tasks, where the source/target samples are assumed to be independent and identically distributed. Very little effort is devoted to theoretically studying the knowledge transferability on non-IID tasks, e.g., cross-network mining. To bridge the gap, in this paper, we propose rigorous generalization bounds and algorithms for cross-network transfer learning from a source graph to a target graph. The crucial idea is to characterize the cross-network knowledge transferability from the perspective of the Weisfeiler-Lehman graph isomorphism test. To this end, we propose a novel Graph Subtree Discrepancy to measure the graph distribution shift between source and target graphs. Then the generalization error bounds on cross-network transfer learning, including both cross-network node classification and link prediction tasks, can be derived in terms of the source knowledge and the Graph Subtree Discrepancy across domains. This thereby motivates us to propose a generic graph adaptive network (GRADE) to minimize the distribution shift between source and target graphs for cross-network transfer learning. Experimental results verify the effectiveness and efficiency of our GRADE framework on both cross-network node classification and cross-domain recommendation tasks.

APA, Harvard, Vancouver, ISO, and other styles

41

Deng, Zilong, Yizhang Wang, and Mustafa Muwafak Alobaedy. "Federated k-means based on clusters backbone." PLOS One 20, no. 6 (2025): e0326145. https://doi.org/10.1371/journal.pone.0326145.

Full text

Abstract:

Federated clustering is a distributed clustering algorithm that does not require the transmission of raw data and is widely used. However, it struggles to handle Non-IID data effectively because it is difficult to obtain accurate global consistency measures under Non-Independent and Identically Distributed (Non-IID) conditions. To address this issue, we propose a federated k-means clustering algorithm based on a cluster backbone called FKmeansCB. First, we add Laplace noise to all the local data, and run k-means clustering on the client side to obtain cluster centers, which faithfully represent the cluster backbone (i.e., the data structures of the clusters). The cluster backbone represents the client’s features and can approximatively capture the features of different labeled data points in Non-IID situations. We then upload these cluster centers to the server. Subsequently, the server aggregates all cluster centers and runs the k-means clustering algorithm to obtain global cluster centers, which are then sent back to the client. Finally, the client assigns all data points to the nearest global cluster center to produce the final clustering results. We have validated the performance of our proposed algorithm using six datasets, including the large-scale MNIST dataset. Compared with the leading non-federated and federated clustering algorithms, FKmeansCB offers significant advantages in both clustering accuracy and running time.

APA, Harvard, Vancouver, ISO, and other styles

42

Salha, Raid B., Hazem I. El Shekh Ahmed, and Hossam O. EL-Sayed. "Adaptive Kernel Estimation of the Conditional Quantiles." International Journal of Statistics and Probability 5, no. 1 (2015): 79. http://dx.doi.org/10.5539/ijsp.v5n1p79.

Full text

Abstract:

In this paper, we define the adaptive kernel estimation of the conditional distribution function (cdf) for independent and identically distributed (iid) data using varying bandwidth. The bias, variance and the mean squared error of the proposed estimator are investigated. Moreover, the asymptotic normality of the proposed estimator is investigated.<br /><br />The results of the simulation study show that the adaptive kernel estimation of the conditional quantiles with varying bandwidth have better performance than the kernel estimations with fixed bandwidth.

APA, Harvard, Vancouver, ISO, and other styles

43

Lv, Yankai, Haiyan Ding, Hao Wu, Yiji Zhao, and Lei Zhang. "FedRDS: Federated Learning on Non-IID Data via Regularization and Data Sharing." Applied Sciences 13, no. 23 (2023): 12962. http://dx.doi.org/10.3390/app132312962.

Full text

Abstract:

Federated learning (FL) is an emerging decentralized machine learning framework enabling private global model training by collaboratively leveraging local client data without transferring it centrally. Unlike traditional distributed optimization, FL trains the model at the local client and then aggregates it at the server. While this approach reduces communication costs, the local datasets of different clients are non-Independent and Identically Distributed (non-IID), which may make the local model inconsistent. The present study suggests a FL algorithm that leverages regularization and data sharing (FedRDS). The local loss function is adapted by introducing a regularization term in each round of training so that the local model will gradually move closer to the global model. However, when the client data distribution gap becomes large, adding regularization items will increase the degree of client drift. Based on this, we used a data-sharing method in which a portion of server data is taken out as a shared dataset during the initialization. We then evenly distributed these data to each client to mitigate the problem of client drift by reducing the difference in client data distribution. Analysis of experimental outcomes indicates that FedRDS surpasses some known FL methods in various image classification tasks, enhancing both communication efficacy and accuracy.

APA, Harvard, Vancouver, ISO, and other styles

44

Li, Hai, Yutong Chen, Kaihong Feng, and Ming Jin. "Low-Altitude Windshear Wind Speed Estimation Method based on KASPICE-STAP." Sensors 23, no. 1 (2022): 54. http://dx.doi.org/10.3390/s23010054.

Full text

Abstract:

Aiming at the problem of low-altitude windshear wind speed estimation for airborne weather radar without independent identically distributed (IID) training samples, this paper proposes a low-altitude windshear wind speed estimation method based on knowledge-aided sparse iterative covariance-based estimation STAP (KASPICE-STAP). Firstly, a clutter dictionary composed of clutter space–time steering vectors is constructed using prior knowledge of the distribution position of ground clutter echo signals in the space–time spectrum. Secondly, the SPICE algorithm is used to obtain the clutter covariance matrix iteratively. Finally, the STAP processor is designed to eliminate the ground clutter echo signal, and the wind speed is estimated after eliminating the ground clutter echo signal. The simulation results show that the proposed method can accurately realize a low-altitude windshear wind speed estimation without IID training samples.

APA, Harvard, Vancouver, ISO, and other styles

45

Li, Ling, Lidong Zhu, and Weibang Li. "Cloud–Edge–End Collaborative Federated Learning: Enhancing Model Accuracy and Privacy in Non-IID Environments." Sensors 24, no. 24 (2024): 8028. https://doi.org/10.3390/s24248028.

Full text

Abstract:

Cloud–edge–end computing architecture is crucial for large-scale edge data processing and analysis. However, the diversity of terminal nodes and task complexity in this architecture often result in non-independent and identically distributed (non-IID) data, making it challenging to balance data heterogeneity and privacy protection. To address this, we propose a privacy-preserving federated learning method based on cloud–edge–end collaboration. Our method fully considers the three-tier architecture of cloud–edge–end systems and the non-IID nature of terminal node data. It enhances model accuracy while protecting the privacy of terminal node data. The proposed method groups terminal nodes based on the similarity of their data distributions and constructs edge subnetworks for training in collaboration with edge nodes, thereby mitigating the negative impact of non-IID data. Furthermore, we enhance WGAN-GP with attention mechanism to generate balanced synthetic data while preserving key patterns from original datasets, reducing the adverse effects of non-IID data on global model accuracy while preserving data privacy. In addition, we introduce data resampling and loss function weighting strategies to mitigate model bias caused by imbalanced data distribution. Experimental results on real-world datasets demonstrate that our proposed method significantly outperforms existing approaches in terms of model accuracy, F1-score, and other metrics.

APA, Harvard, Vancouver, ISO, and other styles

46

Layne, Elliot, Erika N. Dort, Richard Hamelin, Yue Li, and Mathieu Blanchette. "Supervised learning on phylogenetically distributed data." Bioinformatics 36, Supplement_2 (2020): i895—i902. http://dx.doi.org/10.1093/bioinformatics/btaa842.

Full text

Abstract:

Abstract Motivation The ability to develop robust machine-learning (ML) models is considered imperative to the adoption of ML techniques in biology and medicine fields. This challenge is particularly acute when data available for training is not independent and identically distributed (iid), in which case trained models are vulnerable to out-of-distribution generalization problems. Of particular interest are problems where data correspond to observations made on phylogenetically related samples (e.g. antibiotic resistance data). Results We introduce DendroNet, a new approach to train neural networks in the context of evolutionary data. DendroNet explicitly accounts for the relatedness of the training/testing data, while allowing the model to evolve along the branches of the phylogenetic tree, hence accommodating potential changes in the rules that relate genotypes to phenotypes. Using simulated data, we demonstrate that DendroNet produces models that can be significantly better than non-phylogenetically aware approaches. DendroNet also outperforms other approaches at two biological tasks of significant practical importance: antiobiotic resistance prediction in bacteria and trophic level prediction in fungi. Availability and implementation https://github.com/BlanchetteLab/DendroNet.

APA, Harvard, Vancouver, ISO, and other styles

47

Sharma, Shagun, Kalpna Guleria, Ayush Dogra, et al. "A privacy-preserved horizontal federated learning for malignant glioma tumour detection using distributed data-silos." PLOS ONE 20, no. 2 (2025): e0316543. https://doi.org/10.1371/journal.pone.0316543.

Full text

Abstract:

Malignant glioma is the uncontrollable growth of cells in the spinal cord and brain that look similar to the normal glial cells. The most essential part of the nervous system is glial cells, which support the brain’s functioning prominently. However, with the evolution of glioma, tumours form that invade healthy tissues in the brain, leading to neurological impairment, seizures, hormonal dysregulation, and venous thromboembolism. Medical tests, including medical resonance imaging (MRI), computed tomography (CT) scans, biopsy, and electroencephalograms are used for early detection of glioma. However, these tests are expensive and may cause irritation and allergic reactions due to ionizing radiation. The deep learning models are highly optimal for disease prediction, however, the challenge associated with it is the requirement for substantial memory and storage to amalgamate the patient’s information at a centralized location. Additionally, it also has patient data-privacy concerns leading to anonymous information generalization, regulatory compliance issues, and data leakage challenges. Therefore, in the proposed work, a distributed and privacy-preserved horizontal federated learning-based malignant glioma disease detection model has been developed by employing 5 and 10 different clients’ architectures in independent and identically distributed (IID) and non-IID distributions. Initially, for developing this model, the collection of the MRI scans of non-tumour and glioma tumours has been done, which are further pre-processed by performing data balancing and image resizing. The configuration and development of the pre-trained MobileNetV2 base model have been performed, which is then applied to the federated learning(FL) framework. The configurations of this model have been kept as 0.001, Adam, 32, 10, 10, FedAVG, and 10 for learning rate, optimizer, batch size, local epochs, global epochs, aggregation, and rounds, respectively. The proposed model has provided the most prominent accuracy with 5 clients’ architecture as 99.76% and 99.71% for IID and non-IID distributions, respectively. These outcomes demonstrate that the model is highly optimized and generalizes the improved outcomes when compared to the state-of-the-art models.

APA, Harvard, Vancouver, ISO, and other styles

48

Rychlik, Tomasz, and Magdalena Szymkowiak. "Bounds on the Lifetime Expectations of Series Systems with IFR Component Lifetimes." Entropy 23, no. 4 (2021): 385. http://dx.doi.org/10.3390/e23040385.

Full text

Abstract:

We consider series systems built of components which have independent identically distributed (iid) lifetimes with an increasing failure rate (IFR). We determine sharp upper bounds for the expectations of the system lifetimes expressed in terms of the mean, and various scale units based on absolute central moments of component lifetimes. We further establish analogous bounds under a more stringent assumption that the component lifetimes have an increasing density (ID) function. We also indicate the relationship between the IFR property of the components and the generalized cumulative residual entropy of the series system lifetime.

APA, Harvard, Vancouver, ISO, and other styles

49

Efthymiadis, Filippos, Aristeidis Karras, Christos Karras, and Spyros Sioutas. "Advanced Optimization Techniques for Federated Learning on Non-IID Data." Future Internet 16, no. 10 (2024): 370. http://dx.doi.org/10.3390/fi16100370.

Full text

Abstract:

Federated learning enables model training on multiple clients locally, without the need to transfer their data to a central server, thus ensuring data privacy. In this paper, we investigate the impact of Non-Independent and Identically Distributed (non-IID) data on the performance of federated training, where we find a reduction in accuracy of up to 29% for neural networks trained in environments with skewed non-IID data. Two optimization strategies are presented to address this issue. The first strategy focuses on applying a cyclical learning rate to determine the learning rate during federated training, while the second strategy develops a sharing and pre-training method on augmented data in order to improve the efficiency of the algorithm in the case of non-IID data. By combining these two methods, experiments show that the accuracy on the CIFAR-10 dataset increased by about 36% while achieving faster convergence by reducing the number of required communication rounds by 5.33 times. The proposed techniques lead to improved accuracy and faster model convergence, thus representing a significant advance in the field of federated learning and facilitating its application to real-world scenarios.

APA, Harvard, Vancouver, ISO, and other styles

50

Taheri, Seyed Iman, Mohammadreza Davoodi, and Mohd Hasan Ali. "Mitigating Cyber Anomalies in Virtual Power Plants Using Artificial-Neural-Network-Based Secondary Control with a Federated Learning-Trust Adaptation." Energies 17, no. 3 (2024): 619. http://dx.doi.org/10.3390/en17030619.

Full text

Abstract:

Virtual power plants (VPPs) are susceptible to cyber anomalies due to their extensive communication layer. FL-trust, an improved federated learning (FL) approach, has been recently introduced as a mitigation system for cyber-attacks. However, current FL-trust enhancements, relying solely on proportional-integral (PI), exhibit drawbacks like sensitivity to controller gain fluctuations and a slow response to sudden disturbances, and conventional FL-trust is not directly applicable to the non-independent and identically distributed (non-IID) datasets common in VPPs. To address these limitations, we introduce an artificial neural network (ANN)-based technique to adapt FL-trust to non-IID datasets. The ANN is designed as an intelligent anomaly mitigation control method, employing a dynamic recurrent neural network with exogenous inputs. We consider the effects of the most common VPP attacks, poisoning attacks, on the distributed cooperative controller at the secondary control level. The ANN is trained offline and tested online in the simulated VPP. Using MATLAB simulations on a HOMER-modeled VPP, the proposed technique demonstrates its superior ability to sustain normal VPP operation amidst cyber anomalies, outperforming a PI-based mitigation system in accuracy and detection speed.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!