To see the other types of publications on this topic, follow the link: High-dimensional sparse graph.

Journal articles on the topic 'High-dimensional sparse graph'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'High-dimensional sparse graph.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zou, Yuanhang, Zhihao Ding, Jieming Shi, Shuting Guo, Chunchen Su, and Yafei Zhang. "EmbedX: A Versatile, Efficient and Scalable Platform to Embed Both Graphs and High-Dimensional Sparse Data." Proceedings of the VLDB Endowment 16, no. 12 (2023): 3543–56. http://dx.doi.org/10.14778/3611540.3611546.

Full text
Abstract:
In modern online services, it is of growing importance to process web-scale graph data and high-dimensional sparse data together into embeddings for downstream tasks, such as recommendation, advertisement, prediction, and classification. There exist learning methods and systems for either high-dimensional sparse data or graphs, but not both. There is an urgent need in industry to have a system to efficiently process both types of data for higher business value, which however, is challenging. The data in Tencent contains billions of samples with sparse features in very high dimensions, and graphs are also with billions of nodes and edges. Moreover, learning models often perform expensive operations with high computational costs. It is difficult to store, manage, and retrieve massive sparse data and graph data together, since they exhibit different characteristics. We present EmbedX, an industrial distributed learning framework from Tencent, which is versatile and efficient to support embedding on both graphs and high-dimensional sparse data. EmbedX consists of distributed server layers for graph and sparse data management, and optimized parameter and graph operators, to efficiently support 4 categories of methods, including deep learning models on high-dimensional sparse data, network embedding methods, graph neural networks, and in-house developed joint learning models on both types of data. Extensive experiments on massive Tencent data and public data demonstrate the superiority of EmbedX. For instance, on a Tencent dataset with 1.3 billion nodes, 35 billion edges, and 2.8 billion samples with sparse features in 1.6 billion dimension, EmbedX performs an order of magnitude faster for training and our joint models achieve superior effectiveness. EmbedX is deployed in Tencent. A/B test on real use cases further validates the power of EmbedX. EmbedX is implemented in C++ and open-sourced at https://github.com/Tencent/embedx.
APA, Harvard, Vancouver, ISO, and other styles
2

Xie, Anze, Anders Carlsson, Jason Mohoney, et al. "Demo of marius." Proceedings of the VLDB Endowment 14, no. 12 (2021): 2759–62. http://dx.doi.org/10.14778/3476311.3476338.

Full text
Abstract:
Graph embeddings have emerged as the de facto representation for modern machine learning over graph data structures. The goal of graph embedding models is to convert high-dimensional sparse graphs into low-dimensional, dense and continuous vector spaces that preserve the graph structure properties. However, learning a graph embedding model is a resource intensive process, and existing solutions rely on expensive distributed computation to scale training to instances that do not fit in GPU memory. This demonstration showcases Marius: a new open-source engine for learning graph embedding models over billion-edge graphs on a single machine. Marius is built around a recently-introduced architecture for machine learning over graphs that utilizes pipelining and a novel data replacement policy to maximize GPU utilization and exploit the entire memory hierarchy (including disk, CPU, and GPU memory) to scale to large instances. The audience will experience how to develop, train, and deploy graph embedding models using Marius' configuration-driven programming model. Moreover, the audience will have the opportunity to explore Marius' deployments on applications including link-prediction on WikiKG90M and reasoning queries on a paleobiology knowledge graph. Marius is available as open source software at https://marius-project.org.
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Jianyu, Guan Yu, and Yufeng Liu. "Graph-based sparse linear discriminant analysis for high-dimensional classification." Journal of Multivariate Analysis 171 (May 2019): 250–69. http://dx.doi.org/10.1016/j.jmva.2018.12.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Li-e., and Xianxian Li. "A Clustering-Based Bipartite Graph Privacy-Preserving Approach for Sharing High-Dimensional Data." International Journal of Software Engineering and Knowledge Engineering 24, no. 07 (2014): 1091–111. http://dx.doi.org/10.1142/s0218194014500363.

Full text
Abstract:
Driven by mutual benefits, there is a demand for transactional data sharing among organizations or parties for research or business analysis purpose. It becomes an essential concern to provide privacy-preserving data sharing and meanwhile maintain data utility, due to the fact that transactional data may contain sensitive personal information. Existing privacy-preserving methods, such as k-anonymity and l-diversity, cannot handle high-dimensional sparse data well, since they would bring about much data distortion in the anonymization process. In this paper, we use bipartite graphs with node attributes to model high-dimensional sparse data, and then propose a privacy-preserving approach for sharing transactional data in a new vision, in which the bipartite graph is anonymized into a weighted bipartite graph by clustering node attributes. Our approach can maintain privacy of the associations between entities and resist certain attackers with knowledge of partial items. Experiments have been performed on real-life data sets to measure the information loss and the accuracy of answering aggregate queries. Experimental results show that the approach improves the balance of performance between privacy protection and data utility.
APA, Harvard, Vancouver, ISO, and other styles
5

Ni, Li, Peng Manman, and Wu Qiang. "A Spectral Clustering Algorithm for Non-Linear Graph Embedding in Information Networks." Applied Sciences 14, no. 11 (2024): 4946. http://dx.doi.org/10.3390/app14114946.

Full text
Abstract:
With the development of network technology, information networks have become one of the most important means for people to understand society. As the scale of information networks expands, the construction of network graphs and high-dimensional feature representation will become major factors affecting the performance of spectral clustering algorithms. To address this issue, in this paper, we propose a spectral clustering algorithm based on similarity graphs and non-linear deep embedding, named SEG_SC.This algorithm introduces a new spectral clustering model that explores the underlying structure of graphs through sparse similarity graphs and deep graph representation learning, thereby enhancing graph clustering performance. Experimental analysis with multiple types of real datasets shows that the performance of this model surpasses several advanced benchmark algorithms and performs well in clustering on medium- to large-scale information networks.
APA, Harvard, Vancouver, ISO, and other styles
6

Saul, Lawrence K. "A tractable latent variable model for nonlinear dimensionality reduction." Proceedings of the National Academy of Sciences 117, no. 27 (2020): 15403–8. http://dx.doi.org/10.1073/pnas.1916012117.

Full text
Abstract:
We propose a latent variable model to discover faithful low-dimensional representations of high-dimensional data. The model computes a low-dimensional embedding that aims to preserve neighborhood relationships encoded by a sparse graph. The model both leverages and extends current leading approaches to this problem. Like t-distributed Stochastic Neighborhood Embedding, the model can produce two- and three-dimensional embeddings for visualization, but it can also learn higher-dimensional embeddings for other uses. Like LargeVis and Uniform Manifold Approximation and Projection, the model produces embeddings by balancing two goals—pulling nearby examples closer together and pushing distant examples further apart. Unlike these approaches, however, the latent variables in our model provide additional structure that can be exploited for learning. We derive an Expectation–Maximization procedure with closed-form updates that monotonically improve the model’s likelihood: In this procedure, embeddings are iteratively adapted by solving sparse, diagonally dominant systems of linear equations that arise from a discrete graph Laplacian. For large problems, we also develop an approximate coarse-graining procedure that avoids the need for negative sampling of nonadjacent nodes in the graph. We demonstrate the model’s effectiveness on datasets of images and text.
APA, Harvard, Vancouver, ISO, and other styles
7

Li, Xinyu, Xiaoguang Gao, and Chenfeng Wang. "A Novel BN Learning Algorithm Based on Block Learning Strategy." Sensors 20, no. 21 (2020): 6357. http://dx.doi.org/10.3390/s20216357.

Full text
Abstract:
Learning accurate Bayesian Network (BN) structures of high-dimensional and sparse data is difficult because of high computation complexity. To learn the accurate structure for high-dimensional and sparse data faster, this paper adopts a divide and conquer strategy and proposes a block learning algorithm with a mutual information based K-means algorithm (BLMKM algorithm). This method utilizes an improved K-means algorithm to block the nodes in BN and a maximum minimum parents and children (MMPC) algorithm to obtain the whole skeleton of BN and find possible graph structures based on separated blocks. Then, a pruned dynamic programming algorithm is performed sequentially for all possible graph structures to get possible BNs and find the best BN by scoring function. Experiments show that for high-dimensional and sparse data, the BLMKM algorithm can achieve the same accuracy in a reasonable time compared with non-blocking classical learning algorithms. Compared to the existing block learning algorithms, the BLMKM algorithm has a time advantage on the basis of ensuring accuracy. The analysis of the real radar effect mechanism dataset proves that BLMKM algorithm can quickly establish a global and accurate causality model to find the cause of interference, predict the detecting result, and guide the parameters optimization. BLMKM algorithm is efficient for BN learning and has practical application value.
APA, Harvard, Vancouver, ISO, and other styles
8

Dobson, Andrew, and Kostas Bekris. "Improved Heuristic Search for Sparse Motion Planning Data Structures." Proceedings of the International Symposium on Combinatorial Search 5, no. 1 (2021): 196–97. http://dx.doi.org/10.1609/socs.v5i1.18334.

Full text
Abstract:
Sampling-based methods provide efficient, flexible solutions for motion planning, even for complex, high-dimensional systems. Asymptotically optimal planners ensure convergence to the optimal solution, but produce dense structures. This work shows how to extend sparse methods achieving asymptotic near-optimality using multiple-goal heuristic search during graph constuction. The resulting method produces identical output to the existing Incremental Roadmap Spanner approach but in an order of magnitude less time.
APA, Harvard, Vancouver, ISO, and other styles
9

Li, Peng, Mosharaf Md Parvej, Chenghao Zhang, Shufang Guo, and Jing Zhang. "Advances in the Development of Representation Learning and Its Innovations against COVID-19." COVID 3, no. 9 (2023): 1389–415. http://dx.doi.org/10.3390/covid3090096.

Full text
Abstract:
In bioinformatics research, traditional machine-learning methods have demonstrated efficacy in addressing Euclidean data. However, real-world data often encompass non-Euclidean forms, such as graph data, which contain intricate structural patterns or high-order relationships that elude conventional machine-learning approaches. Representation learning seeks to derive valuable data representations from enhancing predictive or analytic tasks, capturing vital patterns and structures. This method has proven particularly beneficial in bioinformatics and biomedicine, as it effectively handles high-dimensional and sparse data, detects complex biological patterns, and optimizes predictive performance. In recent years, graph representation learning has become a popular research topic. It involves the embedding of graphs into a low-dimensional space while preserving the structural and attribute information of the graph, enabling better feature extraction for downstream tasks. This study extensively reviews representation learning advancements, particularly in the research of representation methods since the emergence of COVID-19. We begin with an analysis and classification of neural-network-based language model representation learning techniques as well as graph representation learning methods. Subsequently, we explore their methodological innovations in the context of COVID-19, with a focus on the domains of drugs, public health, and healthcare. Furthermore, we discuss the challenges and opportunities associated with graph representation learning. This comprehensive review presents invaluable insights for researchers as it documents the development of COVID-19 and offers experiential lessons to preempt future infectious diseases. Moreover, this study provides guidance regarding future bioinformatics and biomedicine research methodologies.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Ying, Xiaojun Xu, and Jianbo Li. "High-Dimensional Sparse Graph Estimation by Integrating DTW-D Into Bayesian Gaussian Graphical Models." IEEE Access 6 (2018): 34279–87. http://dx.doi.org/10.1109/access.2018.2849213.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Merkel, Nikolai, Pierre Toussing, Ruben Mayer, and Hans-Arno Jacobsen. "Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study." Proceedings of the VLDB Endowment 18, no. 2 (2024): 293–307. https://doi.org/10.14778/3705829.3705846.

Full text
Abstract:
Graph neural networks (GNNs) are a type of neural network capable of learning on graph-structured data. However, training GNNs on large-scale graphs is challenging due to iterative aggregations of high-dimensional features from neighboring vertices within sparse graph structures combined with neural network operations. The sparsity of graphs frequently results in suboptimal memory access patterns and longer training time. Graph reordering is an optimization strategy aiming to improve the graph data layout. It has shown to be effective to speed up graph analytics workloads, but its effect on the performance of GNN training has not been investigated yet. The generalization of reordering to GNN performance is nontrivial, as multiple aspects must be considered: GNN hyper-parameters such as the number of layers, the number of hidden dimensions, and the feature size used in the GNN model, neural network operations, large intermediate vertex states, and GPU acceleration. In our work, we close this gap by performing an empirical evaluation of 12 reordering strategies in two state-of-the-art GNN systems, PyTorch Geometric and Deep Graph Library. Our results show that graph reordering is effective in reducing training time for CPU- and GPU-based training, respectively. Further, we find that GNN hyper-parameters influence the effectiveness of reordering, that reordering metrics play an important role in selecting a reordering strategy, that lightweight reordering performs better for GPU-based than for CPU-based training, and that invested reordering time can in many cases be amortized.
APA, Harvard, Vancouver, ISO, and other styles
12

Kefato, Zekarias, and Sarunas Girdzijauskas. "Gossip and Attend: Context-Sensitive Graph Representation Learning." Proceedings of the International AAAI Conference on Web and Social Media 14 (May 26, 2020): 351–59. http://dx.doi.org/10.1609/icwsm.v14i1.7305.

Full text
Abstract:
Graph representation learning (GRL) is a powerful technique for learning low-dimensional vector representation of high-dimensional and often sparse graphs. Most studies explore the structure and metadata associated with the graph using random walks and employ an unsupervised or semi-supervised learning schemes. Learning in these methods is context-free, resulting in only a single representation per node. Recently studies have argued on the adequacy of a single representation and proposed context-sensitive approaches, which are capable of extracting multiple node representations for different contexts. This proved to be highly effective in applications such as link prediction and ranking.However, most of these methods rely on additional textual features that require complex and expensive RNNs or CNNs to capture high-level features or rely on a community detection algorithm to identify multiple contexts of a node.In this study we show that in-order to extract high-quality context-sensitive node representations it is not needed to rely on supplementary node features, nor to employ computationally heavy and complex models. We propose Goat, a context-sensitive algorithm inspired by gossip communication and a mutual attention mechanism simply over the structure of the graph. We show the efficacy of Goat using 6 real-world datasets on link prediction and node clustering tasks and compare it against 12 popular and state-of-the-art (SOTA) baselines. Goat consistently outperforms them and achieves up to 12% and 19% gain over the best performing methods on link prediction and clustering tasks, respectively.
APA, Harvard, Vancouver, ISO, and other styles
13

Li, Pei Heng, Taeho Lee, and Hee Yong Youn. "Dimensionality Reduction with Sparse Locality for Principal Component Analysis." Mathematical Problems in Engineering 2020 (May 20, 2020): 1–12. http://dx.doi.org/10.1155/2020/9723279.

Full text
Abstract:
Various dimensionality reduction (DR) schemes have been developed for projecting high-dimensional data into low-dimensional representation. The existing schemes usually preserve either only the global structure or local structure of the original data, but not both. To resolve this issue, a scheme called sparse locality for principal component analysis (SLPCA) is proposed. In order to effectively consider the trade-off between the complexity and efficiency, a robust L2,p-norm-based principal component analysis (R2P-PCA) is introduced for global DR, while sparse representation-based locality preserving projection (SR-LPP) is used for local DR. Sparse representation is also employed to construct the weighted matrix of the samples. Being parameter-free, this allows the construction of an intrinsic graph more robust against the noise. In addition, simultaneous learning of projection matrix and sparse similarity matrix is possible. Experimental results demonstrate that the proposed scheme consistently outperforms the existing schemes in terms of clustering accuracy and data reconstruction error.
APA, Harvard, Vancouver, ISO, and other styles
14

Zhang, Tianjiao, Jixiang Ren, Liangyu Li, et al. "scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering." International Journal of Molecular Sciences 25, no. 11 (2024): 5976. http://dx.doi.org/10.3390/ijms25115976.

Full text
Abstract:
Single-cell RNA sequencing (scRNA-seq) is widely used to interpret cellular states, detect cell subpopulations, and study disease mechanisms. In scRNA-seq data analysis, cell clustering is a key step that can identify cell types. However, scRNA-seq data are characterized by high dimensionality and significant sparsity, presenting considerable challenges for clustering. In the high-dimensional gene expression space, cells may form complex topological structures. Many conventional scRNA-seq data analysis methods focus on identifying cell subgroups rather than exploring these potential high-dimensional structures in detail. Although some methods have begun to consider the topological structures within the data, many still overlook the continuity and complex topology present in single-cell data. We propose a deep learning framework that begins by employing a zero-inflated negative binomial (ZINB) model to denoise the highly sparse and over-dispersed scRNA-seq data. Next, scZAG uses an adaptive graph contrastive representation learning approach that combines approximate personalized propagation of neural predictions graph convolution (APPNPGCN) with graph contrastive learning methods. By using APPNPGCN as the encoder for graph contrastive learning, we ensure that each cell’s representation reflects not only its own features but also its position in the graph and its relationships with other cells. Graph contrastive learning exploits the relationships between nodes to capture the similarity among cells, better representing the data’s underlying continuity and complex topology. Finally, the learned low-dimensional latent representations are clustered using Kullback–Leibler divergence. We validated the superior clustering performance of scZAG on 10 common scRNA-seq datasets in comparison to existing state-of-the-art clustering methods.
APA, Harvard, Vancouver, ISO, and other styles
15

Chen, Dongming, Mingshuo Nie, Hupo Zhang, Zhen Wang, and Dongqi Wang. "Network Embedding Algorithm Taking in Variational Graph AutoEncoder." Mathematics 10, no. 3 (2022): 485. http://dx.doi.org/10.3390/math10030485.

Full text
Abstract:
Complex networks with node attribute information are employed to represent complex relationships between objects. Research of attributed network embedding fuses the topology and the node attribute information of the attributed network in the common latent representation space, to encode the high-dimensional sparse network information to the low-dimensional dense vector representation, effectively improving the performance of the network analysis tasks. The current research on attributed network embedding is presently facing problems of high-dimensional sparsity of attribute eigenmatrix and underutilization of attribute information. In this paper, we propose a network embedding algorithm taking in a variational graph autoencoder (NEAT-VGA). This algorithm first pre-processes the attribute features, i.e., the attribute feature learning of the network nodes. Then, the feature learning matrix and the adjacency matrix of the network are fed into the variational graph autoencoder algorithm to obtain the Gaussian distribution of the potential vectors, which more easily generate high-quality node embedding representation vectors. Then, the embedding of the nodes obtained by sampling this Gaussian distribution is reconstructed with structural and attribute losses. The loss function is minimized by iterative training until the low-dimension vector representation, containing network structure information and attribute information of nodes, can be better obtained, and the performance of the algorithm is evaluated by link prediction experimental results.
APA, Harvard, Vancouver, ISO, and other styles
16

Chen, Pingfei, Xuyang Li, Yong Peng, Xiangsuo Fan, and Qi Li. "WSSGCN: Hyperspectral Forest Image Classification via Watershed Superpixel Segmentation and Sparse Graph Convolutional Networks." Forests 16, no. 5 (2025): 827. https://doi.org/10.3390/f16050827.

Full text
Abstract:
Hyperspectral image classification is crucial in remote sensing but faces challenges in forest ecosystem studies due to high-dimensional data, spectral variability, and spatial heterogeneity. Watershed Superpixel Segmentation and Sparse Graph Convolutional Networks (WSSGCN), a novel framework designed for efficient forest image classification, is introduced in this paper. Watershed superpixel segmentation is first used by the method to divide hyperspectral images into semantically consistent regions, reducing computational complexity while preserving terrain boundary information. On this basis, a dual-branch model is designed: a local branch with multi-scale convolutional neural networks (CNN) extracts spatial–spectral features, while a global branch constructs superpixel graphs and uses GCNs to model the global context. To enhance efficiency, a sparse tensor-based storage method is proposed for the adjacency matrix, reducing complexity from quadratic to linear. Additionally, an attention-based adaptive fusion strategy dynamically balances local and global features. Experiments on multiple datasets show that WSSGCN outperforms mainstream methods in overall accuracy (OA), average accuracy (AA), and Kappa coefficient. Notably, it achieves a 3.5% OA improvement and a 0.04 Kappa coefficient increase compared to SPEFORMER on the WHU-Hi-HongHu dataset. Practicality in resource-limited scenarios is ensured by sparse graph modeling. This work offers an efficient solution for forest monitoring, supporting applications like biodiversity assessment and deforestation tracking, and advances remote sensing-based forest ecosystem analysis. The proposed approach shows strong potential for real-world ecological conservation and forest management.
APA, Harvard, Vancouver, ISO, and other styles
17

Cai, Jun, Xin Xu, Hongpeng Zhu, and Jian Cheng. "An Efficient Compressive Sensing Event-Detection Scheme for Internet of Things System Based on Sparse-Graph Codes." Sensors 23, no. 10 (2023): 4620. http://dx.doi.org/10.3390/s23104620.

Full text
Abstract:
This work studied the event-detection problem in an Internet of Things (IoT) system, where a group of sensor nodes are placed in the region of interest to capture sparse active event sources. Using compressive sensing (CS), the event-detection problem is modeled as recovering the high-dimensional integer-valued sparse signal from incomplete linear measurements. We show that the sensing process in IoT system produces an equivalent integer CS using sparse graph codes at the sink node, for which one can devise a simple deterministic construction of a sparse measurement matrix and an efficient integer-valued signal recovery algorithm. We validated the determined measurement matrix, uniquely determined the signal coefficients, and performed an asymptotic analysis to examine the performance of the proposed approach, namely event detection with integer sum peeling (ISP), with the density evolution method. Simulation results show that the proposed ISP approach achieves a significantly higher performance compared to existing literature at various simulation scenario and match that of the theoretical results.
APA, Harvard, Vancouver, ISO, and other styles
18

Ye, Nanjun. "A Penetrative Multidimensional Data Analytics Model for Complex Relationship Mining over Knowledge Graphs." Journal of Computing and Electronic Information Management 17, no. 2 (2025): 34–41. https://doi.org/10.54097/87rgwp44.

Full text
Abstract:
This study proposes a deep multidimensional data analytics framework for extracting intricate relationships from knowledge graphs, which tackles the challenge of discovering hidden connections in heterogeneous and high-dimensional datasets. The proposed method unifies three principal elements: Dynamic Meta-Path Penetration, Nested Subgraph Extraction, and Tensor-Graph Fusion, which together permit a structured investigation of hidden connections. Dynamic Meta-Path Penetration applies reinforcement learning to traverse the graph, directed by a reward system prioritizing informative routes. Nested Subgraph Extraction hierarchically aggregates multi-hop dependencies by employing Graph Neural Networks, which identifies structural patterns within localized subgraphs. Tensor-Graph Fusion performs joint factorization on the knowledge graph adjacency tensor and multidimensional data tensors, thereby merging structural and attribute-based information within a common latent space. The PPA-GNN layer coordinates these elements by traversing the graph, eliminating unnecessary connections, and merging cross-modal attributes, thus producing embeddings that capture intricate relationships. Additionally, the penetration depth is established as a metric to measure the minimal distance needed to uncover hidden relationships. Experiments on benchmark datasets show our model achieves better performance than state-of-the-art methods in relationship mining tasks, especially in cases with sparse or noisy data. The framework’s ability to integrate heterogeneous data sources and dynamically adapt to graph structures makes it suitable for applications in recommendation systems, biomedical discovery, and social network analysis. This study propels the discipline forward by introducing a cohesive framework for penetrative analytics, which connects graph-based and tensor-based approaches.
APA, Harvard, Vancouver, ISO, and other styles
19

Yang, Qian, Jiaming Zhang, Junjie Zhang, et al. "Graph Transformer Network Incorporating Sparse Representation for Multivariate Time Series Anomaly Detection." Electronics 13, no. 11 (2024): 2032. http://dx.doi.org/10.3390/electronics13112032.

Full text
Abstract:
Cyber–physical systems (CPSs) serve as the pivotal core of Internet of Things (IoT) infrastructures, such as smart grids and intelligent transportation, deploying interconnected sensing devices to monitor operating status. With increasing decentralization, the surge in sensor devices expands the potential vulnerability to cyber attacks. It is imperative to conduct anomaly detection research on the multivariate time series data that these sensors produce to bolster the security of distributed CPSs. However, the high dimensionality, absence of anomaly labels in real-world datasets, and intricate non-linear relationships among sensors present considerable challenges in formulating effective anomaly detection algorithms. Recent deep-learning methods have achieved progress in the field of anomaly detection. Yet, many methods either rely on statistical models that struggle to capture non-linear relationships or use conventional deep learning models like CNN and LSTM, which do not explicitly learn inter-variable correlations. In this study, we propose a novel unsupervised anomaly detection method that integrates Sparse Autoencoder with Graph Transformer network (SGTrans). SGTrans leverages Sparse Autoencoder for the dimensionality reduction and reconstruction of high-dimensional time series, thus extracting meaningful hidden representations. Then, the multivariate time series are mapped into a graph structure. We introduce a multi-head attention mechanism from Transformer into graph structure learning, constructing a Graph Transformer network forecasting module. This module performs attentive information propagation between long-distance sensor nodes and explicitly models the complex temporal dependencies among them to enhance the prediction of future behaviors. Extensive experiments and evaluations on three publicly available real-world datasets demonstrate the effectiveness of our approach.
APA, Harvard, Vancouver, ISO, and other styles
20

Li, Shuang, Bing Liu, and Chen Zhang. "Regularized Embedded Multiple Kernel Dimensionality Reduction for Mine Signal Processing." Computational Intelligence and Neuroscience 2016 (2016): 1–12. http://dx.doi.org/10.1155/2016/4920670.

Full text
Abstract:
Traditional multiple kernel dimensionality reduction models are generally based on graph embedding and manifold assumption. But such assumption might be invalid for some high-dimensional or sparse data due to the curse of dimensionality, which has a negative influence on the performance of multiple kernel learning. In addition, some models might be ill-posed if the rank of matrices in their objective functions was not high enough. To address these issues, we extend the traditional graph embedding framework and propose a novel regularized embedded multiple kernel dimensionality reduction method. Different from the conventional convex relaxation technique, the proposed algorithm directly takes advantage of a binary search and an alternative optimization scheme to obtain optimal solutions efficiently. The experimental results demonstrate the effectiveness of the proposed method for supervised, unsupervised, and semisupervised scenarios.
APA, Harvard, Vancouver, ISO, and other styles
21

Wang, Zhan, Qiuqi Ruan, and Gaoyun An. "Face Recognition Using Double Sparse Local Fisher Discriminant Analysis." Mathematical Problems in Engineering 2015 (2015): 1–9. http://dx.doi.org/10.1155/2015/636928.

Full text
Abstract:
Local Fisher discriminant analysis (LFDA) was proposed for dealing with the multimodal problem. It not only combines the idea of locality preserving projections (LPP) for preserving the local structure of the high-dimensional data but also combines the idea of Fisher discriminant analysis (FDA) for obtaining the discriminant power. However, LFDA also suffers from the undersampled problem as well as many dimensionality reduction methods. Meanwhile, the projection matrix is not sparse. In this paper, we propose double sparse local Fisher discriminant analysis (DSLFDA) for face recognition. The proposed method firstly constructs a sparse and data-adaptive graph with nonnegative constraint. Then, DSLFDA reformulates the objective function as a regression-type optimization problem. The undersampled problem is avoided naturally and the sparse solution can be obtained by adding the regression-type problem to al1penalty. Experiments on Yale, ORL, and CMU PIE face databases are implemented to demonstrate the effectiveness of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
22

Sturtevant, Nathan. "A Sparse Grid Representation for Dynamic Three-Dimensional Worlds." Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 7, no. 1 (2011): 73–78. http://dx.doi.org/10.1609/aiide.v7i1.12438.

Full text
Abstract:
Grid representations offer many advantages for path planning. Lookups in grids are fast, due to the uniform memory layout, and it is easy to modify grids. But, grids often have significant memory requirements, they cannot directly represent more complex surfaces, and path planning is slower due to their high granularity representation of the world. The speed of path planning on grids has been addressed using abstract representations, such as has been documented in work on Dragon Age: Origins. The abstract representation used in this game was compact, preventing permanent changes to the grid. In this paper we introduce a sparse grid representation, where grid cells are only stored where necessary. From this sparse representation we incrementally build an abstract graph which represents possible movement in the world at a high-level of granularity. This sparse representation also allows the representation of three-dimensional worlds. This representation allows the world to be incrementally changed in under a millisecond, reducing the maximum memory required to store a map and abstraction from Dragon Age: Origins by nearly one megabyte. Fundamentally, the representation allows previously allocated but unused memory to be used in ways that result in higher-quality planning and more intelligent agents.
APA, Harvard, Vancouver, ISO, and other styles
23

Majeed, Abdul, and Sungchang Lee. "A Fast Global Flight Path Planning Algorithm Based on Space Circumscription and Sparse Visibility Graph for Unmanned Aerial Vehicle." Electronics 7, no. 12 (2018): 375. http://dx.doi.org/10.3390/electronics7120375.

Full text
Abstract:
This paper proposes a new flight path planning algorithm that finds collision-free, optimal/near-optimal and flyable paths for unmanned aerial vehicles (UAVs) in three-dimensional (3D) environments with fixed obstacles. The proposed algorithm significantly reduces pathfinding computing time without significantly degrading path lengths by using space circumscription and a sparse visibility graph in the pathfinding process. We devise a novel method by exploiting the information about obstacle geometry to circumscribe the search space in the form of a half cylinder from which a working path for UAV can be computed without sacrificing the guarantees on near-optimality and speed. Furthermore, we generate a sparse visibility graph from the circumscribed space and find the initial path, which is subsequently optimized. The proposed algorithm effectively resolves the efficiency and optimality trade-off by searching the path only from the high priority circumscribed space of a map. The simulation results obtained from various maps, and comparison with the existing methods show the effectiveness of the proposed algorithm and verify the aforementioned claims.
APA, Harvard, Vancouver, ISO, and other styles
24

Li, Zixuan, Hao Li, Kenli Li, Fan Wu, Lydia Chen, and Keqin Li. "Locality Sensitive Hash Aggregated Nonlinear Neighborhood Matrix Factorization for Online Sparse Big Data Analysis." ACM/IMS Transactions on Data Science 2, no. 4 (2021): 1–27. http://dx.doi.org/10.1145/3497749.

Full text
Abstract:
Matrix factorization (MF) can extract the low-rank features and integrate the information of the data manifold distribution from high-dimensional data, which can consider the nonlinear neighborhood information. Thus, MF has drawn wide attention for low-rank analysis of sparse big data, e.g., Collaborative Filtering (CF) Recommender Systems, Social Networks, and Quality of Service. However, the following two problems exist: (1) huge computational overhead for the construction of the Graph Similarity Matrix (GSM) and (2) huge memory overhead for the intermediate GSM. Therefore, GSM-based MF, e.g., kernel MF, graph regularized MF, and so on, cannot be directly applied to the low-rank analysis of sparse big data on cloud and edge platforms. To solve this intractable problem for sparse big data analysis, we propose Locality Sensitive Hashing (LSH) aggregated MF (LSH-MF), which can solve the following problems: (1) The proposed probabilistic projection strategy of LSH-MF can avoid the construction of the GSM. Furthermore, LSH-MF can satisfy the requirement for the accurate projection of sparse big data. (2) To run LSH-MF for fine-grained parallelization and online learning on GPUs, we also propose CULSH-MF, which works on CUDA parallelization. Experimental results show that CULSH-MF can not only reduce the computational time and memory overhead but also obtain higher accuracy. Compared with deep learning models, CULSH-MF can not only save training time but also achieve the same accuracy performance.
APA, Harvard, Vancouver, ISO, and other styles
25

Muralinath, Rashmi N., Vishwambhar Pathak, and Prabhat K. Mahanti. "Metastable Substructure Embedding and Robust Classification of Multichannel EEG Data Using Spectral Graph Kernels." Future Internet 17, no. 3 (2025): 102. https://doi.org/10.3390/fi17030102.

Full text
Abstract:
Classification of neurocognitive states from Electroencephalography (EEG) data is complex due to inherent challenges such as noise, non-stationarity, non-linearity, and the high-dimensional and sparse nature of connectivity patterns. Graph-theoretical approaches provide a powerful framework for analysing the latent state dynamics using connectivity measures across spatio-temporal-spectral dimensions. This study applies the graph Koopman embedding kernels (GKKE) method to extract latent neuro-markers of seizures from epileptiform EEG activity. EEG-derived graphs were constructed using correlation and mean phase locking value (mPLV), with adjacency matrices generated via threshold-binarised connectivity. Graph kernels, including Random Walk, Weisfeiler–Lehman (WL), and spectral-decomposition (SD) kernels, were evaluated for latent space feature extraction by approximating Koopman spectral decomposition. The potential of graph Koopman embeddings in identifying latent metastable connectivity structures has been demonstrated with empirical analyses. The robustness of these features was evaluated using classifiers such as Decision Trees, Support Vector Machine (SVM), and Random Forest, on Epilepsy-EEG from the Children’s Hospital Boston’s (CHB)-MIT dataset and cognitive-load-EEG datasets from online repositories. The classification workflow combining mPLV connectivity measure, WL graph Koopman kernel, and Decision Tree (DT) outperformed the alternative combinations, particularly considering the accuracy (91.7%) and F1-score (88.9%), The comparative investigation presented in results section convinces that employing cost-sensitive learning improved the F1-score for the mPLV-WL-DT workflow to 91% compared to 88.9% without cost-sensitive learning. This work advances EEG-based neuro-marker estimation, facilitating reliable assistive tools for prognosis and cognitive training protocols.
APA, Harvard, Vancouver, ISO, and other styles
26

Yang, Ye, Yongli Hu, and Fei Wu. "Sparse and Low-Rank Subspace Data Clustering with Manifold Regularization Learned by Local Linear Embedding." Applied Sciences 8, no. 11 (2018): 2175. http://dx.doi.org/10.3390/app8112175.

Full text
Abstract:
Data clustering is an important research topic in data mining and signal processing communications. In all the data clustering methods, the subspace spectral clustering methods based on self expression model, e.g., the Sparse Subspace Clustering (SSC) and the Low Rank Representation (LRR) methods, have attracted a lot of attention and shown good performance. The key step of SSC and LRR is to construct a proper affinity or similarity matrix of data for spectral clustering. Recently, Laplacian graph constraint was introduced into the basic SSC and LRR and obtained considerable improvement. However, the current graph construction methods do not well exploit and reveal the non-linear properties of the clustering data, which is common for high dimensional data. In this paper, we introduce the classic manifold learning method, the Local Linear Embedding (LLE), to learn the non-linear structure underlying the data and use the learned local geometry of manifold as a regularization for SSC and LRR, which results the proposed LLE-SSC and LLE-LRR clustering methods. Additionally, to solve the complex optimization problem involved in the proposed models, an efficient algorithm is also proposed. We test the proposed data clustering methods on several types of public databases. The experimental results show that our methods outperform typical subspace clustering methods with Laplacian graph constraint.
APA, Harvard, Vancouver, ISO, and other styles
27

Liang, Zexiao, Ruyi Gong, Guoliang Tan, Shiyin Ji, and Ruidian Zhan. "A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification." Applied Sciences 14, no. 12 (2024): 5342. http://dx.doi.org/10.3390/app14125342.

Full text
Abstract:
With the increasing demand for high-resolution images, handling high-dimensional image data has become a key aspect of intelligence algorithms. One effective approach is to preserve the high-dimensional manifold structure of the data and find the accurate mappings in a lower-dimensional space. However, various non-sparse, high-energy occlusions in real-world images can lead to erroneous calculations of sample relationships, invalidating the existing distance-based manifold dimensionality reduction techniques. Many types of noise are difficult to capture and filter in the original domain but can be effectively separated in the frequency domain. Inspired by this idea, a novel approach is proposed in this paper, which obtains the high-dimensional manifold structure according to the correlationships between data points in the frequency domain and accurately maps it to a lower-dimensional space, named Frequency domain-based Manifold Dimensionality Reduction (FMDR). In FMDR, samples are first transformed into frequency domains. Then, interference is filtered based on the distribution in the frequency domain, thereby emphasizing discriminative features. Subsequently, an innovative kernel function is proposed for measuring the similarities between samples according to the correlationships in the frequency domain. With the assistance of these correlationships, a graph structure can be constructed and utilized to find the mapping in a low-dimensional space. To further demonstrate the effectiveness of the proposed algorithm, FMDR is employed for the semi-supervised classification problems in this paper. Experiments using public image datasets indicate that, compared to baseline algorithms and state-of-the-art methods, our approach achieves superior recognition performance. Even with very few labeled data, the advantages of FMDR are still maintained. The effectiveness of FMDR in dimensionality reduction and feature extraction of images makes it widely applicable in fields such as image processing and image recognition.
APA, Harvard, Vancouver, ISO, and other styles
28

Bradley, Patrick, Sina Keller, and Martin Weinmann. "Unsupervised Feature Selection Based on Ultrametricity and Sparse Training Data: A Case Study for the Classification of High-Dimensional Hyperspectral Data." Remote Sensing 10, no. 10 (2018): 1564. http://dx.doi.org/10.3390/rs10101564.

Full text
Abstract:
In this paper, we investigate the potential of unsupervised feature selection techniques for classification tasks, where only sparse training data are available. This is motivated by the fact that unsupervised feature selection techniques combine the advantages of standard dimensionality reduction techniques (which only rely on the given feature vectors and not on the corresponding labels) and supervised feature selection techniques (which retain a subset of the original set of features). Thus, feature selection becomes independent of the given classification task and, consequently, a subset of generally versatile features is retained. We present different techniques relying on the topology of the given sparse training data. Thereby, the topology is described with an ultrametricity index. For the latter, we take into account the Murtagh Ultrametricity Index (MUI) which is defined on the basis of triangles within the given data and the Topological Ultrametricity Index (TUI) which is defined on the basis of a specific graph structure. In a case study addressing the classification of high-dimensional hyperspectral data based on sparse training data, we demonstrate the performance of the proposed unsupervised feature selection techniques in comparison to standard dimensionality reduction and supervised feature selection techniques on four commonly used benchmark datasets. The achieved classification results reveal that involving supervised feature selection techniques leads to similar classification results as involving unsupervised feature selection techniques, while the latter perform feature selection independently from the given classification task and thus deliver generally versatile features.
APA, Harvard, Vancouver, ISO, and other styles
29

Bambi, Jonas, Yudi Santoso, Hanieh Sadri, et al. "A Methodological Approach to Extracting Patterns of Service Utilization from a Cross-Continuum High Dimensional Healthcare Dataset to Support Care Delivery Optimization for Patients with Complex Problems." BioMedInformatics 4, no. 2 (2024): 946–65. http://dx.doi.org/10.3390/biomedinformatics4020053.

Full text
Abstract:
Background: Optimizing care for patients with complex problems entails the integration of clinically appropriate problem-specific clinical protocols, and the optimization of service-system-encompassing clinical pathways. However, alignment of service system operations with Clinical Practice Guidelines (CPGs) is far more challenging than the time-bounded alignment of procedures with protocols. This is due to the challenge of identifying longitudinal patterns of service utilization in the cross-continuum data to assess adherence to the CPGs. Method: This paper proposes a new methodology for identifying patients’ patterns of service utilization (PSUs) within sparse high-dimensional cross-continuum health datasets using graph community detection. Result: The result has shown that by using iterative graph community detections, and graph metrics combined with input from clinical and operational subject matter experts, it is possible to extract meaningful functionally integrated PSUs. Conclusions: This introduces the possibility of influencing the reorganization of some services to provide better care for patients with complex problems. Additionally, this introduces a novel analytical framework relying on patients’ service pathways as a foundation to generate the basic entities required to evaluate conformance of interventions to cohort-specific clinical practice guidelines, which will be further explored in our future research.
APA, Harvard, Vancouver, ISO, and other styles
30

Zhou, Ruqin, and Wanshou Jiang. "A Ridgeline-Based Terrain Co-registration for Satellite LiDAR Point Clouds in Rough Areas." Remote Sensing 12, no. 13 (2020): 2163. http://dx.doi.org/10.3390/rs12132163.

Full text
Abstract:
It is still a completely new and challenging task to register extensive, enormous and sparse satellite light detection and ranging (LiDAR) point clouds. Aimed at this problem, this study provides a ridgeline-based terrain co-registration method in preparation for satellite LiDAR point clouds in rough areas. This method has several merits: (1) only ridgelines are extracted as neighbor information for feature description and their intersections are extracted as keypoints, which can greatly reduce the number of points for subsequent processing, and extracted keypoints is of high repeatability and distinctiveness; (2) a new local-reference frame (LRF) construction method is designed by combining both three dimensional (3D) coordinate and normal vector covariance matrices, which effectively improves its direction consistency; (3) a minimum cost–maximum flow (MCMF) graph-matching strategy is adopted to maximize similarity sum in a sparse-matching graph. It can avoid the problem of “many-to-many” and “one to many” caused by traditional matching strategies; (4) a transformation matrix-based clustering is adopted with a least square (LS)-based registration, where mismatches are eliminated and correct pairs are fully participated in optimal parameters evaluation to improve its stability. Experiments on simulated satellite LiDAR point clouds show that this method can effectively remove mismatches and estimate optimal parameters with high accuracy, especially in rough areas.
APA, Harvard, Vancouver, ISO, and other styles
31

Grassi, Mario, and Barbara Tarantino. "SEMdag: Fast learning of Directed Acyclic Graphs via node or layer ordering." PLOS ONE 20, no. 1 (2025): e0317283. https://doi.org/10.1371/journal.pone.0317283.

Full text
Abstract:
A Directed Acyclic Graph (DAG) offers an easy approach to define causal structures among gathered nodes: causal linkages are represented by arrows between the variables, leading from cause to effect. Recently, industry and academics have paid close attention to DAG structure learning from observable data, and many techniques have been put out to address the problem. We provide a two-step approach, named SEMdag(), that can be used to quickly learn high-dimensional linear SEMs. It is included in the R package SEMgraph and employs a two-stage order-based search using previous knowledge (Knowledge-based, KB) or data-driven method (Bottom-up, BU), under the premise that a linear SEM with equal variance error terms is assumed. We evaluated our framework’s for finding plausible DAGs against six well-known causal discovery techniques (ARGES, GES, PC, LiNGAM, CAM, NOTEARS). We conducted a series of experiments using observed expression (or RNA-seq) data, taking into account a pair of training and testing datasets for four distinct diseases: Amyotrophic Lateral Sclerosis (ALS), Breast cancer (BRCA), Coronavirus disease (COVID-19) and ST-elevation myocardial infarction (STEMI). The results show that the SEMdag() procedure can recover a graph structure with good disease prediction performance evaluated by a conventional supervised learning algorithm (RF): in the scenario where the initial graph is sparse, the BU approach may be a better choice than the KB one; in the case where the graph is denser, both BU an KB report high performance, with highest score for KB approach based on topological layers. Besides its superior disease predictive performance compared to previous research, SEMdag() offers the user the flexibility to define distinct structure learning algorithms and can handle high dimensional issues with less computing load. SEMdag() function is implemented in the R package SEMgraph, easily available at https://CRAN.R-project.org/package=SEMgraph.
APA, Harvard, Vancouver, ISO, and other styles
32

Liang, Jiaxuan, Jun Wang, Guoxian Yu, Shuyin Xia, and Guoyin Wang. "Multi-Granularity Causal Structure Learning." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 12 (2024): 13727–35. http://dx.doi.org/10.1609/aaai.v38i12.29278.

Full text
Abstract:
Unveiling, modeling, and comprehending the causal mechanisms underpinning natural phenomena stand as fundamental endeavors across myriad scientific disciplines. Meanwhile, new knowledge emerges when discovering causal relationships from data. Existing causal learning algorithms predominantly focus on the isolated effects of variables, overlook the intricate interplay of multiple variables and their collective behavioral patterns. Furthermore, the ubiquity of high-dimensional data exacts a substantial temporal cost for causal algorithms. In this paper, we develop a novel method called MgCSL (Multi-granularity Causal Structure Learning), which first leverages sparse auto-encoder to explore coarse-graining strategies and causal abstractions from micro-variables to macro-ones. MgCSL then takes multi-granularity variables as inputs to train multilayer perceptrons and to delve the causality between variables. To enhance the efficacy on high-dimensional data, MgCSL introduces a simplified acyclicity constraint to adeptly search the directed acyclic graph among variables. Experimental results show that MgCSL outperforms competitive baselines, and finds out explainable causal connections on fMRI datasets.
APA, Harvard, Vancouver, ISO, and other styles
33

Majeed, Abdul, and Seong Oun Hwang. "Path Planning Method for UAVs Based on Constrained Polygonal Space and an Extremely Sparse Waypoint Graph." Applied Sciences 11, no. 12 (2021): 5340. http://dx.doi.org/10.3390/app11125340.

Full text
Abstract:
Finding an optimal/quasi-optimal path for Unmanned Aerial Vehicles (UAVs) utilizing full map information yields time performance degradation in large and complex three-dimensional (3D) urban environments populated by various obstacles. A major portion of the computing time is usually wasted on modeling and exploration of spaces that have a very low possibility of providing optimal/sub-optimal paths. However, computing time can be significantly reduced by searching for paths solely in the spaces that have the highest priority of providing an optimal/sub-optimal path. Many Path Planning (PP) techniques have been proposed, but a majority of the existing techniques equally evaluate many spaces of the maps, including unlikely ones, thereby creating time performance issues. Ignoring high-probability spaces and instead exploring too many spaces on maps while searching for a path yields extensive computing-time overhead. This paper presents a new PP method that finds optimal/quasi-optimal and safe (e.g., collision-free) working paths for UAVs in a 3D urban environment encompassing substantial obstacles. By using Constrained Polygonal Space (CPS) and an Extremely Sparse Waypoint Graph (ESWG) while searching for a path, the proposed PP method significantly lowers pathfinding time complexity without degrading the length of the path by much. We suggest an intelligent method exploiting obstacle geometry information to constrain the search space in a 3D polygon form from which a quasi-optimal flyable path can be found quickly. Furthermore, we perform task modeling with an ESWG using as few nodes and edges from the CPS as possible, and we find an abstract path that is subsequently improved. The results achieved from extensive experiments, and comparison with prior methods certify the efficacy of the proposed method and verify the above assertions.
APA, Harvard, Vancouver, ISO, and other styles
34

Kiesel, Scott, Ethan Burns, and Wheeler Ruml. "Abstraction-Guided Sampling for Motion Planning." Proceedings of the International Symposium on Combinatorial Search 3, no. 1 (2021): 162–63. http://dx.doi.org/10.1609/socs.v3i1.18265.

Full text
Abstract:
Motion planning in continuous space is a fundamentalrobotics problem that has been approached from many per-spectives. Rapidly-exploring Random Trees (RRTs) usesampling to efficiently traverse the continuous and high-dimensional state space. Heuristic graph search methods uselower bounds on solution cost to focus effort on portions ofthe space that are likely to be traversed by low-cost solutions.In this work, we bring these two ideas together in a tech-nique called f -biasing: we use estimates of solution cost,computed as in heuristic search, to guide sparse sampling,as in RRTs. We see this new technique as strengthening theconnections between motion planning in robotics and combi-natorial search in artificial intelligence.
APA, Harvard, Vancouver, ISO, and other styles
35

Federico, Anthony, Joseph Kern, Xaralabos Varelas, and Stefano Monti. "Structure Learning for Gene Regulatory Networks." PLOS Computational Biology 19, no. 5 (2023): e1011118. http://dx.doi.org/10.1371/journal.pcbi.1011118.

Full text
Abstract:
Inference of biological network structures is often performed on high-dimensional data, yet is hindered by the limited sample size of high throughput “omics” data typically available. To overcome this challenge, often referred to as the “small n, large p problem,” we exploit known organizing principles of biological networks that are sparse, modular, and likely share a large portion of their underlying architecture. We present SHINE—Structure Learning for Hierarchical Networks—a framework for defining data-driven structural constraints and incorporating a shared learning paradigm for efficiently learning multiple Markov networks from high-dimensional data at large p/n ratios not previously feasible. We evaluated SHINE on Pan-Cancer data comprising 23 tumor types, and found that learned tumor-specific networks exhibit expected graph properties of real biological networks, recapture previously validated interactions, and recapitulate findings in literature. Application of SHINE to the analysis of subtype-specific breast cancer networks identified key genes and biological processes for tumor maintenance and survival as well as potential therapeutic targets for modulating known breast cancer disease genes.
APA, Harvard, Vancouver, ISO, and other styles
36

Wang, Zhen, Yongjie Wang, Xinli Xiong, Qiankun Ren, and Jun Huang. "A Novel Framework for Enhancing Decision-Making in Autonomous Cyber Defense Through Graph Embedding." Entropy 27, no. 6 (2025): 622. https://doi.org/10.3390/e27060622.

Full text
Abstract:
Faced with challenges posed by sophisticated cyber attacks and dynamic characteristics of cyberspace, the autonomous cyber defense (ACD) technology has shown its effectiveness. However, traditional decision-making methods for ACD are unable to effectively characterize the network topology and internode dependencies, which makes it difficult for defenders to identify key nodes and critical attack paths. Therefore, this paper proposes an enhanced decision-making method combining graph embedding with reinforcement learning algorithms. By constructing a game model for cyber confrontations, this paper models important elements of the network topology for decision-making, which guide the defender to dynamically optimize its strategy based on topology awareness. We improve the reinforcement learning with the Node2vec algorithm to characterize information for the defender from the network. And, node attributes and network structural features are embedded into low-dimensional vectors instead of using traditional one-hot encoding, which can address the perceptual bottleneck in high-dimensional sparse environments. Meanwhile, the algorithm training environment Cyberwheel is extended by adding new fine-grained defense mechanisms to enhance the utility and portability of ACD. In experiments, our decision-making method based on graph embedding is compared and analyzed with traditional perception methods. The results show and verify the superior performance of our approach in the strategy selection of defensive decision-making. Also, diverse parameters of the graph representation model Node2vec are analyzed and compared to find the impact on the enhancement of the embedding effectiveness for the decision-making of ACD.
APA, Harvard, Vancouver, ISO, and other styles
37

Lu, Pengli, Junxia Yang, and Teng Zhang. "Identifying influential nodes in complex networks based on network embedding and local structure entropy." Journal of Statistical Mechanics: Theory and Experiment 2023, no. 8 (2023): 083402. http://dx.doi.org/10.1088/1742-5468/acdceb.

Full text
Abstract:
Abstract The identification of influential nodes in complex networks remains a crucial research direction, as it paves the way for analyzing and controlling information diffusion. The currently presented network embedding algorithms are capable of representing high-dimensional and sparse networks with low-dimensional and dense vector spaces, which not only keeps the network structure but also has high accuracy. In this work, a novel centrality approach based on network embedding and local structure entropy, called the ELSEC, is proposed for capturing richer information to evaluate the importance of nodes from the view of local and global perspectives. In short, firstly, the local structure entropy is used to measure the self importance of nodes. Secondly, the network is mapped to a vector space to calculate the Manhattan distance between nodes by using the Node2vec network embedding algorithm, and the global importance of nodes is defined by combining the correlation coefficients. To reveal the effectiveness of the ELSEC, we select three types of algorithms for identifying key nodes as contrast approaches, including methods based on node centrality, optimal decycling based algorithms and graph partition based methods, and conduct experiments on ten real networks for correlation, ranking monotonicity, accuracy of high ranking nodes and the size of the giant connected component. Experimental results show that the ELSEC algorithm has excellent ability to identify influential nodes.
APA, Harvard, Vancouver, ISO, and other styles
38

Wang, Binhao, Jianwei Liu, Bing Kuang, Yuwei Li, and Xianfeng Luo. "Research on 3D laser SLAM algorithm based on graph optimization." Journal of Physics: Conference Series 2880, no. 1 (2024): 012001. http://dx.doi.org/10.1088/1742-6596/2880/1/012001.

Full text
Abstract:
Abstract Aiming at the problems that the positioning accuracy of laser Simultaneous Localization and Mapping (SLAM) is affected by the misclassification of close feature points and incorrect feature matching in the environment with redundant features, a 3D laser SLAM algorithm based on graph optimization was proposed. Firstly, to reduce the feature extraction error problem, the distance threshold is used to filter the close point cloud and improve the accuracy of feature classification, enhanced reliability of optoelectronic measurement data. Secondly, to improve the consistency of feature matching, based on the rough matching based on k-dimensional tree (KD-Tree), a two-stage fine matching based on geometric consistency is introduced to improved laser point alignment accuracy. The proposed method is compared with the mainstream localization algorithms based on KITTI dataset. Compared with Lidar-IMU Odometry with Sparse Adaptive Mapping and Fast LiDAR-Inertial Odometry, the positioning error of the proposed method is reduced by 21.88%and 24.43%on average, which showsthe superiority of the present algorithm in the application of optoelectronics and optical technology, and provides new ideas and methods for the development of high-precision laser navigation technology.
APA, Harvard, Vancouver, ISO, and other styles
39

Yang, Yazhi, Jiandong Shi, Siwei Zhou, and Shasha Yang. "Geometric Matrix Completion via Graph-Based Truncated Norm Regularization for Learning Resource Recommendation." Mathematics 12, no. 2 (2024): 320. http://dx.doi.org/10.3390/math12020320.

Full text
Abstract:
In the competitive landscape of online learning, developing robust and effective learning resource recommendation systems is paramount, yet the field faces challenges due to high-dimensional, sparse matrices and intricate user–resource interactions. Our study focuses on geometric matrix completion (GMC) and introduces a novel approach, graph-based truncated norm regularization (GBTNR) for problem solving. GBTNR innovatively incorporates truncated Dirichlet norms for both user and item graphs, enhancing the model’s ability to handle complex data structures. This method synergistically combines the benefits of truncated norm regularization with the insightful analysis of user–user and resource–resource graph relationships, leading to a significant improvement in recommendation performance. Our model’s unique application of truncated Dirichlet norms distinctively positions it to address the inherent complexities in user and item data structures more effectively than existing methods. By bridging the gap between theoretical robustness and practical applicability, the GBTNR approach offers a substantial leap forward in the field of learning resource recommendations. This advancement is particularly critical in the realm of online education, where understanding and adapting to diverse and intricate user–resource interactions is key to developing truly personalized learning experiences. Moreover, our work includes a thorough theoretical analysis, complete with proofs, to establish the convergence property of the GMC-GBTNR model, thus reinforcing its reliability and effectiveness in practical applications. Empirical validation through extensive experiments on diverse real-world datasets affirms the model’s superior performance over existing methods, marking a groundbreaking advancement in personalized education and deepening our understanding of the dynamics in learner–resource interactions.
APA, Harvard, Vancouver, ISO, and other styles
40

Bambi, Jonas, Hanieh Sadri, Ken Moselle, et al. "Approaches to Extracting Patterns of Service Utilization for Patients with Complex Conditions: Graph Community Detection vs. Natural Language Processing Clustering." BioMedInformatics 4, no. 3 (2024): 1884–900. http://dx.doi.org/10.3390/biomedinformatics4030103.

Full text
Abstract:
Background: As patients interact with a healthcare service system, patterns of service utilization (PSUs) emerge. These PSUs are embedded in the sparse high-dimensional space of longitudinal cross-continuum health service encounter data. Once extracted, PSUs can provide quality assurance/quality improvement (QA/QI) efforts with the information required to optimize service system structures and functions. This may improve outcomes for complex patients with chronic diseases. Method: Working with longitudinal cross-continuum encounter data from a regional health service system, various pattern detection analyses were conducted, employing (1) graph community detection algorithms, (2) natural language processing (NLP) clustering, and (3) a hybrid NLP–graph method. Result: These approaches produced similar PSUs, as determined from a clinical perspective by clinical subject matter experts and service system operations experts. Conclusions: The similarity in the results provides validation for the methodologies. Moreover, the results stress the need to engage with clinical or service system operations experts, both in providing the taxonomies and ontologies of the service system, the cohort definitions, and determining the level of granularity that produces the most clinically meaningful results. Finally, the uniqueness of each approach provides an opportunity to take advantage of the various analytical capabilities that each approach brings, which will be further explored in our future research.
APA, Harvard, Vancouver, ISO, and other styles
41

R., Kalai Selvi, and Malathy G. "Data Structure Innovations for Machine Learning and AI Algorithms." International Journal of Innovative Science and Research Technology (IJISRT) 10, no. 1 (2025): 2640–43. https://doi.org/10.5281/zenodo.14890846.

Full text
Abstract:
With the increasing complexity and size of data in machine learning (ML) and artificial intelligence (AI) applications, efficient data structures have become critical for enhancing performance, scalability, and memory management. Traditional data structures often fail to meet the specific requirements of modern ML and AI algorithms, particularly in terms of speed, flexibility, and storage efficiency. This paper explores recent innovations in data structures  tailored for ML and AI tasks, including dynamic data structures, compressed storage techniques, and specialized graph- based structures. We present a detailed review of advanced data structures such as KD-trees, hash maps, Bloom filters,  sparse matrices, and priority queues, and how they contribute to the performance improvements in common AI applications like deep learning, reinforcement learning, and large-scale data analysis. Furthermore, we propose a new hybrid data structure that combines the strengths of multiple existing structures to address challenges related to real-time processing, memory constraints, and high-dimensional data.
APA, Harvard, Vancouver, ISO, and other styles
42

Song, Wenjun, Xinqi Liu, and Qiuwen Zhang. "Fast Coding Unit Partitioning Method for Video-Based Point Cloud Compression: Combining Convolutional Neural Networks and Bayesian Optimization." Electronics 14, no. 7 (2025): 1295. https://doi.org/10.3390/electronics14071295.

Full text
Abstract:
As 5G technology and 3D capture techniques have been rapidly developing, there has been a remarkable increase in the demand for effectively compressing dynamic 3D point cloud data. Video-based point cloud compression (V-PCC), which is an innovative method for 3D point cloud compression, makes use of High-Efficiency Video Coding (HEVC) to carry out the compression of 3D point clouds. This is accomplished through the projection of the point clouds onto two-dimensional video frames. However, V-PCC faces significant coding complexity, particularly for dynamic 3D point clouds, which can be up to four times more complex to process than a conventional video. To address this challenge, we propose an adaptive coding unit (CU) partitioning method that integrates occupancy graphs, convolutional neural networks (CNNs), and Bayesian optimization. In this approach, the coding units (CUs) are first divided into dense regions, sparse regions, and complex composite regions by calculating the occupancy rate R of the CUs, and then an initial classification decision is made using a convolutional neural network (CNN) framework. For regions where the CNN outputs low-confidence classifications, Bayesian optimization is employed to refine the partitioning and enhance accuracy. The findings from the experiments show that the suggested method can efficiently decrease the coding complexity of V-PCC, all the while maintaining a high level of coding quality. Specifically, the average coding time of the geometric graph is reduced by 57.37%, the attribute graph by 54.43%, and the overall coding time by 54.75%. Although the BD rate slightly increases compared with that of the baseline V-PCC method, the impact on video quality is negligible. Additionally, the proposed algorithm outperforms existing methods in terms of geometric compression efficiency and computational time savings. This study’s innovation lies in combining deep learning with Bayesian optimization to deliver an efficient CU partitioning strategy for V-PCC, improving coding speed and reducing computational resource consumption, thereby advancing the practical application of V-PCC.
APA, Harvard, Vancouver, ISO, and other styles
43

Bi, Yifei, Jianing Luo, Jiwei Zhu, Junxiu Liu, and Wei Li. "Decentralized Multi-Robot Navigation Based on Deep Reinforcement Learning and Trajectory Optimization." Biomimetics 10, no. 6 (2025): 366. https://doi.org/10.3390/biomimetics10060366.

Full text
Abstract:
Multi-robot systems are significant in decision-making capabilities and applications, but avoiding collisions during movement remains a critical challenge. Existing decentralized obstacle avoidance strategies, while low in computational cost, often fail to ensure safety effectively. To address this issue, this paper leverages graph neural networks (GNNs) and deep reinforcement learning (DRL) to aggregate high-dimensional features as inputs for reinforcement learning (RL) to generate paths. Additionally, it introduces safety constraints through an artificial potential field (APF) to optimize these trajectories. Additionally, a constrained nonlinear optimization method further refines the APF-adjusted paths, resulting in the development of the GNN-RL-APF-Lagrangian algorithm. By combining APF and nonlinear optimization techniques, experimental results demonstrate that this method significantly enhances the safety and obstacle avoidance capabilities of multi-robot systems in complex environments. The proposed GNN-RL-APF-Lagrangian algorithm achieves a 96.43% success rate in sparse obstacle environments and 89.77% in dense obstacle scenarios, representing improvements of 59% and 60%, respectively, over baseline GNN-RL approaches. The method maintains scalability up to 30 robots while preserving distributed execution properties.
APA, Harvard, Vancouver, ISO, and other styles
44

Norville, Zane, Michelle Hedlund, and Vivek Buch. "1299 Network-Wide Morphological Dynamics Predict Functional Cognitive Performance." Neurosurgery 71, Supplement_1 (2025): 214. https://doi.org/10.1227/neu.0000000000003360_1299.

Full text
Abstract:
INTRODUCTION: Learning-related morphological changes in brain connectivity remain poorly understood. Network analysis techniques provide robust interpretation of high-dimensional neural data during skill acquisition. This insight may reveal novel anatomical and personalized targets for restoring performance in individuals with intellectual disabilities, who today possess sparse therapeutic offerings. METHODS: Task-paired electrode data were collected from 23 subjects undergoing sEEG epilepsy evaluation. Each subject performed several trials of a temporal expectancy task paradigm in which reaction time was assessed following visual stimulus color-change. Subjects were categorized into “Learners” (n = 17; improved at the task) or “Non-Learners” (n = 6; did not improve) based on trial reaction time slopes. During analysis, each sEEG channel represented a node in a graph. Edge weights between each pair of channels were assigned with phase-locking value to form functional connectivity matrices, which were calculated in four frequency bands (alpha, beta, low-gamma, high-gamma). Changes in modularity (a network-wide measure of community segregation) and efficiency (a measure of cross-network navigation ease) over a task session were compared between Learners and Non-Learners for periods prior to the presentation stimulus (pre-trial), go-cue, and response. RESULTS: Learners demonstrated increased high-gamma modularity in the pre-trial period (p < 0.05), increased alpha efficiency in the go-cue period (p < 0.05), and increased alpha and high-gamma modularity in the response period (p < 0.05) relative to Non-Learners. CONCLUSIONS: Dynamic learning was associated with increases in modular community segregation and brain-wide functional efficiency over the duration of a cognitive task. These network analytical findings demonstrate the potential impact of graph network decoding on understanding dynamic learning and could lead to novel therapeutic control signals for future restoration strategies.
APA, Harvard, Vancouver, ISO, and other styles
45

Zhang, Zhihao. "A Method of Recommending Physical Education Network Course Resources Based on Collaborative Filtering Technology." Scientific Programming 2021 (October 28, 2021): 1–9. http://dx.doi.org/10.1155/2021/9531111.

Full text
Abstract:
Through the current research on e-learning, it is found that the present e-learning system applied to the recommendation activities of learning resources has only two search methods: Top-N and keywords. These search methods cannot effectively recommend learning resources to learners. Therefore, the collaborative filtering recommendation technology is applied, in this paper, to the process of personalized recommendation of learning resources. We obtain user content and functional interest and predict the comprehensive interest of web and big data through an infinite deep neural network. Based on the collaborative knowledge graph and the collaborative filtering algorithm, the semantic information of teaching network resources is extracted from the collaborative knowledge graph. According to the principles of the nearest neighbor recommendation, the course attribute value preference matrix (APM) is obtained first. Next, the course-predicted values are sorted in descending order, and the top T courses with the highest predicted values are selected as the final recommended course set for the target learners. Each course has its own online classroom; the teacher will publish online class details ahead of time, and students can purchase online access to the classroom number and password. The experimental results show that the optimal number of clusters k is 9. Furthermore, for extremely sparse matrices, the collaborative filtering technique method is more suitable for clustering in the transformed low-dimensional space. The average recommendation satisfaction degree of collaborative filtering technology method is approximately 43.6%, which demonstrates high recommendation quality.
APA, Harvard, Vancouver, ISO, and other styles
46

Sanz Ilundain, Iñigo, Laura Hernández-Lorenzo, Cristina Rodríguez-Antona, Jesús García-Donas, and José L. Ayala. "Autoencoder techniques for survival analysis on renal cell carcinoma." PLOS One 20, no. 5 (2025): e0321045. https://doi.org/10.1371/journal.pone.0321045.

Full text
Abstract:
Survival is the gold standard in oncology when determining the real impact of therapies in patients outcome. Thus, identifying molecular predictors of survival (like genetic alterations or transcriptomic patterns of gene expression) is one of the most relevant fields in current research. Statistical methods and metrics to analyze time-to-event data are crucial in understanding disease progression and the effectiveness of treatments. However, in the medical field, data is often high-dimensional, complicating the application of such methodologies. In this study, we addressed this challenge by compressing the high-dimensional transcriptomic data of patients treated with immunotherapy (avelumab + axitinib) and a TKI (sunitinib) into latent, meaningful features using autoencoders. We applied a semi-parametric statistical approach based on the COX Proportional Hazards model, coupled with Breslow’s estimator, to predict each patient’s Progression-Free Survival (PFS) and determine survival functions. Our analysis explored various penalty configurations and their combinations. Given the complexity of transcriptomic data, we extended our model to incorporate both tabular data and its graph variant, where edges represent protein-protein interactions between genes, offering a more insightful approach. Recognizing the interpretability challenges inherent in neural networks, particularly autoencoders, we analyzed the mutual information between genes in the original data and their latent feature representations to clarify which genes are most associated with specific latent variables. The results indicate that different types of autoencoders are better suited for different tasks: denoising autoencoders excel at accurate reconstruction, while the sparse variant is more effective at producing meaningful representations. Additionally, combining these penalties enhances both reconstruction quality and the interpretability of latent features. The interpretable models identified genes such as LRP2 and ACE2 as highly relevant to renal cell carcinoma. This research underscores the utility of autoencoders in managing high-dimensional data problems.
APA, Harvard, Vancouver, ISO, and other styles
47

Yue, Yuyu, Jixin Zhang, Mingwu Zhang, and Jia Yang. "An Abnormal Account Identification Method by Topology Feature Analysis for Blockchain-Based Transaction Network." Electronics 13, no. 8 (2024): 1416. http://dx.doi.org/10.3390/electronics13081416.

Full text
Abstract:
Cryptocurrency, as one of the most successful applications of blockchain technology, has played a vital role in promoting the development of the digital economy. However, its anonymity, large scale of cryptographic transactions, and decentralization have also brought new challenges in identifying abnormal accounts and preventing abnormal transaction behaviors, such as money laundering, extortion, and market manipulation. Recently, some researchers have proposed efficient and accurate abnormal transaction detection based on machine learning. However, in reality, abnormal accounts and transactions are far less common than normal accounts and transactions, so it is difficult for the previous methods to detect abnormal accounts by training with such an imbalance in abnormal/normal accounts. To address the issues, in this paper, we propose a method for identifying abnormal accounts using topology analysis of cryptographic transactions. We consider the accounts and transactions in the blockchain as graph nodes and edges. Since the abnormal accounts may have special topology features, we extract topology features from the transaction graph. By analyzing the topology features of transactions, we discover that the high-dimensional sparse topology features can be compressed by using the singular value decomposition method for feature dimension reduction. Subsequently, we use the generative adversarial network to generate samples like abnormal accounts, which will be sent to the training dataset to produce an equilibrium of abnormal/normal accounts. Finally, we utilize several machine learning techniques to detect abnormal accounts in the blockchain. Our experimental results demonstrate that our method significantly improves the accuracy and recall rate for detecting abnormal accounts in blockchain compared with the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
48

Javidian, Mohammad Ali, Marco Valtorta, and Pooyan Jamshidi. "AMP Chain Graphs: Minimal Separators and Structure Learning Algorithms." Journal of Artificial Intelligence Research 69 (October 7, 2020): 419–70. http://dx.doi.org/10.1613/jair.1.12101.

Full text
Abstract:
This paper deals with chain graphs (CGs) under the Andersson–Madigan–Perlman (AMP) interpretation. We address the problem of finding a minimal separator in an AMP CG, namely, finding a set Z of nodes that separates a given non-adjacent pair of nodes such that no proper subset of Z separates that pair. We analyze several versions of this problem and offer polynomial time algorithms for each. These include finding a minimal separator from a restricted set of nodes, finding a minimal separator for two given disjoint sets, and testing whether a given separator is minimal. To address the problem of learning the structure of AMP CGs from data, we show that the PC-like algorithm is order dependent, in the sense that the output can depend on the order in which the variables are given. We propose several modifications of the PC-like algorithm that remove part or all of this order-dependence. We also extend the decomposition-based approach for learning Bayesian networks (BNs) to learn AMP CGs, which include BNs as a special case, under the faithfulness assumption. We prove the correctness of our extension using the minimal separator results. Using standard benchmarks and synthetically generated models and data in our experiments demonstrate the competitive performance of our decomposition-based method, called LCD-AMP, in comparison with the (modified versions of) PC-like algorithm. The LCD-AMP algorithm usually outperforms the PC-like algorithm, and our modifications of the PC-like algorithm learn structures that are more similar to the underlying ground truth graphs than the original PC-like algorithm, especially in high-dimensional settings. In particular, we empirically show that the results of both algorithms are more accurate and stabler when the sample size is reasonably large and the underlying graph is sparse
APA, Harvard, Vancouver, ISO, and other styles
49

Chen, Junting, Liyun Zhong, and Caiyun Cai. "Using Exponential Kernel for Semi-Supervised Word Sense Disambiguation." Journal of Computational and Theoretical Nanoscience 13, no. 10 (2016): 6929–34. http://dx.doi.org/10.1166/jctn.2016.5649.

Full text
Abstract:
Word sense disambiguation (WSD) in natural language text is a fundamental semantic understanding task at the lexical level in natural language processing (NLP) applications. Kernel methods such as support vector machine (SVM) have been successfully applied to WSD. This is mainly due to their relatively high classification accuracy as well as their ability to handle high dimensional and sparse data. A significant challenge in WSD is to reduce the need for labeled training data while maintaining an acceptable performance. In this paper, we present a semi-supervised technique using the exponential kernel for WSD. Specifically, the semantic similarities between terms are first determined with both labeled and unlabeled training data by means of a diffusion process on a graph defined by lexicon and co-occurrence information, and the exponential kernel is then constructed based on the learned semantic similarity. Finally, the SVM classifier trains a model for each class during the training phase and this model is then applied to all test examples in the test phase. The main feature of this approach is that it takes advantage of the exponential kernel to reveal the semantic similarities between terms in an unsupervised manner, which provides a kernel framework for semi-supervised learning. Experiments on several SENSEVAL benchmark data sets demonstrate the proposed approach is sound and effective.
APA, Harvard, Vancouver, ISO, and other styles
50

Zhao, Haitao, Sujay Datta, and Zhong-Hui Duan. "An Integrated Approach of Learning Genetic Networks From Genome-Wide Gene Expression Data Using Gaussian Graphical Model and Monte Carlo Method." Bioinformatics and Biology Insights 17 (January 2023): 117793222311529. http://dx.doi.org/10.1177/11779322231152972.

Full text
Abstract:
Global genetic networks provide additional information for the analysis of human diseases, beyond the traditional analysis that focuses on single genes or local networks. The Gaussian graphical model (GGM) is widely applied to learn genetic networks because it defines an undirected graph decoding the conditional dependence between genes. Many algorithms based on the GGM have been proposed for learning genetic network structures. Because the number of gene variables is typically far more than the number of samples collected, and a real genetic network is typically sparse, the graphical lasso implementation of GGM becomes a popular tool for inferring the conditional interdependence among genes. However, graphical lasso, although showing good performance in low dimensional data sets, is computationally expensive and inefficient or even unable to work directly on genome-wide gene expression data sets. In this study, the method of Monte Carlo Gaussian graphical model (MCGGM) was proposed to learn global genetic networks of genes. This method uses a Monte Carlo approach to sample subnetworks from genome-wide gene expression data and graphical lasso to learn the structures of the subnetworks. The learned subnetworks are then integrated to approximate a global genetic network. The proposed method was evaluated with a relatively small real data set of RNA-seq expression levels. The results indicate the proposed method shows a strong ability of decoding the interactions with high conditional dependences among genes. The method was then applied to genome-wide data sets of RNA-seq expression levels. The gene interactions with high interdependence from the estimated global networks show that most of the predicted gene-gene interactions have been reported in the literatures playing important roles in different human cancers. Also, the results validate the ability and reliability of the proposed method to identify high conditional dependences among genes in large-scale data sets.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!