To see the other types of publications on this topic, follow the link: Omic data.

Journal articles on the topic 'Omic data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Omic data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Oromendia, Ana, Dorina Ismailgeci, Michele Ciofii, Taylor Donnelly, Linda Bojmar, John Jyazbek, Arnaub Chatterjee, David Lyden, Kenneth H. Yu, and David Paul Kelsen. "Error-free, automated data integration of exosome cargo protein data with extensive clinical data in an ongoing, multi-omic translational research study." Journal of Clinical Oncology 38, no. 15_suppl (May 20, 2020): e16743-e16743. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.e16743.

Full text
Abstract:
e16743 Background: Major advances in understanding the biology of cancer have come from genomic analysis of tumor and normal tissue. Integrating extensive patient-related data with deep analysis of omic data is crucial to informing omic data interpretation. Currently, such integrations are a highly manual, asynchronous, and costly process as well as error-prone and time-consuming. To develop new blood assays that may detect very early stage PDAC, a multi-omic investigation with deep clinical annotation is needed. Using pilot data from an on-going study, we test a new platform allowing automated error-free integration of an extensive clinical database with extensive omic data. Methods: Demographic, clinical, family pedigree and pathology data were collected on the Rave EDC platform. Exosomes were purified from 46 plasma samples from 14 controls and 24 PDAC patients and cargo proteins were quantified via SILAC. The Rave Omics platform was used to ingest and integrate clinical and omic data, run quality checks and generate integrated clinical-omic datasets. Data fidelity was validated by systematically computing differences between corresponding values in the source flies with those present in the extracted data object (integrated data). The root mean squared error (RMSE) was calculated for numeric values in each sample. Additional validation was conducted by manual inspection to ascertain data integrity. Results: We demonstrated automatic integration, without human intervention, of a subset of the clinical data and all available SILAC data into an analysis-ready data object. Data transfer was completely faithful, with 100% concordance between the source and the integrated data without loss of features. All proteins (n = 1515) and clinical variables (n = 64) were imported. Their nomenclature and corresponding sample values (n = 69690) and clinical values (n = 2432) matched exactly between datasets. In all samples, the RMSE was exactly zero, indicating no deviation between data sources. Conclusions: We demonstrated that automatic, efficient, and reliable integration of clinical-omic data is achievable during an in-flight PDAC trial. Automatic exploratory analytics supporting biomarker discovery are currently being used to uncover associations between omic and clinical features. The Rave Omics platform is disease-agnostic and we plan to expand to trials of varying size, indication, and completion status where systematic, automated integration of clinical and (multi)omic data is needed.
APA, Harvard, Vancouver, ISO, and other styles
2

Ugidos, Manuel, Sonia Tarazona, José M. Prats-Montalbán, Alberto Ferrer, and Ana Conesa. "MultiBaC: A strategy to remove batch effects between different omic data types." Statistical Methods in Medical Research 29, no. 10 (March 4, 2020): 2851–64. http://dx.doi.org/10.1177/0962280220907365.

Full text
Abstract:
Diversity of omic technologies has expanded in the last years together with the number of omic data integration strategies. However, multiomic data generation is costly, and many research groups cannot afford research projects where many different omic techniques are generated, at least at the same time. As most researchers share their data in public repositories, different omic datasets of the same biological system obtained at different labs can be combined to construct a multiomic study. However, data obtained at different labs or moments in time are typically subjected to batch effects that need to be removed for successful data integration. While there are methods to correct batch effects on the same data types obtained in different studies, they cannot be applied to correct lab or batch effects across omics. This impairs multiomic meta-analysis. Fortunately, in many cases, at least one omics platform—i.e. gene expression— is repeatedly measured across labs, together with the additional omic modalities that are specific to each study. This creates an opportunity for batch analysis. We have developed MultiBaC (multiomic Multiomics Batch-effect Correction correction), a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. Our strategy is based on the existence of at least one shared data type which allows data prediction across omics. We validate this approach both on simulated data and on a case where the multiomic design is fully shared by two labs, hence batch effect correction within the same omic modality using traditional methods can be compared with the MultiBaC correction across data types. Finally, we apply MultiBaC to a true multiomic data integration problem to show that we are able to improve the detection of meaningful biological effects.
APA, Harvard, Vancouver, ISO, and other styles
3

Rappoport, Nimrod, and Ron Shamir. "NEMO: cancer subtyping by integration of partial multi-omic data." Bioinformatics 35, no. 18 (January 30, 2019): 3348–56. http://dx.doi.org/10.1093/bioinformatics/btz058.

Full text
Abstract:
Abstract Motivation Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients. Results We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization. Availability and implementation Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
4

Canela, Núria Anela. "A pioneering multi-omics data platform sheds light on the understanding of biological systems." Project Repository Journal 20, no. 1 (July 4, 2024): 20–23. http://dx.doi.org/10.54050/prj2021863.

Full text
Abstract:
A pioneering multi-omics data platform sheds light on the understanding of biological systems The GLOMICAVE project has developed an innovative multi-omics data analysis digital platform, relying on big data analytics and artificial intelligence and using large-scale publicly available and experimental omic datasets. The project aimed to maximise the utility of omic data at a massive level and discover new links between animal and vegetable genotype and phenotype, understanding biological systems as a whole.
APA, Harvard, Vancouver, ISO, and other styles
5

Lancaster, Samuel M., Akshay Sanghi, Si Wu, and Michael P. Snyder. "A Customizable Analysis Flow in Integrative Multi-Omics." Biomolecules 10, no. 12 (November 27, 2020): 1606. http://dx.doi.org/10.3390/biom10121606.

Full text
Abstract:
The number of researchers using multi-omics is growing. Though still expensive, every year it is cheaper to perform multi-omic studies, often exponentially so. In addition to its increasing accessibility, multi-omics reveals a view of systems biology to an unprecedented depth. Thus, multi-omics can be used to answer a broad range of biological questions in finer resolution than previous methods. We used six omic measurements—four nucleic acid (i.e., genomic, epigenomic, transcriptomics, and metagenomic) and two mass spectrometry (proteomics and metabolomics) based—to highlight an analysis workflow on this type of data, which is often vast. This workflow is not exhaustive of all the omic measurements or analysis methods, but it will provide an experienced or even a novice multi-omic researcher with the tools necessary to analyze their data. This review begins with analyzing a single ome and study design, and then synthesizes best practices in data integration techniques that include machine learning. Furthermore, we delineate methods to validate findings from multi-omic integration. Ultimately, multi-omic integration offers a window into the complexity of molecular interactions and a comprehensive view of systems biology.
APA, Harvard, Vancouver, ISO, and other styles
6

Morota, Gota. "30 Mutli-omic data integration in quantitative genetics." Journal of Animal Science 97, Supplement_2 (July 2019): 15. http://dx.doi.org/10.1093/jas/skz122.027.

Full text
Abstract:
Abstract The advent of high-throughput technologies has generated diverse omic data including single-nucleotide polymorphisms, copy-number variation, gene expression, methylation, and metabolites. The next major challenge is how to integrate those multi-omic data for downstream analyses to enhance our biological insights. This emerging approach is known as multi-omic data integration, which is in contrast to studying each omic data type independently. I will discuss challenging issues in developing algorithms and methods for multi-omic data integration. The particular focus will be given to the potential for combining diverse types of FAANG data and the utility of multi-omic data integration in association analysis and phenotypic prediction.
APA, Harvard, Vancouver, ISO, and other styles
7

Escriba-Montagut, Xavier, Yannick Marcon, Augusto Anguita-Ruiz, Demetris Avraam, Jose Urquiza, Andrei S. Morgan, Rebecca C. Wilson, Paul Burton, and Juan R. Gonzalez. "Federated privacy-protected meta- and mega-omics data analysis in multi-center studies with a fully open-source analytic platform." PLOS Computational Biology 20, no. 12 (December 9, 2024): e1012626. https://doi.org/10.1371/journal.pcbi.1012626.

Full text
Abstract:
The importance of maintaining data privacy and complying with regulatory requirements is highlighted especially when sharing omic data between different research centers. This challenge is even more pronounced in the scenario where a multi-center effort for collaborative omics studies is necessary. OmicSHIELD is introduced as an open-source tool aimed at overcoming these challenges by enabling privacy-protected federated analysis of sensitive omic data. In order to ensure this, multiple security mechanisms have been included in the software. This innovative tool is capable of managing a wide range of omic data analyses specifically tailored to biomedical research. These include genome and epigenome wide association studies and differential gene expression analyses. OmicSHIELD is designed to support both meta- and mega-analysis, so that it offers a wide range of capabilities for different analysis designs. We present a series of use cases illustrating some examples of how the software addresses real-world analyses of omic data.
APA, Harvard, Vancouver, ISO, and other styles
8

Meunier, Lea, Guillaume Appe, Abdelkader Behdenna, Valentin Bernu, Helia Brull Corretger, Prashant Dhillon, Eleonore Fox, et al. "Abstract 6209: From data disparity to data harmony: A comprehensive pan-cancer omics data collection." Cancer Research 84, no. 6_Supplement (March 22, 2024): 6209. http://dx.doi.org/10.1158/1538-7445.am2024-6209.

Full text
Abstract:
Abstract In cancer research, the exponential growth of omics datasets offers a significant opportunity for scientific advancement. However, challenges such as the lack of uniform standards, in both clinical and omic data, hinder the effective utilization of these datasets, thus impeding our understanding of cancer biology and the development of innovative therapeutic approaches.Addressing these challenges, we have created a novel collection of pan-cancer omics datasets with extensive clinical data harmonization and consistent omic data normalization.Here, we focused on patient-derived gene expression microarray datasets from the Gene Expression Omnibus database. To navigate the complexities presented by the diverse clinical descriptions inherent in these datasets, we leveraged our proprietary ontology, machine learning models, and domain expert quality control processes to homogenize the clinical data elements.Datasets were then selected based on sample composition, molecular data compatibility, and clinical data availability, then passed through a uniform preprocessing and normalization pipeline to maximize data quality. Finally, gene names were aligned on a single annotation reference, and potential batch effects were adjusted before expression data were merged together.We obtained a total of 32,825 transcriptomic sample profiles from 470 datasets, covering 13,435 genes and 45 clinical data elements, across 30 cancer types. Healthy tissue was favored over adjacent tissue, to minimize the risk of introducing biases related to cancer patient background genomic profiles into downstream analyses. We compared our collection with The Cancer Genome Atlas (TCGA), the most commonly used RNA-seq transcriptomic dataset in cancer research. It covers 30 out of the 33 TCGA cancer types, with on average 4.2 times more samples per cancer type ([0.3; 45.5], median 3.4). Despite the two data collections being based on distinct technologies, we observed a Pearson correlation of 0.69 over the 11,753 genes in common, and a 100% overlap of the differentially expressed genes between genders. This consistency highlights cross-technology reliability and complementarity.We have built and continuously enriched a comprehensive dataset collection enabling the secondary analysis of high-quality omic data. This initial work - focused on microarray datasets - allows us to streamline design, exploration and validation of various omics data-driven studies in cancer research.Our ongoing efforts involve not only the continued integration of microarray datasets but also the integration of pan-cancer RNA-seq and single-cell data. This initiative is set to expand further, encompassing a broader range of omics datasets in the future. Citation Format: Lea Meunier, Guillaume Appe, Abdelkader Behdenna, Valentin Bernu, Helia Brull Corretger, Prashant Dhillon, Eleonore Fox, Julien Haziza, Charles Lescure, Camille Marijon, Clemence Petit, Solene Weill, Akpeli Nordor. From data disparity to data harmony: A comprehensive pan-cancer omics data collection [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 6209.
APA, Harvard, Vancouver, ISO, and other styles
9

Quackenbush, John. "Data standards for 'omic' science." Nature Biotechnology 22, no. 5 (May 2004): 613–14. http://dx.doi.org/10.1038/nbt0504-613.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Boekel, Jorrit, John M. Chilton, Ira R. Cooke, Peter L. Horvatovich, Pratik D. Jagtap, Lukas Käll, Janne Lehtiö, Pieter Lukasse, Perry D. Moerland, and Timothy J. Griffin. "Multi-omic data analysis using Galaxy." Nature Biotechnology 33, no. 2 (February 2015): 137–39. http://dx.doi.org/10.1038/nbt.3134.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Krittanawong, Chayakrit. "Big Data Analytics, the Microbiome, Host-omic and Bug-omic Data and Risk for Cardiovascular Disease." Heart, Lung and Circulation 27, no. 3 (March 2018): e26-e27. http://dx.doi.org/10.1016/j.hlc.2017.07.012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Zhu, Shuwei, Wenping Wang, Wei Fang, and Meiji Cui. "Autoencoder-assisted latent representation learning for survival prediction and multi-view clustering on multi-omics cancer subtyping." Mathematical Biosciences and Engineering 20, no. 12 (2023): 21098–119. http://dx.doi.org/10.3934/mbe.2023933.

Full text
Abstract:
<abstract><p>Cancer subtyping (or cancer subtypes identification) based on multi-omics data has played an important role in advancing diagnosis, prognosis and treatment, which triggers the development of advanced multi-view clustering algorithms. However, the high-dimension and heterogeneity of multi-omics data make great effects on the performance of these methods. In this paper, we propose to learn the informative latent representation based on autoencoder (AE) to naturally capture nonlinear omic features in lower dimensions, which is helpful for identifying the similarity of patients. Moreover, to take advantage of survival information or clinical information, a multi-omic survival analysis approach is embedded when integrating the similarity graph of heterogeneous data at the multi-omics level. Then, the clustering method is performed on the integrated similarity to generate subtype groups. In the experimental part, the effectiveness of the proposed framework is confirmed by evaluating five different multi-omics datasets, taken from The Cancer Genome Atlas. The results show that AE-assisted multi-omics clustering method can identify clinically significant cancer subtypes.</p></abstract>
APA, Harvard, Vancouver, ISO, and other styles
13

Demirel, Habibe Cansu, Muslum Kaan Arici, and Nurcan Tuncbag. "Computational approaches leveraging integrated connections of multi-omic data toward clinical applications." Molecular Omics 18, no. 1 (2022): 7–18. http://dx.doi.org/10.1039/d1mo00158b.

Full text
Abstract:
Data integration approaches are crucial for transforming multi-omic data sets into clinically interpretable knowledge. This review presents a detailed and extensive guideline to catalog the recent computational multi-omic data integration methods.
APA, Harvard, Vancouver, ISO, and other styles
14

Chu, Su, Mengna Huang, Rachel Kelly, Elisa Benedetti, Jalal Siddiqui, Oana Zeleznik, Alexandre Pereira, et al. "Integration of Metabolomic and Other Omics Data in Population-Based Study Designs: An Epidemiological Perspective." Metabolites 9, no. 6 (June 18, 2019): 117. http://dx.doi.org/10.3390/metabo9060117.

Full text
Abstract:
It is not controversial that study design considerations and challenges must be addressed when investigating the linkage between single omic measurements and human phenotypes. It follows that such considerations are just as critical, if not more so, in the context of multi-omic studies. In this review, we discuss (1) epidemiologic principles of study design, including selection of biospecimen source(s) and the implications of the timing of sample collection, in the context of a multi-omic investigation, and (2) the strengths and limitations of various techniques of data integration across multi-omic data types that may arise in population-based studies utilizing metabolomic data.
APA, Harvard, Vancouver, ISO, and other styles
15

Chorna, Nataliya, and Filipa Godoy-Vitorino. "A Protocol for the Multi-Omic Integration of Cervical Microbiota and Urine Metabolomics to Understand Human Papillomavirus (HPV)-Driven Dysbiosis." Biomedicines 8, no. 4 (April 8, 2020): 81. http://dx.doi.org/10.3390/biomedicines8040081.

Full text
Abstract:
The multi-omic integration of microbiota data with metabolomics has gained popularity. This protocol is based on a human multi-omics study, integrating cervicovaginal microbiota, HPV status and neoplasia, with urinary metabolites. Indeed, to understand the biology of the infections and to develop adequate interventions for cervical cancer prevention, studies are needed to characterize in detail the cervical microbiota and understand the systemic metabolome. This article is a detailed protocol for the multi-omic integration of cervical microbiota and urine metabolome to shed light on the systemic effects of cervical dysbioses associated with Human Papillomavirus (HPV) infections. This methods article suggests detailed sample collection and laboratory processes of metabolomics, DNA extraction for microbiota, HPV typing, and the bioinformatic analyses of the data, both to characterize the metabolome, the microbiota, and joint multi-omic analyses, useful for the development of new point-of-care diagnostic tests based on these approaches.
APA, Harvard, Vancouver, ISO, and other styles
16

Shah, Tariq, Jinsong Xu, Xiling Zou, Yong Cheng, Mubasher Nasir, and Xuekun Zhang. "Omics Approaches for Engineering Wheat Production under Abiotic Stresses." International Journal of Molecular Sciences 19, no. 8 (August 14, 2018): 2390. http://dx.doi.org/10.3390/ijms19082390.

Full text
Abstract:
Abiotic stresses greatly influenced wheat productivity executed by environmental factors such as drought, salt, water submergence and heavy metals. The effective management at the molecular level is mandatory for a thorough understanding of plant response to abiotic stress. Understanding the molecular mechanism of stress tolerance is complex and requires information at the omic level. In the areas of genomics, transcriptomics and proteomics enormous progress has been made in the omics field. The rising field of ionomics is also being utilized for examining abiotic stress resilience in wheat. Omic approaches produce a huge amount of data and sufficient developments in computational tools have been accomplished for efficient analysis. However, the integration of omic-scale information to address complex genetics and physiological questions is still a challenge. Though, the incorporation of omic-scale data to address complex genetic qualities and physiological inquiries is as yet a challenge. In this review, we have reported advances in omic tools in the perspective of conventional and present day approaches being utilized to dismember abiotic stress tolerance in wheat. Attention was given to methodologies, for example, quantitative trait loci (QTL), genome-wide association studies (GWAS) and genomic selection (GS). Comparative genomics and candidate genes methodologies are additionally talked about considering the identification of potential genomic loci, genes and biochemical pathways engaged with stress resilience in wheat. This review additionally gives an extensive list of accessible online omic assets for wheat and its effective use. We have additionally addressed the significance of genomics in the integrated approach and perceived high-throughput multi-dimensional phenotyping as a significant restricting component for the enhancement of abiotic stress resistance in wheat.
APA, Harvard, Vancouver, ISO, and other styles
17

Ali, Johar, and Ome Kalsoom Afridi. "Omic or Multi-omics Approach Can Save The Mankind." Current Trends in OMICS 1, no. 1 (August 16, 2021): 01–07. http://dx.doi.org/10.32350/cto.11.01.

Full text
Abstract:
The publication of the first draft of human genome, has led to the explosion of high throughput technologies including genomics, epigenomics, transcriptomic, proteomics, and metabolomics aiming to characterize the various biological molecules (DNA, RNA, proteins, and metabolites). These high throughput technologies collectively called as omics revolutionized medical research in the last two decades. The advent of next generation sequencing (NGS) reduced the time and economic cost of traditional sequencing and has led to the emergence of genomics as the first discipline of omics. Following the emergence of genomics, a number of projects such as The Cancer Genome Atlas (TCGA), 1000 Genome Project (1KGP), and the International Cancer Genome Consortium have been accomplished. These projects contributed significantly to the understanding of genetic variations in different cancers, for instance, TCGA produced over 2.5 petabytes of big data. Furthermore, the big data produced by these mega projects has been made publicly available to the clinicians and researchers to fast-track the diagnosis and prognosis of complex rare diseases. In developed countries, a multi-omics approach has been applied holistically to the clinical practice for the diagnosis and prognosis of various cancers and rare Mendelian diseases.
APA, Harvard, Vancouver, ISO, and other styles
18

Pan, Jianqiao, Baoshan Ma, Xiaoyu Hou, Chongyang Li, Tong Xiong, Yi Gong, and Fengju Song. "The construction of transcriptional risk scores for breast cancer based on lightGBM and multiple omics data." Mathematical Biosciences and Engineering 19, no. 12 (2022): 12353–70. http://dx.doi.org/10.3934/mbe.2022576.

Full text
Abstract:
<abstract> <sec><title>Background</title><p>Polygenic risk score (PRS) can evaluate the individual-level genetic risk of breast cancer. However, standalone single nucleotide polymorphisms (SNP) data used for PRS may not provide satisfactory prediction accuracy. Additionally, current PRS models based on linear regression have insufficient power to leverage non-linear effects from thousands of associated SNPs. Here, we proposed a transcriptional risk score (TRS) based on multiple omics data to estimate the risk of breast cancer.</p> </sec> <sec><title>Methods</title><p>The multiple omics data and clinical data of breast invasive carcinoma (BRCA) were collected from the cancer genome atlas (TCGA) and the gene expression omnibus (GEO). First, we developed a novel TRS model for BRCA utilizing single omic data and LightGBM algorithm. Subsequently, we built a combination model of TRS derived from each omic data to further improve the prediction accuracy. Finally, we performed association analysis and prognosis prediction to evaluate the utility of the TRS generated by our method.</p> </sec> <sec><title>Results</title><p>The proposed TRS model achieved better predictive performance than the linear models and other ML methods in single omic dataset. An independent validation dataset also verified the effectiveness of our model. Moreover, the combination of the TRS can efficiently strengthen prediction accuracy. The analysis of prevalence and the associations of the TRS with phenotypes including case-control and cancer stage indicated that the risk of breast cancer increases with the increases of TRS. The survival analysis also suggested that TRS for the cancer stage is an effective prognostic metric of breast cancer patients.</p> </sec> <sec><title>Conclusions</title><p>Our proposed TRS model expanded the current definition of PRS from standalone SNP data to multiple omics data and outperformed the linear models, which may provide a powerful tool for diagnostic and prognostic prediction of breast cancer.</p> </sec> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
19

Sangaralingam, Ajanthah, Abu Z. Dayem Ullah, Jacek Marzec, Emanuela Gadaleta, Ai Nagano, Helen Ross-Adams, Jun Wang, Nicholas R. Lemoine, and Claude Chelala. "‘Multi-omic’ data analysis using O-miner." Briefings in Bioinformatics 20, no. 1 (August 4, 2017): 130–43. http://dx.doi.org/10.1093/bib/bbx080.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Madrid-Márquez, Laura, Cristina Rubio-Escudero, Beatriz Pontes, Antonio González-Pérez, José C. Riquelme, and Maria E. Sáez. "MOMIC: A Multi-Omics Pipeline for Data Analysis, Integration and Interpretation." Applied Sciences 12, no. 8 (April 14, 2022): 3987. http://dx.doi.org/10.3390/app12083987.

Full text
Abstract:
Background and Objectives: The burst of high-throughput omics technologies has given rise to a new era in systems biology, offering an unprecedented scenario for deriving meaningful biological knowledge through the integration of different layers of information. Methods: We have developed a new software tool, MOMIC, that guides the user through the application of different analysis on a wide range of omic data, from the independent single-omics analysis to the combination of heterogeneous data at different molecular levels. Results: The proposed pipeline is developed as a collection of Jupyter notebooks, easily editable, reproducible and well documented. It can be modified to accommodate new analysis workflows and data types. It is accessible via momic.us.es, and as a docker project available at github that can be locally installed. Conclusions: MOMIC offers a complete analysis environment for analysing and integrating multi-omics data in a single, easy-to-use platform.
APA, Harvard, Vancouver, ISO, and other styles
21

von der Heyde, Silvia, Margarita Krawczyk, Julia Bischof, Thomas Corwin, Peter Frommolt, Jonathan Woodsmith, and Hartmut Juhl. "Clinically relevant multi-omic analysis of colorectal cancer." Journal of Clinical Oncology 38, no. 15_suppl (May 20, 2020): e16063-e16063. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.e16063.

Full text
Abstract:
e16063 Background: Cancer is a highly heterogeneous disease, both intra- and inter-individually consisting of complex phenotypes and systems biology. Although genomic data has contributed greatly towards the identification of cancer-specific mutations and the progress of precision medicine, genomic alterations are only one of several important biological drivers of cancer. Furthermore, single-layer omics represent only a small piece of the cancer biology puzzle and provide only partial clues to connecting genotype with clinically relevant phenotypic data. A more integrated approach is urgently needed to unravel the underpinnings of molecular signatures and the phenotypic manifestation of cancer hallmarks. Methods: Here we characterize a colorectal cancer (CRC) cohort of 500 patients across multiple distinct omic data types. Across this CRC cohort, we defined clinically relevant whole genome sequencing based metrics such as micro-satellite-instability (MSI) status, and furthermore investigate gene expression at the transcript level using RNA-Seq, as well as at the proteomic level using tandem mass spectrometry. We further characterized a subgroup of 100 of these patients through 16s rRNA sequencing to identify associated microbiome profiles. Results: We combined these analyses with comprehensive clinical data to observe the impact of ascertained molecular signatures on the CRC patient cohort. Here, we report how patient survival correlates both with specific molecular events across individual omic data types, as well as with combined multi-omic analyses. Conclusions: This project highlights the utility of integrating multiple distinct data types to obtain a more comprehensive overview of the molecular mechanisms underpinning colo-rectal cancer. Furthermore, through combining identified aberrant molecular mechanisms with clinical reports, multi-omic data can be prioritized through their impact on patient cohort survival.
APA, Harvard, Vancouver, ISO, and other styles
22

Badimon, Lina, Guiomar Mendieta, Soumaya Ben-Aicha, and Gemma Vilahur. "Post-Genomic Methodologies and Preclinical Animal Models: Chances for the Translation of Cardioprotection to the Clinic." International Journal of Molecular Sciences 20, no. 3 (January 25, 2019): 514. http://dx.doi.org/10.3390/ijms20030514.

Full text
Abstract:
Although many cardioprotective strategies have demonstrated benefits in animal models of myocardial infarction, they have failed to demonstrate cardioprotection in the clinical setting highlighting that new therapeutic target and treatment strategies aimed at reducing infarct size are urgently needed. Completion of the Human Genome Project in 2001 fostered the post-genomic research era with the consequent development of high-throughput “omics” platforms including transcriptomics, proteomics, and metabolomics. Implementation of these holistic approaches within the field of cardioprotection has enlarged our understanding of ischemia/reperfusion injury with each approach capturing a different angle of the global picture of the disease. It has also contributed to identify potential prognostic/diagnostic biomarkers and discover novel molecular therapeutic targets. In this latter regard, “omic” data analysis in the setting of ischemic conditioning has allowed depicting potential therapeutic candidates, including non-coding RNAs and molecular chaperones, amenable to pharmacological development. Such discoveries must be tested and validated in a relevant and reliable myocardial infarction animal model before moving towards the clinical setting. Moreover, efforts should also focus on integrating all “omic” datasets rather than working exclusively on a single “omic” approach. In the following manuscript, we will discuss the power of implementing “omic” approaches in preclinical animal models to identify novel molecular targets for cardioprotection of interest for drug development.
APA, Harvard, Vancouver, ISO, and other styles
23

Palsson, Bernhard, and Karsten Zengler. "The challenges of integrating multi-omic data sets." Nature Chemical Biology 6, no. 11 (October 18, 2010): 787–89. http://dx.doi.org/10.1038/nchembio.462.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Yurkovich, James T., and Bernhard O. Palsson. "Quantitative -omic data empowers bottom-up systems biology." Current Opinion in Biotechnology 51 (June 2018): 130–36. http://dx.doi.org/10.1016/j.copbio.2018.01.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Yang, Xiaoxi, Yuqi Wen, Xinyu Song, Song He, and Xiaochen Bo. "Exploring the classification of cancer cell lines from multiple omic views." PeerJ 8 (August 18, 2020): e9440. http://dx.doi.org/10.7717/peerj.9440.

Full text
Abstract:
Background Cancer classification is of great importance to understanding its pathogenesis, making diagnosis and developing treatment. The accumulation of extensive omics data of abundant cancer cell line provide basis for large scale classification of cancer with low cost. However, the reliability of cell lines as in vitro models of cancer has been controversial. Methods In this study, we explore the classification on pan-cancer cell line with single and integrated multiple omics data from the Cancer Cell Line Encyclopedia (CCLE) database. The representative omics data of cancer, mRNA data, miRNA data, copy number variation data, DNA methylation data and reverse-phase protein array data were taken into the analysis. TumorMap web tool was used to illustrate the landscape of molecular classification.The molecular classification of patient samples was compared with cancer cell lines. Results Eighteen molecular clusters were identified using integrated multiple omics clustering. Three pan-cancer clusters were found in integrated multiple omics clustering. By comparing with single omics clustering, we found that integrated clustering could capture both shared and complementary information from each omics data. Omics contribution analysis for clustering indicated that, although all the five omics data were of value, mRNA and proteomics data were particular important. While the classifications were generally consistent, samples from cancer patients were more diverse than cancer cell lines. Conclusions The clustering analysis based on integrated omics data provides a novel multi-dimensional map of cancer cell lines that can reflect the extent to pan-cancer cell lines represent primary tumors, and an approach to evaluate the importance of omic features in cancer classification.
APA, Harvard, Vancouver, ISO, and other styles
26

Alizadeh, Madeline, Natalia Sampaio Moura, Alyssa Schledwitz, Seema A. Patil, Jacques Ravel, and Jean-Pierre Raufman. "Big Data in Gastroenterology Research." International Journal of Molecular Sciences 24, no. 3 (January 27, 2023): 2458. http://dx.doi.org/10.3390/ijms24032458.

Full text
Abstract:
Studying individual data types in isolation provides only limited and incomplete answers to complex biological questions and particularly falls short in revealing sufficient mechanistic and kinetic details. In contrast, multi-omics approaches to studying health and disease permit the generation and integration of multiple data types on a much larger scale, offering a comprehensive picture of biological and disease processes. Gastroenterology and hepatobiliary research are particularly well-suited to such analyses, given the unique position of the luminal gastrointestinal (GI) tract at the nexus between the gut (mucosa and luminal contents), brain, immune and endocrine systems, and GI microbiome. The generation of ‘big data’ from multi-omic, multi-site studies can enhance investigations into the connections between these organ systems and organisms and more broadly and accurately appraise the effects of dietary, pharmacological, and other therapeutic interventions. In this review, we describe a variety of useful omics approaches and how they can be integrated to provide a holistic depiction of the human and microbial genetic and proteomic changes underlying physiological and pathophysiological phenomena. We highlight the potential pitfalls and alternatives to help avoid the common errors in study design, execution, and analysis. We focus on the application, integration, and analysis of big data in gastroenterology and hepatobiliary research.
APA, Harvard, Vancouver, ISO, and other styles
27

O'Hara, Eóin, André L. A. Neves, Yang Song, and Le Luo Guan. "The Role of the Gut Microbiome in Cattle Production and Health: Driver or Passenger?" Annual Review of Animal Biosciences 8, no. 1 (February 15, 2020): 199–220. http://dx.doi.org/10.1146/annurev-animal-021419-083952.

Full text
Abstract:
Ruminant production systems face significant challenges currently, driven by heightened awareness of their negative environmental impact and the rapidly rising global population. Recent findings have underscored how the composition and function of the rumen microbiome are associated with economically valuable traits, including feed efficiency and methane emission. Although omics-based technological advances in the last decade have revolutionized our understanding of host-associated microbial communities, there remains incongruence over the correct approach for analysis of large omic data sets. A global approach that examines host/microbiome interactions in both the rumen and the lower digestive tract is required to harness the full potential of the gastrointestinal microbiome for sustainable ruminant production. This review highlights how the ruminant animal production community may identify and exploit the causal relationships between the gut microbiome and host traits of interest for a practical application of omic data to animal health and production.
APA, Harvard, Vancouver, ISO, and other styles
28

Yugi, Katsuyuki, Satoshi Ohno, James R. Krycer, David E. James, and Shinya Kuroda. "Rate-oriented trans-omics: integration of multiple omic data on the basis of reaction kinetics." Current Opinion in Systems Biology 15 (June 2019): 109–20. http://dx.doi.org/10.1016/j.coisb.2019.04.005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Baena-Miret, Sergi, Ferran Reverter, and Esteban Vegas. "A framework for block-wise missing data in multi-omics." PLOS ONE 19, no. 7 (July 23, 2024): e0307482. http://dx.doi.org/10.1371/journal.pone.0307482.

Full text
Abstract:
High-throughput technologies have generated vast amounts of omic data. It is a consensus that the integration of diverse omics sources improves predictive models and biomarker discovery. However, managing multiple omics data poses challenges such as data heterogeneity, noise, high-dimensionality and missing data, especially in block-wise patterns. This study addresses the challenges of high dimensionality and block-wise missing data through a regularization and constrained-based approach. The methodology is implemented in the R package bwm for binary and continuous response variables, and applied to breast cancer and exposome multi-omics datasets, achieving strong performance even in scenarios with missing data present in all omics. In binary classification task, our proposed model achieves accuracy in the range of 86% to 92%, and F1 in the range of 68% to 79%. And, in regression task the correlation between true and predicted responses is in the range of 72% to 76%. However, there is a slight decline in performance metrics as the percentage of missing data increases. In scenarios where block-wise missing data affects multiple omics, the model performance actually surpasses that of scenarios where missing data is present in only one omics. One possible explanation for this might be that the other scenarios introduce a greater diversity of observation profiles, leading to a more robust model. Depending on the specific omics being studied, there is greater consistency in feature selection when comparing block-wise missing data scenarios.
APA, Harvard, Vancouver, ISO, and other styles
30

Futorian, David, Oren Fischman, Gali Arad, Nitzan Simchi, Omri Erez, Eran Seger, Rozanne Groen, and Kirill Pevzner. "Abstract 5410: Predictive biomarker discovery method to bridge the gap between preclinical disease model dose-response and clinical trials." Cancer Research 83, no. 7_Supplement (April 4, 2023): 5410. http://dx.doi.org/10.1158/1538-7445.am2023-5410.

Full text
Abstract:
Abstract The translational gap of drug and biomarker discovery remains one of the biggest challenges in the pharmaceutical industry. On one hand high throughput screening and omics methods facilitate the generation of in-vitro and in-vivo data of dose-response for a particular drug or combinations. On the other hand, clinical omics techniques allow for the creation of large scale treatment-naive clinical datasets, with the TCGA and CPTAC as prominent examples. However, the question of which disease model best represents the response/resistance mechanisms of an oncology patient remains challenging. In this study we present a machine learning (ML) based computational technique for integrating omics from preclinical dose-response studies with clinical treatment-naive samples to create putative predictive biomarkers, and exemplify its application 2 case studies - PARP or AKT inhibitors. We utilize the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (GDSC) as a resource for in-vitro dose response data for PARPi and AKTi, coupled with multi-omic molecular data. We utilize the treatment naive molecular characterization from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data of Breast cancer patients. Multi-omic analysis of these datasets derived 2 putative predictive biomarkers for PARPi and AKTi. The multi-omic analysis resembles the data availability at the critical drug development stage of transition from preclinical response models to a clinical trial. These biomarkers are tested on a validation dataset obtained from the iSPY adaptive clinical trial and achieve superior results compared to the original enrolment biomarkers and even the retrospectively derived biomarkers. This study presents a novel computational approach to bridge the gap between preclinical dose-response and clinical datasets and suggests an efficient way to discover predictive biomarkers based on data accessible in a preclinical setting. Citation Format: David Futorian, Oren Fischman, Gali Arad, Nitzan Simchi, Omri Erez, Eran Seger, Rozanne Groen, Kirill Pevzner. Predictive biomarker discovery method to bridge the gap between preclinical disease model dose-response and clinical trials. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5410.
APA, Harvard, Vancouver, ISO, and other styles
31

Kemmo Tsafack, Ulrich Kemmo, Kwang Woo Ahn, Anne E. Kwitek, and Chien-Wei Lin. "Meta-Analytic Gene-Clustering Algorithm for Integrating Multi-Omics and Multi-Study Data." Bioengineering 11, no. 6 (June 8, 2024): 587. http://dx.doi.org/10.3390/bioengineering11060587.

Full text
Abstract:
Gene pathways and gene-regulatory networks are used to describe the causal relationship between genes, based on biological experiments. However, many genes are still to be studied to define novel pathways. To address this, a gene-clustering algorithm has been used to group correlated genes together, based on the similarity of their gene expression level. The existing methods cluster genes based on only one type of omics data, which ignores the information from other types. A large sample size is required to achieve an accurate clustering structure for thousands of genes, which can be challenging due to the cost of multi-omics data. Meta-analysis has been used to aggregate the data from multiple studies and improve the analysis results. We propose a computationally efficient meta-analytic gene-clustering algorithm that combines multi-omics datasets from multiple studies, using the fixed effects linear models and a modified weighted correlation network analysis framework. The simulation study shows that the proposed method outperforms existing single omic-based clustering approaches when multi-omics data and/or multiple studies are available. A real data example demonstrates that our meta-analytic method outperforms single-study based methods.
APA, Harvard, Vancouver, ISO, and other styles
32

Li, Jin, Feng Chen, Hong Liang, and Jingwen Yan. "MoNET: an R package for multi-omic network analysis." Bioinformatics 38, no. 4 (October 25, 2021): 1165–67. http://dx.doi.org/10.1093/bioinformatics/btab722.

Full text
Abstract:
Abstract Motivation The increasing availability of multi-omic data has enabled the discovery of disease biomarkers in different scales. Understanding the functional interaction between multi-omic biomarkers is becoming increasingly important due to its great potential for providing insights of the underlying molecular mechanism. Results Leveraging multiple biological network databases, we integrated the relationship between single nucleotide polymorphisms (SNPs), genes/proteins and metabolites, and developed an R package Multi-omic Network Explorer Tool (MoNET) for multi-omic network analysis. This new tool enables users to not only track down the interaction of SNPs/genes with metabolome level, but also trace back for the potential risk variants/regulators given altered genes/metabolites. MoNET is expected to advance our understanding of the multi-omic findings by unveiling their transomic interactions and is likely to generate new hypotheses for further validation. Availability and implementation The MoNET package is freely available on https://github.com/JW-Yan/MONET. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
33

Zhou, Juexiao, Siyuan Chen, Yulian Wu, Haoyang Li, Bin Zhang, Longxi Zhou, Yan Hu, et al. "PPML-Omics: A privacy-preserving federated machine learning method protects patients’ privacy in omic data." Science Advances 10, no. 5 (February 2, 2024). http://dx.doi.org/10.1126/sciadv.adh8601.

Full text
Abstract:
Modern machine learning models toward various tasks with omic data analysis give rise to threats of privacy leakage of patients involved in those datasets. Here, we proposed a secure and privacy-preserving machine learning method (PPML-Omics) by designing a decentralized differential private federated learning algorithm. We applied PPML-Omics to analyze data from three sequencing technologies and addressed the privacy concern in three major tasks of omic data under three representative deep learning models. We examined privacy breaches in depth through privacy attack experiments and demonstrated that PPML-Omics could protect patients’ privacy. In each of these applications, PPML-Omics was able to outperform methods of comparison under the same level of privacy guarantee, demonstrating the versatility of the method in simultaneously balancing the privacy-preserving capability and utility in omic data analysis. Furthermore, we gave the theoretical proof of the privacy-preserving capability of PPML-Omics, suggesting the first mathematically guaranteed method with robust and generalizable empirical performance in protecting patients’ privacy in omic data.
APA, Harvard, Vancouver, ISO, and other styles
34

Itai, Yonatan, Nimrod Rappoport, and Ron Shamir. "Integration of gene expression and DNA methylation data across different experiments." Nucleic Acids Research, July 3, 2023. http://dx.doi.org/10.1093/nar/gkad566.

Full text
Abstract:
Abstract Integrative analysis of multi-omic datasets has proven to be extremely valuable in cancer research and precision medicine. However, obtaining multimodal data from the same samples is often difficult. Integrating multiple datasets of different omics remains a challenge, with only a few available algorithms developed to solve it. Here, we present INTEND (IntegratioN of Transcriptomic and EpigeNomic Data), a novel algorithm for integrating gene expression and DNA methylation datasets covering disjoint sets of samples. To enable integration, INTEND learns a predictive model between the two omics by training on multi-omic data measured on the same set of samples. In comprehensive testing on 11 TCGA (The Cancer Genome Atlas) cancer datasets spanning 4329 patients, INTEND achieves significantly superior results compared with four state-of-the-art integration algorithms. We also demonstrate INTEND’s ability to uncover connections between DNA methylation and the regulation of gene expression in the joint analysis of two lung adenocarcinoma single-omic datasets from different sources. INTEND’s data-driven approach makes it a valuable multi-omic data integration tool. The code for INTEND is available at https://github.com/Shamir-Lab/INTEND.
APA, Harvard, Vancouver, ISO, and other styles
35

Flores, Javier E., Daniel M. Claborne, Zachary D. Weller, Bobbie-Jo M. Webb-Robertson, Katrina M. Waters, and Lisa M. Bramer. "Missing data in multi-omics integration: Recent advances through artificial intelligence." Frontiers in Artificial Intelligence 6 (February 9, 2023). http://dx.doi.org/10.3389/frai.2023.1098308.

Full text
Abstract:
Biological systems function through complex interactions between various ‘omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across ‘omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more ‘omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
APA, Harvard, Vancouver, ISO, and other styles
36

Drouard, Gabin, Juha Mykkänen, Jarkko Heiskanen, Joona Pohjonen, Saku Ruohonen, Katja Pahkala, Terho Lehtimäki, et al. "Exploring machine learning strategies for predicting cardiovascular disease risk factors from multi-omic data." BMC Medical Informatics and Decision Making 24, no. 1 (May 2, 2024). http://dx.doi.org/10.1186/s12911-024-02521-3.

Full text
Abstract:
Abstract Background Machine learning (ML) classifiers are increasingly used for predicting cardiovascular disease (CVD) and related risk factors using omics data, although these outcomes often exhibit categorical nature and class imbalances. However, little is known about which ML classifier, omics data, or upstream dimension reduction strategy has the strongest influence on prediction quality in such settings. Our study aimed to illustrate and compare different machine learning strategies to predict CVD risk factors under different scenarios. Methods We compared the use of six ML classifiers in predicting CVD risk factors using blood-derived metabolomics, epigenetics and transcriptomics data. Upstream omic dimension reduction was performed using either unsupervised or semi-supervised autoencoders, whose downstream ML classifier performance we compared. CVD risk factors included systolic and diastolic blood pressure measurements and ultrasound-based biomarkers of left ventricular diastolic dysfunction (LVDD; E/e' ratio, E/A ratio, LAVI) collected from 1,249 Finnish participants, of which 80% were used for model fitting. We predicted individuals with low, high or average levels of CVD risk factors, the latter class being the most common. We constructed multi-omic predictions using a meta-learner that weighted single-omic predictions. Model performance comparisons were based on the F1 score. Finally, we investigated whether learned omic representations from pre-trained semi-supervised autoencoders could improve outcome prediction in an external cohort using transfer learning. Results Depending on the ML classifier or omic used, the quality of single-omic predictions varied. Multi-omics predictions outperformed single-omics predictions in most cases, particularly in the prediction of individuals with high or low CVD risk factor levels. Semi-supervised autoencoders improved downstream predictions compared to the use of unsupervised autoencoders. In addition, median gains in Area Under the Curve by transfer learning compared to modelling from scratch ranged from 0.09 to 0.14 and 0.07 to 0.11 units for transcriptomic and metabolomic data, respectively. Conclusions By illustrating the use of different machine learning strategies in different scenarios, our study provides a platform for researchers to evaluate how the choice of omics, ML classifiers, and dimension reduction can influence the quality of CVD risk factor predictions.
APA, Harvard, Vancouver, ISO, and other styles
37

Arehart, Christopher H., John D. Sterrett, Rosanna L. Garris, Ruth E. Quispe-Pilco, Christopher R. Gignoux, Luke M. Evans, and Maggie A. Stanislawski. "Poly-omic risk scores predict inflammatory bowel disease diagnosis." mSystems, December 14, 2023. http://dx.doi.org/10.1128/msystems.00677-23.

Full text
Abstract:
ABSTRACT Inflammatory bowel disease (IBD) is characterized by complex etiology and a disrupted colonic ecosystem. We provide a framework for the analysis of multi-omic data, which we apply to study the gut ecosystem in IBD. Specifically, we train and validate models using data on the metagenome, metatranscriptome, virome, and metabolome from the Human Microbiome Project 2 IBD multi-omic database, with 1,785 repeated samples from 130 individuals (103 cases and 27 controls). After splitting the participants into training and testing groups, we used mixed-effects least absolute shrinkage and selection operator regression to select features for each omic. These features, with demographic covariates, were used to generate separate single-omic prediction scores. All four single-omic scores were then combined into a final regression to assess the relative importance of the individual omics and the predictive benefits when considered together. We identified several species, pathways, and metabolites known to be associated with IBD risk, and we explored the connections between data sets. Individually, metabolomic and viromic scores were more predictive than metagenomics or metatranscriptomics, and when all four scores were combined, we predicted disease diagnosis with a Nagelkerke’s R 2 of 0.46 and an area under the curve of 0.80 (95% confidence interval: 0.63, 0.98). Our work supports that some single-omic models for complex traits are more predictive than others, that incorporating multiple omic data sets may improve prediction, and that each omic data type provides a combination of unique and redundant information. This modeling framework can be extended to other complex traits and multi-omic data sets. IMPORTANCE Complex traits are characterized by many biological and environmental factors, such that multi-omic data sets are well-positioned to help us understand their underlying etiologies. We applied a prediction framework across multiple omics (metagenomics, metatranscriptomics, metabolomics, and viromics) from the gut ecosystem to predict inflammatory bowel disease (IBD) diagnosis. The predicted scores from our models highlighted key features and allowed us to compare the relative utility of each omic data set in single-omic versus multi-omic models. Our results emphasized the importance of metabolomics and viromics over metagenomics and metatranscriptomics for predicting IBD status. The greater predictive capability of metabolomics and viromics is likely because these omics serve as markers of lifestyle factors such as diet. This study provides a modeling framework for multi-omic data, and our results show the utility of combining multiple omic data types to disentangle complex disease etiologies and biological signatures.
APA, Harvard, Vancouver, ISO, and other styles
38

Downing, Tim, and Nicos Angelopoulos. "A primer on correlation-based dimension reduction methods for multi-omics analysis." Journal of The Royal Society Interface 20, no. 207 (October 2023). http://dx.doi.org/10.1098/rsif.2023.0344.

Full text
Abstract:
The continuing advances of omic technologies mean that it is now more tangible to measure the numerous features collectively reflecting the molecular properties of a sample. When multiple omic methods are used, statistical and computational approaches can exploit these large, connected profiles. Multi-omics is the integration of different omic data sources from the same biological sample. In this review, we focus on correlation-based dimension reduction approaches for single omic datasets, followed by methods for pairs of omics datasets, before detailing further techniques for three or more omic datasets. We also briefly detail network methods when three or more omic datasets are available and which complement correlation-oriented tools. To aid readers new to this area, these are all linked to relevant R packages that can implement these procedures. Finally, we discuss scenarios of experimental design and present road maps that simplify the selection of appropriate analysis methods. This review will help researchers navigate emerging methods for multi-omics and integrating diverse omic datasets appropriately. This raises the opportunity of implementing population multi-omics with large sample sizes as omics technologies and our understanding improve.
APA, Harvard, Vancouver, ISO, and other styles
39

Liu, Yufang, Yongkai Chen, Haoran Lu, Wenxuan Zhong, Guo-Cheng Yuan, and Ping Ma. "Orthogonal multimodality integration and clustering in single-cell data." BMC Bioinformatics 25, no. 1 (April 25, 2024). http://dx.doi.org/10.1186/s12859-024-05773-y.

Full text
Abstract:
AbstractMultimodal integration combines information from different sources or modalities to gain a more comprehensive understanding of a phenomenon. The challenges in multi-omics data analysis lie in the complexity, high dimensionality, and heterogeneity of the data, which demands sophisticated computational tools and visualization methods for proper interpretation and visualization of multi-omics data. In this paper, we propose a novel method, termed Orthogonal Multimodality Integration and Clustering (OMIC), for analyzing CITE-seq. Our approach enables researchers to integrate multiple sources of information while accounting for the dependence among them. We demonstrate the effectiveness of our approach using CITE-seq data sets for cell clustering. Our results show that our approach outperforms existing methods in terms of accuracy, computational efficiency, and interpretability. We conclude that our proposed OMIC method provides a powerful tool for multimodal data analysis that greatly improves the feasibility and reliability of integrated data.
APA, Harvard, Vancouver, ISO, and other styles
40

Hernández-Lemus, Enrique, and Soledad Ochoa. "Methods for multi-omic data integration in cancer research." Frontiers in Genetics 15 (September 19, 2024). http://dx.doi.org/10.3389/fgene.2024.1425456.

Full text
Abstract:
Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
APA, Harvard, Vancouver, ISO, and other styles
41

Nardini, Christine, Jennifer Dent, and Paolo Tieri. "Editorial: Multi-omic data integration." Frontiers in Cell and Developmental Biology 3 (July 7, 2015). http://dx.doi.org/10.3389/fcell.2015.00046.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Muller, Efrat, Itamar Shiryan, and Elhanan Borenstein. "Multi-omic integration of microbiome data for identifying disease-associated modules." Nature Communications 15, no. 1 (March 23, 2024). http://dx.doi.org/10.1038/s41467-024-46888-3.

Full text
Abstract:
AbstractMulti-omic studies of the human gut microbiome are crucial for understanding its role in disease across multiple functional layers. Nevertheless, integrating and analyzing such complex datasets poses significant challenges. Most notably, current analysis methods often yield extensive lists of disease-associated features (e.g., species, pathways, or metabolites), without capturing the multi-layered structure of the data. Here, we address this challenge by introducing “MintTea”, an intermediate integration-based approach combining canonical correlation analysis extensions, consensus analysis, and an evaluation protocol. MintTea identifies “disease-associated multi-omic modules”, comprising features from multiple omics that shift in concord and that collectively associate with the disease. Applied to diverse cohorts, MintTea captures modules with high predictive power, significant cross-omic correlations, and alignment with known microbiome-disease associations. For example, analyzing samples from a metabolic syndrome study, MintTea identifies a module with serum glutamate- and TCA cycle-related metabolites, along with bacterial species linked to insulin resistance. In another dataset, MintTea identifies a module associated with late-stage colorectal cancer, including Peptostreptococcus and Gemella species and fecal amino acids, in line with these species’ metabolic activity and their coordinated gradual increase with cancer development. This work demonstrates the potential of advanced integration methods in generating systems-level, multifaceted hypotheses underlying microbiome-disease interactions.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhang, Qiang, Xiang-He Meng, Chuan Qiu, Hui Shen, Qi Zhao, Lan-Juan Zhao, Qing Tian, Chang-Qing Sun, and Hong-Wen Deng. "Integrative analysis of multi-omics data to detect the underlying molecular mechanisms for obesity in vivo in humans." Human Genomics 16, no. 1 (May 14, 2022). http://dx.doi.org/10.1186/s40246-022-00388-x.

Full text
Abstract:
Abstract Background Obesity is a complex, multifactorial condition in which genetic play an important role. Most of the systematic studies currently focuses on individual omics aspect and provide insightful yet limited knowledge about the comprehensive and complex crosstalk between various omics levels. Subjects and methods Therefore, we performed a most comprehensive trans-omics study with various omics data from 104 subjects, to identify interactions/networks and particularly causal regulatory relationships within and especially those between omic molecules with the purpose to discover molecular genetic mechanisms underlying obesity etiology in vivo in humans. Results By applying differentially analysis, we identified 8 differentially expressed hub genes (DEHGs), 14 differentially methylated regions (DMRs) and 12 differentially accumulated metabolites (DAMs) for obesity individually. By integrating those multi-omics biomarkers using Mendelian Randomization (MR) and network MR analyses, we identified 18 causal pathways with mediation effect. For the 20 biomarkers involved in those 18 pairs, 17 biomarkers were implicated in the pathophysiology of obesity or related diseases. Conclusions The integration of trans-omics and MR analyses may provide us a holistic understanding of the underlying functional mechanisms, molecular regulatory information flow and the interactive molecular systems among different omic molecules for obesity risk and other complex diseases/traits.
APA, Harvard, Vancouver, ISO, and other styles
44

Madhumita, Archit Dwivedi, and Sushmita Paul. "Recursive integration of synergised graph representations of multi-omics data for cancer subtypes identification." Scientific Reports 12, no. 1 (September 17, 2022). http://dx.doi.org/10.1038/s41598-022-17585-2.

Full text
Abstract:
AbstractCancer subtypes identification is one of the critical steps toward advancing personalized anti-cancerous therapies. Accumulation of a massive amount of multi-platform omics data measured across the same set of samples provides an opportunity to look into this deadly disease from several views simultaneously. Few integrative clustering approaches are developed to capture shared information from all the views to identify cancer subtypes. However, they have certain limitations. The challenge here is identifying the most relevant feature space from each omic view and systematically integrating them. Both the steps should lead toward a global clustering solution with biological significance. In this respect, a novel multi-omics clustering algorithm named RISynG (Recursive Integration of Synergised Graph-representations) is presented in this study. RISynG represents each omic view as two representation matrices that are Gramian and Laplacian. A parameterised combination function is defined to obtain a synergy matrix from these representation matrices. Then a recursive multi-kernel approach is applied to integrate the most relevant, shared, and complementary information captured via the respective synergy matrices. At last, clustering is applied to the integrated subspace. RISynG is benchmarked on five multi-omics cancer datasets taken from The Cancer Genome Atlas. The experimental results demonstrate RISynG’s efficiency over the other approaches in this domain.
APA, Harvard, Vancouver, ISO, and other styles
45

S, Kishaanth, Abishek VP, Lokeswari Y. Venkataramana, and Venkata Vara Prasad D. "Enhancing Breast Cancer Survival Prognosis through Omic and Non-Omic Data Integration." Clinical Breast Cancer, August 2024. http://dx.doi.org/10.1016/j.clbc.2024.08.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Knepper, Mark A. "Utilizing Omic Data to Understand Integrative Physiology." Physiology, February 12, 2025. https://doi.org/10.1152/physiol.00045.2024.

Full text
Abstract:
Over the past several decades, physiological research has undergone a progressive shift toward greater-and-greater reductionism, culminating in the rise of ‘molecular physiology.’ The introduction of Omic techniques, chiefly protein mass spectrometry and next-generation DNA sequencing (NGS), has further accelerated this trend, adding massive amounts of information about individual genes, mRNA transcripts, and proteins. However, the long-term goal of understanding physiological and pathophysiological processes at a whole-organism level has not been fully realized. This review summarizes the major protein mass spectrometry and NGS techniques relevant to physiology and explores the challenges of merging data from Omic methodologies with data from traditional hypothesis-driven research to broaden the understanding of physiological mechanisms. It summarizes recent progress in large-scale data integration through: 1) creation of online user-friendly Omic data resources with cross-indexing across data sets to democratize access to Omic data; 2) application of Bayesian methods to combine data from multiple Omic data sets with knowledge from hypothesis-driven studies in order to address specific physiological and pathophysiological questions; and 3) application of concepts from Natural Language Processing to probe the literature and to create user-friendly causal graphs representing physiological mechanisms. Progress in development of so-called “Large Language Models”, e.g. ChatGPT, for knowledge integration is also described along with a discussion of the shortcomings of Large Language Models with regard to management and integration of physiological data.
APA, Harvard, Vancouver, ISO, and other styles
47

Stassen, Shobana V., Gwinky G. K. Yip, Kenneth K. Y. Wong, Joshua W. K. Ho, and Kevin K. Tsia. "Generalized and scalable trajectory inference in single-cell omics data with VIA." Nature Communications 12, no. 1 (September 20, 2021). http://dx.doi.org/10.1038/s41467-021-25773-3.

Full text
Abstract:
AbstractInferring cellular trajectories using a variety of omic data is a critical task in single-cell data science. However, accurate prediction of cell fates, and thereby biologically meaningful discovery, is challenged by the sheer size of single-cell data, the diversity of omic data types, and the complexity of their topologies. We present VIA, a scalable trajectory inference algorithm that overcomes these limitations by using lazy-teleporting random walks to accurately reconstruct complex cellular trajectories beyond tree-like pathways (e.g., cyclic or disconnected structures). We show that VIA robustly and efficiently unravels the fine-grained sub-trajectories in a 1.3-million-cell transcriptomic mouse atlas without losing the global connectivity at such a high cell count. We further apply VIA to discovering elusive lineages and less populous cell fates missed by other methods across a variety of data types, including single-cell proteomic, epigenomic, multi-omics datasets, and a new in-house single-cell morphological dataset.
APA, Harvard, Vancouver, ISO, and other styles
48

Habowski, A. N., T. J. Habowski, and M. L. Waterman. "GECO: gene expression clustering optimization app for non-linear data visualization of patterns." BMC Bioinformatics 22, no. 1 (January 25, 2021). http://dx.doi.org/10.1186/s12859-020-03951-2.

Full text
Abstract:
Abstract Background Due to continued advances in sequencing technology, the limitation in understanding biological systems through an “-omics” lens is no longer the generation of data, but the ability to analyze it. Importantly, much of this rich -omics data is publicly available waiting to be further investigated. Although many code-based pipelines exist, there is a lack of user-friendly and accessible applications that enable rapid analysis or visualization of data. Results GECO (Gene Expression Clustering Optimization; http://www.theGECOapp.com) is a minimalistic GUI app that utilizes non-linear reduction techniques to rapidly visualize expression trends in many types of biological data matrices (such as bulk RNA-seq or proteomics). The required input is a data matrix with samples and any type of expression level of genes/protein/other with a unique ID. The output is an interactive t-SNE or UMAP analysis that clusters genes (or proteins/other unique IDs) based on their expression patterns across the multiple samples enabling visualization of expression trends. Customizable settings for dimensionality reduction, data normalization, along with visualization parameters including coloring and filters, ensure adaptability to a variety of user uploaded data. Conclusion This local and cloud-hosted web browser app enables investigation of any -omic data matrix in a rapid and code-independent manner. With the continued growth of available -omic data, the ability to quickly evaluate a dataset, including specific genes of interest, is more important than ever. GECO is intended to supplement traditional statistical analysis methods and is particularly useful when visualizing clusters of genes with similar trajectories across many samples (ex: multiple cell types, time course, dose response). Users will be empowered to investigate -omic data with a new lens of visualization and analysis that has the potential to uncover genes of interest, cohorts of co-regulated genes programs, and previously undetected patterns of expression.
APA, Harvard, Vancouver, ISO, and other styles
49

Bornhofen, Elesandro, Dario Fè, Istvan Nagy, Ingo Lenk, Morten Greve, Thomas Didion, Christian S. Jensen, Torben Asp, and Luc Janss. "Genetic architecture of inter-specific and -generic grass hybrids by network analysis on multi-omics data." BMC Genomics 24, no. 1 (April 25, 2023). http://dx.doi.org/10.1186/s12864-023-09292-7.

Full text
Abstract:
Abstract Background Understanding the mechanisms underlining forage production and its biomass nutritive quality at the omics level is crucial for boosting the output of high-quality dry matter per unit of land. Despite the advent of multiple omics integration for the study of biological systems in major crops, investigations on forage species are still scarce. Results Our results identified substantial changes in gene co-expression and metabolite-metabolite network topologies as a result of genetic perturbation by hybridizing L. perenne with another species within the genus (L. multiflorum) relative to across genera (F. pratensis). However, conserved hub genes and hub metabolomic features were detected between pedigree classes, some of which were highly heritable and displayed one or more significant edges with agronomic traits in a weighted omics-phenotype network. In spite of tagging relevant biological molecules as, for example, the light-induced rice 1 (LIR1), hub features were not necessarily better explanatory variables for omics-assisted prediction than features stochastically sampled and all available regressors. Conclusions The utilization of computational techniques for the reconstruction of co-expression networks facilitates the identification of key omic features that serve as central nodes and demonstrate correlation with the manifestation of observed traits. Our results also indicate a robust association between early multi-omic traits measured in a greenhouse setting and phenotypic traits evaluated under field conditions.
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Ruo Han, Jianping Wang, and Shuai Cheng Li. "Probabilistic tensor decomposition extracts better latent embeddings from single-cell multiomic data." Nucleic Acids Research, July 5, 2023. http://dx.doi.org/10.1093/nar/gkad570.

Full text
Abstract:
Abstract Single-cell sequencing technology enables the simultaneous capture of multiomic data from multiple cells. The captured data can be represented by tensors, i.e. the higher-rank matrices. However, the existing analysis tools often take the data as a collection of two-order matrices, renouncing the correspondences among the features. Consequently, we propose a probabilistic tensor decomposition framework, SCOIT, to extract embeddings from single-cell multiomic data. SCOIT incorporates various distributions, including Gaussian, Poisson, and negative binomial distributions, to deal with sparse, noisy, and heterogeneous single-cell data. Our framework can decompose a multiomic tensor into a cell embedding matrix, a gene embedding matrix, and an omic embedding matrix, allowing for various downstream analyses. We applied SCOIT to eight single-cell multiomic datasets from different sequencing protocols. With cell embeddings, SCOIT achieves superior performance for cell clustering compared to nine state-of-the-art tools under various metrics, demonstrating its ability to dissect cellular heterogeneity. With the gene embeddings, SCOIT enables cross-omics gene expression analysis and integrative gene regulatory network study. Furthermore, the embeddings allow cross-omics imputation simultaneously, outperforming current imputation methods with the Pearson correlation coefficient increased by 3.38–39.26%; moreover, SCOIT accommodates the scenario that subsets of the cells are with merely one omic profile available.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography