Journal articles: 'Molecular biology – Data processing – Research'

1

Ahdesmäki, Miika J. "Improved PDX and CDX Data Processing—Letter." Molecular Cancer Research 16, no. 11 (2018): 1813. http://dx.doi.org/10.1158/1541-7786.mcr-18-0534.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Khandelwal, Garima, and Crispin Miller. "Improved PDX and CDX Data Processing—Response." Molecular Cancer Research 16, no. 11 (2018): 1814. http://dx.doi.org/10.1158/1541-7786.mcr-18-0535.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Fodje, Michel, Kathryn Janzen, Shanunivan Labiuk, James Gorin, and Pawel Grochulski. "AutoProcess:Automated strategy calculation, data processing & structure solution." Acta Crystallographica Section A Foundations and Advances 70, a1 (2014): C791. http://dx.doi.org/10.1107/s2053273314092080.

Full text

Abstract:

Two critical aspect of macromolecular crystallography experiments are (1) Determining the optimal parameters and strategy for collecting good quality data and (2) Optimal processing of the collected data to obtain to facilitate structure determination. These tasks can be daunting to inexperienced crystallographers and often lead to inefficiencies as valuable beam-time is used up. To support automation, remote access and high-throughput crystallography, we have developed a software system for automation of all data processing tasks required at the synchrotron. AutoProcess, is layered on the XDS data processing package and makes use of other utilities such as BEST from the European Molecular Biology Laboratory (EMBL), CCP4 utilities and SHELX. The software can be used from the command line as a standalone application but can also be run as a service on a high-performance computing cluster, and integrated into beamline control and information management systems such as MxDC and MxLIVE to allows users to determine the optimal strategy for data collection, and/or process full datasets with the click of a button. Users are presented with a graphical data processing report as well as reflection output files in popular formats automatically. For small molecule and peptide structures, an unrefined initial structure with an electron density map is automatically generated using only the raw diffraction images and the chemical composition of the molecule. Future developments will include sub-structure solution for MAD/SAD/SIRAS data. The software is freely available under an open-source license from the authors. The Canadian Light Source is supported by the Natural Sciences and Engineering Research Council of Canada, the National Research Council Canada, the Canadian Institutes of Health Research, the Province of Saskatchewan, Western Economic Diversification Canada, and the University of Saskatchewan.

APA, Harvard, Vancouver, ISO, and other styles

4

Likhoshway, E. V. "Actual trends in water ecosystem biology development." Marine Biological Journal 2, no. 4 (2017): 3–14. http://dx.doi.org/10.21072/mbj.2017.02.4.01.

Full text

Abstract:

This synopsis characterizes new trends in oceanology arising for the last some years as a result of practical application of actual methods in data obtaining and processing. First of all, these are methods of massive parallel sequencing, “-omics” and bioinformatics methods of data storage and analysis. Identifying biologically active substances in water environment and results of laboratory-based experiments show the existence of molecular signal transduction both at the level of population and interspecies relations between microorganisms and at the level of their trophic connections. “From molecules to ecosystem” – is an actual trend in biology of marine ecosystems. Unification and analysis of large databases including space imagery and “cloud” technologies created a new trend of research in ecoinformatics; this allows to understand structural-functional organization of water ecosystems as a whole.

APA, Harvard, Vancouver, ISO, and other styles

5

Gîfu, Daniela, Diana Trandabăț, Kevin Cohen, and Jingbo Xia. "Special Issue on the Curative Power of Medical Data." Data 4, no. 2 (2019): 85. http://dx.doi.org/10.3390/data4020085.

Full text

Abstract:

With the massive amounts of medical data made available online, language technologies have proven to be indispensable in processing biomedical and molecular biology literature, health data or patient records. With huge amount of reports, evaluating their impact has long ceased to be a trivial task. Linking the contents of these documents to each other, as well as to specialized ontologies, could enable access to and the discovery of structured clinical information and could foster a major leap in natural language processing and in health research. The aim of this Special Issue, “Curative Power of Medical Data” in Data, is to gather innovative approaches for the exploitation of biomedical data using semantic web technologies and linked data by developing a community involvement in biomedical research. This Special Issue contains four surveys, which include a wide range of topics, from the analysis of biomedical articles writing style, to automatically generating tests from medical references, constructing a Gold standard biomedical corpus or the visualization of biomedical data.

APA, Harvard, Vancouver, ISO, and other styles

6

Kahsay, Robel, Jeet Vora, Rahi Navelkar, et al. "GlyGen data model and processing workflow." Bioinformatics 36, no. 12 (2020): 3941–43. http://dx.doi.org/10.1093/bioinformatics/btaa238.

Full text

Abstract:

Abstract Summary Glycoinformatics plays a major role in glycobiology research, and the development of a comprehensive glycoinformatics knowledgebase is critical. This application note describes the GlyGen data model, processing workflow and the data access interfaces featuring programmatic use case example queries based on specific biological questions. The GlyGen project is a data integration, harmonization and dissemination project for carbohydrate and glycoconjugate-related data retrieved from multiple international data sources including UniProtKB, GlyTouCan, UniCarbKB and other key resources. Availability and implementation GlyGen web portal is freely available to access at https://glygen.org. The data portal, web services, SPARQL endpoint and GitHub repository are also freely available at https://data.glygen.org, https://api.glygen.org, https://sparql.glygen.org and https://github.com/glygener, respectively. All code is released under license GNU General Public License version 3 (GNU GPLv3) and is available on GitHub https://github.com/glygener. The datasets are made available under Creative Commons Attribution 4.0 International (CC BY 4.0) license. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

7

Bensmail, Halima, and Abdelali Haoudi. "Postgenomics: Proteomics and Bioinformatics in Cancer Research." Journal of Biomedicine and Biotechnology 2003, no. 4 (2003): 217–30. http://dx.doi.org/10.1155/s1110724303209207.

Full text

Abstract:

Now that the human genome is completed, the characterization of the proteins encoded by the sequence remains a challenging task. The study of the complete protein complement of the genome, the “proteome,” referred to as proteomics, will be essential if new therapeutic drugs and new disease biomarkers for early diagnosis are to be developed. Research efforts are already underway to develop the technology necessary to compare the specific protein profiles of diseased versus nondiseased states. These technologies provide a wealth of information and rapidly generate large quantities of data. Processing the large amounts of data will lead to useful predictive mathematical descriptions of biological systems which will permit rapid identification of novel therapeutic targets and identification of metabolic disorders. Here, we present an overview of the current status and future research approaches in defining the cancer cell's proteome in combination with different bioinformatics and computational biology tools toward a better understanding of health and disease.

APA, Harvard, Vancouver, ISO, and other styles

8

Feltus, Frank A., Joseph R. Breen, Juan Deng, et al. "The Widening Gulf between Genomics Data Generation and Consumption: A Practical Guide to Big Data Transfer Technology." Bioinformatics and Biology Insights 9s1 (January 2015): BBI.S28988. http://dx.doi.org/10.4137/bbi.s28988.

Full text

Abstract:

In the last decade, high-throughput DNA sequencing has become a disruptive technology and pushed the life sciences into a distributed ecosystem of sequence data producers and consumers. Given the power of genomics and declining sequencing costs, biology is an emerging “Big Data” discipline that will soon enter the exabyte data range when all subdisciplines are combined. These datasets must be transferred across commercial and research networks in creative ways since sending data without thought can have serious consequences on data processing time frames. Thus, it is imperative that biologists, bioinformaticians, and information technology engineers recalibrate data processing paradigms to fit this emerging reality. This review attempts to provide a snapshot of Big Data transfer across networks, which is often overlooked by many biologists. Specifically, we discuss four key areas: 1) data transfer networks, protocols, and applications; 2) data transfer security including encryption, access, firewalls, and the Science DMZ; 3) data flow control with software-defined networking; and 4) data storage, staging, archiving and access. A primary intention of this article is to orient the biologist in key aspects of the data transfer process in order to frame their genomics-oriented needs to enterprise IT professionals.

APA, Harvard, Vancouver, ISO, and other styles

9

Zinchenko, M. O., K. B. Sukhomlin, O. P. Zinchenko, and V. S. Tepliuk. "The biology of Simulium noelleri and Simulium dolini: morphological, ecological and molecular data." Biosystems Diversity 29, no. 2 (2021): 180–84. http://dx.doi.org/10.15421/012122.

Full text

Abstract:

Molecular genetic research has revolutionized the taxonomy and systematics of the Simuliidae family. Simulium noelleri Friederichs, 1920 is a species of blackfly, common in the Holarctic, reported for 33 countries. In 1954, Topchiev recorded it in Ukraine for the first time. Simulium dolini Usova et Sukhomlin, 1989 has been recorded at the borders of Ukraine and Belarus. It was described for the first time by Usova and Sukhomlin in 1989 from the collection from the territory of Volyn region in 1985. Usova and Sukhomlin, Yankovsky, Adler state that S. noelleri and S. dolini are different species by the morphological characteristics that differ in all phases of development. Adults differ in the structure of the genital appendages, palps, the margin and shape of the face and forehead, the colour of the legs; the larva – in the pattern on the frontal capsule, the number of rays in the fans, mandibular teeth and the hypostoma, the structure of the hind organ of attachment; pupae – in the branching way of gills. Molecular data are becoming an increasingly important tool in insect taxonomy. Therefore, we had to check that these two closely related species also have genetic difference. The development of S. noelleri and S. dolini was studied in four small rivers of Volyn region, Ukraine (Chornohuzka, Konopelka, Putylivka, Omelyanivka) in the period from 2017 to 2019. During initial processing of insect samples, we used the standard protocols EPPO PM7/129. We obtained the nucleotide sequence of S. dolini. It was proved that the populations of S. noelleri and S. dolini from medium and small rivers of Volyn differ in biological, morphological, behavioural and genetic characteristics. Comparison of the species S. noelleri with the data of the GenBank confirms the identification of three distinct morphotypes from Volyn, Great Britain and Canada. As a result of the conducted researches, it was confirmed that two close species of S. dolini and S. noelleri from the noelleri species group differ in the structure of mitochondrial DNA, which confirms their independent taxonomic status. Additional studies comprising more individuals from larger areas of Europe are required to verify the taxonomic position of these two species.

APA, Harvard, Vancouver, ISO, and other styles

10

Vowinckel, Jakob, Thomas Corwin, Jonathan Woodsmith, et al. "Proteome and phospho-proteome profiling for deeper phenotype characterization of colorectal cancer heterogeneity." Journal of Clinical Oncology 39, no. 15_suppl (2021): e15536-e15536. http://dx.doi.org/10.1200/jco.2021.39.15_suppl.e15536.

Full text

Abstract:

e15536 Background: The rise of precision oncology therapeutics requires deep understanding of the molecular mechanisms implicated in cancer biology. Colorectal cancer (CRC) is one of the first solid tumors to be molecularly characterized by defined genes and pathways. Advances in tumor profiling have revealed a profound molecular heterogeneity in CRC leading to the definition of several consensus molecular subtypes (CMS). However, this molecular heterogeneity is still largely defined on the genomic and transcriptomics level. To complement the understanding of genetically defined molecular subgroups, we performed large-scale deep proteomic and phospho-proteomic profiling of CRC patient biopsies and adjacent healthy control tissue, which has enabled to explore the phenotype and obtain more functional insights in cancer biology. Methods: Sample processing from 5-10 mg of tissue per sample was performed using a liquid handling robot. Phospho-peptide enrichment was carried out with a Kingfisher Flex device and MagReSyn Ti-IMAC magnetic beads. Data-Independent Acquisition (DIA) LC-MS/MS was performed on multiple platforms consisting of a Thermo Scientific Q Exactive HF-X mass spectrometer coupled to a Waters M-Class LC. Chromatography was operating at 5 µL/min, and separation was achieved using 45 min (whole proteome) and 60 min (phospho-proteome) gradients. Results: Indivumed has built IndivuType, the world’s first multi-omics database for individualized cancer therapy, analyzing the highest quality cancer biospecimens to generate the most comprehensive dataset, including genomics, transcriptomics, proteomics, and clinical outcome information. Enabled by the DIA technology, a mass spectrometric method developed by Biognosys that obtains peptide fragmentation data in a highly parallelized way with high sensitivity, more than 7,000 proteins in the whole proteome and 20,000 phospho-peptides in the phospho-proteome workflow were profiled across more than 900 resected tissue samples of various CMS of CRC. The resulting proteome and phospho-proteome data were integrated into the IndivuType database and cross-analyzed with genomic and transcriptomic markers. Through this combined analysis, novel insights in clinically relevant signaling pathways in CRC subtypes were revealed. Conclusions: The deep phenotypic profiling of cancer samples, using next generation proteomics and phospho-proteomics, has enabled us to go beyond the genomic level in the characterization of tumor molecular heterogeneity. This multi-omics approach provides a solid foundation to advance the understanding of cancer biology, unravel key molecular events, and support the identification of novel therapeutic targets for precision medicine in CRC.

APA, Harvard, Vancouver, ISO, and other styles

11

Albeck, John G., Michael Pargett, and Alexander E. Davies. "Experimental and engineering approaches to intracellular communication." Essays in Biochemistry 62, no. 4 (2018): 515–24. http://dx.doi.org/10.1042/ebc20180024.

Full text

Abstract:

Communication between and within cells is essential for multicellular life. While intracellular signal transduction pathways are often specified in molecular terms, the information content they transmit remains poorly defined. Here, we review research efforts to merge biological experimentation with concepts of communication that emerge from the engineering disciplines of signal processing and control theory. We discuss the challenges of performing experiments that quantitate information transfer at the molecular level, and we highlight recent studies that have advanced toward a clearer definition of the information content carried by signaling molecules. Across these studies, we emphasize a theme of increasingly well-matched experimental and theoretical approaches to decode the data streams directing cellular behavior.

APA, Harvard, Vancouver, ISO, and other styles

12

Van Chi, Phan, and Le Thi Bich Thao. "Proteogenomics and its applications in biology and precision medicine." Vietnam Journal of Biotechnology 19, no. 1 (2021): 1–14. http://dx.doi.org/10.15625/1811-4989/15386.

Full text

Abstract:

In this review, we briefly discuss proteogenomics, the integration of proteomics with genomics and transcriptomics, whereby the underlying technologies are next-generation sequencing (NGS) and mass spectrometry (MS) with processing the resulting data, an emerging field that promises to accelerate fundamental research related to transcription and translation, as well as its applicability. By combining genomic and proteomic information, scientists are achieving new results due to a more complete and unified understanding of complex molecular biological processes. Part of this review introduces some of the results of using proteogenomics in solving problems such as annotation, gene/genome re-annotation, including editing of open reading frames (ORFs), or improving a process to detect new genes in a number of different organisms, including humans. In particular, the paper also discusses the potential of proteogenomics through research achievements on human genome/proteome in precision medicine, especially in projects on phylogenetic and diagnostic research. and cancer treatment. The challenges and future of proteogenomics are also discussed and documented.

APA, Harvard, Vancouver, ISO, and other styles

13

Kachala, Michael, John Westbrook, and Dmitri Svergun. "Extension of the sasCIF format and its applications for data processing and deposition." Journal of Applied Crystallography 49, no. 1 (2016): 302–10. http://dx.doi.org/10.1107/s1600576715024942.

Full text

Abstract:

Recent advances in small-angle scattering (SAS) experimental facilities and data analysis methods have prompted a dramatic increase in the number of users and of projects conducted, causing an upsurge in the number of objects studied, experimental data available and structural models generated. To organize the data and models and make them accessible to the community, the Task Forces on SAS and hybrid methods for the International Union of Crystallography and the Worldwide Protein Data Bank envisage developing a federated approach to SAS data and model archiving. Within the framework of this approach, the existing databases may exchange information and provide independent but synchronized entries to users. At present, ways of exchanging information between the various SAS databases are not established, leading to possible duplication and incompatibility of entries, and limiting the opportunities for data-driven research for SAS users. In this work, a solution is developed to resolve these issues and provide a universal exchange format for the community, based on the use of the widely adopted crystallographic information framework (CIF). The previous version of the sasCIF format, implemented as an extension of the core CIF dictionary, has been available since 2000 to facilitate SAS data exchange between laboratories. The sasCIF format has now been extended to describe comprehensively the necessary experimental information, results and models, including relevant metadata for SAS data analysis and for deposition into a database. Processing tools for these files (sasCIFtools) have been developed, and these are available both as standalone open-source programs and integrated into the SAS Biological Data Bank, allowing the export and import of data entries as sasCIF files. Software modules to save the relevant information directly from beamline data-processing pipelines in sasCIF format are also developed. This update of sasCIF and the relevant tools are an important step in the standardization of the way SAS data are presented and exchanged, to make the results easily accessible to users and to promote further the application of SAS in the structural biology community.

APA, Harvard, Vancouver, ISO, and other styles

14

Chen, Long, Chunhua Zhang, Yanling Wang, et al. "Data mining and pathway analysis of glucose-6-phosphate dehydrogenase with natural language processing." Molecular Medicine Reports 16, no. 2 (2017): 1900–1910. http://dx.doi.org/10.3892/mmr.2017.6785.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Gudenas, Brian, Bernhard Englinger, Anthony P. Y. Liu, et al. "ETMR-06. DISSECTING THE MOLECULAR AND DEVELOPMENTAL BASIS OF PINEOBLASTOMA THROUGH GENOMICS." Neuro-Oncology 22, Supplement_3 (2020): iii323—iii324. http://dx.doi.org/10.1093/neuonc/noaa222.210.

Full text

Abstract:

Abstract Pineoblastoma (PB) is an aggressive embryonal brain tumor comprising 1% of pediatric CNS tumors. The clinico-molecular heterogeneity and developmental origins underlying PB are poorly understood; therefore, we have assembled a molecular cohort of histologically defined PBs (n=43) with corresponding outcome data. Methylation profiling revealed four molecularly and clinically distinct PB subgroups, including two novel entities. Mutational and transcriptional analysis identified characteristic molecular features of each subgroup, such as mutations in the miRNA processing pathway or FOXR2 proto-oncogene overexpression. Furthermore, subgroups exhibited differences in propensity for metastasis, cytogenetics, and clinical outcomes. To dissect PB developmental origins and resolve PB subgroup biology, we have employed a combination of single-cell genomics and genetically engineered mouse modeling. We created a single-cell transcriptional atlas of the developing murine pineal gland across 11 timepoints and are currently integrating these data with single nuclei RNA-seq data of human PB (n=25). Single-cell analysis of the developing pineal gland revealed three distinct populations of pinealocytes, referred to as early, mid and late pinealocytes, which segregate by developmental stage yet lie along a single developmental trajectory. Preliminary results implicate significant associations between PBs and the early pinealocyte population as well as subgroup-specific differences in intratumoral heterogeneity. Furthermore, this knowledge has informed the downstream generation of biologically faithful disease models, including a transgenic mouse model of the PB-RB subgroup. Remarkably, this model shows up-regulation of key markers of PB such as Crx, Asmt and Otx2 and substantiates early pinealocytes as the probable cell-of-origin for this PB subgroup.

APA, Harvard, Vancouver, ISO, and other styles

16

Hammersley, A. P. "FIT2D: a multi-purpose data reduction, analysis and visualization program." Journal of Applied Crystallography 49, no. 2 (2016): 646–52. http://dx.doi.org/10.1107/s1600576716000455.

Full text

Abstract:

FIT2D is one of the principal area detector data reduction, analysis and visualization programs used at the European Synchrotron Radiation Facility and is also used by more than 400 research groups worldwide, including many other synchrotron radiation facilities. It has been developed for X-ray science, but is applicable to other structural techniques and is used in analysing electron diffraction data and microscopy, and neutron diffraction and scattering data. FIT2D works for both interactive and `batch'-style data processing. Calibration and correction of detector distortions, integration of two-dimensional data to a variety of one-dimensional scans, and one- and two-dimensional model fitting are the main uses. Many other general-purpose image processing and image visualization operations are available. Commands are available through a `graphical user interface' and operations common to certain types of analysis are grouped within `interfaces'. Executable versions for most workstation and personal computer systems, and web page documentation, are available at http://www.esrf.eu/computing/scientific/FIT2D.

APA, Harvard, Vancouver, ISO, and other styles

17

Grunspan, Daniel Z., Benjamin L. Wiggins, and Steven M. Goodreau. "Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research." CBE—Life Sciences Education 13, no. 2 (2014): 167–78. http://dx.doi.org/10.1187/cbe.13-08-0162.

Full text

Abstract:

Social interactions between students are a major and underexplored part of undergraduate education. Understanding how learning relationships form in undergraduate classrooms, as well as the impacts these relationships have on learning outcomes, can inform educators in unique ways and improve educational reform. Social network analysis (SNA) provides the necessary tool kit for investigating questions involving relational data. We introduce basic concepts in SNA, along with methods for data collection, data processing, and data analysis, using a previously collected example study on an undergraduate biology classroom as a tutorial. We conduct descriptive analyses of the structure of the network of costudying relationships. We explore generative processes that create observed study networks between students and also test for an association between network position and success on exams. We also cover practical issues, such as the unique aspects of human subjects review for network studies. Our aims are to convince readers that using SNA in classroom environments allows rich and informative analyses to take place and to provide some initial tools for doing so, in the process inspiring future educational studies incorporating relational data.

APA, Harvard, Vancouver, ISO, and other styles

18

Li, Zhi, Tianyue Zhang, Haojie Lei, et al. "Research on Gastric Cancer’s Drug-resistant Gene Regulatory Network Model." Current Bioinformatics 15, no. 3 (2020): 225–34. http://dx.doi.org/10.2174/1574893614666190722102557.

Full text

Abstract:

Objective: Based on bioinformatics, differentially expressed gene data of drug-resistance in gastric cancer were analyzed, screened and mined through modeling and network modeling to find valuable data associated with multi-drug resistance of gastric cancer. Methods: First, data sets were preprocessed from three aspects: data processing, data annotation and classification, and functional clustering. Secondly, based on the preprocessed data, each classified primary gene regulatory network was constructed by mining interactions among the genes. This paper computed the values of each node in each classified primary gene regulatory network and ranked these nodes according to their scores. On the basis of this, the appropriate core node was selected and the corresponding core network was developed. Results and Conclusion:: Finally, core network modules were analyzed, which were mined. After the correlation analysis, the result showed that the constructed network module had 20 core genes. This module contained valuable data associated with multi-drug resistance in gastric cancer.

APA, Harvard, Vancouver, ISO, and other styles

19

Torkamaneh, Davoud, Jérôme Laroche, and François Belzile. "Fast-GBS v2.0: an analysis toolkit for genotyping-by-sequencing data." Genome 63, no. 11 (2020): 577–81. http://dx.doi.org/10.1139/gen-2020-0077.

Full text

Abstract:

Genotyping-by-sequencing (GBS) is a rapid, flexible, low-cost, and robust genotyping method that simultaneously discovers variants and calls genotypes within a broad range of samples. These characteristics make GBS an excellent tool for many applications and research questions from conservation biology to functional genomics in both model and non-model species. Continued improvement of GBS relies on a more comprehensive understanding of data analysis, development of fast and efficient bioinformatics pipelines, accurate missing data imputation, and active post-release support. Here, we present the second generation of Fast-GBS (v2.0) that offers several new options (e.g., processing paired-end reads and imputation of missing data) and features (e.g., summary statistics of genotypes) to improve the GBS data analysis process. The performance assessment analysis showed that Fast-GBS v2.0 outperformed other available analytical pipelines, such as GBS-SNP-CROP and Gb-eaSy. Fast-GBS v2.0 provides an analysis platform that can be run with different types of sequencing data, modest computational resources, and allows for missing-data imputation for various species in different contexts.

APA, Harvard, Vancouver, ISO, and other styles

20

Aquili, Luca. "The Role of Tryptophan and Tyrosine in Executive Function and Reward Processing." International Journal of Tryptophan Research 13 (January 2020): 117864692096482. http://dx.doi.org/10.1177/1178646920964825.

Full text

Abstract:

The serotonergic precursor tryptophan and the dopaminergic precursor tyrosine have been shown to be important modulators of mood, behaviour and cognition. Specifically, research on the function of tryptophan has characterised this molecule as particularly relevant in the context of pathological disorders such as depression. Moreover, a large body of evidence has now been accumulated to suggest that tryptophan may also be involved in executive function and reward processing. Despite some clear differentiation with tryptophan, the data reviewed in this paper illustrates that tyrosine shares similar functions with tryptophan in the regulation of executive function and reward, and that these processes in turn, rather than acting in isolation, causally influence each other.

APA, Harvard, Vancouver, ISO, and other styles

21

Playdon, Mary C., Amit D. Joshi, Fred K. Tabung, et al. "Metabolomics Analytics Workflow for Epidemiological Research: Perspectives from the Consortium of Metabolomics Studies (COMETS)." Metabolites 9, no. 7 (2019): 145. http://dx.doi.org/10.3390/metabo9070145.

Full text

Abstract:

The application of metabolomics technology to epidemiological studies is emerging as a new approach to elucidate disease etiology and for biomarker discovery. However, analysis of metabolomics data is complex and there is an urgent need for the standardization of analysis workflow and reporting of study findings. To inform the development of such guidelines, we conducted a survey of 47 cohort representatives from the Consortium of Metabolomics Studies (COMETS) to gain insights into the current strategies and procedures used for analyzing metabolomics data in epidemiological studies worldwide. The results indicated a variety of applied analytical strategies, from biospecimen and data pre-processing and quality control to statistical analysis and reporting of study findings. These strategies included methods commonly used within the metabolomics community and applied in epidemiological research, as well as novel approaches to pre-processing pipelines and data analysis. To help with these discrepancies, we propose use of open-source initiatives such as the online web-based tool COMETS Analytics, which includes helpful tools to guide analytical workflow and the standardized reporting of findings from metabolomics analyses within epidemiological studies. Ultimately, this will improve the quality of statistical analyses, research findings, and study reproducibility.

APA, Harvard, Vancouver, ISO, and other styles

22

Qin, Li-Xuan, Huei-Chung Huang, and Colin B. Begg. "Cautionary Note on Using Cross-Validation for Molecular Classification." Journal of Clinical Oncology 34, no. 32 (2016): 3931–38. http://dx.doi.org/10.1200/jco.2016.68.1031.

Full text

Abstract:

Purpose Reproducibility of scientific experimentation has become a major concern because of the perception that many published biomedical studies cannot be replicated. In this article, we draw attention to the connection between inflated overoptimistic findings and the use of cross-validation for error estimation in molecular classification studies. We show that, in the absence of careful design to prevent artifacts caused by systematic differences in the processing of specimens, established tools such as cross-validation can lead to a spurious estimate of the error rate in the overoptimistic direction, regardless of the use of data normalization as an effort to remove these artifacts. Methods We demonstrated this important yet overlooked complication of cross-validation using a unique pair of data sets on the same set of tumor samples. One data set was collected with uniform handling to prevent handling effects; the other was collected without uniform handling and exhibited handling effects. The paired data sets were used to estimate the biologic effects of the samples and the handling effects of the arrays in the latter data set, which were then used to simulate data using virtual rehybridization following various array-to-sample assignment schemes. Results Our study showed that (1) cross-validation tended to underestimate the error rate when the data possessed confounding handling effects; (2) depending on the relative amount of handling effects, normalization may further worsen the underestimation of the error rate; and (3) balanced assignment of arrays to comparison groups allowed cross-validation to provide an unbiased error estimate. Conclusion Our study demonstrates the benefits of balanced array assignment for reproducible molecular classification and calls for caution on the routine use of data normalization and cross-validation in such analysis.

APA, Harvard, Vancouver, ISO, and other styles

23

Urbano, Ferdinando, Francesca Cagnacci, Clément Calenge, Holger Dettki, Alison Cameron, and Markus Neteler. "Wildlife tracking data management: a new vision." Philosophical Transactions of the Royal Society B: Biological Sciences 365, no. 1550 (2010): 2177–85. http://dx.doi.org/10.1098/rstb.2010.0081.

Full text

Abstract:

To date, the processing of wildlife location data has relied on a diversity of software and file formats. Data management and the following spatial and statistical analyses were undertaken in multiple steps, involving many time-consuming importing/exporting phases. Recent technological advancements in tracking systems have made large, continuous, high-frequency datasets of wildlife behavioural data available, such as those derived from the global positioning system (GPS) and other animal-attached sensor devices. These data can be further complemented by a wide range of other information about the animals' environment. Management of these large and diverse datasets for modelling animal behaviour and ecology can prove challenging, slowing down analysis and increasing the probability of mistakes in data handling. We address these issues by critically evaluating the requirements for good management of GPS data for wildlife biology. We highlight that dedicated data management tools and expertise are needed. We explore current research in wildlife data management. We suggest a general direction of development, based on a modular software architecture with a spatial database at its core, where interoperability, data model design and integration with remote-sensing data sources play an important role in successful GPS data handling.

APA, Harvard, Vancouver, ISO, and other styles

24

Meissner, Tobias, Anja Seckinger, Kari Hemminki, et al. "Profound Impact of Sample Processing Delay on Gene Expression of Multiple Myeloma Plasma Cells." Blood 126, no. 23 (2015): 2996. http://dx.doi.org/10.1182/blood.v126.23.2996.2996.

Full text

Abstract:

Abstract Introduction: Gene expression profiling (GEP) has significantly contributed to the elucidation of the molecular heterogeneity of multiple myeloma plasma cells (MMPC) and only recently it has been recommended for risk stratification. Prior to GEP MMPC need to be enriched resulting in an inability to immediately freeze bone marrow aspirates or use RNA stabilization reagents. As a result in multi-center MM trials sample processing delay due to shipping may be an important confounder of molecular analyses and risk stratification based on GEP data. In order to determine the impact of "shipping delay" on MMPC gene expression we analyzed a set of 573 newly diagnosed German MM patients including 230 in-house and 343 shipped samples. Materials and Methods: We included publicly available GEP data of newly diagnosed MM patients treated in the GMMG HD4 and MM5 trials. All samples had been processed in a central laboratory in Heidelberg and include 85 HD4 and 145 MM5 in-house and 97 HD4 and 246 MM5 shipped samples. Prediction of sample status was done on publicly available GEP, including data from the UK, UAMS and MMRC. Differential gene expression was assessed using empirical Bayes statistics in linear models for microarray data. Predictor for shipment status was generated on the MM5 cohort using prediction analysis for microarrays. Pathway enrichment analysis was done using WebGestalt. Risk signatures and molecular subgroups were obtained as previously described. Fisher's exact test was used to compare the subgroup distribution between cohorts. If applicable, results were corrected for multiple testing using the Benjamini-Hochberg method. In all statistical tests, an effect was considered statistically significant if the P-value of its corresponding statistical test was not greater than 5%. Results: Applying the Goeman's global teston the MM5 set showed that "shipping delay" significantly impacted global gene expression (P <0.001). Compared to 145 in-house samples, we detected 3301 down-regulated and 3501 up-regulated genes in 246 shipped samples. For 4280 genes we confirmed differential expression in an independent set of 85 in-house and 97 shipped samples. Of these genes 2040 had a >1.5-fold and 826 a >2-fold difference in expression level. Differentially expressed genes were enriched in processes like ribosome biogenesis, cell cycle, and apoptosis. We observed significantly lower proliferation rates in shipped samples (P <0.001). We did not detect significant differences in the distribution of molecular subgroups between in-house and shipped samples in the combined set of HD4 and MM5. Among GEP based risk predictors the IFM-15 seemed to underestimate high risk in shipped samples, whereas the GEP70 and the EMC-92 gene signatures were more robust. In order to provide a tool to assess the "shipping effect" in public repositories, we generated a 17-gene predictor for shipped samples with a 10-fold cross validation error rate of 0.06 for the training set and an error rate of 0.15 for the validation set. Applying the predictor to further publicly available data sets we detected the "shipping effect" signature in 11% of cases of the UAMS set, 94% of the UK set and 57% of the MMRC set. Conclusion: Our study shows that "shipping delay" widely influences gene expression of MMPC with different impact on molecular classification and risk stratification. Based on available data, currently no clear circumvention of the shipping impact on MMPC can be recommended. It should be avoided if possible or at least be taken into account. Disclosures Seckinger: Takeda: Other: Travel grant. Salwender:Celgene: Honoraria; Janssen Cilag: Honoraria; Bristol Meyer Sqibb: Honoraria; Amgen: Honoraria; Novartis: Honoraria. Goldschmidt:Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding, Speakers Bureau; Millenium: Honoraria, Research Funding, Speakers Bureau; Onyx: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Bristol-Myers Squibb: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Celgene: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding, Speakers Bureau; Janssen-Cilag: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding, Speakers Bureau; Amgen: Consultancy, Membership on an entity's Board of Directors or advisory committees; Chugai: Honoraria, Research Funding, Speakers Bureau; Takeda: Consultancy, Membership on an entity's Board of Directors or advisory committees. Morgan:MMRF: Honoraria; Bristol Myers Squibb: Honoraria, Membership on an entity's Board of Directors or advisory committees; University of Arkansas for Medical Sciences: Employment; CancerNet: Honoraria; Takeda: Honoraria, Membership on an entity's Board of Directors or advisory committees; Weismann Institute: Honoraria; Celgene: Honoraria, Membership on an entity's Board of Directors or advisory committees. Hose:Takeda: Other: Travel grant; EngMab AG: Research Funding. Weinhold:Janssen Cilag: Other: Advisory Board; University of Arkansas for Medical Sciences: Employment.

APA, Harvard, Vancouver, ISO, and other styles

25

Coffa, Jordy, Mark A. van de Wiel, Begoña Diosdado, Beatriz Carvalho, Jan Schouten, and Gerrit A. Meijer. "MLPAnalyzer: Data Analysis Tool for Reliable Automated Normalization of MLPA Fragment Data." Analytical Cellular Pathology 30, no. 4 (2008): 323–35. http://dx.doi.org/10.1155/2008/605109.

Full text

Abstract:

Background: Multiplex Ligation dependent Probe Amplification (MLPA) is a rapid, simple, reliable and customized method for detection of copy number changes of individual genes at a high resolution and allows for high throughput analysis. This technique is typically applied for studying specific genes in large sample series. The large amount of data, dissimilarities in PCR efficiency among the different probe amplification products, and sample-to-sample variation pose a challenge to data analysis and interpretation. We therefore set out to develop an MLPA data analysis strategy and tool that is simple to use, while still taking into account the above-mentioned sources of variation.Materials and Methods: MLPAnalyzer was developed in Visual Basic for Applications, and can accept a large number of file formats directly from capillary sequence systems. Sizes of all MLPA probe signals are determined and filtered, quality control steps are performed, and variation in peak intensity related to size is corrected for. DNA copy number ratios of test samples are computed, displayed in a table view and a set of comprehensive figures is generated. To validate this approach, MLPA reactions were performed using a dedicated MLPA mix on 6 different colorectal cancer cell lines. The generated data were normalized using our program and results were compared to previously performed array-CGH results using both statistical methods and visual examination.Results and Discussion: Visual examination of bar graphs and direct ratios for both techniques showed very similar results, while the average Pearson moment correlation over all MLPA probes was found to be 0.42. Our results thus show that automated MLPA data processing following our suggested strategy may be of significant use, especially when handling large MLPA data sets, when samples are of different quality, or interpretation of MLPA electropherograms is too complex. It remains, however, important to recognize that automated MLPA data processing may only be successful when a dedicated experimental setup is also considered.

APA, Harvard, Vancouver, ISO, and other styles

26

Le Merrer, Julie, Jérôme A. J. Becker, Katia Befort, and Brigitte L. Kieffer. "Reward Processing by the Opioid System in the Brain." Physiological Reviews 89, no. 4 (2009): 1379–412. http://dx.doi.org/10.1152/physrev.00005.2009.

Full text

Abstract:

The opioid system consists of three receptors, mu, delta, and kappa, which are activated by endogenous opioid peptides processed from three protein precursors, proopiomelanocortin, proenkephalin, and prodynorphin. Opioid receptors are recruited in response to natural rewarding stimuli and drugs of abuse, and both endogenous opioids and their receptors are modified as addiction develops. Mechanisms whereby aberrant activation and modifications of the opioid system contribute to drug craving and relapse remain to be clarified. This review summarizes our present knowledge on brain sites where the endogenous opioid system controls hedonic responses and is modified in response to drugs of abuse in the rodent brain. We review 1) the latest data on the anatomy of the opioid system, 2) the consequences of local intracerebral pharmacological manipulation of the opioid system on reinforced behaviors, 3) the consequences of gene knockout on reinforced behaviors and drug dependence, and 4) the consequences of chronic exposure to drugs of abuse on expression levels of opioid system genes. Future studies will establish key molecular actors of the system and neural sites where opioid peptides and receptors contribute to the onset of addictive disorders. Combined with data from human and nonhuman primate (not reviewed here), research in this extremely active field has implications both for our understanding of the biology of addiction and for therapeutic interventions to treat the disorder.

APA, Harvard, Vancouver, ISO, and other styles

27

Dill-McFarland, Kimberly A., Stephan G. König, Florent Mazel, et al. "An integrated, modular approach to data science education in microbiology." PLOS Computational Biology 17, no. 2 (2021): e1008661. http://dx.doi.org/10.1371/journal.pcbi.1008661.

Full text

Abstract:

We live in an increasingly data-driven world, where high-throughput sequencing and mass spectrometry platforms are transforming biology into an information science. This has shifted major challenges in biological research from data generation and processing to interpretation and knowledge translation. However, postsecondary training in bioinformatics, or more generally data science for life scientists, lags behind current demand. In particular, development of accessible, undergraduate data science curricula has the potential to improve research and learning outcomes as well as better prepare students in the life sciences to thrive in public and private sector careers. Here, we describe the Experiential Data science for Undergraduate Cross-Disciplinary Education (EDUCE) initiative, which aims to progressively build data science competency across several years of integrated practice. Through EDUCE, students complete data science modules integrated into required and elective courses augmented with coordinated cocurricular activities. The EDUCE initiative draws on a community of practice consisting of teaching assistants (TAs), postdocs, instructors, and research faculty from multiple disciplines to overcome several reported barriers to data science for life scientists, including instructor capacity, student prior knowledge, and relevance to discipline-specific problems. Preliminary survey results indicate that even a single module improves student self-reported interest and/or experience in bioinformatics and computer science. Thus, EDUCE provides a flexible and extensible active learning framework for integration of data science curriculum into undergraduate courses and programs across the life sciences.

APA, Harvard, Vancouver, ISO, and other styles

28

Lagana, Alessandro, Ben Readhead, Deepak Perumal, et al. "Towards a Network-Based Molecular Taxonomy of Newly Diagnosed Multiple Myeloma." Blood 126, no. 23 (2015): 840. http://dx.doi.org/10.1182/blood.v126.23.840.840.

Full text

Abstract:

Abstract Recent advances in computational biology have led to the development of novel and sophisticated methods to model large datasets measured from complex organisms based on integrative network biology. Networks can provide valuable insight into key biological processes and allow for a deeper understanding of the complexity of cellular systems and disease mechanisms. We developed and applied a network biology approach to infer an improved molecular model and understanding of newly diagnosed multiple myeloma (MM). We constructed the first co-expression network of MM based on RNA-seq data from the current release (IA4) of the Multiple Myeloma Research Foundation (MMRF) CoMMpass Study dataset. The data set consists of 92 samples from newly diagnosed MM patients. Whole Exome Sequencing (WES) data available for 77 out of 92 samples allowed the integration of somatic mutations into the network. Our analysis organized 23,033 genes into 50 co-expression modules. We then evaluated the molecular activity of co-expression modules for concordance with molecular traits. We performed module enrichment analysis against Gene Ontology terms, pathways, chromosome locations, protein-protein interaction networks, MM-associated gene sets and drug-target databases. Analysis of the newly diagnosed multiple myeloma network model (MMNet) revealed known and novel molecular features of multiple myeloma. The integration of MMNet with somatic mutations data unveiled a significant association between mutation burden and the activation of several modules. Fundamental biological processes such as DNA repair, cell cycle, signal transduction, NK-kappaB cascade and MAPK signaling characterized such modules. Interestingly, a number of mutated genes demonstrated pluripotent associations with co-expression module activity. For example, FGFR3 was correlated with expression of several modules, including one enriched for RNA processing and translation-related processes and included the known MM-associated genes FRZB and CCND3. Similarly, the frequently mutated gene DIS3 was significantly associated to five different modules, including the translation-related module and a module enriched for the 1q locus. Our results have identified novel key driver genes that may inform therapy prioritization. The MMNet topology revealed a far greater molecular heterogeneity in primary MM underscoring opportunities to improve the molecular taxonomy of this disease. We identified several modules associating with previously described MM classes, including a module enriched for genes up regulated in the UAMS MS class characterized by spiked expression of WHSC1 and FGFR3. Module connectivity confirmed the central role of both genes, WHSC1 being the top hub gene, i.e. the most connected gene in the module, and FGFR3 being among the top 10 hubs. Consistent with previous findings, this module was characterized by negative correlation with aneuploidy. We found other modules enriched for genes dysregulated in other UAMS classes, such as MF, CD1 and CD2. We also identified several modules associating with relevant biological processes such as apoptosis, cell communication, Wnt and Toll-like receptor signaling. Correlation of modules expression with clinical traits identified insights into genetic subgroups of MM that are not previously described. For examples, we found a module positively correlated to the African American ethnicity. This module was also characterized by enrichment for genes in the fragile regions 5q31 and 6q21. These findings may provide important and exciting insights into the biology of MM among African Americans as they are at increased risk for MM. Our integrative network analysis of the CoMMpass dataset uncovers novel and complex patterns of genomic perturbation, key drivers and associations between clinical traits and genetic markers in newly diagnosed MM patients. Disclosures Chari: Celgene: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; Millennium/Takeda: Consultancy, Research Funding; Biotest: Other: Institutional Research Funding; Array Biopharma: Consultancy, Other: Institutional Research Funding, Research Funding; Novartis: Consultancy, Research Funding; Onyx: Consultancy, Research Funding. Jagannath:BMS: Honoraria; MERCK: Honoraria; Novartis Pharmaceuticals Corporation: Honoraria; Celgene: Honoraria; Janssen: Honoraria. Dudley:Ayasdi, Inc: Other: Equity; Personalis: Patents & Royalties; NuMedii, Inc: Patents & Royalties; GlaxoSmithKline: Consultancy; Janssen Pharmaceuticals: Consultancy; Ecoeos, Inc: Other: Equity.

APA, Harvard, Vancouver, ISO, and other styles

29

Peris-Díaz, Manuel D., Shannon R. Sweeney, Olga Rodak, Enrique Sentandreu, and Stefano Tiziani. "R-MetaboList 2: A Flexible Tool for Metabolite Annotation from High-Resolution Data-Independent Acquisition Mass Spectrometry Analysis." Metabolites 9, no. 9 (2019): 187. http://dx.doi.org/10.3390/metabo9090187.

Full text

Abstract:

Technological advancements have permitted the development of innovative multiplexing strategies for data independent acquisition (DIA) mass spectrometry (MS). Software solutions and extensive compound libraries facilitate the efficient analysis of MS1 data, regardless of the analytical platform. However, the development of comparable tools for DIA data analysis has significantly lagged. This research introduces an update to the former MetaboList R package and a workflow for full-scan MS1 and MS/MS DIA processing of metabolomic data from multiplexed liquid chromatography high-resolution mass spectrometry (LC-HRMS) experiments. When compared to the former version, new functions have been added to address isolated MS1 and MS/MS workflows, processing of MS/MS data from stepped collision energies, performance scoring of metabolite annotations, and batch job analysis were incorporated into the update. The flexibility and efficiency of this strategy were assessed through the study of the metabolite profiles of human urine, leukemia cell culture, and medium samples analyzed by either liquid chromatography quadrupole time-of-flight (q-TOF) or quadrupole orbital (q-Orbitrap) instruments. This open-source alternative was designed to promote global metabolomic strategies based on recursive retrospective research of multiplexed DIA analysis.

APA, Harvard, Vancouver, ISO, and other styles

30

Song, Bosheng, Zimeng Li, Xuan Lin, Jianmin Wang, Tian Wang, and Xiangzheng Fu. "Pretraining model for biological sequence data." Briefings in Functional Genomics 20, no. 3 (2021): 181–95. http://dx.doi.org/10.1093/bfgp/elab025.

Full text

Abstract:

Abstract With the development of high-throughput sequencing technology, biological sequence data reflecting life information becomes increasingly accessible. Particularly on the background of the COVID-19 pandemic, biological sequence data play an important role in detecting diseases, analyzing the mechanism and discovering specific drugs. In recent years, pretraining models that have emerged in natural language processing have attracted widespread attention in many research fields not only to decrease training cost but also to improve performance on downstream tasks. Pretraining models are used for embedding biological sequence and extracting feature from large biological sequence corpus to comprehensively understand the biological sequence data. In this survey, we provide a broad review on pretraining models for biological sequence data. Moreover, we first introduce biological sequences and corresponding datasets, including brief description and accessible link. Subsequently, we systematically summarize popular pretraining models for biological sequences based on four categories: CNN, word2vec, LSTM and Transformer. Then, we present some applications with proposed pretraining models on downstream tasks to explain the role of pretraining models. Next, we provide a novel pretraining scheme for protein sequences and a multitask benchmark for protein pretraining models. Finally, we discuss the challenges and future directions in pretraining models for biological sequences.

APA, Harvard, Vancouver, ISO, and other styles

31

Armstrong, Nicola J., and Mark A. van de Wiel. "Microarray Data Analysis: From Hypotheses to Conclusions Using Gene Expression Data." Analytical Cellular Pathology 26, no. 5-6 (2004): 279–90. http://dx.doi.org/10.1155/2004/943940.

Full text

Abstract:

We review several commonly used methods for the design and analysis of microarray data. To begin with, some experimental design issues are addressed. Several approaches for pre‐processing the data (filtering and normalization) before the statistical analysis stage are then discussed. A common first step in this type of analysis is gene selection based on statistical testing. Two approaches, permutation and model‐based methods are explained and we emphasize the need to correct for multiple testing. Moreover, powerful approaches based on gene sets are mentioned. Clustering of either genes or samples is frequently performed when analyzing microarray data. We summarize the basics of both supervised and unsupervised clustering (classification). The latter may be of use for creating diagnostic arrays, for example. Construction of biological networks, such as pathways, is a statistically challenging but complex task that is a relatively new development and hence mentioned only briefly. We finish with some remarks on literature and software. The emphasis in this paper is on the philosophy behind several statistical issues and on a critical interpretation of microarray related analysis methods.

APA, Harvard, Vancouver, ISO, and other styles

32

Mabrouk, Mai S., Safaa M. Naeem, and Mohamed A. Eldosoky. "DIFFERENT GENOMIC SIGNAL PROCESSING METHODS FOR EUKARYOTIC GENE PREDICTION: A SYSTEMATIC REVIEW." Biomedical Engineering: Applications, Basis and Communications 29, no. 01 (2017): 1730001. http://dx.doi.org/10.4015/s1016237217300012.

Full text

Abstract:

Bioinformatics field has now solidly settled itself as a control in molecular biology and incorporates an extensive variety of branches of knowledge from structural biology, genomics to gene expression studies. Bioinformatics is the application of computer technology to the management of biological information. Genomic signal processing (GSP) techniques have been connected most all around in bioinformatics and will keep on assuming an essential part in the investigation of biomedical issues. GSP refers to using the digital signal processing (DSP) methods for genomic data (e.g. DNA sequences) analysis. Recently, applications of GSP in bioinformatics have obtained great consideration such as identification of DNA protein coding regions, identification of reading frames, cancer detection and others. Cancer is one of the most dangerous diseases that the world faces and has raised the death rate in recent years, it is known medically as malignant neoplasm, so detection of it at the early stage can yield a promising approach to determine and take actions to treat with this risk. GSP is a method which can be used to detect the cancerous cells that are often caused due to genetic abnormality. This systematic review discusses some of the GSP applications in bioinformatics generally. The GSP techniques, used for cancer detection especially, are presented to collect the recent results and what has been reached at this point to be a new subject of research.

APA, Harvard, Vancouver, ISO, and other styles

33

Pecht, Tal, Anna C. Aschenbrenner, Thomas Ulas, and Antonella Succurro. "Modeling population heterogeneity from microbial communities to immune response in cells." Cellular and Molecular Life Sciences 77, no. 3 (2019): 415–32. http://dx.doi.org/10.1007/s00018-019-03378-w.

Full text

Abstract:

AbstractHeterogeneity is universally observed in all natural systems and across multiple scales. Understanding population heterogeneity is an intriguing and attractive topic of research in different disciplines, including microbiology and immunology. Microbes and mammalian immune cells present obviously rather different system-specific biological features. Nevertheless, as typically occurs in science, similar methods can be used to study both types of cells. This is particularly true for mathematical modeling, in which key features of a system are translated into algorithms to challenge our mechanistic understanding of the underlying biology. In this review, we first present a broad overview of the experimental developments that allowed observing heterogeneity at the single cell level. We then highlight how this “data revolution” requires the parallel advancement of algorithms and computing infrastructure for data processing and analysis, and finally present representative examples of computational models of population heterogeneity, from microbial communities to immune response in cells.

APA, Harvard, Vancouver, ISO, and other styles

34

Sahlabadi, Amirhossein, Ravie Chandren Muniyandi, Mahdi Sahlabadi, and Hossein Golshanbafghy. "Framework for Parallel Preprocessing of Microarray Data Using Hadoop." Advances in Bioinformatics 2018 (March 29, 2018): 1–9. http://dx.doi.org/10.1155/2018/9391635.

Full text

Abstract:

Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach.

APA, Harvard, Vancouver, ISO, and other styles

35

Poos, Alexandra M., Jan-Philipp Mallm, Stephan M. Tirier, et al. "A Comprehensive Analysis of Single-Cell Chromatin Accessibility and Gene Expression Identifies Intra-Tumor Heterogeneity and Molecular Treatment Responses in Relapsed/Refractory Multiple Myeloma." Blood 134, Supplement_1 (2019): 575. http://dx.doi.org/10.1182/blood-2019-130051.

Full text

Abstract:

Introduction: Multiple myeloma (MM) is a heterogeneous malignancy of clonal plasma cells that accumulate in the bone marrow (BM). Despite new treatment approaches, in most patients resistant subclones are selected by therapy, resulting in the development of refractory disease. While the subclonal architecture in newly diagnosed patients has been investigated in great detail, intra-tumor heterogeneity in relapsed/refractory (RR) MM is poorly characterized. Recent technological and computational advances provide the opportunity to systematically analyze tumor samples at single-cell (sc) level with high accuracy and througput. Here, we present a pilot study for an integrative analysis of sc Assay for Transposase-Accessible Chromatin with high-throughput sequencing (scATAC-seq) and scRNA-seq with the aim to comprehensively study the regulatory landscape, gene expression, and evolution of individual subclones in RRMM patients. Methods: We have included 20 RRMM patients with longitudinally collected paired BM samples. scATAC- and scRNA-seq data were generated using the 10X Genomics platform. Pre-processing of the sc-seq data was performed with the CellRanger software (reference genome GRCh38). For downstream analyses the R-packages Seurat and Signac (Satija Lab) as well as Cicero (Trapnell Lab) were used. For all patients bulk whole genome sequencing (WGS) data was available, which we used for confirmatory studies of intra-tumor heterogeneity. Results: A comprehensive study at the sc level requires extensive quality controls (QC). All scATAC-seq files passed the QC, including the detected number of cells, number of fragments in peaks or the ratio of mononucleosomal to nucleosome-free fragments. Yet, unsupervised clustering of the differentially accessible regions resulted in two main clusters, strongly associated with sample processing time. Delay of sample processing by 1-2 days, e.g. due to shipment from participating centers, resulted in global change of chromatin accessibility with more than 10,000 regions showing differences compared to directly processed samples. The corresponding scRNA-seq files also consistently failed QC, including detectable genes per cell and the percentage of mitochondrial RNA. We excluded these samples from the study. Analysing scATAC-seq data, we observed distinct clusters before and after treatment of RRMM, indicating clonal adaptation or selection in all samples. Treatment with carfilzomib resulted in highly increased co-accessibility and &gt;100 genes were differentially accessible upon treatment. These genes are related to the activation of immune cells (including T-, and B-cells), cell-cell adhesion, apoptosis and signaling pathways (e.g. NFκB) and include several chaperone proteins (e.g. HSPH1) which were upregulated in the scRNA-seq data upon proteasome inhibition. The power of our comprehensive approach for detection of individual subclones and their evolution is exemplarily illustrated in a patient who was treated with a MEK inhibitor and achieved complete remission. This patient showed two main clusters in the scATAC-seq data before treatment, suggesting presence of two subclones. Using copy number profiles based on WGS and scRNA-seq data and performing a trajectory analysis based on scATAC-seq data, we could confirm two different subclones. At relapse, a seemingly independent dominant clone emerged. Upon comprehensive integration of the datasets, one of the initial subclones could be identified as the precursor of this dominant clone. We observed increased accessibility for 108 regions (e.g. JUND, HSPA5, EGR1, FOSB, ETS1, FOXP2) upon MEK inhibition. The most significant differentially accessible region in this clone and its precursor included the gene coding for krüppel-like factor 2 (KLF2). scRNA-seq data showed overexpression of KLF2 in the MEK-inhibitor resistant clone, confirming KLF2 scATAC-seq data. KLF2 has been reported to play an essential role together with KDM3A and IRF1 for MM cell survival and adhesion to stromal cells in the BM. Conclusions: Our data strongly suggest to use only immediately processed samples for single cell technologies. Integrating scATAC- and scRNA-seq together with bulk WGS data showed that detection of individual clones and longitudinal changes in the activity of cis-regulatory regions and gene expression is feasible and informative in RRMM. Disclosures Goldschmidt: John-Hopkins University: Research Funding; Novartis: Membership on an entity's Board of Directors or advisory committees, Research Funding; John-Hopkins University: Research Funding; Bristol-Myers Squibb: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Mundipharma: Research Funding; Takeda: Membership on an entity's Board of Directors or advisory committees, Research Funding; MSD: Research Funding; Molecular Partners: Research Funding; Dietmar-Hopp-Stiftung: Research Funding; Janssen: Consultancy, Research Funding; Chugai: Honoraria, Research Funding; Janssen: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Sanofi: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Amgen: Consultancy, Research Funding; Celgene: Honoraria, Membership on an entity's Board of Directors or advisory committees, Research Funding; Adaptive Biotechnology: Membership on an entity's Board of Directors or advisory committees.

APA, Harvard, Vancouver, ISO, and other styles

36

Rodiño-Janeiro, Bruno Kotska, Cristina Pardo-Camacho, Javier Santos, and Cristina Martínez. "Mucosal RNA and protein expression as the next frontier in IBS: abnormal function despite morphologically intact small intestinal mucosa." American Journal of Physiology-Gastrointestinal and Liver Physiology 316, no. 6 (2019): G701—G719. http://dx.doi.org/10.1152/ajpgi.00186.2018.

Full text

Abstract:

Irritable bowel syndrome (IBS) is one of the commonest gastrointestinal disorders. Although long-time considered a pure functional disorder, intense research in past years has rendered a very complex and varied array of observations indicating the presence of structural and molecular abnormalities underlying characteristic motor and sensitive changes and clinical manifestations. Analysis of gene and protein expression in the intestinal mucosa has shed light on the molecular mechanisms implicated in IBS physiopathology. This analysis uncovers constitutive and inductive genetic and epigenetic marks in the small and large intestine that highlight the role of epithelial barrier, immune activation, and mucosal processing of foods and toxins and several new molecular pathways in the origin of IBS. The incorporation of innovative high-throughput techniques into IBS research is beginning to provide new insights into highly structured and interconnected molecular mechanisms modulating gene and protein expression at tissue level. Integration and correlation of these molecular mechanisms with clinical and environmental data applying systems biology/medicine and data mining tools emerge as crucial steps that will allow us to get meaningful and more definitive comprehension of IBS-detailed development and show the real mechanisms and causality of the disease and the way to identify more specific diagnostic biomarkers and effective treatments.

APA, Harvard, Vancouver, ISO, and other styles

37

Juhás, P., T. Davis, C. L. Farrow, and S. J. L. Billinge. "PDFgetX3: a rapid and highly automatable program for processing powder diffraction data into total scattering pair distribution functions." Journal of Applied Crystallography 46, no. 2 (2013): 560–66. http://dx.doi.org/10.1107/s0021889813005190.

Full text

Abstract:

PDFgetX3is a new software application for converting X-ray powder diffraction data to an atomic pair distribution function (PDF).PDFgetX3has been designed for ease of use, speed and automated operation. The software can readily process hundreds of X-ray patterns within a few seconds and is thus useful for high-throughput PDF studies that measure numerous data sets as a function of time, temperature or other environmental parameters. In comparison to the preceding programs,PDFgetX3requires fewer inputs and less user experience and it can be readily adopted by novice users. The live-plotting interactive feature allows the user to assess the effects of calculation parameters and select their optimum values.PDFgetX3uses anad hocdata correction method, where the slowly changing structure-independent signal is filtered out to obtain coherent X-ray intensities that contain structure information. The output fromPDFgetX3has been verified by processing experimental PDFs from inorganic, organic and nanosized samples and comparing them with their counterparts from a previous established software. In spite of the different algorithm, the obtained PDFs were nearly identical and yielded highly similar results when used in structure refinement.PDFgetX3is written in the Python language and features a well documented reusable code base. The software can be used either as a standalone application or as a library of PDF processing functions that can be called from other Python scripts. The software is free for open academic research but requires paid license for commercial use.

APA, Harvard, Vancouver, ISO, and other styles

38

Lyu, Yuan, Steven Kopcho, Folnetti A. Alvarez, Bryson C. Okeoma, and Chioma M. Okeoma. "Development of a Cationic Amphiphilic Helical Peptidomimetic (B18L) As A Novel Anti-Cancer Drug Lead." Cancers 12, no. 9 (2020): 2448. http://dx.doi.org/10.3390/cancers12092448.

Full text

Abstract:

BST-2 is a novel driver of cancer progression whose expression confers oncogenic properties to breast cancer cells. As such, targeting BST-2 in tumors may be an effective therapeutic approach against breast cancer. Here, we sought to develop potent cytotoxic anti-cancer agent using the second-generation BST-2-based anti-adhesion peptide, B18, as backbone. To this end, we designed a series of five B18-derived peptidomimetics. Among these, B18L, a cationic amphiphilic α-helical peptidomimetic, was selected as the drug lead because it displayed superior anti-cancer activity against both drug-resistant and drug-sensitive cancer cells, with minimal toxicity on normal cells. Probing mechanism of action using molecular dynamics simulations, biochemical and membrane biophysics studies, we observed that B18L binds BST-2 and possesses membranolytic characteristics. Furthermore, molecular biology studies show that B18L dysregulates cancer signaling pathways resulting in decreased Src and Erk1/2 phosphorylation, increased expression of pro-apoptotic Bcl2 proteins, caspase 3 cleavage products, as well as processing of the caspase substrate, poly (ADP-ribose) polymerase-1 (PARP-1), to the characteristic apoptotic fragment. These data indicate that through the coordinated regulation of membrane, mitochondrial and signaling events, B18L executes cancer cell death and thus has the potential to be developed into a potent and selective anti-cancer compound.

APA, Harvard, Vancouver, ISO, and other styles

39

Blaschke, Christian, Lynette Hirschman, Alexander Yeh, and Alfonso Valencia. "Critical Assessment of Information Extraction Systems in Biology." Comparative and Functional Genomics 4, no. 6 (2003): 674–77. http://dx.doi.org/10.1002/cfg.337.

Full text

Abstract:

An increasing number of groups are now working in the area of text mining, focusing on a wide range of problems and applying both statistical and linguistic approaches. However, it is not possible to compare the different approaches, because there are no common standards or evaluation criteria; in addition, the various groups are addressing different problems, often using private datasets. As a result, it is impossible to determine how well the existing systems perform, and particularly what performance level can be expected in real applications. This is similar to the situation in text processing in the late 1980s, prior to the Message Understanding Conferences (MUCs). With the introduction of a common evaluation and standardized evaluation metrics as part of these conferences, it became possible to compare approaches, to identify those techniques that did or did not work and to make progress. This progress has resulted in a common pipeline of processes and a set of shared tools available to the general research community. The field of biology is ripe for a similar experiment. Inspired by this example, the BioLINK group (Biological Literature, Information and Knowledge [1]) is organizing a CASP-like evaluation for the text data-mining community applied to biology. The two main tasks specifically address two major bottlenecks for text mining in biology: (1) the correct detection of gene and protein names in text; and (2) the extraction of functional information related to proteins based on the GO classification system. For further information and participation details, see http://www.pdg.cnb.uam.es/BioLink/BioCreative.eval.html

APA, Harvard, Vancouver, ISO, and other styles

40

Stanstrup, Jan, Corey Broeckling, Rick Helmus, et al. "The metaRbolomics Toolbox in Bioconductor and beyond." Metabolites 9, no. 10 (2019): 200. http://dx.doi.org/10.3390/metabo9100200.

Full text

Abstract:

Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.

APA, Harvard, Vancouver, ISO, and other styles

41

Tasić, Srdjan, and Irena Tasić. "THE APPLICATION OF BIOINFORMATICS IN THE MOLECULAR CHARACTERIZATION OF Bacillus licheniformis." Knowledge International Journal 28, no. 4 (2018): 1367–70. http://dx.doi.org/10.35120/kij28041367s.

Full text

Abstract:

Bioinformatics is the application of information technology in biology and includes the processes of gathering, processing and analysing experimental results. Bioinformatics now entails the creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data. Computers are necessary in microbiology because the manual comparison of multiple sequences has become unpractical. The research subject was the characterisation of the strain ST51 isolated from the thermal water well in Vranjska Banja, south eastern Serbia. Molecular characterisation of these three strains was performed by analysis of the tuf gene, which encodes the elongation factor Tu. The DNA sequences were compared to those deposited in GenBank. data bases using the BLAST program (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn). The biochemical characterisation was performed using the API 50 CHB system (bioMerieux) and APIWEB TM software Ver. 4.1. The molecular characterisation of the strain ST51 proved the highest level of similarity to the strain Bacillus licheniformis marked as ATCC 14580 (99% identical). The biochemical characterization confirmed that the strain ST51 belongs to the species Bacillus licheniformis.Given that all the conducted analyses yielded a substantial number of data, they were processed and compared using biostatistics methods and tools in order to achieve the highest probability of resulted taxonomic classification. Modern research contributes to the analysis of a significant number of variables which is why considerably more statistical analyses are involved in their interpretation and presentation. Our results indicate that different methods are needed for proper determination and characterisation of isolates/strains. Regarding taxonomy, molecular methods are the most precise, while for physiological specificity biochemical methods are more reliable.

APA, Harvard, Vancouver, ISO, and other styles

42

Manojlovic, Zarko, Austin Christofferson, Christophe Legendre, et al. "In-Depth Molecular Profiling of Multiple Myeloma in African Americans." Blood 126, no. 23 (2015): 2973. http://dx.doi.org/10.1182/blood.v126.23.2973.2973.

Full text

Abstract:

Abstract Introduction: Multiple Myeloma (MM) is a complex malignancy of plasma cells well-described hyperdyploidy, and immunoglobulin gene rearrangements. To promote rapid advances in the field, Multiple Myeloma Research Foundation (MMRF) initiative is an intensive and comprehensive longitudinal study (CoMMpass) designed to create a rich discovery ecosystem to through in-depth clinical and molecular profiling to understand the molecular perturbations of the disease in the context of therapy. The large study population empowered us to stratify mutational landscapes among different ethnicities to influence on broader disparities in tumor dynamics. Methods: Clinical data, tumor/normal sample collection, and mutational landscape analysis described in this abstract are derived from MMRF CoMMpass IA7 release that is composed of self-identified 93 African American (AA) and 377 European American (EA). The post-processing and primary analysis was done on baseline samples only. Whole exome sequencing was analyzed for the detection of somatic events. Secondary analysis was performed using MutSigCV (Mutation Significance) algorithm and GISTIC (The Genomic Identification of Significant Targets in Cancer) to determine the significance of coding mutations and copy number events. Results: Our preliminary comparison analysis of CoMMpass IA7 data demonstrated that overall there was no statistical difference (p=0.5973) in nonsilent mutation burden between the two stratified groups, AA (μ=63.9 mutations/patient) vs EA (μ=73.7 mutations/patient). However, we have observed several notable differences. The most notable population difference (p<0.0001) was TP53 mutation occurrence, which was more common in EA 5.6% (21/377) compared AA 2.1% (2/93). Furthermore, MutSig analysis also revealed a novel candidate PTCHD3 (p = 7.07E-06, q = 3.33E-02) with 6% occurrence in AA compared to only 0.04% in EA. Furthermore, we looked at mutation occurrence difference of at least 5 fold between the two stratified populations. We have identified several additional genes that have higher mutation frequencies in tumors from AA patients including: ANKRD26, BCL7A, BRWD3. Interestingly, majority is implicated in epigenetic regulatory mechanisms. Moreover, despite the complex karyotype in MM translocations, our analysis demonstrated a lower frequency of 14q32 translocations in tumors from AA patients (10%) compared to tumors from EA patients (37%). These data would independently validate our previously reported differences in 14q32 breakpoints between these two populations. Conclusion: Data from this comprehensive diverse multi-institutional longitudinal study has afforded us the opportunity to study population differences among MM patients in an unprecedented way. Taking advantage of high quality, high-resolution whole exome and whole-genome data has allowed us to identify potential differences in the genomic and molecular landscape of MM from AA and EA patients. These data may help us further understand the incidence and outcomes among patients from these populations. As CoMMpass continues to mature these datasets will be retested that may result in some of these preliminary reported events to either level off, or increase in power. Disclosures Mulligan: Millennium Pharmaceuticals, Inc., Cambridge, MA, USA, a wholly owned subsidiary of Takeda Pharmaceutical Company Limited: Employment. Lonial:Janssen: Consultancy, Research Funding; Bristol-Myers Squibb: Consultancy, Research Funding; Novartis: Consultancy, Research Funding; Millennium: Consultancy, Research Funding; Celgene: Consultancy, Research Funding; Onyx: Consultancy, Research Funding. Keats:Translational Genomic Research Institute: Employment.

APA, Harvard, Vancouver, ISO, and other styles

43

Hajibaba, Majid, Mohsen Sharifi, and Saeid Gorgin. "The Influence of Memory-Aware Computation on Distributed BLAST." Current Bioinformatics 14, no. 2 (2019): 157–63. http://dx.doi.org/10.2174/1574893613666180601080811.

Full text

Abstract:

Background: One of the pivotal challenges in nowadays genomic research domain is the fast processing of voluminous data such as the ones engendered by high-throughput Next-Generation Sequencing technologies. On the other hand, BLAST (Basic Local Alignment Search Tool), a longestablished and renowned tool in Bioinformatics, has shown to be incredibly slow in this regard. Objective: To improve the performance of BLAST in the processing of voluminous data, we have applied a novel memory-aware technique to BLAST for faster parallel processing of voluminous data. Method: We have used a master-worker model for the processing of voluminous data alongside a memory-aware technique in which the master partitions the whole data in equal chunks, one chunk for each worker, and consequently each worker further splits and formats its allocated data chunk according to the size of its memory. Each worker searches every split data one-by-one through a list of queries. Results: We have chosen a list of queries with different lengths to run insensitive searches in a huge database called UniProtKB/TrEMBL. Our experiments show 20 percent improvement in performance when workers used our proposed memory-aware technique compared to when they were not memory aware. Comparatively, experiments show even higher performance improvement, approximately 50 percent, when we applied our memory-aware technique to mpiBLAST. Conclusion: We have shown that memory-awareness in formatting bulky database, when running BLAST, can improve performance significantly, while preventing unexpected crashes in low-memory environments. Even though distributed computing attempts to mitigate search time by partitioning and distributing database portions, our memory-aware technique alleviates negative effects of page-faults on performance.

APA, Harvard, Vancouver, ISO, and other styles

44

Escher, Claudia, Jakob Vowinckel, Karel Novy, et al. "Next generation proteomics in precision oncology: 1000s of proteome and phosphoproteome profiles of tumors and matching healthy tissues as meaningful layer in multi-omics database." Journal of Clinical Oncology 38, no. 15_suppl (2020): e15672-e15672. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.e15672.

Full text

Abstract:

e15672 Background: The rise of precision oncology therapeutics requires deep understanding of all molecular mechanisms involved in cancer biology. IndivuType offers the world’s first multi-omics database for individualized cancer therapy, analyzing the highest quality cancer biospecimens to generate the most comprehensive dataset, including genomics (WGS), transcriptomics, proteomics, and clinical outcome information. Indivumed is committed to the quality of the IndivuType ecosystem starting with stringent SOP-driven sample collection combined with thorough validation of clinical information and data integrity. The availability of multi-omics data from the same tumor can provide a comprehensive molecular picture of cancer for a given patient. Protein expression and activation are directly related to cellular function and hence provide actionable information about druggable targets. Until recently, the proteomics technology could not match the scale of next-gen sequencing and consequently precision medicine has almost exclusively been based on gene level data. Here we present the first large-scale data set for protein expression and phosphorylation. Enabled by the data independent acquisition (DIA) workflow, a mass spectrometric method provided by Biognosys that obtains peptide fragmentation data in a highly parallelized way with high sensitivity, more than 7,000 proteins in the whole proteome (WP) and 20,000 phospho-peptides in the phospho-proteome (PP) workflow were profiled. Methods: Sample processing from 5 mg of tissue per sample was performed using liquid handling robot. Phospho-peptide enrichment was carried out with a Kingfisher Flex device and MagReSyn Ti-IMAC magnetic beads. DIA LC-MS/MS was performed on multiple platforms consisting of a Thermo Scientific Q Exactive HF-X mass spectrometer coupled to a Waters M-Class LC. Chromatography was operating at 5 µL/min, and separation was achieved using 45 min (WP) and 60 min (PP) gradients. Results: Several thousands of high-quality patient samples of various cancer types have been analyzed to date. The resulting proteome and phospho-proteome data has been integrated into the IndivuType database, thereby providing a solid foundation to advance our understanding of cancer. Conclusions: With the ongoing addition of more samples and associated deep and rich data, the platform could unravel key molecular events and is expected to transform knowledge into actionable treatments and personalized therapies.

APA, Harvard, Vancouver, ISO, and other styles

45

Stroman, Patrick W., Howard J. M. Warren, Gabriela Ioachim, Jocelyn M. Powers, and Kaitlin McNeil. "A comparison of the effectiveness of functional MRI analysis methods for pain research: The new normal." PLOS ONE 15, no. 12 (2020): e0243723. http://dx.doi.org/10.1371/journal.pone.0243723.

Full text

Abstract:

Studies of the neural basis of human pain processing present many challenges because of the subjective and variable nature of pain, and the inaccessibility of the central nervous system. Neuroimaging methods, such as functional magnetic resonance imaging (fMRI), have provided the ability to investigate these neural processes, and yet commonly used analysis methods may not be optimally adapted for studies of pain. Here we present a comparison of model-driven and data-driven analysis methods, specifically for the study of human pain processing. Methods are tested using data from healthy control participants in two previous studies, with separate data sets spanning the brain, and the brainstem and spinal cord. Data are analyzed by fitting time-series responses to predicted BOLD responses in order to identify significantly responding regions (model-driven), as well as with connectivity analyses (data-driven) based on temporal correlations between responses in spatially separated regions, and with connectivity analyses based on structural equation modeling, allowing for multiple source regions to explain the signal variations in each target region. The results are assessed in terms of the amount of signal variance that can be explained in each region, and in terms of the regions and connections that are identified as having BOLD responses of interest. The characteristics of BOLD responses in identified regions are also investigated. The results demonstrate that data-driven approaches are more effective than model-driven approaches for fMRI studies of pain.

APA, Harvard, Vancouver, ISO, and other styles

46

Mölder, Felix, Kim Philipp Jablonski, Brice Letcher, et al. "Sustainable data analysis with Snakemake." F1000Research 10 (January 18, 2021): 33. http://dx.doi.org/10.12688/f1000research.29032.1.

Full text

Abstract:

Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.

APA, Harvard, Vancouver, ISO, and other styles

47

Mölder, Felix, Kim Philipp Jablonski, Brice Letcher, et al. "Sustainable data analysis with Snakemake." F1000Research 10 (April 19, 2021): 33. http://dx.doi.org/10.12688/f1000research.29032.2.

Full text

Abstract:

Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.

APA, Harvard, Vancouver, ISO, and other styles

48

Ntini, Evgenia, and Annalisa Marsico. "Functional impacts of non-coding RNA processing on enhancer activity and target gene expression." Journal of Molecular Cell Biology 11, no. 10 (2019): 868–79. http://dx.doi.org/10.1093/jmcb/mjz047.

Full text

Abstract:

Abstract Tight regulation of gene expression is orchestrated by enhancers. Through recent research advancements, it is becoming clear that enhancers are not solely distal regulatory elements harboring transcription factor binding sites and decorated with specific histone marks, but they rather display signatures of active transcription, showing distinct degrees of transcription unit organization. Thereby, a substantial fraction of enhancers give rise to different species of non-coding RNA transcripts with an unprecedented range of potential functions. In this review, we bring together data from recent studies indicating that non-coding RNA transcription from active enhancers, as well as enhancer-produced long non-coding RNA transcripts, may modulate or define the functional regulatory potential of the cognate enhancer. In addition, we summarize supporting evidence that RNA processing of the enhancer-associated long non-coding RNA transcripts may constitute an additional layer of regulation of enhancer activity, which contributes to the control and final outcome of enhancer-targeted gene expression.

APA, Harvard, Vancouver, ISO, and other styles

49

Misra, Biswapriya B., Carl Langefeld, Michael Olivier, and Laura A. Cox. "Integrated omics: tools, advances and future approaches." Journal of Molecular Endocrinology 62, no. 1 (2019): R21—R45. http://dx.doi.org/10.1530/jme-18-0055.

Full text

Abstract:

With the rapid adoption of high-throughput omic approaches to analyze biological samples such as genomics, transcriptomics, proteomics and metabolomics, each analysis can generate tera- to peta-byte sized data files on a daily basis. These data file sizes, together with differences in nomenclature among these data types, make the integration of these multi-dimensional omics data into biologically meaningful context challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, pan-omics or shortened to just ‘omics’, the challenges include differences in data cleaning, normalization, biomolecule identification, data dimensionality reduction, biological contextualization, statistical validation, data storage and handling, sharing and data archiving. The ultimate goal is toward the holistic realization of a ‘systems biology’ understanding of the biological question. Commonly used approaches are currently limited by the 3 i’s – integration, interpretation and insights. Post integration, these very large datasets aim to yield unprecedented views of cellular systems at exquisite resolution for transformative insights into processes, events and diseases through various computational and informatics frameworks. With the continued reduction in costs and processing time for sample analyses, and increasing types of omics datasets generated such as glycomics, lipidomics, microbiomics and phenomics, an increasing number of scientists in this interdisciplinary domain of bioinformatics face these challenges. We discuss recent approaches, existing tools and potential caveats in the integration of omics datasets for development of standardized analytical pipelines that could be adopted by the global omics research community.

APA, Harvard, Vancouver, ISO, and other styles

50

Barus, Ewi Mellysa, and Terry Noviar Panggabean. "Pengaruh Laboratorium Virtual Berbasis Problem Solving Terhadap Kemampuan Berpikir Kritis." Jurnal Pendidikan Biologi 9, no. 3 (2020): 11. http://dx.doi.org/10.24114/jpb.v9i3.20004.

Full text

Abstract:

The availability of facilities and infrastructure in learning Cell and molecular biology courses is needed to support the achievement of learning objectives. Based on the results of observations in the Imelda University, Pharmacy Department, the lack of implementation due to the lack of facilities and infrastructure that support the practicum causes a lack of improvement in students' critical thinking skills. The ability to think critically is the main factor affecting student understanding. In this case, the media that can support this is a problem solving-based virtual laboratory to encourage students' critical thinking skills. This study aims to identify the effect of a problem solving-based virtual biology laboratory on the critical thinking skills of students in the Pharmacy Undergraduate Study Program at Imelda University, Medan. The research design was one group pre-test posttest. Data collection was carried out starting in January 2020, followed by data processing and analysis. The sample in this study were all students of the first semester Pharmacy Department at Imelda University, totaling 63 people. The sampling technique in this study is total sampling. The statistical analysis used was one sample T-test. Based on the results of the discussion based on the problems and research objectives, it can be concluded that there is a positive and significant effect with p 0.00 <0.05 in the use of problem solving-based virtual laboratories on students' critical thinking skills in the cell and molecular biology courses at the Imelda University Pharmacy Department. With the virtual laboratory, students learn the principles of science in a fast, effective and fun way through virtual laboratory interactions and navigation.

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Molecular biology – Data processing – Research'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles