Log in

Relevant bibliographies by topics / Sequence submission / Journal articles

To see the other types of publications on this topic, follow the link: Sequence submission.

Journal articles on the topic 'Sequence submission'

Author: Grafiati

Published: 5 June 2025

Last updated: 24 June 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Sequence submission.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gruenstaeudl, Michael. "annonex2embl: automatic preparation of annotated DNA sequences for bulk submissions to ENA." Bioinformatics 36, no. 12 (2020): 3841–48. http://dx.doi.org/10.1093/bioinformatics/btaa209.

Full text

Abstract:

Abstract Motivation The submission of annotated sequence data to public sequence databases constitutes a central pillar in biological research. The surge of novel DNA sequences awaiting database submission due to the application of next-generation sequencing has increased the need for software tools that facilitate bulk submissions. This need has yet to be met with the concurrent development of tools to automate the preparatory work preceding such submissions. Results The author introduce annonex2embl, a Python package that automates the preparation of complete sequence flatfiles for large-scale sequence submissions to the European Nucleotide Archive. The tool enables the conversion of DNA sequence alignments that are co-supplied with sequence annotations and metadata to submission-ready flatfiles. Among other features, the software automatically accounts for length differences among the input sequences while maintaining correct annotations, automatically interlaces metadata to each record and displays a design suitable for easy integration into bioinformatic workflows. As proof of its utility, annonex2embl is employed in preparing a dataset of more than 1500 fungal DNA sequences for database submission. Availability and implementation annonex2embl is freely available via the Python package index at http://pypi.python.org/pypi/annonex2embl. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

2

Wong, Mark, and Rhodri Leng. "On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa." F1000Research 8 (July 26, 2019): 1200. http://dx.doi.org/10.12688/f1000research.18656.1.

Full text

Abstract:

This paper describes a unique two-step methodology used to construct six linked bibliometric datasets covering the sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa genomes. First, we retrieved all sequence submission data from the European Nucleotide Archive (ENA), including accession numbers associated with each species. Second, we used these accession numbers to construct queries to retrieve peer-reviewed scientific publications that first linked to these sequence lengths in the scientific literature. For each species, this resulted in two associated datasets: 1) A .csv file documenting the PMID of each article describing new sequences, all paper authors, all institutional affiliations of each author, countries of institution, year of first submission to the ENA, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the ENA, number of nucleotides sequenced, number of submissions per institution in a given year, and years of submission to the database. In several upcoming publications, we utilise these datasets to understand how institutional collaboration shaped sequencing efforts, and to systematically identify important institutions and changes in network structures over time. This paper, therefore, should aid researchers who would like to use these data for future analyses by making the methodology that underpins it transparent. Further, by detailing our methodology, researchers may be able to utilise our approach to construct similar datasets in the future.

APA, Harvard, Vancouver, ISO, and other styles

3

Wong, Mark, and Rhodri Leng. "On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa." F1000Research 8 (October 23, 2020): 1200. http://dx.doi.org/10.12688/f1000research.18656.2.

Full text

Abstract:

This paper describes a unique two-step methodology used to construct six linked bibliometric datasets covering the sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa genomes. First, we retrieved all sequence submission data from the European Nucleotide Archive (ENA), including accession numbers associated with each species. Second, we used these accession numbers to construct queries to retrieve peer-reviewed scientific publications that first linked to these sequence lengths in the scientific literature. For each species, this resulted in two associated datasets: 1) A .csv file documenting the PMID of each article describing new sequences, all paper authors, all institutional affiliations of each author, countries of institution, year of first submission to the ENA, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the ENA, number of nucleotides sequenced, number of submissions per institution in a given year, and years of submission to the database. In several upcoming publications, we utilise these datasets to understand how institutional collaboration shaped sequencing efforts, and to systematically identify important institutions and changes in network structures over time. This paper, therefore, should aid researchers who would like to use these data for future analyses by making the methodology that underpins it transparent. Further, by detailing our methodology, researchers may be able to utilise our approach to construct similar datasets in the future.

APA, Harvard, Vancouver, ISO, and other styles

4

Griffin, Hugh G., and Annette M. Griffia. "WWW Nucleotide sequence submission System." Molecular Biotechnology 5, no. 1 (1996): 71. http://dx.doi.org/10.1007/bf02762417.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Wong, Mark, and Rhodri Leng. "On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa." F1000Research 8 (February 28, 2023): 1200. http://dx.doi.org/10.12688/f1000research.18656.3.

Full text

Abstract:

This data note describes a unique two-step methodology to construct six linked datasets covering the sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa genomes. The datasets were used as evidence in a project that investigated the history of genomic science. To design the datasets, we first retrieved all sequence submission data from the European Nucleotide Archive (ENA), including accession numbers associated with each of our three species. Second, we used these accession numbers to construct queries to retrieve peer-reviewed scientific publications that first described these sequence submissions in the scientific literature. For each species, this resulted in two associated datasets: 1) A .csv file documenting the PMID of each article describing new sequences, all paper authors, all institutional affiliations of each author, countries of institution, year of first submission to the ENA (when available), and the year of article publication, and 2) A .csv file documenting all institutions submitting to the ENA, number of nucleotides sequenced and years of submission to the database. We utilised these datasets to understand how institutional collaboration shaped sequencing efforts, and to systematically identify important institutions and changes in the structure of research communities throughout the history of genomics and across our three target species. This data note, therefore, should aid researchers who would like to use these data for future analyses by making the methodology that underpins it transparent. Further, by detailing our methodology, researchers may be able to utilise our approach to construct similar datasets in the future.

APA, Harvard, Vancouver, ISO, and other styles

6

Levy, Richard, Jessie Berta-Thompson, Gary Olds, and Andrew Wilson. "Specimods: A web-based tool for producing Genbank submission files for sequenced museum specimens." Biodiversity Information Science and Standards 6 (September 9, 2022): e94596. https://doi.org/10.3897/biss.6.94596.

Full text

Abstract:

The expectation of published sequence data on Genbank and other sequence databases is not only a requirement for journal publication, but is incredibly valuable for continuing research in organismal evolution and ecology. The process of formatting and submitting sequence data from vouchered specimens is burdensome and tedious. Improperly formatted submissions lead to lengthy delays and frustrations on behalf of the submitters and reviewers. Specimods*1 is a new tool that utilizes the Global Biodiversity Information Facility (GBIF) API to produce formatted source modifier files that will facilitate the batch upload of specimen sequences to Genbank, using the sequence submission platform. Once a template CSV is populated with the specimen catalog numbers, institution codes, SeqID, sequence descriptions, and the sequences, Specimods will pull specimen metadata from GBIF to produce two files: a FASTA file with a complete definition line (SeqID, organism name, sequence title) and the sequences as well as a source modifier file containing specimen metadata. Currently, Specimods supports Internal Transcribed Spacer (ITS) (1, 2, or both), SSUrRNA_18s, and LSUrRNA_28s sequences. The application*2 uses an Express server and Node.js with data linked to a user account stored in a MySQL relational database. This presentation will demonstrate the tool using data from fungal specimens and show how submitting dozens of sequences with metadata can be accomplished in a matter of minutes.

APA, Harvard, Vancouver, ISO, and other styles

7

Leng, Rhodri, Gil Viry, Miguel García-Sancho, James Lowe, Mark Wong, and Niki Vermeulen. "The Sequences and the Sequencers." Historical Studies in the Natural Sciences 52, no. 3 (2022): 277–319. http://dx.doi.org/10.1525/hsns.2022.52.3.277.

Full text

Abstract:

This special issue on sequences and sequencers uses new analytical approaches to re-assess the history of genomics. Historical attention has largely focused on a few central characters and institutions: those that participated in the Human Genome Project (HGP), especially its final stages. Our analysis—based on an assessment of almost 13.5 million DNA sequence submissions and 30,000 publications of human, yeast, and pig DNA sequences—followed overlapping chronologies starting before and finishing after the concerted efforts to sequence the genomes of each species: 1980 to 2000 in yeast, 1985 to 2005 for the human, and 1990 to 2015 for the pig. Our main conclusion is that when broader sequencing practices—especially those addressed to nonhuman species—are taken into account, the large-scale center model that characterized the organization of the HGP falls short in representing genomics as a whole. Instead of taking the HGP as a model, we describe an iterative process in which the practices of sequence submission and publication were entangled. Analysis of co-authorship networks between institutions derived from our data shows how linked sequence submission and publication were to medical, biochemical, and agricultural research. Our analysis thus reveals the utility of big data and mixed-methods approaches for addressing science as a multidimensional endeavor with a history shaped by co-constitutive, synchronic interactions among different elements—such as communities, species, and disciplines—as much as diachronic trajectories over time. This perspective enables us to better capture interdisciplinary and interspecies work, and offers a more fluid portrayal of the connections between scientific practices and agricultural, industrial, and medical goals. This essay is part of a special issue entitled The Sequences and the Sequencers: A New Approach to Investigating the Emergence of Yeast, Human, and Pig Genomics, edited by Michael García-Sancho and James Lowe.

APA, Harvard, Vancouver, ISO, and other styles

8

Gilna, P., L. J. Tomlinson, and C. Burks. "Submission of Nucleotide Sequence Data to GenBank(R)." Microbiology 135, no. 7 (1989): 1779–86. http://dx.doi.org/10.1099/00221287-135-7-1779.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Tuli, Mary Ann, Tomas P. Flores, and Graham N. Cameron. "Submission of nucleotide sequence data to EMBL/GenBank/DDBJ." Molecular Biotechnology 6, no. 1 (1996): 47–51. http://dx.doi.org/10.1007/bf02762322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Shadek, Teuku Fadjar, Shodik Nuryadhin, and Ainin Najmi. "PENGEMBANGAN SISTEM PENGAJUAN UTTP DENGAN MENGGUNAKAN PROGRAM BOOTSTRAP (PHP) DALAM RANGKA PENINGKATAN DAYA SAING DAN INOVASI PADA PT KALIBRASI INDONESIA MANDIRI." Jurnal Sistem Informasi dan Informatika (Simika) 6, no. 1 (2023): 69–79. http://dx.doi.org/10.47080/simika.v6i1.2361.

Full text

Abstract:

PT Kalibrasi Indonesia Mandiri is one unit of the legal metrology agency in Serang Regency that functions are to control, manage and determine the types of measuring instruments, scales and equipment (UTTP) by setting mass standards, volume standards and length standards. PT Kalibrasi Indonesia Mandiri is under department of Industry, Trade, Cooperatives and SMEs in Banten province, as one of the Technical Implementation Unit (UPT). In accepting UTTP (Measuring, Measure, Weighing, and Equipment) submissions from institutions or companies is still manual and there is no system. For this reasons, the author was designed a system to facilitate the processing of UTTP submissions received at the Technical Implementation Unit in a systemized (online) manner. The solution is using a web-based application. For this reason the author choose the title "Web-based UTTP Submission Information System at PT. Kalibrasi Indonesia Mandiri Serang”. In designing this system the tools used are UML (Unified modeling language) including Flowmaps, Use Case Diagrams, Class Diagrams, Activity Diagrams, Sequence Diagrams, File Structures, Input Output Design. The result of this study is the design of an UTTP submission information system that is used to facilitate work units in maintaining consistency and their level of performance in receiving and preparing test results reports

APA, Harvard, Vancouver, ISO, and other styles

11

Spinard, Edward, Mark Dinhobl, Cassidy N. G. Erdelyan, et al. "A Standardized Pipeline for Assembly and Annotation of African Swine Fever Virus Genome." Viruses 16, no. 8 (2024): 1293. http://dx.doi.org/10.3390/v16081293.

Full text

Abstract:

Obtaining a complete good-quality sequence and annotation for the long double-stranded DNA genome of the African swine fever virus (ASFV) from next-generation sequencing (NGS) technology has proven difficult, despite the increasing availability of reference genome sequences and the increasing affordability of NGS. A gap analysis conducted by the global African swine fever research alliance (GARA) partners identified that a standardized, automatic pipeline for NGS analysis was urgently needed, particularly for new outbreak strains. Whilst there are several diagnostic and research labs worldwide that collect isolates of the ASFV from outbreaks, many do not have the capability to analyze, annotate, and format NGS data from outbreaks for submission to NCBI, and some publicly available ASFV genomes have missing or incorrect annotations. We developed an automated, standardized pipeline for the analysis of NGS reads that directly provides users with assemblies and annotations formatted for their submission to NCBI. This pipeline is freely available on GitHub and has been tested through the GARA partners by examining two previously sequenced ASFV genomes; this study also aimed to assess the accuracy and limitations of two strategies present within the pipeline: reference-based (Illumina reads) and de novo assembly (Illumina and Nanopore reads) strategies.

APA, Harvard, Vancouver, ISO, and other styles

12

Hankeln, Wolfgang, Norma Johanna Wendel, Jan Gerken, et al. "CDinFusion – Submission-Ready, On-Line Integration of Sequence and Contextual Data." PLoS ONE 6, no. 9 (2011): e24797. http://dx.doi.org/10.1371/journal.pone.0024797.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Fukuda, Asami, Yuichi Kodama, Jun Mashima, Takatomo Fujisawa, and Osamu Ogasawara. "DDBJ update: streamlining submission and access of human data." Nucleic Acids Research 49, no. D1 (2020): D71—D75. http://dx.doi.org/10.1093/nar/gkaa982.

Full text

Abstract:

Abstract The Bioinformation and DDBJ Center (DDBJ Center, https://www.ddbj.nig.ac.jp) provides databases that capture, preserve and disseminate diverse biological data to support research in the life sciences. This center collects nucleotide sequences with annotations, raw sequencing data, and alignment information from high-throughput sequencing platforms, and study and sample information, in collaboration with the National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI). This collaborative framework is known as the International Nucleotide Sequence Database Collaboration (INSDC). In collaboration with the National Bioscience Database Center (NBDC), the DDBJ Center also provides a controlled-access database, the Japanese Genotype–phenotype Archive (JGA), which archives and distributes human genotype and phenotype data, requiring authorized access. The NBDC formulates guidelines and policies for sharing human data and reviews data submission and use applications. To streamline all of the processes at NBDC and JGA, we have integrated the two systems by introducing a unified login platform with a group structure in September 2020. In addition to the public databases, the DDBJ Center provides a computer resource, the NIG supercomputer, for domestic researchers to analyze large-scale genomic data. This report describes updates to the services of the DDBJ Center, focusing on the NBDC and JGA system enhancements.

APA, Harvard, Vancouver, ISO, and other styles

14

Stricker, Amber, Dale Polson, Michael Murtaugh, Jane Christopher-Hennings, and Travis Clement. "Variation in porcine reproductive and respiratory syndrome virus open reading frame 5 diagnostic sequencing." Journal of Swine Health and Production 23, no. 1 (2015): 18–27. http://dx.doi.org/10.54846/jshap/840.

Full text

Abstract:

Objective: To assess porcine reproductive and respiratory syndrome virus (PRRSV) open reading frame 5 (ORF5) sequencing variation, within and among state diagnostic laboratories, that may contribute to observed differences in sequence homology among isolates. Materials and methods: PRRS virus-positive blood samples were collected from individual pigs on three different farms and submitted on three independent occasions to three diagnostic laboratories for PRRSV ORF5 nucleotide sequencing. The PRRSV isolates on each farm were genetically disparate. Vaccine viruses (Ingelvac PRRS MLV and Ingelvac PRRS ATP; Boehringer Ingelheim Vetmedica, Inc, St Joseph, Missouri) were submitted as positive controls. Results: Full-length ORF5 sequences were obtained from all samples. Positive-control vaccine virus sequencing was precise and highly accurate, with all laboratories on all occasions obtaining nearly identical sequences. The analytical specificity of field PRRSV sequencing was robust, with a median variation among laboratories for the same farm sample, across all pigs and submission dates, of one base difference per 603-base sequence (0.2%). Seventy-five percent of sequences had fewer than six base differences, and the greatest difference was 2.2%. However, 16% of samples in one submission from one farm appeared to be misidentified in the reports of one laboratory. Implications: Inter- and intra-laboratory ORF5 sequencing results are reproducible, reliable, and do not contribute significantly to estimated PRRSV diversity. Tracking errors may occur which can lead to confusion or inappropriate reaction by key decision makers. Submitters should retain aliquots of all samples to enable further investigation of a diagnostic error not related to the sequencing procedure.

APA, Harvard, Vancouver, ISO, and other styles

15

Guard, Jean, Deana R. Jones, Richard K. Gast, Javier S. Garcia, and Michael J. Rothrock. "Serotype Screening of Salmonella enterica Subspecies I by Intergenic Sequence Ribotyping (ISR): Critical Updates." Microorganisms 11, no. 1 (2022): 97. http://dx.doi.org/10.3390/microorganisms11010097.

Full text

Abstract:

(1) Background: Foodborne illness from Salmonella enterica subspecies I is most associated with approximately 32 out of 1600 serotypes. While whole genome sequencing and other nucleic acid-based methods are preferred for serotyping, they require expertise in bioinformatics and often submission to an external agency. Intergenic Sequence Ribotyping (ISR) assigns serotype to Salmonella in coordination with information freely available at the National Center for Biotechnology Information. ISR requires updating because it was developed from 26 genomes while there are now currently 1804 genomes and 1685 plasmids. (2) Methods: Serotypes available for sequencing were analyzed by ISR to confirm primer efficacy and to identify any issues in application. Differences between the 2012 and 2022 ISR database were tabulated, nomenclature edited, and instances of multiple serotypes aligning to a single ISR were examined. (3) Results: The 2022 ISR database has 268 sequences and 40 of these were assigned new NCBI accession numbers that were not previously available. Extending boundaries of sequences resolved hdfR cross-alignment and reduced multiplicity of alignment for 37 ISRs. Comparison of gene cyaA sequences and some cell surface epitopes provided evidence that homologous recombination was potentially impacting results for this subset. There were 99 sequences that still had no match with an NCBI submission. (4) The 2022 ISR database is available for use as a serotype screening method for Salmonella enterica subspecies I. Finding that 36.9% of the sequences in the ISR database still have no match within the NCBI Salmonella enterica database suggests that there is more genomic heterogeneity yet to characterize.

APA, Harvard, Vancouver, ISO, and other styles

16

GARG, AKHIL, DETLEF LEIPE, and PETER UETZ. "The disconnect between DNA and species names: lessons from reptile species in the NCBI taxonomy database." Zootaxa 4706, no. 3 (2019): 401–7. http://dx.doi.org/10.11646/zootaxa.4706.3.1.

Full text

Abstract:

We compared the species names in the Reptile Database, a dedicated taxonomy database, with those in the NCBI taxonomy database, which provides the taxonomic backbone for the GenBank sequence database. About 67% of the known ~11,000 reptile species are represented with at least one DNA sequence and a binary species name in GenBank. However, a common problem arises through the submission of preliminary species names (such as “Pelomedusa sp. A CK-2014”) to GenBank and thus the NCBI taxonomy. These names cannot be assigned to any accepted species names and thus create a disconnect between DNA sequences and species. While these names of unknown taxonomic meaning sometimes get updated, often they remain in GenBank which now contains sequences from ~1,300 such “putative” reptile species tagged by informal names (~15% of its reptile names). We estimate that NCBI/GenBank probably contain tens of thousands of such “disconnected” entries. We encourage sequence submitters to update informal species names after they have been published, otherwise the disconnect will cause increasing confusion and possibly misleading taxonomic conclusions.

APA, Harvard, Vancouver, ISO, and other styles

17

Bardou, Philippe, Sandrine Laguerre, Sarah Maman Haddad, et al. "MINTIA: a metagenomic INserT integrated assembly and annotation tool." PeerJ 9 (September 27, 2021): e11885. http://dx.doi.org/10.7717/peerj.11885.

Full text

Abstract:

The earth harbors trillions of bacterial species adapted to very diverse ecosystems thanks to specific metabolic function acquisition. Most of the genes responsible for these functions belong to uncultured bacteria and are still to be discovered. Functional metagenomics based on activity screening is a classical way to retrieve these genes from microbiomes. This approach is based on the insertion of large metagenomic DNA fragments into a vector and transformation of a host to express heterologous genes. Metagenomic libraries are then screened for activities of interest, and the metagenomic DNA inserts of active clones are extracted to be sequenced and analysed to identify genes that are responsible for the detected activity. Hundreds of metagenomics sequences found using this strategy have already been published in public databases. Here we present the MINTIA software package enabling biologists to easily generate and analyze large metagenomic sequence sets, retrieved after activity-based screening. It filters reads, performs assembly, removes cloning vector, annotates open reading frames and generates user friendly reports as well as files ready for submission to international sequence repositories. The software package can be downloaded from https://github.com/Bios4Biol/MINTIA.

APA, Harvard, Vancouver, ISO, and other styles

18

Francois, Clementine M., Faustine Durand, Emeric Figuet, and Nicolas Galtier. "Prevalence and Implications of Contamination in Public Genomic Resources: A Case Study of 43 Reference Arthropod Assemblies." G3: Genes|Genomes|Genetics 10, no. 2 (2019): 721–30. http://dx.doi.org/10.1534/g3.119.400758.

Full text

Abstract:

Thanks to huge advances in sequencing technologies, genomic resources are increasingly being generated and shared by the scientific community. The quality of such public resources are therefore of critical importance. Errors due to contamination are particularly worrying; they are widespread, propagate across databases, and can compromise downstream analyses, especially the detection of horizontally-transferred sequences. However we still lack consistent and comprehensive assessments of contamination prevalence in public genomic data. Here we applied a standardized procedure for foreign sequence annotation to 43 published arthropod genomes from the widely used Ensembl Metazoa database. This method combines information on sequence similarity and synteny to identify contaminant and putative horizontally-transferred sequences in any genome assembly, provided that an adequate reference database is available. We uncovered considerable heterogeneity in quality among arthropod assemblies, some being devoid of contaminant sequences, whereas others included hundreds of contaminant genes. Contaminants far outnumbered horizontally-transferred genes and were a major confounder of their detection, quantification and analysis. We strongly recommend that automated standardized decontamination procedures be systematically embedded into the submission process to genomic databases.

APA, Harvard, Vancouver, ISO, and other styles

19

Dittami, Simon M., and Erwan Corre. "Detection of bacterial contaminants and hybrid sequences in the genome of the kelp Saccharina japonica using Taxoblast." PeerJ 5 (November 17, 2017): e4073. http://dx.doi.org/10.7717/peerj.4073.

Full text

Abstract:

Modern genome sequencing strategies are highly sensitive to contamination making the detection of foreign DNA sequences an important part of analysis pipelines. Here we use Taxoblast, a simple pipeline with a graphical user interface, for the post-assembly detection of contaminating sequences in the published genome of the kelp Saccharina japonica. Analyses were based on multiple blastn searches with short sequence fragments. They revealed a number of probable bacterial contaminations as well as hybrid scaffolds that contain both bacterial and algal sequences. This or similar types of analysis, in combination with manual curation, may thus constitute a useful complement to standard bioinformatics analyses prior to submission of genomic data to public repositories. Our analysis pipeline is open-source and freely available at http://sdittami.altervista.org/taxoblast and via SourceForge (https://sourceforge.net/projects/taxoblast).

APA, Harvard, Vancouver, ISO, and other styles

20

Barker, D. J., R. Natarajan, J. Robinson, and S. G. Marsh. "Streamlining submission to the IPD-IMGT/HLA database: The sequence feature annotation tool." Human Immunology 85 (September 2024): 110918. http://dx.doi.org/10.1016/j.humimm.2024.110918.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Bossinger, June A. "The Annotator's Assistant: an expert system for direct submission of genetic sequence data." Bioinformatics 4, no. 1 (1988): 197–202. http://dx.doi.org/10.1093/bioinformatics/4.1.197.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Crossley, Beate M., Jianfa Bai, Amy Glaser, et al. "Guidelines for Sanger sequencing and molecular assay monitoring." Journal of Veterinary Diagnostic Investigation 32, no. 6 (2020): 767–75. http://dx.doi.org/10.1177/1040638720905833.

Full text

Abstract:

Genetic sequencing, or DNA sequencing, using the Sanger technique has become widely used in the veterinary diagnostic community. This technology plays a role in verification of PCR results and is used to provide the genetic sequence data needed for phylogenetic analysis, epidemiologic studies, and forensic investigations. The Laboratory Technology Committee of the American Association of Veterinary Laboratory Diagnosticians has prepared guidelines for sample preparation, submission to sequencing facilities or instrumentation, quality assessment of nucleic acid sequence data performed, and for generating basic sequencing data and phylogenetic analysis for diagnostic applications. This guidance is aimed at assisting laboratories in providing consistent, high-quality, and reliable sequence data when using Sanger-based genetic sequencing as a component of their laboratory services.

APA, Harvard, Vancouver, ISO, and other styles

23

Gupta, Vikas, Joana Paupério, Josephine Burgin, Suran Jayathilaka, and Guy Cochrane. "The ENA Source Attribute Helper: An API for improved biological source data." Biodiversity Information Science and Standards 6 (August 2, 2022): e91118. https://doi.org/10.3897/biss.6.91118.

Full text

Abstract:

Metadata management for sequence data is essential for the accurate description of Earth's biodiversity. Within metadata attributes, those that reference the biological sources of sequences and samples and allow linking to the specimen or sample of origin are fundamental for facilitating connections between molecular biology, taxonomy, systematic biology and biodiversity research, increasing the discoverability and usability of data by researchers worldwide.Sequence data is publicly archived at the International Nucleotide Sequence Database Collaboration (INSDC) that includes the National Centre for Biotechnology Information (NCBI), the DNA Data Bank of Japan (DDBJ) and the European Nucleotide Archive (ENA). Sequences stored at INSDC have associated a considerable range of metadata, including attributes related to its biological source, such as references to natural history collections or culture collections. But, these source attributes are not always submitted or may be incomplete, limiting the association of the sequence records to the original source material, hampering further data connections (e.g., biological data associated with the voucher or species distribution data). Therefore, we have developed the ENA Source Attribute Helper API, a tool that aims to assist users on the submission of accurate attributes referring to the biological source of samples and sequence data. This tool was developed within the scope of BiCIKL (Biodiversity Community Integrated Knowledge Library) (Penev et al. 2022), a Horizon 2020 project which targets building a wide, biodiversity related community for connecting data along the different axes of biodiversity research.The first version of the tool was designed to support correct annotation of the attributes that identify the source material from which the sample or sequence were obtained, namely /specimen_voucher, /culture_collection, and /biomaterial (INSDC 2021). These attributes follow a Darwin Core Triplet format (Wieczorek et al. 2012), composed of institution code, collection code and the specimen, culture, or material identifier, accordingly. Since the submission of the biological source attributes to the INSDC may be performed both when data is initially uploaded or on following updates using a variety of tools, we developed the API as an open source tool that is publicly accessible and may be used as a free-standing service. The API is built using Representational State Transfer (REST) API Architecture and it is designed to use the data available in the NCBI BioCollections (Sharma et al. 2018). NCBI Biocollections is a curated database of metadata for natural history collections, associated with records in INSDC, that includes the institution and collection codes. The API main functions include the querying of the metadata (the API presents both exact matches and similar matches) for the institutions and collections based on the user input, validation of institution and collection codes in the attribute strings provided by the user, and the construction of the attribute string based on the user-provided information. The API does not include the search or validation of the voucher specimen codes. The API is designed in a way that it can be extended easily for any future enhancements and initially expected to promote and support the submission and any subsequent curation of better structured and more richly described source data. We expect this tool to contribute to better connected biodiversity data and hence provide a stronger foundation to strengthen the value of natural history collections, taxonomic expertise, and biodiversity knowledge.

APA, Harvard, Vancouver, ISO, and other styles

24

Karia, R., and M. Plested. "PHP87 THE IMPACT OF THE SUBMISSION SEQUENCE – WHICH APPRAISING BODY TO SUBMIT TO FIRST?" Value in Health 12, no. 3 (2009): A94—A95. http://dx.doi.org/10.1016/s1098-3015(10)73536-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Kahn, Patricia, David Hazledine, and Graham Cameron. "A new system for submission of nucleotide sequence data to the EMBL Data Library." Plant Molecular Biology 11, no. 4 (1988): 541–46. http://dx.doi.org/10.1007/bf00039035.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Komarov, V. A., M. I. Kurashkin, and I. V. Yakushev. "onformity assessment of vehicles used in the agricultural sector." Machinery and Equipment for Rural Area, no. 3 (March 23, 2021): 20–24. http://dx.doi.org/10.33267/2072-9642-2021-3-20-24.

Full text

Abstract:

Regulatory requirements for compliance and certification of vehicles taking into account international experience are discussed. The sequence of resolving issues on compliance and certification of vehicles is presented on the example of the Saransk dump truck plant. The results of research on the average number of defects per vehicle and the number of vehicles accepted from the first submission in various quality management systems are presented

APA, Harvard, Vancouver, ISO, and other styles

27

Giongo, Adriana, Heather L. Tyler, Ursula N. Zipperer, and Eric W. Triplett. "Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission." Standards in Genomic Sciences 2, no. 3 (2010): 309–17. http://dx.doi.org/10.4056/sigs.972221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Du, Chunguang, Edward Buckler, and Spencer Muse. "Development of a Maize Molecular Evolutionary Genomic Database." Comparative and Functional Genomics 4, no. 2 (2003): 246–49. http://dx.doi.org/10.1002/cfg.282.

Full text

Abstract:

PANZEA is the first public database for studying maize genomic diversity. It was initiated as a repository of genomic diversity for an NSF Plant Genome project on ‘Maize Evolutionary Genomics’. PANZEA is hosted at the Bioinformatics Research Center, North Carolina State University, and is open to the public (http://statgen.ncsu.edu/panzea). PANZEA is designed to capture the interrelationships between germplasm, molecular diversity, phenotypic diversity and genome structure. It has the ability to store, integrate and visualize DNA sequence, enzymatic, SSR (simple sequence repeat) marker, germplasm and phenotypic data. The relational data model is selected and implemented in Oracle. An automated DNA sequence data submission tool has been created that allows project researchers to remotely submit their DNA sequence data directly to PANZEA. On-line database search forms and reports have been created to allow users to search or download germplasm, DNA sequence, gene/locus data and much more, directly from the web.

APA, Harvard, Vancouver, ISO, and other styles

29

Shi, Linchun, Haimei Chen, Mei Jiang, et al. "CPGAVAS2, an integrated plastome sequence annotator and analyzer." Nucleic Acids Research 47, W1 (2019): W65—W73. http://dx.doi.org/10.1093/nar/gkz345.

Full text

Abstract:

AbstractWe previously developed a web server CPGAVAS for annotation, visualization and GenBank submission of plastome sequences. Here, we upgrade the server into CPGAVAS2 to address the following challenges: (i) inaccurate annotation in the reference sequence likely causing the propagation of errors; (ii) difficulty in the annotation of small exons of genes petB, petD and rps16 and trans-splicing gene rps12; (iii) lack of annotation for other genome features and their visualization, such as repeat elements; and (iv) lack of modules for diversity analysis of plastomes. In particular, CPGAVAS2 provides two reference datasets for plastome annotation. The first dataset contains 43 plastomes whose annotation have been validated or corrected by RNA-seq data. The second one contains 2544 plastomes curated with sequence alignment. Two new algorithms are also implemented to correctly annotate small exons and trans-splicing genes. Tandem and dispersed repeats are identified, whose results are displayed on a circular map together with the annotated genes. DNA-seq and RNA-seq data can be uploaded for identification of single-nucleotide polymorphism sites and RNA-editing sites. The results of two case studies show that CPGAVAS2 annotates better than several other servers. CPGAVAS2 will likely become an indispensible tool for plastome research and can be accessed from http://www.herbalgenomics.org/cpgavas2.

APA, Harvard, Vancouver, ISO, and other styles

30

Bolaji, Ayooluwa J., and Ana T. Duggan. "In silico analyses identify sequence contamination thresholds for Nanopore-generated SARS-CoV-2 sequences." PLOS Computational Biology 20, no. 8 (2024): e1011539. http://dx.doi.org/10.1371/journal.pcbi.1011539.

Full text

Abstract:

The SARS-CoV-2 pandemic has brought molecular biology and genomic sequencing into the public consciousness and lexicon. With an emphasis on rapid turnaround, genomic data informed both diagnostic and surveillance decisions for the current pandemic at a previously unheard-of scale. The surge in the submission of genomic data to publicly available databases proved essential as comparing different genome sequences offers a wealth of knowledge, including phylogenetic links, modes of transmission, rates of evolution, and the impact of mutations on infection and disease severity. However, the scale of the pandemic has meant that sequencing runs are rarely repeated due to limited sample material and/or the availability of sequencing resources, resulting in the upload of some imperfect runs to public repositories. As a result, it is crucial to investigate the data obtained from these imperfect runs to determine whether the results are reliable prior to depositing them in a public database. Numerous studies have identified a variety of sources of contamination in public next-generation sequencing (NGS) data as the number of NGS studies increases along with the diversity of sequencing technologies and procedures. For this study, we conducted an in silico experiment with known SARS-CoV-2 sequences produced from Oxford Nanopore Technologies sequencing to investigate the effect of contamination on lineage calls and single nucleotide variants (SNVs). A contamination threshold below which runs are expected to generate accurate lineage calls and maintain genome-relatedness and integrity was identified. Together, these findings provide a benchmark below which imperfect runs may be considered robust for reporting results to both stakeholders and public repositories and reduce the need for repeat or wasted runs.

APA, Harvard, Vancouver, ISO, and other styles

31

Baron, Michael D., and Arnaud Bataille. "A curated dataset of peste des petits ruminants virus sequences for molecular epidemiological analyses." PLOS ONE 17, no. 2 (2022): e0263616. http://dx.doi.org/10.1371/journal.pone.0263616.

Full text

Abstract:

Peste des petits ruminants (PPR) is a highly contagious and devastating viral disease infecting predominantly sheep and goats. Tracking outbreaks of disease and analysing the movement of the virus often involves sequencing part or all of the genome and comparing the sequence obtained with sequences from other outbreaks, obtained from the public databases. However, there are a very large number (>1800) of PPRV sequences in the databases, a large majority of them relatively short, and not always well-documented. There is also a strong bias in the composition of the dataset, with countries with good sequencing capabilities (e.g. China, India, Turkey) being overrepresented, and most sequences coming from isolates in the last 20 years. In order to facilitate future analyses, we have prepared sets of PPRV sequences, sets which have been filtered for sequencing errors and unnecessary duplicates, and for which date and location information has been obtained, either from the database entry or from other published sources. These sequence datasets are freely available for download, and include smaller datasets which maximise phylogenetic information from the minimum number of sequences, and which will be useful for simple lineage identification. Their utility is illustrated by uploading the data to the MicroReact platform to allow simultaneous viewing of lineage date and geographic information on all the viruses for which we have information. While preparing these datasets, we identified a significant number of public database entries which contain clear errors, and propose guidelines on checking new sequences and completing metadata before submission.

APA, Harvard, Vancouver, ISO, and other styles

32

Ji, Xiaolei, Chen Guo, Yaoyao Dai, et al. "Genomic Characterization and Molecular Evolution of Sapovirus in Children under 5 Years of Age." Viruses 16, no. 1 (2024): 146. http://dx.doi.org/10.3390/v16010146.

Full text

Abstract:

Sapovirus (SaV) is a type of gastroenteric virus that can cause acute gastroenteritis. It is highly contagious, particularly among children under the age of 5. In this study, a total of 712 stool samples from children under the age of 5 with acute gastroenteritis were collected. Out of these samples, 28 tested positive for SaV, resulting in a detection rate of 3.93% (28/712). Samples with Ct < 30 were collected for library construction and high-throughput sequencing, resulting in the acquisition of nine complete genomes. According to Blast, eight of them were identified as GI.1, while the remaining one was GI.6. The GI.6 strain sequence reported in our study represents the first submission of the GI.6 strain complete genome sequence from mainland China to the Genbank database, thus filling the data gap in our country. Sequence identity analysis revealed significant nucleotide variations between the two genotypes of SaV and their corresponding prototype strains. Phylogenetic and genetic evolution analyses showed no evidence of recombination events in the obtained sequences. Population dynamics analysis demonstrated potential competitive inhibition between two lineages of GI.1. Our study provides insights into the molecular epidemiological and genetic evolution characteristics of SaV prevalent in the Nantong region of China, laying the foundation for disease prevention and control, as well as pathogen tracing related to SaV in this area.

APA, Harvard, Vancouver, ISO, and other styles

33

Yeh, Eric, William Jarrold, and Joshua Jordan. "Leveraging Psycholinguistic Resources and Emotional Sequence Models for Suicide Note Emotion Annotation." Biomedical Informatics Insights 5s1 (January 2012): BII.S8979. http://dx.doi.org/10.4137/bii.s8979.

Full text

Abstract:

We describe the submission entered by SRI International and UC Davis for the I2B2 NLP Challenge Track 2. Our system is based on a machine learning approach and employs a combination of lexical, syntactic, and psycholinguistic features. In addition, we model the sequence and locations of occurrence of emotions found in the notes. We discuss the effect of these features on the emotion annotation task, as well as the nature of the notes themselves. We also explore the use of bootstrapping to help account for what appeared to be annotator fatigue in the data. We conclude a discussion of future avenues for improving the approach for this task, and also discuss how annotations at the word span level may be more appropriate for this task than annotations at the sentence level.

APA, Harvard, Vancouver, ISO, and other styles

34

Ashelford, Kevin E., Nadia A. Chuzhanova, John C. Fry, Antonia J. Jones, and Andrew J. Weightman. "At Least 1 in 20 16S rRNA Sequence Records Currently Held in Public Repositories Is Estimated To Contain Substantial Anomalies." Applied and Environmental Microbiology 71, no. 12 (2005): 7724–36. http://dx.doi.org/10.1128/aem.71.12.7724-7736.2005.

Full text

Abstract:

ABSTRACT A new method for detecting chimeras and other anomalies within 16S rRNA sequence records is presented. Using this method, we screened 1,399 sequences from 19 phyla, as defined by the Ribosomal Database Project, release 9, update 22, and found 5.0% to harbor substantial errors. Of these, 64.3% were obvious chimeras, 14.3% were unidentified sequencing errors, and 21.4% were highly degenerate. In all, 11 phyla contained obvious chimeras, accounting for 0.8 to 11% of the records for these phyla. Many chimeras (43.1%) were formed from parental sequences belonging to different phyla. While most comprised two fragments, 13.7% were composed of at least three fragments, often from three different sources. A separate analysis of the Bacteroidetes phylum (2,739 sequences) also revealed 5.8% records to be anomalous, of which 65.4% were apparently chimeric. Overall, we conclude that, as a conservative estimate, 1 in every 20 public database records is likely to be corrupt. Our results support concerns recently expressed over the quality of the public repositories. With 16S rRNA sequence data increasingly playing a dominant role in bacterial systematics and environmental biodiversity studies, it is vital that steps be taken to improve screening of sequences prior to submission. To this end, we have implemented our method as a program with a simple-to-use graphic user interface that is capable of running on a range of computer platforms. The program is called Pintail, is released under the terms of the GNU General Public License open source license, and is freely available from our website at http://www.cardiff.ac.uk/biosi/research/biosoft/ .

APA, Harvard, Vancouver, ISO, and other styles

35

Xu, Xingjian, Lili Hao, Junwei Zhu, et al. "Database Resources of the BIG Data Center in 2018." Nucleic Acids Research 46, no. D1 (2017): D14—D20. http://dx.doi.org/10.1093/nar/gkx897.

Full text

Abstract:

Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn.

APA, Harvard, Vancouver, ISO, and other styles

36

Bateman, Alex, Maria-Jesus Martin, Sandra Orchard, et al. "UniProt: the universal protein knowledgebase in 2021." Nucleic Acids Research 49, no. D1 (2020): D480—D489. http://dx.doi.org/10.1093/nar/gkaa1100.

Full text

Abstract:

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.

APA, Harvard, Vancouver, ISO, and other styles

37

Wang, Shiliang, Jaideep P. Sundaram, and Timothy B. Stockwell. "VIGOR extended to annotate genomes for additional 12 different viruses." Nucleic Acids Research 40, W1 (2012): W186—W192. http://dx.doi.org/10.1093/nar/gks528.

Full text

Abstract:

Abstract A gene prediction program, VIGOR (Viral Genome ORF Reader), was developed at J. Craig Venter Institute in 2010 and has been successfully performing gene calling in coronavirus, influenza, rhinovirus and rotavirus for projects at the Genome Sequencing Center for Infectious Diseases. VIGOR uses sequence similarity search against custom protein databases to identify protein coding regions, start and stop codons and other gene features. Ribonucleicacid editing and other features are accurately identified based on sequence similarity and signature residues. VIGOR produces four output files: a gene prediction file, a complementary DNA file, an alignment file, and a gene feature table file. The gene feature table can be used to create GenBank submission. VIGOR takes a single input: viral genomic sequences in FASTA format. VIGOR has been extended to predict genes for 12 viruses: measles virus, mumps virus, rubella virus, respiratory syncytial virus, alphavirus and Venezuelan equine encephalitis virus, norovirus, metapneumovirus, yellow fever virus, Japanese encephalitis virus, parainfluenza virus and Sendai virus. VIGOR accurately detects the complex gene features like ribonucleicacid editing, stop codon leakage and ribosomal shunting. Precisely identifying the mat_peptide cleavage for some viruses is a built-in feature of VIGOR. The gene predictions for these viruses have been evaluated by testing from 27 to 240 genomes from GenBank.

APA, Harvard, Vancouver, ISO, and other styles

38

Cherry, Colin, Saif M. Mohammad, and Berry De Bruijn. "Binary Classifiers and Latent Sequence Models for Emotion Detection in Suicide Notes." Biomedical Informatics Insights 5s1 (January 2012): BII.S8933. http://dx.doi.org/10.4137/bii.s8933.

Full text

Abstract:

This paper describes the National Research Council of Canada's submission to the 2011 i2b2 NLP challenge on the detection of emotions in suicide notes. In this task, each sentence of a suicide note is annotated with zero or more emotions, making it a multi-label sentence classification task. We employ two distinct large-margin models capable of handling multiple labels. The first uses one classifier per emotion, and is built to simplify label balance issues and to allow extremely fast development. This approach is very effective, scoring an F-measure of 55.22 and placing fourth in the competition, making it the best system that does not use web-derived statistics or re-annotated training data. Second, we present a latent sequence model, which learns to segment the sentence into a number of emotion regions. This model is intended to gracefully handle sentences that convey multiple thoughts and emotions. Preliminary work with the latent sequence model shows promise, resulting in comparable performance using fewer features.

APA, Harvard, Vancouver, ISO, and other styles

39

Barley, Oliver R., and Craig A. Harms. "Different Methods of Winning, Losing, and Training in Combat Sports and Their Relationship with Overall Competitive Winningness." Translational Sports Medicine 2024 (February 21, 2024): 1–10. http://dx.doi.org/10.1155/2024/5531981.

Full text

Abstract:

This study aimed to investigate how overall competitive winningness in combat sports depended on patterns of victory and loss, as well as training habits. Competitors (N = 280) from several combat sports participated in the study. The online survey included questions on self-reported patterns of victory (and loss), training habits, general demographics (e.g., age), and sport-specific information (e.g., stage of career and competitive style). Overall, it was found across four models that reflected diversity of winningness in combat sports that the most important predictors of competitive winningness were loss by points (negative), loss by submission (negative), loss (negative) or victory (positive) by throw or technical fall, and loss (negative) or victory (positive) by knockout. The findings applied to amateur and regional/state athletes, and rarely to karate or tae kwon do. Findings around demographics or training habits were largely unremarkable, outside of a relationship between higher training loads and less career winning in wrestlers. Results show that while winning via a finishing sequence (e.g., knockout or submission) is preferable to the judge’s decision or points, the matter of victory is less important than the methods by which an athlete loses. In grappling-only sports, we observed a trend that more losses via finishing sequence were worse for careers than losing by points. In fact, having most of one’s losses coming via judge’s decision or points was beneficial in wrestling and judo, perhaps due to athletes taking less risks and having better defence. These findings may aid practitioners developing effective tactics and training programs.

APA, Harvard, Vancouver, ISO, and other styles

40

Osborne, Brian I. "Three sets of Macintosh AppleScripts for the automatic submission of sequence data to the Internet BLAST server." Bioinformatics 12, no. 3 (1996): 257–58. http://dx.doi.org/10.1093/bioinformatics/12.3.257.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Shomer, Benny. "EMBL sequence submission system—an object-oriented approach to developing interactive data collection systems through the WWW." Bioinformatics 13, no. 1 (1997): 55–60. http://dx.doi.org/10.1093/bioinformatics/13.1.55.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Gupta, Vikas, Joana Paupério, Josephine Burgin, Suran Jayathilaka, and Guy Cochrane. "ENA Source Attribute Helper: An Application Programming Interface to facilitate accurate reference to biological source data." F1000Research 11 (September 13, 2022): 1042. http://dx.doi.org/10.12688/f1000research.123934.1.

Full text

Abstract:

Background: Metadata attributes of sequences that accurately reference their biological sources, as specimens or other materials of origin, and link with natural history collections, are essential to facilitate the connections between different fields in life sciences and promote reusability of data. However, metadata used to reference the biological source of sequences available within the molecular data repositories are not always well structured or comprehensive. Methods: Within the scope of the Horizon 2020 project Biodiversity Community Integrated Knowledge Library (BiCIKL), we have developed a tool, the European Nucleotide Archive (ENA) Source Attribute Helper Application Programming Interface (API), to help users accurately report biological source-related sequence and sample attributes. This tool currently focuses on the attributes in which specimens, cultures or other materials are identified, from which the sequence data were derived, and uses curated data to obtain the unique codes for the institutions and collections holding the vouchers. The API's main functions include the presentation of metadata associated with queried institutions or collections, validation of institution and collection codes in the attribute strings provided by the user, and the construction of an attribute string based on user-entered data. The API does not however support the search of voucher specimen codes, as these need to be obtained directly from the voucher institutions. We describe the API and discuss use cases for its different endpoints. The API is available at https://www.ebi.ac.uk/ena/sah/api/. Conclusions: We expect the API to promote and support the initial submission and any subsequent curation of biological source attributes, and hereby contribute to better links between sequence data and natural history collections, and hence on to taxonomy and biodiversity research, towards increasing the discoverability, reusability and impact of data.

APA, Harvard, Vancouver, ISO, and other styles

43

Poling, Shannon, Robert Blum, Adam Dodson, Shivani Patel, and Rahul Koka. "Board 531 - Technology Innovations Abstract A Better Training Manikin for Intubating Neonates with Pierre Robin Sequence (Submission #298)." Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare 8, no. 6 (2013): 624. http://dx.doi.org/10.1097/01.sih.0000441729.26299.4f.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Balavenkataraman, Kadhirvelu Vishnukumar, Kessy Abarenkov, Allan Zirk, et al. "Enabling Community Curation of Biological Source Annotations of Molecular Data Through PlutoF and the ELIXIR Contextual Data Clearinghouse." Biodiversity Information Science and Standards 6 (August 23, 2022): e93595. https://doi.org/10.3897/biss.6.93595.

Full text

Abstract:

The advancements in sequencing technologies have greatly contributed to the documentation of Earth's biodiversity. However, for exploring the full potential of molecular resources for biodiversity, there needs to be a good linkage between sequence data and its biological source, contributing to a network of connected data in the biodiversity research cycle. This requires a foundation of well-structured and accessible annotations in the molecular sequence repositories.The International Nucleotide Sequence Database Collaboration (INSDC), of which the European Nucleotide Archive (ENA) is its European node, holds a large amount of annotations associated with sequence data, relating to its biological source (e.g., specimens in natural history collections). However, for a number of records, these annotations may be incomplete (e.g., missing voucher information), ambiguous or even inaccurate.Therefore, we have implemented a workflow that allows third-party annotations to be attached to sequence and sample records using two existing services, the PlutoF platform and the ELIXIR Contextual Data ClearingHouse. This work was developed within the scope of the BiCIKL (Biodiversity Community Integrated Knowledge Library) project, which aims to establish open science practices in the biodiversity domain.PlutoF is an online data management platform that also provides computing services for biology-related research. PlutoF features allow registered users to enter their own data and access public data at INSDC. Users can enter and manage a range of data, as taxonomic classifications, occurrences, etc. This platform also includes a module that allows the addition of third-party annotations (on material source, taxonomic identification, etc.) linked to specimens or sequence records. This module was already in use by the UNITE community for annotation of INSDC rDNA Internal Transcribed Spacer sequence datasets (Abarenkov et al. 2021). These UNITE annotations are displayed in the National Centre for Biotechnology Information (NCBI) records through links to the PlutoF platform. However, there was the need for an automated solution that allowed third-party annotations to any sequence or sample record at INSDC. This was implemented through the operation of the ELIXIR Contextual Data ClearingHouse (hereafter as Clearinghouse). The Clearinghouse holds a simple RESTful Application Programming Interface (API) to support the submission of additions and improvements to current metadata attributes, such as information on material sources, on records publicly available in the ELIXIR data resources. The Clearinghouse enables the submission of these corrected metadata from databases (such as the PlutoF platform) to the primary data repositories.The workflow developed is shown in Fig. 1 and consists of the following steps: i) users annotate sequence metadata that is regularly downloaded from INSDC using NCBI's E-utilities; ii) an annotation proposal is created and a verification notification is sent to an assigned reviewer; iii) the reviewer evaluates the annotation proposal and accepts it or rejects it with comments; iv) if the annotation proposal is accepted, the annotated fields that may be mapped to ENA fields are then pushed to the Clearinghouse using their RESTful API. The annotations when received at ENA are then reviewed before being displayed. This workflow is implemented through a web interface in PlutoF, which allows user-friendly and effortless reporting of corrections or additions to biological source metadata in sequence records.Overall, we expect this tool to contribute to the enrichment of metadata associated with sequence records, and therefore increase the links between the molecular and biodiversity resources, and enable sequencing data to deliver their full potential for biodiversity conservation.

APA, Harvard, Vancouver, ISO, and other styles

45

Kanimozhi, R., D. Arvind Prasath, R. Dhandapani, and Santhosh Sigamani. "Response Surface Optimization of Culture Conditions of Microcystis sp. to Enhance its Biomass Production and Explore its Potential as Antimicrobials." Nature Environment and Pollution Technology 21, no. 4 (2022): 1865–73. http://dx.doi.org/10.46488/nept.2022.v21i04.041.

Full text

Abstract:

The menace of drug-resistant bacteria is an issue of global concern. The growth mechanism of the algae Microcystis sp. encompasses the capacity to upset bacterial pathogens, and this approach is explored in the study. Microcystis sp. biomass harnessing was optimized via DoE-RSM (Design of Experiment-Response Surface Methodology), and further, the in vitro antimicrobial abilities to counter the drug-tolerant microbes were considered. This investigation aimed to increase the biomass output via optimization of essential components of the media parameter like NaNO3, K2HPO4, and MgSO4 as the variables. A maximal biomass yield of 262 mg.L-1 was accomplished within the optimized conditions and the Microcystis sp. displayed notable antimicrobial action against Staphylococcus aureus and Pseudomonas aeruginosa. Hence the Microcystis sp. could be an ideal biocontrol agent to mitigate the drug-tolerant microbes. A partial sequencing was performed, and gene sequences were subject to BLAST at NCBI, and the microbial isolate was identified as Microcystis aeruginosa, and the accession number was also procured for this sequence submission as MT792731.1.

APA, Harvard, Vancouver, ISO, and other styles

46

Gardner, Daniel, Michael Abato, Kevin H. Knuth, Robert DeBellis, and Steven M. Erde. "Dynamic publication model for neurophysiology databases." Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 356, no. 1412 (2001): 1229–47. http://dx.doi.org/10.1098/rstb.2001.0911.

Full text

Abstract:

We have implemented a pair of database projects, one serving cortical electrophysiology and the other invertebrate neurones and recordings. The design for each combines aspects of two proven schemes for information interchange. The journal article metaphor determined the type, scope, organization and quantity of data to comprise each submission. Sequence databases encouraged intuitive tools for data viewing, capture, and direct submission by authors. Neurophysiology required transcending these models with new datatypes. Time–series, histogram and bivariate datatypes, including illustration–like wrappers, were selected by their utility to the community of investigators. As interpretation of neurophysiological recordings depends on context supplied by metadata attributes, searches are via visual interfaces to sets of controlled–vocabulary metadata trees. Neurones, for example, can be specified by metadata describing functional and anatomical characteristics. Permanence is advanced by data model and data formats largely independent of contemporary technology or implementation, including Java and the XML standard. All user tools, including dynamic data viewers that serve as a virtual oscilloscope, are Java–based, free, multiplatform, and distributed by our application servers to any contemporary networked computer. Copyright is retained by submitters; viewer displays are dynamic and do not violate copyright of related journal figures. Panels of neurophysiologists view and test schemas and tools, enhancing community support.

APA, Harvard, Vancouver, ISO, and other styles

47

Mohammad, Taghreed Khudhur. "Molecular Identification of Staphylococcus spp Isolated from Clinical samples." Journal of Biotechnology Research Center 12, no. 1 (2018): 14–24. http://dx.doi.org/10.24126/jobrc.2018.12.1.544.

Full text

Abstract:

The analysis of 16S rRNA gene sequences has been the technique generally used to study and confirm the identification and taxonomy of staphylococci. However, bacterial species cannot always be distinguished from each other using cultural methods. Thus, clinical samples were collected from 190 cases only 31 positive for staphylococcal infections with Urinary Tract Infection, Wounds, Burns, Otitis media, diarrhea infections, were applied for microbiological analysis which include: cultures on Manitol salt agar and HiCrome UTI Agar medium all the isolates gave positive golden yellow and identify as Staphylococcus spp. DNA was extracted from Staphylococcus spp and the 16srRNA gene were amplified by using specific primer, then sequencing of nucleic acid of genes was performed by machine is AB13730XL, Applied Biosystem, Macro gen company, the DNA sequencing results of flank sense of 16srRNA gene from 31 strains of Staphylococcus was confirm the identification into species level: Staphylococcus haemolyticus, Staphylococcus aureus, Staphylococcus epidermidis And Staphylococcus sciuri. Analysis of the sequences appeared that there two substitution(Transversion, Transition) in the Staphylococcus aureus strains with Sequence ID LC090540.1 location at Range of nucleotide from 4 to 636, 100% compatibility with NCBI while no substitution appeared in the Staphylococcus haemolyticus strains which have the sequence ID LN998078.1, 99% compatibility with NCBI also the sequence ID KR812401.1 which related to the strain Staphylococcus sciuri not appeared any substitution after sequencing analysis. Types of substitution detected in partial 16srRNA gene in Staphylococcus epidermidis strains 13 Transversion and 5 transition substitution location at range of nucleotide from 6 to 1026 have the Sequence ID KF575160.1 compared with data obtained from Gene Bank, these finding lead to conclusion, our assay allows rapid and confirm the detection to avoid possibility of misidentification of Staphylococcus species based on cultural analysis, the study aimed to propose the partial sequencing of the gene as an alternative molecular tool for the analysis of Staphylococcus species and for decreasing the possibility of misidentification. New submission of local Iraqi Staphylococcus clinical isolated during the current study show successfully record of four isolate Staphylococcus sciuri, Staphylococcus epidermidis, Staphylococcus aureus and Staphylococcus haemolyticus with GenBank accession number: KY938530.1 ,KY938529.1, ,KY938528.1, and GenBank: KY938527.1respectivelly.

APA, Harvard, Vancouver, ISO, and other styles

48

Beekman, Jared A., Ronald F. A. Woodaman, and Dennis M. Buede. "A Review of Probabilistic Opinion Pooling Algorithms with Application to Insider Threat Detection." Decision Analysis 17, no. 1 (2020): 39–55. http://dx.doi.org/10.1287/deca.2019.0399.

Full text

Abstract:

We retrospectively explore the effectiveness of various probabilistic opinion pools against a set of insider threat detection modeling data from a recently completed, multiyear, sponsored research effort. We explored four opinion pools: the linear opinion pool (likely the most popular), the beta-transformed linear opinion pool, the geometric opinion pool, and a multiplicative method based on odds called Bordley’s formula. The data for our study came from our recent work in the inference of insider threats for our research sponsor. In this work, we created a multimodeling inference enterprise modeling (MIEM) process to either predict threats within a population or, given the threats, predict how well the enterprise system can detect those threats. As part of larger research challenges designed by the research sponsor, we applied the MIEM process quarterly to respond to a sequence of varying challenge problems (CPs). Via MIEM, we developed multiple, independent computation forecast models. These models generated certainty intervals to answer CP questions. These intervals were fused into a single interval for each question via an expert panel prior to submission. The sponsors scored the responses against ground truth. In this paper, we (a) ask which pooling functions work best on these data and consider why, and (b) compare this performance to the actual submissions to determine if one of the pooling functions performed better than our judgment-based fusion.

APA, Harvard, Vancouver, ISO, and other styles

49

Lieu, N., L. Theis, C. Yeung, et al. "O075 Update in the management of Pierre Robin Sequence." Sleep Advances 5, Supplement_1 (2024): A27. https://doi.org/10.1093/sleepadvances/zpae070.075.

Full text

Abstract:

Abstract Background Pierre Robin sequence (PRS) is a congenital triad of micrognathia, glossoptosis and airway obstruction. Current treatment approaches in Australia include nasopharyngeal airway, continuous positive airway pressure, and surgery including mandibular distraction osteogenesis. The Tübingen Palatal Plate (TPP) offers an orthodontic method for jaw distraction that has hitherto not been available in Australia. Since October 2023, we have assembled the team required to implement this treatment. This study evaluates the outcomes of patients managed with TPP at our centre, to date. Methods Retrospective analysis comparing treatment outcomes for patients diagnosed with PRS managed at a tertiary paediatric centre from October 2023 to October 2024. Progress to date Of ten patients with PRS treated at our centre since October 2023, seven (70%) were treated with TPP. Eight (80%) patients underwent polysomnography (PSG) prior to intervention. The mean apnoea hypopnea index (AHI) was 49.6 events/hr (±24.1; 16.4-88.5) and obstructive apnoea hypopnea index (OAHI) was 43.2 events/hr (±24.4; 15.5-88.5). A repeat PSG within 3 weeks post intervention demonstrated improvement in AHI by 24 events/hr (±24.9; -3.8-66; p=0.06) and OAHI by 18 events/hr (±9.2; 4.9-31.4; p=0.09). At the time of submission, 2 patients have completed TPP treatment with final PSG AHI/OAHI values of 4.5/3.4 and 9.4/3 respectively. This demonstrates mean reduction in AHI scores of 28.8 events/hr (±21.6; 11.9-45.7; p=0.33) and OAHI 22.7 events/hr (±15.5; 12.5-32.9; p=0.27). Intended outcome and impact We have successfully implemented TPP therapy for PRS in Australia, confirming its effectiveness as an orthodontic method of jaw distraction.

APA, Harvard, Vancouver, ISO, and other styles

50

Blaxter, Mark, Joana Pauperio, Conrad Schoch, and Kerstin Howe. "Taxonomy Identifiers (TaxId) for Biodiversity Genomics: a guide to getting TaxId for submission of data to public databases." Wellcome Open Research 9 (October 15, 2024): 591. http://dx.doi.org/10.12688/wellcomeopenres.22949.1.

Full text

Abstract:

Biodiversity genomics critically depends on correct taxonomic identification of the sample from which data are derived. Tracking of that taxonomic information through systems that archive data and report on genome sequencing efforts. For submission of data to the International Nucleotide Sequence Database Collaboration (INSDC) databases (DNA DataBank of Japan [DDBJ], European Nucleotide Archive [ENA] and National Center for Biotechnology Information [NCBI]), samples and data derived from them must be assigned a species-level NCBI Taxonomy taxonomic identifier (TaxId, sometimes referred to as taxId or txid). We thus need to be able to identify the TaxId for a target species efficiently. Because the NCBI Taxonomy does not include all known species and cannot preemptively represent unknown taxa, we also need an efficient process for generating new TaxIds for species not yet listed. This document provides workflows for different kinds of TaxId acquisition scenarios and was created to guide users in these processes. Although developed for European projects such as Darwin Tree of Life and the European Reference Genome Atlas, the workflows are universally applicable and describe the use of ENA in resolving taxonomic issues. Too Long: Didn't Read (TL;DR): Use the ENA REST API programmatically to retrieve TaxIds for target species and confirm that sequence data can be submitted to those TaxIds. Use the NCBI Web interface to NCBI Taxonomy to identify potential homotypic synonyms. Request a new TaxId from ENA for a species not yet in NCBI Taxonomy, and for species-like entries for which the full Linnaean binomen is not determined (see https://ena-docs.readthedocs.io/en/latest/faq/taxonomy_requests.html#creating-taxon-requests). Discuss directly with the NCBI Taxonomy curators or the curators at ENA and NCBI whenever you think there is an opportunity to improve their database.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!