To see the other types of publications on this topic, follow the link: Calling sequence.

Journal articles on the topic 'Calling sequence'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Calling sequence.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhao, Zhijia, Bo Wu, Mingzhou Zhou, Yufei Ding, Jianhua Sun, Xipeng Shen, and Youfeng Wu. "Call sequence prediction through probabilistic calling automata." ACM SIGPLAN Notices 49, no. 10 (December 31, 2014): 745–62. http://dx.doi.org/10.1145/2714064.2660221.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chang, Chun-Tien, Chi-Neu Tsai, Chuan Yi Tang, Chun-Houh Chen, Jang-Hau Lian, Chi-Yu Hu, Chia-Lung Tsai, et al. "Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling." Scientific World Journal 2012 (2012): 1–10. http://dx.doi.org/10.1100/2012/365104.

Full text
Abstract:
The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such asβ-defensin 4 (DEFB4) and its paralogHSPDP3.
APA, Harvard, Vancouver, ISO, and other styles
3

Woerner, August E., Jennifer Churchill Cihlar, Utpal Smart, and Bruce Budowle. "Numt identification and removal with RtN!" Bioinformatics 36, no. 20 (July 24, 2020): 5115–16. http://dx.doi.org/10.1093/bioinformatics/btaa642.

Full text
Abstract:
Abstract Motivation Assays in mitochondrial genomics rely on accurate read mapping and variant calling. However, there are known and unknown nuclear paralogs that have fundamentally different genetic properties than that of the mitochondrial genome. Such paralogs complicate the interpretation of mitochondrial genome data and confound variant calling. Results Remove the Numts! (RtN!) was developed to categorize reads from massively parallel sequencing data not based on the expected properties and sequence identities of paralogous nuclear encoded mitochondrial sequences, but instead using sequence similarity to a large database of publicly available mitochondrial genomes. RtN! removes low-level sequencing noise and mitochondrial paralogs while not impacting variant calling, while competing methods were shown to remove true variants from mitochondrial mixtures. Availability and implementation https://github.com/Ahhgust/RtN Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
4

O'Connell, Jared, and Jonathan Marchini. "Joint Genotype Calling With Array and Sequence Data." Genetic Epidemiology 36, no. 6 (July 20, 2012): 527–37. http://dx.doi.org/10.1002/gepi.21657.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tessier, Laurence, Olivier Côté, and Dorothee Bienzle. "Sequence variant analysis of RNA sequences in severe equine asthma." PeerJ 6 (October 11, 2018): e5759. http://dx.doi.org/10.7717/peerj.5759.

Full text
Abstract:
Background Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. This study aimed to determine the utility of RNA-Seq to call gene sequence variants, and to identify sequence variants of potential relevance to the pathogenesis of asthma. Methods RNA-Seq data were generated from endobronchial biopsies collected from six asthmatic and seven non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for sequence variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Sequence variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools, and two RNA-Seq predicted sequence variants were investigated with both PCR and Sanger sequencing. Supplementary analysis of novel sequence variant selection with VEP was based on a score of <0.01 predicted with Sorting Intolerant from Tolerant software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For select variants, effect on protein function was assessed with Polymorphism Phenotyping 2 and screening for non-acceptable polymorphism 2 software. Sequences were aligned and 3D protein structures predicted with Geneious software. Difference in allele frequency between the groups was assessed using a Pearson’s Chi-squared test with Yates’ continuity correction, and difference in genotype frequency was calculated using the Fisher’s exact test for count data. Results RNA-Seq variant calling and filtering correctly identified substitution variants in PACRG and RTTN. Sanger sequencing confirmed that the PACRG substitution was appropriately identified in all 26 samples while the RTTN substitution was identified correctly in 24 of 26 samples. These variants of uncertain significance had substitutions that were predicted to result in loss of function and to be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, and a change in both for RTTN. For PACRG, no difference in allele frequency between the two groups was detected but a higher proportion of asthmatic horses had the altered RTTN allele compared to non-asthmatic animals. Discussion RNA-Seq was sensitive and specific for calling gene sequence variants in this disease model. Even moderate coverage (<10–20 counts per million) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect sequence variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins, and possible association of the sequence variants with asthma, is of uncertain significance, but their role in ciliary function may be of future interest.
APA, Harvard, Vancouver, ISO, and other styles
6

Bedo, Justin, Benjamin Goudey, Jeremy Wazny, and Zeyu Zhou. "Information theoretic alignment free variant calling." PeerJ Computer Science 2 (July 25, 2016): e71. http://dx.doi.org/10.7717/peerj-cs.71.

Full text
Abstract:
While traditional methods for calling variants across whole genome sequence data rely on alignment to an appropriate reference sequence, alternative techniques are needed when a suitable reference does not exist. We present a novel alignment and assembly free variant calling method based on information theoretic principles designed to detect variants have strong statistical evidence for their ability to segregate samples in a given dataset. Our method uses the context surrounding a particular nucleotide to define variants. Given a set of reads, we model the probability of observing a given nucleotide conditioned on the surrounding prefix and suffixes of lengthkas a multinomial distribution. We then estimate which of these contexts are stable intra-sample and varying inter-sample using a statistic based on the Kullback–Leibler divergence.The utility of the variant calling method was evaluated through analysis of a pair of bacterial datasets and a mouse dataset. We found that our variants are highly informative for supervised learning tasks with performance similar to standard reference based calls and another reference free method (DiscoSNP++). Comparisons against reference based calls showed our method was able to capture very similar population structure on the bacterial dataset. The algorithm’s focus on discriminatory variants makes it suitable for many common analysis tasks for organisms that are too diverse to be mapped back to a single reference sequence.
APA, Harvard, Vancouver, ISO, and other styles
7

Mukbil, Awad, Umut Durak, and Sven Hartmann. "Conformance testing of FMI calling sequence for simulation environments." International Journal of Modeling, Simulation, and Scientific Computing 10, no. 02 (April 2019): 1950008. http://dx.doi.org/10.1142/s1793962319500089.

Full text
Abstract:
Exchanging simulation models is currently of utmost importance. To improve interoperability between suppliers and original equipment manufacturers (OEMs), the functional mock-up interface (FMI) is exchanged in a standard format called functional mock-up unit (FMU). Since its first release, many simulation tools took the initiative to support FMI. However, since then, there have been many complaints stating that exchanging models via FMI does not work as stable as expected. The reason usually turned out to be the implementation of tool vendors that sometimes fail to comply with the standard fully. This paper introduces a methodology for testing FMI compliance of importing simulation tools using a set of reference FMUs. The standard defines the implementation of FMI functions calling sequence in a state machine. Therefore, conformance testing (also called fault detection) from automata theory is utilized to produce reference FMUs based on the FMI state-machine.
APA, Harvard, Vancouver, ISO, and other styles
8

Dewal, N., Y. Hu, M. L. Freedman, T. LaFramboise, and I. Pe'er. "Calling amplified haplotypes in next generation tumor sequence data." Genome Research 22, no. 2 (November 16, 2011): 362–74. http://dx.doi.org/10.1101/gr.122564.111.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ruark, Elise, Esty Holt, Anthony Renwick, Márton Münz, Matthew Wakeling, Sian Ellard, Shazia Mahamdallie, Shawn Yost, and Nazneen Rahman. "ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling using the ICR142 NGS validation series." Wellcome Open Research 3 (August 31, 2018): 108. http://dx.doi.org/10.12688/wellcomeopenres.14754.1.

Full text
Abstract:
Evaluating, optimising and benchmarking of next generation sequencing (NGS) variant calling performance are essential requirements for clinical, commercial and academic NGS pipelines. Such assessments should be performed in a consistent, transparent and reproducible fashion, using independently, orthogonally generated data. Here we present ICR142 Benchmarker, a tool to generate outputs for assessing variant calling performance using the ICR142 NGS validation series, a dataset of exome sequence data from 142 samples together with Sanger sequence data at 704 sites. ICR142 Benchmarker provides summary and detailed information on the sensitivity, specificity and false detection rates of variant callers. ICR142 Benchmarker also automatically generates a single page report highlighting key performance metrics and how performance compares to widely-used open-source tools. We used ICR142 Benchmarker with VCF files outputted by GATK, OpEx and DeepVariant to create a benchmark for variant calling performance. This evaluation revealed pipeline-specific differences and shared challenges in variant calling, for example in detecting indels in short repeating sequence motifs. We next used ICR142 Benchmarker to perform regression testing with versions 0.5.2 and 0.6.1 of DeepVariant. This showed that v0.6.1 improves variant calling performance, but there was evidence of some minor changes in indel calling behaviour that may benefit from attention in future updates. The data also allowed us to evaluate filters to optimise DeepVariant calling, and we recommend using 30 as the QUAL threshold for base substitution calls when using DeepVariant v0.6.1. Finally, we used ICR142 Benchmarker with VCF files from two commercial variant calling providers to facilitate optimisation of their in-house pipelines and to provide transparent benchmarking of their performance. ICR142 Benchmarker consistently and transparently analyses variant calling performance based on the ICR142 NGS validation series, using the standard VCF input and outputting informative metrics to enable user understanding of pipeline performance. ICR142 Benchmarker is freely available at https://github.com/RahmanTeamDevelopment/ICR142_Benchmarker/releases.
APA, Harvard, Vancouver, ISO, and other styles
10

Ruark, Elise, Esty Holt, Anthony Renwick, Márton Münz, Matthew Wakeling, Sian Ellard, Shazia Mahamdallie, Shawn Yost, and Nazneen Rahman. "ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling performance using the ICR142 NGS validation series." Wellcome Open Research 3 (October 31, 2018): 108. http://dx.doi.org/10.12688/wellcomeopenres.14754.2.

Full text
Abstract:
Evaluating, optimising and benchmarking of next generation sequencing (NGS) variant calling performance are essential requirements for clinical, commercial and academic NGS pipelines. Such assessments should be performed in a consistent, transparent and reproducible fashion, using independently, orthogonally generated data. Here we present ICR142 Benchmarker, a tool to generate outputs for assessing germline base substitution and indel calling performance using the ICR142 NGS validation series, a dataset of Illumina platform-based exome sequence data from 142 samples together with Sanger sequence data at 704 sites. ICR142 Benchmarker provides summary and detailed information on the sensitivity, specificity and false detection rates of variant callers. ICR142 Benchmarker also automatically generates a single page report highlighting key performance metrics and how performance compares to widely-used open-source tools. We used ICR142 Benchmarker with VCF files outputted by GATK, OpEx and DeepVariant to create a benchmark for variant calling performance. This evaluation revealed pipeline-specific differences and shared challenges in variant calling, for example in detecting indels in short repeating sequence motifs. We next used ICR142 Benchmarker to perform regression testing with DeepVariant versions 0.5.2 and 0.6.1. This showed that v0.6.1 improves variant calling performance, but there was evidence of minor changes in indel calling behaviour that may benefit from attention. The data also allowed us to evaluate filters to optimise DeepVariant calling, and we recommend using 30 as the QUAL threshold for base substitution calls when using DeepVariant v0.6.1. Finally, we used ICR142 Benchmarker with VCF files from two commercial variant calling providers to facilitate optimisation of their in-house pipelines and to provide transparent benchmarking of their performance. ICR142 Benchmarker consistently and transparently analyses variant calling performance based on the ICR142 NGS validation series, using the standard VCF input and outputting informative metrics to enable user understanding of pipeline performance. ICR142 Benchmarker is freely available at https://github.com/RahmanTeamDevelopment/ICR142_Benchmarker/releases.
APA, Harvard, Vancouver, ISO, and other styles
11

Ngeno, K. "Variant Calling pipeline for Next Generation Sequence Data – A review." Journal of Animal Science and Veterinary Medicine 3, no. 4 (August 30, 2018): 90–93. http://dx.doi.org/10.31248/jasvm2018.092.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Illingworth, C. J. R. "SAMFIRE: multi-locus variant calling for time-resolved sequence data." Bioinformatics 32, no. 14 (April 22, 2016): 2208–9. http://dx.doi.org/10.1093/bioinformatics/btw205.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Prodanov, Timofey, and Vikas Bansal. "Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications." Nucleic Acids Research 48, no. 19 (October 9, 2020): e114-e114. http://dx.doi.org/10.1093/nar/gkaa829.

Full text
Abstract:
Abstract The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies can potentially overcome this limitation, long segmental duplications with high sequence identity pose challenges for long-read mapping. We describe a probabilistic method, DuploMap, designed to improve the accuracy of long-read mapping in segmental duplications. It analyzes reads mapped to segmental duplications using existing long-read aligners and leverages paralogous sequence variants (PSVs)—sequence differences between paralogous sequences—to distinguish between multiple alignment locations. On simulated datasets, DuploMap increased the percentage of correctly mapped reads with high confidence for multiple long-read aligners including Minimap2 (74.3–90.6%) and BLASR (82.9–90.7%) while maintaining high precision. Across multiple whole-genome long-read datasets, DuploMap aligned an additional 8–21% of the reads in segmental duplications with high confidence relative to Minimap2. Using DuploMap-aligned PacBio circular consensus sequencing reads, an additional 8.9 Mb of DNA sequence was mappable, variant calling achieved a higher F1 score and 14 713 additional variants supported by linked-read data were identified. Finally, we demonstrate that a significant fraction of PSVs in segmental duplications overlaps with variants and adversely impacts short-read variant calling.
APA, Harvard, Vancouver, ISO, and other styles
14

Park, Su-Young, and Chai-Yeoung Jung. "Genotype-Calling System for Somatic Mutation Discovery in Cancer Genome Sequence." Journal of the Korea Institute of Information and Communication Engineering 17, no. 12 (December 31, 2013): 3009–15. http://dx.doi.org/10.6109/jkiice.2013.17.12.3009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Bailey, Mark W., and Jack W. Davidson. "Target-sensitive construction of diagnostic programs for procedure calling sequence generators." ACM SIGPLAN Notices 31, no. 5 (May 1996): 249–57. http://dx.doi.org/10.1145/249069.231431.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Vancuren, Sarah J., Scott J. Dos Santos, and Janet E. Hill. "Evaluation of variant calling for cpn60 barcode sequence-based microbiome profiling." PLOS ONE 15, no. 7 (July 9, 2020): e0235682. http://dx.doi.org/10.1371/journal.pone.0235682.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Flickinger, Matthew, Goo Jun, Gonçalo R. Abecasis, Michael Boehnke, and Hyun Min Kang. "Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data." American Journal of Human Genetics 97, no. 2 (August 2015): 284–90. http://dx.doi.org/10.1016/j.ajhg.2015.07.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Matsumoto, Yui K., and Kazuo Okanoya. "Mice modulate ultrasonic calling bouts according to sociosexual context." Royal Society Open Science 5, no. 6 (June 2018): 180378. http://dx.doi.org/10.1098/rsos.180378.

Full text
Abstract:
Mice produce various sounds within the ultrasonic range in social contexts. Although these sounds are often used as an index of sociability in biomedical research, their biological significance remains poorly understood. We previously showed that mice repeatedly produced calls in a sequence (i.e. calling bout), which can vary in their structure, such as Simple, Complex or Harmonics. In this study, we investigated the use of the three types of calling bouts in different sociosexual interactions, including both same- and opposite-sex contexts. In same-sex contexts, males typically produced a Simple calling bout, whereas females mostly produced a Complex one. By contrast, in the opposite-sex context, they produced all the three types of calling bouts, but the use of each calling type varied according to the progress and mode of sociosexual interaction (e.g. Harmonic calling bout was specifically produced during reproductive behaviour). These results indicate that mice change the structure of calling bout according to sociosexual contexts, suggesting the presence of multiple functional signals in their ultrasonic communication.
APA, Harvard, Vancouver, ISO, and other styles
19

Subudhi, Sharmila, Suvasini Panigrahi, and Tanmay Kumar Behera. "Detection of Mobile Phone Fraud Using Possibilistic Fuzzy C-Means Clustering and Hidden Markov Model." International Journal of Synthetic Emotions 7, no. 2 (July 2016): 23–44. http://dx.doi.org/10.4018/ijse.2016070102.

Full text
Abstract:
This paper presents a novel approach for fraud detection in mobile phone networks by using a combination of Possibilistic Fuzzy C-Means clustering and Hidden Markov Model (HMM). The clustering technique is first applied on two calling features extracted from the past call records of a subscriber generating a behavioral profile for the user. The HMM parameters are computed from the profile, which are used to generate some profile sequences for training. The trained HMM model is then applied for detecting fraudulent activities on incoming call sequences. A calling instance is detected as forged when the new sequence is not accepted by the trained model with sufficiently high probability. The efficacy of the proposed system is demonstrated by extensive experiments carried out with Reality Mining dataset. Furthermore, the comparative analysis performed with other clustering methods and another approach recently proposed in the literature justifies the effectiveness of the proposed algorithm.
APA, Harvard, Vancouver, ISO, and other styles
20

López, C., M. Eizaguirre, and R. Albajes. "Courtship and mating behaviour of the Mediterranean corn borer, Sesamia nonagrioides (Lepidoptera: Noctuidae)." Spanish Journal of Agricultural Research 1, no. 1 (March 1, 2003): 43. http://dx.doi.org/10.5424/sjar/2003011-8.

Full text
Abstract:
The behavioural sequence of courtship and mating of Sesamia nonagrioides Lefèbvre males and females in laboratorycages is described. The attractant capacity of both sexes was studied by tethering males or females. The courtshipand mating sequence of tethered females was the same as untethered ones, but tethered males were absolutely inactive.Age (1 day versus 2 days) did not affect the mating rate. Differences in calling time onset and first calling agewere found between north-eastern Spanish and Greek populations. The role of hair pencils in courtship flight is discussed,rejecting that they stimulate calling behaviour of females or that they attract females. The influence of adultpopulation density was studied in laboratory and field tests, and was found to have no effect on the mating rate in thecontrols. However, moth density influenced mating rates where mating disruption treatments were applied. Several consequences for investigations into pheromone composition and its use in monitoring and population control are reported in the discussion.
APA, Harvard, Vancouver, ISO, and other styles
21

Kingwara, Leonard, Muthoni Karanja, Catherine Ngugi, Geoffrey Kangogo, Kipkerich Bera, Maureen Kimani, Nancy Bowen, Dorcus Abuya, Violet Oramisi, and Irene Mukui. "From Sequence Data to Patient Result: A Solution for HIV Drug Resistance Genotyping With Exatype, End to End Software for Pol-HIV-1 Sanger Based Sequence Analysis and Patient HIV Drug Resistance Result Generation." Journal of the International Association of Providers of AIDS Care (JIAPAC) 19 (January 1, 2020): 232595822096268. http://dx.doi.org/10.1177/2325958220962687.

Full text
Abstract:
Introduction: With the rapid scale-up of antiretroviral therapy (ART) to treat HIV infection, there are ongoing concerns regarding probable emergence and transmission of HIV drug resistance (HIVDR) mutations. This scale-up has to lead to an increased need for routine HIVDR testing to inform the clinical decision on a regimen switch. Although the majority of wet laboratory processes are standardized, slow, labor-intensive data transfer and subjective manual sequence interpretation steps are still required to finalize and release patient results. We thus set out to validate the applicability of a software package to generate HIVDR patient results from raw sequence data independently. Methods: We assessed the performance characteristics of Hyrax Bioscience’s Exatype (a sequence data to patient result, fully automated sequence analysis software, which consolidates RECall, MEGA X and the Stanford HIV database) against the standard method (RECall and Stanford database). Exatype is a web-based HIV Drug resistance bioinformatic pipeline available at sanger. exatype.com . To validate the exatype, we used a test set of 135 remnant HIV viral load samples at the National HIV Reference Laboratory (NHRL). Result: We analyzed, and successfully generated results of 126 sequences out of 135 specimens by both Standard and Exatype software. Result production using Exatype required minimal hands-on time in comparison to the Standard (6 computation-hours using the standard method versus 1.5 Exatype computation-hours). Concordance between the 2 systems was 99.8% for 311,227 bases compared. 99.7% of the 0.2% discordant bases, were attributed to nucleotide mixtures as a result of the sequence editing in Recall. Both methods identified similar (99.1%) critical antiretroviral resistance-associated mutations resulting in a 99.2% concordance of resistance susceptibility interpretations. The Base-calling comparison between the 2 methods had Cohen’s kappa (0.97 to 0.99), implying an almost perfect agreement with minimal base calling variation. On a predefined dataset, RECall editing displayed the highest probability to score mixtures accurately 1 vs. 0.71 and the lowest chance to inaccurately assign mixtures to pure nucleotides (0.002–0.0008). This advantage is attributable to the manual sequence editing in RECall. Conclusion: The reduction in hands-on time needed is a benefit when using the Exatype HIV DR sequence analysis platform and result generation tool. There is a minimal difference in base calling between Exatype and standard methods. Although the discrepancy has minimal impact on drug resistance interpretation, allowance of sequence editing in Exatype as RECall can significantly improve its performance.
APA, Harvard, Vancouver, ISO, and other styles
22

VanRaden, P. M., D. M. Bickhart, and J. R. O'Connell. "Calling known variants and identifying new variants while rapidly aligning sequence data." Journal of Dairy Science 102, no. 4 (April 2019): 3216–29. http://dx.doi.org/10.3168/jds.2018-15172.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Wang, Jing, Jingyang Gao, and Cheng Ling. "Deletion genotype calling on the basis of sequence visualisation and image classification." International Journal of Data Mining and Bioinformatics 20, no. 2 (2018): 109. http://dx.doi.org/10.1504/ijdmb.2018.093682.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Herzeel, Charlotte, Pascal Costanza, Dries Decap, Jan Fostier, and Joke Reumers. "elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling." PLOS ONE 10, no. 7 (July 16, 2015): e0132868. http://dx.doi.org/10.1371/journal.pone.0132868.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Giddings, Michael C., Robert L. Brumley, Michael Haker, and Lloyd M. Smith. "An adaptive, object oriented strategy for base calling in DNA sequence analysis." Nucleic Acids Research 21, no. 19 (1993): 4530–40. http://dx.doi.org/10.1093/nar/21.19.4530.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Sueur, J., and T. Aubin. "Acoustic communication in the Palaearctic red cicada, Tibicina haematodes: chorus organisation, calling-song structure, and signal recognition." Canadian Journal of Zoology 80, no. 1 (January 1, 2002): 126–36. http://dx.doi.org/10.1139/z01-212.

Full text
Abstract:
Males of the Palaearctic red cicada, Tibicina haematodes, produce calling songs that are attractive to both sexes. For the first time we (i) describe the organisation of the chorus formed by aggregating males, (ii) analysed the physical characteristics of the calling song, and (iii) used playback experiments of natural, modified, and allospecific signals to investigate the signal-recognition process. Males overlap each other's calling song and try to call first and last during a chorus, leading to what we term domino and last-word effects, respectively. The calling song consists of a two-part sequence made up of a succession of pulses. It is characterized by slow and fast amplitude modulations and three frequency bands. The structure of the signal varied among individuals in both temporal and frequency parameters. Our playback experiments showed that males make a rough analysis of frequency and duration features of the signal. They pay no attention to amplitude modulations. Because males are not capable of precise analysis, they reply to various allospecific calling songs. Females' analysis of the calling song being difficult to test, the role of this signal in sexual selection still needs to be documented.
APA, Harvard, Vancouver, ISO, and other styles
27

King, David J., Graham Freimanis, Lidia Lasecka-Dykes, Amin Asfor, Paolo Ribeca, Ryan Waters, Donald P. King, and Emma Laing. "A Systematic Evaluation of High-Throughput Sequencing Approaches to Identify Low-Frequency Single Nucleotide Variants in Viral Populations." Viruses 12, no. 10 (October 20, 2020): 1187. http://dx.doi.org/10.3390/v12101187.

Full text
Abstract:
High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA “populations” were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (107 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (105 copies) required more technical replicates to maintain accuracy, while low RNA inputs (103 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants.
APA, Harvard, Vancouver, ISO, and other styles
28

Li, Wei, Peng Liu, and Hao Chen. "Research and Implementation of Malicious Code Behavior Analysis." Applied Mechanics and Materials 182-183 (June 2012): 1938–42. http://dx.doi.org/10.4028/www.scientific.net/amm.182-183.1938.

Full text
Abstract:
Along with the rapidly development of network technology, viruses, Trojans and other malicious code is updating unprecedented quickly, which constantly threatening the collective as well as the personal information safety.Analysis of malware based on the code behavioral characteristics aims at telling whether the code is malicious or not, which can effectively solve the problem caused by Zero-Day attacks that traditional anti-virus technology can hardly prevent. This paper studies how to monitor and record the API calling sequence when a program is running, and how to get the eigenvectors of behavior by means of analyzing the calling sequence of sensitive APIs, which makes the behavior of malicious code can be tracked, and providing supports and theoretical basis for addressing the potential threat of malicious code.
APA, Harvard, Vancouver, ISO, and other styles
29

Preheim, Sarah P., Allison R. Perrotta, Antonio M. Martin-Platero, Anika Gupta, and Eric J. Alm. "Distribution-Based Clustering: Using Ecology To Refine the Operational Taxonomic Unit." Applied and Environmental Microbiology 79, no. 21 (August 23, 2013): 6593–603. http://dx.doi.org/10.1128/aem.00342-13.

Full text
Abstract:
ABSTRACT16S rRNA sequencing, commonly used to survey microbial communities, begins by grouping individual reads into operational taxonomic units (OTUs). There are two major challenges in calling OTUs: identifying bacterial population boundaries and differentiating true diversity from sequencing errors. Current approaches to identifying taxonomic groups or eliminating sequencing errors rely on sequence data alone, but both of these activities could be informed by the distribution of sequences across samples. Here, we show that using the distribution of sequences across samples can help identify population boundaries even in noisy sequence data. The logic underlying our approach is that bacteria in different populations will often be highly correlated in their abundance across different samples. Conversely, 16S rRNA sequences derived from the same population, whether slightly different copies in the same organism, variation of the 16S rRNA gene within a population, or sequences generated randomly in error, will have the same underlying distribution across sampled environments. We present a simple OTU-calling algorithm (distribution-based clustering) that uses both genetic distance and the distribution of sequences across samples and demonstrate that it is more accurate than other methods at grouping reads into OTUs in a mock community. Distribution-based clustering also performs well on environmental samples: it is sensitive enough to differentiate between OTUs that differ by a single base pair yet predicts fewer overall OTUs than most other methods. The program can decrease the total number of OTUs with redundant information and improve the power of many downstream analyses to describe biologically relevant trends.
APA, Harvard, Vancouver, ISO, and other styles
30

Perez-Enciso, Miguel. "229 DNA sequence assisted prediction: the uncomfortable truth." Journal of Animal Science 97, Supplement_3 (December 2019): 55. http://dx.doi.org/10.1093/jas/skz258.112.

Full text
Abstract:
Abstract Using whole genome sequence for improving genomic prediction relative to that from high density SNP arrays has been well below expectations, despite some overoptimistic computer simulations. Why is this so? First, NGS data are massive, noisy and their computer bioinformatics analysis is expensive when applied to the scale needed in animal breeding. SNP calling is a tricky procedure that is especially sensitive to low depth sequencing. This makes it NGS data far more expensive than array genotyping. Second, rare variants are the most frequent class of variants. Population genetics theory dictates that the number of SNPs of a given frequency f is inversely proportional to f. For prediction purposes, it is clear that rare variants are not useful, because it is very likely that they do not segregate in both testing and training subpopulations. Third, sequence contains highly repetitive info, the number of new SNPs decreases quickly with adding new samples and, further, low effective population sizes in domestic animals makes it disequilibrium to be large. What can we do about it? First, high density data can be imputed up to sequence; this has a mild - and limited - effect on improving accuracy. Second, sequence at very low depth on numerous animals can be obtained. This is an extremely risky option that I discourage due to strong biases in heterozygous genotype calling. Third, predictions can be constructed using some sort of prior information (e.g., based on known causative genes or from GWAS studies) together with high density, perhaps custom designed arrays. I believe this is the most promising approach.
APA, Harvard, Vancouver, ISO, and other styles
31

Kalbfleisch, Ted, and Michael P. Heaton. "Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes." F1000Research 2 (November 14, 2013): 244. http://dx.doi.org/10.12688/f1000research.2-244.v1.

Full text
Abstract:
Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these species have provided unique insights into mammalian gene function. However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life. For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project. Only six of these have reference genomes: cattle, swine, sheep, goat, water buffalo, and bison. Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade. In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species’ reference genome (Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago (Bos taurus UMD3.1). In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding genome repeat regions and sex chromosomes, approximately 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep. Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous. These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene function.
APA, Harvard, Vancouver, ISO, and other styles
32

Kalbfleisch, Ted, and Michael P. Heaton. "Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes." F1000Research 2 (February 10, 2014): 244. http://dx.doi.org/10.12688/f1000research.2-244.v2.

Full text
Abstract:
Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these species have provided unique insights into mammalian gene function. However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life. For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project. Only six of these have reference genomes: cattle, swine, sheep, goat, water buffalo, and bison. Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade. In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species’ reference genome (Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago (Bos taurus UMD3.1). In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep. Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous. These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene function.
APA, Harvard, Vancouver, ISO, and other styles
33

Xu, Huilei, John DiCarlo, Ravi Satya, Quan Peng, and Yexun Wang. "Comparison of somatic mutation calling methods in amplicon and whole exome sequence data." BMC Genomics 15, no. 1 (2014): 244. http://dx.doi.org/10.1186/1471-2164-15-244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Holm, Ingrid A., Timothy W. Yu, and Steven Joffe. "From Sequence Data to Returnable Results: Ethical Issues in Variant Calling and Interpretation." Genetic Testing and Molecular Biomarkers 21, no. 3 (March 2017): 178–83. http://dx.doi.org/10.1089/gtmb.2016.0413.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Schilbert, Hanna Marie, Andreas Rempel, and Boas Pucker. "Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data." Plants 9, no. 4 (April 2, 2020): 439. http://dx.doi.org/10.3390/plants9040439.

Full text
Abstract:
High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.
APA, Harvard, Vancouver, ISO, and other styles
36

Yang, Jianfeng, Xiaofan Ding, Xing Sun, Shui-Ying Tsang, and Hong Xue. "SAMSVM: A tool for misalignment filtration of SAM-format sequences with support vector machine." Journal of Bioinformatics and Computational Biology 13, no. 06 (December 2015): 1550025. http://dx.doi.org/10.1142/s0219720015500250.

Full text
Abstract:
Sequence alignment/map (SAM) formatted sequences [Li H, Handsaker B, Wysoker A et al., Bioinformatics 25(16):2078–2079, 2009.] have taken on a main role in bioinformatics since the development of massive parallel sequencing. However, because misalignment of sequences poses a significant problem in analysis of sequencing data that could lead to false positives in variant calling, the exclusion of misaligned reads is a necessity in analysis. In this regard, the multiple features of SAM-formatted sequences can be treated as vectors in a multi-dimension space to allow the application of a support vector machine (SVM). Applying the LIBSVM tools developed by Chang and Lin [Chang C-C, Lin C-J, ACM Trans Intell Syst Technol 2:1–27, 2011.] as a simple interface for support vector classification, the SAMSVM package has been developed in this study to enable misalignment filtration of SAM-formatted sequences. Cross-validation between two simulated datasets processed with SAMSVM yielded accuracies that ranged from 0.89 to 0.97 with F-scores ranging from 0.77 to 0.94 in 14 groups characterized by different mutation rates from 0.001 to 0.1, indicating that the model built using SAMSVM was accurate in misalignment detection. Application of SAMSVM to actual sequencing data resulted in filtration of misaligned reads and correction of variant calling.
APA, Harvard, Vancouver, ISO, and other styles
37

Rautiainen, Mikko, Veli Mäkinen, and Tobias Marschall. "Bit-parallel sequence-to-graph alignment." Bioinformatics 35, no. 19 (March 9, 2019): 3599–607. http://dx.doi.org/10.1093/bioinformatics/btz162.

Full text
Abstract:
Abstract Motivation Graphs are commonly used to represent sets of sequences. Either edges or nodes can be labeled by sequences, so that each path in the graph spells a concatenated sequence. Examples include graphs to represent genome assemblies, such as string graphs and de Bruijn graphs, and graphs to represent a pan-genome and hence the genetic variation present in a population. Being able to align sequencing reads to such graphs is a key step for many analyses and its applications include genome assembly, read error correction and variant calling with respect to a variation graph. Results We generalize two linear sequence-to-sequence algorithms to graphs: the Shift-And algorithm for exact matching and Myers’ bitvector algorithm for semi-global alignment. These linear algorithms are both based on processing w sequence characters with a constant number of operations, where w is the word size of the machine (commonly 64), and achieve a speedup of up to w over naive algorithms. For a graph with |V| nodes and |E| edges and a sequence of length m, our bitvector-based graph alignment algorithm reaches a worst case runtime of O(|V|+⌈mw⌉|E| log w) for acyclic graphs and O(|V|+m|E| log w) for arbitrary cyclic graphs. We apply it to five different types of graphs and observe a speedup between 3-fold and 20-fold compared with a previous (asymptotically optimal) alignment algorithm. Availability and implementation https://github.com/maickrau/GraphAligner Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
38

Lepais, Olivier, Emilie Chancerel, Christophe Boury, Franck Salin, Aurélie Manicki, Laura Taillebois, Cyril Dutech, et al. "Fast sequence-based microsatellite genotyping development workflow." PeerJ 8 (May 4, 2020): e9085. http://dx.doi.org/10.7717/peerj.9085.

Full text
Abstract:
Application of high-throughput sequencing technologies to microsatellite genotyping (SSRseq) has been shown to remove many of the limitations of electrophoresis-based methods and to refine inference of population genetic diversity and structure. We present here a streamlined SSRseq development workflow that includes microsatellite development, multiplexed marker amplification and sequencing, and automated bioinformatics data analysis. We illustrate its application to five groups of species across phyla (fungi, plant, insect and fish) with different levels of genomic resource availability. We found that relying on previously developed microsatellite assay is not optimal and leads to a resulting low number of reliable locus being genotyped. In contrast, de novo ad hoc primer designs gives highly multiplexed microsatellite assays that can be sequenced to produce high quality genotypes for 20–40 loci. We highlight critical upfront development factors to consider for effective SSRseq setup in a wide range of situations. Sequence analysis accounting for all linked polymorphisms along the sequence quickly generates a powerful multi-allelic haplotype-based genotypic dataset, calling to new theoretical and analytical frameworks to extract more information from multi-nucleotide polymorphism marker systems.
APA, Harvard, Vancouver, ISO, and other styles
39

Wang, Jing, Cheng Ling, and Jingyang Gao. "CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks." BioMed Research International 2017 (2017): 1–8. http://dx.doi.org/10.1155/2017/6375059.

Full text
Abstract:
Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of results from these tools achieves fairly high sensitivity but still produces low accuracy on low coverage sequence data. That is, these methods contain many false positives. In this paper, we present CNNdel, an approach for calling deletions from paired-end reads. CNNdel gathers SV candidates reported by multiple tools and then extracts features from aligned BAM files at the positions of candidates. With labeled feature-expressed candidates as a training set, CNNdel trains convolutional neural networks (CNNs) to distinguish true unlabeled candidates from false ones. Results show that CNNdel works well with NGS reads from 26 low coverage genomes of the 1000 Genomes Project. The paper demonstrates that convolutional neural networks can automatically assign the priority of SV features and reduce the false positives efficaciously.
APA, Harvard, Vancouver, ISO, and other styles
40

VanRaden, P. M., D. M. Bickhart, and J. R. O'Connell. "0302 Identifying and calling insertions, deletions, and single-base mutations efficiently from sequence data." Journal of Animal Science 94, suppl_5 (October 1, 2016): 144. http://dx.doi.org/10.2527/jam2016-0302.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Chu, Chong, Jin Zhang, and Yufeng Wu. "GINDEL: Accurate Genotype Calling of Insertions and Deletions from Low Coverage Population Sequence Reads." PLoS ONE 9, no. 11 (November 25, 2014): e113324. http://dx.doi.org/10.1371/journal.pone.0113324.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Stephens, Alex J., John Inman-Bamber, Philip M. Giffard, and Flavia Huygens. "High-Resolution Melting Analysis of the spa Repeat Region of Staphylococcus aureus." Clinical Chemistry 54, no. 2 (February 1, 2008): 432–36. http://dx.doi.org/10.1373/clinchem.2007.093658.

Full text
Abstract:
Abstract Background: The staphylococcal protein A (spa) locus of Staphylococcus aureus contains a complex repeat structure and is commonly used for single-locus sequence-based genotyping. The real-time PCR platform supports genotyping methods that are single step and closed tube and potentially can be carried out simultaneously with diagnosis. We describe here a method for genotyping S. aureus using high-resolution melting (HRM) analysis of the spa polymorphic region X. Methods: The conventional PCR spa assay was modified and optimized for the Rotor-Gene 6000 instrument (Corbett Life Science). HRM analysis on the Corbett Rotor-Gene 6000 instrument was used to test 22 known spa sequences obtained from 44 diverse methicillin-resistant S. aureus (MRSA) isolates. Criteria for calling pairs of melting curves “same” or “different” were developed empirically by converting the data to difference graph format with one curve defined as the control. HRM curve comparison between runs was done to determine the portability of the method. The assay performance was assessed by genotyping uncharacterized isolates, carrying out blind trials, and comparing HRM profiles from different runs. Results: HRM analysis of 44 diverse MRSA isolates generated 20 profiles from 22 spa sequence types. The 2 unresolved HRM spa types differed by only 1 bp. Two blind trials demonstrated complete reproducibility with respect to calling the different spa types. Interrun comparisons of HRM curves were successfully developed, indicating the robustness of the method. Conclusion: Analysis of the spa locus by HRM resolves spa sequence variants. This single- and closed-tube single-step method for S. aureus genotyping can be easily combined with the interrogation of other genetic markers.
APA, Harvard, Vancouver, ISO, and other styles
43

Bao, Riyue, Lei Huang, Jorge Andrade, Wei Tan, Warren A. Kibbe, Hongmei Jiang, and Gang Feng. "Review of Current Methods, Applications, and Data Management for the Bioinformatics Analysis of Whole Exome Sequencing." Cancer Informatics 13s2 (January 2014): CIN.S13779. http://dx.doi.org/10.4137/cin.s13779.

Full text
Abstract:
The advent of next-generation sequencing technologies has greatly promoted advances in the study of human diseases at the genomic, transcriptomic, and epigenetic levels. Exome sequencing, where the coding region of the genome is captured and sequenced at a deep level, has proven to be a cost-effective method to detect disease-causing variants and discover gene targets. In this review, we outline the general framework of whole exome sequence data analysis. We focus on established bioinformatics tools and applications that support five analytical steps: raw data quality assessment, preprocessing, alignment, post-processing, and variant analysis (detection, annotation, and prioritization). We evaluate the performance of open-source alignment programs and variant calling tools using simulated and benchmark datasets, and highlight the challenges posed by the lack of concordance among variant detection tools. Based on these results, we recommend adopting multiple tools and resources to reduce false positives and increase the sensitivity of variant calling. In addition, we briefly discuss the current status and solutions for big data management, analysis, and summarization in the field of bioinformatics.
APA, Harvard, Vancouver, ISO, and other styles
44

Raymond, Chase Wesley. "Negotiating entitlement to language: Calling 911 without English." Language in Society 43, no. 1 (January 24, 2014): 33–59. http://dx.doi.org/10.1017/s0047404513000869.

Full text
Abstract:
AbstractWhen individuals in the United States dial the emergency service telephone number, they immediately encounter some version of the English-language institutional opening “Nine-one-one, what is your emergency?”. What happens, though, when the one placing the call is not a speaker of English? How do callers and call-takers adapt to overcome this added communicative barrier so that they are able to effectively assess the emergency situation at hand? The present study describes the structure of a language negotiation sequence, which serves to evaluate callers' entitlement to receive service in a language other than the institutional default—in our case, requests for Spanish in lieu of English. We illustrate both how callers initially design requests for language, as well as how call-takers subsequently respond to those differing request formulations. Interactions are examined qualitatively and quantitatively to underscore the context-based contingencies surrounding call-takers' preference for English over the use of translation services. The results prove informative not only in terms of how bilingual talk is organized within social institutions, but also more generally with regard to how humans make active use of a variety of resources in their attempts to engage in interaction with one another. (Entitlement, discourse/social interaction, conversation analysis, requests, language contact, institutional talk, Spanish (in the US))*
APA, Harvard, Vancouver, ISO, and other styles
45

Ip, Eddie K. K., Clinton Hadinata, Joshua W. K. Ho, and Eleni Giannoulatou. "dv-trio: a family-based variant calling pipeline using DeepVariant." Bioinformatics 36, no. 11 (April 21, 2020): 3549–51. http://dx.doi.org/10.1093/bioinformatics/btaa116.

Full text
Abstract:
Abstract Motivation In 2018, Google published an innovative variant caller, DeepVariant, which converts pileups of sequence reads into images and uses a deep neural network to identify single-nucleotide variants and small insertion/deletions from next-generation sequencing data. This approach outperforms existing state-of-the-art tools. However, DeepVariant was designed to call variants within a single sample. In disease sequencing studies, the ability to examine a family trio (father-mother-affected child) provides greater power for disease mutation discovery. Results To further improve DeepVariant’s variant calling accuracy in family-based sequencing studies, we have developed a family-based variant calling pipeline, dv-trio, which incorporates the trio information from the Mendelian genetic model into variant calling based on DeepVariant. Availability and implementation dv-trio is available via an open source BSD3 license at GitHub (https://github.com/VCCRI/dv-trio/). Contact e.giannoulatou@victorchang.edu.au Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
46

SZÖVÉNYI, GERGELY, GELLÉRT PUSKÁS, and KIRILL MÁRK ORCI. "Isophya nagyi, a new phaneropterid bush-cricket (Orthoptera: Tettigonioidea) from the Eastern Carpathians (Caliman Mountains, North Romania)." Zootaxa 3521, no. 1 (October 18, 2012): 67. http://dx.doi.org/10.11646/zootaxa.3521.1.5.

Full text
Abstract:
This study describes Isophya nagyi sp. n. from the Caliman Mountains (Eastern Carpathians, Romania). This species wasdiscovered on the basis of the special rhythmic pattern of its male calling song. Regarding morphology Isophya nagyi issimilar to the species of the Isophya camptoxypha species-group (I. ciucasi, I. sicula, I. posthumoidalis, I. camptoxypha),however the male stridulatory file contains more stridulatory pegs (105–130) compared to the other members of thespecies group (50–80 pegs). Calling males produce a long sequence of evenly repeated syllables (repetition rate variesbetween 60–80 syllables at 21–24o C), and most importantly syllables are composed of three characteristic impulse groupscontrary to songs of the other species where syllables are composed of two elements or the song consists of two syllabletypes. Besides the description of the basic morphological features and pair-forming acoustic signals of the new species, a calling song based key is given for the I. camptoxypha species group.
APA, Harvard, Vancouver, ISO, and other styles
47

Ruark, Elise, Anthony Renwick, Matthew Clarke, Katie Snape, Emma Ramsay, Anna Elliott, Sandra Hanks, Ann Strydom, Sheila Seal, and Nazneen Rahman. "The ICR142 NGS validation series: a resource for orthogonal assessment of NGS analysis." F1000Research 5 (March 22, 2016): 386. http://dx.doi.org/10.12688/f1000research.8219.1.

Full text
Abstract:
To provide a useful community resource for orthogonal assessment of NGS analysis software, we present the ICR142 NGS validation series. The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 730 sites; 409 sites with variants and 321 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 286 indel variants and 275 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance. The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332.
APA, Harvard, Vancouver, ISO, and other styles
48

Ruark, Elise, Anthony Renwick, Matthew Clarke, Katie Snape, Emma Ramsay, Anna Elliott, Sandra Hanks, Ann Strydom, Sheila Seal, and Nazneen Rahman. "The ICR142 NGS validation series: a resource for orthogonal assessment of NGS analysis." F1000Research 5 (September 5, 2018): 386. http://dx.doi.org/10.12688/f1000research.8219.2.

Full text
Abstract:
To provide a useful community resource for orthogonal assessment of NGS analysis software, we present the ICR142 NGS validation series. The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 704 sites; 416 sites with variants and 288 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 293 indel variants and 247 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance. The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332.
APA, Harvard, Vancouver, ISO, and other styles
49

SUEUR, JÉRÔME, and STÉPHANE PUISSANT. "Similar look but different song: a new Cicadetta species in the montana complex (Insecta, Hemiptera, Cicadidae)." Zootaxa 1442, no. 1 (April 5, 2007): 55–68. http://dx.doi.org/10.11646/zootaxa.1442.1.5.

Full text
Abstract:
The Cicadetta montana species complex includes six cicada species from the West-Palaearctic region. Based on acoustic diagnostic characters, a seventh species Cicadetta cantilatrix sp. nov. belonging to the complex is described. The type-locality is in France but the species distribution area extends to Poland, Germany, Switzerland, Austria, Slovenia, Macedonia and Montenegro. The calling song sequence consists of two phrases with different echemes. This calling pattern clearly differs from those produced by all other members of the complex, including C. cerdaniensis, previously mistaken with the new species. This description increases the acoustic diversity observed within a single cicada genus and supports the hypothesis that sound communication may play a central role in speciation.
APA, Harvard, Vancouver, ISO, and other styles
50

Chen, Lixin, Pingfang Liu, Thomas C. Evans, and Laurence M. Ettwiller. "Response to Comment on “DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification”." Science 361, no. 6409 (September 27, 2018): eaat0958. http://dx.doi.org/10.1126/science.aat0958.

Full text
Abstract:
Following the Comment of Stewart et al., we repeated our analysis on sequencing runs from The Cancer Genome Atlas (TCGA) using their suggested parameters. We found signs of oxidative damage in all sequence contexts and irrespective of the sequencing date, reaffirming that DNA damage affects mutation-calling pipelines in their ability to accurately identify somatic variations.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography