Academic literature on the topic 'High-throughput sequencing data'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'High-throughput sequencing data.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "High-throughput sequencing data"
Campagne, Fabien, Kevin C. Dorff, Nyasha Chambwe, James T. Robinson, and Jill P. Mesirov. "Compression of Structured High-Throughput Sequencing Data." PLoS ONE 8, no. 11 (November 18, 2013): e79871. http://dx.doi.org/10.1371/journal.pone.0079871.
Full textParrish, Nathaniel, Benjamin Sudakov, and Eleazar Eskin. "Genome reassembly with high-throughput sequencing data." BMC Genomics 14, Suppl 1 (2013): S8. http://dx.doi.org/10.1186/1471-2164-14-s1-s8.
Full textFonseca, Nuno A., Johan Rung, Alvis Brazma, and John C. Marioni. "Tools for mapping high-throughput sequencing data." Bioinformatics 28, no. 24 (October 11, 2012): 3169–77. http://dx.doi.org/10.1093/bioinformatics/bts605.
Full textMaruki, Takahiro, and Michael Lynch. "Genotype-Frequency Estimation from High-Throughput Sequencing Data." Genetics 201, no. 2 (July 29, 2015): 473–86. http://dx.doi.org/10.1534/genetics.115.179077.
Full textDalca, A. V., and M. Brudno. "Genome variation discovery with high-throughput sequencing data." Briefings in Bioinformatics 11, no. 1 (January 1, 2010): 3–14. http://dx.doi.org/10.1093/bib/bbp058.
Full textAres, Manuel. "Methods for Processing High-Throughput RNA Sequencing Data." Cold Spring Harbor Protocols 2014, no. 11 (November 2014): pdb.top083352. http://dx.doi.org/10.1101/pdb.top083352.
Full textDavid, Matei, Harun Mustafa, and Michael Brudno. "Detecting Alu insertions from high-throughput sequencing data." Nucleic Acids Research 41, no. 17 (August 5, 2013): e169-e169. http://dx.doi.org/10.1093/nar/gkt612.
Full textNumanagić, Ibrahim, James K. Bonfield, Faraz Hach, Jan Voges, Jörn Ostermann, Claudio Alberti, Marco Mattavelli, and S. Cenk Sahinalp. "Comparison of high-throughput sequencing data compression tools." Nature Methods 13, no. 12 (October 24, 2016): 1005–8. http://dx.doi.org/10.1038/nmeth.4037.
Full textFiume, M., V. Williams, A. Brook, and M. Brudno. "Savant: genome browser for high-throughput sequencing data." Bioinformatics 26, no. 16 (June 20, 2010): 1938–44. http://dx.doi.org/10.1093/bioinformatics/btq332.
Full textNumanagić, Ibrahim, Salem Malikić, Victoria M. Pratt, Todd C. Skaar, David A. Flockhart, and S. Cenk Sahinalp. "Cypiripi: exact genotyping ofCYP2D6using high-throughput sequencing data." Bioinformatics 31, no. 12 (June 13, 2015): i27—i34. http://dx.doi.org/10.1093/bioinformatics/btv232.
Full textDissertations / Theses on the topic "High-throughput sequencing data"
Roguski, Łukasz 1987. "High-throughput sequencing data compression." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/565775.
Full textGràcies als avenços en el camp de les tecnologies de seqüenciació, en els darrers anys la recerca biomèdica ha viscut una revolució, que ha tingut com un dels resultats l'explosió del volum de dades genòmiques generades arreu del món. La mida típica de les dades de seqüenciació generades en experiments d'escala mitjana acostuma a situar-se en un rang entre deu i cent gigabytes, que s'emmagatzemen en diversos arxius en diferents formats produïts en cada experiment. Els formats estàndards actuals de facto de representació de dades genòmiques són en format textual. Per raons pràctiques, les dades necessiten ser emmagatzemades en format comprimit. En la majoria dels casos, aquests mètodes de compressió es basen en compressors de text de caràcter general, com ara gzip. Amb tot, no permeten explotar els models d'informació especifícs de dades de seqüenciació. És per això que proporcionen funcionalitats limitades i estalvi insuficient d'espai d'emmagatzematge. Això explica per què operacions relativament bàsiques, com ara el processament, l'emmagatzematge i la transferència de dades genòmiques, s'han convertit en un dels principals obstacles de processos actuals d'anàlisi. Per tot això, aquesta tesi se centra en mètodes d'emmagatzematge i compressió eficients de dades generades en experiments de sequenciació. En primer lloc, proposem un compressor innovador d'arxius FASTQ de propòsit general. A diferència de gzip, aquest compressor permet reduir de manera significativa la mida de l'arxiu resultant del procés de compressió. A més a més, aquesta eina permet processar les dades a una velocitat alta. A continuació, presentem mètodes de compressió que fan ús de l'alta redundància de seqüències present en les dades de seqüenciació. Aquests mètodes obtenen la millor ratio de compressió d'entre els compressors FASTQ del marc teòric actual, sense fer ús de cap referència externa. També mostrem aproximacions de compressió amb pèrdua per emmagatzemar dades de seqüenciació auxiliars, que permeten reduir encara més la mida de les dades. En últim lloc, aportem un sistema flexible de compressió i un format de dades. Aquest sistema fa possible generar de manera semi-automàtica solucions de compressió que no estan lligades a cap mena de format específic d'arxius de dades genòmiques. Per tal de facilitar la gestió complexa de dades, diversos conjunts de dades amb formats heterogenis poden ser emmagatzemats en contenidors configurables amb l'opció de dur a terme consultes personalitzades sobre les dades emmagatzemades. A més a més, exposem que les solucions simples basades en el nostre sistema poden obtenir resultats comparables als compressors de format específic de l'estat de l'art. En resum, les solucions desenvolupades i descrites en aquesta tesi poden ser incorporades amb facilitat en processos d'anàlisi de dades genòmiques. Si prenem aquestes solucions conjuntament, aporten una base sòlida per al desenvolupament d'aproximacions completes encaminades a l'emmagatzematge i gestió eficient de dades genòmiques.
Durif, Ghislain. "Multivariate analysis of high-throughput sequencing data." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1334/document.
Full textThe statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid dimension reduction methods that rely on both compression (representation of the data into a lower dimensional space) and variable selection. Developments are made concerning: the sparse Partial Least Squares (PLS) regression framework for supervised classification, and the sparse matrix factorization framework for unsupervised exploration. In both situations, our main purpose will be to focus on the reconstruction and visualization of the data. First, we will present a new sparse PLS approach, based on an adaptive sparsity-inducing penalty, that is suitable for logistic regression to predict the label of a discrete outcome. For instance, such a method will be used for prediction (fate of patients or specific type of unidentified single cells) based on gene expression profiles. The main issue in such framework is to account for the response to discard irrelevant variables. We will highlight the direct link between the derivation of the algorithms and the reliability of the results. Then, motivated by questions regarding single-cell data analysis, we propose a flexible model-based approach for the factorization of count matrices, that accounts for over-dispersion as well as zero-inflation (both characteristic of single-cell data), for which we derive an estimation procedure based on variational inference. In this scheme, we consider probabilistic variable selection based on a spike-and-slab model suitable for count data. The interest of our procedure for data reconstruction, visualization and clustering will be illustrated by simulation experiments and by preliminary results on single-cell data analysis. All proposed methods were implemented into two R-packages "plsgenomics" and "CMF" based on high performance computing
Zhang, Xuekui. "Mixture models for analysing high throughput sequencing data." Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/35982.
Full textHoffmann, Steve. "Genome Informatics for High-Throughput Sequencing Data Analysis." Doctoral thesis, Universitätsbibliothek Leipzig, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-152643.
Full textDiese Arbeit stellt drei verschiedene algorithmische und statistische Strategien für die Analyse von Hochdurchsatz-Sequenzierungsdaten vor. Zuerst führen wir eine auf enhanced Suffixarrays basierende heuristische Methode ein, die kurze Sequenzen mit grossen Genomen aligniert. Die Methode basiert auf der Idee einer fehlertoleranten Traversierung eines Suffixarrays für Referenzgenome in Verbindung mit dem Konzept der Matching-Statistik von Chang und einem auf Bitvektoren basierenden Alignmentalgorithmus von Myers. Die vorgestellte Methode unterstützt Paired-End und Mate-Pair Alignments, bietet Methoden zur Erkennung von Primersequenzen und zum trimmen von Poly-A-Signalen an. Auch in unabhängigen Benchmarks zeichnet sich das Verfahren durch hohe Sensitivität und Spezifität in simulierten und realen Datensätzen aus. Für eine große Anzahl von Sequenzierungsprotokollen erzielt es bessere Ergebnisse als andere bekannte Short-Read Alignmentprogramme. Zweitens stellen wir einen auf dynamischer Programmierung basierenden Algorithmus für das spliced alignment problem vor. Der Vorteil dieses Algorithmus ist seine Fähigkeit, nicht nur kollineare Spleiß- Ereignisse, d.h. Spleiß-Ereignisse auf dem gleichen genomischen Strang, sondern auch zirkuläre und andere nicht-kollineare Spleiß-Ereignisse zu identifizieren. Das Verfahren zeichnet sich durch eine hohe Genauigkeit aus: während es bei der Erkennung kollinearer Spleiß-Varianten vergleichbare Ergebnisse mit anderen Methoden erzielt, schlägt es die Wettbewerber mit Blick auf Sensitivität und Spezifität bei der Vorhersage nicht-kollinearer Spleißvarianten. Die Anwendung dieses Algorithmus führte zur Identifikation neuer Isoformen. In unserer Publikation berichten wir über eine neue Isoform des Tumorsuppressorgens p53. Da dieses Gen eines der am besten untersuchten Gene des menschlichen Genoms ist, könnte die Anwendung unseres Algorithmus helfen, eine Vielzahl weiterer Isoformen bei weniger prominenten Genen zu identifizieren. Drittens stellen wir ein datenadaptives Modell zur Identifikation von Single Nucleotide Variations (SNVs) vor. In unserer Arbeit zeigen wir, dass sich unser auf empirischen log-likelihoods basierendes Modell automatisch an die Qualität der Sequenzierungsexperimente anpasst und eine \"Entscheidung\" darüber trifft, welche potentiellen Variationen als SNVs zu klassifizieren sind. In unseren Simulationen ist diese Methode auf Augenhöhe mit aktuell eingesetzten Verfahren. Schließlich stellen wir eine Auswahl biologischer Ergebnisse vor, die mit den Besonderheiten der präsentierten Alignmentverfahren in Zusammenhang stehen
Stromberg, Michael Peter. "Enabling high-throughput sequencing data analysis with MOSAIK." Thesis, Boston College, 2010. http://hdl.handle.net/2345/1332.
Full textDuring the last few years, numerous new sequencing technologies have emerged that require tools that can process large amounts of read data quickly and accurately. Regardless of the downstream methods used, reference-guided aligners are at the heart of all next-generation analysis studies. I have developed a general reference-guided aligner, MOSAIK, to support all current sequencing technologies (Roche 454, Illumina, Applied Biosystems SOLiD, Helicos, and Sanger capillary). The calibrated alignment qualities calculated by MOSAIK allow the user to fine-tune the alignment accuracy for a given study. MOSAIK is a highly configurable and easy-to-use suite of alignment tools that is used in hundreds of labs worldwide. MOSAIK is an integral part of our genetic variant discovery pipeline. From SNP and short-INDEL discovery to structural variation discovery, alignment accuracy is an essential requirement and enables our downstream analyses to provide accurate calls. In this thesis, I present three major studies that were formative during the development of MOSAIK and our analysis pipeline. In addition, I present a novel algorithm that identifies mobile element insertions (non-LTR retrotransposons) in the human genome using split-read alignments in MOSAIK. This algorithm has a low false discovery rate (4.4 %) and enabled our group to be the first to determine the number of mobile elements that differentially occur between any two individuals
Thesis (PhD) — Boston College, 2010
Submitted to: Boston College. Graduate School of Arts and Sciences
Discipline: Biology
Xing, Zhengrong. "Poisson multiscale methods for high-throughput sequencing data." Thesis, The University of Chicago, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10195268.
Full textIn this dissertation, we focus on the problem of analyzing data from high-throughput sequencing experiments. With the emergence of more capable hardware and more efficient software, these sequencing data provide information at an unprecedented resolution. However, statistical methods developed for such data rarely tackle the data at such high resolutions, and often make approximations that only hold under certain conditions.
We propose a model-based approach to dealing with such data, starting from a single sample. By taking into account the inherent structure present in such data, our model can accurately capture important genomic regions. We also present the model in such a way that makes it easily extensible to more complicated and biologically interesting scenarios.
Building upon the single-sample model, we then turn to the statistical question of detecting differences between multiple samples. Such questions often arise in the context of expression data, where much emphasis has been put on the problem of detecting differential expression between two groups. By extending the framework for a single sample to incorporate additional group covariates, our model provides a systematic approach to estimating and testing for such differences. We then apply our method to several empirical datasets, and discuss the potential for further applications to other biological tasks.
We also seek to address a different statistical question, where the goal here is to perform exploratory analysis to uncover hidden structure within the data. We incorporate the single-sample framework into a commonly used clustering scheme, and show that our enhanced clustering approach is superior to the original clustering approach in many ways. We then apply our clustering method to a few empirical datasets and discuss our findings.
Finally, we apply the shrinkage procedure used within the single-sample model to tackle a completely different statistical issue: nonparametric regression with heteroskedastic Gaussian noise. We propose an algorithm that accurately recovers both the mean and variance functions given a single set of observations, and demonstrate its advantages over state-of-the art methods through extensive simulation studies.
Fritz, Markus Hsi-Yang. "Exploiting high throughput DNA sequencing data for genomic analysis." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610819.
Full textWoolford, Julie Ruth. "Statistical analysis of small RNA high-throughput sequencing data." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610375.
Full textKircher, Martin. "Understanding and improving high-throughput sequencing data production and analysis." Doctoral thesis, Universitätsbibliothek Leipzig, 2011. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-71102.
Full textAinsworth, David. "Computational approaches for metagenomic analysis of high-throughput sequencing data." Thesis, Imperial College London, 2016. http://hdl.handle.net/10044/1/44070.
Full textBooks on the topic "High-throughput sequencing data"
Rodríguez-Ezpeleta, Naiara, Michael Hackenberg, and Ana M. Aransay. Bioinformatics for high throughput sequencing. New York, NY: Springer, 2012.
Find full textRodríguez-Ezpeleta, Naiara, Ana M. Aransay, and Michael Hackenberg. Bioinformatics for High Throughput Sequencing. Springer, 2014.
Find full textTaberlet, Pierre, Aurélie Bonin, Lucie Zinger, and Eric Coissac. Environmental DNA. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780198767220.001.0001.
Full textPezzella, Francesco, Mahvash Tavassoli, and David J. Kerr, eds. Oxford Textbook of Cancer Biology. Oxford University Press, 2019. http://dx.doi.org/10.1093/med/9780198779452.001.0001.
Full textBook chapters on the topic "High-throughput sequencing data"
Glass, Elizabeth M., and Folker Meyer. "Analysis of Metagenomics Data." In Bioinformatics for High Throughput Sequencing, 219–29. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0782-9_13.
Full textSexton, David. "Computational Infrastructure and Basic Data Analysis for High-Throughput Sequencing." In Bioinformatics for High Throughput Sequencing, 55–65. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0782-9_4.
Full textZhang, Michael Q. "Dissecting Splicing Regulatory Network by Integrative Analysis of CLIP-Seq Data." In Bioinformatics for High Throughput Sequencing, 209–18. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0782-9_12.
Full textPaszkiewicz, Konrad, and David J. Studholme. "High-Throughput Sequencing Data Analysis Software: Current State and Future Developments." In Bioinformatics for High Throughput Sequencing, 231–48. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0782-9_14.
Full textMane, Shrinivasrao P., Thero Modise, and Bruno W. Sobral. "Analysis of High-Throughput Sequencing Data." In Methods in Molecular Biology, 1–11. Totowa, NJ: Humana Press, 2010. http://dx.doi.org/10.1007/978-1-60761-682-5_1.
Full textYoung, Matthew D., Davis J. McCarthy, Matthew J. Wakefield, Gordon K. Smyth, Alicia Oshlack, and Mark D. Robinson. "Differential Expression for RNA Sequencing (RNA-Seq) Data: Mapping, Summarization, Statistical Analysis, and Experimental Design." In Bioinformatics for High Throughput Sequencing, 169–90. New York, NY: Springer New York, 2011. http://dx.doi.org/10.1007/978-1-4614-0782-9_10.
Full textWeese, David, and Enrico Siragusa. "Full-Text Indexes for High-Throughput Sequencing." In Algorithms for Next-Generation Sequencing Data, 41–75. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-59826-0_2.
Full textHoffmann, Steve. "Computational Analysis of High Throughput Sequencing Data." In Methods in Molecular Biology, 199–217. Totowa, NJ: Humana Press, 2011. http://dx.doi.org/10.1007/978-1-61779-027-0_9.
Full textVälimäki, Niko, and Simon J. Puglisi. "Distributed String Mining for High-Throughput Sequencing Data." In Lecture Notes in Computer Science, 441–52. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-33122-0_35.
Full textRieder, Dietmar, and Francesca Finotello. "Analysis of High-Throughput RNA Bisulfite Sequencing Data." In Methods in Molecular Biology, 143–54. New York, NY: Springer New York, 2017. http://dx.doi.org/10.1007/978-1-4939-6807-7_10.
Full textConference papers on the topic "High-throughput sequencing data"
Mangul, Serghei, and Alex Zelikovsky. "Poster: Haplotype discovery from high-throughput sequencing data." In 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS). IEEE, 2011. http://dx.doi.org/10.1109/iccabs.2011.5729908.
Full textHolt, James, Shunping Huang, Leonard McMillan, and Wei Wang. "Read Annotation Pipeline for High-Throughput Sequencing Data." In BCB'13: ACM-BCB2013. New York, NY, USA: ACM, 2013. http://dx.doi.org/10.1145/2506583.2506645.
Full textChung, Wei-Chun, Yu-Jung Chang, Chien-Chih Chen, Der-Tsai Lee, and Jan-Ming Ho. "Optimizing a MapReduce module of preprocessing high-throughput DNA sequencing data." In 2013 IEEE International Conference on Big Data. IEEE, 2013. http://dx.doi.org/10.1109/bigdata.2013.6691694.
Full textJiangyu, Li, Wang Xiaolei, Zhao Dongsheng, Mao Yiqing, and Qian Cheng. "A fast microbial detection algorithm based on high-throughput sequencing data." In ICBCB '17: 2017 5th International Conference on Bioinformatics and Computational Biology. New York, NY, USA: ACM, 2017. http://dx.doi.org/10.1145/3035012.3035014.
Full textWang, Xin, Mingxiang Teng, Guohua Wang, Yuming Zhao, Xu Han, Weixing Feng, Lang Li, Jeremy Sanford, and Yunlong Liu. "xIP-seq Platform: An Integrative Framework for High-Throughput Sequencing Data Analysis." In 2009 Ohio Collaborative Conference on Bioinformatics (OCCBIO). IEEE, 2009. http://dx.doi.org/10.1109/occbio.2009.20.
Full textPuljiz, Zrinka, and Haris Vikalo. "Iterative learning of single individual haplotypes from high-throughput DNA sequencing data." In 2014 8th International Symposium on Turbo Codes and Iterative Information Processing (ISTC). IEEE, 2014. http://dx.doi.org/10.1109/istc.2014.6955103.
Full textMilicchio, Franco, Iain E. Buchan, and Mattia C. F. Prosperi. "A* fast and scalable high-throughput sequencing data error correction via oligomers." In 2016 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2016. http://dx.doi.org/10.1109/cibcb.2016.7758117.
Full textChen, Chien-Chih, Yu-Jung Chang, Wei-Chun Chung, Der-Tsai Lee, and Jan-Ming Ho. "CloudRS: An error correction algorithm of high-throughput sequencing data based on scalable framework." In 2013 IEEE International Conference on Big Data. IEEE, 2013. http://dx.doi.org/10.1109/bigdata.2013.6691642.
Full textChung, Wei-Chun, Yu-Jung Chang, D. T. Lee, and Jan-Ming Ho. "Using geometric structures to improve the error correction algorithm of high-throughput sequencing data on MapReduce framework." In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014. http://dx.doi.org/10.1109/bigdata.2014.7004306.
Full textXiaodong Zhang, Chong Chu, Yao Zhang, Yufeng Wu, and Jingyang Gao. "Concod: Accurate consensus-based approach of calling deletions from high-throughput sequencing data." In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2016. http://dx.doi.org/10.1109/bibm.2016.7822495.
Full text