To see the other types of publications on this topic, follow the link: DNA nucleotide sequencing.

Dissertations / Theses on the topic 'DNA nucleotide sequencing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 41 dissertations / theses for your research on the topic 'DNA nucleotide sequencing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Boufounos, Petros T. 1977. "Signal processing for DNA sequencing." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/17536.

Full text
Abstract:
Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.<br>Includes bibliographical references (p. 83-86).<br>DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's blueprint of how life works. The advancement of biological science in has created a vast demand for sequencing methods, which needs to be addressed by automated equipment. This thesis tries to address one part of that process, known as base calling: it is the conversion of the electrical signal-the electropherogram--collected by the sequencing equipment to a sequence of letters drawn from ( A,TC,G ) that corresponds to the sequence in the molecule sequenced. This work formulates the problem as a pattern recognition problem, and observes its striking resemblance to the speech recognition problem. We, therefore, propose combining Hidden Markov Models and Artificial Neural Networks to solve it. In the formulation we derive an algorithm for training both models together. Furthermore, we devise a method to create very accurate training data, requiring minimal hand-labeling. We compare our method with the de facto standard, PHRED, and produce comparable results. Finally, we propose alternative HMM topologies that have the potential to significantly improve the performance of the method.<br>by Petros T. Boufounos.<br>M.Eng.and S.B.
APA, Harvard, Vancouver, ISO, and other styles
2

Fritz, Markus Hsi-Yang. "Exploiting high throughput DNA sequencing data for genomic analysis." Thesis, University of Cambridge, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.610819.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kislyuk, Andrey O. "Algorithm development for next generation sequencing-based metagenome analysis." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/42779.

Full text
Abstract:
We present research on the design, development and application of algorithms for DNA sequence analysis, with a focus on environmental DNA (metagenomes). We present an overview and primer on algorithm development for bioinformatics of metagenomes; work on frameshift detection in DNA sequencing data; work on a computational pipeline for the assembly, feature prediction, annotation and analysis of bacterial genomes; work on unsupervised phylogenetic clustering of metagenomic fragments using Markov Chain Monte Carlo methods; and work on estimation of bacterial genome plasticity and diversity, potential improvements to the measures of core and pan-genomes.
APA, Harvard, Vancouver, ISO, and other styles
4

Musgrave-Brown, Esther. "Development and application of methods for targeted DNA sequencing of pooled samples." Thesis, University of Cambridge, 2014. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648613.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Andrews, Daniel James. "Statistical models of PCR for quantification of target DNA by sequencing." Thesis, University of Cambridge, 2015. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.708581.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Lin, Cheng-Hsien Kenny. "An ASIC application for DNA sequencing by Smith-Waterman algorithm (DNASSWA) /." [St. Lucia, Qld.], 2004. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe18716.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Yikang. "Synthesis of some novel phospholipids, and nucleotide analogues, as potential chemotherapeutic agents and for DNA sequencing." Thesis, University of Southampton, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.295106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Stranneheim, Henrik. "Enabling massive genomic and transcriptomic analysis." Doctoral thesis, KTH, Genteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-45957.

Full text
Abstract:
In recent years there have been tremendous advances in our ability to rapidly and cost-effectively sequence DNA. This has revolutionized the fields of genetics and biology, leading to a deeper understanding of the molecular events in life processes. The rapid advances have enormously expanded sequencing opportunities and applications, but also imposed heavy strains on steps prior to sequencing, as well as the subsequent handling and analysis of the massive amounts of sequence data that are generated, in order to exploit the full capacity of these novel platforms. The work presented in this thesis (based on six appended papers) has contributed to balancing the sequencing process by developing techniques to accelerate the rate-limiting steps prior to sequencing, facilitating sequence data analysis and applying the novel techniques to address biological questions.   Papers I and II describe techniques to eliminate expensive and time-consuming preparatory steps through automating library preparation procedures prior to sequencing. The automated procedures were benchmarked against standard manual procedures and were found to substantially increase throughput while maintaining high reproducibility. In Paper III, a novel algorithm for fast classification of sequences in complex datasets is described. The algorithm was first optimized and validated using a synthetic metagenome dataset and then shown to enable faster analysis of an experimental metagenome dataset than conventional long-read aligners, with similar accuracy. Paper IV, presents an investigation of the molecular effects on the p53 gene of exposing human skin to sunlight during the course of a summer holiday. There was evidence of previously accumulated persistent p53 mutations in 14% of all epidermal cells. Most of these mutations are likely to be passenger events, as the affected cell compartments showed no apparent growth advantage. An annual rate of 35,000 novel sun-induced persistent p53 mutations was estimated to occur in sun-exposed skin of a human individual.  Paper V, assesses the effect of using RNA obtained from whole cell extracts (total RNA) or cytoplasmic RNA on quantifying transcripts detected in subsequent analysis. Overall, more differentially detected genes were identified when using the cytoplasmic RNA. The major reason for this is related to the reduced complexity of cytoplasmic RNA, but also apparently due (at least partly) to the nuclear retention of transcripts with long, structured 5’- and 3’-untranslated regions or long protein coding sequences. The last paper, VI, describes whole-genome sequencing of a large, consanguineous family with a history of Leber hereditary optic neuropathy (LHON) on the maternal side. The analysis identified new candidate genes, which could be important in the aetiology of LHON. However, these candidates require further validation before any firm conclusions can be drawn regarding their contribution to the manifestation of LHON.<br>QC 20111115
APA, Harvard, Vancouver, ISO, and other styles
9

Fletcher, Jeremy Charles. "THE USE OF PYROSEQUENCING FOR THE ANALYSIS OF Y CHROMOSOME SINGLE NUCLEOTIDE POLYMORPHISMS." Master's thesis, University of Central Florida, 2004. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4487.

Full text
Abstract:
The potential value of the Y chromosome for forensic applications has been recognized for some time with the current work dedicated to Short Tandem Repeat analysis and Single Nucleotide Polymorphism (SNP) discovery. This study examined the ability of two different SNP analysis methods to determine if they could be utilized in forensic applications and ultimately be developed into an established system for Y chromosome SNP analysis. This study examined two principle SNP analysis systems: single base extension and Pyrosequencing. Pyrosequencing was determined to be superior to single base extension, due to the wealth of information provided with sequencing and the flexibility of designing primers for analysis. Using Pyrosequencing, 50 Y chromosome loci were examined and the minimum loci required for maximum diversity for the development of a Y chromosome SNP analysis system were chosen. Thirteen loci were selected based on their ability to discriminate 60 different individuals from three different racial groups into 15 different haplogroups. The Y chromosome SNP analysis system developed utilized nested PCR for the amplification of all 13 loci. Then they were sequenced as groups, ranging from one to three loci, in a single reaction. The Y chromosome SNP analysis system developed here has the potential for forensic application since it has shown to be successful in the analysis of blood, buccal swabs, semen, and saliva, works with as little as 5 pg of starting DNA material, and will amplify only male DNA in the presence of male/female mixtures in which the female portion of the sample overwhelmed the male portion 30,000 to 1.<br>M.S.<br>Department of Chemistry<br>Arts and Sciences<br>Chemistry
APA, Harvard, Vancouver, ISO, and other styles
10

Hasmats, Johanna. "Analysis of genetic variations in cancer." Doctoral thesis, KTH, Genteknologi, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-104438.

Full text
Abstract:
The aim of this thesis is to apply recently developed technologies for genomic variation analyses, and to ensure quality of the generated information for use in preclinical cancer research. Faster access to a patients’ full genomic sequence for a lower cost makes it possible for end users such as clinicians and physicians to gain a more complete understanding of the disease status of a patient and adjust treatment accordingly. Correct biological interpretation is important in this context, and can only be provided through fast and simple access to relevant high quality data. Therefore, we here propose and validate new bioinformatic strategies for biomarker selection for prediction of response to cancer therapy. We initially explored the use of bioinformatic tools to select interesting targets for toxicity in carboplatin and paclitaxel on a smaller scale. From our findings we then further extended the analysis to the entire exome to look for biomarkers as targets for adverse effects from carboplatin and gemcitabine. To investigate any bias introduced by the methods used for targeting the exome, we analyzed the mutation profiles in cancer patients by comparing whole genome amplified DNA to unamplified DNA. In addition, we applied RNA-seq to the same patients to further validate the variations obtained by sequencing of DNA. The understanding of the human cancer genome is growing rapidly, thanks to methodological development of analysis tools. The next step is to implement these tools as a part of a chain from diagnosis of patients to genomic research to personalized treatment.<br><p>QC 20121105</p>
APA, Harvard, Vancouver, ISO, and other styles
11

Baudet, Christian. "Uma abordagem para detecção e remoção de artefatos em sequencias ESTs." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276264.

Full text
Abstract:
Orientador: Zanoni Dias<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-08T07:27:54Z (GMT). No. of bitstreams: 1 Baudet_Christian_M.pdf: 13612079 bytes, checksum: 648d18039dc13dcd5a2f422cc7863666 (MD5) Previous issue date: 2006<br>Resumo: O sequenciamento de ESTs (Expressed Sequence Tag) [2] e uma tecnica que trabalha com bibliotecas de cDNAs tendo como objetivo a obtençao de uma boa aproximaçao para o ?ndice genico, que e a listagem de genes existentes no genoma do organismo estudado. Antes da serem analisadas, as sequencias obtidas do sequenciamento dos ESTs devem ser processadas para eliminaçao de artefatos. Artefatos sao trechos que nao pertencem ao organismo ou que possuem baixa qualidade ou baixa complexidade. Trechos de vetores, adaptadores e caudas poli-A podem ser citados como exemplos de artefatos. A eliminaçao dos artefatos deve ser feita para que a an'alise das sequencias produzidas no projeto nao seja prejudicada por estes ?ru?dos?. Por exemplo, artefatos presentes em sequencias freq¨uentemente produzem erros em processos de clusterizaçao, pois eles podem determinar se sequencias serao unidas em um mesmo cluster ou separadas em clusters diferentes. Observando a importancia da realizaçao de um bom processo de limpeza das sequencias, o trabalho desenvolvido nesta dissertaçao teve como principal objetivo a obtençao de um conjunto eficiente de procedimentos de detecçao e remoçao de artefatos. Este conjunto foi produzido a partir de uma nova estrategia de deteçao de artefatos. Normalmente, cada projeto de seq¨uenciamento possui seu proprio conjunto de procedimentos dividido em varias etapas. Estas etapas sao, em geral, ligadas entre si e o resultado de uma pode influenciar o resultado de outra. A nossa estrategia visa a realizaçao destas etapas de forma totalmente independente. Alem da avaliaçao desta nova estrategia, o trabalho tambem realizou um estudo mais detalhado sobre dois tipos de artefatos: baixa qualidade e derrapagem. Para cada um deles, algoritmos foram propostos e validados atraves de testes com conjuntos de seq¨u?encias produzidas em projetos reais de sequenciamento. O conjunto final de procedimentos, baseado nos estudos desenvolvidos durante a escrita deste texto, foi testado com as sequencias do projeto SUCEST [100, 103, 113] e mostrou bons resultados. O clustering produzido com as sequencias processadas por nossos metodos apresentou melhores consistencia interna e externa e menores taxas de redundancia quando comparado ao clustering original do projeto<br>Abstract: Expressed Sequence Tag (EST) Sequencing [2] is one technique that works with cDNA libraries. It aims to achieve a good approximation for the gene index of an organism. Before analyzing the sequences obtained by sequencing ESTs, they must be processed for artifact removal. An artifact is a sequence that does not belong to the studied organism or that has low quality or low complexity. As example of artifacts, we have adapters, poly- A tails, vectors, etc. Artifacts removal must be performed because their presence can produce ?noises? in the sequencing project data analysis. For example, artifact can join two sequences in a same cluster inappropriately or separate them in two different clusters when they should be put together. Motivated by the sequence cleaning process importance, our main objective in this work was to develop an efficient set of procedures to detect and to remove sequence artifacts. Usually, each EST sequencing project has its own procedure set divided in many steps. These steps are, in general, linked and the result of one given step might influence the result of the next one. Our strategy was to perform each step independently assuring that any execution order of those steps would lead to the same result. Additionally to the new strategy evaluation, this work also studied detailedly two type of artifacts: low quality and slippage. For each one, algorithms were proposed and validated through tests with sequences of real sequencing projects. The final set of procedure, developed in this work, was evaluated using the sequences of the SUCEST project [100, 103, 113] and produced good results. The resulting clustering from our method has better external and internal consistency and lower redundacy rate than those produced by the SUCEST project clustering<br>Mestrado<br>Ciência da Computação<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
12

Ståhl, Patrik L. "Methods for Analyzing Genomes." Doctoral thesis, KTH, Genteknologi, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-12407.

Full text
Abstract:
The human genome reference sequence has given us a two‐dimensional blueprint of our inherited code of life, but we need to employ modern‐day technology to expand our knowledge into a third dimension. Inter‐individual and intra‐individual variation has been shown to be larger than anticipated, and the mode of genetic regulation more complex. Therefore, the methods that were once used to explain our fundamental constitution are now used to decipher our differences. Over the past four years, throughput from DNA‐sequencing platforms has increased a thousand‐fold, bearing evidence of a rapid development in the field of methods used to study DNA and the genomes it constitutes. The work presented in this thesis has been carried out as an integrated part of this technological evolution, contributing to it, and applying the resulting solutions to answer difficult biological questions. Papers I and II describe a novel approach for microarray readout based on immobilization of magnetic particles, applicable to diagnostics. As benchmarked on canine mitochondrial DNA, and human genomic DNA from individuals with cystic fibrosis, it allows for visual interpretation of genotyping results without the use of machines or expensive equipment. Paper III outlines an automated and cost‐efficient method for enrichment and titration of clonally amplified DNA‐libraries on beads. The method uses fluorescent labeling and a flow‐cytometer to separate DNA‐beads from empty ones. At the same time the fraction of either bead type is recorded, and a titration curve can be generated. In paper IV we combined the highly discriminating multiplex genotyping of trinucleotide threading with the digital readout made possible by massively parallel sequencing. From this we were able to characterize the allelic distribution of 88 obesity related SNPs in a population of 462 individuals enrolled at a childhood obesity center. Paper V employs the throughput of present day DNA sequencingas it investigates deep into sun‐exposed skin to find clues on the effects of sunlight during the course of a summer holiday. The tumor suppressor p53 gene was targeted, only to find that despite its well‐documented involvement in the disease progression of cancers, an estimated 35,000 novel sun‐induced persistent p53 mutations are added and phenotypically tolerated in the skin of every individual every year. The last paper, VI, describes a novel approach for finding breast cancer biomarkers. In this translational study we used differential protein expression profiles and sequence capture to select and enrich for 52 candidate genes in DNA extracted from ten tumors. Two of the genes turned out to harbor protein‐altering mutations in multiple individuals.
APA, Harvard, Vancouver, ISO, and other styles
13

Dordio, Ana Mafalda Duarte. "Deteção e caracterização molecular de Babesia spp. em Canis familiaris e de outros agentes transmitidos por ixodídeos na Área Metropolitana de Lisboa e Oeste, Portugal." Master's thesis, Universidade de Lisboa, Faculdade de Medicina Veterinária, 2018. http://hdl.handle.net/10400.5/14764.

Full text
Abstract:
Dissertação de Mestrado Integrado em Medicina Veterinária<br>As doenças dos canídeos cujos agentes são transmitidos por vetores (DCTV), são causadas por um número abrangente de agentes patogénicos transmitidos por artrópodes e são um problema crescente no mundo nos últimos anos. A Babesiose canina faz parte dessas doenças, sendo causada por diversas espécies pertencentes ao género Babesia. Até ao momento sabe-se que a Babesiose canina é provocada pelas espécies Babesia canis e Babesia microti-like no Norte e Babesia vogeli em todas as regiões de Portugal. No entanto, as informações relativas às espécies de Babesia, bem como a sua prevalência molecular, distribuição geográfica e a gravidade do quadro clínico são escassas. A pesquisa de possíveis agentes transmitidos por vetores é igualmente importante, uma vez que os cães podem estar infetados com múltiplos destes agentes patogénicos, o que torna a abordagem clínica e tratamento um desafio para o Médico Veterinário. A amostra deste estudo foi constituída por dois grupos, de forma a contribuir para uma perceção real dos agentes patogénicos encontrados em 94 animais aparentemente saudáveis, provenientes de canis, e 49 animais doentes com suspeita de Babesiose canina, ambos provenientes da área metropolitana de Lisboa e do Oeste. Este estudo baseou-se na deteção de Babesia e agentes coinfetantes o método de observação de esfregaços sanguíneos ao Microscópio ótico, PCR convencional, RFLP e sequenciação de DNA. A infeção por B. canis foi detetada apenas no grupo de animais doentes, em 2 cães (1,40%) com um quadro clínico descrito e compatível com Babesiose canina aguda, enquanto a espécie B. vogeli foi detetada em 1 animal doente e 3 animais aparentemente saudáveis (2,81%). Foram detetadas infeções únicas em 35 animais (24,64%), dos quais: 17 (11,97%) com Hepatozoon canis, 4 (2,82%) com Anaplasma platys, 1 (0,70%) com Ehrlichia canis e 7 (4,93%) com Mycoplasma haemocanis. As coinfeções foram detetadas em 13 animais (9,15%), dos quais: 5 (3,52%) com H. canis e A. platys; 5 (5,52%) com H. canis e M. haematoparvum; e 1 (0,70%) com A. platys e M. haematoparvum. A partir do grupo de animais aparentemente saudáveis, a prevalência de infeções únicas e coinfeções foi de 26,6%, e de 12,7% respetivamente. Esta foi a primeira identificação, a partir de métodos moleculares, de Babesia canis e Mycoplasma haematoparvum no Sul de Portugal. A identificação de agentes transmitidos por vetores auxilia os Médicos Veterinários na sua abordagem clínica e reforça a importância de atuar de acordo com o conceito One Health para a prevenção dos riscos de transmissão.<br>ABSTRACT - DETECTION AND MOLECULAR CHARACTERIZATION OF BABESIA SPP. IN CANIS FAMILIARIS AND OTHER PATHOGENS TRANSMITED BY TICKS IN THE METROPOLITAN AREA OF LISBON AND WESTERN REGION, PORTUGAL - Canine vector-borne diseases (CVBD) are caused by a wide range of pathogens transmitted by arthropods, and it is an issue of growing importance from the past years. Canine Babesiosis is englobed in this group of diseases, furthermore is caused by different species from the Babesia genera. Currently it’s known that Canine Babesiosis it’s caused in the northern of Portugal by Babesia canis and Babesia microti-like, and by Babesia vogeli in all the country. There is a lack of information about the species that could cause the disease in Portugal, as well as the molecular prevalence, geographic distribution and severity of clinical manifestations of these parasites. Also, the detection of possible pathogens transmitted by vectors are equally important since the dogs can be infected with multiple pathogens, which makes the clinical approach and treatment a challenge for veterinarians. This study includes two groups, with the aim of contribution for a real perspective of pathogens that can be found, both the metropolitan area of Lisbon and Western region of Portugal. 94 dogs apparently healthy from shelters, that had previous contact with ticks, and 49 dogs clinically suspected of Canine Babesiosis. This study assessed, by means of blood smear examination, conventional PCR, RFLP and DNA nucleotide sequencing, the presence of Babesia spp. and co-infecting agents. Babesia canis was detected only in the group of sick dogs, in two animals (1,40%), with clinical manifestations described and compatible with an acute Canine Babesiosis, while B. vogeli was detected in one animal suspect of disease and 3 animals apparently health (2,81%). Single infections were detected in 35 animals (24,64%): H. canis in 17 (11,97%), A. platys in 4 (2,82%), E. canis in 1 (0,70%) and M. haemocanis in 7 (4,93%). Coinfections were detected in 13 animals (9,15%): H. canis and A. platys in 5 (3,52%); H. canis and M. haematoparvum in 5 (5,52%) and A. platys with M. haematoparvum in 1 (0,70%). In dogs apparently healthy the prevalence of single infections and coinfections was 26,6%, and 12,7% respectively. This is the first molecular identification of B. canis and M. haematoparvum in dogs from southern Portugal. This identification of pathogens of CVBD agents helps to guide the clinical approach of veterinarians at the practice and reinforces the importance of a One Health approach, to prevent the risk of the transmission.<br>N/A
APA, Harvard, Vancouver, ISO, and other styles
14

Syaifudin, Mochamad. "Species-specific DNA markers for improving the genetic management of tilapia." Thesis, University of Stirling, 2015. http://hdl.handle.net/1893/22624.

Full text
Abstract:
The tilapias are a group of African and Middle Eastern cichlid fish that are widely cultured in developed and developing countries. With many different species and sub-species, and extensive use of interspecies hybrids, identification of tilapia species is of importance in aquaculture and in wild populations where introductions occur. This research set out to distinguish between tilapia species and sub-species by retrieving species-specific nuclear DNA markers (SNPs) using two approaches: (i) sequencing of the coding regions of the ADA gene; and (ii) next-generation sequencing, both standard RADseq and double-digest RADseq (ddRADseq). The mitochondrial DNA (mtDNA) marker cytochrome c oxidase subunit I (COI) was used to verify tilapia species status. ADA gene sequence analysis was partially successful, generating SNP markers that distinguished some species pairs. Most species could also be discriminated using the COI sequence. Reference based analysis (RBA: using only markers found in the O. niloticus genome sequence) of standard RADseq data identified 1,613 SNPs in 1,002 shared RAD loci among seven species. De novo based analysis (DBA: based on the entire data set) identified 1,358 SNPs in 825 loci and RBA detected 938 SNPs in 571 shared RAD loci from ddRADseq among 10 species. Phylogenetic trees based on shared SNP markers indicated similar patterns to most prior phylogenies based on other characteristics. The standard RADseq detected 677 species-specific SNP markers from the entire data set (seven species), while the ddRADseq retrieved 38 (among ten species). Furthermore, 37 such SNP markers were identified from ddRADseq data from a subset of four economically important species which are often involved in hybridization in aquaculture, and larger numbers of SNP markers distinguished between species pairs in this group. In summary, these SNPs are a valuable resource in further investigating hybridization and introgression in a range of captive and wild stocks of tilapias.
APA, Harvard, Vancouver, ISO, and other styles
15

Roy, Christian K. "Putting the Pieces Together: Exons and piRNAs: A Dissertation." eScholarship@UMMS, 2014. https://escholarship.umassmed.edu/gsbs_diss/726.

Full text
Abstract:
Analysis of gene expression has undergone a technological revolution. What was impossible 6 years ago is now routine. High-throughput DNA sequencing machines capable of generating hundreds of millions of reads allow, indeed force, a major revision toward the study of the genome’s functional output—the transcriptome. This thesis examines the history of DNA sequencing, measurement of gene expression by sequencing, isoform complexity driven by alternative splicing and mammalian piRNA precursor biogenesis. Examination of these topics is framed around development of a novel RNA-templated DNA-DNA ligation assay (SeqZip) that allows for efficient analysis of abundant, complex, and functional long RNAs. The discussion focuses on the future of transcriptome analysis, development and applications of SeqZip, and challenges presented to biomedical researchers by extremely large and rich datasets.
APA, Harvard, Vancouver, ISO, and other styles
16

Roy, Christian K. "Putting the Pieces Together: Exons and piRNAs: A Dissertation." eScholarship@UMMS, 2005. http://escholarship.umassmed.edu/gsbs_diss/726.

Full text
Abstract:
Analysis of gene expression has undergone a technological revolution. What was impossible 6 years ago is now routine. High-throughput DNA sequencing machines capable of generating hundreds of millions of reads allow, indeed force, a major revision toward the study of the genome’s functional output—the transcriptome. This thesis examines the history of DNA sequencing, measurement of gene expression by sequencing, isoform complexity driven by alternative splicing and mammalian piRNA precursor biogenesis. Examination of these topics is framed around development of a novel RNA-templated DNA-DNA ligation assay (SeqZip) that allows for efficient analysis of abundant, complex, and functional long RNAs. The discussion focuses on the future of transcriptome analysis, development and applications of SeqZip, and challenges presented to biomedical researchers by extremely large and rich datasets.
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Wei. "Unveiling Molecular Mechanisms of piRNA Pathway from Small Signals in Big Data: A Dissertation." eScholarship@UMMS, 2015. https://escholarship.umassmed.edu/gsbs_diss/805.

Full text
Abstract:
PIWI-interacting RNAs (piRNA) are a group of 23–35 nucleotide (nt) short RNAs that protect animal gonads from transposon activities. In Drosophila germ line, piRNAs can be categorized into two different categories— primary and secondary piRNAs— based on their origins. Primary piRNAs, generated from transcripts of specific genomic regions called piRNA clusters, which are enriched in transposon fragments that are unlikely to retain transposition activity. The transcription and maturation of primary piRNAs from those cluster transcripts are poorly understood. After being produced, a group of primary piRNAs associates Piwi proteins and directs them to repress transposons at the transcriptional level in the nucleus. Other than their direct role in repressing transposons, primary piRNAs can also initiate the production of secondary piRNA. piRNAs with such function are loaded in a second PIWI protein named Aubergine (Aub). Similar to Piwi, Aub is guided by piRNAs to identify its targets through base-pairing. Differently, Aub functions in the cytoplasm by cleaving transposon mRNAs. The 5' cleavage products are not degraded but loaded into the third PIWI protein Argonaute3 (Ago3). It is believed that an unidentified nuclease trims the 3' ends of those cleavage products to 23–29 nt, becoming mature piRNAs remained in Ago3. Such piRNAs whose 5' ends are generated by another PIWI protein are named secondary piRNAs. Intriguingly, secondary piRNAs loaded into Ago3 also cleave transposon mRNA or piRNA cluster transcripts and produce more secondary piRNAs loaded into Aub. This reciprocal feed-forward loop, named the “Ping-Pong cycle”, amplified piRNA abundance. By dissecting and analyzing data from large-scale deep sequencing of piRNAs and transposon transcripts, my dissertation research elucidates the biogenesis of germline piRNAs in Drosophila. How primary piRNAs are processed into mature piRNAs remains enigmatic. I discover that primary piRNA signal on the genome display a fixed periodicity of ~26 nt. Such phasing depends on Zucchini, Armitage and some other primary piRNA pathway components. Further analysis suggests that secondary piRNAs bound to Ago3 can initiate phased primary piRNA production from cleaved transposon RNAs. The first ~26 nt becomes a secondary piRNA that bind Aub while the subsequent piRNAs bind Piwi, allowing piRNAs to spread beyond the site of RNA cleavage. This discovery adds sequence diversity to the piRNA pool, allowing adaptation to changes in transposon sequence. We further find that most Piwi-associated piRNAs are generated from the cleavage products of Ago3, instead of being processed from piRNA cluster transcripts as the previous model suggests. The cardinal function of Ago3 is to produce antisense piRNAs that direct transcriptional silencing by Piwi, rather to make piRNAs that guide post-transcriptional silencing by Aub. Although Ago3 slicing is required to efficiently trigger phased piRNA production, an alternative, slicing-independent pathway suffices to generate Piwi-bound piRNAs that repress transcription of a subset of transposon families. The alternative pathway may help flies silence newly acquired transposons for which they lack extensively complementary piRNAs. The Ping-Pong model depicts that first ten nucleotides of Aub-bound piRNAs are complementary to the first ten nt of Ago3-bound piRNAs. Supporting this view, piRNAs bound to Aub typically begin with Uridine (1U), while piRNAs bound to Ago3 often have adenine at position 10 (10A). Furthermore, the majority of Ping-Pong piRNAs form this 1U:10A pair. The Ping-Pong model proposes that the 10A is a consequence of 1U. By statistically quantifying those target piRNAs not paired to g1U, we discover that 10A is not directly caused by 1U. Instead, fly Aub as well as its homologs, Siwi in silkmoth and MILI in mice, have an intrinsic preference for adenine at the t1 position of their target RNAs. On the other hand, this t1A (and g10A after loading) piRNA directly give rise to 1U piRNA in the next Ping-Pong cycle, maximizing the affinity between piRNAs and PIWI proteins.
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Wei. "Unveiling Molecular Mechanisms of piRNA Pathway from Small Signals in Big Data: A Dissertation." eScholarship@UMMS, 2010. http://escholarship.umassmed.edu/gsbs_diss/805.

Full text
Abstract:
PIWI-interacting RNAs (piRNA) are a group of 23–35 nucleotide (nt) short RNAs that protect animal gonads from transposon activities. In Drosophila germ line, piRNAs can be categorized into two different categories— primary and secondary piRNAs— based on their origins. Primary piRNAs, generated from transcripts of specific genomic regions called piRNA clusters, which are enriched in transposon fragments that are unlikely to retain transposition activity. The transcription and maturation of primary piRNAs from those cluster transcripts are poorly understood. After being produced, a group of primary piRNAs associates Piwi proteins and directs them to repress transposons at the transcriptional level in the nucleus. Other than their direct role in repressing transposons, primary piRNAs can also initiate the production of secondary piRNA. piRNAs with such function are loaded in a second PIWI protein named Aubergine (Aub). Similar to Piwi, Aub is guided by piRNAs to identify its targets through base-pairing. Differently, Aub functions in the cytoplasm by cleaving transposon mRNAs. The 5' cleavage products are not degraded but loaded into the third PIWI protein Argonaute3 (Ago3). It is believed that an unidentified nuclease trims the 3' ends of those cleavage products to 23–29 nt, becoming mature piRNAs remained in Ago3. Such piRNAs whose 5' ends are generated by another PIWI protein are named secondary piRNAs. Intriguingly, secondary piRNAs loaded into Ago3 also cleave transposon mRNA or piRNA cluster transcripts and produce more secondary piRNAs loaded into Aub. This reciprocal feed-forward loop, named the “Ping-Pong cycle”, amplified piRNA abundance. By dissecting and analyzing data from large-scale deep sequencing of piRNAs and transposon transcripts, my dissertation research elucidates the biogenesis of germline piRNAs in Drosophila. How primary piRNAs are processed into mature piRNAs remains enigmatic. I discover that primary piRNA signal on the genome display a fixed periodicity of ~26 nt. Such phasing depends on Zucchini, Armitage and some other primary piRNA pathway components. Further analysis suggests that secondary piRNAs bound to Ago3 can initiate phased primary piRNA production from cleaved transposon RNAs. The first ~26 nt becomes a secondary piRNA that bind Aub while the subsequent piRNAs bind Piwi, allowing piRNAs to spread beyond the site of RNA cleavage. This discovery adds sequence diversity to the piRNA pool, allowing adaptation to changes in transposon sequence. We further find that most Piwi-associated piRNAs are generated from the cleavage products of Ago3, instead of being processed from piRNA cluster transcripts as the previous model suggests. The cardinal function of Ago3 is to produce antisense piRNAs that direct transcriptional silencing by Piwi, rather to make piRNAs that guide post-transcriptional silencing by Aub. Although Ago3 slicing is required to efficiently trigger phased piRNA production, an alternative, slicing-independent pathway suffices to generate Piwi-bound piRNAs that repress transcription of a subset of transposon families. The alternative pathway may help flies silence newly acquired transposons for which they lack extensively complementary piRNAs. The Ping-Pong model depicts that first ten nucleotides of Aub-bound piRNAs are complementary to the first ten nt of Ago3-bound piRNAs. Supporting this view, piRNAs bound to Aub typically begin with Uridine (1U), while piRNAs bound to Ago3 often have adenine at position 10 (10A). Furthermore, the majority of Ping-Pong piRNAs form this 1U:10A pair. The Ping-Pong model proposes that the 10A is a consequence of 1U. By statistically quantifying those target piRNAs not paired to g1U, we discover that 10A is not directly caused by 1U. Instead, fly Aub as well as its homologs, Siwi in silkmoth and MILI in mice, have an intrinsic preference for adenine at the t1 position of their target RNAs. On the other hand, this t1A (and g10A after loading) piRNA directly give rise to 1U piRNA in the next Ping-Pong cycle, maximizing the affinity between piRNAs and PIWI proteins.
APA, Harvard, Vancouver, ISO, and other styles
19

Castro, Lara Reinel de. "Análise exômica em pacientes portadores de cardiomiopatia hipertrófica." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/98/98131/tde-26042016-090654/.

Full text
Abstract:
A cardiomiopatia hipertrófica (CMH) é uma doença geneticamente determinada, caracterizada por hipertrofia ventricular primária, com prevalência estimada de 0.2% na população geral. Qualquer portador tem 50% de chance de transmitir esta doença para seus filhos, o que torna cada vez mais relevante a importância do estudo genético dos indivíduos acometidos e de seus familiares. Já foram descritas diversas mutações genéticas causadoras de CMH, a maioria em genes que codificam proteínas do sarcômero, e algumas mutações mais raras em genes não sarcoméricos. O objetivo desse estudo é sequenciar as regiões exônicas de genes candidatos, incluindo os principais envolvidos na hipertrofia miocárdica, utilizando o sequenciamento de nova geração (Generation Sequencing); testar a aplicabilidade e viabilidade deste sistema para identificar mutações já confirmadas e propor as prováveis novas mutações causadoras de CMH. Métodos e resultados: 66 pacientes não aparentados portadores de CMH foram estudados e submetidos à coleta de sangue para obtenção do DNA para analisar as regiões exômicas de 82 genes candidatos, utilizando a plataforma MiSeq (Illumina). Identificou-se 99 mutações provavelmente patogênicas em 54 pacientes incluídos no estudo (81,8%) relacionadas ou não a CMH, e distribuídas em 42 genes diferentes. Destas mutações 27 já haviam sido publicadas, sendo que 17 delas descritas como causadoras de CMH. Em 28 pacientes (42,4%) identificou-se mutação nos três principais genes sarcoméricos relacionados à CMH (MYH7, MYBPC3, TNNT2). Encontrou-se também um grande número de variantes não sonôminas de efeito clínico incerto e algumas mutações relacionadas a outras enfermidades. Conclusão: a análise da sequencia dos exônos de genes candidatos, demonstrou ser uma técnica promissora para o diagnóstico genético de CMH de forma mais rápida e sensível. A quantidade de dados gerados é o um fator limitante até o momento, principalmente em doenças geneticamente complexas com envolvimento de diversos genes e com sistema de bioinformática limitado.<br>Hypertrophic Cardiomyopathy (HCM) is a genetically determined disease, estimated prevalence of 0.2% in the general population. Any of its carriers has 50% likelihood to pass it on to their children, and that makes the genetic study of these individuals and their relatives even more relevant. There have been several studies describing genetic mutations that cause HCM - the vast majority in genes responsible for sarcomere protein coding - and other rarer mutations in non-sarcomeric genes. The aim of this research is study exonic areas of specific genes, including the most important ones related to myocardial hypertrophy, identifying the genetic mutations that have already been documented, and possible new pathogenic mutations, using the high throughput DNA sequencing (NGS); testing the pplicability and viability to identify HCM-causing mutations. Methods and results: 66 unrelated patients with CM were studied and subject to blood sample in order to extract their genomic DNA to analyze exomic regions of 82 candidates genes, using the high throughput sequencing technology on MiSeg (Illumina) platform. In this study we identified 99 possible damaging mutations in 54 patients (81.8%) that could be related or not to HCM, and distributed in 42 different genes. 27 of this variants have already been published, and 17 of them have been described as HCM causes. 42,4% of the patients (28 individuals) have genetic mutations in the three main sarcomeric genes related to HCM (MYH7, MYBPC3, TNNT2). We also identified a large number of non-synonymous variants of uncertain clinical significance and some mutations related to other diseases. Conclusion: The exome analysis in candidates genes using NGS has demonstrated to be promising for the genetic diagnosis of HCM, in a short time with sensivity. The amount of data obtained in a short period of time is the main limiting factor, especially for genetically complex diseases that involve multiple genes.
APA, Harvard, Vancouver, ISO, and other styles
20

Cappi, Carolina. "Variações raras no genoma de pacientes com transtorno obsessivo-compulsivo." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/5/5142/tde-09092013-160344/.

Full text
Abstract:
Estudos de variações genéticas raras têm caracterizado com sucesso regiões do genoma e processos biológicos envolvidos no risco de desenvolver transtornos psiquiátricos. Dentro deste contexto, o sequenciamento de nucleotídeos em larga escala de exons do genoma inteiro para a observação de variações raras, conhecidas em inglês como single-nucleotide variation (SNV), e mutações espontâneas (\"de novo\" - DNM) tornou-se uma abordagem essencial na descoberta de novos genes de risco para transtornos psiquiátricos. Até o presente momento, nenhum estudo de SNV e variações de novo com sequenciamento de todas as regiões codificantes foi relatado no transtorno obsessivo-compulsivo (TOC). No presente estudo, este sequenciamento foi feito em 20 casos esporádicos de TOC e seus pais, para investigar a presença de variações raras de novo e herdadas nos probandos. A partir da observação de que os produtos de genes (proteínas) associados a uma mesma doença interagem em uma rede de interação proteína-proteína (IPP), foi feita uma rede de IPP com os genes (proteína) que apresentaram variações de novo não sinônimas, para observar dentre estes genes o de maior importância na rede de IPP. A fim de investigar a relevância desta rede, foi aplicado um algoritmo de grau informativo para redes, baseado na priorização de genes relacionados com doenças (Degree-Aware Algorithms for Network-Based Disease Gene Prioritization - DADA) que ordenou os genes com variações de novo não sinônimas, embasado na associação destes com uma lista de genes envolvidos em conferir risco para TOC provenientes de uma meta-análise de genética em TOC. Além disso, foi feita uma análise de processos biológicos envolvidos com os genes que apresentavam variações raras herdadas. No total, 10 variações de novo não sinônimas (9 missense e 1 nonsense) foram validadas utilizando-se o sequenciamento Sanger. Os genes WWP1, AP1G1 e CR1, advindos da lista inicial de genes com variações não sinônimas, foram altamente conectados na rede de IPP. O gene WWP1 foi o gene com maior pontuação no ranking de resultados do algoritmo DADA. Os resultados de processos biológicos envolvendo estes genes sugerem o envolvimento do sistema imunológico em TOC. Além disso, todos os genes com variações de novo não sinônimas, exceto um, são genes com expressão no cérebro e envolvidos com sinaptogênese e morte neuronal<br>Studies of rare variations have been used successfully to characterize regions of the genome and molecular pathways conferring risk for developmental neuropsychiatric disorders. Within this context, whole-exome sequencing for rare single-nucleotide variation (SNV) and de novo mutation (DNM) mutation has become an essential approach for gene discovery in psychiatric disease. To date, few if any studies of SNVs and de novo variation using this technology have been reported for obsessive-compulsive disorder (OCD). In the present study, all coding regions of the genome were sequenced for 20 sporadic probands with OCD and their parents, to investigate de novo and inherited rare variations in probands. Subsequently, based on the observation that the products of genes associated with similar diseases are likely to interact with each other heavily in a network of proteinprotein interactions (PPIs), a PPI network was generated, with the genes (protein) with non-synonymous de novo variation, to observe the most important genes in the network. To investigate the relevance of this network to OCD further, degree-aware disease gene prioritization (DADA) was applied, to rank all genes with non-synonymous de novo variation based on their relatedness to a set of genes previously identified in an OCD genetics meta-analysis. In addition, pathway analyses using de novo and inherited variation were completed. Altogether, 10 non-synonymous de novo variations (9 missense, 1 nonsense) were successfully validated by Sanger sequencing. We found that the genes WWP1, AP1G1 and CR1, from the initial non-synonymous de novo variation gene list, were highly interconnected in the PPI. The WWP1 gene had the highest rank in the DADA analyses. Results of the pathway analyses suggest the enrichment of genes involved in immunological systems. In short, almost all genes involved in the DNMs of the present study are expressed in the human brain and implicated in synaptogenesis and neuronal apoptosis
APA, Harvard, Vancouver, ISO, and other styles
21

Silva, Artur Guazzelli Leme. "A influência de polimorfismos de base única na metilação de DNA em genes de receptores olfatórios." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/46/46131/tde-19072018-104722/.

Full text
Abstract:
Os genes de receptores olfatórios (OR) pertencem a uma família de proteínas de membrana formada por cerca de 1000 genes no genoma de camundongo. Os genes OR são expressos de forma monogênica e monoalélica nos neurônios olfatórios (OSNs). No entanto, ainda não está claro o mecanismo que permite essa forma de expressão peculiar, sobretudo, qual o papel da metilação de DNA nesse processo. Nosso estudo determinou o padrão de metilação de DNA da região promotora e codificadora do gene Olfr17. Em células de epitélio olfatório (MOE) de camundongos adultos, observamos na região codificadora (CDS) do gene uma frequência de metilação em dinucleotídeos CpG 58%, enquanto que na sua região promotora ela foi bem mais baixa. Os níveis de metilação do Olfr17 em MOE de embrião (E15.5) e fígado foram similares aos observados em MOE de animais adultos. Em seguida, analisamos se a metilação de DNA pode regular a expressão gênica do Olfr17. Utilizando animais transgênicos onde os neurônios olfatórios que expressam Olfr17 também expressam GFP, pudemos selecionar neurônios olfatórios GFP+ e analisar a metilação do gene Olfr17, que está ativo nestas células. Verificamos que o padrão geral de metilação do Olfr17, tanto na região CDS como na região promotora, não se altera quando este gene está ativo. Este resultado indica que alterações na metilação do gene Olfr17 não são necessárias para que este receptor seja expresso. Finalmente, verificamos que a região promotora do gene Olfr17, de duas linhagens de camundongos diferentes, a C57BL/6 e a 129, possuem dois polimorfismos de base única (SNPs) que alteram o conteúdo CpG. Devido a estes SNPs, a linhagem 129 apresenta dois sítios CpG adicionais, inexistentes na linhagem C57BL/6. Nossas análises mostraram que estes CpGs são frequentemente metilados, o que torna o promotor do Olfr17 de 129 significativamente mais metilado que o promotor de C57BL/6. Em seguida, nós analisamos o nível de expressão no MOE dos dois alelos de Olfr17, o 129 e o C57BL/6, utilizando ensaios de RT-qPCR. Estes experimentos demonstraram que o nível de expressão do alelo 129, que possui 3 CpGs metiladas em seu promotor, é menor que o do alelo C57BL/6, que apresenta apenas uma CpG que é pouco metilada em seu promotor. Nossos resultados sugerem que as alterações na região promotora influenciam a probabilidade com que o gene OR é escolhido para ser expresso no MOE.<br>Olfactory receptor (OR) genes belong to a large family of membrane proteins composed of 1000 genes in the mouse genome. The OR genes are expressed in the olfactory sensory neurons (OSNs) in a monogenic and monoallelic fashion. However, the mechanisms that govern OR gene expression are unclear. Here we asked whether DNA methylation plays a role in the regulation of OR gene expression. We first determined the DNA methylation pattern in the coding (CDS) and promoter regions of the odorant receptor gene Olfr17. In olfactory epithelium (MOE) cells, the CpG methylation level in the CDS is 58% but is much lower in the promoter region of the gene. In embryonic MOE (E15.5) and liver, the levels of Olfr17 DNA methylation are similar to the ones shown in adult MOE. We next analyzed whether DNA methylation is involved in Olfr17 regulation. We isolated GFP+ neurons from transgenic mice that coexpress GFP with Olfr17, and analyzed the DNA methylation pattern of the Olfr17, which is active in these cells. We found that the general methylation pattern, both, in the coding and promoter regions is not altered in the active gene. These results indicate that changes in DNA methylation are not required for the activation of Olfr17. Finally, we found that the Olfr17 promoter region from two different mouse strains, C57BL/6 and 129, has two single-nucleotide polymorphisms (SNPs) that alter the CpG content. The SNPs lead to the existence of two additional CpGs in the 129 allele, which are absent in the C57BL/6 allele. These CpGs are frequently methylated, making the 129 Olfr17 promoter significantly more methylated than the Olfr17 promoter from C57BL/6. We next performed RT-qPCR experiments to analyze the expression levels of the 129 and C57BL/6 Olfr17 alleles in the MOE. These experiments showed that the expression level of the 129 Olfr17 allele, which contains three methylated CpGs in its promoter region, is lower than the one from C57BL/6, which contains only one, undermethylated CpG, in its promoter. Our results suggest that these promoter modifications regulate the probability of the OR gene choice.
APA, Harvard, Vancouver, ISO, and other styles
22

Reis, Viviane Neri de Souza. "Variações de novo e raras no genoma de pacientes com transtornos do espectro do autismo verbais e não verbais." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/5/5142/tde-08122014-121628/.

Full text
Abstract:
Estudos de gêmeos e famílias demonstram que os transtornos do espectro do autismo (TEA) apresentam um grande componente genético (~50%), porém sua etiologia ainda é desconhecida, possivelmente devido aos TEA serem caracterizados como doenças complexas, poligênicas e multifatoriais. Recentemente, variações no número de cópias (CNVs, do inglês Copy Number Variations) e mutações pontuais (SNV, do inglês Single Nucleotide Variant) raras, de novo e herdadas foram associadas com TEA, sugerindo novos loci e genes candidatos. No entanto, a grande maioria das alterações descritas são individuais, de forma que analises por agrupamento das mesmas em genes, e busca de funções biológicas ou vias hiper-representadas tem sido uma abordagem para a compreensão dos possíveis mecanismos etiopatológicos dos TEA. Como os TEA são muito heterogêneos clinicamente o uso de endofenótipos específicos para agrupamento das alterações gênicas pode auxiliar a discriminação de vias e processos biológicos relacionados a dimensões fenotípicas. Considerando os estudos realizados em autismo, e a natureza das variações comuns e raras, nesse trabalho foi realizado o sequenciamento do exoma de 1 família de dois irmãos com TEA sindrômico (sequenciamento piloto) e 18 trios de casos esporádicos de TEA, em busca alterações muito raras e/ou de novo com provável impacto funcional nos pacientes; Além disso, foi analisado se existe diferença entre as vias biológicas hiper-representadas de redes gênicas crescidas a partir dos genes que apresentavam variações raras e de novo, comparando pacientes de TEA com: (1) pouca ou nenhuma comunicação, chamados de não verbais e (2) média a boa comunicação, chamados de verbais. No sequenciamento piloto da família dos irmãos com TEA sindrômico, encontramos 1 duplicação em 4p16.3 e 1 deleção em 8p23.3, em ambos os irmãos; alterações estas encontradas em estudos previos em pacientes com características sindrômicas e TEA; na análise de SNVs e Indels foi encontrada 1 variação de novo e 117 variações não-sinônimas raras herdadas de um dos pais na irmã e 150 variações não-sinônimas raras herdadas de um dos pais no irmão; a análise de vias revelou que os genes com as mutações pontuais raras estavam hiper-representados em regiões cromossômicas diferentes em cada irmão (no cromossomo 1 na paciente do sexo feminino e no cromossomo 16 no paciente do sexo masculino), o que pode estar relacionado às diferenças fenotípicas por eles apresentadas. No sequenciamento do exoma dos trios foram encontradas alterações de novo em 9 dos pacientes: 1 CNV de novo (deleção) de 1,5Mb na região 3q29, região previamente associada com síndrome e transtornos do desenvolvimento; e 8 genes alterados por mutações pontuais de novo, dos quais um dele é o GABBR2, que apresenta evidência de associação com TEA. A análise de vias e redes das variantes herdadas raras, mostrou que muitos dos genes relacionados aos dois grupos verbais e não verbais são genes já associados com TEA ou que apresentam interação com aqueles genes associados ao TEA. As analises de vias e redes precisam ser replicadas em amostras maiores, mas com nossos resultados preliminares podemos perceber que nosso estudo contribui com alterações em genes de vias relacionadas a neurogênese e sinaptogênese, independentemente do fenótipo, que possam refletir um conjunto de genes específicos e ou numero de alterações relacionadas a gravidade do TEA<br>Studies of twins and families have shown that autism spectrum disorders (ASD) are highly heritable (~50%), but its etiology is still unknown, possibly because it is a very heterogeneous phenotype and have multiple genes involved in its development, what characterizes a complex disease such as ASD. Recently, copy number variations (CNVs) and point mutations (SNVs) rare, inherited e de novo, were associated with ASD, suggesting new candidate genes and loci. Because they are very rare, the vast majority of the changes described are individual, so the analysis of different variations grouped by genes and searching for biological functions or hyper represented pathways has been an approach for understanding possible pathogenic mechanisms of ASD. As ASD is clinically very heterogeneous, the use of endophenotypes, specific grouping of genomic changes can help discriminating pathways and biological processes related to phenotypic dimensions. Considering the studies in autism, and the nature of common and rare variants, we sequenced all exons (exome) of 1 family with syndromic ASD (pilot sequencing) and 18 trios of sporadic ASD cases to search for de novo and rare variations with probable functional impact on Brazilian patients; Also, we analyzed whether there is a difference in the enrichment of biological pathways of gene networks from the list of genes affected with de novo and rare deleterious variants in two groups of ASD patients: (1) cases with little or no communication, called nonverbal and (2) cases with average to good communication, called verbal. In the pilot exome sequencing (ASD syndromic family), we found a duplication in 4p16.3 and a deletion in 8p23.3 in both siblings, alterations that were found in patients with syndromes and ASD in previously studies; the analysis of SNVs showed 1 variation de novo and 117 nonsynonymous rare variations inherited from only 1 of the parents in the female sibling, and 150 nonsynonymous rare variations inherited from only 1 of the parents in the male sibling; Pathway analysis revealed enrichment differences of chromosomal regions for each sibling (chromosome 1 for the female patient and chromosome 16 for the male patient), what may be related to their phenotypic differences. In the exome sequencing of trios, as expected, it was found de novo variation in 9 of the patients: 1 de novo CNV (deletion) of 1.5 Mb in the region 29 of the long arm of chromosome 3, a region previously associated with syndrome and developmental disorders; and 8 genes altered by de novo variations, one of those is in the GABBR2, gene with previous evidence of association with ASD. The pathways and networks analysis of rare inherited variants showed that many of the genes related to the two groups verbal and nonverbal are already associated with ASD or interacts with those genes associated with ASD. This pathway and gene network analyses need to be replicated in larger samples, but our preliminary results shows that our study contributes with variations in genes related to neurogenesis and synaptogenesis pathways, regardless of phenotype, with probable impact to specific genes that may be related to severity of clinical presentation
APA, Harvard, Vancouver, ISO, and other styles
23

Lopalco, Maria. "A new class of enzymatically cleavable nucleotides for DNA sequencing by synthesis." Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/15233.

Full text
Abstract:
A new family of nucleotides has been designed, synthesised and evaluated for DNA sequencing by synthesis. Each of the nucleotide analogues had a free 3’-OH group and a fluorophore attached to the base through a small peptide linkage that was designed to stop multiple base additions. Analysis by HPLC and MALDI-TOF proved the modified nucleotides were incorporated into a growing DNA strand by a DNA polymerase and that the peptide linker was cleaved with high efficiency after incubation of the extended primer with a protease. Enzymatically cleavable nucleotides were successfully applied to chip-based sequencing by synthesis cycles. The fluorescent emission revealed the identity of the incorporated nucleotide and the removal of the fluorophore by a protease ensured the detection of the next base in the following cycle. Additionally, a practical microwave-mediated solid-phase protocol has been developed for the synthesis of cyanine dyes spanning the visible and the near infrared spectrum to allow the preparation of a series of fluorescent nucleotides for the design of a four colour DNA sequencing technology.
APA, Harvard, Vancouver, ISO, and other styles
24

Hansen, Tarrant William. "Evaluation of molecular methods used for the rapid detection of multi-drug resistant Mycobacterium tuberculosis." Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/20723/1/Tarrant_Hansen_Thesis.pdf.

Full text
Abstract:
Tuberculosis remains a major public health issue globally, with an estimated 9.2 million new cases in 2006. A new threat to TB control is the emergence of drug resistant strains. These strains are harder to cure as standard anti-tuberculosis first line treatments are ineffective. Multi Drug Resistant Tuberculosis (MDR-TB) is defined as Mycobacterium tuberculosis that has developed resistance to at least rifampicin and isoniazid, and these strains now account for greater than 5% of worldwide cases. Mutations within the Rifampicin Resistance Determining Region (RRDR) of the rpoB gene are present in greater than 95% of strains that show rifampicin resistance by conventional drug susceptibility testing. As rifampicin mono resistance is extremely rare, and rifampicin resistance is usually associated with isoniaizd resistance, the RRDR region of the rpoB gene is a very useful surrogate marker for MDR-TB. Many molecular assays have been attempted based on this theory and have had varied levels of success. The three methods evaluated in this study are DNA sequencing of the rpoB, katG and inhA genes, the Genotype MTBDRplus line probe assay (Hain Lifesciences) and a novel method incorporating Real-Time PCR with High Resolution Melt analysis targeted at the RRDR using the Rotorgene 6000 (Corbett Lifesciences). The sensitivity for the detection of rifampicin resistance was far better using DNA sequencing or the commercially available line probe assay than detection by the Real-Time PCR method developed in this study.
APA, Harvard, Vancouver, ISO, and other styles
25

Hansen, Tarrant William. "Evaluation of molecular methods used for the rapid detection of multi-drug resistant Mycobacterium tuberculosis." Queensland University of Technology, 2008. http://eprints.qut.edu.au/20723/.

Full text
Abstract:
Tuberculosis remains a major public health issue globally, with an estimated 9.2 million new cases in 2006. A new threat to TB control is the emergence of drug resistant strains. These strains are harder to cure as standard anti-tuberculosis first line treatments are ineffective. Multi Drug Resistant Tuberculosis (MDR-TB) is defined as Mycobacterium tuberculosis that has developed resistance to at least rifampicin and isoniazid, and these strains now account for greater than 5% of worldwide cases. Mutations within the Rifampicin Resistance Determining Region (RRDR) of the rpoB gene are present in greater than 95% of strains that show rifampicin resistance by conventional drug susceptibility testing. As rifampicin mono resistance is extremely rare, and rifampicin resistance is usually associated with isoniaizd resistance, the RRDR region of the rpoB gene is a very useful surrogate marker for MDR-TB. Many molecular assays have been attempted based on this theory and have had varied levels of success. The three methods evaluated in this study are DNA sequencing of the rpoB, katG and inhA genes, the Genotype MTBDRplus line probe assay (Hain Lifesciences) and a novel method incorporating Real-Time PCR with High Resolution Melt analysis targeted at the RRDR using the Rotorgene 6000 (Corbett Lifesciences). The sensitivity for the detection of rifampicin resistance was far better using DNA sequencing or the commercially available line probe assay than detection by the Real-Time PCR method developed in this study.
APA, Harvard, Vancouver, ISO, and other styles
26

Yamamoto, Guilherme Lopes. "Aplicabilidade clínica da técnica de sequenciamento de nova geração com enfoque em displasias esqueléticas." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/5/5141/tde-18122017-091713/.

Full text
Abstract:
INTRODUÇÃO: Na última década surgiu uma nova técnica, o sequenciamento de nova geração, que, contrário ao método tradicional de Sanger, permite o sequenciamento em paralelo e em larga escala de múltiplos genes, ou até mesmo todos os genes humanos, a menor custo e com uma análise mais acelerada. Essa técnica possibilitou a descoberta de novos genes responsáveis por diversas doenças mendelianas, sendo rapidamente incorporada no contexto clínico. OBJETIVOS: comparar os resultados das técnicas de Sanger e sequenciamento de nova geração em amostras controle; introduzir a técnica de sequenciamento de nova geração no contexto clínico nas casuísticas de doenças ósseas genéticas e RASopatias; avaliar a sensibilidade diagnóstica desta técnica em amostras sem dados clínicos fornecidos. MÉTODOS: o sequenciamento de nova geração (sob a forma de um painel de genes customizado ou do exoma) foi realizado em amostras com mutações identificadas previamente por Sanger e em dois grupos de doenças mendelianas, 144 pacientes com doenças ósseas e 79 com RASopatias, além de 90 amostras sem dados clínicos conhecidos (45 casos e 45 controles). A técnica de Sanger foi aplicada em 29 amostras de doenças ósseas e em 81 amostras para confirmação de variantes identificadas pelo sequenciamento de nova geração. RESULTADOS: A sensibilidade da técnica de sequenciamento de nova geração foi estimada em 95,92% e a especificidade em 98,77%. Na casuística de doenças ósseas, a sensibilidade diagnóstica das amostras sequenciadas por Sanger foi de 69% (20/29), por painel customizado, 60% (75/125) e por exoma, 63% (12/19). Na casuística de RASopatias, a sensibilidade diagnóstica através do exoma foi de 46% (36/79). Como resultado deste trabalho, dois genes novos associados a RASopatias (LZTR1 e SOS2) e um associado a uma displasia esquelética (PCYT1A) foram identificados. Na análise das amostras sem conhecimento prévio da hipótese clínica foi obtida uma sensibilidade diagnóstica de 46,67% (21/45), mas que chegou a 73,08% (14/26) para as hipóteses de erros inatos do metabolismo. CONCLUSÕES: Foi demonstrado que a sensibilidade e a especificidade da técnica do sequenciamento de nova geração são altas e correspondentes a valores encontrados por outros grupos na literatura. Essa técnica foi considerada apropriada não apenas no contexto de pesquisa, demonstrado aqui pela descoberta de três novos genes associados a doenças mendelianas, como também para análises clínicas. Neste estudo a técnica foi aplicada com sucesso no contexto clínico, seja pelo painel customizado, seja pelo exoma, com uma positividade semelhante à encontrada pela técnica de Sanger. Mesmo na análise de amostras sem história clínica prévia foi possível identificar variantes patogênicas em quase metade dos casos, e numa porcentagem ainda maior quando a doença era um erro inato do metabolismo. Essa sensibilidade é comparável à obtida pela espectroscopia de massas em Tandem aplicada à triagem de múltiplas condições simultaneamente, o que sugere que a técnica do sequenciamento de nova geração poderá ser incorporada ao programa de triagem neonatal no futuro, ampliando o emprego de testes genéticos em complementaridade aos testes bioquímicos tradicionais<br>INTRODUCTION: In the last decade a new technique, the next generation sequencing, has emerged, which, contrary to the traditional Sanger method, performs parallel and high-throughput sequencing of multiple genes, or even all human genes, at a lower cost and with a faster analysis. This technique allowed the discovery of new genes responsible for several Mendelian diseases and has been quickly incorporated into the clinical context. OBJECTIVES: to compare the results of Sanger technique and next generation sequencing in control samples; to apply the next generation sequencing technique in the clinical practice to the cases of genetic skeletal disorders and RASopathies; to evaluate the diagnostic yield of this technique in samples without clinical data provided. METHODS: Next generation sequencing (in the form of a customized gene panel or exome) was performed in samples with mutations previously identified by Sanger sequencing and in two groups of Mendelian diseases, 144 patients with skeletal disorders and 79 patients with RASopathies, besides 90 samples with unknown clinical data (45 cases and 45 controls). The Sanger technique was applied in 29 samples of skeletal disorders and in 81 samples for confirmation of variants identified by next generation sequencing. RESULTS: The sensitivity of the next generation sequencing technique was estimated at 95.92% and the specificity at 98.77%. In the case of skeletal disorders, the diagnostic yield of the samples sequenced by Sanger was 69% (20/29), by customized panel, 60% (75/125), and by exome, 63% (12/19). In the individuals with RASopathies, the diagnostic yield through exome sequencing was 46% (36/79). As a result of this study, two new genes associated with RASopathies (LZTR1 and SOS2) and one associated with a skeletal dysplasia (PCYT1A) were identified. In the analysis of the samples without previous knowledge of the clinical hypothesis, a total diagnostic yield of 46.67% (21/45) was obtained, but it was up to 73.08% (14/26) in the group with hypothesis of inborn errors of metabolism. CONCLUSIONS: It was demonstrated that the sensitivity and specificity of the next generation sequencing technique are high and are similar to values found by other groups in the literature. The technique was considered appropriate not only for research, as demonstrated here by the identification of three new genes associated to Mendelian diseases, but also for clinical analysis. In this study the technique was successfully applied in the clinical context, both by customized panel and by exome, with positivity similar to that obtained by the Sanger technique. Even in the analysis of samples with no previous clinical history, it was possible to identify pathogenic variants in almost half of the cases, and an even greater percentage was obtained when the disease was an inborn error of metabolism. This sensitivity is comparable to that obtained by Tandem mass spectroscopy applied to multi-condition screening, suggesting that the next generation sequencing technique may be incorporated in the future into the neonatal screening program, increasing the use of genetic testing in complementarity to the biochemical tests
APA, Harvard, Vancouver, ISO, and other styles
27

Benazir, Katarina Marquez. "Molecular Marker Applications in Oat (Avena Sativa L.) Breeding and Germplasm Diagnostics." Thèse, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31148.

Full text
Abstract:
The ability to identify germplasm and select traits accurately is fundamental to successful plant breeding. Pedigrees and molecular markers facilitate these processes; however misleading experimental results can occur when incorrect relationships and/or cultivar names are recorded. Molecular markers can identify these inconsistencies, and with advances in genotyping technology these diagnostics can be done faster and more objectively. This study aimed to develop molecular marker assays and graphical genotyping methodologies for cultivar identification, seed purity assessment and trait selection in oat (Avena sativa L.). KBioscience’s Allele-Specific PCR (KASP™) and genotyping-by-sequencing (GBS) technologies were applied to a set of current Canadian oat cultivars to evaluate their utility for identifying cultivars and detecting intra-cultivar variation. Both KASP™ and GBS detected different extents of heterogeneity among a set of 160 seeds that originated from four seed sources of four cultivars. In both cases, the detected variation did not appear to be limited to a specific cultivar or seed source, reinforcing that all cultivars are heterogeneous. Graphical genotyping localized heterogeneity to specific chromosome regions, thereby distinguishing physical contamination from true genetic heterogeneity and heterozygosity. Pre-existing genotype data for 700 oat cultivars and breeding lines were also used to construct graphical genotypes for pedigree validation and discovery of potential sources for favourable quantitative trait loci (QTL) alleles. This methodology used historical QTLs and anchoring markers to identify 25 putative “high oil” allele carriers. The results from this study will provide diagnostic tools for cultivar identification and pedigree validation, in addition to meaningful information about existing heterogeneity and possible QTL locations in current cultivars.
APA, Harvard, Vancouver, ISO, and other styles
28

Baratela, Wagner Antonio da Rosa. "Estudo genético-clínico das displasias esqueléticas, com enfoque nas osteocondrodisplasias com acometimento do esqueleto axial, associado ao envolvimento epifisário e/ou metafisário." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/5/5141/tde-02072018-120400/.

Full text
Abstract:
INTRODUÇÃO: As osteocondrodisplasias constituem um grupo heterogêneo de doenças que comprometem a formação, crescimento e desenvolvimento do sistema esquelético. O diagnóstico definitivo, principalmente nas formas com acometimento de coluna, epífise e/ou metáfise, é desafiador, devido à heterogeneidade genética, raridade de algumas formas específicas e da sobreposição de fenótipos clínico-radiológicos. OBJETIVOS: avaliar as características clínico-radiológicas e as bases moleculares de um grupo de pacientes com osteocondrodisplasias com envolvimento do esqueleto axial associado a anomalias epifisárias e/ou metafisárias. MÉTODOS: Foram avaliados clínica-radiologicamente 65 pacientes de 58 famílias (com mediana de idade de 9 anos e 2 meses) com osteocondrodisplasias de acometimento espôndilo-epi-metafisários pertencentes a 15 diferentes grupos da Nosologia de anomalias ósseas genéticas. Os pacientes foram classificados em cinco categorias (I,IIA,IIB,III,IV), levando-se em consideração a certeza do diagnóstico de uma determinada displasia esquelética pelos aspectos clínico-radiológicos (I,IIA,IIB,IV), o conhecimento (I,IIA,IIB) ou não (III,IV) da base molecular e o tamanho/número de éxons a serem analisados. De acordo com esta classificação, um fluxograma foi estabelecido para a decisão de qual técnica de sequenciamento seria empregada: Sanger (I) ou sequenciamento de nova geração, sob a forma de um painel de genes customizado (IIA,III) ou do exoma (IIB,III,IV). RESULTADOS: dentre os 64 pacientes analisados por um teste molecular, variantes consideradas causativas do fenótipo foram encontradas em 61 (95%). De acordo com as categorias estabelecidas, a positividade do teste genético inicial e o número de indivíduos analisados foram: categoria I (13/17), IIA (32/36), IIB (2/2), III (1/3), IV (6/6). Em sete casos (três na categoria I, dois na categoria IIA e dois na categoria III), nos quais a estratégia molecular inicial não obteve sucesso, testes moleculares adicionais foram realizados como complementação, possibilitando a identificação da base molecular das osteocondrodisplasias em questão. Em um paciente na categoria I e nos dois indivíduos da categoria IIA, nos quais não foi possível identificar uma variante patogênica, nenhum teste genético adicional foi realizado. Dentre os casos com resultados positivos, em um paciente na categoria IIB, o resultado final mudou a hipótese diagnóstica inicial; dois pacientes classificados como IIA e III, o quadro clínico presente era leve, contrastando com o quadro típico descrito nas osteocondrodisplasias causadas por mutações nos genes INPPL1 e HSPG2. No grupo de osteocondrodisplasias sem etiologia conhecida (categoria IV), foram identificadas variantes em três novos genes (PCYT1A, FN1 e LONP1). Uma re-análise nos casos negativos permitiu a identificação de mais um paciente com mutação em FN1. Mutações novas em genes previamente conhecidos foram encontradas nos genes COL2A1, COL11A1, CHST3, SLC26A2, HSPG2, ACAN, TRPV4, COMP, SMARCAL1, INPPL1, NPR2, LIFR e CANT1. CONCLUSÕES: o presente estudo mostrou que uma caracterização clínico-radiológica bem elaborada permite uma maior concordância com os testes moleculares. Em algumas situações, a sobreposição fenotípica impediu que o diagnóstico definitivo fosse estabelecido apenas por critérios clínico-radiológicos, exigindo o emprego de testes moleculares adicionais para o correto diagnóstico e, consequentemente, um aconselhamento genético mais preciso. Os achados clínico-radiológicos e moleculares contribuíram para ampliar o espectro fenotípico de algumas osteocondrodisplasias, particularmente aquelas decorrentes de variantes nos genes HSPG2 e INPPL1, além de desvendar as bases genéticas de três osteocondrodisplasias<br>INTRODUCTION: Osteochondrodysplasias comprise a heterogeneous group of bone disorders affecting formation, growth, and development of the skeleton. An accurate diagnosis could be challenging due to the genetic heterogeneity, rarity of specific types, and overlapping of clinical and radiographic phenotypes. OBJECTIVES: to evaluate the clincal and radiographic characteristics, as well as the molecular basis of a group of patients with spondylo-epi and/or metaphyseal osteochondrodyplasias. METHODS: Sixty-five patients from 58 families (with a median age of 9 years 2 months) with spondylo-epi-metaphyseal osteocondrodysplasias, from 15 different groups from the Nosology of Genetic Bone Disorders, were enrolled. Patients were classified into five categories (I,IIA,IIB,III,IV), considering the diagnostic certainty of a specific osteochondrodysplasia, based on clinical and radiographic findings (I,IIA,IIB,IV), the previous knowledge (I,IIA,IIB), or not (III,IV) of the molecular basis, as well as, the size/number of exons to be tested. According to this classification, a flowchart was established to aid on the molecular testing strategy: Sanger sequencing (I), or next generation sequencing, at the form of a custom targeted gene panel (IIA,III), or exome sequencing (IIB,III,IV). RESULTS: Phenotype causing variants were found in 61 out of 64 patients (95%). Based on the category system described above, the initial molecular testing positivity was: category I (13/17), IIA (32/36), IIB (2/2), III (1/3), and IV (6/6). In seven cases (three in category I, two in category IIA, and two in category III), additional genetic tests applied were able to find the causative mutation, when the initial testing strategy was not successful. For one patient in category I and two in category IIA, in whose no pathogenic mutation was found, no further molecular studies were performed. Among the positive cases, one patient in category IIB had his former clinical diagnosis changed through the molecular testing; two patients form categories IIA and III showed a milder phenotype, in contrast to the expected clinical findings in INPPL1 and HSPG2 osteochondrodysplasias. The molecular basis of three different osteochondrodysplasias (category IV) was unraveled by this study (genes PCYT1A, FN1 e LONP1). A re-analysis of the negative cases allowed the identification of another patient harboring mutation in FN1. Novel variants were found in previously known genes: COL2A1, COL11A1, CHST3, SLC26A2, HSPG2, ACAN, TRPV4, COMP, SMARCAL1, INPPL1, NPR2, LIFR, and CANT1. CONCLUSIONS: The present study showed that a well-elaborated clinical-radiological characterization allows greater correspondence with the molecular testing. In some cases, phenotypic overlap prevented the definitive diagnosis from being established only by clinical-radiological criteria, requiring the use of additional molecular tests for the correct diagnosis, and consequently, a more accurate genetic counseling. The clinical-radiographic, as well as the molecular findings, contributed to expand the clinical phenotype in certain groups, especially the variants found in patients with HSPG2 and INPPL1 osteochondrodysplasias, besides unraveling the molecular basis of three osteochondrodysplasias
APA, Harvard, Vancouver, ISO, and other styles
29

Piazza, João Paulo. "Uma metodologia para determinação do organismo de origem de sequencias de DNA com aplicação em projetos EST." [s.n.], 2004. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275787.

Full text
Abstract:
Orientador: João Carlos Setubal<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-31T09:21:06Z (GMT). No. of bitstreams: 1 Piazza_JoaoPaulo_M.pdf: 1307969 bytes, checksum: 885944b1beb24b7a3979738e217bfb50 (MD5) Previous issue date: 2004<br>Resumo: Este trabalho apresenta uma nova metodologia para a determinação computacional do organismo de origem de seqüência de DNA, implementada na forma de um programa chamado QUEST. O QUEST é baseado em dois princípios: a extração de informações intrínsecas a cada seqüência, chamadas de características, e a extração de diferentes tipos de características e sua combinação para se chegar a melhores resultados. São utilizados 7 diferentes programas como extratores de características, alguns desenvolvidos por terceiros (Glimmer e ESTScan) e outros desenvolvidos pelo autor. As características foram combinadas utilizando vários classificadores diferentes, variando desde uma soma simples até os baseados em vetores de suporte. O QUEST requer seqüências para treinamento. Em comparação com as abordagens baseadas em similaridade, as vantagens principais da QUEST estão no fornecimento de previsões para as taxas de erro e na capacidade de lidar com seqüências sem similaridades significativas em bancos de seqüência. O QUEST foi aplicado ao problema de determinar automaticamente contaminantes em projetos EST. São apresentados resultados de experimentos simulados e de um projeto EST real (o projeto EST de Schistosoma mansoni). Nos experimentos simulados foram atingidas taxas de falsos positivos mais falsos negativos de aproximadamente 10%. No projeto de S.mansoni o QUEST sugere que a contaminação em seqüências supostamente legítimas poderia ser de pelo menos 6%. No teste com S.mansoni, o QUEST foi 10 vezes mais rápido que o tempo necessário para executar o BLASTX em todas as seqüências testadas. O QUEST tem outras aplicações, incluindo a determinação do organismo de origem na nova abordagem genômica chamada de genômica ambiental (também chamada de metagenômica).<br>Abstract: This work presents a new methodology for computational ascertainment of organismal origin of DNA sequences, which we call QUEST. QUEST is based on two principles: that of extracting intrinsic information from each sequence, which are called features, and of extracting deferent kinds of features and combining them to achieve a better result. We use as feature extractors 7 deferent programs, some third-party (Glimmer and ESTScan) and others developed by the author. We combine features using many diferent standard classifers, ranging from simple sum to support vector machines. QUEST requires training sequences. In comparison to similarity-based approaches, QUEST has the main advantages of providing predicted error rates and of being able to deal with sequences without a significant match in sequence databases. We applied QUEST to the problem of automatically determining contaminants in EST projects. We present results from a simulated experiment and from a real EST project (the Schistosoma mansoni EST project). In the simulated experiment we achieved rates of false positives plus false negatives of around 10%. In the S.mansoni project QUEST suggests that contamination in supposedly bona _de sequences may be of at least 6%. In the S.mansoni test, QUEST was 10 times faster than the time it took to run BLASTX on all tested sequences. QUEST has a number of other applications, including the determination of organismal origin in the new approach to genomics called environmental genomics (also called metagenomics)<br>Mestrado<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
30

Silva, Amanda Santiago Ferreira Lantyer. "Filogeografia e genômica populacional de anuros neotropicais : estruturação, diversidade e demografia histórica /." Rio Claro, 2019. http://hdl.handle.net/11449/182576.

Full text
Abstract:
Orientador: Célio Fernando Baptista Haddad<br>Resumo: Nesta tese exploramos a diversificação de espécies de anuros da Tribo Lophyohylini distribuídas na Mata Atlântica (MA) e Caatinga, biomas do Brasil. Para tanto, utilizamos dados genômicos e métodos de filogeografia estatística para analisá-los, uma abordagem robusta e até então ainda escassa na filogeografia dos organismos sul-americanos. Através do sequenciamento Sanger amostramos trechos de DNA mitocondrial (mtDNA) e através do sequenciamento de alto rendimento obtivemos polimorfismos de nucleotídeo único (SNPs) do DNA nuclear (nDNA). No primeiro capítulo, reunimos 82 amostras da espécie Aparasphenodon brunoi que apresenta distribuição nas terras baixas e costeiras do bioma MA, sendo considerada ideal para investigar o papel da instabilidade climática do Pleistoceno em seu processo de diversificação. Ainda no primeiro capítulo, para avaliar a influência do comportamento filopátrico, associado às diferenças no modo reprodutivo, na dispersão e consequentemente, no fluxo gênico entre populações, reunimos 72 amostras de A. arapapa e A. aff. brunoi. Estas tratam-se de espécies simpátricas, de distribuição mais restrita e com modos reprodutivos contrastantes o que as tornam ideais para esse tipo de comparação. De acordo com a inferência baseada em modelos demográficos, por volta do último máximo glacial (UMG) a população norte de A. brunoi se manteve estável enquanto que a do sul experimentou um gargalo seguido de expansão recente. Além disso, a população norte de A. brunoi apres... (Resumo completo, clicar acesso eletrônico abaixo)<br>Abstract: In this dissertation we explored the diversification of anuran species of the Lophyohylini Tribe distributed in the Atlantic Forest (AF) and Caatinga biomes of Brazil. To do so, we used genomic data and methods of statistical phylogeography to analyze them, a robust and until then still scarce approach in the phylogeography of South American organisms. Through the Sanger sequencing we sampled mitochondrial DNA (mtDNA) and through high-throughput sequencing we obtained single nucleotide polymorphisms (SNPs) from nuclear DNA (nDNA). In the first chapter, we collected 82 samples of the species Aparasphenodon brunoi that is distributed in the low and coastal lands of the AF biome, being considered ideal to investigate the role of Pleistocene climatic instability in its diversification process. In the first chapter, to evaluate the influence of philopatric behavior, associated with differences in reproductive mode, dispersion and consequent gene flow among populations, we also collected 72 samples of A. arapapa and A. aff. brunoi species. These are sympatric species with a narrower distribution and with contrasting reproductive modes, making them ideal for this type of comparison. According to the inference based on demographic models, around the last glacial maximum (LGM) the northern population of A. brunoi remained stable while the southern one experienced a bottleneck followed by recent expansion. In addition, the northern population of A. brunoi exhibits greater nuclear diver... (Complete abstract click electronic access below)<br>Doutor
APA, Harvard, Vancouver, ISO, and other styles
31

Guo, Wenjing. "Design and synthesis of novel nucleotide analogs and protein conjugates for DNA sequencing." Thesis, 2016. https://doi.org/10.7916/D8ZP4686.

Full text
Abstract:
Sequencing by Synthesis (SBS), a DNA sequencing methodology based on the DNA polymerase reaction, is a promising paradigm for deciphering large-scale genomes. This thesis describes the design and synthesis of a variety of nucleotide reversible terminators (NRTs) with different characteristics. One set of NRTs possesses a phosphate moiety attached to the 2’ position of the sugar to block further incorporation in polymerase reaction, with the potential for fluorescent tag attachment at the same site or on the base through a cleavable linker for detection. The other set of NRTs possesses an azido-methyl moiety that blocks the 3’-hydroxyl group for detection by surface-enhanced Raman scattering. Each NRT has been tested in proof-of-principle SBS experiments. In addition, a set of 5’-phosphate tagged nucleotides has been developed and tested for nanopore electronic detection. A new set of NRTs, 2’-O-monophosphate 3’-hydroxyl nucleoside 5’-triphosphates (2’-P-NTPs) has been synthesized and its application for SBS has been investigated (chapter 2). These NRTs contain a phosphate at the 2’ position of the sugar ring, which serves as the removable capping group during the polymerase reaction. This moiety is positioned close to the 3’-hydroxyl group so as to block further nucleotide incorporation in the polymerase reaction. It nonetheless should allow improved binding to the polymerase relative to nucleotides with blocking groups at the 3’ position, since polymerases have strict requirements for the 3’-OH binding pocket. 2’-P-NTPs can be incorporated into the growing nucleic acid strand at temperatures ranging from 37oC to 65oC with Stoffel fragment modified 19 (SfM19) polymerase. After incorporation, the phosphate capping moiety on the 2’ position of the DNA extension product can be efficiently removed by enzymatic phosphatase reaction permitting the next incorporation step. Fluorescently labeled 2’-P-NTPs have the potential for sequencing DNA and direct sequencing of RNA-like templates. As an alternative to fluorescence-based SBS, a Raman spectroscopy detection method was developed using an azido moiety (N3) as both a 3’-OH blocking group and a label with an intense, narrow and unique Raman shift at 2125 cm-1, where virtually all biological molecules are transparent (chapter 3). First the four 3’-O-azidomethyl nucleotide reversible terminators (N3-dNTPs) were demonstrated to produce surface enhanced Raman scattering (SERS) at 2125 cm-1. These 4 nucleotide analogues were used as substrates for the polymerase to perform a complete 4-step SBS reaction. SERS was used to monitor the appearance of the azide-specific Raman peak at 2125 cm-1 as a result of polymerase mediated primer extension by a single N3-dNTP and disappearance of this Raman peak upon cleavage of the azido label to permit the next nucleotide incorporation, thereby determining the DNA sequence. Due to the small size of the azido label, the N3-dNTPs are efficient substrates for the DNA polymerase. In the SBS cycles, the natural nucleotides are restored after each incorporation and cleavage, producing a growing DNA strand that bears no modifications and will not impede further polymerase reactions. Thus, with further improvements in SERS for this moiety, this approach has the potential to provide an attractive alternative to fluorescence-based SBS. Chapter 4 describes the design, synthesis and characterization of a new set of 5’-phosphate labeled nano-tag nucleotides (NTNs) for single molecule electronic SBS by nanopore detection. Four modified oligonucleotide polymers that produce distinct electrical current blockade signals in nanopores were designed as the nano-tags. While most of the NTNs flow rapidly through the pore, those complementary to the nucleotide on the DNA template are captured by the polymerase and will have at least 10-fold longer dwell times in the pore, which affords enough time for measuring and discriminating the signals. Since the nano-tags are automatically removed during the polymerase extension reaction in real time, only natural DNA strands are produced. Thus this SBS method should decrease the overall sequencing time and increase the read length.
APA, Harvard, Vancouver, ISO, and other styles
32

Payne, Christina M. "Molecular dynamics simulation of a nanoscale device for fast sequencing of DNA." Diss., 2007. http://etd.library.vanderbilt.edu/ETD-db/available/etd-11282007-144800/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

"Bioinformatics analyses for next-generation sequencing of plasma DNA." 2012. http://library.cuhk.edu.hk/record=b5549423.

Full text
Abstract:
1997年,Dennis等證明胚胎DNA在孕婦母體中存在的事實開啟了產前無創診斷的大門。起初的應用包括性別鑒定和恒河猴血型系統的識別。隨著二代測序的出現和發展,對外周血游離DNA更加成熟的分析和應用應運而生。例如當孕婦懷孕十二周時, 應用二代測序技術在母體外周血DNA中預測胎兒21號染色體是否是三倍體, 其準確性達到98%。本論文的第一部分介紹如何應用母體外周血DNA構建胎兒的全基因組遺傳圖譜。這項研究極具挑戰,原因是孕後12周,胎兒對外周血DNA貢獻很小,大多數在10%左右,另外外周血中的胎兒DNA大多數短於200 bp。目前的演算法和程式都不適合於從母體外周血DNA中構建胎兒的遺傳圖譜。在這項研究中,根據母親和父親的基因型,用生物資訊學手段先構建胎兒可能有的遺傳圖譜,然後將母體外周血DNA的測序資訊比對到這張可能的遺傳圖譜上。如果在母親純和遺傳背景下,決定父親的特異遺傳片段,只要定性檢測父親的特異遺傳片段是否在母體外周血中存在。如果在母親雜合遺傳背景下,決定母親的遺傳特性,就要進行定量分析。我開發了單倍型相對劑量分析方案,統計學上判斷母親外周血中的兩條單倍型相對劑量水準,顯著增加的單倍型即為最大可能地遺傳給胎兒的單倍型。單倍型相對劑量分析方案可以加強測序資訊的分析效率,降低測序數據波動,比單個位點分析更加穩定,強壯。<br>隨著靶標富集測序出現,測序價格急劇下降。第一部分運用母親父親的多態位點基因型的組合加上測序的資訊可以計算出胎兒DNA在母體外周血中的濃度。但是該方法的局限是要利用母親父親的多態位點的基因型,而不能直接從測序的資訊中推測胎兒DNA在母體外周血中的濃度。本論文的第二部分,我開發了基於二項分佈的混合模型直接預測胎兒DNA在母體外周血中的濃度。當混合模型的似然值達到最大的時候,胎兒DNA在母體外周血中的濃度得到最優估算。由於靶標富集測序可以提供高倍覆蓋的測序資訊,從而有機會直接根據概率模型識別出母親是純和而且胎兒是雜合的有特異信息量的位點。<br>除了母體外周血DNA水準分析推動產前無創診斷外,表觀遺傳學的分析也不容忽視。 在本論文的第三部分,我開發了Methyl-Pipe軟體,專門用於全基因組的甲基化的分析。甲基化測序數據分析比一般的基因組測序分析更加複雜。由於重亞硫酸鹽測序文庫的沒有甲基化的胞嘧啶轉化成尿嘧啶,最後以胸腺嘧啶的形式存在PCR產物中, 但是對於甲基化的胞嘧啶則保持不變。 因此,為了實現將重亞硫酸鹽處理過的測序序列比對到參考基因組。首先,分別將Watson和Crick鏈的參考基因組中胞嘧啶轉化成全部轉化為胸腺嘧啶,同時也將測序序列中的胞嘧啶轉化成胸腺嘧啶。然後將轉化後的測序序列比對到參考基因組上。最後根據比對到基因組上的測序序列中的胞嘧啶和胸腺嘧啶的含量推到全基因組的甲基化水準和甲基化特定模式。Methyl-Pipe可以用於識別甲基化水平顯著性差異的基因組區別,因此它可以用於識別潛在的胎兒特異的甲基化位點用於產前無創診斷。<br>The presence of fetal DNA in the cell-free plasma of pregnant women was first described in 1997. The initial clinical applications of this phenomenon focused on the detection of paternally inherited traits such as sex and rhesus D blood group status. The development of massively parallel sequencing technologies has allowed more sophisticated analyses on circulating cell-free DNA in maternal plasma. For example, through the determination of the proportional representation of chromosome 21 sequences in maternal plasma, noninvasive prenatal diagnosis of fetal Down syndrome can be achieved with an accuracy of >98%. In the first part of my thesis, I have developed bioinformatics algorithms to perform genome-wide construction of the fetal genetic map from the massively parallel sequencing data of the maternal plasma DNA sample of a pregnant woman. The construction of the fetal genetic map through the maternal plasma sequencing data is very challenging because fetal DNA only constitutes approximately 10% of the maternal plasma DNA. Moreover, as the fetal DNA in maternal plasma exists as short fragments of less than 200 bp, existing bioinformatics techniques for genome construction are not applicable for this purpose. For the construction of the genome-wide fetal genetic map, I have used the genome of the father and the mother as scaffolds and calculated the fractional fetal DNA concentration. First, I looked at the paternal specific sequences in maternal plasma to determine which portions of the father’s genome had been passed on to the fetus. For the determination of the maternal inheritance, I have developed the Relative Haplotype Dosage (RHDO) approach. This method is based on the principle that the portion of maternal genome inherited by the fetus would be present in slightly higher concentration in the maternal plasma. The use of haplotype information can enhance the efficacy of using the sequencing data. Thus, the maternal inheritance can be determined with a much lower sequencing depth than just looking at individual loci in the genome. This algorithm makes it feasible to use genome-wide scanning to diagnose fetal genetic disorders prenatally in a noninvasive way.<br>As the emergence of targeted massively parallel sequencing, the sequencing cost per base is reducing dramatically. Even though the first part of the thesis has already developed a method to estimate fractional fetal DNA concentration using parental genotype informations, it still cannot be used to deduce the fractional fetal DNA concentration directly from sequencing data without prior knowledge of genotype information. In the second part of this thesis, I propose a statistical mixture model based method, FetalQuant, which utilizes the maximum likelihood to estimate the fractional fetal DNA concentration directly from targeted massively parallel sequencing of maternal plasma DNA. This method allows fetal DNA concentration estimation superior to the existing methods in term of obviating the need of genotype information without loss of accuracy. Furthermore, by using Bayes’ rule, this method can distinguish the informative SNPs where mother is homozygous and fetus is heterozygous, which is potential to detect dominant inherited disorder.<br>Besides the genetic analysis at the DNA level, epigenetic markers are also valuable for noninvasive diagnosis development. In the third part of this thesis, I have also developed a bioinformatics algorithm to efficiently analyze genomewide DNA methylation status based on the massively parallel sequencing of bisulfite-converted DNA. DNA methylation is one of the most important mechanisms for regulating gene expression. The study of DNA methylation for different genes is important for the understanding of the different physiological and pathological processes. Currently, the most popular method for analyzing DNA methylation status is through bisulfite sequencing. The principle of this method is based on the fact that unmethylated cytosine residues would be chemically converted to uracil on bisulfite treatment whereas methylated cytosine would remain unchanged. The converted uracil and unconverted cytosine can then be discriminated on sequencing. With the emergence of massively parallel sequencing platforms, it is possible to perform this bisulfite sequencing analysis on a genome-wide scale. However, the bioinformatics analysis of the genome-wide bisulfite sequencing data is much more complicated than analyzing the data from individual loci. Thus, I have developed Methyl-Pipe, a bioinformatics program for analyzing the DNA methylation status of genome-wide methylation status of DNA samples based on massively parallel sequencing. In the first step of this algorithm, an in-silico converted reference genome is produced by converting all the cytosine residues to thymine residues. Then, the sequenced reads of bisulfite-converted DNA sequences are aligned to this modified reference sequence. Finally, post-processing of the alignments removes non-unique and low-quality mappings and characterizes the methylation pattern in genome-wide manner. Making use of this new program, potential fetal-specific hypomethylated regions which can be used as blood biomarkers can be identified in a genome-wide manner.<br>Detailed summary in vernacular field only.<br>Detailed summary in vernacular field only.<br>Detailed summary in vernacular field only.<br>Jiang, Peiyong.<br>Thesis (Ph.D.)--Chinese University of Hong Kong, 2012.<br>Includes bibliographical references (leaves 100-105).<br>Abstracts also in Chinese.<br>Chapter SECTION I : --- BACKGROUND --- p.1<br>Chapter CHAPTER 1: --- Circulating nucleic acids and Next-generation sequencing --- p.2<br>Chapter 1.1 --- Circulating nucleic acids --- p.2<br>Chapter 1.2 --- Next-generation sequencing --- p.3<br>Chapter 1.3 --- Bioinformatics analyses --- p.9<br>Chapter 1.4 --- Applications of the NGS --- p.11<br>Chapter 1.5 --- Aims of this thesis --- p.12<br>Chapter SECTION II : --- Mathematically decoding fetal genome in maternal plasma --- p.14<br>Chapter CHAPTER 2: --- Characterizing the maternal and fetal genome in plasma at single base resolution --- p.15<br>Chapter 2.1 --- Introduction --- p.15<br>Chapter 2.2 --- SNP categories and principle --- p.17<br>Chapter 2.3 --- Clinical cases and SNP genotyping --- p.20<br>Chapter 2.4 --- Sequencing depth and fractional fetal DNA concentration determination --- p.24<br>Chapter 2.5 --- Filtering of genotyping errors for maternal genotypes --- p.26<br>Chapter 2.6 --- Constructing fetal genetic map in maternal plasma --- p.27<br>Chapter 2.7 --- Sequencing error estimation --- p.36<br>Chapter 2.8 --- Paternal-inherited alleles --- p.38<br>Chapter 2.9 --- Maternally-derived alleles by RHDO analysis --- p.39<br>Chapter 2.1 --- Recombination breakpoint simulation and detection --- p.49<br>Chapter 2.11 --- Prenatal diagnosis of β- thalassaemia --- p.51<br>Chapter 2.12 --- Discussion --- p.53<br>Chapter SECTION III : --- Statistical model for fractional fetal DNA concentration estimation --- p.56<br>Chapter CHAPTER 3: --- FetalQuant: deducing the fractional fetal DNA concentration from massively parallel sequencing of maternal plasma DNA --- p.57<br>Chapter 3.1 --- Introduction --- p.57<br>Chapter 3.2 --- Methods --- p.60<br>Chapter 3.2.1 --- Maternal-fetal genotype combinations --- p.60<br>Chapter 3.2.2 --- Binomial mixture model and likelihood --- p.64<br>Chapter 3.2.3 --- Fractional fetal DNA concentration fitting --- p.66<br>Chapter 3.3 --- Results --- p.71<br>Chapter 3.3.1 --- Datasets --- p.71<br>Chapter 3.3.2 --- Evaluation of FetalQuant algorithm --- p.75<br>Chapter 3.3.3 --- Simulation --- p.78<br>Chapter 3.3.4 --- Sequencing depth and the number of SNPs required by FetalQuant --- p.81<br>Chapter 3.5 --- Discussion --- p.85<br>Chapter SECTION IV : --- NGS-based data analysis pipeline development --- p.88<br>Chapter CHAPTER 4: --- Methyl-Pipe: Methyl-Seq bioinformatics analysis pipeline --- p.89<br>Chapter 4.1 --- Introduction --- p.89<br>Chapter 4.2 --- Methods --- p.89<br>Chapter 4.2.1 --- Overview of Methyl-Pipe --- p.90<br>Chapter 4.3 --- Results and discussion --- p.96<br>Chapter SECTION V : --- CONCLUDING REMARKS --- p.97<br>Chapter CHAPTER 5: --- Conclusion and future perspectives --- p.98<br>Chapter 5.1 --- Conclusion --- p.98<br>Chapter 5.2 --- Future perspectives --- p.99<br>Reference --- p.100
APA, Harvard, Vancouver, ISO, and other styles
34

Erturk, Ece. "Photochemical and Enzymatic Method for DNA Methylation Profiling and Walking Approach for Increasing Read Length of DNA Sequencing by Synthesis." Thesis, 2018. https://doi.org/10.7916/D88W4WQ5.

Full text
Abstract:
The first half of this dissertation demonstrates development of a novel method for DNA methylation profiling based on site specific conversion of cytosine in CpG sites catalyzed by DNA methyltransferases. DNA methylation, a chemical process by which DNA bases are modified by methyl groups, is one of the key epigenetic mechanisms used by cells to regulate gene expression. It predominantly occurs at the 5-position of cytosines in CpG sites and is essential in normal development. Aberrant methylation is associated with many diseases including cancer. Bisulfite Genomic Sequencing (BGS), the gold standard in DNA methylation profiling, works on the principle of converting unmethylated cytosines to uracils using sodium bisulfite under strong basic conditions that cause extensive DNA damage limiting its applications. This dissertation focuses on the research and development of a new method for single cell whole-genome DNA methylation profiling that will convert the unmethylated cytosines in CpG sites to thymine analogs with the aid of DNA methyltransferase and photo-irradiation. Previously we synthesized a model deoxycytidine containing an optimized allyl chemical group at the 5-position and demonstrated that this molecule undergoes photo-conversion to its deoxythymidine analog (C to T conversion) with irradiation at 300 nm. The C to T conversion also proved feasible using synthetic DNA molecules. In this thesis, we demonstrate the conversion of a novel modified deoxycytidine molecule (PhAll-dC) using 350 nm photo-irradiation and a triplet photosensitizer (thioxanthone, TX) to avoid potential DNA damage. The new photoproduct was identified as the deoxythymidine analog of the starting molecule as assessed by IR, MS and NMR. An AdoMet analog containing the optimized chemical group was also synthesized and tested for enzymatic transfer to the C5-position of CpG cytosines using DNA methyltransferases. DNA methyltansferase M.SssI was engineered for more efficient enzymatic transfer. In the future, we will incorporate a triplet photosensitizer into the photoreactive moiety on AdoMet to increase energy transfer efficiency for photo-conversion of C to the T analog. Incorporating this into an overall method followed by amplification and sequencing should allow us to assess the methylation status of all CpGs in the genome in an efficient manner. The second half of this dissertation demonstrates development of a DNA sequencing by synthesis (SBS) method, The Sequence Walking Approach, using novel nucleotide reversible terminators (NRTs) together with natural nucleotides. Following the completion of The Human Genome Project, next generation DNA sequencing technologies emerged to overcome the limitations of Sanger Sequencing, the prominent DNA sequencing technology of the time. These technologies led to significant improvements in throughput, accuracy and economics of DNA sequencing. Today, fluorescence-based sequencing by synthesis methods dominate the high-throughput sequencing market. One of the major challenges facing fluorescence-based SBS methods is their read length limitation which constitutes a big barrier for applications such as de novo genome assembly and resolving structurally complex regions of the genome. In this regard, we have developed a novel SBS method called ‘The Sequence Walking Approach’ to overcome current challenges in increasing the single pass read length of DNA sequencing. Our method utilizes three dNTPs together with one nucleotide reversible terminator in reactions called ‘walks’ that terminate at predetermined bases instead of after each incorporation. In this method, the primer extended via 4-color SBS is stripped off and replaced by the original primer for walking reactions. By reducing the accumulation of cleavage artifacts of incorporated NRTs in a single run, our method aims to reach longer read lengths. In this thesis, we have demonstrated a variation of The Sequence Walking Approach in which 4-color sequencing steps are interspersed with walking steps over a continuous length of DNA without stripping off extended primers and reannealing the original primer. The improvements introduced with this method will enable the use of fluorescence-based SBS in many applications such as detection of genomic variants and de novo genome assemblies while preserving low costs and high accuracy.
APA, Harvard, Vancouver, ISO, and other styles
35

Hsieh, Min-Kang. "Design and Synthesis of 3'-Oxygen-Modified Cleavable Nucleotide Reversible Terminators for Scarless DNA Sequencing by Synthesis." Thesis, 2018. https://doi.org/10.7916/D8XD2HST.

Full text
Abstract:
This dissertation describes the design and synthesis of novel cleavable fluorescent/anchor modified nucleotide reversible terminators using 3’-O-dithiomethyl (3’-O-DTM; 3’-O-SS) as a linker to directly or indirectly attach a fluorescent reporter to achieve scarless DNA Sequencing by Synthesis (SBS). To develop these nucleotide analogues for four-color SBS, two nucleotide analogues (3’-O-ROX-SS-dATP and 3’-O-BodipyFL-SS-dTTP) with directly attached fluorescent dyes and two other nucleotide analogues with directly attached biotin or trans-cyclooctene (TCO) as anchors (3’-O-Biotin-SS-dCTP and 3’-O-TCO-SS-dGTP) were successfully designed and synthesized. The nucleotide analogues with a PEG-elongated linker (3’-O-ROX-PEG4-SS-dATP, 3’-O-BodipyFL-PEG4-SS-dTTP, 3’-O-Biotin-PEG4-SS-dCTP and 3’-O-TCO-PEG4-SS-dGTP) were also designed and synthesized to optimize their incorporation efficiency in polymerase reactions. In our design, Biotin and TCO were demonstrated to be anchor moieties with high efficiency and specificity for binding with fluorescently labeled streptavidin and tetrazine, respectively. The DNA extension products produced by polymerase incorporation of 3’-O-Biotin-SS-dCTP and 3’-O-Biotin-PEG4-SS-dCTP were accurately identified by binding to Cy5-labeled streptavidin, while the DNA extension products produced by polymerase incorporation of 3’-O-TCO-SS-dGTP and 3’-O-TCO-PEG4-SS-dGTP were identified with equal precision by reaction with TAMRA-labeled tetrazine. A proof-of-concept experiment was conducted to demonstrate four-color scarless SBS using the novel nucleotide analogues described above.
APA, Harvard, Vancouver, ISO, and other styles
36

Ren, Jianyi. "Design and Synthesis of Novel Cleavable Fluorescent Nucleotide Reversible Terminators Using Disulfide Linkers for DNA Sequencing by Synthesis." Thesis, 2018. https://doi.org/10.7916/D82R544K.

Full text
Abstract:
High-throughput DNA sequencing technology has advanced rapidly in the past few decades and is the driving force for personalized precision medicine. In this Thesis, a set of novel disulfide linker-based nucleotide reversible terminators (NRTs) has been designed and synthesized for application in DNA sequencing by synthesis (SBS), which is the dominant sequencing platform. The design and synthesis principles are outlined as follows. Four nucleotides (A, C, G, T) are modified as NRTs for the DNA extension reaction catalyzed by polymerase by attaching a cleavable fluorophore to a specific location on the base and blocking the 3′-OH group with a small chemically-reversible moiety so that the resulting molecules are still recognized by DNA polymerase as substrates. In these fluorescent NRTs, the fluorophores are attached through a disulfide (-SS-) cleavable linker to the 5-position of cytosine and thymine, and to the 7-position of deaza-adenine and deaza-guanine, and a small disulfide moiety is used to cap the 3'-OH group of the deoxyribose. The resulting fluorescent NRTs (3′-O-tert-butyldithiomethyl-dNTP-SS-fluorophores) are shown to be good substrates in DNA polymerase catalyzed reactions. The fluorophore and the 3′-O-tert-butyldithiomethyl group on a DNA extension product, which is generated by incorporating the 3′-O-tert-butyldithiomethyl-dNTP-SS-fluorophore in a polymerase reaction, are removed simultaneously and rapidly by treatment with a reducing agent, tris (3-hydroxypropyl) phosphine, in aqueous buffer solution. This one-step dual-cleavage reaction thus allows the reinitiation of the polymerase reaction and increases the SBS efficiency. DNA templates consisting of homopolymer regions were accurately sequenced by using this class of fluorescent nucleotide analogues on a DNA chip and a four-color fluorescent scanner. Compared with existing fluorescent NRTs, the unique disulfide linkers used to synthesize the NRTs described in this thesis are cleaved efficiently under DNA compatible conditions, leading to shorter scars on the DNA extension strand to further improvement of the DNA SBS technology.
APA, Harvard, Vancouver, ISO, and other styles
37

Fox, Samuel E. "Transcriptomic analysis using high-throughput sequencing and DNA microarrays." Thesis, 2011. http://hdl.handle.net/1957/23741.

Full text
Abstract:
Transcriptomics and gene expression profiling enables the elucidation of the genetic response of an organism to various environmental cues. Transcriptomics enables the deciphering of differences between two closely related organisms to the same environment and in contrast, enables the elucidation of genetic responses of the same organism to different environmental cues. Two major methods are utilized for the study of transcriptomes, high-throughput sequencing and microarray analysis. High-throughput sequencing technologies such as the Illumina platform are relatively new and protocols must be developed for the analyses of transcriptomes (RNA-sequencing). A RNA-seq protocol was developed and refined for the Illumina sequencing platform. This protocol was then utilized for the de novo sequencing of the steelhead salmon transcriptome. Hatchery steelhead exhibit a reduced fitness compared to wild steelhead that has been shown to be genetically based. Consequently, the steelhead transcriptome was assembled, annotated, and used to identify gene expression differences between hatchery and wild fish. We uncovered many differentially expressed genes involved in metabolic processes and growth and development. This work has created a better understanding of the genetic differences between hatchery and wild steelhead salmon. Brachypodium distachyon is a monocot grass important as a model for cereal crops and potential biofuels feedstocks. To better understand the genetic response of this plant to different environmental cues, a comprehensive assessment of the transcriptomic response was conducted under a variety of conditions including diurnal/circadian light/dark/temperature environments and different abiotic stress conditions. Using a whole-genome tiling DNA microarray, we identified that the majority of transcripts in Brachypodium exhibit a daily rhythm in their abundance that is conserved between rice and Brachypodium. We also identified numerous cis-regulatory elements dictating these rhythmic expression patterns. We also identified the genetic response to abiotic stresses such as salinity, drought, cold, heat, and high light. We uncovered a core set of genes which responds to all stresses, indicating a core stress response. A large number of transcription factors were uncovered as potential nodes for regulating the abiotic stress response in Brachypodium. Moreover, promoter elements that drive specific responses to discrete abiotic stresses were uncovered. Altogether, the transcriptome analyses in this work furthers our understandings of how particular organisms respond to environmental cues and better elucidates the relationship between genes and the environment.<br>Graduation date: 2012<br>Access restricted to the OSU Community at author's request from Oct. 5, 2011 - April 5, 2012.
APA, Harvard, Vancouver, ISO, and other styles
38

"Plasma DNA sequencing: a tool for noninvasive prenatal diagnosis and research into circulating nucleic acids." Thesis, 2010. http://library.cuhk.edu.hk/record=b6075311.

Full text
Abstract:
In the first part of this thesis, two chromosome Y specific genes ( SRYand TSPY) were chosen as the molecular targets to investigate the characteristics of fetal-specific DNA fragments in maternal plasma. By employing the touch down ligation-mediated PCR coupled with cloning and sequencing, the end property and the fragment species of fetal DNA were studied.<br>Noninvasive prenatal detection of fetal chromosomal aneuploidies is a much sought-after goal in fetomaternal medicine. The discovery of fetal DNA in the plasma of pregnant women has offered new opportunities for this purpose. However, the fact that fetal DNA amounts to just a minor fraction of all DNA in maternal plasma makes it challenging for locus-specific DNA assays to detect the small increase in sequences derived from a trisomic chromosome. On the other hand, although the clinical applications of plasma DNA for prenatal diagnosis are expanding rapidly, the biological properties of circulating DNA in plasma remain unclear. Recently, next-generation sequencing technologies have transformed the landscape of biomedical research through the ultra-high-throughput sequence information generated in a single run. Massively parallel sequencing allows us to study plasma DNA at an unprecedented resolution and also precisely detect fetal chromosomal aneuploidies in a locus-independent way.<br>Our group has demonstrated the use of massively parallel sequencing to quantify maternal plasma DNA sequences for the noninvasive prenatal detection of fetal trisomy 21. In the second part of this thesis, the clinical utility of this new sequencing approach was extended to the prenatal detection of fetal trisomy 18 and 13. A region-selection method was developed to minimize the effects of GC content on the diagnostic sensitivity and precision for the prenatal diagnosis of trisomy 13. To facilitate the next-generation sequencing-based maternal plasma DNA analysis for clinical implementation, two measures, i.e., lowering the starting volume of maternal plasma and barcoding multiple maternal plasma samples, were investigated.<br>Taken together, the results presented in this thesis have demonstrated the clinical utility of massively parallel sequencing of maternal plasma DNA and have also provided us a better understanding of the biology of circulating DNA molecules.<br>The third part of this thesis focuses on the massively parallel paired-end sequencing of plasma DNA. By analyzing millions of sequenced DNA fragments, the biological properties of maternal plasma DNA were elucidated, such as the size distribution of fetal-derived and maternally-contributed DNA molecules and the potential effect of epigenetic modification on DNA fragmentation. Moreover, the plasma DNA from hematopoietic stem cell transplant patients was characterized by paired-end sequencing approach. These sequencing data not only confirmed the predominant hematopoietic origin of cell-free DNA but also revealed the size difference between hematologically-derived and other tissue-derived DNA molecules in plasma.<br>Zheng, Wenli.<br>Adviser: Lo Yu Ming Dennis.<br>Source: Dissertation Abstracts International, Volume: 73-03, Section: B, page: .<br>Thesis (Ph.D.)--Chinese University of Hong Kong, 2010.<br>Includes bibliographical references (leaves 261-275).<br>Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Abstract also in Chinese.
APA, Harvard, Vancouver, ISO, and other styles
39

Wang, Xuanting. "Identification of SARS-CoV-2 Polymerase and Exonuclease Inhibitors and Novel Methods for Single-Color Fluorescent DNA Sequencing by Synthesis." Thesis, 2021. https://doi.org/10.7916/d8-n6ah-nt76.

Full text
Abstract:
This dissertation is divided into two main sections describing major portions of my Ph.D. research: (1) development of two enzymatic assays for identifying inhibitors of SARS-CoV-2 RNA dependent RNA polymerase (RdRp) and the associated proofreading exonuclease complexes, two key enzymatic activities of SARS-CoV-2, the virus responsible for the COVID-19 pandemic and (2) the design and implementation of four novel single-color fluorescent DNA sequencing by synthesis (SBS) methods, including the synthesis of many of the key nucleotide analogues required for these studies. In response to the COVID-19 pandemic, the first part of my research is focused on the discovery of potential therapeutics for combating coronavirus infections. Chapter 1 describes the identification of several polymerase and exonuclease inhibitors for SARS-CoV-2 using novel mass spectrometry-based molecular assays. SARS-CoV-2 has an exonuclease complex, which removes nucleotide inhibitors such as Remdesivir that are incorporated into the viral RNA during replication, reducing the efficacy of these drugs for treating COVID-19. Combinations of inhibitors of both the viral RdRp and the exonuclease could overcome this deficiency. Chapter 1 reports the identification of hepatitis C virus NS5A inhibitors Pibrentasvir and Ombitasvir as SARS-CoV-2 exonuclease inhibitors. In the presence of identified exonuclease inhibitors, RNAs terminated with the active forms of the prodrugs like Sofosbuvir, Remdesivir and Favipiravir were largely protected from excision by the exonuclease, while in the absence of exonuclease inhibitors, there was rapid excision. Viral cell culture studies also demonstrate significant synergy using this combination strategy. This study supports the use of combination drugs that inhibit both the SARS-CoV-2 polymerase and exonuclease for effective COVID-19 treatment. Chapters 2-6 describe the single-color DNA SBS studies. Chapter 2 provides essential background on the structure of DNA, the DNA polymerase reaction, and several key DNA sequencing technologies, with an emphasis on the design of nucleotide analogues for the DNA SBS approach. Chapter 3 delineates a one-color fluorescent DNA SBS method based on a set of nucleotide reversible terminators (NRTs) comprising two orthogonal cleavable linkers, one fluorescent dye and one anchor. Chapter 4 describes a one-color hybrid DNA sequencing approach using a set of dideoxynucleotide analogues bearing two orthogonal cleavable linkers, one fluorophore and one anchor as well as a set of unlabeled NRTs. By introducing a pH responsive fluorophore into the design of nucleotide analogues, Chapter 5 demonstrates a novel type of single-color DNA SBS method using a set of NRTs comprising one pH-responsive fluorescent dye or one non-responsive fluorescent dye tethered with one cleavable linker. Chapter 6 presents another option for the single-color DNA sequencing technique using a set of deoxynucleotide analogues comprising the above pH responsive or non-responsive dyes tethered with a cleavable linker, along with a set of unlabeled NRTs. The one-color SBS approaches have the potential for higher sensitivity, miniaturization and cost effectiveness compared with four-color SBS methods. Finally, Chapter 7 summarizes the SARS-CoV-2 antiviral drug discovery and one-color sequencing techniques and discusses potential follow-up research on these projects.
APA, Harvard, Vancouver, ISO, and other styles
40

"Development of bioinformatics algorithms for trisomy 13 and 18 detection by next generation sequencing of maternal plasma DNA." 2011. http://library.cuhk.edu.hk/record=b5894869.

Full text
Abstract:
Chen, Zhang.<br>Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.<br>Includes bibliographical references (p. 109-114).<br>Abstracts in English and Chinese.<br>ABSTRACT --- p.I<br>摘要 --- p.III<br>ACKNOWLEDGEMENTS --- p.IV<br>PUBLICATIONS --- p.VI<br>CONTRIBUTORS --- p.VII<br>TABLE OF CONTENTS --- p.VIII<br>LIST OF TABLES --- p.XIII<br>LIST OF FIGURES --- p.XIV<br>LIST OF ABBREVIATIONS --- p.XVI<br>Chapter SECTION I : --- BACKGROUND --- p.1<br>Chapter CHAPTER 1: --- PRENATAL DIAGNOSIS OF FETAL TRISOMY BY NEXT GENERATION SEQUENCING TECHNOLOGY --- p.2<br>Chapter 1.1 --- FETAL TRISOMY --- p.2<br>Chapter 1.2 --- CONVENTIONAL PRENATAL DIAGNOSIS OF FETAL TRISOMIES --- p.3<br>Chapter 1.3 --- CELL FREE FETAL D N A AND ITS APPLICATION IN PRENATAL DIAGNOSIS --- p.5<br>Chapter 1.4 --- NEXT GENERATION SEQUENCING TECHNOLOGY --- p.5<br>Chapter 1.5 --- SUBSTANTIAL BIAS IN THE NEXT GENERATION SEQUENCING PLATFORM --- p.9<br>Chapter 1.6 --- PRENATAL DIAGNOSIS OF TRISOMY BY NEXT GENERATION SEQUENCING --- p.10<br>Chapter 1.7 --- AIMS OF THIS THESIS --- p.11<br>Chapter SECTION I I : --- MATERIALS AND METHODS --- p.13<br>Chapter CHAPTER 2: --- METHODS FOR NONINVASIVE PRENATAL DIAGNOSIS OF FETAL TRISOMY MATERNAL PLASMA DNA SEQUENCING --- p.14<br>Chapter 2.1 --- STUDY DESIGN AND PARTICIPANTS --- p.14<br>Chapter 2.1.1 --- Ethics Statement --- p.14<br>Chapter 2.1.2 --- "Study design, setting and participants" --- p.14<br>Chapter 2.2 --- MATERNAL PLASMA D N A SEQUENCING --- p.17<br>Chapter 2.3 --- SEQUENCING DATA ANALYSIS --- p.18<br>Chapter SECTION I I I : --- TRISOMY 13 AND 18 DETECTION BY THE T21 BIOINFORMATICS ANALYSIS PIPELINE --- p.21<br>Chapter CHAPTER 3: --- THE T21 BIOINFORMATICS ANALYSIS PIPELINE FOR TRISOMY 13 AND 18 DETECTION --- p.22<br>Chapter 3.1 --- INTRODUCTION --- p.22<br>Chapter 3.2 --- METHODS --- p.23<br>Chapter 3.2.1 --- Bioinformatics analysis pipeline for trisomy 13 and 18 detection --- p.23<br>Chapter 3.3 --- RESULTS --- p.23<br>Chapter 3.3.1 --- Performance of the T21 bioinformatics analysis pipeline for trisomy 13 and 18 detection --- p.23<br>Chapter 3.3.2 --- The precision of quantifying chrl 3 and chrl 8 --- p.27<br>Chapter 3.4 --- DISCUSSION --- p.29<br>Chapter SECTION IV : --- IMPROVING THE T21 BIOINFORMATICS ANALYSIS PIPELINE FOR TRISOMY 13 AND 18 DETECTION --- p.30<br>Chapter CHAPTER 4: --- IMPROVING THE ALIGNMENT --- p.31<br>Chapter 4.1 --- INTRODUCTION --- p.31<br>Chapter 4.2 --- METHODS --- p.32<br>Chapter 4.2.1 --- Allowing mismatches in the index sequences --- p.32<br>Chapter 4.2.2 --- Calculating the mappability of the human reference genome --- p.33<br>Chapter 4.2.3 --- Aligning reads to the non-repeat masked human reference genome --- p.34<br>Chapter 4.2.4 --- Trisomy 13 and 18 detection --- p.34<br>Chapter 4.3 --- RESULTS --- p.34<br>Chapter 4.3.1 --- Increasing read numbers by allowing mismatches in the index sequences --- p.34<br>Chapter 4.3.2 --- Increasing read numbers by using the non-masked reference genome for alignment . --- p.38<br>Chapter 4.3.3 --- Allowing mismatches in the read alignment --- p.42<br>Chapter 4.3.4 --- The performance of trisomy 13 and 18 detection after improving the alignment --- p.47<br>Chapter 4.4 --- DISCUSSION --- p.50<br>Chapter CHAPTER 5: --- REDUCING THE GC BIAS BY CORRECTION OF READ COUNTS --- p.53<br>Chapter 5.1 --- INTRODUCTION --- p.53<br>Chapter 5.2 --- METHODS --- p.54<br>Chapter 5.2.1 --- Read alignment --- p.54<br>Chapter 5.2.2 --- Calculating the correlation between GC content and read counts --- p.55<br>Chapter 5.2.3 --- GC correction in read counts --- p.55<br>Chapter 5.2.4 --- Trisomy 13 and 18 detection --- p.56<br>Chapter 5.3 --- RESULTS --- p.56<br>Chapter 5.3.1 --- GC bias in plasma DNA sequencing --- p.56<br>Chapter 5.3.2 --- Correcting the GC bias in read counts by linear regression --- p.59<br>Chapter 5.3.3 --- Correcting the GC bias in read counts by LOESS regression --- p.65<br>Chapter 5.3.4 --- Bin size --- p.72<br>Chapter 5.4 --- DISCUSSION --- p.75<br>Chapter CHAPTER 6: --- REDUCING THE GC BIAS BY MODIFYING THE GENOMIC REPRESENTATION CALCULATION --- p.77<br>Chapter 6.1 --- INTRODUCTION --- p.77<br>Chapter 6.2 --- METHODS --- p.78<br>Chapter 6.2.1 --- Modifying the genomic representation calculation --- p.78<br>Chapter 6.2.2 --- Trisomy 13 and 18 detection --- p.78<br>Chapter 6.2.3 --- Combining GC correction and modified genomic representation --- p.78<br>Chapter 6.3 --- RESULTS --- p.79<br>Chapter 6.3.1 --- Reducing the GC bias by modifying genomic representation calculation --- p.79<br>Chapter 6.3.2 --- Combining GC correction and modified genomic representation --- p.86<br>Chapter 6.4 --- DISCUSSION --- p.89<br>Chapter CHAPTER 7: --- IMPROVING THE STATISTICS FOR TRISOMY 13 AND 18 DETECTION --- p.91<br>Chapter 7.1 --- INTRODUCTION --- p.91<br>Chapter 7.2 --- METHODS --- p.92<br>Chapter 7.2.1 --- Comparing chrl 3 or chrl8 with other chromosomes within the sample --- p.92<br>Chapter 7.2.2 --- Comparing chrl 3 or chrl 8 with the artificial chromosomes --- p.92<br>Chapter 7.3 --- RESULTS --- p.93<br>Chapter 7.3.1 --- Determining the trisomy 13 and 18 status by comparing chromosomes within the samples --- p.93<br>Chapter 7.3.2 --- Determining the trisomy 13 and 18 status by comparing chrl3 or chrl 8 with artificial chromosomes --- p.97<br>Chapter 7.4 --- DISCUSSION --- p.100<br>Chapter SECTION V : --- CONCLUDING REMARKS --- p.102<br>Chapter CHAPTER 8: --- CONCLUSION AND FUTURE PERSPECTIVES --- p.103<br>Chapter 8.1 --- THE PERFORMANCE OF THE T21 BIOINFORMATICS ANALYSIS PIPELINE DEVELOPED FOR TRISOMY 21 DETECTION IS SUBOPTIMAL FOR TRISOMY 13 AND 18 DETECTION --- p.103<br>Chapter 8.2 --- THE ALIGNMENT COULD BE IMPROVED BY ALLOWING ONE MISMATCH IN THE INDEX AND USING THE NON-REPEAT MASKED HUMAN REFERENCE GENOME AS THE ALIGNMENT REFERENCE --- p.104<br>Chapter 8.3 --- THE PRECISION OF QUANTIFYING CHR13 AND CHR18 COULD BE IMPROVED BY THE G C CORRECTION OR THE MODIFIED GENOMIC REPRESENTATION --- p.104<br>Chapter 8.4 --- THE STATISTICS FOR TRISOMY 13 AND 18 DETECTION COULD BE IMPROVED BY COMPARING CHR13 OR CHR18 WITH ARTIFICIAL CHROMOSOMES WITHIN THE SAMPLE --- p.105<br>Chapter 8.5 --- PROSPECTS FOR FUTURE WORK --- p.106<br>REFERENCE --- p.109
APA, Harvard, Vancouver, ISO, and other styles
41

Andere, Anne A. "De novo genome assembly of the blow fly Phormia regina (Diptera: Calliphoridae)." Thesis, 2014. http://hdl.handle.net/1805/5630.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)<br>Phormia regina (Meigen), commonly known as the black blow fly is a dipteran that belongs to the family Calliphoridae. Calliphorids play an important role in various research fields including ecology, medical studies, veterinary and forensic sciences. P. regina, a non-model organism, is one of the most common forensically relevant insects in North America and is typically used to assist in estimating postmortem intervals (PMI). To better understand the roles P. regina plays in the numerous research fields, we re-constructed its genome using next generation sequencing technologies. The focus was on generating a reference genome through de novo assembly of high-throughput short read sequences. Following assembly, genetic markers were identified in the form of microsatellites and single nucleotide polymorphisms (SNPs) to aid in future population genetic surveys of P. regina. A total 530 million 100 bp paired-end reads were obtained from five pooled male and female P. regina flies using the Illumina HiSeq2000 sequencing platform. A 524 Mbp draft genome was assembled using both sexes with 11,037 predicted genes. The draft reference genome assembled from this study provides an important resource for investigating the genetic diversity that exists between and among blow fly species; and empowers the understanding of their genetic basis in terms of adaptations, population structure and evolution. The genomic tools will facilitate the analysis of genome-wide studies using modern genomic techniques to boost a refined understanding of the evolutionary processes underlying genomic evolution between blow flies and other insect species.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!