Thèses sur le sujet « Séquençage à haut débit »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les 50 meilleures thèses pour votre recherche sur le sujet « Séquençage à haut débit ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.
Ric, Audrey Marie Amélie. « Caractérisation d'aptamères par électrophorèse capillaire couplée au séquençage haut-débit Illumina ». Thesis, Toulouse 3, 2017. http://www.theses.fr/2017TOU30388/document.
Texte intégralAptamers are oligomers of small single-stranded DNA or RNA which can have strong and specific interactions with some targets when they fold into three-dimensional structures. The objective of this thesis was to complete existing studies on the use of capillary electrophoresis in order to develop a method for the selection of aptamers by CE coupled to laser induced fluorescence and Illumina high-throughput sequencing. In a first step, we developed a method of detection and separation by capillary electrophoresis coupled with the double detection UV-LEDIF of a DNA library interacting with a target: thrombin. It is a model already studied and for which two aptamers have been published. We used aptamer T29 as part of our study because it has the best affinity. Capillary Electrophoresis is a powerful analytical tool that facilitates the selection efficiency of aptamers and specifies the determination of the interaction parameters. We thus were able to determine the affinity constant KD by CE-UV-LEDIF on the basic model: thrombin. Moreover, we also show how the use of Tris buffer can degrade single-stranded DNA during capillary electrophoresis and we propose as an alternative the use of a dibasic sodium phosphate buffer which avoids the phenomenon of degradation. Finally, we explain the difficulty of amplification by qPCR and PCR of an aptamer such as T29 with a G-quadruplex structure. We showed that the Illumina high-throughput sequencing allowed us to find a correlation between the number of sequenced molecules and the number of sequences obtained. Analysis of the sequences obtained shows a significant amount (20%) of T29 sequences which do not correspond to the sequence of this aptamer. This shows that the PCR and high-throughput sequencing steps for the detection of G-quadruplex can induce bias in the identification of these molecules
Kopylova, Evguenia. « Algorithmes bio-informatiques pour l'analyse de données de séquençage à haut débit ». Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2013. http://tel.archives-ouvertes.fr/tel-00919185.
Texte intégralKopylova, Evguenia. « Algorithmes bio-informatiques pour l’analyse de données de séquençage à haut débit ». Thesis, Lille 1, 2013. http://www.theses.fr/2013LIL10181/document.
Texte intégralSequence alignment algorithms are at the heart of bioinformatic sequence analysis. In this thesis we focus on the alignment of millions of short sequences produced by Next-Generation Sequencing (NGS) technologies in particular for the analysis of metagenomic and metatranscriptomic data, that is the DNA and RNA directly extracted for an environment. Two major challenges were confronted in our developed algorithms. First, all NGS technologies today are susceptible to sequencing errors in the form of nucleotide substitutions, insertions and deletions. Second, metagenomic samples can contain hundreds of unknown organisms and the standard approach to identifying them is to align against known closely related species. To overcome these challenges we designed a new approximate matching technique based on the universal Levenshtein automaton which quickly locates short regions of similarity (seeds) between two sequences allowing 1 error of any type. Using seeds to detect possible high scoring alignments is a widely used heuristic for rapid sequence alignment, although most existing software are optimized for performing high similarity searches and apply exact seeds. Furthermore, we describe a new indexing data structure based on the Burst trie which optimizes the search for approximate seeds. We demonstrate the efficacy of our method in two implemented software, SortMeRNA and SortMeDNA. The former can quickly filter ribosomal RNA fragments from metatranscriptomic data and the latter performs full alignment for genomic and metagenomic data
Latypova, Martin Xénia. « Etude fonctionnelle de variants identifiés par séquençage haut-débit : apports et perspectives ». Thesis, Nantes, 2018. http://www.theses.fr/2018NANT1024.
Texte intégralTechnological advances have opened unparalleled opportunities to detect genetic variation. Interpretation of these datausing in vivo disease modeling approaches provides helpful input to inform Medical Genetics clinical practice. Neurodevelopmental disorders, including intellectual disability and autism spectrum disorder, pose a major challengefor genomic data interpretation and disease modeling, given the extensive locus heterogeneity, high contribution of de novo variation to etiologic burden and low accessibility of cell types of interest. Using anatomical surrogate phenotypes in zebrafish, we established relevance to disease and tested pathogenicity of point mutations in novel neurodevelopmental disease causing genes RORA and SIN3B. First, we categorized the RORA-associated disorder in two clinical subtypes depending on the presence of cerebellar features present in addition to intellectual disability and autism spectrum disorder. Nonsynonymous variant testing in zebrafish indicated that there was a diverse direction of variant effect, which was consistent with the clinical subtypes observed. Additionally, we supported SIN3B involvement in a syndromic intellectual disability syndrome by demonstrating that disruption of craniofacial architecture, a comorbid feature, was caused by sin3b targeting in zebrafish. This work highlights the utility of the zebrafish model organism as an informative experimental tool for variant interpretation in genomic medicine, especially in neurodevelopmental disorders
Vervier, Kevin. « Méthodes d’apprentissage structuré pour la microbiologie : spectrométrie de masse et séquençage haut-débit ». Thesis, Paris, ENMP, 2015. http://www.theses.fr/2015ENMP0081/document.
Texte intégralUsing high-throughput technologies is changing scientific practices and landscape in microbiology. On one hand, mass spectrometry is already used in clinical microbiology laboratories. On the other hand, the last ten years dramatic progress in sequencing technologies allows cheap and fast characterization of microbial diversity in complex clinical samples. Consequently, the two technologies are approached in future diagnostics solutions. This thesis aims to play a part in new in vitro diagnostics (IVD) systems based on high-throughput technologies, like mass spectrometry or next generation sequencing, and their applications in microbiology.Because of the volume of data generated by these new technologies and the complexity of measured parameters, we develop innovative and versatile statistical learning methods for applications in IVD and microbiology. Statistical learning field is well-suited for tasks relying on high-dimensional raw data that can hardly be used by medical experts, like mass-spectrum classification or affecting a sequencing read to the right organism. Here, we propose to use additional known structures in order to improve quality of the answer. For instance, we convert a sequencing read (raw data) into a vector in a nucleotide composition space and use it as a structuredinput for machine learning approaches. We also add prior information related to the hierarchical structure that organizes the reachable micro-organisms (structured output)
Haidar, Zahraa. « Identification de gènes responsables de maladies neurologiques héréditaires par séquençage à haut débit ». Thesis, Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0662.
Texte intégralMy work is a joint PhD between Saint Joseph University in Beirut (Lebanon) and Aix Marseille University in Marseille (France). My PhD project aims at identifying genes responsible for rare neurological diseases by next-generation sequencing (NGS) in consanguineous Lebanese families. Neurological diseases are characterized by extensive phenotypic and genetic heterogeneity, and affect the structure and function of different regions of the central and peripheral nervous system.During my PhD work, I have studied several of these families, trying to identify the molecular basis of the studied disease, using NGS technologies. First, I performed the bioinformatics analysis of the exome and genome data, as well as the segregation by Sanger sequencing, and the family segregation of the candidate variants identified by NGS. In some diseases, for which a new mutation or gene has been identified; I have carried out more functional studies, in order to understand the physiopathological mechanisms bases
Becmeur-Lefebvre, Mathilde. « Identification de nouveaux genes responsables d'anomalies du développement par séquençage haut débit d'exome ». Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCK080.
Texte intégralMultiple congenital anomalies (MCA) are often genetic conditions, with a risk of recurrence. The etiologic diagnosis of these conditions in fetuses is mandatory to allow genetic counseling for the future pregnancies. Regarding current diagnostic tests (fetal autopsy, cytogenetic test and targeted molecular tets), the diagnostic rate in MCA fetuses is about 30%, allowing genetic counselling in only one third of families. Exome sequencing (ES) has allowed to identify the molecular basis of many new syndromes.We aimed to assess the contribution of ES solo-based strategy to identify new developmental genes in fetuses presenting with MCA without etiological diagnosis after standard investigations with an original multistep strategy.We performed solo ES in 95 MCA fetuses from 10 prenatal diagnostic centers in France. First, we focused on OMIM related disease genes, with a first step using bioinformatic scores and public databases independently of phenotype, a second step using genotype-phenotype correlation and a third step of research analysis extended to the whole exome. Variant confirmation and parental segregation were done by Sanger sequencing. ES allowed the identification of a causative variants in 23 fetuses (24%), variants of unknown significance (VUS) in 7 fetuses (7%) and variants in new candidate genes in 6 fetuses (6%). Among causative variants, most were from autosomal recessive inheritance (50%), 42% were sporadic and 4% were from autosomal dominant inheritance. The additionnal strategy identified 17/23 causative variants, including 2 new causative variants not identified by the classical approach because of atypical or extreme fetal phenotype, and 2 new VUS. No new candidate gene was identified by this strategy.To conclude, solo ES with classical and additionnal strategy presents a low efficiency to identify new genes implicated in embryonary development but allows the extension of the clinical spectrum of well-known pediatric pathologies to the prenatal period. Trio ES or genome sequencing would be now insteresting strategies to be explored
Mersch, Marjorie. « Analyse de la méthylation de l'ADN par séquençage haut-débit chez la Poule ». Thesis, Toulouse, INPT, 2018. http://www.theses.fr/2018INPT0107/document.
Texte intégralAnticipating the impact of environmental changes (on climate and feed) is a crucial issue for livestock production systems, including poultry. The influence of the environment on phenotypes is partly mediated by epigenetic phenomena, including DNA methylation, which may be involved in the regulation of gene expression. These mechanisms do not affect the DNA sequence but can be inherited by mitosis or meiosis. The interactions between epigenomes and gene expression are increasingly being studied in animal models and in plants. However, the mechanisms of regulation of genome expression through DNA methylation are relatively unknown in birds. This thesis work is based on two experimental devices realized in chicken aiming to characterize the methylome by high-throughput sequencing. The methylation patterns across the genome, and their link with expression, were first established by whole-genome bisulfite sequencing (WGBS) in whole embryos, following a reduced representation bisulfite sequencing (RRBS) from hypothalamus of adults. To date, no specific chicken RRBS study has been published. These two analyses were carried out by developing an optimized bioinformatics pipeline, available for scientific community. Overall, the pattern of methylation in chicken is like those in mammals: CpG islands - dinucleotides CG-rich regions which are often poorly methylated, and which are found mainly in the promoter regions of the genome - are generally poorly methylated in promoters on WGBS and RRBS data. Embryo methylome analyses confirmed the absence of a dose-compensation phenomenon on sex chromosomes, or the presence of a hypermethylated region on the Z chromosome. The analyses of RRBS data revealed an overall hypermethylation of CGs across the genome, suggesting a methylation response to environmental stress. From the analysis of WGBS data, we found that the level of methylation in promoters was negatively correlated with the expression of the associated gene. For the first time, a specific allele methylation was also detected between chicken lines whose frequency is comparable to that observed in humans. On the RRBS data, preliminary results of the methylome response to environmental stresses showed the complex nature of this relationship. The use of a low-energy diet would led to greater mobilization of body fat, while individuals with heat stress had a lighter body weight. Integrating these data with phenotypic measurements would allow to link methylation and environment. Beyond the fundamental aspect of this thesis, the method developed in this work could be applied to livestock systems to breed animals better adapted to a changing environment, by improving production traits
Fermey, Pierre. « Identification de nouvelles bases moléculaires des cancers précoces par séquençage à haut débit ». Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR110/document.
Texte intégralOne of the greatest advances in oncology and genetics over the past 20 years has been the identification of hereditary forms of cancer and of the cancer genes. Nevertheless, in a majority of patients suspected to present an inherited form of cancer, analyses of the genes known to be involved in the Mendelian predispositions to cancer often remain negative. Today, thanks to the emergence of high-throughput sequencing (NGS), it is now possible to sequence all exons of an individual (exome) or several hundred genes in a short period of time and for a reasonable cost. In this context, we have applied several strategiesbased on these new tools in order to identify new molecular basis of early-onset cancers. First, we applied an intra-familial exome analysis strategy to an atypical family with chondrosarcomas of the chest, for which no molecular basis could be identified. Using this strategy, we were able to identify a truncating alteration of the EXT2 gene NM_000401.3; c.237G> A; p.Trp79 *). The documented loss of function alterations of this gene are implicated in a disease called multiple osteochondromas (OM), associated with benign lesions. Interestingly, these patients showed no clinical signs of OM indicating a potential phenotypic extension of EXT2 mutations. Plus, this work allowed us to change the clinical management of this family. We then used a strategy of subtractive exomic analysis of trio sick child/healthy parents in order to identify de novo mutations in a young patient who developed a medulloblastoma of the cerebellum at 8 years-old followed by a meningioma at 22 years-old. The analysis of the trio revealed the existence of a de novo mutation affecting a highly conserved amino acid of the HID-1 protein. HID-1 is specifically expressed in neuronal and secretory cells, and seems to function around the Golgi apparatus to regulate the sorting of newly formed vesicles. Our hypothesis is that a defect of the HID-1 protein linked to a mutation of the HID-1 gene, could alter the secretory pathway therefore contributing to the development of the tumor. This work, which is still ongoing, demonstrates both the strength of the trio strategy for the rapid identification of de novo mutations and illustrates all the difficulty of interpreting variants detected in genes not yet involved in cancer. Then, thanks to the recruitment of the Laboratory of Molecular Genetics of the CHU of Rouen, we have collected a cohort of 10 patients who developed an adrenocortical carcinoma (ACC) at a very early age and for which no molecular basis could be identified. Despite subtractive and inter-familial exomic analyses, we were unable to highlight new molecular bases for these cases of pediatric ACC. Finally, under the assumption that rare or private mutations in a limited number of genes involved in cancer could contribute to inherited forms of cancer, we undertook a project to sequence 201 genes involved in cancer in patients who developed tumors at a pediatric age. The first results of this project confirmed the robustness of this technique and suggested a phenotypic extension of the DICER1 mutation spectrum as well as an oligogenic contribution of DNA repair genes in pediatric tumors. Soon, these results will be compiled in a database and will benefit from a statistical analysis with the objective to identify enrichment of rare variants in specific genes or biological pathways in these patients compared to control individuals
Nguyen, Quang Nam. « Utilisation du séquençage à haut débit pour la sélection et l'ingénierie des aptamères ». Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS238.
Texte intégralSELEX is a directed molecular evolution technic which allows, after several rounds of selection, enriching a library from random nucleic acids to sequences able to bind specifically a target. Sequencing technics are then used to identify these sequences called « aptamers ». Since the arrival of High-Throughput Sequencing (HTS), it is now possible to analyse millions of sequences. The aim of the thesis was to develop methods for the treatment and the analysis of HTS data, in order to facilitate the identification of the best aptamers inside a SELEX. During this thesis, a semi-automatic binding test on adherent living cells has been developed to measure the affinity of aptamers identified in SELEX directed against specific cells (cell-SELEX). Then, the evolution of the sequence enrichment during a cell-SELEX has been analysed by HTS. This analysis gave us the possibility to design a new phylogenetic approch named FREDROGRAM. This evolutive approch allowed to identify variants of an aptamer’s family with a better affinity. Finally, HTS of two SELEX directed against proteins has contributed to a better understanding of the impact of selection parameters on the library and to identified new aptamers, notably by reducing the number of SELEX rounds. To conclude, this work shows the importance of HTS in the identification of the best aptamers and suggests new protocols to monitor the next SELEX in a different manner
Mambu, Mambueni Hendrick. « Identification de nouveaux variants rares associés à la spondyloarthrite par séquençage haut-débit ». Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASL064.
Texte intégralSpondyloarthritis (SpA) is a multifactorial disease with an estimated heritability of over 90%, mainly related to HLA-B27. All identified susceptibility factors, including HLA-B27, explain less than one third of the heritability. The involvement of rare variants could explain part of this missing heritability. The aim of this work was to identify rare variants associated with SpA via a combined family analysis and high-throughput sequencing approach. First, we sequenced a 1.4 Mb region significantly linked to SpA at 13q13 in 71 patients and 21 healthy controls from families with a high linkage score in this region. We identified a rare variant in the FREM2 gene present in 9 patients from a family with high linkage to the region and not found in other families or isolated cases of SpA. We then sequenced the exome of 48 patients from 20 multiplex families. Unfortunately, we did not observe any recurrent variants between families. We then focused on a second, previously known genetic linkage peak on chromosome 9. The study of the family most linked to this region, which includes 12 patients, led to the identification of several rare coding variants segregating with the disease. However, subsequent studies have shown equivalent allelic frequencies of these variants between cases and controls. Finally, whole genome sequencing of 413 patients from 76 multiplex families with 4 or more patients was performed. We identified 1203 rare, coding, non-synonymous variants shared by at least all affected family members. Genetic and functional validation analyses of these variants are underway, as is the analysis of non-coding variants. In conclusion, these different approaches suggest significant genetic heterogeneity in SpA and also highlight the difficulty of confirming the involvement of rare variants in complex diseases
Liais, Etienne. « Identification et caractérisation de virus aviaires par des approches de séquençage à haut débit ». Thesis, Toulouse, INPT, 2014. http://www.theses.fr/2014INPT0134/document.
Texte intégralInfectious diseases are considered the most prevalent cause of mortality in humans as well as other animals worldwide. Since the advent of high throughput sequencing technologies, diagnostic methods for these conditions have quickly changed and evolved, as the continuously decreasing cost of mass sequencing is making this tool available to larger numbers of people. As part of my thesis project, an Illumina®-based sequencing method (on a MiSeq machine) was designed for diagnostic purposes in clinical cases in poultry. We first used this method to identify the causative agent of the fulminating disease of guinea fowl. This validated the use of our protocol to identify the pathogenic infectious agent behind a specific condition. This newly identified Coronavirus was further analysed and characterised. In a second study we used an unbiased mass sequencing approach to describe the RNA virus populations present in the duck respiratory tract during clinical episodes (respiratory illness or egg drops). Data showed an important viral diversity and we identified some candidate pathogens. Taken together, these results validate the use of high throughput sequencing as a powerful diagnostic tool
Mirauta, Bogdan. « Etude du transcriptome à partir de données de comptages issues de séquençage haut débit ». Thesis, Paris 6, 2014. http://www.theses.fr/2014PA066424/document.
Texte intégralIn this thesis we address the problem of reconstructing the transcription profile from RNA-Seq reads in cases where the reference genome is available but without making use of existing annotation. In the first two chapters consist of an introduction to the biological context, high-throughput sequencing and the statistical methods that can be used in the analysis of series of counts. Then we present our contribution for the RNA-Seq read count model, the inference transcription profile by using Particle Gibbs and the reconstruction of DE regions. The analysis of several data-sets proved that using Negative Binomial distributions to model the read count emission is not generally valid. We develop a mechanistic model which accounts for the randomness generated within all RNA-Seq protocol steps. Such a model is particularly important for the assessment of the credibility intervals associated with the transcription level and coverage changes. Next, we describe a State Space Model accounting for the read count profile for observations and transcription profile for the latent variable. For the transition kernel we design a mixture model combining the possibility of making, between two adjacent positions, no move, a drift move or a shift move. We detail our approach for the reconstruction of the transcription profile and the estimation of parameters using the Particle Gibbs algorithm. In the fifth chapter we complete the results by presenting an approach for analysing differences in expression without making use of existing annotation. The proposed method first approximates these differences for each base-pair and then aggregates continuous DE regions
Mirauta, Bogdan. « Etude du transcriptome à partir de données de comptages issues de séquençage haut débit ». Electronic Thesis or Diss., Paris 6, 2014. http://www.theses.fr/2014PA066424.
Texte intégralIn this thesis we address the problem of reconstructing the transcription profile from RNA-Seq reads in cases where the reference genome is available but without making use of existing annotation. In the first two chapters consist of an introduction to the biological context, high-throughput sequencing and the statistical methods that can be used in the analysis of series of counts. Then we present our contribution for the RNA-Seq read count model, the inference transcription profile by using Particle Gibbs and the reconstruction of DE regions. The analysis of several data-sets proved that using Negative Binomial distributions to model the read count emission is not generally valid. We develop a mechanistic model which accounts for the randomness generated within all RNA-Seq protocol steps. Such a model is particularly important for the assessment of the credibility intervals associated with the transcription level and coverage changes. Next, we describe a State Space Model accounting for the read count profile for observations and transcription profile for the latent variable. For the transition kernel we design a mixture model combining the possibility of making, between two adjacent positions, no move, a drift move or a shift move. We detail our approach for the reconstruction of the transcription profile and the estimation of parameters using the Particle Gibbs algorithm. In the fifth chapter we complete the results by presenting an approach for analysing differences in expression without making use of existing annotation. The proposed method first approximates these differences for each base-pair and then aggregates continuous DE regions
Da, Silva Ophélie. « Structure de l'écosystème planctonique : apport des données à haut débit de séquençage et d'imagerie ». Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS183.
Texte intégralPlanktonic organisms are key actors in oceanic ecosystems, which support trophic networks and play a major role in biogeochemical cycles and climate regulation. While the spatio-temporal distribution of planktonic diversity can be investigated at several levels, from the gene to the ecosystem, identifying the underlying mechanisms is challenging. Indeed, the structure of diversity results from different evolutionary and ecological processes that can act simultaneously. Since the beginning of the 21st century, the oceanic environment has been increasingly monitored. Numerous observation platforms have been deployed, leading to the acquisition of a large amount of data for multiple environmental characteristics. At the same time, technologies for studying living organisms have been developed. Thus, an unprecedented sampling of planktonic organisms has taken place. In particular, high-throughput sequencing and imaging data provide molecular, taxonomic and functional information at several biological levels. The objective of this thesis was to explore the structure of planktonic ecosystems using high-throughput sequencing and imaging data. Coupling with environmental data could contribute to a better understanding of the spatial distribution of planktonic diversity, from species to communities. In the first part, the genetic diversity of protists was studied at the species level. The hypothesis was that metagenomics could provide access to the poorly characterized spatial organization of the intraspecific protist genetic diversity, as well as to the mechanisms underlying it. In a second part, the link between genetic diversity and functional diversity was explored. Transparency was targeted. This functional trait is little explored at the community level and its molecular basis is poorly identified. A data-driven approach allowed this trait to emerge from imaging data, leading to the exploration of its biogeography and molecular basis. In the last part, the high potential of complementarity between sequencing, imaging and environmental datasets was explored, in order to highlight the multi-scale structure of the planktonic ecosystem and to identify its global structure. Finally, all the results were discussed to highlight the contributions that these data can provide to the understanding of planktonic ecosystems, as well as the limitations they can face
Gicquel, Evelyne. « Etude par approches globales de la sélectivité d’atteinte dans les dystrophies des ceintures ». Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLE041.
Texte intégralLimb Girdle Muscular Dystrophies are a group of genetic diseases affecting the muscles of the body with different degrees of severity. The factors behind these differences of impairment have not been identified.The objective of this thesis work is to identify the molecular differences existing in normal condition between muscles known to show a difference of impairment in case of genetic deficiencies asssociated with Limb Girdle Muscular Dystrophy. We based our work on the assumption that the differences of impairment between muscles would be caused by mechanisms leading to modifications of the expression of protective or sensitizer genes in the muscle. Therefore, we explored these mechanisms through a global approach. Analyses by high-throughput sequencing in Primate muscles allowed the identification of several genes and regulatory elements whose expression differs between the sensitive and the resistant muscles. These genes interact in a common network of interactions, which could be targeted for therapeutic purpose. Some of these differences were shown to be conserved in the mouse. We then explored the mechanisms by which the identified regulatory elements may be involved in selectivity of impairment. The results of this thesis provide a deeper understanding of the pathophysiological mechanisms of Limb Girdle Muscular Dystrophies. They will also pave the way for the development of new treatments for this group of diseases
Hurel, Julie. « Détection d'organismes génétiquement modifiés (OGM) inconnus par analyse statistique de données de séquençage haut débit ». Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1B027.
Texte intégralThe European Union has adopted a very restrictive policy towards the dissemination and use of genetically modified organisms (GMOs), whose use in food is not well accepted by consumers. Although a maximum threshold exists for a food to be labelled "GM-free", only known GMOs are easily detectable. A GMO consists mainly of a host genome and a sequence inserted by a non-natural process that confers a particular property on the organism, such as resistance to certain diseases. In recent years, GMOs with an inserted sequence that is not known have been produced that are not detectable by approaches used until now (PCR-type). Hence the need to propose a tool for the detection of unknown GMOs, the subject of this thesis, based on recent advances in terms of high-throughput sequencing. Statistically, each organism has a specific frequency of nucleotide use in its genome. Any introduction of foreign genetic material will locally alter the nucleotide use frequencies in that region, resulting in different nucleotide use frequencies compared to those of the host organism. Based on this assertion, an unknown GMO detection tool has been developed from bacterial sequencing data when the GMO results from the insertion of a foreign gene, the truncation or fusion of a gene that may belong to the host genome. The tool has been tested on 4 GMO bacterial genomes, 7 wild bacterial genomes and 42 synthetic bacterial genomes. The results demonstrate the effectiveness of the method developed by presenting only one false positive gene and identifying more than 99% of the genes of GMO inserts
Lacoste, Deixonne Caroline. « Apport du séquençage haut débit dans l'amélioration de la prise en charge des maladies monogéniques ». Thesis, Aix-Marseille, 2016. http://www.theses.fr/2016AIXM5062/document.
Texte intégralThe diffusion of Next Generation Sequencing (NGS) technologies induces an important change that modifies molecular diagnostics indications and prompts laboratories to re-think their diagnostic strategies, up-to-now based on Sanger sequencing routine. Several high throughput approaches are available from the sequencing of a gene panel, to a whole exome, or even a whole genome. In all cases, a tremendous amount of data are generated, that have to be filtered, interpreted and analyzed by the use of powerful bioinformatics tools.In part 1, existing strategies and the difficulties and challenges of high-throughput sequencing for molecular diagnosis in genetic diseases are discussed. In part 2, the set up and the technical validation of this diagnostic approach in the Molecular Genetics’ Laboratory of the Timone Hospital in Marseille is presented and illustrated by 3 examples of complex diagnostics solved thanks to NGS. NGS promises to shorten significantly the time of analysis and results reporting, and to expand the number of tested genes. It also promises to increase the proportion of positive diagnoses. Finally, the NGS can identify new variants and new genes involved in human pathology, thus will globally improve patient clinical care
Brinda, Karel. « Nouvelles techniques informatiques pour la localisation et la classification de données de séquençage haut débit ». Thesis, Paris Est, 2016. http://www.theses.fr/2016PESC1027/document.
Texte intégralSince their emergence around 2006, Next-Generation Sequencing technologies have been revolutionizing biological and medical research. Obtaining instantly an extensive amount of short or long reads from almost any biological sample enables detecting genomic variants, revealing the composition of species in a metagenome, deciphering cancer biology, decoding the evolution of living or extinct species, or understanding human migration patterns and human history in general. The pace at which the throughput of sequencing technologies is increasing surpasses the growth of storage and computer capacities, which still creates new computational challenges in NGS data processing. In this thesis, we present novel computational techniques for the problems of read mapping and taxonomic classification. With more than a hundred of published mappers, read mapping might be considered fully solved. However, the vast majority of mappers follow the same paradigm and only little attention has been paid to non-standard mapping approaches. Here, we propound the so-called dynamic mapping that we show to significantly improve the resulting alignments compared to traditional mapping approaches. Dynamic mapping is based on exploiting the information from previously computed alignments, helping to improve the mapping of subsequent reads. We provide the first comprehensive overview of this method and demonstrate its qualities using Dynamic Mapping Simulator, a pipeline that compares various dynamic mapping scenarios to static mapping and iterative referencing. An important component of a dynamic mapper is an online consensus caller, i.e., a program collecting alignment statistics and guiding updates of the reference in the online fashion. We provide OCOCO, the first online consensus caller that implements a smart statistics for individual genomic positions using compact bit counters. Beyond its application to dynamic mapping, OCOCO can be employed as an online SNP caller in various analysis pipelines, enabling calling SNPs from a stream without saving the alignments on disk. Metagenomic classification of NGS reads is another major problem studied in the thesis. Having a database of thousands reference genomes placed on a taxonomic tree, the task is to rapidly assign to tree nodes a huge amount of NGS reads, and possibly estimate the relative abundance of involved species. In this thesis, we propose improved computational techniques for this task. In a series of experiments, we show that spaced seeds consistently improve the classification accuracy. We provide Seed-Kraken, a spaced seed extension of Kraken, the most popular classifier at present. Furthermore, we suggest a new indexing strategy based on a BWT-index, obtaining a much smaller and more informative index compared to Kraken. We provide a modified version of BWA that improves the BWT-index for a quick k-mer look-up
Bisseux, Maxime. « Dynamique de la circulation des Entérovirus de l'homme à l'environnement : Etude par séquençage haut débit ». Thesis, Université Clermont Auvergne (2017-2020), 2017. http://www.theses.fr/2017CLFAS013.
Texte intégralEnterovirus (EV) are Picornaviruses (non-enveloped, positive-sense RNA viruses), characterized by a large genetic and antigenic diversity (116 types classified within 4 taxonomic species EV-A to D) and rapid evolution. Human infections are frequent, highly contagious from stools and occur as outbreaks. The infections are mainly asymptomatic or benign but severe or fatal cases can be reported in young children. Poliomyelitis is the model EV infection. Combined with clinical and virological surveillance, mass vaccination is closer than ever to achieve the WHO program of the Global Polio Eradication Initiative. However, the detection of wild type polioviruses in polio-free countries and the recent worldwide emergence of non-polio enteroviruses (EV-A71, EV-D68) associated with severe clinical manifestations underscore the importance of surveilling EV circulation in the general population. The aim of the PhD thesis was the detection and identification of EV strains in wastewater treated in the sewage treatment plant at Clermont-Ferrand (France). The viral data were compared with those reported through clinical surveillance to obtain a comprehensive picture of the viral circulation in the local population. A method was developed to concentrate viruses from raw and treated wastewater and molecular assays were used to detect EVs and 6 other human enteric viruses. The viral genomes were detected in all samples from October 2014 to October 2015, with a median of 6 and 4 different viruses in raw and treated wastewater respectively. Phylogenetic analysis of viral sequences (EV, hepatitis A and E viruses) determined in wastewater and reported in patients during the sampling period, showed the efficiency of the method for surveilling enteric viruses in the community. The EV diversity in raw wastewater was analyzed by sequencing of amplicons with the Illumina high throughput technology (metabarcoding). The analysis revealed a large viral diversity and the silent circulation of 25 types not detected from hospital data (in particular 9 EV-C, of which sequences of vaccine poliovirus 1). The phylogenetic analyses of intra-typic variants showed different epidemic patterns in the predominant EV types circulating over the study period. The data demonstrate the feasibility and sensitivity of the strategy developed for the detection and characterization of EV in wastewater and provide a future prospect for the implementation of environmental surveillance of non-polio EV infections in epidemiological studies, epidemic prevention, and for health alert. Combining the surveillance of enteric viruses in the environment and in the clinical setting allows a better understanding of their prevalence. This global approach of virus circulation and ecological health represents an important investment for laboratories, which will require integration in national and international collaboration networks beyond the scope of enterovirus surveillance
Caporossi, Alban. « Apport du séquençage haut débit dans l'analyse bioinformatique du génome du virus de l'hépatite C ». Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAS021/document.
Texte intégralHigh-throughput sequencing has been used in this work to reconstruct with adapted methods the whole genomeof the hepatitis C virus (HCV) particularly for accurately typing the virus. Thus, we managed to detect in a studya recombinant form of HCV circulating within a patient. We typed and detected in another study resistancemutations of several HCV strains of different genotypes. Finally, a last study based on this approach enabled touncover a HCV strain belonging to a new subtype. High-throughput sequencing has also been used in this workto detect multiple infections and analyze viral evolution with targeted HCV genes and non-specific methods for2 HCV patients under treatment. This retrospective study enabled to define the composition of each temporalsample, assess their nucleotide diversity, investigate viral population genetic structure and temporal evolutionand date secondary infections. Results of this analysis support the hypothesis of onset mechanism of treatmentresistance (selective sweeps)
Chaara, Wahiba. « Caractérisation de la diversité du répertoire TCR par modélisation de données de séquençage haut-débit ». Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066410/document.
Texte intégralT lymphocytes (LT) are key players in the immune system, a complex and dynamic system evolving over the organism’s life. The concept of "lymphocyte repertoire" designates a collection of lymphocytes sharing the same phenotype, the same function or any other criteria. Each LT is characterized by a unique membrane receptor, called TCR, allowing it to recognize specifically antigens. TCRs are characterized by variable regions produced by a series of somatic rearrangements that occur during the thymic differentiation; these regions engage LT recognition diversity. The “TCR repertoire” approach focuses the clonal characterisation of LT populations on the diversity of the TCR expressed on the scale of the population. The high-throughput sequencing of TCR chains (RepSeq) describes this diversity with unprecedented precision. However, this approach requires adapted tools to enable a relevant deciphering of the analysed TCR repertoire diversity. My thesis aimed to: i) deepen the concept of diversity of the lymphocyte repertoire, ii) develop an appropriate methodology to exploit optimally RepSeq data while taking into account the limits of this technology, and iii) develop a tool providing immunologists a thorough characterisation of their TCR repertoires of interest
Chaara, Wahiba. « Caractérisation de la diversité du répertoire TCR par modélisation de données de séquençage haut-débit ». Electronic Thesis or Diss., Paris 6, 2016. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2016PA066410.pdf.
Texte intégralT lymphocytes (LT) are key players in the immune system, a complex and dynamic system evolving over the organism’s life. The concept of "lymphocyte repertoire" designates a collection of lymphocytes sharing the same phenotype, the same function or any other criteria. Each LT is characterized by a unique membrane receptor, called TCR, allowing it to recognize specifically antigens. TCRs are characterized by variable regions produced by a series of somatic rearrangements that occur during the thymic differentiation; these regions engage LT recognition diversity. The “TCR repertoire” approach focuses the clonal characterisation of LT populations on the diversity of the TCR expressed on the scale of the population. The high-throughput sequencing of TCR chains (RepSeq) describes this diversity with unprecedented precision. However, this approach requires adapted tools to enable a relevant deciphering of the analysed TCR repertoire diversity. My thesis aimed to: i) deepen the concept of diversity of the lymphocyte repertoire, ii) develop an appropriate methodology to exploit optimally RepSeq data while taking into account the limits of this technology, and iii) develop a tool providing immunologists a thorough characterisation of their TCR repertoires of interest
Karaouzene, Thomas. « Bioinformatique et infertilité : analyse des données de séquençage haut-débit et caractérisation moléculaire du gène DPY19L2 ». Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAS041/document.
Texte intégralIn the last decade, the investigations of genetic diseases have been revolutionized by the rise of high throughput sequencing (HTS). Thanks to these new techniques it is now possible to analyze the totality of the coding sequences of an individual (exome sequencing) or even the sequences of his entire genome or transcriptome.The understanding of a pathology and of the genes associated with it now depends on our ability to identify causal variants within a plethora of technical artifact and benign variants.HTS is expected to be particularly useful in the field infertility as this pathology is expected to be highly genetically heterogeneous and only a few genes have so far been associated with it. My thesis focuses on male infertility and is divided into two main parts: HTS data analysis of infertile men and the molecular characterization of a specific phenotype, globozoospermia.Several thousands of distinct variants can be identified in a single exome, thereby using effective informatics is essential in order to obtain a short and actionable list of variants. It is for this purpose that I developed a HTS data analysis pipeline performing successively all bioinformatics analysis steps: 1) reads mapping along a reference genome, 2) genotype calling, 3) variant annotation and 4) the filtering of the variants considered as non-relevant for the analysis. Performing all these independent steps within a single pipeline is a good way to calibrate them and therefore to reduce the number of erroneous calls. This pipeline has been used in five studies and allowed the identification of variants impacting candidate genes that may explain the patients’ infertility phenotype. All these variants have been experimentally validated using Sanger sequencing.I also took part in the genetic and molecular investigations which permitted to demonstrate that the absence of the DPY192 gene induces male infertility due to globozoospermia, the presence in the ejaculate of only round-headed and acrosomeless spermatozoa. Most patients with globozoospermia have a homozygous deletion of the whole gene. I contributed to the characterization of the mechanisms responsible for this recurrent deletion, then, using Dpy19l2 knockout (KO) mice, I realized the comparative study of testicular transcriptome of wild type and Dpy19l2 -/- KO mice. This study highlighted a dysregulation of 76 genes in KO mice. Among them, 23 are involved in nucleic acid and protein binding, which may explain acrosome anchoring defaults observed in the sperm of globozoospermic patients.My work allowed a better understanding of globozoospermia and the development of a HTS data analysis pipeline. The latter allowed the identification of more than 15 human gametogenesis genes involved in different infertility phenotypes
Glouzon, Jean-Pierre. « Étude de la dynamique des populations du viroïde de la mosaïque latente du pêcher par séquençage à haut débit et segmentation ». Mémoire, Université de Sherbrooke, 2012. http://hdl.handle.net/11143/6582.
Texte intégralMbareche, Hamza. « Molecular tools for the study of fungal aerosols ». Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/35697.
Texte intégralSince the rapid development of high-throughput sequencing methods in molecular ecology, fungi have been the underdogs of the microbial world, especially in bioaerosol studies. Particularly, studies describing fungal exposure in different occupational environments have been limited by traditional culture methods that underestimate the broad spectrum of fungi present in the air. There are potential risks in the human inhalation of fungal spores in an occupational scenario where the quantity and diversity of fungi is high. Although some health problems are already known to be associated with fungal exposure in certain work environments, the risk may be underestimated due to the methods used. Applying high-throughput sequencing in soil samples has helped the explanation of the fungal role in ecosystems. However, the literature is not decisive in terms of the genomic region to use as target for the enrichment and sequencing of fungi. The present thesis deals with the challenge of determining which region from the two universally used regions, ITS1 and ITS2, is best suited for study of fungal aerosols. In tandem with this challenge came another of addressing the loss of fungal cells during the centrifugation of liquid impaction air samples for purposes of concentration. This thesis describes a new filtration-based method to circumvent such losses during centrifugation. These two challenges represent the first part of the thesis, which focuses on methodology development. In synopsis, the treatment of air samples prior to DNA extraction is considered, along with the identification of the best region to target in amplicon-based high throughput sequencing. In the second part of the thesis, the focus turns to the application of the developed methodology to characterize fungal exposure in three different work environments: compost, biomethanization, and dairy farms. All three are of special interest due to potentially high fungal exposure. Results show that ITS1 outperformed ITS2 in disclosing higher levels of fungal diversity in aerosol samples. Due to complementarity in the taxonomic profiles disclosed by the two regions, the author suggests the use of both regions to cover the greatest possible number of taxa when taxonomy is the main interest of the study. However, ITS1 should be the first choice in other studies, mainly because of the high diversity it reveals and its concordance with results obtained via shotgun metagenomic profiling. In addition, the new filtration-based approach proposed in this work might be the best alternative available for compensating the loss of propagules in centrifugation done prior to DNA extraction. Taken together, these methods allowed a profound characterization of fungal exposure in occupational environments.
Muller, Etienne. « Les défis du séquençage à haut débit dans l'exploration génétique des cancers du sein et de l'ovaire ». Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR100/document.
Texte intégralBreast and ovarian cancers appear in 5 to 10% of cases in a context of genetic predisposition, of which only a small proportion is explained by the presence of a pathogenic variant on the BRCA1, BRCA2 and PALB2 genes. High throughput sequencing can explore this missing heredity, but represents a new challenge both in computing, statistics and biology. Three approaches using this new technology have been used to investigate new predisposition factors. First, the risks associated with 34 known or suspected genes involved in predispositions were estimated from the analysis of 5,131 index cases and the development of a new statistical approach. Also, the participation of mosaic neo-mutations in the syndrome was explored from 1,750 index cases from the previous study, with a software developed specifically for detecting poorly represented variants: outLyzer. Finally, the exploration by sequencing of the missing heredity was extended to a panel of 201 genes involved in cancer, from 118 patients selected for the early onset of their disease, a highly suggestive element of a predisposition factor. The results of this work validated the relevance of the PALB2, RAD51C and RAD51D study for patient management, and also suggested an underestimated involvement of mosaic variants. However, there are still very likely other highly penetrating genetic factors to be discovered, but whose risk modulation is based on an oligogenic model
Nguyen, Do Ngoc Linh. « Mise au point de l’analyse par séquençage à haut-débit du microbiote fongique et bactérien respiratoire chez les patients atteints de mucoviscidose ». Thesis, Lille 2, 2016. http://www.theses.fr/2016LIL2S011/document.
Texte intégralChronic pulmonary infection results in an irreversible decline in lung function in patients with cystic fibrosis (CF). While several bacteria are known as main causes for these infections (for example: Pseudomonas aeruginosa, Staphylococcus aureus, Burkholderia cepacia, Achromobacter xylosoxidans...), more recently some fungal genera including filamentous fungi (such as Aspergillus, Scedosporium...) have also been identified as emerging or re-emerging pathogens able to cause invasive mycosis. Thus, the identification of the microorganisms involved in the respiratory colonizations and/or infections has become essential.Still now culture methods remain the gold standard for diagnostic of microbial infections. However, it could not identify non-culturable or difficult-to-cultivate microorganisms. Thanks to the development of high-throughput sequencing (next generation sequencing or NGS), recent studies have shown that the lung of patients with CF is a complex poly-microbial flora, also called the CF lung microbiota, which includes not only bacteria but also fungi (yeast and/or filamentous fungi), and viruses and phages. Dysbiosis (loss of abundance and/or diversity) of the lung microbiota has been associated with the patient's decreased lung function and poor clinical status.While lung bacteriota and its role in pathogenesis have widely been studied, few research studies focus on the fungal component (mycobiota/ mycobiome) of the lungs. Our thesis (PhD work) focuses on NGS analysis of pro- and eukaryotic lung microbiota in CF patients, in particular on the comparison of different methodological approaches to optimize and standardize the NGS protocol. This project has been developed under the supervision of Pr. Laurence Delhaes in the “Biology and Diversity of Eukaryotic Emerging Pathogens” team directed by Dr. Eric Viscogliosi.Firstly, we present a state of art on the current knowledge on the fungal colonization/infections risk in CF as well as the development of new concepts of lung microbiota and lung mycobiota on which our team focuses.Secondly, we applied the NGS approach to study the pro- and eukaryotic microbiota in the sputum samples of CF patient lung. Indeed, NGS is a powerful technique that may introduce biases on numerous methodological steps. One of the most important biases is that this technique could not differentiate among the living microorganisms, the dead or damaged cells, and the extracellular DNA. In the context of the CF lung microbiota which is often exposed to high-dose intravenous antibiotics, the analysis by NGS might evaluate4inaccurately the abundance and the diversity of the lung microbiota. Pretreatment of samples by propidium monoazide (PMA), which can target selectively the DNA of viable cells, could be a solution to overcome this limitation. Our study aimed to determine whether a sample pretreatment with PMA modified the lung pro- and eukaryotic microbiota analyzed by NGS. We discuss the clinical relevance of this approach "PMA - NGS" in the context of CF patients to a better quantification of living microorganisms
Limasset, Antoine. « Nouvelles approches pour l'exploitation des données de séquences génomique haut débit ». Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S049/document.
Texte intégralNovel approaches for the exploitation of high throughput sequencing data In this thesis we discuss computational methods to deal with DNA sequences provided by high throughput sequencers. We will mostly focus on the reconstruction of genomes from DNA fragments (genome assembly) and closely related problems. These tasks combine huge amounts of data with combinatorial problems. Various graph structures are used to handle this problem, presenting trade-off between scalability and assembly quality. This thesis introduces several contributions in order to cope with these tasks. First, novel representations of assembly graphs are proposed to allow a better scaling. We also present novel uses of those graphs apart from assembly and we propose tools to use such graphs as references when a fully assembled genome is not available. Finally we show how to use those methods to produce less fragmented assembly while remaining tractable
Padioleau, Ismaël. « Étude génomique de l'interférence entre la réplication et la transcription comme source du stress réplicatif ». Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTT053/document.
Texte intégralOncogenes activation promotes aberrant cell proliferation, increasing replication stress and DNA damage. It has been proposed that genomic instability leads to checkpoints inhibition and promotes cancer development (Halazonetis et al. 2008). However, the link between aberrant proliferation, replication stress and DNA breaks is still unclear. We hypothesized that aberrant proliferation leads to more incident due to DNA and RNA polymerases encounter and stalling. When the two polymerases encounter, the accumulation of positive-supercoiled DNA between two polymerases induces fork stalling, resulting in the formation of fragile structures such as single-stranded DNA (ssDNA). These ssDNAs formed at stalled forks could be a source for DNA breaks, promoting the development of cancer cells. To validate this hypothesis, biologists from our team have worked on HeLa cell lines with increased replication-transcription conflicts. I perform the bioinformatics analysis of the following genomic data:- DRIP-seq: R-Loops positioning on genome using immunoprecipitation on DNA/RNA hybrids.-γ-H2AX ChIP-Seq: Gamma-H2AX is an histone mark found at DNA breaks.-pRPA ChIP-Seq : Positioning of stalled forks using the substrate of ATR kinase, phospho-RPA (S33) as a marker.Each data was produced on control cells and two cell lines where TOP1 and ASF/SF2 were depleted by as inducible shRNA (shTOP1 and shASF). Topoisomerase 1 is a topological enzyme that unwinds DNA when supercoiling accumulates. ASF/SF2 is part of the splicing complexes that processes mRNP (messenger ribonucleoprotein particles) to prevent the accumulation of R-loops during transcription. Using these data and others from literature, I determined that regions having higher risk to induce replication stress are located downstream of highly transcribed and early replicated genes, and preferentially with head-on collision between DNA and RNA polymerases. I also revealed that cancer-related genes are enriched in these regions of the genome
Doan, Trung-Tung. « Epidémiologie moléculaire et métagénomique à haut débit sur la grille ». Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2012. http://tel.archives-ouvertes.fr/tel-00778073.
Texte intégralMorisse, Pierre. « Correction de données de séquençage de troisième génération ». Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR043/document.
Texte intégralThe aims of this thesis are part of the vast problematic of high-throughput sequencing data analysis. More specifically, this thesis deals with long reads from third-generation sequencing technologies. The aspects tackled in this topic mainly focus on error correction, and on its impact on downstream analyses such a de novo assembly. As a first step, one of the objectives of this thesis is to evaluate and compare the quality of the error correction provided by the state-of-the-art tools, whether they employ a hybrid (using complementary short reads) or a self-correction (relying only on the information contained in the long reads sequences) strategy. Such an evaluation allows to easily identify which method is best tailored for a given case, according to the genome complexity, the sequencing depth, or the error rate of the reads. Moreover, developpers can thus identify the limiting factors of the existing methods, in order to guide their work and propose new solutions allowing to overcome these limitations. A new evaluation tool, providing a wide variety of metrics, compared to the only tool previously available, was thus developped. This tool combines a multiple sequence alignment approach and a segmentation strategy, thus allowing to drastically reduce the evaluation runtime. With the help of this tool, we present a benchmark of all the state-of-the-art error correction methods, on various datasets from several organisms, spanning from the A. baylyi bacteria to the human. This benchmark allowed to spot two major limiting factors of the existing tools: the reads displaying error rates above 30%, and the reads reaching more than 50 000 base pairs. The second objective of this thesis is thus the error correction of highly noisy long reads. To this aim, a hybrid error correction tool, combining different strategies from the state-of-the-art, was developped, in order to overcome the limiting factors of existing methods. More precisely, this tool combines a short reads alignmentstrategy to the use of a variable-order de Bruijn graph. This graph is used in order to link the aligned short reads, and thus correct the uncovered regions of the long reads. This method allows to process reads displaying error rates as high as 44%, and scales better to larger genomes, while allowing to reduce the runtime of the error correction, compared to the most efficient state-of-the-art tools.Finally, the third objectif of this thesis is the error correction of extremely long reads. To this aim, aself-correction tool was developed, by combining, once again, different methologies from the state-of-the-art. More precisely, an overlapping strategy, and a two phases error correction process, using multiple sequence alignement and local de Bruijn graphs, are used. In order to allow this method to scale to extremely long reads, the aforementioned segmentation strategy was generalized. This self-correction methods allows to process reads reaching up to 340 000 base pairs, and manages to scale very well to complex organisms such as the human genome
Langouët, Maéva. « Démembrement génétique des déficiences intellectuelles et compréhension des bases physiopathologiques associées, à l'ère du séquençage à haut débit ». Thesis, Paris 5, 2014. http://www.theses.fr/2014PA05T045/document.
Texte intégralIntellectual deficiency (ID) is characterized by a broad range of deficits in higher brain functions that result in significant limitations in adaptive and cognitive capacities required for competence in daily living, communication, social interaction and integration, self-direction, and work (DSM-V). ID affects approximately 3% of the population. Identifying ID causes is essential to improve patients' care services with no risk to miss a curable cause, but also to provide genetic counselling to the family for future pregnancies. Little is known about the biological bases of these conditions. Indeed, despite recent advances in cytogenetic and molecular genetics, the cause of the mental handicap remains unexplained in 40% of the cases. Understanding the molecular bases of these disorders is therefore an important medical challenge for the next years. Also, ID genes identification and analysis of the cellular mechanisms underlying these conditions should provide significant insight into the molecular and cellular pathways involved in cognition and may lead to new therapeutic trials aiming at improving the daily living of these patients and their families. The PhD work presented here report on the analysis, using Whole Exome Sequencing (WES), of five different families presenting with syndromic ID. The first part develops results from the analysis of three consanguineous families with an autosomal recessive form of ID. The second part presents the study of 2 unrelated male ID patients who presented the same clinical features. Overall, this work allowed the identification of i) two genes previously associated with ID (WDR62 and AP4M1), ii) two candidate genes (RAD54B and HERC2), potential modifiers of the phenotype, then iii) the definition of a novel hereditary mode, and finally iv) the characterization of two new genes of ID (TTI2 and NONO) followed by the functional analysis of mutations effects in patients' cells and the Nonogt mouse model
Ferreira, de Carvalho Julie. « Évolution du génome des spartines polyploïdes envahissant les marais salés : apport des nouvelles techniques de séquençage haut-débit ». Phd thesis, Université Rennes 1, 2013. http://tel.archives-ouvertes.fr/tel-00795861.
Texte intégralBernard, Elsa. « Etude de l'épissage grâce à des techniques de régression parcimonieuse dans l'ère du séquençage haut débit de l'ARN ». Thesis, Paris Sciences et Lettres (ComUE), 2016. http://www.theses.fr/2016PSLEM063/document.
Texte intégralThe number of protein-coding genes in a human, a nematodeand a fruit fly are roughly equal.The paradoxical miscorrelation between the number of genesin an organism's genome and its phenotypic complexityfinds an explanation in the alternative natureof splicing in higher organisms.Alternative splicing largely increases the functionaldiversity of proteins encoded by a limitednumber of genes.It is known to be involved incell fate decisionand embryonic development,but also appears to be dysregulatedin inherited and acquired human genetic disorders,in particular in cancers.High-throughput RNA sequencing technologiesallow us to measure and question splicingat an unprecedented resolution.However, while the cost of sequencing RNA decreasesand throughput increases,many computational challenges arise from the discrete and local nature of the data.In particular, the task of inferring alternative transcripts requires a non-trivial deconvolution procedure.In this thesis, we contribute to deciphering alternative transcript expressions andalternative splicing events fromhigh-throughput RNA sequencing data.We propose new methods to accurately and efficientlydetect and quantify alternative transcripts.Our methodological contributionslargely rely on sparse regression techniquesand takes advantage ofnetwork flow optimization techniques.Besides, we investigate means to query splicing abnormalitiesfor clinical diagnosis purposes.We suggest an experimental protocolthat can be easily implemented in routine clinical practice,and present new statistical models and algorithmsto quantify splicing events and measure how abnormal these eventsmight be in patient data compared to wild-type situations
Mouden, Charlotte. « Holoprosencéphalie : identification de nouveaux gènes et redéfinition du mode de transmission par des approches de séquençage haut-débit ». Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1B012/document.
Texte intégralHoloprosencephaly (HPE) is the most common developmental disorder affecting the brain in humans. HPE is characterised by a lack of interhemispheric separation, on a varying scale of severity. When HPE is not due to chromosomal aberrations, a genetic origin is suspected. Alterations of fourteen genes have been implicated in HPE, mainly involved in SHH, NODAL, FGF and NOTCH signalling pathways, with an unclear mode of inheritance. In order to increase the molecular diagnosis yield and to improve genetic counselling, the goal of the GPLD team (IGDR) is to identify new genes. In one inbred family, a deleterious homozygous mutation in STIL gene has been identified. The STIL protein is involved in primary cilia assembly, through which SHH signalling transits. In another inbred family, a homozygous candidate mutation was located in FAT1, a protocadherin involved in brain development that causes HPE-like phenotypes in animal models. For other non-consanguineous families, exome sequencing data were analysed in trios. All children of these families have a previously identified mutation in a HPE gene that is transmitted from a healthy parent. The approach consisted in searching for additional genetic events, under the hypothesis of a multigenic inheritance. Thus, a digenic inheritance of mutations in SHH and DISP1 has been identified in one family. Further associations of candidate mutations have been identified in others, one also involving FAT1. In conclusion, this work provides new elements accounting for the understanding of HPE genetic bases and particularly new arguments in favour of a multigenic inheritance. The study of these complex genetics bases requires the development of new analytical methods that could be of use in relation to other developmental disorders in which a multigenic inheritance is suspected
Osman, Naoum Jorge. « Étude des populations bactériennes des écosystèmes des sols oligotrophes en utilisant des technologies de séquençage à haut débit ». Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS196/document.
Texte intégral“What microbes are where, and how do they live there” is now an essential question to understand life on Earth, even when comparing seemingly similar ecosystems in different locations. Soil bacterial populations are known to play important roles in biogeochemical cycles, soil maintenance, climatic effects and agriculture. I used pyrosequencing of PCR amplified 16S rDNA from total extracted DNA in order to reveal the bacterial populations living in four different unusual and oligotrophic environments: A. Saline areas are widely distributed on Earth’s and are represented by both saline lakes and saline soils. We examined the bacterial composition of estuary sediments, brackish and sandy soil samples from the Camargue region (Rhône delta in southern France) sampled in two consecutive years. Members belonging to the Proteobacteria, Bacteroidetes, Chloroflexi, Firmicutes, Acidobacteria and Actinobacteria phyla were found principally in saline sediment and soil samples. We found that members from these phyla were associated principally to halophilic bacteria, sulphate reducing bacteria (SRB), nitrate reducing bacteria and coliforms, and that their varying proportions were likely affected by salinity and geographical location. B. Bacterial populations associated with the rhizosphere of plants are known to play essential roles in biogeochemical cycles, plant nutrition and disease biocontrol. We examined the bacterial populations of the rhizosphere of rice (Oryza sativa) growing in the Camargue region in 2013 and 2014. The most abundant bacterial populations were found to be members belonging to the Proteobacteria, Acidobacteria, Chloroflexi and Gemmatimonadetes phyla. The genera members belong these phyla were found to participate in soil biogeochemical processes such as nitrification, denitrification, oxidation, as well as act as biocontrol agents. The bacterial populations were found to significantly vary by geographical location as well by year of collection. C. We examined the surface soils from “Padza de Dapani” on the island of Mayotte off the east coast of Africa, as this region is not a true (hot) desert, but resembles one due to extensive soil erosion. In the acidic, oligotrophic and mineralized soil samples from Mayotte, members of the Actinobacteria, Proteobacteria and Acidobacteria phyla dominated the bacterial populations. Interestingly, members of the genera Acinetobacter, Arthrobacter, Burkholderia and Bacillus were found to be predominant in our samples, as is also observed in hot (Asian) deserts and may play roles in soil mineral weathering, thus helping to understand desertification processes. D. Earth’s arid regions comprise >30% of the continental surface and the oligotrophic soils are subjected to harsh environmental factors such as low average annual rainfall, high UV exposure and large temperature fluctuations. We examined the bacterial populations present in the rhizosphere of pioneer plants and surface soils in the Jizan desert of Saudi Arabia. The most abundant bacterial phyla belonged to the Bacteroidetes, Proteobacteria and Firmicutes phyla that were different between the rhizosphere of plant versus these from surface sand, with the exception of the plant “Panicum Turgidum”, which contain in its rhizosphere high proportions (70%) of members belonging to the Flavobacterium genus
Molet, Lucie. « Génotypage des papillomavirus humains par séquençage haut-débit : conséquences dans le dépistage du cancer du col de l’utérus et apport conceptuel au virome cutané ». Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS231/document.
Texte intégralHuman papillomaviruses (HPV) are classified into 5 genera α, β, γ, μ and η. Their genome comprises six early genes including two oncogenes E6 and E7, and two late genes encoding the L1 and L2 capsid proteins. β- and γ-HPV constitute an important part of the cutaneous virome; usually asymptomatic, they can manifest as papillomatosis like warts and are associated with certain skin cancers, especially in immunocompromised patients. α-HPV has a mucosal tropism; high-risk (HR) α-HPV16 and 18 are involved in 99% of cervical cancers.Detection of α-HR-HPV in cervical samples guide the management of women whose Pap smear result shows atypical squamous cells of undetermined significance (ASCUS), although genotyping targets only the most common HPV types. Genotyping of β- and γ-HPV becomes necessary for the study of the virome especially in contexts of susceptibility to HPV pathogenesis (i.e. WHIM syndrome (for Warts, Hypogammaglobulimemia, Infections and Myelokathexis)). WHIM syndrome is a rare congenital immunodeficiency caused by a gain-of-function mutation of the CXCR4 receptor of the chemokine CXCL12 and manifests in 70% of cases by extensive cutaneous papillomatosis and ano-genital lesions that often evolve into cancer. Studies in our laboratory have identified the intrinsic role of the dysregulated CXCL12/CXCR4 axis in viral pathogenesis by demonstrating in particular the beneficial action of the blocking of this axis by an antagonist of CXCR4 (AMD3100) in vitro and in vivo on HPV-associated oncogenesis.Our objectives were: (i) to identify HPV whose genotype could not be determined (HPV-X) by a conventional test (INNO-LiPA HPV Genotyping Extra II®) in cervical samples with Pap smear report of ASCUS (ii) to characterize the HPV virome of a patient suffering from WHIM syndrom during a 7-month compassionate AMD3100 clinical trial to assess its impact on HPV-associated abnormalities.In both cases, we have developed a high-throughput sequencing genotyping method on Illumina Miseq®. The distribution of genotypes and their nucleotide polymorphism were studied by comparative and phylogenetic analyzes. (i) Our strategy identified in the 54 investigated ASCUS/HPV-X a majority of low-risk HPV, achieving a multiple infection (2 to 7 genotypes) in 41% of cases, and also the existence of quasi-species (41% of FCU) comprising up to 17 variants for the same genotype. Thus, probable competitions or hybridization defects of the minority variants may explain the lack of performance of the INNO-LiPA test. (ii) In the WHIM patient, sequencing was supplemented with type-specific qPCRs, allowing a qualitative and quantitative study. AMD3100 did not qualitatively modify the cutaneous HPV virome composed of 16 types, mainly β- and γ-HPV. In contrast, the quantitative analysis shows changes in the relative proportions of viral genomes suggesting a treatment effect on the expression of certain types that can be selectively associated with. papillomatosis. In this respect, one of the HPVs belonging to the cutaneous virome of the patient was found to be one of only two types present in a deep wart biopsy. This result supports the hypothesis of HPV selection in the lesion process. In addition, the oncogenic proteins E6 and E7 of this virus have mutations which could promote the pathogenic potential of this viral variant in comparison with the sequence of the reference HPV genome; a hypothesis that is under investigation.In conclusion, the high throughput sequencing techniques that we have developed have made it possible to better characterize the composition of the HPV virome demonstrating both its complexity in viral genotypes or in derivatives (i.e. quasi-species concept). The dynamics of which may underlie the pathogenic potential of this HPV virome
Mansour-Hendili, Lamisse. « Mise en place d’une stratégie de validation fonctionnelle de variations de signification incertaine dans les pathologies constitutionnelles du globule rouge ». Electronic Thesis or Diss., Paris 12, 2022. http://www.theses.fr/2022PA120057.
Texte intégralThe deployment of next generation sequencing (NGS) over the past ten years in hospital genetic laboratories in France and around the world has revolutionized the management of rare diseases, including constitutional hemolytic anemia (CHA). It has led to the multiplication of variations of uncertain significance (VUS) requiring the implementation of functional tests to permit a re-classification. The objective of this work is to propose a realistic and effective strategy for the functional exploration of VUS associated with CHs. This approach is based on family genetic studies, study of transcripts on Paxgene tubes, development of methods on-site such as the LORRCA MaxSis for the study of the RBC membrane properties, improvment of techniques such as the RBC density measurement by phthalate gradient and establishment of a collaborative network (example of CNRS in Roscoff for electrophysiological studies). We have shown the interest of NGS in these patients with suspected CHA and have highlighted associations of variations of interest in different genes of RBC pathologies in the same patient (Mansour-Hendili et al 2020). We identified a new pathological entity in two patients with “autoimmune direct antiglobulin test negative” haemolytic anemia who did not respond to immunomodulators. This is a mechanism of acquired spherocytosis by point mutation of the ANK1 gene probably due to clonal hematopoiesis in the elderly (submission in progress). The realization of a whole genome sequencing led to a diagnosis for a child suffering from unexplained transfusion-dependent hemolysis with neurodevelopmental delay due to the VPS4A gene (Lunati-Rozie et al 2021). Via a patient recall system, additional explorations have been carried out. Twenty-five patients underwent a transcript study allowing the reclassification of sixteen variations. Ten family studies have been carried out, one of which excludes the deleterious nature of a VUS of the GATA1 gene. We have shown the interest of measuring RBC density as screening tool for RBC membrane diseases. Its use as a functional test in the case of associations of variations in RBC membrane genes has highlighted the usefulness of the dense cell rate as a differential marker of the presence/absence of the associated variation. Concerning the LORRCA, osmoscan profiles make it possible to discriminate patient with associations of variations compared to “positive” controls without association. Stability studies conducted for these phenotypic tests at different storage times and temperatures show the importance of pre-analytical conditions. We illustrated this problem with the known KCNN4 gene mutation: p.R352H described with a normal osmoscan and ektacytometry profiles. We found twice on two independent samples and manipulations realized on D0 without storage abnormal osmoscan profiles. In addition, we show the interest of the study of electrophysiological properties of the PIEZO1 and KCNN4 channels carried out in Roscoff in the classification of VUS (case one patient with a new KCNN4 mutation and thrombosis, Mansour-Hendili et al 2021). For the associations of variations of interest, the interpretation profiles are more complex but also show profiles differences compared to well-chosen controls. This work has made it possible to demonstrate the usefulness, in addition to family and transcript studies, of RBC phenotypic diagnostic or monitoring tools (LORRCA, density of the GR) to help with the functional validation of isolated or associated VUS in CHA patients. This requires means of revocation, adequate positive controls (intrafamilial cases) and compliance with preanalytical conditions. The establishment of collaborative networks also brings real usefulness and reciprocal intellectual and human added value. The return to the phenotype is an essential recourse for the classification of VUS in particular for the CHA
Seesao, Yuwalee. « Caractérisation des Anisakidae dans les poissons marins : développement d’une méthode d’identification par séquençage à haut-débit et étude de prévalence ». Thesis, Lille 2, 2015. http://www.theses.fr/2015LIL2S043/document.
Texte intégralAnisakis, Pseudoterranova, Hysterothylacium and Contracaecum genera, members of the Anisakids family, are nematodes which larvae are recovered from numerous fish and cephalopods species. These larvae may induce digestive and/or allergic pathologies in human being. In France, the consumption of fishery products, mostly raw or undercooked is increasing. This PhD work is part of the research program Fish-Parasites funded by the ANR (ANR-10-ALIA-004). It aimed to assess risks related to fishery products consumption and its main goal was to study the distribution of Anisakids in fishery products.The sampling plan was based on a risk-ranking analysis using data on fishery products consumption, fishing areas and prevalence data from previous work on Anisakids. A total of 1 768 fish from 18 species were collected. All the organs were dissected for parasites isolation. Parasites were identified using two approaches : i) single analysis by Sanger sequencing for organs containing less than 11 nematodes ii) pooled analysis by high throughput sequencing (HTS) for the remaining. The development of numerous tools (database creation with reference sequences, primers design, set up of the sequencing template preparation and development of an automatic analytical pipeline) was necessary for the HTS method set up. From the sequencing results, acquisition and structuring of the prevalence data has been carried out on parasites potentially pathogenic for human being and recovered from commonly consumed fishery products.On 1 768 sampled fish, two species were not parasitized at all: plaice (Pleuronectes platessa) and aquacultured Atlantic salmon (Salmo salar). 43.30 % of the fish were not infected by Anisakids. Concerning infected fish, 28.62% were contaminated in the visceral organs; 22.96% in both visceral organs and fillets and, finally, 5.49% of the fish were infected by Anisakids only in fillets.The five fish species with an elevated prevalence in their fillets were by decreasing values: blue ling (100 %), megrim (70 %), saithe (63 %), monkfish (61 %) and hake (60 %). The most identified Anisakids were: Anisakis simplex, Anisakis pegreffii, Hysterothylacium aduncum, Pseudoterranova krabbeii.Anisakis has been recovered in all the localisations and generally in higher quantities, Contracaecum has mainly been recovered from the liver, Hysterothylacium from the corporal cavity and Pseudoterranova from both fillets and corporal cavity. Anisakis simplex was isolated from all the fishing areas except for the Lion Gulf and it was the genus with the most important number of individuals. The zone of Feroan waters was the region with the most important diversity of Anisakids into a single sampled fish species (blue ling).The multivariate logistic statistical study showed that the fish species and size affect the prevalence of Anisakis and Pseudoterranova in fish fillets.HTS with the PGM™ Ion Torrent proved to be a powerful and innovative tool for the analysis of large numbers of Anisakids samples with low cost, in a shorter time and with a result equivalent to the individual method of Sanger sequencing
Mallaret, Martial. « Identification de nouveaux gènes d'ataxies cérébelleuses récessives et intérêt du séquençage haut débit dans le diagnostic des ataxies d'origine génétique ». Thesis, Strasbourg, 2015. http://www.theses.fr/2015STRAJ090/document.
Texte intégralHereditary cerebellar ataxias are a group of neurodegenerative or neurodevelopemental diseases responsible of major disability. We found thanks to exome sequencing mutations in the WWOX gene in two consaguineous families presenting with cerebellar ataxia, epilepsy and mental retardation. This gene was until recently only recognized to be a tumor suppressor.With a 57 ataxia genes targeted capture strategy, next generation sequencing in 155 patients found 20,6% of positive diagnosis, including several new mutations in ANO10 and SYNE1. Multi center studies allow to extend clinical knowledge with severes phenotypes especially in ARCA1.We validate a clinico-biological algorithm for recesssive ataxias diagnosis published by Anheim in the in New England Journal of Medicine, 2012 in a blinded manner
Fenouil, Romain. « Etude des mécanismes de la régulation transcriptionnelle et développement d'outils bioinformatiques pour le traitement des données de séquençage haut débit ». Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4089.
Texte intégralMechanisms underlying the regulation of genetic expression are crucial for cell maintenance and adaptation to environment (differentiation, development...). Molecular approaches reveal a great diversity of factors involved in this process (TFs, epigenetics, nucleosomes) and several layers of regulation (transcription initiation, elongation, splicing, maturation) which contribute to the observed transcriptome complexity. During my thesis, we studied the mechanisms of transcription regulation in mammals during lymphocyte differentiation. Briefly, we described the recruitment of GTFs and the transcriptional activity occurring on promoters and enhancers. We also reveal that CpG islands (CGIs) are major regulator elements in mammals, which contribute to nucleosome depletion in a transcription-independent manner on a significant amount of promoters. Together with our collaborators, we also studied the mechanisms of transcription elongation, alternative splicing, or the complex combinatorial patterns of PTMs that can be set on the CTD of RNA Polymerase II and on histone tails. In the context of transition from pre-genomic studies to genome-wide experiments, an important part of my work consisted in the development of bioinformatics tools for the processing and analysis of experimental datasets from ChIP-on-chip, and HTS technologies (ChIP-Seq, MNase-Seq, RNA-Seq)
Lussac-Sorton, Florian. « Effet des thérapies protéiques sur l’axe intestin-poumon dans la mucoviscidose ». Electronic Thesis or Diss., Bordeaux, 2024. http://www.theses.fr/2024BORD0428.
Texte intégralCystic fibrosis is a multi-organ disease whose progression is mainly determined by respiratory and digestive damage. The advent of CFTR modulators has revolutionized its prognosis over the last decade. Alterations in microbiota and mycobiota have been well described in cystic fibrosis, as have, to a lesser extent, the inter-organ connections defining the gut-lung axis. Yet there is still a lack of data regarding the impact of protein therapies on the mycobiota-microbiota.The LUM-IVA-BIOTA protocol is a national multicentric study designed to decipher the evolution of mycobiota-microbiota and inflammation in patients receiving the lumacaftor-ivacaftor (LUM/IVA) combination, using targeted metagenomic approaches. The first part of this work, involving patients over 12 years of age, showed that the effect of LUM/IVA on lung microbiota depended on chronic colonization with Pseudomonas aeruginosa. The next stage of the project involved a cohort of children aged 2 to 11 and demonstrated improvements in lung and gut microbiotas on LUM/IVA with changes in the gut-lung axis, while confirming the key role of P. aeruginosa colonization in improving lung microbiota. These results highlight that protein therapies can improve the diversity of microbial florae within the gut-lung axis if initiated early on, before the turning point of P. aeruginosa colonization. These findings need to be further confirmed in patients receiving next-generation modulators such as the elexacaftor-tezacaftor-ivacaftor triple therapy
Chalabi, Smahane. « Caractérisation de la reprogrammation de l'expression des gènes chez les blés allopyloïdes ». Thesis, Evry-Val d'Essonne, 2014. http://www.theses.fr/2014EVRY0040.
Texte intégralPolyploidy is a major evolutionary force, especially in angiosperms, all of which species have undergone recurrent polyploidization events during their evolution.In order to understand reprogramming of gene expression in response to polyploidy in the economically important wheat species (genera Triticum and Aegilops), I used an original model that consists in decreasing and reincreasing ploidy levels. Thus, the allotetraploid T. turgidum (BBAA) is extracted from the allohexaploid bread wheat T. aestivum (BBAADD), consisting in decreasing ploidy level. This extracted allotetraploid is crossed with the diploid species Ae. tauschii (DD) to synthesize an allohexaploid wheat, consisting in re-increasing ploidy level.The characterization of reprogramming of gene expression in response to decreasing and re-increasing ploidy levels was done here using first microarray technologies and then massive parallel mRNA sequencing (RNA-Seq), that has been rendered possible by the recent ‘draft' hexaploid wheat genome sequencing and subsequently the availability of the three homoeologs sequences (Ah, Bh, Dh) of 8605 genes. Adequate bioinformatics and statistics methods have been adopted and/or developed and used.My work reveals a partitioning of global expression of genes into that of their constituent homoeologs in different wheats allopolyploids. Most of homoeologs contribute equally to the overall gene expression and a low proportion reveals a bias towards one homoeolog, without showing a global dominance of a specific sub-genome. The partitioning and concerted expression of homoeologs is also established in wheat. Most homoeologs increase their expression when separated and reduce their expression levels when joined together in a higher ploidy level. For most genes, Ah and Bh homoeolog expression in allohexaploid wheat is equal to 2/3 of their expression level in the extracted allotetraploid wheat whereas the Dh homoeolog expression level is equal to 1/3 of that in the wheat diploid genome. This concerted change in homoeolog expression maintains the global gene expression at nearly similar levels in different ploidy levels.Results obtained in this work contribute to our understanding of global gene expression regulation and its partitioning between constituent homoeologs at different ploidy levels. Functional analysis of the different gene expression categories would reveal important gene functional categories that are regulated in response to polyploidy
Vandenborght, Louise-Eva. « Étude des microbiotes endogènes et exogènes de patients atteints de maladie respiratoire chronique ». Thesis, Bordeaux, 2019. http://www.theses.fr/2019BORD0410.
Texte intégralThe microorganisms, some of which preceded us billions of years ago, have been identified from the North Pole to the bottom of the ocean until in the atmosphere. The advent of High Throughput Sequencing (HTS) technologies has facilitated the discovery and study of these microbial communities, including the human body. Thus, the human body appears to be composed of ten times more microorganisms than its own cells. Notably, the intestinal microbiota is one of the most investigated; it is composed of a significant biomass. In contrast, studies on the pulmonary microbiota are only in their early stages, particularly in the context of chronic respiratory diseases (CRD). While until recently lungs were considered as sterile organs, they are now found to be composed of a poly-microbial community of bacteria, viruses, phages and fungi. In the case of CRD, such as asthma and cystic fibrosis studied in this PhD work, the composition of the pulmonary microbiota determined by NGS appears to correlate with the clinical course of the patients.In cystic fibrosis (the most common genetic disease in the Caucasian population) and asthma (a multifactorial disease attributed to environmental factors associated with a genetic predisposition which knows a constantly increasing prevalence), changes in abundance and diversity (known as dysbiosis) of bacterial communities (endogenous microbiota) are well documented. However, the study of endogenous mycobiota (fungal community that resides into the lungs) remains much less investigated. Indeed, it presents methodological challenges inherent to its very low biomass, but also to the structure of fungi wall.In addition, very few information exits on the relationship between the endogenous pulmonary microbiota of patients and the corresponding indoor environment (or exogenous microbiota), whereas this specific fungal exposure represents a known risk factor for the development of a CRD. The study of such microbial exposome also presents methodological challenges, in terms of sampling and characterization of the microbial communities that are also of low biomass.The main objective of this work was initially to optimize mycobiota analysis, using an artificial fungal community from the extraction steps to the selection of the most efficient fungal targets to perform targeted metagenomics in cystic fibrosis and asthma context. Concerning the study of the microbial exposome, the evaluation of a device for the collection of microorganisms present in the patient's indoor environment was developed during this work in order to provide an optimized, standardized, and scientifically relevant tool.Secondly, the NGS characterization of endogenous microbiota and mycobiota of cystic fibrosis patients was investigated by taking into account their clinical state, in particular the existence of a pulmonary exacerbation. Correlations involving fungal genera known to be medically relevant such as Aspergillus, Scedosporium and Candida have been identified by focusing on the inter-kingdom network between bacteria and fungi identified in patients. It allowed us to confirm (from our experimental data) the role of the endogenous mycobiota in the ecological model “Climax / Attack" adapted to cystic fibrosis. [...]
Ben, Nsira Nadia. « Algorithme de recherche incrémentale d'un motif dans un ensemble de séquences d'ADN issues de séquençages à haut débit ». Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR143/document.
Texte intégralIn this thesis, we are interested in the problem of on-line pattern matching in highly similar sequences, On-line Pattern Matching on Highly Similar Sequences, outcoming from Next Generation Sequencing technologies (NGS). These sequences only differ by a very small amount. There is thus a strong need for efficient algorithms for performing fast pattern matching in such specific sets of sequences. We develop new algorithms to process this problem. This thesis is partitioned into five parts. In the first part, we present a state of the art on the most popular algorithms of finding problem and the related indexes. Then, in the three following parts, we develop three algorithms directly dedicated to the on-line search for patterns in a set of highly similar sequences. Finally, in the fifth part, we conduct an experimental study on these algorithms. This study shows that our algorithms are efficient in practice in terms of computation time
Morales, Raul. « L'apport du séquençage à haut débit dans la recherche de nouvelles associations génotype-phénotype dans les myopathies : cas particuliers des titinopathies ». Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTT083.
Texte intégralUnravelling new phenotype/genotype correlations in myopathies with next generation sequencing. TitinopathiesNext-generation sequencing (NGS) is a revolutionary technology that allows the simultaneous analysis of multiple genes. The main NGS challenge is the clinical interpretation of the molecular findings to differentiate pathogenic, likely pathogenic, and non-clinically significant DNA variants. The purpose of this study was the detection of gene variants that cause myopathy in a cohort of 156 pediatric and adult patients by targeted NGS. We identified a pathogenic variant in 74 of the 156 patients (47.4%). The highest rate of positive diagnosis was in patients with congenital myopathies. Particularly, 82% of patients with severe hypotonia at birth had a pathogenic variant, suggesting that NGS should be considered the first diagnostic tool in this group. The most interesting result was the identification in 10% of patients of new or unusual phenotype/genotype associations concerning several genes. In some cases, this contributed to expand the phenotype associated with a specific gene. In the others, the disease severity or the inheritance mode differed from the classical clinical descriptions. The stepwise diagnostic methodology used in this work and the selected patient population could explain the remarkably high percentage of expanded phenotype-genotype associations (more than 10% of cases). Indeed, in addition to the classical in silico predictions of the variant and familial segregation, the procedure for validating the association between atypical or incomplete phenotypes and a genetic variant included also deep phenotyping (i.e., detailed clinical examination, whole-body muscle MRI, and additional techniques for muscular biopsy analysis, such as electron microscopy), multidisciplinary concertation, exhaustive and continuous literature update and if necessary, advice from international laboratories that are experts in specific genes.Since the widespread use of Next Generation Sequencing (NGS), the exhaustive analysis of the 364 exons of the TTN gene revealed an increasingly number of reported TTN variants. However, it is often difficult to assess the pathogenicity of the identified variants, in particular missense ones, due to the clinical heterogeneity of the pathology, the large size of the gene, the frequency of TTN variants in the general population. The aim of the study is to develop a workflow for interpreting TTN variants. This workflow is based on deep phenotyping, evaluation of the effects of TTN variants predicted to affect transcription and/or splicing on skeletal muscle titin transcripts, and evaluation of the consequences of some variants on protein by Western Blot studies. Our study highlights the importance of analyzing the consequences of variants not only on protein, but also on transcripts, to have an exhaustive understanding of the consequences of variants at the transcriptional and protein level
Gondard, Mathilde. « A la découverte des agents pathogènes et microorganismes des tiques par séquençage de nouvelle génération et QPCR microfluidique à haut débit ». Thesis, Paris Est, 2017. http://www.theses.fr/2017PESC1017.
Texte intégralVector-borne diseases are illnesses caused by pathogens transmitted by haematophagous arthropods which provide active transmission (mechanical or biological) of infectious agents from one vertebrate to another. Among these vectors, ticks are known to carry and transmit the greatest variety of pathogens of public health and veterinary importance. They transmit microorganisms responsible for bacterial (Lyme borreliosis, rickettsioses), parasitic (babesiosis, theileriosis), or viral diseases (tick-borne encephalitis).The Antilles are located in the heart of the Caribbean Neotropical Zone. This area can be considered at risk for the emergence of vector-borne diseases mainly due to favorable environmental conditions and intercontinental exchanges (e.g. legal and illegal animal trade, migratory birds). However, the epidemiological situation of the Caribbean area, with regard to tick-borne diseases, is still poorly documented. Indeed, most of field studies only focused on animal pathogens such as Ehrlichia ruminantium, Babesia (bovis and bigemina) and Anaplasma marginale and questions about the risk of emergence or re-emergence of tick-borne diseases remain unanswered. Thus, it is crucial to develop efficient epidemiological surveillance tools that would enable the detection of new, known or unexpected pathogens present in ticks. In this context, the main objective of my thesis was to obtain an overview of pathogens of medical and veterinary interest present in Caribbean ticks using new high-throughput technologies. We first used a high-throughput sequencing approach to determine pathogens present in ticks (bacteria, parasites, and viruses) collected in Guadeloupe and Martinique. This analysis revealed a great diversity of pathogenic agents in our samples and highlighted the presence of four viruses belonging to new viral families recently described and associated with arthropods. Results of sequencing combined with data available in the literature allowed us to make the most exhaustive list of pathogens potentially transmitted by ticks and requiring health surveillance in the Caribbean area. From this pathogen inventory, we developed a system of high-throughput screening of infectious agents applicable to the whole Caribbean area. This molecular tool is a microfluidic system based on the BiomarkTM dynamic arrays technology (Fluidigm Corporation), which enables high-throughput real-time PCR to simultaneously detect 48-96 targets within 48 to 96 samples. Two different chips have been developed, one for bacteria and parasites monitoring, and one for viruses. Their efficiency was tested on tick samples collected in both Guadeloupe and Martinique. This large-scale screening provided a comprehensive overview of the epidemiological situation of 45 bacteria, 17 parasites and 31 viruses potentially transmitted by ticks in the French West Indies. The high-throughput detection tool developed during my thesis represents a major improvement in epidemiological surveillance technology, enabling the rapid and concomitant monitoring of a wide range of pathogens. It will soon be applied to high-throughput screening of infectious agents found in ticks collected throughout the Caribbean, including Trinidad and Tobago, St. Kitts, Barbados, and St. Lucia, thanks to the collaboration with the CaribVet network, and local veterinarians
Gagné, Patrick. « Élaboration d’une méthodologie d’analyses bio-informatiques pour des données de séquençage Illumina dans un contexte de metabarcoding ». Master's thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/66444.
Texte intégralKarkar, Adnane. « Leucodystrophies : aspects génétiques et moléculaires au Maroc ». Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCC258.
Texte intégralLeukodystrophies are hereditary disorders affecting central nervous system white matter (WM) with or without damages in peripheral nervous system. These disorders have in common abnormalities of the glial cell and the sheath of the myelin. Magnetic resonance imaging (MRI) is the major tool to detect WM abnormalities, so MRI in association with a good clinical examination can help to provide an accurate medical diagnosis. Moreover, the confirmation of this or that leukodystrophy remains in the field of molecular biology by the detection of mutated gene translating the phenotype of the patient. Knowing that in Morocco no previous study on leukodystrophies had been carried out, we sought to know the characteristics of Moroccan patients carrying these disorders so as to be able to establish a molecular diagnosis of the most frequent forms. We collected samples from families with one or more members had a suspicion of leukodystrophy. Thus, we carried out a sequencing with an approach that consists in analyzing directly by Sanger method the leukodystrophies having a positive biochemical marker and by next generation sequencing or Next-Generation Sequencing (NGS) leukodystrophies without known biochemical markers. This approach allowed us, to identify leukodystrophies in a sample of the Moroccan population but also to identify new mutations