Dissertations / Theses on the topic 'Prediction of transcription factor binding sites'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Prediction of transcription factor binding sites.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Robert, Christelle L. R. S. "Computational Prediction of Transcription Factor Binding Sites in Bacterial Genomes." Thesis, University of Dundee, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.521672.
Full textMorozov, Vyacheslav. "Computational Methods for Inferring Transcription Factor Binding Sites." Thèse, Université d'Ottawa / University of Ottawa, 2012. http://hdl.handle.net/10393/23382.
Full textSealfon, Rachel (Rachel Sima). "Predicting enhancer regions and transcription factor binding sites in D. melanogaster." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62434.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 71-75).
Identifying regions in the genome that have regulatory function is important to the fundamental biological problem of understanding the mechanisms through which a regulatory sequence drives specific spatial and temporal patterns of gene expression in early development. The modENCODE project aims to comprehensively identify functional elements in the C. elegans and D. melanogaster genomes. The genome- wide binding locations of all known transcription factors as well as of other DNA- binding proteins are currently being mapped within the context of this project [8]. The large quantity of new data that is becoming available through the modENCODE project and other experimental efforts offers the potential for gaining insight into the mechanisms of gene regulation. Developing improved approaches to identify functional regions and understand their architecture based on available experimental data represents a critical part of the modENCODE effort. Towards this goal, I use a machine learning approach to study the predictive power of experimental and sequence-based combinations of features for predicting enhancers and transcription factor binding sites.
by Rachel Sealfon.
S.M.
Sandelin, Albin. "In silico prediction of CIS-regulatory elements /." Stockholm, 2004. http://diss.kib.ki.se/2004/91-7349-879-3/.
Full textJayaram, N. "Improving the prediction of transcription factor binding sites to aid the interpretation of non-coding single nucleotide variants." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1556214/.
Full textRezwan, Faisal Ibne. "Improving computational predictions of Cis-regulatory binding sites in genomic data." Thesis, University of Hertfordshire, 2011. http://hdl.handle.net/2299/7133.
Full textParmar, Victor. "Predicting transcription factor binding sites using phylogenetic footprinting and a probabilistic framework for evolutionary turnover." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=87000.
Full textL'identification des sites de fixation des facteurs de transcription (TFBS), particulièrement sur les génomes eucaryotiques plus élevés, a été un énorme défi. Les méthodes informatiques comportant l'identification de la conservation de séquence entre les génomes de différentes espèces ont eu beaucoup de succès parce que les sites trouvés dans de telles régions fortement conservées sont probablement fonctionnels (les facteurs de transcription se rajoutent sur le génome à ces sites-là et réglent la production de protéine). Dans cette thèse, nous présentons un algorithme probabiliste pour la prédiction de TFBSs qui prend en considération également le remuement évolutionnaire. Notre algorithme est validé par l'intermédiare des simulations et le résultats de son application sur des données ChIP-chip sont présentés
Kiełbasa, Szymon M. "Bioinformatics of eukaryotic gene regulation." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2006. http://dx.doi.org/10.18452/15562.
Full textUnderstanding the mechanisms which control gene expression is one of the fundamental problems of molecular biology. Detailed experimental studies of regulation are laborious due to the complex and combinatorial nature of interactions among involved molecules. Therefore, computational techniques are used to suggest candidate mechanisms for further investigation. This thesis presents three methods improving the predictions of regulation of gene transcription. The first approach finds binding sites recognized by a transcription factor based on statistical over-representation of short motifs in a set of promoter sequences. A succesful application of this method to several gene families of yeast is shown. More advanced techniques are needed for the analysis of gene regulation in higher eukaryotes. Hundreds of profiles recognized by transcription factors are provided by libraries. Dependencies between them result in multiple predictions of the same binding sites which need later to be filtered out. The second method presented here offers a way to reduce the number of profiles by identifying similarities between them. Still, the complex nature of interaction between transcription factors makes reliable predictions of binding sites difficult. Exploiting independent sources of information reduces the false predictions rate. The third method proposes a novel approach associating gene annotations with regulation of multiple transcription factors and binding sites recognized by them. The utility of the method is demonstrated on several well-known sets of transcription factors. RNA interference provides a way of efficient down-regulation of gene expression. Difficulties in predicting efficient siRNA sequences motivated the development of a library containing siRNA sequences and related experimental details described in the literature. This library, presented in the last chapter, is publicly available at http://www.human-sirna-database.net
Gebhardt, Marie Luise. "Enrichment of miRNA targets in REST-regulated genes allows filtering of miRNA target predictions." Doctoral thesis, Humboldt-Universität zu Berlin, Lebenswissenschaftliche Fakultät, 2016. http://dx.doi.org/10.18452/17407.
Full textPredictions of miRNA binding sites suffer from high false positive rates (24-70%) and measuring biological interactions of miRNAs and target transcripts on a genome wide scale remains challenging. In the thesis at hand the question was answered if the ever growing body of ChIP-sequencing data can be applied to filter miRNA target predictions by making use of the underlying regulatory network of miRNAs and transcription factors. First different methods for association of ChIP-sequencing peaks to target genes were tested. Target gene lists of the transcriptional repressor RE1-silencing transcription factor (REST/NRSF) were generated by means of ChIP-sequencing data. An enrichment analysis tool based on predictions from TargetScanHuman was developed and applied to find ‘enrichment’-miRNAs with over-represented targets in the REST gene lists. The detected miRNAs were shown to be part of a highly regulated REST-miRNA network. Possible functions could be assigned to them and their role in the regulatory network and special network motifs (incoherent feedforward loop of type 2) was analyzed. It turned out that miRNA target predictions of genes shared by enrichment-miRNAs and REST had a higher proportion of true positive associations than the TargetScanHuman background, thus the procedure made a filtering possible.
Pape, Utz J. [Verfasser]. "Statistics for transcription factor binding sites / Utz J. Pape." Berlin : Freie Universität Berlin, 2009. http://d-nb.info/1023329476/34.
Full textKlein, Holger [Verfasser]. "Co-occurrence of transcription factor binding sites / Holger Klein." Berlin : Freie Universität Berlin, 2010. http://d-nb.info/1024541517/34.
Full textMaynou, Fernàndez Joan. "Computational representation and discovery of transcription factor binding sites." Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/387550.
Full textLa informació sobre com, quan i on es produeixen les proteïnes ha estat un dels majors reptes en la biologia molecular. Els estudis sobre el control de l'expressió gènica són essencials per conèixer millor el procés de síntesis d'una proteïna. La regulació gènica és un procés altament controlat que s'inicia amb la transcripció de l'ADN. En aquest procés, els gens, unitat bàsica d'herència, són copiats a àcid ribonucleic (RNA). El primer pas és controlat per la unió de proteïnes, anomenades factors de transcripció (TF), amb una seqüència d'ADN (àcid desoxiribonucleic) en la regió reguladora del gen. Aquestes seqüències s'anomenen punts d'unió i són específiques de cada proteïna. La unió dels factors de transcripció amb el seu corresponent punt d'unió és l'inici de la transcripció. Els punts d'unió són seqüències molt curtes (5 a 20 parells de bases de llargada) i altament degenerades. Aquestes seqüències poden succeir de forma aleatòria cada centenar de parells de bases. A més a més, un factor de transcripció pot unir-se a diferents punts. A conseqüència de l'alta variabilitat, és difícil establir una seqüència consensus. Per tant, l'estudi i la identificació del punts d'unió és important per entendre el control de l'expressió gènica. La importància d'identificar seqüències reguladores ha portat a projectes com l'ENCODE (Encyclopedia of DNA Elements) a dedicar grans esforços a mapejar les seqüències d'unió d'un gran conjunt de factors de transcripció per identificar regions reguladores. L'accés a seqüències genòmiques i els avanços en les tecnologies d'anàlisi de l'expressió gènica han permès també el desenvolupament dels mètodes computacionals per la recerca de motius. Gràcies aquests avenços, en els últims anys, un gran nombre de algorismes han sigut aplicats en la recerca de motius en organismes procariotes i eucariotes simples. Tot i la simplicitat dels organismes, l'índex de falsos positius és alt respecte als veritables positius. Per tant, per estudiar organismes més complexes és necessari mètodes amb més sensibilitat. En aquesta tesi ens hem apropat al problema de la detecció de les seqüències d'unió des de diferents angles. Concretament, hem desenvolupat un conjunt d'eines per la detecció de motius basats en models lineals i no-lineals. Les seqüències d'unió dels factors de transcripció han sigut caracteritzades mitjançant dues aproximacions. La primera està basada en la informació inherent continguda en cada posició de les seqüències d'unió. En canvi, la segona aproximació caracteritza la seqüència d'unió mitjançant un model de covariància. A partir d'ambdues caracteritzacions, hem proposat un nou conjunt de mètodes computacionals per la detecció de seqüències d'unió. Primer, es va desenvolupar un nou mètode basat en la mesura paramètrica de la incertesa (entropia de Rényi). Aquest algorisme de detecció avalua la variació total de l'entropia de Rényi d'un conjunt de seqüències d'unió quan una seqüència candidata és afegida al conjunt. Aquest mètode va obtenir un bon rendiment per aquells seqüències d'unió amb poca o nul.la correlació entre posicions. La correlació entre posicions fou considerada a través d'un model lineal, Qresiduals, i dos models no-lineals, alpha-Divergence i SIGMA. Q-residuals és una nova metodologia per la recerca de motius basada en la construcció d'un subespai a partir de la covariància de les seqüències d'ADN numèriques. Quan el nombre de seqüències disponible és petit, el rendiment de Q-residuals fou significant millor i més ràpid que en les metodologies comparades. Alpha-Divergence avalua la variació total de la divergència paramètrica en un conjunt de seqüències d'unió quan una seqüència candidata és afegida. Donat un q-valor òptim, alpha-Divergence va tenir un millor rendiment que les metodologies comparades en la majoria de seqüències d'unió dels factors de transcripció considerats. Finalment, un nou mètode computacional, SIGMA, va ser desenvolupat per tal millorar la potència de detecció
Sanchez, Galan Frauca Javier. "Large scale identification of transcription factor binding sites in DNA sequences." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=86960.
Full textIn this thesis we present a new approach that uses elements of the three identification methods to develop a large-scale approach that assesses the over-representation of TFBS in DNA sequences. Results of application of this new method are presented for five biological datasets: including a set of regions bound by estrogen receptor (ER). We also present new results, yet to be validated experimentally, from two interesting biological datasets. The first is a dataset containing coding regions under non-coding selection (called CRUNCS). The other is a set of genes regulated by proteins called angiopoietins.
Finally, a new public bioinformatic software, used to estimate the over-representation of TFBSs in DNA sequences, that we call the Genome-Wide Analysis of TFBS Over- Representation (GATOR), is introduced.
À ce jour, la régulation des gènes est encore l'un des processus les plus étudiés en biologie moléculaire. L'une de ses principales categories d'acteurs, des protéines appelées facteurs de transcription, joue un rôle essentiel dans le contrôle du taux d'expression des gènes, en se liant à des sites spécifiques sur la séquence d'ADN. Ces sites sont des séquences courtes (de 5 à 15 paires de bases) et sont communément appelés sites de liaison pour les facteurs de transcription (TFBSs, en anglais). Les interactions entre ces protéines et l'ADN jouent un rôle fondamental à plusieurs stades du développement cellulaire et de la réponse à divers types de stress. Diverses méthodes de calcul qui exploitent les caractéristiques spécifiques des TFBS ont été développées et testées dans le but de l'identifier de tels sites de liaison. Citons par ex- emple l'identification des TFBS à l'aide des empreintes phylogénétiques, des modules de régulation cis et de la sur-représentation statistique.
Dans cette thèse nous présentons une nouvelle approche qui utilise des éléments des trois méthodes d'identification susmentionnés pour développer une approche à grande échelle qui évalue la sur-représentation des TFBS, dans les séquences d'ADN. Les résultats de l'utilisation de cette nouvelle méthode sont présentés pour cinq ensembles de données biologiques. Parmi eux, un ensemble des régions de sites de liaison liées aux récepteurs d'oestrogène (ER), un ensemble de données qui contient des régions codantes sous sélection non codante (appelé CRUNCS) et finalment, un ensemble de génes régulés par des protéines appelées angiopoietines.
Finalement, nous présentons un nouveau logiciel bioinformatique public qui sert à estimer la sur-représentation des TFBSs dans les séquences d'ADN et que nous avos appelé le Genome-Wide Analysis of TFBS Over-Representation (GATOR).
Jaini, Suma. "Methods for functional characterization of transcription factor binding sites in bacteria." Thesis, Boston University, 2014. https://hdl.handle.net/2144/11097.
Full textUnderstanding gene regulation is necessary to gain insight into and model important cellular processes including disease. Current inability to combat many diseases is partly because of incomplete understanding of gene circuitry. Regulation mechanisms of Mycobacterium tuberculosis, the causative agent of Tuberculosis are not properly understood. Transcriptional regulatory network (TRN) is a network comprising transcription factors (TF) and their targeted genes that provide a powerful framework to analyze the complete regulatory system. Chromatin immunoprecipitation followed by next generation sequencing (ChiP-Seq) is becoming the method of choice to identify genome wide TFBS . Therefore, we use ChiP-Seq on known transcription factors to reconstruct the TRN of Mycobacterium tuberculosis (Mtb) and other bacteria. ChiP-Seq reveals various transcription factor binding sites (TFBS) but doesn't provide any information on the mechanism of regulation of the genes by their corresponding TF's. Techniques to gain more insight into the mechanisms include microarray, knock out studies and qPCR. But, these techniques provide a static view of network. Also, they provide information at RNA level and mask the regulation happening at protein level. Therefore, in order to understand both the mechanism of regulation at protein level as well as to capture the network dynamics, we built a synthetic gene circuit in Mycobacterium smegmatis and defined input-output relationships between key TFs and their targeted promoters. We validated this system on kstR, a TF which is a known repressor. KstR regulates genes involved in cholesterol degradation and is shown to de- repress itself and its regulon genes in the presence of cholesterol as well as in hypoxia, where there are no exogenous lipids4- . We explored the possibility of other by-products that may be responsible for the de-repression of kstR and its regulon. The data suggests that propionyl-coA, a by-product from degradation of cholesterol, odd numbered fatty acids as well as branched chain amino-acids is causing the de-repression of kstR and its regulon. ChiP-Seq data on transcription factors in MTb as well as E.coli shows that many TFBS are located immediately upstream of open reading frame start sites, consistent with our understanding ofprokaryotic gene regulation. However, the data also suggests that many TFBS are located inside and also downstream of open reading frames6. One of our hypotheses is that these novel TFBS might be indirect binding sites that mediate chromatin looping . Therefore, we developed a method 3C (Chromosome Conformation Capture) to understand the regulation in the third dimension by analyzing the chromosomal interactions. We optimized the protocol in E.coli and validated using a known interaction mediated by a repressor GalR . We then identified two regions, 20 kbp apart, containing TFBS of StpA, a nucleoid associated protein, which are not directly involved in gene regulation of their downstream genes. The data from a 3C experiment on an E.coli strain with inducible StpA suggests that these two regions interact by an unknown mechanism. However, the interaction was not lost when a similar experiment is done in StpA knock out strain suggesting that StpA may not be a sole TF responsible for this interaction. Lastly, we developed Hi-C method on E.coli genomic DNA to identify long range interactions in a genome wide and unbiased manner.
Gazzillo, Lisa Christine. "The Mapping of Transcription Factor Binding Sites in the Turkey Prolactin Gene." Thesis, Virginia Tech, 2000. http://hdl.handle.net/10919/35719.
Full textMaster of Science
Pairó, Castiñeira Erola. "Detection of Transcription Factor Binding Sites by Means of Multivariate Signal Processing Techniques." Doctoral thesis, Universitat de Barcelona, 2015. http://hdl.handle.net/10803/336663.
Full textSchmidt, Jens. "Discovery of Putative STAT5 Transcription Factor Binding Sites in Mice with Diabetic Nephropathy." Ohio University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1385482459.
Full textLee, Tek Hyung. "A regulatory role for repeated decoy transcription factor binding sites in target gene expression." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/76563.
Full textCataloged from PDF version of thesis.
Includes bibliographical references.
Repetitive DNA sequences are prevalent in both prokaryote and eukaryote genomes and the majority of repeats are concentrated in intergenic regions. These tandem repeats (TRs) are highly variable as the number of repeated units changes frequently due to recombination events and/or polymerase slippage during replication. While TRs have been traditionally regarded as non-functional 'junk' DNA, variability in the number of TRs present within or close to genes is known to lead to gross phenotypic changes and disease. However, whether intergenic TRs have a functional role is less understood. Recent studies reveal that many intergenic TRs contain transcription factor (TF) binding sites and that several TRs of TF binding sites indeed influence gene expression. A possible mechanism is that TRs serve as TF decoys, competing with a promoter for TF binding. We utilized a synthetic system in budding yeast to examine if repeated binding sites serve as decoys, and alter the expression of genes regulated by the sequestered TF. Combining experiments with kinetic modeling suggests that repeated decoy binding sites sequester activators more strongly than a promoter binding site although both binding sites are identical in sequence. This strong binding converts a graded dose-response between activator and promoter to a sigmoidal-like response. We further find that the tight activatordecoy interaction becomes weaker with increasing activator levels, suggesting that the activator binding at the repeated decoy site array might be anti-cooperative. Finally, we show that the high affinity of repeated decoy sites qualitatively changes the behavior of a transcriptional positive feedback loop from a graded to bimodal, all-or-none response. Taken together, repeated TF binding sites play an unappreciated role as a gene regulator. Since repeated decoy sites are hypervariable in number, this variability can lead to qualitative changes in gene expression and potentially phenotypic variation over short evolutionary time scales.
by Tek Hyung Lee.
Ph.D.
Piper, Jason. "The demarcation of transcription factor binding sites through the analysis of DNase-seq data." Thesis, University of Warwick, 2014. http://wrap.warwick.ac.uk/71314/.
Full textOchs, Sharon D. "Elucidating transcription factor regulation by TCDD within the hs1,2 enhancer." Wright State University / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=wright1333992865.
Full textZandvakili, Arya. "The Role of Affinity and Arrangement of Transcription Factor Binding Sites in Determining Hox-regulated Gene Expression Patterns." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1535708748728472.
Full textTeo, William J. "Screening of potential upstream regulators and identification of DNA binding sites for the tooth transcription factor Krox-26." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ63008.pdf.
Full textBeach, Michael. "Unraveling the molecular physiology of the β-cell: genome wide analysis of binding sites for the transcription factor PDX1." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/15879.
Full textTria, Fernando Domingues Kümmel. "Análise in silico de regiões promotoras de genes de Xylella fastidiosa." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/95/95131/tde-13082013-194053/.
Full textXylella fastidiosa is a gram-negative, non-flagellated bacterium responsible for causing economically important diseases such as Pierce\'s disease in grapevines and Citrus Variegated Clorosis (CVC) in sweet orange trees. In the present work we performed in silico analysis on promoter sequences of protein-coding genes from this phytopathogen, including those involved in virulence and pathogenic mechanisms, in an attempt to better understand the underlying transcriptional regulatory dynamics. Two strategies for cis-regulatory elements prediction were applied on promoter sequences from 9a5c strain genome, a proven causal agent of CVC. The first one, known as phylogenetic footprinting, involved the prediction of regulatory motifs conserved on promoter sequences of orthologous transcription units from X. fastidiosa and a set of 7 comparatives species. The criteria to identify orthologous transcription units, i. e., those from different species and whose promoter sequences share at least one common regulatory motif, was studied based on regulatory information available for model organisms: Pseudomonas aeruginosa, Bacillus subtilis and Escherichia coli. The results obtained with the phylogenetic footprinting analysis permitted us to access the underlying transcriptional regulatory network from the species in a comprehensive manner (genome-wide), with a total of 2990 regulatory interactions corresponding to 80 predicted motifs distributed on promoter sequences of 56.8% of all transcription units. In the second strategy regulatory information from E. coli was recovered and used to expand the knowledge of ten regulons in X. fastidiosa, through a scanning process, of which some regulatory interactions were previously described by independent studies. We emphasize some genes related to host invasion and colonization present in the Fur and CRP regulons, two global transcription regulators. Lastly, comparative analysis on corresponding regulatory regions among strains were performed and differences possibly associated to phenotypic variation were identified between 9a5c and J1a12, a non-virulent strain isolated from orange trees, and between 9a5c and Temecula1, a strain associated to Pierce\'s disease on grapevines.
Carbonari, Gioia <1983>. "Identification of the N-Linked Glycosylation Sites of the Transcription Factor Rest and Effect of Glycosylation on DNA Binding and Transcriptional Activity." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amsdottorato.unibo.it/4288/.
Full textLanger, Björn. "Phenotype-related regulatory element and transcription factor identification via phylogeny-aware discriminative sequence motif scoring." Doctoral thesis, Center for Systems Biology Dresden, 2017. https://tud.qucosa.de/id/qucosa%3A31172.
Full textYang, Doo Seok. "Computational Study of Nucleosome Positioning Sequence Patterns and the Effects of the Nucleosome Positioning on the Availability of the Transcription Factor Binding Sites in Study Systems." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36580.
Full textBukka, Prasanna L. "Regulation of parathyroid hormone-related peptide gene expression in osteoblast-like cells : the role of an intronic minisatellite ans Sp1 transcription factor binding sites in the promoter region." Thesis, McGill University, 2003. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=19451.
Full textHassan, Faizule. "Adenovirus Mediated Delivery of Decoy Hyper Binding Sites for Sequestration of an Oncogenic Transcription Factor HMGA as a Potential Novel Cancer Therapy and Antibacterial Activity of Local Mushrooms." Miami University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=miami1511449587326648.
Full textRepapi, Emmanouela. "An integrated genomic approach for the identification and analysis of single nucleotide polymorphisms that affect cancer in humans." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:16f4482e-7f83-46c9-88d9-583c4154e044.
Full textNeto, Antonio Ferrão. "Predição computacional de sítios de ligação de fatores de transcrição baseada em gramáticas regulares estocásticas." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/95/95131/tde-02012018-144349/.
Full textTranscription factors (FT) are proteins that bind to specific and well-conserved sequences of nucleotides in the DNA, called transcription factor binding sites (TFBS), contained in regions of gene regulation known as cis-regulatory modules (CRM). By recognizing TFBA, the transcription factor binds to that site and positively or negatively influence the gene transcription. There are experimental procedures for the identification of TFBS in a genome such as footprinting, ChIP-chip or ChIP-Seq. However, the implementation of these techniques involves high costs and time. Alternatively, one may utilize the TFBS sequences already known for a particular transcription factor and applying computational supervised learning techniques to create a computational model for that site and then perform the computational prediction in the genome. However, most existing software tools for this purpose considers independence between nucleotide positions in the site - such as those based on PWMs (position weight matrix) - which is not necessarily true. This project aimed to evaluate the use of stochastic regular grammars (SRG) as an alternative technique to PWMs in this problem, since SRGs are able to characterize dependencies between consecutive positions in the sites. Although differences in performance have been subtle, SRGs appear to be more suitable than PWMs in the presence of higher base dependency values, and PWMs in other cases. Finally, a computational TFBS prediction tool was created based on both SRGs and PWMs.
Liao, Yi-Sian, and 廖一憲. "Prediction of Transcription Factor Binding Sites from Unaligned Gene Sequences." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/41692945232299039263.
Full text國立清華大學
電機工程學系
97
To know the regulation of gene transcription, transcription factor binding sites (motifs) are helpful information. In fact, cDNA microarray hybridization (ChIP array) has became a popular tool for recognizing motif from gene sequences. However the ChIp array can only map the probable sequence within 1-2 kilobases resolution. Our goal is to find out the motif binding site without the information of motif length. To reach this goal we design a computational program, base on the discriminator and binomial model to find the most possible patterns. And we compare our performance to the program called constraint-less Cosmo [1]. From the simulation results, we can prove that our program is better than Cosmo.
Benner, Philipp. "Combining Prior Information for the Prediction of Transcription Factor Binding Sites." 2015. https://ul.qucosa.de/id/qucosa%3A21541.
Full textLiu, Chen-Yei, and 劉承業. "An integrated computational tool for predicting transcription factor binding and microRNA target sites of vertebrate genomes." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/15515659671163337497.
Full text國立臺灣海洋大學
生物科技研究所
95
Both transcription factor and microRNA play important roles in the regulation of gene expression. We developed a computational tool analyzing upstream and downstream sequences of interested genes. We used PWM(position weight matrix) based on the current biochemical studies for identifying transcription factor binding sites. We also used the Miranda program to analyze the 3’UTR sequences spotting the potential microRNA target sequences which might regulate the expression of the genes harboring the target sequences. Comparative genomics is employed by comparing the sequences between species to detect the transcription factors and microRNAs binding sites conserved during evolution. At last, we build a web tool for assisting users in discovering the transcription factor binding and microRNA target sites given a gene set.
Wu, Ping-Cheng, and 吳秉承. "Incorporating sequence motifs to improve accuracy of predicting transcription factor binding sites using ChIP-seq data." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/94459583035276354379.
Full text國立臺灣大學
生物產業機電工程學研究所
104
Transcription factors (TF) regulate gene expression in living organisms and influence multiple biological processes. Chromatin immunoprecipitation sequencing (ChIP-seq) is a technology that have been widely used to find transcription factor binding sites (TFBSs) of a specific TF among the DNA sequences of a genome. However, the accuracy of the TFBSs identified by ChIP-seq has not been systematically evaluated. In this regard, this thesis utilized TFBS information provided by the TRANSFAC database to validate the TFBSs identified by using ChIP-seq only with multiple false discovery rate (FDR). Moreover, in this thesis, a method incorporating de novo motif discovery was proposed to improve the performance of the predicted TFBSs. ChIP-seq data sampled from different cell lines was collected from ENCODE database. In general, ~60% of the peak regions identified by using the ChIP-seq only with a strict FDR cutoff (FDR = 0) contained at least one TFBS of the specific TF across multiple cell lines. In addition, by our proposed method, the prediction accuracy was improved and better than the results using ChIP-seq alone, though it was observed that the improved levels were affected by the used FDR cutoffs and discovered motifs. In conclusion, this thesis identified the accuracy problem of the ChIP-seq platform by observing from the data in a large scale, and address this issue by proposing a method incorporating de novo motif discovery. The observed results can serve as an important foundation for developing bioinformatics tools on TFBS prediction in future.
Austin, Ryan. "The de novo Prediction of Functionally Significant Sequence Motifs in Arabidopsis thaliana." Thesis, 2009. http://hdl.handle.net/1807/19021.
Full textLiu, Kai-Wei, and 劉凱維. "Mapping of Transcription Factor Binding Sites and DNA-Binding Motifs." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/74621550301522822856.
Full text國立臺灣大學
資訊工程學研究所
96
Transcription factors (TFs) play an essential role in gene regulation by activating or inhibiting the expressions of the corresponding genes. The transcription factors carry out their functions by docking at a specific region in the DNA sequence, which is normally referred to as transcription factor binding site (TFBS). Since the complete network of the interactions between TFs and genes is still largely unknown, figuring out the key residues in the DNA binding domain of a TF can provide the biochemists with valuable information for design of biochemical experiments to verify the interactions between the TF and the corresponding genes. Furthermore, with the key residues in the DNA binding domain identified, we can move to establish a mapping between the DNA binding motifs and the TFBS motifs. In the study reported in this thesis, we have proposed a novel approach to achieve the objectives mentioned above. The proposed approach begins with clustering the TFBSs with the same binding type. Then, sequence alignment with a strict criterion is applied to the corresponding DNA binding domains of the TFBSs in the same cluster in order to identify the key residues in the DNA binding domains. For those TFs whose tertiary structure is present in the Protein Data Bank (PDB), we have examined the physiochemical significance of the key residues identified.
Quon, Gerald T. "The landscape of false-positive transcription factor binding site predictions in yeast." 2007. http://link.library.utoronto.ca/eir/EIRdetail.cfm?Resources__ID=452973&T=F.
Full textShih, Chih-Yuan, and 石智遠. "Isolation and Characterization of Binding Sites for Transcription Factor FOXP2." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/46154906962594778308.
Full text國立清華大學
生命科學系
92
FOXP2 belongs to a family of transcription factors containing the DNA-binding forkhead/winged helix domain. The known FOXP2 mutations that are associated with speech and language disorder are a missense mutation in the KE family and a t(5;7)(q22;q31.2) translocation with a breakpoint in individual CS . Nevertheless, there is no known about the target genes regulated by FOXP2. In this study, using whole genome PCR procedure, we searched human genomic DNA for potential FOXP2 target sites. A number of related sequences that interacted with FOXP2 were identified in vitro by band shift and DNase I footprint analysis. Transient transfection assays in 293T cells further confirmed that the FOXP2 binding sites could also function in vivo. Promoter databases analysis reveals that FOXP2 binding sites are present in the upstream regions of several candidate target genes. A sequence comparison based on several of the novel sequences yielded a putative consensus binding sequence of 5’-TGTTTGT-3’. Remarkably, this sequence is similar to the consensus sequences for forkhead proteins. These DNA binding sites may help identify novel targets of FOXP2 and aid in further understanding FOXP2 function during development of speech and language.
LEE, M. I., and 李美宜. "Recognizing Cancer-related Genes based on Transcription Factor Binding Sites." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/11997758257999628220.
Full text臺中健康暨管理學院
資訊科學與應用學系碩士班
93
Abstract The purpose of transcription factors (TFs) is to regulate the expression of other genes. They are also the key-point to control if mutation will occur on promoter region or not? Current researchers on TFs mainly focus on predicting motifs using algorithms such as Multiple Em for Motif Elicitation (MEME), Genetic Algorithm (GA), and Gibbs Sampler. In this thesis, we propose a new approach to predict possible cancer-related genes based on transcription factor binding sites (TFBS). The experimented TFBS that are binding on promoter region and the known cancer-related genes have been collected from TFSEARCH and CHIP websites, respectively. The TFBS that result in mutation of genes are selected. We then analyze the occurrence frequencies of these TFBS to investigate the relations of TFBS and possible cancer-related genes. We also discuss the two-factor case of analyzing the relations of two TFBS and possible cancer-related genes. Our results show that the TFBS-based approach for predicting possible cancer-related genes is a reliable method to recognize possible cancer-related genes.
Hsu, Jen-Jay, and 徐振傑. "Prediction of DNA Binding Transcription Factor segments under Specified structure." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/46288973583523923659.
Full text國立臺灣大學
資訊工程學研究所
96
This thesis discusses the design of a predictor aimed at identifying the secondary structures in a transcription factor that are involved in interaction with the DNA. In particular, the design of the predictor has been optimized for identifying the alpha-helix structures involved in interaction with the DNA due to their prevalence. In the design of the predictor, the support vector machine (SVM) was employed and the study reported in this thesis focused on the features exploited for making prediction. In the experiments conducted in this study, two datasets have been used. The first dataset was derived from the TF-DNA complexes deposited in the Protein Data Bank (PDB) and the second dataset was derived from the TF sequences deposited in SWISS-PROT. With respect to identifying the alpha-helix structures involved in interaction with the DNA, the predictor proposed in this thesis delivered sensitivity of 75%, precision of 80%, and specificity of 92% with the first dataset and sensitivity 65%, precision 85%, and specificity 98% with the second dataset.
Wu, Po-Chun, and 吳柏均. "Investigating Variations of Transcription Factor Binding Sites by 1000 Genomes Data." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/31281686353680898713.
Full text國立臺灣大學
生醫電子與資訊學研究所
103
Gene regulation is essential and important for maintaining cellular functions. Therefore, how biological system regulates gene expression is a very important research topic for researchers. Gene regulation of cell functioning can be divided into many parts, including gene expression, mRNA transcription and splicing, post-translational modification, etc. This study aims at exploring the activation and inactivation effect of gene expression, through the interaction between transcription factors and double-stranded DNA. Among the three billion base pairs of human genome, some biological significant fragments such as genes or transcription factor binding sites account for only a small portion of DNA. The size of transcription factor binding motifs is about 5 to 15 nucleotides. Accordingly, how to identify transcription factor binding sites and how they achieve gene regulation is a very important research issue. Meanwhile, the bonding strength between transcription factors and their binding sites may also affect the regulation of gene expression. In the 1990s, the Human Genome Sequencing Project launched. Limited to the technology at that time, this project spent a lot of money and manpower. Finally, 23 human chromosomes were completed sequencing in 2001, including in total three billion bases. This is a considerable milestone on human genome research. With the development of biotechnology and the reducing cost of computer calculation, the technology of genome sequencing started to grow fast. In 2008, the 1000 Genomes Project started, planning to use faster and easier sequencing technology, to sequencing more than a thousand human genomes within three years. In 2012, in total 1,092 human genomes have been published. So far, the latest version dataset of this project has already contained 2,504 human genome data. The completion of human genome allows researchers to perform high-throughput screening of transcription factor binding sites. More and more individual genome datasets, provided a wealth of research themes letting us to glimpse the differences within individual transcription factor binding sites. The objective of this study is using the data of 1000 Genomes Project to explore individual variations in transcription factor binding sites, and the possibilities of its applications on genetic tests. This study collected the binding site data of 34 human transcription factors in the JASPAR database, and combined this information with the variant data of the 1000 Genomes Project to explore individual variations in transcription factor binding sites. Analysis from the study shows, the JASPAR-denoted transcription factor binding sites have only about 3% of position with individual variations. Furthermore, the positions with individual variations do not consistent with the original motifs of the transcription factor binding sites. Some individual variations occur at the positions where the corresponding motif implies not allowing variations. In order to further investigate the rationale behind this inconsistency, this study used an online tool named PiDNA, which predicts the binding motif of a DNA-binding protein using protein-DNA complex structures. This study employed such binding motifs to explore the potential minor form that might be omitted previously. At the end of this study, it discusses the future application of personal genetic diagnosis, and how to use existing bioinformatics tools and public databases to assess the importance of the occurrence of variants observed in transcription factor binding sites. It is expected that this study can provide novel insights for individual genetic tests in the personalized medecine.
Wang, Mei-Huei, and 王美惠. "Analysis of Transcription Factor Binding Sites by Using Sequential Pattern Mining." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/59563074202747148930.
Full text長庚大學
資訊管理研究所
94
The process of transcription is that an RNA product is produced from a given DNA. In this process, transcription factors affect the expression of genes by binding to specific regions with consensus patterns in the upstream region of genes. Therefore, the consensus patterns are also known as transcription factor binding sites (TFBS). By analyzing how transcription factors act on DNA binding sites and how they collaborate in ordered coordination, we could get an insight of the gene regulation process. Many computational studies on the combinations of and the relationships between transcription factor binding sites are based on association rule mining. Results of the studies are provided to biologists for further research. However, the sequenceing order of the transcription factor binding sites can not be mined by association rule minig and the number of rules produced by association rule mining is enormous. This study uses the known TFBS in TRANSFAC database to mark the TFBS positions in upstream sequence of gene. Sequential pattern mining technique is proposed to analyze the permatation of transcription factor binding sites in upstream region of genes. The differences between sequential pattern mining and association rule mining are explored. The result shows that sequential pattern mining find the combination and permutation transcription factor binding sites more efficiently and thus save the time a biologist must otherwise spend on validating the experiment.
Zhao, Xiaoyan. "Improved Algorithms for Discovery of Transcription Factor Binding Sites in DNA Sequences." Thesis, 2010. http://hdl.handle.net/1969.1/ETD-TAMU-2010-12-8834.
Full textWei-Hao, Yuan. "Extracting Transcription Factor Binding Sites from Unaligned Gene Sequences with Statistical Models." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0016-1303200709314000.
Full textYuan, Wei-Hao, and 袁偉豪. "Extracting Transcription Factor Binding Sites from Unaligned Gene Sequences with Statistical Models." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/09117995586690727615.
Full text國立清華大學
電機工程學系
94
Transcription factor binding sites (motifs) are crucial in the regulation of the gene transcription. Recently, the chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP array) have been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1-2 kilobases resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-array binding sequences and search for possible motifs representing the transcription factor binding sites. In this thesis, we design a program to find out accurate motif sites in the yeast genome with dependency graphs and their expanded Bayesian networks. The program incorporates with the binomial probability model to build significant initial motif sets. Finally, we compare our results with those obtained from famous programs and show that our program outperforms these program in the consistence with known specificities.
Church, William David. "Mapping the YY1 and p65 binding sites on the transcription factor LSF." Thesis, 2013. https://hdl.handle.net/2144/14244.
Full textHsu, Chih-Kai, and 許智凱. "A two-way predicting computational tool website for transcription factor binding site andvertebrate genomes." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/60071928811643946109.
Full text國立臺灣海洋大學
生物科技研究所
97
Transcriptional control by the transcription factors plays the central roles in the regulation of the gene expression. Given the gene lists involved in a biological phenomenon generated in a gene expression profiling experiment, the commonly asked question is what kind of and how the transcription factors control a battery of genes. eMOG (Extraction of Motif or Gene) is the web tool that we develop to analyze the upstream sequence given a gene list. eMOG scans the upstream sequences of genes and, by judging a probability score, discover the over-represented known transcription factor binding site (TFBS). Furthermore, eMOG allows the users to employ TF names to predict the genes that are potentially regulated by the given TFs. Finally, the user can visualize the TFBS patterns on the upstream sequence of genes using Scalable Vector Graphics (SVG). We use 115 human genes the upstream sequences of which are bound by E2F family, a TF family that regulates the entry of S phase in cell cycle. eMOG revealed four TFBS (E2F, CREB, NF-Y, Nrf-1) that are over-represented in the upstream sequences of those 115 genes. Moreover, we discover another 27 genes that are potentially under the transcriptional control of these four TF by reverse eMOG. Functional analysis of these 27 genes reveals that 14 genes are known to be directly related to cell cycle control and two genes associated with membrane receptor. Interestingly, using the same approach, 26 mouse genes are discovered to be potentially under the transcriptional control of the same four TF by reverse eMOG. The function of 11 out of these 26 mouse genes are known to be related to cell cycle control.
Fu, Changjui, and 傅昶瑞. "A mixed 0-1 linear programming approach for finding transcription factor binding sites." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/80211213408434043290.
Full text國立交通大學
資訊管理研究所
95
The discrimination of transcription factor binding sites (TFBS) in multiple DNA sequences is an essential work for function analysis of gene expression. Enumeration methods that search all possible patterns have best precision among all current algorithms but require an exponential computational time and have difficulties to search for longer patterns. A predefined shared pattern can notably prunes the searching space but such information is often unavailable. Finding unframed TFBS today still relies on heuristic approaches which compromise to accuracy. To effectively find TFBS, this study develops a mixed 0-1 linear programming approach to solve a series of problems for issues including fixed-pattern TFBS finding, ambiguous spacer TFBS finding and pattern-free TFBS finding. The proposed method has the following advantages over current methods: (1) A pattern-driven instead of sample-driven (or sequence-driven) design; (2) A global optimal solution is promised; (3) Structural features of motifs are embeddable to help facilitate search process. And with pattern-free approaches we can successfully determine TFBS within dispersed spacers. We apply several experiments on every kind of TFBS finding programs and in these examples the real TFBS are successfully determined in an acceptable computational time.
Gonsalves, Sarah E. "Identification of Heat Shock Factor Binding Sites in the Drosophila Genome." Thesis, 2012. http://hdl.handle.net/1807/34017.
Full text