To see the other types of publications on this topic, follow the link: Bioinformatics. eng.

Dissertations / Theses on the topic 'Bioinformatics. eng'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 42 dissertations / theses for your research on the topic 'Bioinformatics. eng.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Yamasaki, Lílian Hiromi Tomonari. "Análise por ferramentas de bioinformática da proteína não-estrutural 5A do vírus da hepatite C genótipo 1 e 3 em amostras pré-tratamento /." São José do Rio Preto : [s.n.], 2010. http://hdl.handle.net/11449/94818.

Full text
Abstract:
Resumo: A infecção pelo vírus da Hepatite C (HCV) é considerada um grande problema de saúde pública, desde a sua descoberta em 1989. Entretanto a terapia mais utilizada atualmente, baseada no uso de Peginterferon, tem sucesso em aproximadamente 50% dos pacientes com o genótipo 1. Embora os mecanismos envolvidos nesta resistência viral ainda não sejam esclarecidos, sugere-se que fatores virais e do hospedeiro participam deste. A proteína não-estrutural 5A (NS5A) está envolvida em diversos processos celulares e é um componente essencial para o HCV. Entretanto, sua estrutura e função ainda não foram bem elucidadas. A partir destes fatos, os objetivos do presente estudo foram elaborar um modelo teórico da NS5A e investigar as propriedades estruturais e funcionais in silico. Foram analisadas 345 sequências da proteína NS5A do HCV de 23 pacientes infectados com o genótipo 1 ou 3. As composições de aminoácidos e de estrutura secundária demonstraram que há diferença entre os genótipos, podendo indicar que há diferenças nas interações proteína-proteína entre os genótipos, o que pode estar relacionado com a diferença da taxa de resistência ao tratamento. A análise funcional foi realizada com o ProtFun, que sugeriu que a NS5A estaria envolvida nas funções celulares de metabolismo intermediário central, tradução, crescimento, tranporte, ligação e hormônio. Estas funções variaram entre os domínios, suportando a hipótese de que a NS5A é uma proteína multifuncional. A análise pelo PROSITE indicou vários sítios de glicosilação, fosforilação e miristoilação, que são altamente conservados e podem ter função importante na estabilização da estrutura e função, sendo assim possíveis alvos de novos antivirais. Alguns deles estão em regiões relacionadas com a resposta ao tratamento. Outro... (Resumo completo, clicar acesso eletrônico abaixo)<br>Abstract: Hepatitis C virus (HCV) infects almost 3% of people worldwide and it is considered the main cause of liver chronic diseases and transplants. Until today, there is no effective vaccine and the current most used therapy, based on Peginterferon, is successful only in 50% of patients infected by genotype 1. Although the outcomes of this treatment resistance are unclear, it is suggested host and virus factors may participate in this mechanism. Non-structural 5A (NS5A) protein is involved in several cellular and virus processes and it is a critical component of HCV. However, its structure and function are still uncertain. Regarding these facts, the present study attachments were to elaborate a model of the NS5A protein and to investigate NS5A structural and functional features, using in silico tools. It was analyzed 345 sequences of HCV NS5A protein from 23 patients infected by genotypes 1 or 3. Residues and secondary structure composition of all sequences demonstrated that there are differences between genotypes. It may indicate that there are differences in interactions between genotypes, which could be related with the distinct average of treatment resistance. In addition, among those that varied between genotypes, there were amino acids in regions that studies suggested as related with virus persistence. Functional analysis was performed with ProtFun. It suggested that NS5A is involved with central intermediary metabolism, translation, growth, transport, ligation and hormone functions in the cell. These functions vary between the domains, strengthening the hypothesis that NS5A is a multifunctional protein. Prosite motif search indicated that there are many glicosilation, fosforilation and myristoilation sites, which are highly conserved and may play an important role in structural stabilization and... (Complete abstract click electronic access below)<br>Orientador: Paula Rahal<br>Coorientador: Helen Andrade Arcuri<br>Banca: Fernanda Canduri<br>Banca: Carlos Alberto Montanari<br>Mestre
APA, Harvard, Vancouver, ISO, and other styles
2

Acencio, Marcio Luis. "Construção e análise da rede integrada de interações entre genes humanos envolvida com a regulação da trnsição G1/S do ciclo celular plea adesão à matriz extracelular /." Botucatu : [s.n.], 2011. http://hdl.handle.net/11449/102703.

Full text
Abstract:
Orientador: Ney Lemke<br>Banca: José Carlos Merino Mombach<br>Banca: Marcos Roberto de Mattos Fontes<br>Banca: Tie Koide<br>Banca: Deilson Elgui de Oliveira<br>Resumo: Virtualmente, todas as células normais, com exceção das células hematopoieticas, precisam estar aderi das à matriz extracelular para que elas possam se proliferar. Na ausência de adesão, essas células não se proliferam mais e acabam sofrendo apoptose. Porém, após transformação oncogênica, as células adquirem a capacidade de proliferação na ausência de adesão à matriz extracelular. Essa capacidade, cuja base molecular está na regulação anormal da transição G tiS do ciclo celular pela adesão, é uma das propriedades fundamentais das células cancerosas e também requisito para que essas células adquiriam sua capacidade metastática. Como as metástases correspondem a aproximadamente 90% das mortes por câncer, a elucidação dos mecanismos moleculares subjacentes à regulação da transição G IIS do ciclo celular pela adesão à matriz extracelular é, portanto, essencial para o desenvolvimento de drogas que possam inibir a formação das metástases. Com o intuito de elucidar esses mecanismos, nós adotamos neste trabalho uma abordagem estritamente computacional baseada em teoria das redes e aprendizado de máquina através do desenvolvimento de novos métodos de (i) construção de redes que representam a provável regulação entre dois diferentes processos (nesse caso, regulação da transição G l/S pela adesão à matriz extracelular), (á) predição de interações oncogênicas, (iii) determinação de sub-redes de vias de sinalização oncogênica entre dois genes de interesse em uma rede (batizado de graph2sig) e (iv) predição de potenciais alvos de drogas. A rede potencialmente envolvida na regulação da transição G l/S do ciclo celular pela matriz extracelular construída (Gccam) possui ~ 2000 genes e ~ 20.000 interações e representa ~ 78% dos processos biológicos conhecidamente envolvidos nessa regulação... (Resumo completo, clicar acesso eletrônico abaixo)<br>Abstract: Virtually all normal cells, excluding the hematopoietic cells, require anchorage to the extracellular matrix for their proliferation and survival. When such cells are deprived of anchorage, they arrest in the G1 phase of the cell cycle and eventually undergo apoptosis. Cancer cells, on the other hand, acquire the ability to perform anchorage-independent proliferation as a result of the disruption of the regulation of the G1/S cell cycle transition by adhesion to extracellular matrix. Anchorage-independent proliferation is the foundation for tumorigenicity and metastatic capability of cancer cells. As metastases are the cause of 90% of human cancer deaths, it is crucial to decipher the molecular mechanisms underlying the regulation of the G1/S cell cycle transition by the adhesion to extracellular cell matrix. In order to decipher such mechanisms, we developed in this present work machine learning and graph theory-based computational methods for the (i) construction of networks representing the regulatory relationships between two biological processes of interest, (ii) prediction of oncogenic interactions, (iii) extraction of oncogenic signaling subnetworks between two genes and (iv) prediction of druggable genes. The network representing the regulatory relationships between G1/S cell cycle transition and adhesion to extracellular matrix, Gccam, is comprised by 2,000 genes and 20,000 interactions. Moreover, 78% of known biological process involved in the regulation of G1/S cell cycle transition by adhesion to extracellular matrix are embedded in Gccam. Through the prediction of oncogenic interactions and the extraction of oncogenic signaling subnetworks between EGFR and CDC6, genes that encode proteins likely to be relevant to anchorage-independent proliferation, we postulate the following hypotheses for the molecular mechanisms underlying the anchorage-independent proliferation: cancer... (Complete abstract click electronic access below)<br>Doutor
APA, Harvard, Vancouver, ISO, and other styles
3

Gustafsson, Johan. "Finding potential electroencephalography parameters for identifying clinical depression." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-256392.

Full text
Abstract:
This master thesis report describes signal processing parameters of electroencephalography (EEG) signals with a significant difference between the signals from the animal model of clinical depression and the non-depressed animal model. The signal from the depressed model had a weaker power in gamma (30 - 80 Hz) than the non-depressed model during awake and it had a stronger power in delta (1.5 - 4 Hz) during sleep. The report describes the process of using visualisation to understand the shape of the signal which helps with interpreting results and helps with the development of parameters. A generic tool for time-frequency analysis was improved to cope with the size of the weeklong EEG dataset. A method for evaluating the quality of how well the EEG parameters are able to separate the strains with as short recordings as possible was developed. This project shows that it is possible to separate an animal model of depression from an animal model of non-depression based on its EEG and that EEG-classifiers may work as indicative classifiers for depression. Not a lot of data is needed. Further studies are needed to verify that the results are not overly sensitive to recording setup and to study to what extent the results are translational. It might be some of the EEG parameters with significant differences described here are limited to describe the difference between the two strains FSL and SD. But the classifiers have reasonable biological explanations that makes them good candidates for being translational EEG-based classifiers for clinical depression.
APA, Harvard, Vancouver, ISO, and other styles
4

Martin, Paul. "Post-GWAS bioinformatics and functional analysis of disease susceptibility loci." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/postgwas-bioinformatics-and-functional-analysis-of-disease-susceptibility-loci(cc0e6cee-5c32-4b75-b3d3-f7c18b6f126d).html.

Full text
Abstract:
Genome-wide association studies (GWAS) have been tremendously successful in identifying genetic variants associated with complex diseases, such as rheumatoid arthritis (RA). However, the majority of these associations lie outside traditional protein coding regions and do not necessarily represent the causal effect. Therefore, the challenges post-GWAS are to identify causal variants, link them to target genes and explore the functional mechanisms involved in disease. The aim of the work presented here is to use high level bioinformatics to help address these challenges. There is now an increasing amount of experimental data generated by several large consortia with the aim of characterising the non-coding regions of the human genome, which has the ability to refine and prioritise genetic associations. However, whilst being publicly available, manually mining and utilising it to full effect can be prohibitive. I developed an automated tool, ASSIMILATOR, which quickly and effectively facilitated the mining and rapid interpretation of this data, inferring the likely functional consequence of variants and informing further investigation. This was used in a large extended GWAS in RA which assessed the functional impact of associated variants at the 22q12 locus, showing evidence that they could affect gene regulation. Environmental factors, such as vitamin D, can also affect gene regulation, increasing the risk of disease but are generally not incorporated into most GWAS. Vitamin D deficiency is common in RA and can regulate genes through vitamin D response elements (VDREs). I interrogated a large, publicly available VDRE ChIP-Seq dataset using a permutation testing approach to test for VDRE enrichment in RA loci. This study was the first comprehensive analysis of VDREs and RA associated variants and showed that they are enriched for VDREs, suggesting an involvement of vitamin D in RA.Indeed, evidence suggests that disease associated variants effect gene regulation through enhancer elements. These can act over large distances through physical interactions. A newly developed technique, Capture Hi-C, was used to identify regions of the genome which physically interact with associated variants for four autoimmune diseases. This study showed the complex physical interactions between genetic elements, which could be mediated by regions associated with disease. This work is pivotal in fully characterising genetic associations and determining their effect on disease. Further work has re-defined the 6q23 locus, a region associated with multiple diseases, resulting in a major re-evaluation of the likely causal gene in RA from TNFAIP3 to IL20RA, a druggable target, illustrating the huge potential of this research. Furthermore, it has been used to study the genetic associations unique to multiple sclerosis in the same region, showing chromatin interactions which support previously implicated genes and identify novel candidates. This could help improve our understanding and treatment of the disease. Bioinformatics is fundamental to fully exploit new and existing datasets and has made many positive impacts on our understanding of complex disease. This empowers researchers to fully explore disease aetiology and to further the discovery of new therapies.
APA, Harvard, Vancouver, ISO, and other styles
5

Anwar, Maryam. "A bioinformatics approach to building an otic gene regulatory network." Thesis, King's College London (University of London), 2016. https://kclpure.kcl.ac.uk/portal/en/theses/a-bioinformatics-approach-to-building-an-otic-gene-regulatory-network(506591df-8f55-4585-a7a1-86d18d65af1a).html.

Full text
Abstract:
During development, the coordinated and sequential action of signals and regulatory factors controls how cells become different from each other and acquire specific fates. This information can be integrated in gene regulatory networks (GRNs) that model these processes over time and consider temporal and spatial changes of gene expression and how these are regulated. During early development, vertebrate sensory organs arise from the pre-placodal region at the border of the neural plate. Subsequently, FGF signalling plays a crucial role in inducing otic-epibranchial progenitors that ultimately give rise to the otic and epibranchial placodes. Downstream of FGF signalling, many transcription factors are activated. However, their regulatory relationships are not very clear. This project uses a bioinformatics approach to establish a GRN to model how multipotent progenitors transit through sequential regulatory states until they are committed to the ear lineage. To this end, using systematic perturbation experiments, new ear-specific genes have been identified some of which respond early to FGF. Focussing on these early genes, I have used phylogenetic footprinting combined with histone ChIP-seq to identify novel enhancers. Subsequently, I have investigated transcription factor binding sites within these enhancers to identify a small group of common regulators. In parallel, using mRNA-seq and perturbation data, I have reverse-engineered GRNs that recapitulate known interactions and predict new ones. Using a combination of these approaches, I have ultimately enriched a preliminary literature-based GRN by placing otic genes and their interactions into a hierarchy. Thus, this network is a resource for identifying key otic regulators and their targets and provides guidelines for future experiments.
APA, Harvard, Vancouver, ISO, and other styles
6

Martínez, Fundichely Alexander 1978. "Bioinformatic characterization and analysis of polymorphic inversions in the human genome." Doctoral thesis, Universitat Pompeu Fabra, 2013. http://hdl.handle.net/10803/384837.

Full text
Abstract:
Within the great interest in the characterization of genomic structural variants (SVs) in the human genome, inversions present unique challenges and have been little studied. This thesis has developed "GRIAL", a new algorithm focused specifically in detect and map accurately inversions from paired-end mapping (PEM) data, which is the most widely used method to detect SVs. GRIAL is based on geometrical rules to cluster, merge and refine both breakpoints of putative inversions. That way, we have been able to predict hundreds of inversions in the human genome. In addition, thanks to the different GRIAL quality scores, we have been able to identify spurious PEM-patterns and their causes, and discard a big fraction of the predicted inversions as false positives. Furthermore, we have created â ˘ AIJInvFESTâ˘A˙I, the first database of human polymorphic inversions, which represents the most reliable catalogue of inversions and integrates all the associated information from multiple sources. Currently, InvFEST combines information from 30 different studies and contains 1092 candidate inversions, which are categorized based on internal scores and manual curation. Finally, the analysis of all the data generated has provided information on the genomic patterns of inversions, contributing decisively to the understanding of the map of human polymorphic inversions.<br>Dentro del estudio de las variantes estructurales en el genoma humano, las inversiones han sido las menos han consolidado sus resultados y constituye uno de los principales retos en la actualidad. Esta tesis aborda el tema a través de la implementación de "GRIAL" un nuevo algoritmo específicamente diseñado para la detección más precisa posible de las inversiones usando el mapeo de secuencias apareadas (del inglés PEM) que es el método más utilizado para estudiar la variación estructural. GRIAL se basa en reglas geométricas para agrupar los patrones de PEM que señalan un posible punto de rotura (del inglés breakpoint) de inversión, además une cada breakpoint correspondientes a inversiones independientes y refina lo más exacto posible su localización. Su uso nos permitió predecir cientos de inversiones. Un gran aporte de nuestro método es la creación de índices (del inglés score) de fiabilidad para las predicciones mediante los cuales identificamos patrones de inversión incorrectos y sus causas. Esto nos permitió filtrar nuestro resultado eliminando un gran número de predicciones posiblemente falsas. Además se creó "InvFEST", la primera base de datos especialmente dedicada a inversiones polimórficas en el genoma humano la cual representa el catálogo más fiable de inversiones, integrando además a cada inversión conocida la información asociada disponible. Actualmente InvFEST contiene (y mantiene la clasificación según el nivel de certeza) un catálogo de 1092 inversiones clasificadas, a partir de datos de 30 estudios diferentes. Finalmente el análisis de toda la información generada nos permitió describir algunos patrones de las inversiones polimórficas en el genoma humano contribuyendo de este modo a la comprensión de esta variante estructural y el estado de su información en los estudios del genoma humano.<br>Inversió genòmica
APA, Harvard, Vancouver, ISO, and other styles
7

Codicè, Francesco. "Rete neurale per la predizione end-to-end dello stato di ossidazione delle cisteine e la connettività dei ponti disolfuro." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20593/.

Full text
Abstract:
Le proteine sono macromolecole fondamentali in moltissimi processi biologici essenziali per gli organismi viventi e diverse sono le funzioni che possono svolgere: possono ad esempio fungere da anticorpi per proteggere gli organismi da patogeni esterni oppure possono avere ruoli di natura strutturale. Le proteine sono costituite da catene di aminoacidi che ne determinano la forma, ossia il modo in cui una proteina si ripiega, che a sua volta determina la funzione svolta. Fra i vari fattori che hanno importanza nella conformazione delle proteine vi è l'aminoacido cisteina . Esso ha come singolarità funzionale il fatto di poter formare forti legami chiamati ponti disolfuro . I ponti disolfuro hanno un importante ruolo a livello sia strutturale che funzionale nelle proteine. Questi legami covalenti si formano per l'ossidazione di coppie di cisteine . Essendo questi legami particolarmente interessanti, sono stati sviluppati diversi metodi computazionali in silico per la predizione, data la sequenza relativa ad una proteina, delle cisteine coinvolte in questi legami. Il problema è comunemente affrontato con due approcci: nel primo, data la sequenza di aminoacidi di una proteina, si predice lo stato di ossidazione delle cisteine, ossia si predice binariamente per ogni cisteina se é coinvolta o meno in un ponte disolfuro; nel secondo approccio si predice lo schema di connettività delle cisteine ovvero si predicono quali sono le coppie di cisteine legate da ponti disolfuro. In questa tesi si descrive la costruzione di una rete neurale basata sull'approccio del multitask learning , in altre parole si tratta dell'addestramento di un modello per effettuare predizioni diverse contemporaneamente, sfruttando la condivisione di parte dei parametri del modello. E' descritta la costruzione di una rete neurale multitask per la predizione in un unico modello dello stato di ossidazione delle cisteine e dello schema di connettività delle stesse, partendo dalla sequenza di aminoacidi.
APA, Harvard, Vancouver, ISO, and other styles
8

Duck, Geraint. "Extraction of database and software usage patterns from the bioinformatics literature." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/extraction-of-database-and-software-usage-patterns-from-the-bioinformatics-literature(fac16cb8-5b5b-4732-b7af-77a41cc64487).html.

Full text
Abstract:
Method forms the basis of scientific research, enabling criticism, selection and extension of current knowledge. However, methods are usually confined to the literature, where they are often difficult to find, understand, compare, or repeat. Bioinformatics and computational biology provide a rich opportunity for resource creation and discovery, with a rapidly expanding "resourceome". Many of these resources are difficult to find due to the large choice available, and there are only a limited number of sufficiently populated lists that can help inform resource selection. Text mining has enabled large scale data analysis and extraction from within the scientific literature, and as such can provide a way to help explore the vast wealth of resources available, which form the basis of bioinformatics methods. As such, this thesis aims to survey the computational biology literature, using text mining to extract database and software resource name mentions. By evaluating the common pairs and patterns of usage of these resources within such articles, an abstract approximation of the in silico methods employed within the target domain is developed. Specifically, this thesis provides an analysis of the difficulties of resource name extraction from the literature, then using this knowledge to develop bioNerDS - a rule-based system that can detect database and software name mentions within full-text documents (with a final F-score of 67%). bioNerDS is then applied to the full-text document corpus from PubMed Central, the results of which are then explored to identify the differences in resource usage between different domains (bioinformatics, biology and medicine) through time, different journals and different document sections. In particular, the well established resources (e.g., BLAST, GO and GenBank) remain pervasive throughout the domains, although they are seeing a slight decline in usage. Statistical programs see high levels of usage, with R in bioinformatics and SPSS in medicine being frequently mentioned throughout the literature. An overview of the common resource pairs has been generated by pairing database and software names which directly co-occur after one another in text. Combining and aggregating these resource pairs together across the literature enables the generation of a network of common resource patterns within computational biology, which provides an abstract representation of the common in silico methods used. For example, sequence alignment tools remain an important part of several computational biology analysis pipelines, and GO is a strong network sink (primarily used for data annotation). The networks also show the emergence of proteomics and next generation sequencing resources, and provide a specialised overview of a typical phylogenetics method. This work performs an analysis of common resource usage patterns, and thus provides an important first step towards in silico method extraction using text-mining. This should have future implications in community best practice, both for resource and method selection.
APA, Harvard, Vancouver, ISO, and other styles
9

Nambiar, Kate. "Bioinformatic analysis of peptide microarray immunoassay data for serological diagnosis of infectious diseases." Thesis, University of Brighton, 2017. https://research.brighton.ac.uk/en/studentTheses/de2be38a-5941-4bc5-adb9-1c200b07c193.

Full text
Abstract:
Understanding antibody - antigen interactions occurring in infectious diseases is important in understanding aetiology, can help facilitate diagnosis, and could offer potential targets for vaccine or therapeutic antibody development. Peptide arrays – collections of short peptides immobilised on solid planar supports – offer a high throughput and highly parallel method of identifying immunogenic epitopes and relating patterns of antibody identification to clinical disease states. As technology advances, so the density and complexity of peptide arrays of becomes ever higher. Managing the large volume of data that modern high density microarrays generate requires sophisticated bioinformatics in order to minimise errors and biases. In this thesis I introduce a new software package, pmpa, that uses R, the open source statistical programming platform and an object orientated framework from the Bioconductor project. The package facilitates analysis of peptide microarray data including functions for reading scanned data files, quality assessment and pre-processing. It is both flexible and modular – integrating with existing software in the Bioconductor repository. Data pre-processing is key to any microarray analysis. Noise due to technical variation can obscure true biological effects if careful steps are not taken. The aim of pre-processing is to minimise noise while preserving biological variation. No consensus exists as to the optimal method of pre-processing making comparison between studies difficult. This thesis explores two key aspects of pre-processing: background correction and normalisation using two experimental datasets – a titration series of a monoclonal anti C.difficle Toxin B monoclonal antibody, and dataset with an anti-Toxin A antibody spiked into non immune sera to examine biases introduced by the pre-processing and whether they improve measures such as precision and differential identification. Finally the analysis method is applied to two studies identifying antibody signatures in infectious diseases: the first investigating immune responses to C. difficile – a major hospital acquired infection and the leading identifiable cause of antibiotic associated diarrhoea, and the second characterising antibody signatures that define paediatric tuberculosis infection. The real world application of the methodology identifies signatures of immune responses characterising clinical disease eg. relapsing vs. single episode C. difficile infection, but also highlights a number of limitations of the technique such as batch confounding and response variability.
APA, Harvard, Vancouver, ISO, and other styles
10

Xiang, Jie. "Functional genomic study and bioinformatic analysis on natural bioactive peptides." Thesis, Queen's University Belfast, 2017. https://pure.qub.ac.uk/portal/en/theses/functional-genomic-study-and-bioinformatic-analysis-on-natural-bioactive-peptides(4c3d14a3-7484-463b-8c82-cdc9abc5052e).html.

Full text
Abstract:
An extraordinary diverse components of animal toxins have emerged along with the long-term evolution of animals, which have unique functions that protect them to survive in the wild. Specifically, the granular glands of amphibians contain multiple chemical compounds, among which, peptides are one of the major types of the constituents. This thesis is divided into six chapters. Chapter 1, as general introduction, describes the detailed background on peptide-based therapeutic agents, the promising therapeutic application of animal toxins, and a brief review on anuran skins and their secretions. Chapter 2 presents the materials and methods employed in this project. In chapter 3, a novel bradykinin-related peptide was isolated and identified from the skin secretion of odorrana livida using shotgun cloning and tandem mass spectrometry fragmentation sequencing approach. This peptide exhibited a dose-dependent contractile property on rat bladder and ileum and increased the contraction frequency on rat uterus ex vivo smooth muscle preparations. It also showed vasorelaxant activity on rat tail artery smooth muscle. In addition, the peptide was modified by substituting the penultimate amino acid in the amino terminus from phenylalanine to leucine. Theanalogue completely abolished parental peptide activity, but showed an inhibition effects on bradykinin-induced rat tail artery smooth muscle relaxation. By using specific antagonists for bradykinin B1 and B2 receptors, we found that bradykinin b2receptor is highly likely to be involved in the rat tail artery related effects caused bythis novel bradykinin-related peptide and its analogue. Chapter 4 and 5 are about thediscovery of two pairs of novel antimicrobial peptides belonging to Bombinin and VIBombinin H families, respectively, from the skin secretion of Bombina genus. In chapter 4, the sequence modification was applied on bombinin HL by replacing Lisomer-leucine to D-isomer leucine from the second position of the amino terminus. Both the wild type and modified peptides displayed well-defined α-helical structure in bacterial membrane mimicking environment. BHL-bombinin displayed broad-spectrum bactericidal activity against a wide range of microorganisms, while bombinin H only exhibited a mild bacteriostatic effect on gram positive bacteria. The synergistic antimicrobial effects were observed between BHL-bominin with bombininH and between bombinin H with ampicillin. In addition, haemolytic and cytotoxic examination exhibited a highly synergistic selectivity and low cytotoxicity on mammalian cells of these three peptides. In chapter 5, the sequence modification was employed in the BHK-bombinin by replacing the glutamic acid with lysine at the 23rdposition from amino terminus, which increased the net charge and expanded thenonpolar face of the original peptide. According to the results from in vitro function alanalysis, this modification strategy significantly improved the selectivity index of thepeptide with increased antimicrobial activity and decreased haemolysis activity. The combined antimicrobial evaluation on both natural and modified peptides showed synergistic inhibition activity against both gram positive bacteria and fungi yeast. In summary, this thesis reveals the combined strategy of using a molecular cloning technique and mass spectrometric method for novel host-defensive peptides identification from amphibian skin secretions. In vitro and ex vivo functionalevaluations were subsequently employed which not only bring us a better understanding on the diversity of natural sourced bioactive peptides but also emphasized the research value for characterizing their in depth mechanisms.
APA, Harvard, Vancouver, ISO, and other styles
11

Frousios, Kimon. "Bioinformatic analysis of genomic sequencing data : read alignment and variant evaluation." Thesis, King's College London (University of London), 2014. http://kclpure.kcl.ac.uk/portal/en/theses/bioinformatic-analysis-of-genomic-sequencing-data(e3a55df7-543e-4eaa-a81e-6534eacf6250).html.

Full text
Abstract:
The invention and rise in popularity of Next Generation Sequencing technologies has led to a steep increase of sequencing data and the rise of new challenges. This thesis aims to contribute methods for the analysis of NGS data, and focuses on two of the challenges presented by these data. The first challenge regards the need for NGS reads to be aligned to a reference sequence, as their short length complicates direct assembly. A great number of tools exist that carry out this task quickly and efficiently, yet they all rely on the mere count of mismatches in order to assess alignments, ignoring the knowledge that genome composition and mutation frequencies are biased. Thus, the use of a scoring matrix that incorporates the mutation and composition biases observed among humans was tested with simulated reads. The scoring matrix was implemented and incorporated into the in-house algorithm REAL, allowing side-by-side comparison of the performance of the biased model and the mismatch count. The algorithm REAL was also used to investigate the applicability of NGS RNA-seq data to the understanding of the relationship between genomic expression and the compartmentalisation of genomic base composition into isochores. The second challenge regards the evaluation of the variants (SNPs) that are discovered by sequencing. NGS technologies have caused a sharp rise in the rate with which new SNPs are discovered, rendering impossible the experimental validation of each one. Several tools exist that take into account various properties of the genome, the transcripts and the protein products relevant to the location of a SNP and attempt to predict the SNP's impact. These tools are valuable in screening and prioritising SNPs likely to have a causative association with a genetic disease of interest. Despite the number of individual tools and the diversity of their resources, no attempt had been made to draw a consensus among them. Two consensus approaches were considered, one based on a very simplistic vote majority of the tools considered, and one based on machine learning. Both methods proved to offer highly competitive classification both against the individual tools and against other consensus methods that were published in the meantime.
APA, Harvard, Vancouver, ISO, and other styles
12

Bray, Tracey. "From structure to function in proteins : a computational study." Thesis, University of Manchester, 2010. https://www.research.manchester.ac.uk/portal/en/theses/from-structure-to-function-a-computational-study(5a78c88c-f890-4c2f-9122-a7adec5d2ca0).html.

Full text
Abstract:
The study of proteins and their function is key to understanding how the cell works in normal and disease states. Historically, the study of protein function was limited to biochemical characterisation, but as computing power and the number of available protein sequences and structures increased this allowed the relationship between sequence, structure and function to be explored. As the number of sequences and structures grows beyond the capacity for experimental groups to study them, computational approaches to inferring function become more important. Enzymes make up approximately half of the known protein sequences and structures, and most of the work in this thesis focuses on the relationship between the sequence, structure and function in enzymes.Firstly, the differences in sequence and structural features between enzymes of the six main functional classes are explored. Features that exhibited the most significant differences between the six classes were further studied to explore their link with function. This study suggested reasons as to why groups of functionally similar but non-homologous enzymes share similar sequence and structural features. A computational tool to predict EC class was then developed in an attempt to exploit the differences in these features. In order to calculate features relating to a particular active site to be used in the EC class prediction method, it was first necessary to predict the active site location. A comprehensive analysis of currently-available functional site prediction tools identified an approach previously developed by this group as amongst the best-performing methods. Here, a tool was created to deliver this approach via a publicly-available web-server, which was subsequently used in the attempt to predict EC class. The study of differences in sequence and structural features between classes revealed differences in oligomeric status between functions. High-order oligomers were linked to an increase in metabolic control in the lyases, possibly via mechanisms such as cooperativity. To further test this idea, it was necessary to be able to computationally identify oligomeric enzymes that act cooperatively. Since no such method currently exists, the degree of coupling of dynamic fluctuations between subunits was explored as a possible way of detecting cooperativity. Whilst this was unsuccessful, the study highlighted the existence of a pattern of correlated motions that were conserved over a wide range of non-homologous and functionally diverse proteins. These observations shed further light on the link between sequence, structure and function and highlight the functional importance of dynamics in protein structures.
APA, Harvard, Vancouver, ISO, and other styles
13

Feichtinger, Julia. "Development of a bioinformatic analytical approach to identify novel human cancer testis gene candidates." Thesis, Bangor University, 2012. https://research.bangor.ac.uk/portal/en/theses/development-of-bioinfirmatic-analytical-approach-to-identify-novel-human-cancer-testis-gene-candidates(09065b27-9fc0-49df-a9eb-fb7ef8aab878).html.

Full text
Abstract:
The identification of tumour antigens (TAs) represents an ongoing challenge to the development of novel cancer diagnostic, prognostic and therapeutic strategies. A group of proteins, the cancer testis (CT) antigens are promising targets for such clinical applications. Their encoding genes show expression restricted to the immunologically privileged testes but their expression is also found in cells with a cancerous phenotype. To facilitate and automate the identification of novel CT genes, bioinformatic analytical pipelines based on publicly available microarray and expressed sequence tag (EST) data were developed and implemented as web tools to support wider application. Human germline-associated datasets were generated and the developed screening pipelines were subsequently used to analyse these datasets, leading to the identification of a. novel cohort of meiosis-speci fic genes, the meiCT genes that exhibit t he characteristics of CT genes and may have oncogenic features. In general, frequent germline gene expression found in cancer could reflect a soma-to-germline transformation occurring in human cells in the course of the development of cancer. The expression of germline-specific genes, in particular of meiotic genes, could lead to the production of proteins that cause oncogenic events and thus contribute to tumorigenesis and to the acquisition of tumour characteristics.
APA, Harvard, Vancouver, ISO, and other styles
14

Santos, Anselmo Azevedo dos. "Exploração de uma biblioteca genômica de Passiflora edulis f. flavicarpa por sequenciamento de BAC-ends." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/11/11137/tde-22082013-160154/.

Full text
Abstract:
O maracujá-amarelo (Passiflora edulis f. flavicarpa) é uma frutífera de importância econômica no Brasil, sendo apreciado para a produção de suco concentrado e para o consumo in natura, além de ser usado pela indústria farmacêutica na extração da passiflorina. O presente trabalho visou à exploração da biblioteca genômica inserida em BACs (Ped-B-Flav) por meio da técnica de BAC-end sequencing, visando prover os primeiros insights sobre a composição e organização genômica da espécie, além de gerar novos candidatos a marcadores moleculares. Ao todo foram realizadas 9.979 reações de sequenciamento com eficiência média de 89 %, resultando em 8.821 BES de alta qualidade, com tamanho variando entre 100 pb e 1.255 pb, tendo em média 596 pb, totalizando cerca de 5,7 Mpb de informação genômica. Foram identificados, ao todo, 610 potenciais novos marcadores microssatélites. Os motivos de tetranucleotídeos foram os mais abundantes, ou seja, 28,9 % do total, sendo as repetições AATT aquelas observadas com maior frequência, com 131 ocorrências. Foram identificados e classificados 4.394 (19,69 %) elementos repetitivos. Dentre estes elementos, os grupos dos retrotransposons gypsy e copia-like foram os mais abundantes, correspondendo a 10,08 % e 7,93 % das ocorrências, respectivamente. Além disso, foram encontradas 767 (8,7 %) sequências com alta identidade a regiões codificadoras de proteínas. Estas sequências foram classificadas e anotadas de acordo com o vocabulário controlado GeneOntology. Análises de mapeamento genômico comparativo revelaram três regiões microssintênicas com o genoma de Populus trichocarpa, uma com o genoma de Vitis vinifera e uma com o genoma de Arabdopisis thaliana, além de evidenciarem uma série de regiões rearranjadas em relação aos genomas de referência. O presente estudo mostrou que os BES de Passiflora edulis são uma excelente fonte de informações sobre o genoma da espécie, principalmente no que tange à diversidade gênica, identificação de elementos transponíveis e ao potencial para o desenvolvimento de novos marcadores genéticos. Igualmente, foi possível empregar essas sequências na identificação de regiões microssintênicas entre o genoma do maracujá-amarelo e de outras espécies vegetais próximas.<br>Yellow passion fruit (Passiflora edulis f. flavicarpa) is of considerable economic importance to Brazil. It is used to produce juice concentrate and also marketed for consumption as a fresh fruit. In the pharmaceutical industry, it is used to produce passiflora extract. The aim of this study was to explore the BAC (Bacterial Artificial Chromosome) genomic library (Ped-B-Flav) using BAC-end sequencing (BES) to provide some initial insights into the composition and organization of the species genome, and to generate new candidates for molecular markers. Altogether, 9,979 sequencing reactions were performed, with an average efficiency of 89 %, resulting in 8,821 high-quality BES, of average length ranging from 100 bp to 1255 bp, and consisting of an average 596 bp, totaling some 5.7 Mb of genomic information. In all, we identified 610 potential new microsatellite markers. Tetranucleotide motifs (28.9%) were the most abundant and AATT was the most frequently observed motif, with 131 occurrences. We identified and classified 4,394 (19.69 %) repetitive elements. Retrotransposon gypsy (10.8%) and copia-like (7.93%) elements were the most abundant. Furthermore, we found 767 (8.7 %) sequences very similar to those of protein coding regions. These sequences were classified and annotated according to gene ontology controlled vocabulary. Comparative genomic mapping revealed three regions showing microsynteny with the genome of Populus trichocarpa, one with Vitis vinifera genome and one with the Arabdopisis thaliana genome. In addition it revealed a series of rearranged regions in comparison to the reference genomes. This study showed that Passiflora edulis BES form an excellent source of information on the genome of the species, especially in regard to genetic diversity, identification of transposable elements and potential for the development of new genetic markers. It was also possible, using these sequences, to identify regions showing microsynteny with other plant species.
APA, Harvard, Vancouver, ISO, and other styles
15

Sanja, Brdar. "Non-negative matrix factorization for integrative clustering." Phd thesis, Univerzitet u Novom Sadu, Fakultet tehničkih nauka u Novom Sadu, 2016. https://www.cris.uns.ac.rs/record.jsf?recordId=101841&source=NDLTD&language=en.

Full text
Abstract:
Integrative approaches are motivated by the desired improvement ofrobustness, stability and accuracy. Clustering, the prevailing technique forpreliminary and exploratory analysis of experimental data, may benefit fromintegration across multiple partitions. In this thesis we have proposedintegration methods based on non-negative matrix factorization that can fuseclusterings stemming from different data sets, different data preprocessingsteps or different sub-samples of objects or features. Proposed methods areevaluated from several points of view on typical machine learning data sets,synthetics data, and above all, on data coming form bioinformatics realm,which rise is fuelled by technological revolutions in molecular biology. For avast amounts of &#39;omics&#39; data that are nowadays available sophisticatedcomputational methods are necessary. We evaluated methods on problemfrom cancer genomics, functional genomics and metagenomics.<br>Предмет истраживања докторске дисертације су алгоритми кластеровања,односно груписања података, и могућности њиховог унапређењаинтегративним приступом у циљу повећања поузданости, робустности наприсуство шума и екстремних вредности у подацима, омогућавања фузијеподатака. У дисертацији су предложене методе засноване на ненегативнојфакторизацији матрице. Методе су успешно имплементиране и детаљноанализиране на разноврсним подацима са UCI репозиторијума исинтетичким подацима које се типично користе за евалуацију новихалгоритама и поређење са већ постојећим методама. Већи деодисертације посвећен је примени у домену биоинформатике која обилујехетерогеним подацима и бројним изазовним задацима. Евалуација јеизвршена на подацима из домена функционалне геномике, геномике рака иметагеномике.<br>Predmet istraživanja doktorske disertacije su algoritmi klasterovanja,odnosno grupisanja podataka, i mogućnosti njihovog unapređenjaintegrativnim pristupom u cilju povećanja pouzdanosti, robustnosti naprisustvo šuma i ekstremnih vrednosti u podacima, omogućavanja fuzijepodataka. U disertaciji su predložene metode zasnovane na nenegativnojfaktorizaciji matrice. Metode su uspešno implementirane i detaljnoanalizirane na raznovrsnim podacima sa UCI repozitorijuma isintetičkim podacima koje se tipično koriste za evaluaciju novihalgoritama i poređenje sa već postojećim metodama. Veći deodisertacije posvećen je primeni u domenu bioinformatike koja obilujeheterogenim podacima i brojnim izazovnim zadacima. Evaluacija jeizvršena na podacima iz domena funkcionalne genomike, genomike raka imetagenomike.
APA, Harvard, Vancouver, ISO, and other styles
16

Shen, Yingjia. "Genome wide studies of mRNA 3'-end processing signals and alternative polyadenylation in plants." Miami University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=miami1260664627.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Bull, Simon. "Predicting drug target proteins and their properties." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/predicting-drug-target-proteins-and-their-properties(4a57420f-ba76-4b24-bb3a-f8f8627aac75).html.

Full text
Abstract:
The discovery of drug targets is a vital component in the development of therapeutic treatments, as it is only through the modulation of a target’s activity that a drug can alleviate symptoms or cure. Accurate identification of drug targets is therefore an important part of any development program, and has an outsized impact on the program’s success due to its position as the first step in the pipeline. This makes the stringent selection of potential targets all the more vital when attempting to control the increasing cost and time needed to successfully complete a development program, and in order to increase the throughput of the entire drug discovery pipeline. In this work, a computational approach was taken to the investigation of protein drug targets. First, a new heuristic, Leaf, for the approximation of a maximum independent set was developed, and evaluated in terms of its ability to remove redundancy from protein datasets, the goal being to generate the largest possible non-redundant dataset. The ability of Leaf to remove redundancy was compared to that of pre-existing heuristics and an optimal algorithm, Cliquer. Not only did Leaf find unbiased non-redundant sets that were around 10% larger than the commonly used PISCES algorithm, it found ones that were no more than one protein smaller than the maximum possible found by Cliquer. Following this, the human proteome was mined to discover properties of proteins that may be important in determining their suitability for pharmaceutical modulation. Data was gathered concerning each protein’s sequence, post-translational modifications, secondary structure, germline variants, expression profile and target status. The data was then analysed to determine features for which the target and non-target proteins had significantly different values. This analysis was repeated for subsets of the proteome consisting of all GPCRs, ion channels, kinases and proteases, as well as for a subset consisting of all proteins that are implicated in cancer. Next, machine learning was used to quantify the proteins in each dataset in terms of their potential to serve as a drug target. For each dataset, this was accomplished by first inducing a random forest that could distinguish between its targets and non-targets, and then using the random forest to quantify the drug target likeness of the non-targets. The properties that can best differentiate targets from non-targets were primarily found to be those that are directly related to a protein’s sequence (e.g. secondary structure). Germline variants, expression levels and interactions between proteins had minimal discriminative power. Overall, the best indicators of drug target likeness were found to be the proteins’ hydrophobicities, in vivo half-lives, propensity for being membrane bound and the fraction of non-polar amino acids in their sequences. In terms of predicting potential targets, datasets of proteases, ion channels and cancer proteins were able to induce random forests that were highly capable of distinguishing between targets and non-targets. The non-target proteins predicted to be targets by these random forests comprise the set of the most suitable potential future drug targets, and are therefore likely to produce the best results if used as the basis for building a drug development programme.
APA, Harvard, Vancouver, ISO, and other styles
18

Stoney, Ruth. "Using pathway networks to model context dependent cellular function." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/using-pathway-networks-to-model-context-dependent-cellular-function(562db48d-5e8b-40bb-8457-47c9a3455f9c).html.

Full text
Abstract:
Molecular networks are commonly used to explore cellular organisation and disease mechanisms. Function is studied using molecular interaction networks, such as protein-protein networks. Although much biological insight has been gained using these models of molecular function, they are hindered by their reliance on available experimental data and an inability to capture the complexity of biological processes. Functional modules can be identified based on molecular network topology, making it essential that the edges accurately depict molecular interactions. However, these networks struggle to depict the temporal nature of interactions, giving the impression that all interactions are constant. This misrepresentation can result in functionally heterogeneous clusters. The notoriously inaccurate nature of experimental protein interaction data, along with variable conformity among network clusters and functional modules further impedes functional module extraction. Representation of genes by single nodes artificially merges the functions of pleiotropic genes, distorting the arrangement of function within molecular networks. This thesis therefore explores a more suitable model for representing function. Pathways are composed of sets of proteins that are known to interact within a particular cellular context, corresponding to a discernible biological function. Their representation of context dependent cellular activity makes them ideal for use as nodes within a new pathway level model. Using combinatorial algorithms a reduced redundancy pathway set was produced to represent global cellular systems. Enrichment analysis provides reliable functional annotations for each pathway node, attributing independent functions to pleiotropic genes. Edges are based on functional semantic similarity, generating a network representation of functional organisation. Both yeast and human biological systems are presented as functionally connected pathway networks. Pathway annotation and experimentation with semantic similarity measures provides insight into the cross-talk between biological processes. Pathway functional modules elucidate the intracellular implementation of processes. Disease modules highlight the effects of functional perturbations and disease mechanisms. The pathway model provides a complementary, high-level functional model that begins to bridge the gap between molecular data and phenotype. The utilisation of pathway data provides a large, well-validated data source, avoiding the inaccuracies inherent with molecular data. Pathway models better represent components of biological complexity such as pleiotropy and linear implementation of functions.
APA, Harvard, Vancouver, ISO, and other styles
19

Nelson, Michael Graham. "Bioinformatic approaches to detect transposable element insertions in high throughput sequence data from Saccharomyces and Drosophila." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/bioinformatic-approaches-to-detect-transposable-element-insertions-in-high-throughput-sequence-data-from-saccharomyces-and-drosophila(df6427f7-2f8e-4de5-81eb-51f6bfab514a).html.

Full text
Abstract:
Transposable elements (TEs) are mutagenic mobile DNA sequences whose excision and insertion are powerful drivers of evolution. Some TE families are known to target specific genome features, and studying their insertion preferences can provide information about both TE biology and the state of the genome at these locations. To investigate this, collecting large numbers of insertion sites for TEs in natural populations is required. Genome resequencing data can potentially provide a rich source of such insertion sites. The field of detecting these "non-reference" TE insertions is an active area of research, with many methods being released and no comprehensive review performed. To drive forward knowledge of TE biology and the field of non-reference TE detection, we created McClintock, an integrated pipeline of six TE detection methods. McClintock lowers the barriers against use of these methods by automating the creation of the diverse range of input files required whilst also setting up all methods to run simultaneously and standardising the output. To test McClintock and its component methods, it was run on both simulated and real Saccharomyces cerevisiae data. Tests on simulated data reveal the general properties of component methods' predictions as well as the limitations of simulated data for testing software systems. Overlap between results from the McClintock component methods show many insertions detected by only one method, highlighting the need to run multiple TE detection methods to fully understand a resequenced sample. Utilising the well characterised properties of S. cerevisiae TE insertion preferences, real yeast population resequencing data can act as a biological validation for the predictions of McClintock. All component methods recreated previously known biological properties of S. cerevisiae TE insertions in natural population data. To demonstrate the versatility of McClintock, we applied the system to Drosophila melanogaster resequencing data. 27 Schneider's cell lines were sequenced and analysed with McClintock. In addition to demonstrating the scalability of McClintock to larger genomes with more TE families, this exposed ongoing transposition in S2 cell lines. Likewise, the use of non-reference TE insertions as variable sites allowed us to recreate the relationships between S2 sub-lines, confirming that S1, S2, and S3 were most likely established separately. The results also suggest that there are several S2 sub-lines in use and that these sub-lines can differ from each other in TE content by hundreds of non-reference TE copies. Overall this thesis demonstrates that the McClintock pipeline can highlight problems in TE detection from genome data as well as revealing that much can still be learned from this data source.
APA, Harvard, Vancouver, ISO, and other styles
20

Zhang, Jianzhe. "Development of an Apache Spark-Based Framework for Processing and Analyzing Neuroscience Big Data: Application in Epilepsy Using EEG Signal Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=case1597089028333942.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

McDowall, Mark. "Human protein-protein interaction prediction." Thesis, University of Dundee, 2011. https://discovery.dundee.ac.uk/en/studentTheses/697e465a-edbd-41d2-acda-5910a49e4157.

Full text
Abstract:
Protein-protein interactions are essential for the survival of all living cells, allowing for processes such as cell signalling, metabolism and cell division to occur. Yet in humans there are only &gt;38k annotated interactions of an interactome estimated to range between 150k to 600k interactions and out of a potential 300M protein pairs.Experimental methods to define the human interactome generate high quality results, but are expensive and slow. Computational methods play an important role to fill the gap.To further this goal, the prediction of human protein-protein interactions was investigated by the development of new predictive modules and the analysis of diverse datasets within the framework of the previously established PIPs protein-protein interaction predictor Scott and Barton 2007. New features considered include the semantic similarity of Gene Ontology annotating terms, clustering of interaction networks, primary sequences and gene co-expression. Integrating the new features in a naive Bayesian manner as part of the PIPs 2 predictor resulted in two sets of predictions. With a conservative threshold, the union of both sets is &gt;300k predicted human interactions with an intersect of &gt;94k interactions, of which a subset have been experimentally validated. The PIPs 2 predictor is also capable of making predictions in organisms that have no annotated interactions. This is achieved by training the PIPs 2 predictor based on a set of evidence and annotated interactions in another organism resulting in a ranking of protein pairs in the original organism of interest. Such an approach allows for predictions to be made across the whole proteome of poorly characterised organism, rather than being limited only to proteins with known orthologues. The work described here has increased the coverage of the human interactome and introduced a method to predict interactions in organisms that have previously had limited or no annotated interactions. The thesis aims to provide a stepping stone towards the completion of the human interactome and a way of predicting interactions in organisms that have been less well studied, but are often clinically relevant.
APA, Harvard, Vancouver, ISO, and other styles
22

Denson, Marian. "Rational design of immunotherapy to treat fungal allergy." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/rational-design-of-immunotherapy-to-treat-fungal-allergy(ff331eb5-0b27-4a41-823f-b767f5273508).html.

Full text
Abstract:
Background: Asthma affects 5.4 million people in the UK. Asthma subgroups are also susceptible to inhalation of fungal spores (Aspergillus fumigatus) and development of pulmonary fungal aspergilloma; presenting a life threatening but poorly understood condition. NHS costs for corticosteroids, bronchodilators and antifungal agents that are only partially effective continue to rise. Allergy immunotherapy development is of great interest as it is specific to the allergen and can harness key adaptive immune T-cells to down-regulate inflammatory responses. Immunotherapy has been used with varying degrees of success for treatment of grass, pollen, venom, cat and dog allergens however to date has not been directed to fungal allergens. The study aims were: 1) to further understand the A. fumigatus allergens and the protein epitopes responsible for generating immune responses. 2) To genotype participating ABPA/SAFS patients to observe any HLA associations. Methods: 37 subjects with fungal sensitivity were recruited to the study which received permission from the local ethics committee (UHSM LREC). Computer bioinformatic predictions using Propred software identified several potential fungal T cell peptide epitopes; of which 8 peptides were soluble and tested in vitro for specific T-cell proliferation responses by flow cytometric analysis. Skin prick tests determined subject responses to fungal allergens including A. fumigatus, and DNA analysis determined subject HLA type. Results: 5 of 8 soluble peptides were Aspergillus fumigatus derived and 3 from Alternaria alternata. All 8 peptides induced higher CD4 proliferative responses in ABPA/SAFS patients, compared to healthy controls from highest significance to lowest as follows: peptide 1.1 > 9.1 > 8.1 > 2.1 > 9.1.1 > 4.1 > 4.1.1 and 10.1.1. 73% subjects elicited skin responses to A. fumigatus. DNA HLA typing identified alleles associated with ABPA/SAFS but not all allele sub types. Discussion: The ABPA/SAFS group consistently raised T-cell responses to fungal peptides compared to controls. This demonstrates peripheral CD4s retain memory for fungal specificity and clearly respond when challenged with fungal epitopes in vitro. This concept underpins the rationale to further characterize the responding CD4 cells and pursuing bioinformatics approaches for immunotherapy investigations for fungal allergy.
APA, Harvard, Vancouver, ISO, and other styles
23

Soul, Jamie. "A systems biology approach to knee osteoarthritis." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/a-systems-biology-approach-to-knee-osteoarthritis(0b229b46-7be4-4fdb-9a14-062c3dcfcf05).html.

Full text
Abstract:
A hallmark of the joint disease osteoarthritis (OA) is the degradation of the articular cartilage in the affected joint, debilitating pain and decreased mobility. At present there are no disease modifying drugs for treatment of osteoarthritis. This represents a significant, unmet medical need as there is a large and increasing prevalence of OA. Using a systems biology approach, we aimed to better understand the pathogenic mechanisms of OA and ultimately aid development of therapeutics. This thesis focuses on the analysis of gene expression data from human OA cartilage obtained at total knee replacement (TKR). This transcriptomics approach gives a genome-wide overview of changes, but can be challenging to interpret. Network-based algorithms provide a framework for the fusion of knowledge so allowing effective interpretation. The PhenomeExpress algorithm was developed as part of this thesis to aid the interpretation of gene expression data. PhenomeExpress uses known disease gene associations to identify relevant dysregulated pathways in the data. PhenomeExpress was further developed into an 'app' for Cytoscape, the widely used network analysis and visualisation platform. To investigate the processes that occur during the degradation of cartilage we examined the gene expression of damaged and intact OA cartilage using RNA-Seq and identified key altered pathways with PhenomeExpress. A regulatory network driven by four transcription factors accounts for a significant proportion of the observed differential expression of damage-associated genes in the PhenomeExpress identified pathways. We further explored the role of the cytokines IL-1 and TNF that have been reported to β drive the progression of OA. Comparison of the expression response of in vitro cytokine-treated explants with the in vivo damage response revealed major differences, providing little evidence for any significant role of IL-1 and TNF as drivers of OA β damage in vivo. Finally, we examined the heterogeneity of OA through analysis of cartilage expression profiles at TKR. Through a network-based clustering method, we found two subgroups of patients on the basis of their gene expression profiles. These subgroups were found to have distinct OA expression perturbations and we identified TGF and S100A8/9 β signalling as potentially explaining the observed differential expression. We developeda RT-qPCR based classifier that allowed classification of new samples into these subgroups so allowing future assessment of the clinical significance of these subgroups. The work presented in this thesis includes a novel, widely-accessible tool for the analysis of disease gene expression data, which we used to give new insights into the pathogenesis of osteoarthritis. We have produced a rich dataset for future research and our analysis of this data has increased our understanding of cartilage damage processes and the heterogeneity of OA.
APA, Harvard, Vancouver, ISO, and other styles
24

Herzel, Lydia. "Co-transcriptional splicing in two yeasts." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-179274.

Full text
Abstract:
Cellular function and physiology are largely established through regulated gene expression. The first step in gene expression, transcription of the genomic DNA into RNA, is a process that is highly aligned at the levels of initiation, elongation and termination. In eukaryotes, protein-coding genes are exclusively transcribed by RNA polymerase II (Pol II). Upon transcription of the first 15-20 nucleotides (nt), the emerging nascent RNA 5’ end is modified with a 7-methylguanosyl cap. This is one of several RNA modifications and processing steps that take place during transcription, i.e. co-transcriptionally. For example, protein-coding sequences (exons) are often disrupted by non-coding sequences (introns) that are removed by RNA splicing. The two transesterification reactions required for RNA splicing are catalyzed through the action of a large macromolecular machine, the spliceosome. Several non-coding small nuclear RNAs (snRNAs) and proteins form functional spliceosomal subcomplexes, termed snRNPs. Sequentially with intron synthesis different snRNPs recognize sequence elements within introns, first the 5’ splice site (5‘ SS) at the intron start, then the branchpoint and at the end the 3’ splice site (3‘ SS). Multiple conformational changes and concerted assembly steps lead to formation of the active spliceosome, cleavage of the exon-intron junction, intron lariat formation and finally exon-exon ligation with cleavage of the 3’ intron-exon junction. Estimates on pre-mRNA splicing duration range from 15 sec to several minutes or, in terms of distance relative to the 3‘ SS, the earliest detected splicing events were 500 nt downstream of the 3‘ SS. However, the use of indirect assays, model genes and transcription induction/blocking leave the question of when pre-mRNA splicing of endogenous transcripts occurs unanswered. In recent years, global studies concluded that the majority of introns are removed during the course of transcription. In principal, co-transcriptional splicing reduces the need for post-transcriptional processing of the pre-mRNA. This could allow for quicker transcriptional responses to stimuli and optimal coordination between the different steps. In order to gain insight into how pre-mRNA splicing might be functionally linked to transcription, I wanted to determine when co-transcriptional splicing occurs, how transcripts with multiple introns are spliced and if and how the transcription termination process is influenced by pre-mRNA splicing. I chose two yeast species, S. cerevisiae and S. pombe, to study co-transcriptional splicing. Small genomes, short genes and introns, but very different number of intron-containing genes and multi-intron genes in S. pombe, made the combination of both model organisms a promising system to study by next-generation sequencing and to learn about co-transcriptional splicing in a broad context with applicability to other species. I used nascent RNA-Seq to characterize co-transcriptional splicing in S. pombe and developed two strategies to obtain single-molecule information on co-transcriptional splicing of endogenous genes: (1) with paired-end short read sequencing, I obtained the 3’ nascent transcript ends, which reflect the position of Pol II molecules during transcription, and the splicing status of the nascent RNAs. This is detected by sequencing the exon-intron or exon-exon junctions of the transcripts. Thus, this strategy links Pol II position with intron splicing of nascent RNA. The increase in the fraction of spliced transcripts with further distance from the intron end provides valuable information on when co-transcriptional splicing occurs. (2) with Pacific Biosciences sequencing (PacBio) of full-length nascent RNA, it is possible to determine the splicing pattern of transcripts with multiple introns, e.g. sequentially with transcription or also non-sequentially. Part of transcription termination is cleavage of the nascent transcript at the polyA site. The splicing status of cleaved and non-cleaved transcripts can provide insights into links between splicing and transcription termination and can be obtained from PacBio data. I found that co-transcriptional splicing in S. pombe is similarly prevalent to other species and that most introns are removed co-transcriptionally. Co-transcriptional splicing levels are dependent on intron position, adjacent exon length, and GC-content, but not splice site sequence. A high level of co-transcriptional splicing is correlated with high gene expression. In addition, I identified low abundance circular RNAs in intron-containing, as well as intronless genes, which could be side-products of RNA transcription and splicing. The analysis of co-transcriptional splicing patterns of 88 endogenous S. cerevisiae genes showed that the majority of intron splicing occurs within 100 nt downstream of the 3‘ SS. Saturation levels vary, and confirm results of a previous study. The onset of splicing is very close to the transcribing polymerase (within 27 nt) and implies that spliceosome assembly and conformational rearrangements must be completed immediately upon synthesis of the 3‘ SS. For S. pombe genes with multiple introns, most detected transcripts were completely spliced or completely unspliced. A smaller fraction showed partial splicing with the first intron being most often not spliced. Close to the polyA site, most transcripts were spliced, however uncleaved transcripts were often completely unspliced. This suggests a beneficial influence of pre-mRNA splicing for efficient transcript termination. Overall, sequencing of nascent RNA with the two strategies developed in this work offers significant potential for the analysis of co-transcriptional splicing, transcription termination and also RNA polymerase pausing by profiling nascent 3’ ends. I could define the position of pre-mRNA splicing during the process of transcription and provide evidence for fast and efficient co-transcriptional splicing in S. cerevisiae and S. pombe, which is associated with highly expressed genes in both organisms. Differences in S. pombe co-transcriptional splicing could be linked to gene architecture features, like intron position, GC-content and exon length.
APA, Harvard, Vancouver, ISO, and other styles
25

Socrates, Vimig. "Neuro-Integrative Connectivity: A Scientific Workflow-Based Neuroinformatics Platform For Brain Network Connectivity Studies Using EEG Data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1561655750151063.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Xu, Guorong. "Computational Pipeline for Human Transcriptome Quantification Using RNA-seq Data." ScholarWorks@UNO, 2011. http://scholarworks.uno.edu/td/343.

Full text
Abstract:
The main theme of this thesis research is concerned with developing a computational pipeline for processing Next-generation RNA sequencing (RNA-seq) data. RNA-seq experiments generate tens of millions of short reads for each DNA/RNA sample. The alignment of a large volume of short reads to a reference genome is a key step in NGS data analysis. Although storing alignment information in the Sequence Alignment/Map (SAM) or Binary SAM (BAM) format is now standard, biomedical researchers still have difficulty accessing useful information. In order to assist biomedical researchers to conveniently access essential information from NGS data files in SAM/BAM format, we have developed a Graphical User Interface (GUI) software tool named SAMMate to pipeline human transcriptome quantification. SAMMate allows researchers to easily process NGS data files in SAM/BAM format and is compatible with both single-end and paired-end sequencing technologies. It also allows researchers to accurately calculate gene expression abundance scores.
APA, Harvard, Vancouver, ISO, and other styles
27

Sheth-Ughade, Parita. "Immunological responses to fungal epitope peptides." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/immunological-responses-to-fungal-epitope-peptides(1f8234cb-77e4-4577-a6ba-e57d502048a4).html.

Full text
Abstract:
Introduction: Fungi are common aeroallergens responsible for at least 3% – 10% of allergic diseases worldwide, with the proportion hugely variable in different populations. Treatment is complicated by viable nature and disease causing ability of the allergen and is often only palliative. Thus, this study aimed to serve as a pilot investigation to design novel anti-allergy therapeutics to cure allergy at the molecular level. It investigates the effect of wild type fungal peptides and corresponding variant peptides on allergy associated immunological responses – cellular and cytokine based – to use such variant peptides to cause the delicate shift from an allergic to a normal immune response. Further, the study explores the role of bioinformatics in investigating allergy and designing novel therapeutics. Methods: This study used ProPred, a bioinformatics software, to predict wild type peptides from selected allergens of Aspergillus fumigatus and Alternata alternaria for a target population. These were then modified to generate single amino acid variants. Both these peptide sets were tested to compare the cellular and cytokine patterns they generated in sensitised (n = 3) and healthy volunteers (n = 3) to check for anti-allergy responses that may be exerted by certain variants. The recruited population was also subjected to skin prick testing (SPT, n = 46) to check for co-sensitisations patterns and HLA typing (n = 40) to evaluate ProPred accuracy for peptide prediction. This study also attempted an in silico search for unknown Penicillium chrysogenum allergens by comparing known Penicillium and A. fumigatus allergens to identify probable agents of co-sensitization. Results: Of the wild type and variant peptides tested in this study, one variant peptide – peptide 1.1v from Asp f 2 – was successfully identified to change the cellular and cytokine profile to promote an anti-allergic response when compared to its corresponding wild type form (1.1o). This candidate is a good target for further investigation for use in peptide immunotherapy. Further, 8 shared allergens between A. fumigatus and P. chrysogenum were identified that may possibly be agents of co-sensitization between these species. SPT results indicated maximum subject co-sensitization between A. fumigatus and Candida albicans and P. chrysogenum. HLA typing results demonstrated the efficiency of ProPred to be 96.29%, thus implying that bioinformatics can effectively be used to study allergy in this novel manner. Conclusion: This study has demonstrated that variant peptides with a single amino acid change can cause the delicate shift from an allergic to a healthy immune response in sensitised subjects. This approach – in combination with other allergy associated factors such as epitope specificity for HLA types and inherent co-sensitization patterns in a population – can effectively be used to design peptide candidates for immunotherapy to target allergy at the molecular level. With promising results obtained in this pilot study, this approach guarantees further investigation in immunotherapy. This study has also demonstrated that bioinformatics can be effectively used to design and execute allergy studies in a targeted and inexpensive manner.
APA, Harvard, Vancouver, ISO, and other styles
28

Heard, Stephanie. "Plant pathogen sensing for early disease control." Thesis, University of Manchester, 2014. https://www.research.manchester.ac.uk/portal/en/theses/plant-pathogen-sensing-for-early-disease-control(48949f80-2596-4ce2-912a-6513e72f6a8d).html.

Full text
Abstract:
Sclerotinia sclerotiorum, a fungal pathogen of over 400 plant species has been estimated to cost UK based farmers approximately £20 million per year during severe outbreak (Oerke and Dehne 2004). S. sclerotiorum disease incidence is difficult to predict as outbreaks are often sporadic. Ascospores released from the fruiting bodies or apothecia can be dispersed for tens of kilometres. This makes disease control problematic and with no S. sclerotiorum resistant varieties available, growers are forced to spray fungicides up to three times per flowering season in anticipation of the arrival of this devastating disease. This thesis reports the development of the first infield S. sclerotiorum biosensor which aims to enable rapid detection of airborne ascospores, promoting a more accurate disease risk assessment and fungicide spraying regime. The sensor is designed to detect the presence of oxalic acid, the main pathogenicity factor secreted during early S. sclerotiorum ascospore germination. Upon electrochemical detection of this analyte in the biosensor, a binary output is relayed to farmer to warm him of a disease risk. This project focused on the development of a nutrient matrix which was designed to be contained within the biosensor. The role of this matrix was to promote the growth of captured airborne S. sclerotiorum ascospores and induce high levels of oxalic acid secretion. The use of the designed biological matrix to promote oxalic acid production was tested during three field trials in S. sclerotiorum artificially inoculated fields. This thesis describes the use of contemporary pathogenomics technologies to further investigate candidate genes involved in pathogenicity alongside the secretion of oxalic acid. A pre-described bioinformatics pipeline was used to predict the S. sclerotiorum secretome to identify potential effector proteins as well as explore proteins which are unique to S. sclerotiorum to be used as other novel targets for detection. GFP tagged constructs were designed to investigate the expression of the putative targets for S. sclerotiorum detection. The transcriptomes of wild type and oxalic acid deficient S. sclerotiorum strains during infection as well as during a saprotrophic stage were investigated. This study provided expression support for not only some of the unannotated genes identified in the putative secretome, but some candidate genes speculated to be involved in infection.
APA, Harvard, Vancouver, ISO, and other styles
29

Loke, Johnny Chee Heng. "COMPILATION OF mRNA POLYADENYLATION SIGNALS IN ARABIDOPSIS THALIANA REVEALED NEW SIGNAL ELEMENTS AND POTENTIAL SECONDARY STRUCTURES." Miami University / OhioLINK, 2004. http://rave.ohiolink.edu/etdc/view?acc_num=miami1103223217.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Kruse, Colin Peter Singer. "Data-Enabled Approach to Characterize Dynamic Regulatory Pathways in Two Kingdoms." Ohio University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1573746719306039.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Linheiro, Raquel. "Computational analysis of transposable element target site preferences in Drosophila melanogaster." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/computational-analysis-of-transposable-element-target-site-preferences-in-drosophila-melanogaster(33ac0a41-2fbd-4974-b6b6-db4e1e48a7b0).html.

Full text
Abstract:
Transposable elements (TEs) are mobile DNA sequences that are a source of mutations and can target specific sites in host genome. Understanding the molecular mechanisms of TE target site preferences is a fundamental challenge in functional and evolutionary genomics. Here we used accurately mapped TE insertions in the Drosophila melanogaster genome, from large-scale gene disruption and resequencing projects, to better understand TE insertion site mechanisms. First we test predictions of the palindromic target site model for DNA transposon insertion using artificially generated P-element insertions. We provide evidence that the P-element targets a 14 bp palindromic motif that can be identified at the primary sequence level that differs significantly from random base composition in the D. melanogaster genome. This sequence also predicts local spacing, hotspots and strand orientation of P-element insertions. Next, we combine artificial P-element insertions with data from genome- wide studies on sequence properties of promoter regions, in an attempt to decode the genomic factors associated with P-element promoter targeting. Our results indicate that the P-element insertions are affected by nucleosome positioning and the presence of chromatin marks made by the Polycomb and trithorax protein groups. We provide the first genome-wide study which shows that core promoter architecture and chromatin structure impact P-element target preferences shedding light on the nuclear processes that influence its pattern of TE insertions across the D. melanogaster genome. In an effort to understand the natural insertion preferences of a wide range of TEs, we then used genome resequencing data to identify insertions sites not present in the reference strain. We found that both Illumina and 454 sequencing platforms showed consistent results in terms of target site duplication (TSD) and target site motif (TSM) discovery. We found that TSMs typically extend the TSD and are palindromic for both DNA and LTR elements with a variable center that depends on the length of the TSD. Additionally, we found that TEs from the same subclass present similar TSDs and TSMs. Finally, by correlating results on P-element insertion sites from natural strains with gene disruption experiments, we show that there is an overlap in target site preferences between artificial and natural insertion events and that P-element targeting of promoter regions of genes is a natural characteristic of this element that is influenced by the same features has the artificially generated insertions. Together, the results presented in this thesis provide important new findings about the target preferences of TEs in one of the best-studied and most important model organisms, and provide a platform for understanding target site preferences of TEs in other species using genomic data.
APA, Harvard, Vancouver, ISO, and other styles
32

Sheppard, Sarah E. "Application of a Naïve Bayes Classifier to Assign Polyadenylation Sites from 3' End Deep Sequencing Data: A Dissertation." eScholarship@UMMS, 2013. http://escholarship.umassmed.edu/gsbs_diss/653.

Full text
Abstract:
Cleavage and polyadenylation of a precursor mRNA is important for transcription termination, mRNA stability, and regulation of gene expression. This process is directed by a multitude of protein factors and cis elements in the pre-mRNA sequence surrounding the cleavage and polyadenylation site. Importantly, the location of the cleavage and polyadenylation site helps define the 3’ untranslated region of a transcript, which is important for regulation by microRNAs and RNA binding proteins. Additionally, these sites have generally been poorly annotated. To identify 3’ ends, many techniques utilize an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Previously, simple heuristic filters relying on the number of adenines in the genomic sequence downstream of a putative polyadenylation site have been used to remove these sites of internal priming. However, these simple filters may not remove all sites of internal priming and may also exclude true polyadenylation sites. Therefore, I developed a naïve Bayes classifier to identify putative sites from oligo-dT primed 3’ end deep sequencing as true or false/internally primed. Notably, this algorithm uses a combination of sequence elements to distinguish between true and false sites. Finally, the resulting algorithm is highly accurate in multiple model systems and facilitates identification of novel polyadenylation sites.
APA, Harvard, Vancouver, ISO, and other styles
33

Gardner, Allison. "Characterising and predicting amyloid mutations in proteins." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/characterising-and-predicting-amyloid-mutations-in-proteins(5fb5b725-ac9e-499b-81ee-f9ce7cbcb19e).html.

Full text
Abstract:
A database, AmyProt, was developed that collated details of 32 human amyloid proteins associated with disease and 488 associated mutations and polymorphisms, of which 316 are classified as amyloid. A detailed profile of the mutations was developed in terms of location within domains and secondary structures of the proteins and functional effects of the mutations. The data was used to test the hypothesis that mutations enhance amyloidosis in human amyloid proteins have distinctive characteristics, in terms of specific location within proteins and physico-chemical characteristics, which differentiate them from non-amyloid forming polymorphisms in amyloid proteins and from disease mutations and polymorphisms in non-amyloid disease linked proteins. The aim was to use these characteristics to train a prediction algorithm for amyloid mutations that will provide a more accurate prediction than current general disease prediction tools and amyloid prediction tools that focus on aggregating regions. 66 location specific features and changes upon mutation of 366 amino acids propensities, derived from the amino acid index database AAindex, were analysed. A significant proportion of mutations were located with aggregating regions, however the majority of mutations were not associated with these regions. An analysis of motifs showed that amyloid mutations had a significant association with transmembrane helix motifs such as GxxxG. Statistical analysis of substitutions mutations, using substitution matrices, showed that amyloid mutations have a decrease in α-helix propensity and overall secondary structure propensity compared to the disease mutations and disease and amyloid polymorphisms. Machine learning was used to reduce the large set of features to a set of 18 features. These included location near transmembrane helices, secondary structure features; transmembrane and extracellular domains and 4 amino acid propensities: knowledge-based membrane propensity scale from 3D helix; α-helix propensity; partition coefficient; normalized frequency of coil. The AmyProt mutations and non-amyloid polymorphisms were used to train and test the novel amyloid mutation prediction tool, AmyPred, the first tool developed purely to predict amyloid mutations. AmyPred predicts the amyloidogenicity of mutations as a consensus by majority vote (CMV) and mean probability (CMP) of 5 classifiers. Validation of AmyPred with 27 amyloid mutations and 20 non-amyloid mutations from APP, Tau and TTR proteins, gave classification accuracies of 0.7/0.71 (CMV/CMP) and with an MCC of 0.4 (CMV) and 0.41 (CMP). AmyPred out performed other tools such as SIFT (0.37) and PolyPhen (0.36) and the amyloid consensus prediction tool, MetAmyl (0.13). Finally, AmyPred was used to analyse p53 mutations to characterize amyloid and non-amyloid mutations within this protein.
APA, Harvard, Vancouver, ISO, and other styles
34

Han, Nam Shik. "Systematic approaches for modelling and visualising responses to perturbation of transcriptional regulatory networks." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/systematic-approaches-for-modelling-and-visualising-responses-to-perturbation-of-transcriptional-regulatory-networks(3f4cf115-3b68-457f-8fd6-0f7609d5b9bc).html.

Full text
Abstract:
One of the greatest challenges in modern biology is to understand quantitatively the mechanisms underlying messenger Ribonucleic acid (mRNA) transcription within the cell. To this end, integrated functional genomics attempts to use the vast wealth of data produced by modern large scale genomic projects to understand how the genome is deployed to create a diversity of tissues and species. The expression levels of tens or hundreds of thousands genes are profiled at multiple time points or different experimental conditions in the genomic projects. The profiling results are deposited in large scale quantitative data files that are not possible to analyse without systematic computational methods. In particular, it is much more difficult to experimentally measure the concentration level of transcription factor proteins and their affinity for the promoter region of genes, while it is relatively easy to measure the result of transcription using experimental techniques such as microarrays. In the absence of such biological experiments, it becomes necessary to use in silico techniques to determine the transcription factor regulatory activities given existing gene expression profile data. It therefore presents significant challenges and opportunities to the computer science community. This PhD Project made use of one such in silico technique to determine the differences (if any) in transcription factor regulatory activities of different experimental conditions and time points.The research aim of the Project was to understand the transcriptional regulatory mechanism that controls the sophisticated process of gene expression in cells. In particular, differences in the downstream signalling from which transcription factors can play a role in predisposition to diseases such as Parasitic disease, Cancer, and Neuroendocrine disease. To address this question I have had access to large integrated genomics datasets generated in studies on parasitic disease, lung cancer, and endocrine (hormone) disease. The current state-of-the-art takes existing knowledge and asks "How do these data relate to what we already know?" By applying machine learning approaches the project explored the role that such data can play in uncovering new biological knowledge.
APA, Harvard, Vancouver, ISO, and other styles
35

Swainston, Neil. "Systems biology informatics for the development and use of genome-scale metabolic models." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/systems-biology-informatics-for-the-development-and-use-of-genomescale-metabolic-models(ba1622ca-af31-486c-81f7-063a11f51e1f).html.

Full text
Abstract:
Systems biology attempts to understand biological systems through the generation of predictive models that allow the behaviour of the system to be simulated in silico. Metabolic systems biology has in recent years focused upon the reconstruction and constraint-based analysis of genome-scale metabolic networks, which provide computational and mathematical representations of the known metabolic capabilities of a given organism. This thesis initially concerns itself with the development of such metabolic networks, first considering the community-driven development of consensus networks of the metabolic functions of Saccharomyces cerevisiae. This is followed by a consideration of automated approaches to network reconstruction that can be applied to facilitate what has, until recently, been an arduous manual process. The use of such large-scale networks in the generation of dynamic kinetic models is then considered. The development of such models is dependent upon the availability of experimentally determined parameters, from omics approaches such as transcriptomics, proteomics and metabolomics, and from kinetic assays. A discussion of the challenges faced with developing informatics infrastructure to support the acquisition, analysis and dissemination of quantitative proteomics and enzyme kinetics data follows, along with the introduction of novel software approaches to address these issues. The requirement for integrating experimental data with kinetic models is considered, along with approaches to construct, parameterise and simulate kinetic models from the network reconstructions and experimental data discussed previously. Finally, future requirements for metabolic systems biology informatics are considered, in the context of experimental data management, modelling infrastructure, and data integration required to bridge the gap between experimental and modelling approaches.
APA, Harvard, Vancouver, ISO, and other styles
36

Chan, Pedro. "A computational investigation of solubility, functionality and the adaptation in subcellular compartments of proteins." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/a-computational-investigation-of-solubility-functionality-and-the-adaptation-in-subcellular-compartments-of-proteins(29ba40c2-0e8b-459a-803b-529da885289a).html.

Full text
Abstract:
A cell is considered to be the smallest unit of life. It carries out a variety of biochemical reactions through the activities of proteins and protein enzymes. In order to perform functions, proteins must be in their native folded state together with the correct environmental conditions. A slight change in pH or temperature could cause disruption to the electrostatic interactions within the protein, thus leading to conformational change and the loss of activity. Studies have shown that solubility could be enhanced by increasing the number of charges on the protein surface. And from the studies of extremophiles, we learned that the presence of non-polar aromatic residues could be a key for thermostable proteins. Thus, charges are important to determine the function and adaptation of proteins.Over the decades, large amount of protein sequence and structure information relating to molecular biology has been produced. By employing algorithms, computational and statistical techniques, it is possible to analyse these data to solve biological problems. Often these investigations are based mainly on sequences since their numbers outstrip the number of available structures. However, adding structures would allow us to investigate problems such as the relationship between charges, sequence, structure and functions, which is the aim of this study.In this thesis, the relationships between proteins and function were examined by various electrostatic features derived from charges and also geometric properties from structures. One interesting finding is that the averaged value of pH of maximum stability of proteins within a subcellular location was highly correlated to the pH of that subcellular compartment, which was due to pKas (of histidines), and their locations on the proteins. We also found that the size of the largest non-charged patch on the protein surface correlates with solubility and provides a predictor with a maximum accuracy of 76%. The use of novel charge-based methods shows little improvement in distinguishing between enzymes and non-enzymes. However, the method of using real charges with grid size of 1 angstrom has paved a way into the idea of using charges and dipoles pattern from enzyme active site to distinguish different enzymes. Finally, a web-tool for displaying conserved residues on 3D protein structure is made available to the public for identifying residues that may be of functional importance.
APA, Harvard, Vancouver, ISO, and other styles
37

NI, JIAQIAN. "Plasma Biomarkers for Age-Related Macular Degeneration." Cleveland State University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=csu1236700270.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Sarafraz, Farzaneh. "Finding conflicting statements in the biomedical literature." Thesis, University of Manchester, 2012. https://www.research.manchester.ac.uk/portal/en/theses/finding-conflicting-statements-in-the-biomedical-literature(963e490a-eeea-4f4c-864d-fb318899beed).html.

Full text
Abstract:
The main archive of life sciences literature currently contains more than 18,000,000 references, and it is virtually impossible for any human to stay up-to-date with this large number of papers, even in a specific sub-domain. Not every fact that is reported in the literature is novel and distinct. Scientists report repeat experiments, or refer to previous findings. Given the large number of publications, it is not surprising that information on certain topics is repeated over a number of publications. From consensus to contradiction, there are all shades of agreement between the claimed facts in the literature, and considering the volume of the corpus, conflicting findings are not unlikely. Finding such claims is particularly interesting for scientists, as they can present opportunities for knowledge consolidation and future investigations. In this thesis we present a method to extract and contextualise statements about molecular events as expressed in the biomedical literature, and to find those that potentially conflict each other. The approach uses a system that detects event negations and speculation, and combines those with contextual features (e.g. type of event, species, and anatomical location) to build a representational model for establishing relations between different biological events, including relations concerning conflicts. In the detection of negations and speculations, rich lexical, syntactic, and semantic features have been exploited, including the syntactic command relation. Different parts of the proposed method have been evaluated in a context of the BioNLP 09 challenge. The average F-measures for event negation and speculation detection were 63% (with precision of 88%) and 48% (with precision of 64%) respectively. An analysis of a set of 50 extracted event pairs identified as potentially conflicting revealed that 32 of them showed some degree of conflict (64%); 10 event pairs (20%) needed a more complex biological interpretation to decide whether there was a conflict. We also provide an open source integrated text mining framework for extracting events and their context on a large-scale basis using a pipeline of tools that are available or have been developed as part of this research, along with 72,314 potentially conflicting molecular event pairs that have been generated by mining the entire body of accessible biomedical literature. We conclude that, whilst automated conflict mining would need more comprehensive context extraction, it is feasible to provide a support environment for biologists to browse potential conflicting statements and facilitate data and knowledge consolidation.
APA, Harvard, Vancouver, ISO, and other styles
39

Oyeyemi, Oyebode. "Modelling HIV-1 interaction with the host system." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/modelling-hiv1-interaction-with-the-host-system(41095e34-78dd-4b75-bd25-9695a4cc768f).html.

Full text
Abstract:
Human immunodeficiency virus (HIV-1) is the pathogenic agent of HIV infection thatprecedes the total breakdown of cellular immunity, a condition known as acquiredimmunodeficiency syndrome (AIDS). The pandemic nature of the disease has promptedintense research into its biology. Already, much is known about HIV-1 infection, lifecycle,and progression to aids. Systems biology enables the combination of complex data fromthese studies into a framework where their effect on the various levels of cellularorganization (i.e. Pathways, cells, tissues, organs and the whole body) could be studied insilico. In this thesis, first, we reviewed our knowledge of the HIV-1 Human InteractionDatabase. We examined its contents and identified processes that HIV-1 was not previouslyknown to interact with. Then, we attempted an in silico dynamic model of HIV-1 interaction. We built a model of HIV-1 interaction with the CD4 T cell activation pathway comprised of137 nodes (16 HIV-1, 121 human) and 336 interactions. The model reproduced expectedpatterns of T cell activation. Using interaction graph properties, we identified 26 host cellfactors, including MAPK1&3, Ikkb-Ikky-Ikka and PKA, which contribute to the net activationor inhibition of viral proteins. By following a logical Boolean formalism, we identified 9 hostcell factors essential to the functions of viral proteins in the activation pathway. This wasthe first attempt to model dynamic viral-host interaction relationships. Then, we organize HIV-1 interacting host genes into modules to represent cellular processesneeded by the virus. We combined HIV-1 interactions with host gene GO annotations toclassify host genes according to these needed cellular processes. We obtained 201 modulesand found the same set of viral proteins do not interact with host genes having similarmodules suggesting intelligence in its co-ordination of host processes. This work is one of agrowing list that explores coordination of HIV-1 interactions. But more importantly, it would bebeneficial to functionally downsize the large dynamic HIV-1 interaction network. Finally, in our discussion, we discuss our results and suggest possible ways in which our workon dynamic models could be improved. This work is opening up a new field of systems virologythat studies the effect of viruses on the host in terms of its temporal and spatial aspects.
APA, Harvard, Vancouver, ISO, and other styles
40

"Development of bioinformatics approaches for biological database integration and genome-wide identification of RNA editing sites." 2014. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1291609.

Full text
Abstract:
Guo, Mengbiao.<br>Thesis M.Phil. Chinese University of Hong Kong 2014.<br>Includes bibliographical references (leaves 95-105).<br>Abstracts also in Chinese.<br>Title from PDF title page (viewed on 28, October, 2016).
APA, Harvard, Vancouver, ISO, and other styles
41

"Bioinformatics analyses of high-throughput genomic and transcriptomic data from nasopharyngeal carcinoma cell line, xenografts and associated Epstein-Barr virus." 2014. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1291547.

Full text
Abstract:
This thesis is the construct of a computational system for studying the nasopharyngeal carcinoma (NPC) using high-throughput sequencing data. The system involves several components, including discovery of gene fusion in NPC cell line, construction of Esptein-Barr virus (EBV) genome, and evaluation on contaminated sequencing data alignment approaches. We successfully discovered a gene fusion (UBR5-ZNF423) in a NPC cell line (C666-1) which was verified by lab experiments and found in 8.3% of primary tumors. It was discovered the regulation of this gene affect the growth of cancer cell. We constructed the EBV genome in C666-1. It serves as an important reference for studying this important NPC cell line, which was the only NPC cell line in the world for a long time. We also evaluated three mapping approaches. Two of them are designed to filter out potential mouse contamination reads on human sequencing data, which can originate from NPC human-in-mouse xenografts. We found that special care should always be applied to contaminated data. Although direct mapping can give acceptable results if in most cases, the combined-based approached is suggested. It can effectively reduce false positive variants and maintain good enough numbers of true positive variants. Filtering approach is an alternative to the combined-based approach that can also effectively reduce contamination when memory is not sufficient.<br>本論文利用電腦有系統地研究鼻咽癌,當中的數據利用了高通量測序技術來定序。其中章節包括在鼻咽癌胞系中尋找融合基因、組建潛藏於人體可引致鼻咽癌的EB病毒基因組、還有評價幾種可處理受污染序列的序列排列方法。我們成功地在鼻咽癌胞系(C666-1)中發現出一個融合基因(UBR5-ZNF423),並在實驗中確定此成果,其中發現在原發腫瘤中有8.3%的樣本中找出此融合基因。此外,也發現這融合基因調控會影響到癌細胞的生長。C666-1鼻咽癌胞系在過往有一段很長的時間裡,都是全世界唯一的鼻咽癌胞系,因此它有非常重要的參考價值,在此研究,我們組建了在C666-1裡的EB病毒基因組,使它作為研究C666-1的參考樣本。另外,我們評價了三種處理排列的方法,其中兩種的設計能過濾部分人類序列數據當中老鼠基因組的污染,老鼠基因組的污染可以來自於異種移植,即把人類癌細腫瘤移植於老鼠身上種植,我們建議在情況許可下都使用特殊的處理方法而不是直接作序列排列。直接作序列排列數據雖然已有合理的表現,但相比之下組合基因組式序列排列方法能有效減少錯誤肯定的遺傳變異,並同時保留足夠多正確肯定的遺傳變異,所以組合基因組式序列排列方法應在情況許可下都使用它。過濾式序列排列方法也是一種特殊的處理方法,它也能有效減少錯誤肯定的遺傳變異,它對記憶體的需求比組合基因組式序列排列方法少,可在電腦的記憶體不足時使用它。<br>Tso, Kai Yuen.<br>Thesis M.Phil. Chinese University of Hong Kong 2014.<br>Includes bibliographical references (leaves 112-120).<br>Abstracts also in Chinese.<br>Title from PDF title page (viewed on 24, October, 2016).<br>Detailed summary in vernacular field only.
APA, Harvard, Vancouver, ISO, and other styles
42

"Integrative Bioinformatics Analyses Using Next-Generation Sequencing: Super-Enhancer Characterization in Skeletal Muscle Differentiation and Genomic and Methylomic Exploration in Plasma DNA." 2016. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1292440.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography