To see the other types of publications on this topic, follow the link: Multiple data sources.

Dissertations / Theses on the topic 'Multiple data sources'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Multiple data sources.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Zhang, Ping. "Learning from Multiple Knowledge Sources." Diss., Temple University Libraries, 2013. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/214795.

Full text
Abstract:
Computer and Information Science<br>Ph.D.<br>In supervised learning, it is usually assumed that true labels are readily available from a single annotator or source. However, recent advances in corroborative technology have given rise to situations where the true label of the target is unknown. In such problems, multiple sources or annotators are often available that provide noisy labels of the targets. In these multi-annotator problems, building a classifier in the traditional single-annotator manner, without regard for the annotator properties may not be effective in general. In recent years, how to make the best use of the labeling information provided by multiple annotators to approximate the hidden true concept has drawn the attention of researchers in machine learning and data mining. In our previous work, a probabilistic method (i.e., MAP-ML algorithm) of iteratively evaluating the different annotators and giving an estimate of the hidden true labels is developed. However, the method assumes the error rate of each annotator is consistent across all the input data. This is an impractical assumption in many cases since annotator knowledge can fluctuate considerably depending on the groups of input instances. In this dissertation, one of our proposed methods, GMM-MAPML algorithm, follows MAP-ML but relaxes the data-independent assumption, i.e., we assume an annotator may not be consistently accurate across the entire feature space. GMM-MAPML uses a Gaussian mixture model (GMM) and Bayesian information criterion (BIC) to find the fittest model to approximate the distribution of the instances. Then the maximum a posterior (MAP) estimation of the hidden true labels and the maximum-likelihood (ML) estimation of quality of multiple annotators at each Gaussian component are provided alternately. Recent studies show that it is not the case that employing more annotators regardless of their expertise will result in improved highest aggregating performance. In this dissertation, we also propose a novel algorithm to integrate multiple annotators by Aggregating Experts and Filtering Novices, which we call AEFN. AEFN iteratively evaluates annotators, filters the low-quality annotators, and re-estimates the labels based only on information obtained from the good annotators. The noisy annotations we integrate are from any combination of human and previously existing machine-based classifiers, and thus AEFN can be applied to many real-world problems. Emotional speech classification, CASP9 protein disorder prediction, and biomedical text annotation experiments show a significant performance improvement of the proposed methods (i.e., GMM-MAPML and AEFN) as compared to the majority voting baseline and the previous data-independent MAP-ML method. Recent experiments include predicting novel drug indications (i.e., drug repositioning) for both approved drugs and new molecules by integrating multiple chemical, biological or phenotypic data sources.<br>Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
2

Brizzi, Francesco. "Estimating HIV incidence from multiple sources of data." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/273803.

Full text
Abstract:
This thesis develops novel statistical methodology for estimating the incidence and the prevalence of Human Immunodeficiency Virus (HIV) using routinely collected surveillance data. The robust estimation of HIV incidence and prevalence is crucial to correctly evaluate the effectiveness of targeted public health interventions and to accurately predict the HIV- related burden imposed on healthcare services. Bayesian CD4-based multi-state back-calculation methods are a key tool for monitoring the HIV epidemic, providing estimates of HIV incidence and diagnosis rates by disentangling their competing contribution to the observed surveillance data. Improving the effectiveness of public health interventions, requires targeting specific age-groups at high risk of infection; however, existing methods are limited in that they do not allow for such subgroups to be identified. Therefore the methodological focus of this thesis lies in developing a rigorous statistical framework for age-dependent back-calculation in order to achieve the joint estimation of age-and-time dependent HIV incidence and diagnosis rates. Key challenges we specifically addressed include ensuring the computational feasibility of proposed methods, an issue that has previously hindered extensions of back-calculation, and achieving the joint modelling of time-and-age specific incidence. The suitability of non-parametric bivariate smoothing methods for modelling the age-and-time specific incidence has been investigated in detail within comprehensive simulation studies. Furthermore, in order to enhance the generalisability of the proposed model, we developed back-calculation that can admit surveillance data less rich in detail; these handle surveillance data collected from an intermediate point of the epidemic, or only available on a coarse scale, and concern both age-dependent and age-independent back-calculation. The applicability of the proposed methods is illustrated using routinely collected surveillance data from England and Wales, for the HIV epidemic among men who have sex with men (MSM).
APA, Harvard, Vancouver, ISO, and other styles
3

Yeang, Chen-Hsiang 1969. "Inferring regulatory networks from multiple sources of genomic data." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/28731.

Full text
Abstract:
Thesis (Sc. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.<br>Includes bibliographical references (p. 279-299).<br>(cont.) algorithm to identify the regulatory models from protein-DNA binding and gene expression data. These models to a large extent agree with the knowledge of gene regulation pertaining to the corresponding regulators. The three works in this thesis provide a framework of modeling gene regulatory networks.<br>This thesis addresses the problems of modeling the gene regulatory system from multiple sources of large-scale datasets. In the first part, we develop a computational framework of building and validating simple, mechanistic models of gene regulation from multiple sources of data. These models, which we call physical network models, annotate the network of molecular interactions with several types of attributes (variables). We associate model attributes with physical interaction and knock-out gene expression data according to the confidence measures of data and the hypothesis that gene regulation is achieved via molecular interaction cascades. By applying standard model inference algorithms, we are able to obtain the configurations of model attributes which optimally fit the data. Because existing datasets do not provide sufficient constraints to the models, there are many optimal configurations which fit the data equally well. In the second part, we develop an information theoretic score to measure the expected capacity of new knock-out experiments in terms of reducing the model uncertainty. We collaborate with biologists to perform suggested knock-out experiments and analyze the data. The results indicate that we can reduce model uncertainty by incorporating new data. The first two parts focus on the regulatory effects along single pathways. In the third part, we consider the combinatorial effects of multiple transcription factors on transcription control. We simplify the problem by characterizing a combinatorial function of multiple regulators in terms of the properties of single regulators: the function of a regulator and its direction of effectiveness. With this characterization, we develop an incremental<br>by Chen-Hsiang Yeang.<br>Sc.D.
APA, Harvard, Vancouver, ISO, and other styles
4

Liao, Zhining. "Query processing for data integration from multiple data sources over the Internet." Thesis, University of Ulster, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.422192.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dembowski, John S. "dbUNiFier: A Framework for Automated Unification of Textual Data in Multiple Remote Data Sources." UNF Digital Commons, 2003. http://digitalcommons.unf.edu/etd/366.

Full text
Abstract:
Over time, advances in database technology and utilization have resulted in a rapid increase in the number and types of data sources. Simultaneously, numerous methods of unifying these various data sources have emerged. Research has shown that a more comprehensive set of data attribute matches between multiple schemas can be detected by combining a number of the unification methodologies as opposed to using a single method. In this research project, a unification framework, dbUNiFier, has been proposed as an approach to allow for easy integration of both existing and future unification methods and data sources.
APA, Harvard, Vancouver, ISO, and other styles
6

Gupta, Sunil Kumar. "Unsupervised modeling of multiple data sources : a latent shared subspace approach." Thesis, Curtin University, 2011. http://hdl.handle.net/20.500.11937/2583.

Full text
Abstract:
The growing number of information sources has given rise to joint analysis. While the research community has mainly focused on analyzing data from a single source, there has been relatively few attempts on jointly analyzing multiple data sources exploiting their statistical sharing strengths. In general, the data from these sources emerge without labeling information and thus it is imperative to perform the joint analysis in an unsupervised manner.This thesis addresses the above problem and presents a general shared subspace learning framework for jointly modeling multiple related data sources. Since the data sources are related, there exist common structures across these sources, which can be captured through a shared subspace. However, each source also has some individual structures, which can be captured through an individual subspace. Incorporating these concepts in nonnegative matrix factorization (NMF) based subspace learning, we develop a nonnegative shared subspace learning model for two data sources and demonstrate its application to tag based social media retrieval. Extending this model, we impose additional regularization constraints of mutual orthogonality on the shared and individual subspaces and show that, compared to its unregularized counterpart, the new regularized model effectively deals with the problem of negative knowledge transfer – a key issue faced by transfer learning methods. The effectiveness of the regularized model is demonstrated through retrieval and clustering applications for a variety of data sets. To take advantage from more than one auxiliary source, we extend above models generalizing two sources to multiple sources with an added flexibility of allowing sources having arbitrary sharing configurations. The usefulness of this model is demonstrated through improved performance, achieved with multiple auxiliary sources. In addition, this model is used to relate the items from disparate media types allowing us to perform cross-media retrieval using tags.Departing from the nonnegative models, we use a linear-Gaussian framework and develop Bayesian shared subspace learning, which not only models the mixed-sign data but also learns probabilistic subspaces. Learning the subspace dimensionalities for the shared subspace models has an important role in optimum knowledge transfer but requires model selection – a task that is computationally intensive and time consuming. To this end, we xii propose a nonparametric Bayesian joint factor analysis model that circumvents the problem of model selection by using a hierarchical beta process prior, inferring subspace dimensionalities automatically from the data. The effectiveness of this model is shown on both synthetic and real data sets. For synthetic data set, successful recovery of both shared and individual subspace dimensionalities is demonstrated, whilst for real data set, the model outperforms recent state-of-the-art techniques for text modeling and image retrieval.
APA, Harvard, Vancouver, ISO, and other styles
7

Argudo, Medrano Oscar. "Realistic reconstruction and rendering of detailed 3D scenarios from multiple data sources." Doctoral thesis, Universitat Politècnica de Catalunya, 2018. http://hdl.handle.net/10803/620733.

Full text
Abstract:
During the last years, we have witnessed significant improvements in digital terrain modeling, mainly through photogrammetric techniques based on satellite and aerial photography, as well as laser scanning. These techniques allow the creation of Digital Elevation Models (DEM) and Digital Surface Models (DSM) that can be streamed over the network and explored through virtual globe applications like Google Earth or NASA WorldWind. The resolution of these 3D scenes has improved noticeably in the last years, reaching in some urban areas resolutions up to 1m or less for DEM and buildings, and less than 10 cm per pixel in the associated aerial imagery. However, in rural, forest or mountainous areas, the typical resolution for elevation datasets ranges between 5 and 30 meters, and typical resolution of corresponding aerial photographs ranges between 25 cm to 1 m. This current level of detail is only sufficient for aerial points of view, but as the viewpoint approaches the surface the terrain loses its realistic appearance. One approach to augment the detail on top of currently available datasets is adding synthetic details in a plausible manner, i.e. including elements that match the features perceived in the aerial view. By combining the real dataset with the instancing of models on the terrain and other procedural detail techniques, the effective resolution can potentially become arbitrary. There are several applications that do not need an exact reproduction of the real elements but would greatly benefit from plausibly enhanced terrain models: videogames and entertainment applications, visual impact assessment (e.g. how a new ski resort would look), virtual tourism, simulations, etc. In this thesis we propose new methods and tools to help the reconstruction and synthesis of high-resolution terrain scenes from currently available data sources, in order to achieve realistically looking ground-level views. In particular, we decided to focus on rural scenarios, mountains and forest areas. Our main goal is the combination of plausible synthetic elements and procedural detail with publicly available real data to create detailed 3D scenes from existing locations. Our research has focused on the following contributions: - An efficient pipeline for aerial imagery segmentation - Plausible terrain enhancement from high-resolution examples - Super-resolution of DEM by transferring details from the aerial photograph - Synthesis of arbitrary tree picture variations from a reduced set of photographs - Reconstruction of 3D tree models from a single image - A compact and efficient tree representation for real-time rendering of forest landscapes<br>Durant els darrers anys, hem presenciat avenços significatius en el modelat digital de terrenys, principalment gràcies a tècniques fotogramètriques, basades en fotografia aèria o satèl·lit, i a escàners làser. Aquestes tècniques permeten crear Models Digitals d'Elevacions (DEM) i Models Digitals de Superfícies (DSM) que es poden retransmetre per la xarxa i ser explorats mitjançant aplicacions de globus virtuals com ara Google Earth o NASA WorldWind. La resolució d'aquestes escenes 3D ha millorat considerablement durant els darrers anys, arribant a algunes àrees urbanes a resolucions d'un metre o menys per al DEM i edificis, i fins a menys de 10 cm per píxel a les fotografies aèries associades. No obstant, en entorns rurals, boscos i zones muntanyoses, la resolució típica per a dades d'elevació es troba entre 5 i 30 metres, i per a les corresponents fotografies aèries varia entre 25 cm i 1m. Aquest nivell de detall només és suficient per a punts de vista aeris, però a mesura que ens apropem a la superfície el terreny perd tot el realisme. Una manera d'augmentar el detall dels conjunts de dades actuals és afegint a l'escena detalls sintètics de manera plausible, és a dir, incloure elements que encaixin amb les característiques que es perceben a la vista aèria. Així, combinant les dades reals amb instàncies de models sobre el terreny i altres tècniques de detall procedural, la resolució efectiva del model pot arribar a ser arbitrària. Hi ha diverses aplicacions per a les quals no cal una reproducció exacta dels elements reals, però que es beneficiarien de models de terreny augmentats de manera plausible: videojocs i aplicacions d'entreteniment, avaluació de l'impacte visual (per exemple, com es veuria una nova estació d'esquí), turisme virtual, simulacions, etc. En aquesta tesi, proposem nous mètodes i eines per ajudar a la reconstrucció i síntesi de terrenys en alta resolució partint de conjunts de dades disponibles públicament, per tal d'aconseguir vistes a nivell de terra realistes. En particular, hem decidit centrar-nos en escenes rurals, muntanyes i àrees boscoses. El nostre principal objectiu és la combinació d'elements sintètics plausibles i detall procedural amb dades reals disponibles públicament per tal de generar escenes 3D d'ubicacions existents. La nostra recerca s'ha centrat en les següents contribucions: - Un pipeline eficient per a segmentació d'imatges aèries - Millora plausible de models de terreny a partir d'exemples d’alta resolució - Super-resolució de models d'elevacions transferint-hi detalls de la fotografia aèria - Síntesis d'un nombre arbitrari de variacions d’imatges d’arbres a partir d'un conjunt reduït de fotografies - Reconstrucció de models 3D d'arbres a partir d'una única fotografia - Una representació compacta i eficient d'arbres per a navegació en temps real d'escenes
APA, Harvard, Vancouver, ISO, and other styles
8

Tremblay, Monica Chiarini. "Uncertainty in the information supply chain : integrating multiple health care data sources." [Tampa, Fla.] : University of South Florida, 2007. http://purl.fcla.edu/usf/dc/et/SFE0002086.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Endress, William. "Merging Multiple Telemetry Files from Widely Separated Sources for Improved Data Integrity." International Foundation for Telemetering, 2012. http://hdl.handle.net/10150/581824.

Full text
Abstract:
Merging telemetry data from multiple data sources into a single file, provides the ability to fill in gaps in the data and reduce noise by taking advantage of the multiple sources. This is desirable when analyzing the data as there is only one file to work from. Also, the analysts will spend less time trying to explain away gaps and spikes in data that are attributable to dropped and noisy telemetry frames, leading to more accurate reports. This paper discusses the issues and solutions for doing the merge.
APA, Harvard, Vancouver, ISO, and other styles
10

Cooke, Payton, and Payton Cooke. "Comparative Analysis of Multiple Data Sources for Travel Time and Delay Measurement." Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/622847.

Full text
Abstract:
Arterial performance measurement is an essential tool for both researchers and practitioners, guiding decisions on traffic management, future improvements, and public information. Link travel time and intersection control delay are two primary performance measures that are used to evaluate arterial level of service. Despite recent technological advancements, collecting travel time and intersection delay data can be a time-consuming and complicated process. Limited budgets, numerous available technologies, a rapidly changing field, and other challenges make performance measurement and comparison of data sources difficult. Three common data collection sources (probe vehicles, Bluetooth media access control readers, and manual queue length counts) are often used for performance measurement and validation of new data methods. Comparing these and other data sources is important as agencies and researchers collect arterial performance data. This study provides a methodology for comparing data sources, using statistical tests and linear correlation to compare methods and identify strengths and weaknesses. Additionally, this study examines data normality as an issue that is seldom considered, yet can affect the performance of statistical tests. These comparisons can provide insight into the selection of a particular data source for use in the field or for research. Data collected along Grant Road in Tucson, Arizona, was used as a case study to evaluate the methodology and the data sources. For evaluating travel time, GPS probe vehicle and Bluetooth sources produced similar results. Bluetooth can provide a greater volume of data more easily in addition to samples large enough for more rigorous statistical evaluation, but probe vehicles are more versatile and provide higher resolution data. For evaluating intersection delay, probe vehicle and queue count methods did not always produce similar results.
APA, Harvard, Vancouver, ISO, and other styles
11

Larsson, Jimmy. "Taxonomy Based Image Retrieval : Taxonomy Based Image Retrieval using Data from Multiple Sources." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-180574.

Full text
Abstract:
With a multitude of images available on the Internet, how do we find what we are looking for? This project tries to determine how much the precision and recall of search queries is improved by using a word taxonomy on traditional Text-Based Image Search and Content-Based Image Search. By applying a word taxonomy to different data sources, a strong keyword filter and a keyword extender were implemented and tested. The results show that depending on the implementation, the precision or the recall can be increased. By using a similar approach on real life implementations, it is possible to force images with higher precisions to the front while keeping a high recall value, thus increasing the experienced relevance of image search.<br>Med den mängd bilder som nu finns tillgänglig på Internet, hur kan vi fortfarande hitta det vi letar efter? Denna uppsats försöker avgöra hur mycket bildprecision och bildåterkallning kan öka med hjälp av appliceringen av en ordtaxonomi på traditionell Text-Based Image Search och Content-Based Image Search. Genom att applicera en ordtaxonomi på olika datakällor kan ett starkt ordfilter samt en modul som förlänger ordlistor skapas och testas. Resultaten pekar på att beroende på implementationen så kan antingen precisionen eller återkallningen förbättras. Genom att använda en liknande metod i ett verkligt scenario är det därför möjligt att flytta bilder med hög precision längre fram i resultatlistan och samtidigt behålla hög återkallning, och därmed öka den upplevda relevansen i bildsök.
APA, Harvard, Vancouver, ISO, and other styles
12

Gadaleta, Emanuela. "A multidisciplinary computational approach to model cancer-omics data : organising, integrating and mining multiple sources of data." Thesis, Queen Mary, University of London, 2015. http://qmro.qmul.ac.uk/xmlui/handle/123456789/8141.

Full text
Abstract:
It is imperative that the cancer research community has the means with which to effectively locate, access, manage, analyse and interpret the plethora of data values being generated by novel technologies. This thesis addresses this unmet requirement by using pancreatic cancer and breast cancer as prototype malignancies to develop a generic integrative transcriptomic model. The analytical workflow was initially applied to publicly available pancreatic cancer data from multiple experimental types. The transcriptomic landscape of comparative groups was examined both in isolation and relative to each other. The main observations included (i) a clear separation of profiles based on experimental type, (ii) identification of three subgroups within normal tissue samples resected adjacent to pancreatic cancer, each showing disruptions to biofunctions previously associated with pancreatic cancer (iii) and that cell lines and xenograft models are not representative of changes occurring during pancreatic tumourigenesis. Previous studies examined transcriptomic profiles across 306 biological and experimental samples, including breast cancer. The plethora of clinical and survival data readily available for breast cancer, compared to the paucity of publicly available pancreatic cancer data, allowed for expansion of the pipeline’s infrastructure to include functionalities for cross-platform and survival analysis. Application of this enhanced pipeline to multiple cohorts of triple negative and basal-like breast cancers identified differential risk groups within these breast cancer subtypes. All of the main experimental findings of this thesis are being integrated with the Pancreatic Expression Database and the Breast Cancer Campaign Tissue Bank bioinformatics portal, which enhances the sharing capacity of this information and ensures its exposure to a wider audience.
APA, Harvard, Vancouver, ISO, and other styles
13

Stich, Dennis [Verfasser], and George [Akademischer Betreuer] Craig. "Convection initiation : detection and nowcasting with multiple data sources / Dennis Stich. Betreuer: George Craig." München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2013. http://d-nb.info/1036161099/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Solera, Melissa Viola Eitzel. "Synthesizing multiple data sources to understand the population and community ecology of California trees." Thesis, University of California, Berkeley, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3686015.

Full text
Abstract:
<p> In this work, I answer timely questions regarding tree growth, tree survival, and community change in California tree species, using a variety of sophisticated statistical and remote sensing tools. In Chapter 1, I address tree growth for a single tree species with a thorough explanation of hierarchical state-space models for forest inventory data. Understanding tree growth as a function of tree size is important for a multitude of ecological and management applications. Determining what limits growth is of central interest, and forest inventory permanent plots are an abundant source of long-term information but are highly complex. Observation error and multiple sources of shared variation make these data challenging to use for growth estimation. I account for these complexities and incorporate potential limiting factors into a hierarchical state-space model. I estimate the diameter growth of white fir in the Sierra Nevada of California from forest inventory data, showing that estimating such a model is feasible in a Bayesian framework using readily available modeling tools. In this forest, white fir growth depends strongly on tree size, total plot basal area, and unexplained variation between individual trees. Plot-level resource supply variables do not have a strong impact on inventory-size trees. This approach can be applied to other networks of permanent forest plots, leading to greater ecological insights on tree growth. </p><p> In Chapter 2, I expand my state-space modeling to examine survival in seven tree species, as well as investigating the results of modeling them in aggregate and comparing with the individual species models. Declining tree survival is a complex, well-recognized problem, but studies have been largely limited to relatively rare old-growth forests or low-diversity systems, and to models which are species-aggregated or cannot easily accommodate yearly climate variables. I estimate survival models for a relatively diverse second-growth forest in the Sierra Nevada of California using a hierarchical state-space framework. I account for a mosaic of measurement intervals and random plot variation, and I directly include yearly stand development variables alongside climate variables and topographic proxies for nutrient limitation. My model captures the expected dependence of survival on tree size. At the community level, stand development variables account for decreasing survival trends, but species-specific models reveal a diversity of factors influencing survival. Species time trends in survival do not always conform to existing theories of Sierran forest dynamics, and size relationships with survival differ for each species. Within species, low survival is concentrated in susceptible subsets of the population and single estimates of annual survival rates do not reflect this heterogeneity in survival. Ultimately only full population dynamics integrating these results with models of recruitment can address the potential for community shifts over time. </p><p> In Chapter 3, I combine statistical modeling with remote sensing techniques to investigate whether topographic variables influence changes in woody cover. In the North Coast of California, changes in fire management have resulted in conversion of oak woodland into coniferous forest, but the controls on this slow transition are unknown. Historical aerial imagery, in combination with Object-Based Image Analysis (OBIA), allows us to classify land cover types from the 1940s and compare these maps with recent cover. Few studies have used these maps to model drivers of cover change, partly due to two statistical challenges: 1) appropriately accounting for spatial autocorrelation and 2) appropriately modeling percent cover which is bounded between 0 and 100 and not normally distributed. I study the change in woody cover in California's North Coast using historical and recent high-spatial-resolution imagery. I classify the imagery using eCognition Developer and aggregate the resulting maps to the scale of a Digital Elevation Model (DEM) in order to understand topographic drivers of woody cover change. I use Generalized Additive Models (GAMs) with a quasi-binomial probability distribution to account for spatial autocorrelation and the boundedness of the percent woody cover variable. I find that historical woody cover has a consistent positive effect on current woody cover, and that the spatial term in the model is significant even after controlling for historical cover. Specific topographic variables emerge as important for different sites at different scales, but no overall pattern emerges across sites or scales for any of the topographic variables I tested. This GAM framework for modeling historical data is flexible and could be used with more variables, more flexible relationships with predictor variables, and larger scales. Modeling drivers of woody cover change from historical ecology data sources can be a valuable way to plan restoration and enhance ecological insight into landscape change. </p><p> I conclude that these techniques are promising but a framework is needed for sensitivity analysis, as modeling results can depend strongly on variable selection and model structure. (Abstract shortened by UMI.)</p>
APA, Harvard, Vancouver, ISO, and other styles
15

MARX, EDGARD LUIZ. "BABEL: AN EXTENSIBLE FRAMEWORK FOR EASY RDF PUBLICATION FROM MULTIPLE DATA SOURCES USING TEMPLATES." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2012. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=20949@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO<br>COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR<br>PROGRAMA DE EXCELENCIA ACADEMICA<br>A grande maioria dos dados que se encontram hoje na Web não estão preparados para a Web Semântica. Para facilitar e promover a conversão de dados, armazenados em bancos de dados relacionais e planilhas em particular, nós introduzimos a abordagem do Babel. Diferentemente das abordagens existentes, nomeadamente RDB2RDF, Babel e promove a conversão de dados em uma ampla variedade de formatos, que incluem OWL, RDFa, RSS e (X)HTML, além de RDF. A principal contribuição de Babel, no entanto, é sua facilidade de uso. Babel suaviza a curva de aprendizado, eliminando a necessidade de se familiarizar com técnicas de mapeamento complexas, que são substituídas pelo uso de templates.<br>The vast majority of data on the Web today is not Semantic Web ready. To facilitate and promote the conversion of data, stored in relational databases and spreadsheets in particular, we introduce the Babel approach. Differently from existing approaches, notably RDBtoRDF, Babel outputs data in a wider range of formats, that include OWL, RDFa, RSS and (X)HTML, in addition to RDF. The main contribution of Babel, however, is its ease of use. Babel smoothes the learning curve by altogether eliminating the need of getting acquainted with complex mapping techniques, that are substituted by the use of templates.
APA, Harvard, Vancouver, ISO, and other styles
16

Ferreirone, Mariano. "Extraction and integration of constraints in multiple data sources using ontologies and knowledge graphs." Electronic Thesis or Diss., Université de Lorraine, 2025. http://www.theses.fr/2025LORR0013.

Full text
Abstract:
Cette thèse explore l'introduction des graphes de formes SHACL dans des environnements sémantiques présentant des sources hétérogènes et des exigences contextuelles variées. Cette recherche propose une approche pour l'enrichissement des systèmes basés sur le Web sémantique, apportant des bénéfices à plusieurs domaines, notamment l'Industrie 4.0. La thèse commence par une large revue des travaux actuels liés à la validation des contraintes sémantiques sur les modèles de représentation des connaissances. Sur la base d'une revue systématique de la littérature, une définition d'une taxonomie décrivant les types de travaux associés est proposée. Les défis ouverts liés à la création de graphes de formes et à leur inclusion dans les environnements sémantiques existants sont mis en évidence. Le besoin d'une représentation des graphes de formes capable de s'adapter à différents contextes ainsi que l'intégration de ces graphes sont particulièrement soulignés. En s'appuyant sur les standards SHACL, un modèle de restriction sémantique représentant des groupes de formes partageant une même cible et pouvant contenir des conflits inter-formes est présenté. Un processus de configuration de pré-validation permettant d'activer le sous-graphe du modèle le mieux adapté au contexte actuel est décrit. De plus, une approche d'intégration des graphes de formes SHACL appartenant à un même environnement est proposée. L'intégration proposée résout les conflits de contraintes grâce à la spécialisation des formes. Un cas d'utilisation pratique, basé sur l'École de Ski Française, démontre l'utilisabilité des contributions proposées. Des évaluations de l'exactitude et de la cohérence du graphe de formes généré sont réalisées. Les performances des procédures mises en œuvre sont également évaluées. La thèse se conclut par un résumé des contributions et une proposition de pistes de recherche futures visant à améliorer davantage l'intégration et la représentation des graphes de formes SHACL<br>This thesis explores the introduction of Shapes Constraint Language graphs in semantic environments which present heterogeneous sources and different context requirements. This research presents a proposal for the enrichment of Semantic Web based systems, providing benefits for several domains such as the Industry 4.0. The thesis starts with a wide review of the current works related to the validation of semantic constraints on knowledge representation models. Based on a systematic literature review, a definition of a taxonomy which describes the related types of works is proposed. The open challenges related to the creation of shape graphs and their inclusion in existing semantic environments are highlighted. The needs for a shape graph representation which is able to attend different contexts and for the integration of shape graphs stand out. Based on the Shapes Constraint Language standards, a semantic restriction model which represents groups of shapes which share target and could hold inter-shape conflicts is presented. A pre-validation configuration process to activate the model's sub-graph that best fits the current context is described. Moreover, an approach for the integration of these graphs which belongs to a common environment is proposed. The integration procedure resolves the constraint conflicts with the specialization of shapes. A practical use case based on the French Ski School demonstrates the usability of the proposed contributions. Evaluations of the correctness and consistency of the generated shape graph are carried out. The implemented procedures' performance is also evaluated. The thesis concludes by summarizing the contributions and suggesting future research directions to further improve the integration and representation of Shapes Constraint Language graphs
APA, Harvard, Vancouver, ISO, and other styles
17

Rezazadeh, Arezou. "Error exponent analysis for the multiple access channel with correlated sources." Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/667611.

Full text
Abstract:
Due to delay constraints of modern communication systems, studying reliable communication with finite-length codewords is much needed. Error exponents are one approach to study the finite-length regime from the information-theoretic point of view. In this thesis, we study the achievable exponent for single-user communication and also multiple-access channel with both independent and correlated sources. By studying different coding schemes including independent and identically distributed, independent and conditionally distributed, message-dependent, generalized constant-composition and conditional constant-composition ensembles, we derive a number of achievable exponents for both single-user and multi-user communication, and we analyze them.<br>A causa de les restriccions de retard dels sistemes de comunicació moderns, estudiar la fiabilitat de la comunicació amb paraules de codis de longitud finita és important. Els exponents d’error són un mètode per estudiar el règim de longitud finita des del punt de vista de la teoria de la informació. En aquesta tesi, ens centrem en assolir l’exponent per a la comunicació d’un sol usuari i també per l’accés múltiple amb fonts independents i correlacionades. En estudiar els següents esquemes de codificació amb paraules independents i idènticament distribuïdes, independents i condicionalment distribuïdes, depenent del missatge, composició constant generalitzada, i conjunts de composició constant condicional, obtenim i analitzem diversos exponents d’error assolibles tant per a la comunicació d’un sol usuari com per la de múltiples usuaris.<br>Las restricciones cada vez más fuertes en el retraso de transmisión de los sistemas de comunicación modernos hacen necesario estudiar la fiabilidad de la comunicación con palabras de códigos de longitud finita. Los exponentes de error son un método para estudiar el régimen de longitud finita desde el punto de vista la teoría de la información. En esta tesis, nos centramos en calcular el exponente para la comunicación tanto de un solo usuario como para el acceso múltiple con fuentes independientes y correladas. Estudiando diferentes familias de codificación, como son esquemas independientes e idénticamente distribuidos, independientes y condicionalmente distribuidos, que dependen del mensaje, de composición constante generalizada, y conjuntos de composición constante condicional, obtenemos y analizamos varios exponentes alcanzables tanto para la comunicación de un solo usuario como para la de múltiples usuarios.
APA, Harvard, Vancouver, ISO, and other styles
18

Comino, Trinidad Marc. "Algorithms for the reconstruction, analysis, repairing and enhancement of 3D urban models from multiple data sources." Doctoral thesis, Universitat Politècnica de Catalunya, 2020. http://hdl.handle.net/10803/670373.

Full text
Abstract:
Over the last few years, there has been a notorious growth in the field of digitization of 3D buildings and urban environments. The substantial improvement of both scanning hardware and reconstruction algorithms has led to the development of representations of buildings and cities that can be remotely transmitted and inspected in real-time. Among the applications that implement these technologies are several GPS navigators and virtual globes such as Google Earth or the tools provided by the Institut Cartogràfic i Geològic de Catalunya. In particular, in this thesis, we conceptualize cities as a collection of individual buildings. Hence, we focus on the individual processing of one structure at a time, rather than on the larger-scale processing of urban environments. Nowadays, there is a wide diversity of digitization technologies, and the choice of the appropriate one is key for each particular application. Roughly, these techniques can be grouped around three main families: - Time-of-flight (terrestrial and aerial LiDAR). - Photogrammetry (street-level, satellite, and aerial imagery). - Human-edited vector data (cadastre and other map sources). Each of these has its advantages in terms of covered area, data quality, economic cost, and processing effort. Plane and car-mounted LiDAR devices are optimal for sweeping huge areas, but acquiring and calibrating such devices is not a trivial task. Moreover, the capturing process is done by scan lines, which need to be registered using GPS and inertial data. As an alternative, terrestrial LiDAR devices are more accessible but cover smaller areas, and their sampling strategy usually produces massive point clouds with over-represented plain regions. A more inexpensive option is street-level imagery. A dense set of images captured with a commodity camera can be fed to state-of-the-art multi-view stereo algorithms to produce realistic-enough reconstructions. One other advantage of this approach is capturing high-quality color data, whereas the geometric information is usually lacking. In this thesis, we analyze in-depth some of the shortcomings of these data-acquisition methods and propose new ways to overcome them. Mainly, we focus on the technologies that allow high-quality digitization of individual buildings. These are terrestrial LiDAR for geometric information and street-level imagery for color information. Our main goal is the processing and completion of detailed 3D urban representations. For this, we will work with multiple data sources and combine them when possible to produce models that can be inspected in real-time. Our research has focused on the following contributions: - Effective and feature-preserving simplification of massive point clouds. - Developing normal estimation algorithms explicitly designed for LiDAR data. - Low-stretch panoramic representation for point clouds. - Semantic analysis of street-level imagery for improved multi-view stereo reconstruction. - Color improvement through heuristic techniques and the registration of LiDAR and imagery data. - Efficient and faithful visualization of massive point clouds using image-based techniques.<br>Durant els darrers anys, hi ha hagut un creixement notori en el camp de la digitalització d'edificis en 3D i entorns urbans. La millora substancial tant del maquinari d'escaneig com dels algorismes de reconstrucció ha portat al desenvolupament de representacions d'edificis i ciutats que es poden transmetre i inspeccionar remotament en temps real. Entre les aplicacions que implementen aquestes tecnologies hi ha diversos navegadors GPS i globus virtuals com Google Earth o les eines proporcionades per l'Institut Cartogràfic i Geològic de Catalunya. En particular, en aquesta tesi, conceptualitzem les ciutats com una col·lecció d'edificis individuals. Per tant, ens centrem en el processament individual d'una estructura a la vegada, en lloc del processament a gran escala d'entorns urbans. Avui en dia, hi ha una àmplia diversitat de tecnologies de digitalització i la selecció de l'adequada és clau per a cada aplicació particular. Aproximadament, aquestes tècniques es poden agrupar en tres famílies principals: - Temps de vol (LiDAR terrestre i aeri). - Fotogrametria (imatges a escala de carrer, de satèl·lit i aèries). - Dades vectorials editades per humans (cadastre i altres fonts de mapes). Cadascun d'ells presenta els seus avantatges en termes d'àrea coberta, qualitat de les dades, cost econòmic i esforç de processament. Els dispositius LiDAR muntats en avió i en cotxe són òptims per escombrar àrees enormes, però adquirir i calibrar aquests dispositius no és una tasca trivial. A més, el procés de captura es realitza mitjançant línies d'escaneig, que cal registrar mitjançant GPS i dades inercials. Com a alternativa, els dispositius terrestres de LiDAR són més accessibles, però cobreixen àrees més petites, i la seva estratègia de mostreig sol produir núvols de punts massius amb regions planes sobrerepresentades. Una opció més barata són les imatges a escala de carrer. Es pot fer servir un conjunt dens d'imatges capturades amb una càmera de qualitat mitjana per obtenir reconstruccions prou realistes mitjançant algorismes estèreo d'última generació per produir. Un altre avantatge d'aquest mètode és la captura de dades de color d'alta qualitat. Tanmateix, la informació geomètrica resultant sol ser de baixa qualitat. En aquesta tesi, analitzem en profunditat algunes de les mancances d'aquests mètodes d'adquisició de dades i proposem noves maneres de superar-les. Principalment, ens centrem en les tecnologies que permeten una digitalització d'alta qualitat d'edificis individuals. Es tracta de LiDAR terrestre per obtenir informació geomètrica i imatges a escala de carrer per obtenir informació sobre colors. El nostre objectiu principal és el processament i la millora de representacions urbanes 3D amb molt detall. Per a això, treballarem amb diverses fonts de dades i les combinarem quan sigui possible per produir models que es puguin inspeccionar en temps real. La nostra investigació s'ha centrat en les següents contribucions: - Simplificació eficaç de núvols de punts massius, preservant detalls d'alta resolució. - Desenvolupament d'algoritmes d'estimació normal dissenyats explícitament per a dades LiDAR. - Representació panoràmica de baixa distorsió per a núvols de punts. - Anàlisi semàntica d'imatges a escala de carrer per millorar la reconstrucció estèreo de façanes. - Millora del color mitjançant tècniques heurístiques i el registre de dades LiDAR i imatge. - Visualització eficient i fidel de núvols de punts massius mitjançant tècniques basades en imatges.
APA, Harvard, Vancouver, ISO, and other styles
19

Smith, Tiziana. "Estimating hydrologic fluxes, crop water use, and agricultural land use in China from multiple data sources." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/104166.

Full text
Abstract:
Thesis: S.M., Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2016.<br>Thesis: S.M. in Technology and Policy, Massachusetts Institute of Technology, Institute for Data, Systems, and Society, Technology and Policy Program, 2016.<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (pages 95-99).<br>Crop production has significantly altered the terrestrial environment by changing land use (Ramankutty et al., 2008) and by altering the water cycle through both co-opting rainfall and surface water withdrawals (Postel et al., 1996). As the world's population continues to grow and individual diets become more resource-intensive, the demand for food - and the land and water necessary to produce it - will continue to increase. Quantitative data about water availability, water use, and agricultural land use are needed to develop sustainable water and agricultural planning and policies. However, existing large-scale data are susceptible to errors and can be physically inconsistent. China is an example of a large area where food demand is expected to increase and a lack of data clouds the resource management dialogue. Some assert that China will have insufficient land and water resources to feed itself, posing a threat to global food security if they seek to increase food imports (Brown and Starke, 1995). Others believe resources are plentiful (Lomborg, 2001). Without quantitative data, it is difficult to discern if these concerns are realistic or overly dramatized. This thesis presents a quantitative approach to characterize hydrologic fluxes, crop water use, and agricultural land use and applies the methodology in China using data from around the year 2000. The approach uses the principles of water balance and of crop water requirements to assimilate existing data with a least-squares estimation technique, producing new estimates of water and land use variables that are physically consistent while minimizing differences from measured data. We argue that this technique for estimating water fluxes and agricultural land use can provide a useful basis for resource management and policy, both in China and around the world.<br>by Tiziana Smith.<br>S.M.<br>S.M. in Technology and Policy
APA, Harvard, Vancouver, ISO, and other styles
20

Latifi, Hooman [Verfasser], and Barbara [Akademischer Betreuer] Koch. "The use of nonparametric methods for small-scale forest inventory by means of multiple remote sensing data sources." Freiburg : Universität, 2011. http://d-nb.info/1122592647/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Chen, Yixuan. "Dissecting Genetic Basis of Complex Traits by Haplotype-based Association Studies and Integrated Information from Multiple Data Sources." Case Western Reserve University School of Graduate Studies / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=case1294872871.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Lee, Sang Gu. "Integrating Data from Multiple Sources to Estimate Transit-Land Use Interactions and Time-Varying Transit Origin-Destination Demand." Diss., The University of Arizona, 2012. http://hdl.handle.net/10150/265832.

Full text
Abstract:
This research contributes to a very active body of literature on the application of Automated Data Collection Systems (ADCS) and openly shared data to public transportation planning. It also addresses the interaction between transit demand and land use patterns, a key component of generating time-varying origin-destination (O-D) matrices at a route level. An origin-destination (O-D) matrix describes the travel demand between two different locations and is indispensable information for most transportation applications, from strategic planning to traffic control and management. A transit passenger's O-D pair at the route level simply indicates the origin and destination stop along the considered route. Observing existing land use types (e.g., residential, commercial, institutional) within the catchment area of each stop can help in identifying existing transit demand at any given time or over time. The proposed research addresses incorporation of an alighting probability matrix (APM) - tabulating the probabilities that a passenger alights at stops downstream of the boarding at a specified stop - into a time-varying O-D estimation process, based on the passenger's trip purpose or activity locations represented by the interactions between transit demand and land use patterns. In order to examine these interactions, this research also uses a much larger dataset that has been automatically collected from various electronic technologies: Automated Fare Collection (AFC) systems and Automated Passenger Counter (APC) systems, in conjunction with other readily available data such as Google's General Transit Feed Specification (GTFS) and parcel-level land use data. The large and highly detailed datasets have the capability of rectifying limitations of manual data collection (e.g., on-board survey) as well as enhancing any existing decision-making tools. This research proposes use of Google's GTFS for a bus stop aggregation model (SAM) based on distance between individual stops, textual similarity, and common service areas. By measuring land use types within a specified service area based on SAM, this research helps in advancing our understanding of transit demand in the vicinity of bus stops. In addition, a systematic matching technique for aggregating stops (SAM) allows us to analyze the symmetry of boarding and alightings, which can observe a considerable passenger flow between specific time periods and symmetry by time period pairs (e.g., between AM and PM peaks) on an individual day. This research explores the potential generation of a time-varying O-D matrix from APC data, in conjunction with integrated land use and transportation models. This research aims at incorporating all valuable information - the time-varying alighting probability matrix (TAPM) that represents on-board passengers' trip purpose - into the O-D estimation process. A practical application is based on APC data on a specific transit route in the Minneapolis - St. Paul metropolitan area. This research can also provide other practical implications. It can help transit agencies and policy makers to develop decision-making tools to support transit planning, using improved databases with transit-related ADCS and parcel-level land use data. As a result, this work not only has direct implications for the design and operation of future urban public transport systems (e.g., more precise bus scheduling, improve service to public transport users), but also for urban planning (e.g., for transit oriented urban development) and travel forecasting.
APA, Harvard, Vancouver, ISO, and other styles
23

Taylor, Shauna Rae. "Pregnancy-associated intimate partner violence an examination of multiple dimensions of intimate partner abuse victimization using three unique data sources /." Orlando, Fla. : University of Central Florida, 2009. http://purl.fcla.edu/fcla/etd/CFE0002560.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Gonzalez, Sergio E. (Sergio Ezequiel). "On creating cleantech confluences : best practices and partnerships to mobilize multiple sources of private capital into early-stage clean technologies." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/104811.

Full text
Abstract:
Thesis: S.M. in Technology and Policy, Massachusetts Institute of Technology, School of Engineering, Institute for Data, Systems, and Society, Technology and Policy Program, 2016.<br>This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Cataloged from student-submitted PDF version of thesis. Vita.<br>Includes bibliographical references (pages 81-84).<br>During the 2015 Paris Climate Change Conference, world climate scientists and policymakers agreed that global temperatures must not exceed a two degree Celsius increase above pre-industrial levels within the next 30 years. It is estimated that this will require investments of $40 trillion or $1.3 trillion per year in new and mature clean technologies. Currently, only about $0.3 trillion of investment goes to clean technology a year and the majority of that funding goes to mature, proven technologies. There is an investment gap in clean technologies, and the gap is especially pronounced for new and unproven technologies that are necessary to bring down costs of the entire system, and produce quicker breakthroughs in CO₂ mitigation. The gap is partly due to the large losses sustained by venture capitalists-one of the greatest source of early-stage capital-who invested heavily in clean technology companies in the years leading up to the 2008 recession. After the market crashed, federal and state governments ended up being among the few remaining supporters of these technology companies because of their public benefits. However, in order to stay below 2 degree Celsius warming, venture capitalists and other private venture investors must be engaged to invest in the clean technology sector again. Public sector funds are not sufficient. In a sector that has produced few winners while receiving substantial government support, the challenge could not be greater. To address this challenge, we ask three questions of three key actors: How can entrepreneurs attract private investment and scale up pass the Valley of Death? How can venture capitalists build the ability and confidence to invest in the cleantech sector again? How can policymakers address the failure modes that may still exist if investors and entrepreneurs follow best practices? To explore this issue, we conducted interviews, reviewed literature, compiled data from online sources, and compiled information from conferences and workshops. Our findings reveal a "Cleantech Confluence", or a preliminary set of best practices and partnerships. When simultaneously implemented, the Confluence can mobilize multiple sources of private capital into early-stage clean technologies.<br>by Sergio E. Gonzalez.<br>S.M. in Technology and Policy
APA, Harvard, Vancouver, ISO, and other styles
25

Khombe, Moses. "A Desk Study of the Education Policy Implications of Using Data from Multiple Sources: Example of Primary School Teacher Supply and Demand in Malawi." BYU ScholarsArchive, 2014. https://scholarsarchive.byu.edu/etd/4366.

Full text
Abstract:
Malawi, as a country with very limited resources, needs to have educational policies in place to maximize effectiveness of the public education system. Policymakers depend on accurate data, but variations in data between sources leaves policymakers uncertain as they attempt to craft policies to address the growing educational crisis in Malawi. A desk study was performed to evaluate the policy implications of employing data from multiple sources using primary school teacher supply and demand in Malawi as an illustration. This study examined one national organization, Malawi's Ministry of Education, Science, and Technology (MoEST); three international aid and assistance organizations (IAAOs), including The Department for International Development (DIFD) from the UK, Japan International Cooperation Agency (JICA), and the United States Agency for International Development (USAID); and one global organization, The United Nations Educational, Scientific and Cultural Organization (UNSECO). The study documented differences and similarities between the data sources. Among the factors considered were the nature of each institution and the effect it could have on data collection, aggregation, analysis and reporting; the definitions used by each organization, and their implications for data use; and each organization's methods of collection, aggregation, analysis and reporting. The study found significant variations in the teacher supply and demand data presented by the five organizations, with variations of up to 333% between sources. To address this problem, it is recommended that the Government of Malawi (GoM) establish a central agency to standardize education data. Three policy scenarios are detailed, presenting the probable outcome of various actions the GoM could take regarding this recommendation.
APA, Harvard, Vancouver, ISO, and other styles
26

Zuccato, Diego. "Progettazione e realizzazione di un portale multi-Ente." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17920/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Fischer, Elisabeth [Verfasser]. "Teaching Quality in Higher Education : A Field Study Investigating Effects between Input, Process, and Output Variables Using Multiple Data Sources / Elisabeth Fischer." Kassel : Universitätsbibliothek Kassel, 2019. http://d-nb.info/1201508843/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Taylor, La'Shan Denise. "Assessing Health Status, Disease Burden, and Quality of Life in Appalachia Tennessee: The Importance of Using Multiple Data Sources in Health Research." Digital Commons @ East Tennessee State University, 2009. https://dc.etsu.edu/etd/1889.

Full text
Abstract:
As the US population ages, public health agencies must examine better ways to measure the impact of adverse health outcome on a population. Many reports have asserted that more adverse health events occur in Appalachia. However, few studies have assessed the quality of life and burden of disease on those residing in Appalachia. Therefore, the overall aim of this dissertation was to assess the health status, burden of disease, and quality of life in Appalachia using available data and improved health outcome assessment measures. For this dissertation, 3 secondary data sources collected by the State of Tennessee and the National Center for Health Statistics (NCHS) were used. These data were used to calculate the index of disparity and absolute and relative disparity measures within the study area of 8 Appalachian counties in upper east Tennessee. Vital statistics data for the selected area were also used to calculate Disability Adjusted Life Years (DALYs) by gender for all cause mortality and stroke mortality. The Behavior Risk Factor Surveillance System (BRFSS) data were used for prevalence data and to determine what factors impact Health Related Quality of Life (HRQOL) within the study area. The Index of disparity (ID) for all cause mortality for the study area found that disparity is greatest in stroke mortality for the study area and TN and the least for all cause mortality and the US. The highest numbers of DALYs was found in the 45-59 age group for the Appalachian study population. Finally, the mean general health status did not vary significantly by gender; however, predictors of reporting excellent to good health status did vary based on gender. Predictors of fair to poor general health status were found to be low income, having diabetes, or having had a stroke or heart attack. The results within this dissertation are intended to assist health professionals with the creation of health interventions and policy development within the Appalachian area. This dissertation proposes a more comprehensive health status monitoring system for assessing health disparity at a regional level.
APA, Harvard, Vancouver, ISO, and other styles
29

Lai, Eva K. M. "Integrating multiple sources of data to construct a time series of recreational catch and effort for the West Coast bioregion of Western Australia." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2022. https://ro.ecu.edu.au/theses/2524.

Full text
Abstract:
There is growing attention on reconstructing recreational catch to quantify the impacts of recreational fishing. In the absence of quantitative catch and effort data, time series of reconstructed catch and effort are determined from sporadic data sources. The aim of this study was to reconstruct a time series of recreational catches for key species from a boat-based recreational fishery in the West Coast Bioregion of Western Australia from 1993/94 to 2017/18, based on data collected by various survey approaches. Prior to the reconstruction, there was a need to develop a thorough understanding of the survey methods and the data collected to ensure scientific credibility and stakeholder acceptance of reconstruction results. Thus, preparatory aspects of the reconstruction included a statistical comparison of catch and effort estimates from four Access Point surveys between 1996/97 and 2009/10, which highlighted the need to monitor recreational fisheries at appropriate intervals, particularly when the impacts of management changes on fish populations need to be considered. A further aspect was an assessment of biases in survey design and estimation methods by corroborating independent survey methods involving an off-site Phone-Diary, and on-site Bus-route Access Point and Remote Camera surveys conducted concurrently over a 12- month period in 2011/12. It was found that the relatively low cost and availability of a licensing database as a sampling frame favoured the Phone-Diary survey as a long-term monitoring tool for this boat-based recreational fishery. These two studies provided necessary information for developing the reconstruction method and understanding the limitation of the data and difficulties of the reconstruction. To reconstruct a time series of recreational catches, a method was developed to integrate data gathered from aperiodic Access Point surveys between 1996/97 and 2009/10, and biennial Phone-Diary surveys from 2011/12 to 2017/18. The catch reconstruction models included parameters for participation (number of fishers) based on the Estimated Residential Population and number of Recreational Boat Fishing Licences, effort (number of days fished per fisher per year), and catch (number of fish per fisher per day) based on the probability of a catch and nonzero catch rate. For many of the species considered, reconstructed annual catch was variable and aligned with estimates from the Access Point and Phone-Diary surveys. Catch information from periodic Phone-Diary surveys provided the necessary information for catch reconstructions, which in turn provided an approach to fill in the gaps between surveys. The reconstruction of catch from recreational fishing provided a time series that can be compared against the commercial sector and be used in stock assessment modelling.
APA, Harvard, Vancouver, ISO, and other styles
30

Pfeifroth, Uwe Anton [Verfasser], Bodo [Akademischer Betreuer] [Gutachter] Ahrens, and Andreas [Gutachter] Fink. "The diurnal cycle of clouds and precipitation : an evaluation of multiple data sources / Uwe Anton Pfeifroth. Betreuer: Bodo Ahrens. Gutachter: Bodo Ahrens ; Andreas Fink." Frankfurt am Main : Universitätsbibliothek Johann Christian Senckenberg, 2016. http://d-nb.info/1112601627/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Pfeifroth, Uwe [Verfasser], Bodo [Akademischer Betreuer] [Gutachter] Ahrens, and Andreas [Gutachter] Fink. "The diurnal cycle of clouds and precipitation : an evaluation of multiple data sources / Uwe Anton Pfeifroth. Betreuer: Bodo Ahrens. Gutachter: Bodo Ahrens ; Andreas Fink." Frankfurt am Main : Universitätsbibliothek Johann Christian Senckenberg, 2016. http://nbn-resolving.de/urn:nbn:de:hebis:30:3-414318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Melamid, Elan. "What works? integrating multiple data sources and policy research methods in assessing need and evaluating outcomes in community-based child and family service systems /." Santa Monica, Calif. : RAND, 2002. http://www.rand.org/publications/RGSD/RGSD161/RGSD161.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Bacher, Raphael. "Méthodes pour l'analyse des champs profonds extragalactiques MUSE : démélange et fusion de données hyperspectrales ;détection de sources étendues par inférence à grande échelle." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAT067/document.

Full text
Abstract:
Ces travaux se placent dans le contexte de l'étude des champs profonds hyperspectraux produits par l'instrument d'observation céleste MUSE. Ces données permettent de sonder l'Univers lointain et d'étudier les propriétés physiques et chimiques des premières structures galactiques et extra-galactiques. La première problématique abordée dans cette thèse est l'attribution d'une signature spectrale pour chaque source galactique. MUSE étant un instrument au sol, la turbulence atmosphérique dégrade fortement le pouvoir de résolution spatiale de l'instrument, ce qui génère des situations de mélange spectral pour un grand nombre de sources. Pour lever cette limitation, des approches de fusion de données, s'appuyant sur les données complémentaires du télescope spatial Hubble et d'un modèle de mélange linéaire, sont proposées, permettant la séparation spectrale des sources du champ. Le second objectif de cette thèse est la détection du Circum-Galactic Medium (CGM). Le CGM, milieu gazeux s'étendant autour de certaines galaxies, se caractérise par une signature spatialement diffuse et de faible intensité spectrale. Une méthode de détection de cette signature par test d'hypothèses est développée, basée sur une stratégie de max-test sur un dictionnaire et un apprentissage des statistiques de test sur les données. Cette méthode est ensuite étendue pour prendre en compte la structure spatiale des sources et ainsi améliorer la puissance de détection tout en conservant un contrôle global des erreurs. Les codes développés sont intégrés dans la bibliothèque logicielle du consortium MUSE afin d'être utilisables par l'ensemble de la communauté. De plus, si ces travaux sont particulièrement adaptés aux données MUSE, ils peuvent être étendus à d'autres applications dans les domaines de la séparation de sources et de la détection de sources faibles et étendues<br>This work takes place in the context of the study of hyperspectral deep fields produced by the European 3D spectrograph MUSE. These fields allow to explore the young remote Universe and to study the physical and chemical properties of the first galactical and extra-galactical structures.The first part of the thesis deals with the estimation of a spectral signature for each galaxy. As MUSE is a terrestrial instrument, the atmospheric turbulences strongly degrades the spatial resolution power of the instrument thus generating spectral mixing of multiple sources. To remove this issue, data fusion approaches, based on a linear mixing model and complementary data from the Hubble Space Telescope are proposed, allowing the spectral separation of the sources.The second goal of this thesis is to detect the Circum-Galactic Medium (CGM). This CGM, which is formed of clouds of gas surrounding some galaxies, is characterized by a spatially extended faint spectral signature. To detect this kind of signal, an hypothesis testing approach is proposed, based on a max-test strategy on a dictionary. The test statistics is learned on the data. This method is then extended to better take into account the spatial structure of the targets, thus improving the detection power, while still ensuring global error control.All these developments are integrated in the software library of the MUSE consortium in order to be used by the astrophysical community.Moreover, these works can easily be extended beyond MUSE data to other application fields that need faint extended source detection and source separation methods
APA, Harvard, Vancouver, ISO, and other styles
34

Punase, Shubha. "Utilizing Multiple Data Sources In The Preparation Of A Vision Zero Plan For The City Of Alexandria: Investigating The Relationship Between Transportation Infrastructure, Socio- Economic Characteristics, And Crash Outcomes In The City." Thesis, Virginia Tech, 2016. http://hdl.handle.net/10919/78329.

Full text
Abstract:
“Vision Zero,” first adopted by Sweden in 1997, is a road safety policy that aims to achieve a transportation system having zero fatalities or serious injuries for all modes of transportation. It takes a proactive approach to road safety system by identifying risk and taking steps to prevent injuries. Historically, traffic related crashes have disproportionately impacted vulnerable communities and system users including people of color, low income individuals, seniors, children, and pedestrians, bicyclists, and transit users (who typically walk to and from public transport). These inequities are addressed in the Vision Zero framework by prioritizing interventions in areas that need safety improvements the most. In 2016, the Alexandria City Council voted unanimously to develop a “Vision Zero” policy and program as a part of its updated transportation master plan. It required an initial equity analysis to assess the impact of traffic crashes on the traditionally underserved communities / groups (groups from at least one of these categories: low-income; minority; elderly; children; limited English proficiency; persons with disabilities; and/or pedestrians/ bicyclists/ transit users). This study combines three different methods to investigate the equity issues regarding traffic safety: 1) descriptive analysis of the spatial pattern of crashes and their relationship with the demographic profiles of neighborhoods at census block group level (for 2010-2014 period); 2) descriptive analysis of the crash trends in Alexandria; and 3) exploratory regression analyses for two different units of analysis (an aggregate regression analysis of crashes at census block group, and a disaggregate regression analysis of the individual level crash reports of traffic crashes). The analysis found that the elderly, school aged children, rail/subway users, and pedestrians had a higher risk of fatalities and severe injuries in traffic crashes. Higher job densities, alcohol impairment, and speeding were significantly related to higher KSI, whereas, smaller block sizes (higher number of street segments per sq. mile area of census block group), higher housing density, and use of safety equipment were related to lower KSI.<br>Master of Urban and Regional Planning<br>“Vision Zero,” first adopted by Sweden in 1997, is a road safety policy that aims to achieve a transportation system having zero fatalities or serious injuries for all modes of transportation. It takes a proactive approach to road safety system by identifying risk and taking steps to prevent injuries. Historically, traffic related crashes have disproportionately impacted vulnerable communities and system users including people of color, low income individuals, seniors, children, and pedestrians, bicyclists, and transit users (who typically walk to and from public transport). These inequities are addressed in the Vision Zero framework by prioritizing interventions in areas that need safety improvements the most. In 2016, the Alexandria City Council voted unanimously to develop a “Vision Zero” policy and program as a part of its updated transportation master plan. It required an initial equity analysis to assess the impact of traffic crashes on the traditionally underserved communities / groups (groups from at least one of these categories: low-income; minority; elderly; children; limited English proficiency; persons with disabilities; and/or pedestrians/ bicyclists/ transit users). This study combines three different methods to investigate the equity issues regarding traffic safety: 1) descriptive analysis of the spatial pattern of crashes and their relationship with the demographic profiles of neighborhoods at census block group level (for 2010-2014 period); 2) descriptive analysis of the crash trends in Alexandria; and 3) exploratory regression analyses for two different units of analysis (an aggregate regression analysis of crashes at census block group, and a disaggregate regression analysis of the individual level crash reports of traffic crashes). The analysis found that the elderly, school aged children, rail/subway users, and pedestrians had a higher risk of fatalities and severe injuries in traffic crashes. Higher job densities, alcohol impairment, and speeding were significantly related to higher KSI, whereas, smaller block sizes (higher number of street segments per sq. mile area of census block group), higher housing density, and use of safety equipment were related to lower KSI.
APA, Harvard, Vancouver, ISO, and other styles
35

Reed, Jesse. "Approaches to Multiple-source Localization and Signal Classification." Thesis, Virginia Tech, 2009. http://hdl.handle.net/10919/33081.

Full text
Abstract:
Source localization with a wireless sensor network remains an important area of research as the number of applications with this problem increases. This work considers the problem of source localization by a network of passive wireless sensors. The primary means by which localization is achieved is through direction-finding at each sensor, and in some cases, range estimation as well. Both single and multiple-target scenarios are considered in this research. In single-source environments, a solution that outperforms the classic least squared error estimation technique by combining direction and range estimates to perform localization is presented. In multiple-source environments, two solutions to the complex data association problem are addressed. The first proposed technique offers a less complex solution to the data association problem than a brute-force approach at the expense of some degradation in performance. For the second technique, the process of signal classification is considered as another approach to the data association problem. Environments in which each signal possesses unique features can be exploited to separate signals at each sensor by their characteristics, which mitigates the complexity of the data association problem and in many cases improves the accuracy of the localization. Two approaches to signal-selective localization are considered in this work. The first is based on the well-known cyclic MUSIC algorithm, and the second combines beamforming and modulation classification. Finally, the implementation of a direction-finding system is discussed. This system includes a uniform circular array as a radio frequency front end and the universal software radio peripheral as a data processor.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
36

Perry, Sarah Louise. "Developing productive habitat models of megafauna distribution in the Irish Sea using multiple source sightings data." Thesis, Aberystwyth University, 2017. http://hdl.handle.net/2160/d7d3da18-3d46-4eb7-813f-83aa750c4963.

Full text
Abstract:
Until recently monitoring of the marine environment and its megafauna population relied on field observations and catch or strandings information. However, new approaches are being developed which take into consideration the need for information over whole sea areas over wide spatial and temporal scales for the purposes of marine spatial planning and ecological research. This thesis investigates the use of wide scale predictive species distribution modelling using multiple source sightings information combined with environmental variable data for monitoring marine megafauna. The development of new techniques for marine ecological studies is fundamental to the future of successful marine management and the development of remote sensing techniques and the availability of remotely sensed data has opened up the opportunity to investigate the marine environment on a much wider scale. Remotely sensed data derived from satellites is now widely accessible and available, including sensors for measuring sea surface temperature and productivity within our oceans. Presence-only species distribution modelling approaches were combined with historical species occurrence data alongside environmental predictor variables, including remotely sensed sea surface temperature (SST) and chlorophyll-a concentration, to explore the importance of predictor variables and to produce predictive maps of marine megafauna occurrence in the Irish Sea. This study found that multiple source presence-only data for marine megafauna commonly found in the wider Irish Sea can be a valuable resource for use in species distribution models to produce ecologically meaningful results.
APA, Harvard, Vancouver, ISO, and other styles
37

Bedewy, Ahmed M. "OPTIMIZING DATA FRESHNESS IN INFORMATION UPDATE SYSTEMS." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1618573325086709.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Ahmed, Sarah N. "Evaluation of nitrogen nonpoint-source loadings using high resolution land use data in a GIS a multiple watershed study for the State of Maryland /." College Park, Md.: University of Maryland, 2008. http://hdl.handle.net/1903/8615.

Full text
Abstract:
Thesis (M.S.) -- University of Maryland, College Park, 2008.<br>Thesis research directed by: Dept. of Civil and Environmental Engineering. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
39

El, Moussawi Ali Hassan. "SIMD-aware word length optimization for floating-point to fixed-point conversion targeting embedded processors." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S150/document.

Full text
Abstract:
Afin de limiter leur coût et/ou leur consommation électrique, certains processeurs embarqués sacrifient le support matériel de l'arithmétique à virgule flottante. Pourtant, pour des raisons de simplicité, les applications sont généralement spécifiées en utilisant l'arithmétique à virgule flottante. Porter ces applications sur des processeurs embarqués de ce genre nécessite une émulation logicielle de l'arithmétique à virgule flottante, qui peut sévèrement dégrader la performance. Pour éviter cela, l'application est converti pour utiliser l'arithmétique à virgule fixe, qui a l'avantage d'être plus efficace à implémenter sur des unités de calcul entier. La conversion de virgule flottante en virgule fixe est une procédure délicate qui implique des compromis subtils entre performance et précision de calcul. Elle permet, entre autre, de réduire la taille des données pour le coût de dégrader la précision de calcul. Par ailleurs, la plupart de ces processeurs fournissent un support pour le calcul vectoriel de type SIMD (Single Instruction Multiple Data) afin d'améliorer la performance. En effet, cela permet l'exécution d'une opération sur plusieurs données en parallèle, réduisant ainsi le temps d'exécution. Cependant, il est généralement nécessaire de transformer l'application pour exploiter les unités de calcul vectoriel. Cette transformation de vectorisation est sensible à la taille des données ; plus leurs tailles diminuent, plus le taux de vectorisation augmente. Il apparaît donc un compromis entre vectorisation et précision de calcul. Plusieurs travaux ont proposé des méthodologies permettant, d'une part la conversion automatique de virgule flottante en virgule fixe, et d'autre part la vectorisation automatique. Dans l'état de l'art, ces deux transformations sont considérées indépendamment, pourtant elles sont fortement liées. Dans ce contexte, nous étudions la relation entre ces deux transformations, dans le but d'exploiter efficacement le compromis entre performance et précision de calcul. Ainsi, nous proposons d'abord un algorithme amélioré pour l'extraction de parallélisme SLP (Superword Level Parallelism ; une technique de vectorisation). Puis, nous proposons une nouvelle méthodologie permettant l'application conjointe de la conversion de virgule flottante en virgule fixe et de l'exploitation du SLP. Enfin, nous implémentons cette approche sous forme d'un flot de compilation source-à-source complètement automatisé, afin de valider ces travaux. Les résultats montrent l'efficacité de cette approche, dans l'exploitation du compromis entre performance et précision, vis-à-vis d'une approche classique considérant ces deux transformations indépendamment<br>In order to cut-down their cost and/or their power consumption, many embedded processors do not provide hardware support for floating-point arithmetic. However, applications in many domains, such as signal processing, are generally specified using floating-point arithmetic for the sake of simplicity. Porting these applications on such embedded processors requires a software emulation of floating-point arithmetic, which can greatly degrade performance. To avoid this, the application is converted to use fixed-point arithmetic instead. Floating-point to fixed-point conversion involves a subtle tradeoff between performance and precision ; it enables the use of narrower data word lengths at the cost of degrading the computation accuracy. Besides, most embedded processors provide support for SIMD (Single Instruction Multiple Data) as a mean to improve performance. In fact, this allows the execution of one operation on multiple data in parallel, thus ultimately reducing the execution time. However, the application should usually be transformed in order to take advantage of the SIMD instruction set. This transformation, known as Simdization, is affected by the data word lengths ; narrower word lengths enable a higher SIMD parallelism rate. Hence the tradeoff between precision and Simdization. Many existing work aimed at provide/improving methodologies for automatic floating-point to fixed-point conversion on the one side, and Simdization on the other. In the state-of-the-art, both transformations are considered separately even though they are strongly related. In this context, we study the interactions between these transformations in order to better exploit the performance/accuracy tradeoff. First, we propose an improved SLP (Superword Level Parallelism) extraction (an Simdization technique) algorithm. Then, we propose a new methodology to jointly perform floating-point to fixed-point conversion and SLP extraction. Finally, we implement this work as a fully automated source-to-source compiler flow. Experimental results, targeting four different embedded processors, show the validity of our approach in efficiently exploiting the performance/accuracy tradeoff compared to a typical approach, which considers both transformations independently
APA, Harvard, Vancouver, ISO, and other styles
40

Shih, Tsung-Ta, and 石宗達. "Integrating Erasable itemsets from Multiple Data Sources." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/72445825366424330796.

Full text
Abstract:
碩士<br>國立高雄大學<br>資訊工程學系碩士班<br>104<br>Erasable-itemset mining is a new and interesting problem suitable for factory production planning. It is to find the itemsets (components) that can be eliminated if the products generated from them gain profit under a given threshold. Erasable itemsets can be used when a factory needs to renew products or production needs downsizing and can still keep operation and gain profit. A company may have several factories, each of which may derive its own erasable itemsets at a time period. A manager of the company needs to know the overall erasable itemsets integrated from all the factories. In this thesis, we thus consider the erasable-itemset integration to merge the erasable itemsets from multi-sources. It is based on the known erasable itemsets in each factory as a reference information to reduce the rescan of the individidual data sources. We start from two-factory erasable-itemset merging and propose an efficient integration approach. Itemsets are classified into erasable and non-erasable, and thus four cases can be derived for an itemset in two factories. We can thus get the merged erasable itemset directly or rescan partial data sources to reduce mining time according to the cases. Besides, the proposed two-factory integration approach can be further extended to process more than two sets of erasable itemsets. Four experiments are made and their results show that the proposed algorithm executes faster than the batch approach in the multiple data-source environment for erasable itemsets.
APA, Harvard, Vancouver, ISO, and other styles
41

Gasca-Aragon, Hugo. "Data combination from multiple sources under measurement error." 2012. https://scholarworks.umass.edu/dissertations/AAI3556250.

Full text
Abstract:
Regulatory Agencies are responsible for monitoring the performance of particular measurement communities. In order to achieve their objectives, they sponsor Intercomparison exercises between the members of these communities. The Intercomparison Exercise Program for Organic Contaminants in the Marine Environment is an ongoing NIST/NOAA program. It was started in 1986 and there have been 19 studies to date. Using this data as a motivation we review the theory and practices applied to its analysis. It is a common practice to apply some kind of filter the comparison study data. These filters go from outliers detection and exclusion to exclusion of the entire data from a participant when its measurements are very "different". When the measurements are not so "different" the usual assumption is that the laboratories are unbiased then the simple mean, the weighted mean or the one way random effects model are applied to obtain estimates of the true value. Instead we explore methods to analyze these data under weaker assumptions and apply them to some of the available data. More specifically we explore estimation of models assessing the laboratories performance and way to use those fitted models in estimating a consensus value for new study material. This is done in various ways starting with models that allow a separate bias for each lab with each compound at each point in time and then considering generalizations of that. This is done first by exploiting models where, for a particular compound, the bias may be shared over labs or over time and then by modeling systematic biases (which depend on the concentration) by combining data from different labs. As seen in the analyses, the latter models may be more realistic. Due to uncertainty in the certified reference material analyzing systematic biases leads to a measurement error in linear regression problem. This work has two differences from the standard work in this area. First, it allows heterogeneity in the material being delivered to the lab, whether it be control or study material. Secondly, we make use of Fieller's method for estimation which has not been used in the context before, although others have suggested it. One challenge in using Fieller's method is that explicit expressions for the variance and covariance of the sample variance and covariance of independent but non-identically distributed random variables are needed. These are developed. Simulations are used to compare the performance of moment/Wald, Fieller and bootstrap methods for getting confidence intervals for the slope in the measurement model. These suggest that the Fieller's method performs better than the bootstrap technique. We also explore four estimators for the variance of the error in the equation in this context and determine that the estimator based on the modified squared residuals outperforms the others. Homogeneity is a desirable property in control and study samples. Special experiments with nested designs must be conducted for homogeneity analysis and assessment purposes. However, simulation shows that heterogeneity has low impact on the performance of the studied estimators. This work shows that a biased but consistent estimator for the heterogeneity variance can be obtained from the current experimental design.
APA, Harvard, Vancouver, ISO, and other styles
42

Gasca-Aragon, Hugo. "Data Combination from Multiple Sources Under Measurement Error." 2013. https://scholarworks.umass.edu/open_access_dissertations/687.

Full text
Abstract:
Regulatory Agencies are responsible for monitoring the performance of particular measurement communities. In order to achieve their objectives, they sponsor Intercomparison exercises between the members of these communities. The Intercomparison Exercise Program for Organic Contaminants in the Marine Environment is an ongoing NIST/NOAA program. It was started in 1986 and there have been 19 studies to date. Using this data as a motivation we review the theory and practices applied to its analysis. It is a common practice to apply some kind of filter to the comparison study data. These filters go from outliers detection and exclusion to exclusion of the entire data from a participant when its measurements are very “different". When the measurements are not so “different" the usual assumption is that the laboratories are unbiased then the simple mean, the weighted mean or the one way random effects model are applied to obtain estimates of the true value. Instead we explore methods to analyze these data under weaker assumptions and apply them to some of the available data. More specifically we explore estimation of models assessing the laboratories performance and way to use those fitted models in estimating a consensus value for new study material. This is done in various ways starting with models that allow a separate bias for each lab with each compound at each point in time and then considering generalizations of that. This is done first by exploiting models where, for a particular compound, the bias may be shared over labs or over time and then by modeling systematic biases (which depend on the concentration) by combining data from different labs. As seen in the analyses, the latter models may be more realistic. Due to uncertainty in the certified reference material analyzing systematic biases leads to a measurement error in linear regression problem. This work has two differences from the standard work in this area. First, it allows heterogeneity in the material being delivered to the lab, whether it be control or study material. Secondly, we make use of Fieller's method for estimation which has not been used in the context before, although others have suggested it. One challenge in using Fieller's method is that explicit expressions for the variance and covariance of the sample variance and covariance of independent but non-identically distributed random variables are needed. These are developed. Simulations are used to compare the performance of moment/Wald, Fieller and bootstrap methods for getting confidence intervals for the slope in the measurement model. These suggest that the Fieller's method performs better than the bootstrap technique. We also explore four estimators for the variance of the error in the equation in this context and determine that the estimator based on the modified squared residuals outperforms the others. Homogeneity is a desirable property in control and study samples. Special experiments with nested designs must be conducted for homogeneity analysis and assessment purposes. However, simulation shows that heterogeneity has low impact on the performance of the studied estimators. This work shows that a biased but consistent estimator for the heterogeneity variance can be obtained from the current experimental design.
APA, Harvard, Vancouver, ISO, and other styles
43

Chang, Hung-Ye, and 張弘燁. "Chronic Pain Recognition with multiple sources data fusion." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/t7752j.

Full text
Abstract:
碩士<br>國立交通大學<br>光電系統研究所<br>107<br>Chronic pain is a common disease not in only Taiwan but also in the world. There are over 1.5 billion patients worldwide. Obviously, chronic pain is an important issue. In previous study had indicated several features which have ability for patients identification. As the same time , there are also some studies reveal that that multimodal information is benefit for improve the identification about clinical disease. It is believed that multiple sources combination can complement each different source which is benefit for the model training. Based on above research, we interested to know more about the chronic pain field. In this study we collect the EEG , ECG, saliva and do the Quantitative sensory testing information. First of all, we apply machine learning with the coherence features transform from EEG to classify the patients’ group. Second of all we combine coherence with other domain information to optimize the accuracy. In our study we have showed that multiple sources information is advantageous to identify chronic pain patients. The accuracy can reach about 87% and 88% in CM and FM groups comparison.
APA, Harvard, Vancouver, ISO, and other styles
44

Gao, Feng. "On combining data from multiple sources with unknown relative weights." 1993. http://catalog.hathitrust.org/api/volumes/oclc/29243848.html.

Full text
Abstract:
Thesis (Ph. D.)--University of Wisconsin--Madison, 1993.<br>Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 174-178).
APA, Harvard, Vancouver, ISO, and other styles
45

LIN, YUNG-SHENG, and 林永盛. "Constructing PM2.5 Map Based on ETL-Based Multiple Data Sources Fusion." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/ptp5wj.

Full text
Abstract:
碩士<br>國立臺北大學<br>資訊工程學系<br>105<br>In recent years, air quality and environmental quality are getting more and more concern in public, many people are pay attention to the air quality of their surroundings. As one of air pollution PM2.5 will be affected to human body directly, it was invisible to the naked eye. According to many epidemiological studies have demonstrated the health hazards of PM2.5, including early death, bronchitis, asthma, cardiovascular disease, lung cancer, and etc. Most of the current PM2.5 station is fixed, and it cannot true reflection of air quality of people surroundings. Thus, in the thesis, we want to develop a method that everyone can easily watch air quality around themselves. We construct a comprehensive PM2.5 map by fused the sensing data from multiple monitoring station, and we also implement a Mobile PM2.5 Sensor (MPM) to detect the air quality. The MPM will send the air quality data to user's smart phone, then send user's GPS location and sensor data to cloud server to construct PM2.5 map by Google Map API, anyone can get the current air quality information of the region of PM2.5 concentration through PM2.5 map. We also develop an extract-transform-load based cloud computing platform. We integrate the PM2.5 data sources from different organization. In order to construct the comprehensive PM2.5 map, we fuse these integrated air data by Inverse Distance Weighting method (IDW). So that we can get PM2.5 value anywhere, even there is no PM2.5 monitoring stations. And we develop a user application to present our system.
APA, Harvard, Vancouver, ISO, and other styles
46

Venter, Yolande. "Investigating excessive aggression during the preschool years through multiple data sources." Diss., 2012. http://hdl.handle.net/10500/6027.

Full text
Abstract:
Although aggression as social phenomenon is widely researched, this research study aimed to illuminate the importance of early identification of excessively aggressive children specifically. The aim was to explore and gain an in-depth understanding of excessive aggressive behaviour during the preschool years. A qualitative research methodology was employed consisting of a parent interview, observations of the research participant and numerous play sessions consisting of various activities including free drawings; ‘Draw-a-Person ‘, a family drawing; the ‘Children’s Apperception Test’, and free play activities. The study explored various factors possibly leading to the onset and continuation of excessive aggressive behaviour. It seems clear that no single factor is responsible for the display of excessive aggression, but rather, multiple factors contribute to the problem of aggression as a whole. Play therapy is suggested as an effective method in the assessment and counselling of excessive aggressive behaviour in preschool children<br>Psychology<br>M.Sc. (Psychology)
APA, Harvard, Vancouver, ISO, and other styles
47

Mehrotra, Shashank. "Evaluation and Validation of Distraction Detection Algorithms on Multiple Data Sources." 2018. https://scholarworks.umass.edu/masters_theses_2/710.

Full text
Abstract:
This study aims to evaluate algorithms designed to detect distracted driving. This includes the comparison of how efficiently they detect the state of distraction and likelihood of a crash. Four algorithms that utilize measures of cumulative glance, past glance behavior, and glance eccentricity were used to understand the distracted state of the driver and were validated on two separate data sources (i.e., simulator and naturalistic data). Additionally, an independent method for distraction detection was designed using data mining methods. This approach utilized measures like steering degree, lane offset, lateral and longitudinal velocity, and acceleration. The results showed a higher likelihood of distracted events when cumulative glances were considered. However, the state of distraction was observed to be higher when glance eccentricity was added. Additionally, it was observed that glance behavior using the four legacy algorithms were better detectors of the state of distraction as compared to the data mining method that used vehicular measures. This research has implications in understanding the state of distraction, predicting the power of different methods, and comparing approaches in different contexts (naturalistic vs simulator). These findings provide the fundamental building blocks towards designing advanced mitigation systems that give drivers feedback in instances of high crash likelihood.
APA, Harvard, Vancouver, ISO, and other styles
48

Chen, Yu. "Biological knowledge discovery through mining multiple sources of high-throughput data." 2004. http://etd.utk.edu/2004/ChenYu.pdf.

Full text
Abstract:
Thesis (Ph. D.)--University of Tennessee, Knoxville, 2004.<br>Title from title page screen (viewed Sep. 23, 2004). Thesis advisor: Dong Xu. Document formatted into pages (xiii, 149 p. : ill. (chiefly col.)). Vita. Includes bibliographical references (p. 132-145).
APA, Harvard, Vancouver, ISO, and other styles
49

Huang, Wen-Jing. "Development of statewide truck travel demand models using multiple data sources." 1998. http://catalog.hathitrust.org/api/volumes/oclc/40452138.html.

Full text
Abstract:
Thesis (Ph. D.)--University of Wisconsin--Madison, 1998.<br>Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 234-239).
APA, Harvard, Vancouver, ISO, and other styles
50

"Integrative Analyses of Diverse Biological Data Sources." Doctoral diss., 2011. http://hdl.handle.net/2286/R.I.9224.

Full text
Abstract:
abstract: The technology expansion seen in the last decade for genomics research has permitted the generation of large-scale data sources pertaining to molecular biological assays, genomics, proteomics, transcriptomics and other modern omics catalogs. New methods to analyze, integrate and visualize these data types are essential to unveil relevant disease mechanisms. Towards these objectives, this research focuses on data integration within two scenarios: (1) transcriptomic, proteomic and functional information and (2) real-time sensor-based measurements motivated by single-cell technology. To assess relationships between protein abundance, transcriptomic and functional data, a nonlinear model was explored at static and temporal levels. The successful integration of these heterogeneous data sources through the stochastic gradient boosted tree approach and its improved predictability are some highlights of this work. Through the development of an innovative validation subroutine based on a permutation approach and the use of external information (i.e., operons), lack of a priori knowledge for undetected proteins was overcome. The integrative methodologies allowed for the identification of undetected proteins for Desulfovibrio vulgaris and Shewanella oneidensis for further biological exploration in laboratories towards finding functional relationships. In an effort to better understand diseases such as cancer at different developmental stages, the Microscale Life Science Center headquartered at the Arizona State University is pursuing single-cell studies by developing novel technologies. This research arranged and applied a statistical framework that tackled the following challenges: random noise, heterogeneous dynamic systems with multiple states, and understanding cell behavior within and across different Barrett's esophageal epithelial cell lines using oxygen consumption curves. These curves were characterized with good empirical fit using nonlinear models with simple structures which allowed extraction of a large number of features. Application of a supervised classification model to these features and the integration of experimental factors allowed for identification of subtle patterns among different cell types visualized through multidimensional scaling. Motivated by the challenges of analyzing real-time measurements, we further explored a unique two-dimensional representation of multiple time series using a wavelet approach which showcased promising results towards less complex approximations. Also, the benefits of external information were explored to improve the image representation.<br>Dissertation/Thesis<br>Ph.D. Industrial Engineering 2011
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography