Dissertations / Theses on the topic 'Multivariate analysis – Data processing'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Multivariate analysis – Data processing.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Jonsson, Pär. "Multivariate processing and modelling of hyphenated metabolite data." Doctoral thesis, Umeå universitet, Kemi, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-663.
Full textSiluyele, Ian John. "Power studies of multivariate two-sample tests of comparison." Thesis, University of the Western Cape, 2007. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_6355_1255091702.
Full textThe multivariate two-sample tests provide a means to test the match between two multivariate distributions. Although many tests exist in the literature, relatively little is known about the relative power of these procedures. The studies reported in the thesis contrasts the effectiveness, in terms of power, of seven such tests with a Monte Carlo study. The relative power of the tests was investigated against location, scale, and correlation alternatives.
Vitale, Raffaele. "Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/90442.
Full textLa presente tesis doctoral, concebida principalmente para apoyar y reforzar la relación entre la academia y la industria, se desarrolló en colaboración con Shell Global Solutions (Amsterdam, Países Bajos) en el esfuerzo de aplicar y posiblemente extender los enfoques ya consolidados basados en variables latentes (es decir, Análisis de Componentes Principales - PCA - Regresión en Mínimos Cuadrados Parciales - PLS - o PLS discriminante - PLSDA) para la resolución de problemas complejos no sólo en los campos de mejora y optimización de procesos, sino también en el entorno más amplio del análisis de datos multivariados. Con este fin, en todos los capítulos proponemos nuevas soluciones algorítmicas eficientes para abordar tareas dispares, desde la transferencia de calibración en espectroscopia hasta el modelado en tiempo real de flujos de datos. El manuscrito se divide en las seis partes siguientes, centradas en diversos temas de interés: Parte I - Prefacio, donde presentamos un resumen de este trabajo de investigación, damos sus principales objetivos y justificaciones junto con una breve introducción sobre PCA, PLS y PLSDA; Parte II - Sobre las extensiones basadas en kernels de PCA, PLS y PLSDA, donde presentamos el potencial de las técnicas de kernel, eventualmente acopladas a variantes específicas de la recién redescubierta proyección de pseudo-muestras, formulada por el estadista inglés John C. Gower, y comparamos su rendimiento respecto a metodologías más clásicas en cuatro aplicaciones a escenarios diferentes: segmentación de imágenes Rojo-Verde-Azul (RGB), discriminación y monitorización de procesos por lotes y análisis de diseños de experimentos de mezclas; Parte III - Sobre la selección del número de factores en el PCA por pruebas de permutación, donde aportamos una guía extensa sobre cómo conseguir la selección de componentes de PCA mediante pruebas de permutación y una ilustración completa de un procedimiento algorítmico original implementado para tal fin; Parte IV - Sobre la modelización de fuentes de variabilidad común y distintiva en el análisis de datos multi-conjunto, donde discutimos varios aspectos prácticos del análisis de componentes comunes y distintivos de dos bloques de datos (realizado por métodos como el Análisis Simultáneo de Componentes - SCA - Análisis Simultáneo de Componentes Distintivos y Comunes - DISCO-SCA - Descomposición Adaptada Generalizada de Valores Singulares - Adapted GSVD - ECO-POWER, Análisis de Correlaciones Canónicas - CCA - y Proyecciones Ortogonales de 2 conjuntos a Estructuras Latentes - O2PLS). Presentamos a su vez una nueva estrategia computacional para determinar el número de factores comunes subyacentes a dos matrices de datos que comparten la misma dimensión de fila o columna y dos planteamientos novedosos para la transferencia de calibración entre espectrómetros de infrarrojo cercano; Parte V - Sobre el procesamiento y la modelización en tiempo real de flujos de datos de alta dimensión, donde diseñamos la herramienta de Procesamiento en Tiempo Real (OTFP), un nuevo sistema de manejo racional de mediciones multi-canal registradas en tiempo real; Parte VI - Epílogo, donde presentamos las conclusiones finales, delimitamos las perspectivas futuras, e incluimos los anexos.
La present tesi doctoral, concebuda principalment per a recolzar i reforçar la relació entre l'acadèmia i la indústria, es va desenvolupar en col·laboració amb Shell Global Solutions (Amsterdam, Països Baixos) amb l'esforç d'aplicar i possiblement estendre els enfocaments ja consolidats basats en variables latents (és a dir, Anàlisi de Components Principals - PCA - Regressió en Mínims Quadrats Parcials - PLS - o PLS discriminant - PLSDA) per a la resolució de problemes complexos no solament en els camps de la millora i optimització de processos, sinó també en l'entorn més ampli de l'anàlisi de dades multivariades. A aquest efecte, en tots els capítols proposem noves solucions algorítmiques eficients per a abordar tasques dispars, des de la transferència de calibratge en espectroscopia fins al modelatge en temps real de fluxos de dades. El manuscrit es divideix en les sis parts següents, centrades en diversos temes d'interès: Part I - Prefaci, on presentem un resum d'aquest treball de recerca, es donen els seus principals objectius i justificacions juntament amb una breu introducció sobre PCA, PLS i PLSDA; Part II - Sobre les extensions basades en kernels de PCA, PLS i PLSDA, on presentem el potencial de les tècniques de kernel, eventualment acoblades a variants específiques de la recentment redescoberta projecció de pseudo-mostres, formulada per l'estadista anglés John C. Gower, i comparem el seu rendiment respecte a metodologies més clàssiques en quatre aplicacions a escenaris diferents: segmentació d'imatges Roig-Verd-Blau (RGB), discriminació i monitorització de processos per lots i anàlisi de dissenys d'experiments de mescles; Part III - Sobre la selecció del nombre de factors en el PCA per proves de permutació, on aportem una guia extensa sobre com aconseguir la selecció de components de PCA a través de proves de permutació i una il·lustració completa d'un procediment algorítmic original implementat per a la finalitat esmentada; Part IV - Sobre la modelització de fonts de variabilitat comuna i distintiva en l'anàlisi de dades multi-conjunt, on discutim diversos aspectes pràctics de l'anàlisis de components comuns i distintius de dos blocs de dades (realitzat per mètodes com l'Anàlisi Simultània de Components - SCA - Anàlisi Simultània de Components Distintius i Comuns - DISCO-SCA - Descomposició Adaptada Generalitzada en Valors Singulars - Adapted GSVD - ECO-POWER, Anàlisi de Correlacions Canòniques - CCA - i Projeccions Ortogonals de 2 blocs a Estructures Latents - O2PLS). Presentem al mateix temps una nova estratègia computacional per a determinar el nombre de factors comuns subjacents a dues matrius de dades que comparteixen la mateixa dimensió de fila o columna, i dos plantejaments nous per a la transferència de calibratge entre espectròmetres d'infraroig proper; Part V - Sobre el processament i la modelització en temps real de fluxos de dades d'alta dimensió, on dissenyem l'eina de Processament en Temps Real (OTFP), un nou sistema de tractament racional de mesures multi-canal registrades en temps real; Part VI - Epíleg, on presentem les conclusions finals, delimitem les perspectives futures, i incloem annexos.
Vitale, R. (2017). Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90442
TESIS
Doshi, Punit Rameshchandra. "Adaptive prefetching for visual data exploration." Link to electronic thesis, 2003. http://www.wpi.edu/Pubs/ETD/Available/etd-0131103-203307.
Full textKeywords: Adaptive prefetching; Large-scale multivariate data visualization; Semantic caching; Hierarchical data exploration; Exploratory data analysis. Includes bibliographical references (p.66-70).
Cannon, Paul C. "Extending the information partition function : modeling interaction effects in highly multivariate, discrete data /." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2263.pdf.
Full textForshed, Jenny. "Processing and analysis of NMR data : Impurity determination and metabolic profiling." Doctoral thesis, Stockholm : Dept. of analytical chemistry, Stockholm university, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-712.
Full textGuamán, Novillo Ana Verónica. "Multivariate Signal Processing for Quantitative and Qualitative Analysis of Ion Mobility Spectrometry data, applied to Biomedical Applications and Food Related Applications." Doctoral thesis, Universitat de Barcelona, 2015. http://hdl.handle.net/10803/349210.
Full textEl objetivo de esta tesis es el desarrollo de nuevas metodologías en el procesado de señal multivariante en espectros IMS. En este trabajo se ha realizado una comparación entre tres espectrómetros IMS. Esta labor comparativa, mediante procesado multivariante, es prácticamente inédita en este ámbito. En este caso se realizó un estudio con 3 aminas y se determinó el límite de detección. Los resultados mostraron que los 3 espectrómetros tuvieron un rendimiento similar, a pesar de que sus condiciones de operación son distintas. Se propuso una técnica específica para eliminar ruido de baja frecuencia acoplado al espectro de IMS. Se observó que utilizar PCA o ICA (métodos multivariantes) mejora notablemente la relación señal ruido si se compara con las técnicas convencionales. Se ha estudiado el alineamiento de los espectros y se han propuesto soluciones basadas en los diferentes métodos del estado del arte. Se ha evidenciado que incluir compuestos de referencia para garantizar que el proceso de alineamiento es el adecuado es ventajoso. En el caso de que esto no fuese posible se aconseja realizar el alineamiento por etapas, primero un alineamiento en una misma muestra, y luego entre muestras. Se realizaron modelos cualitativos para diferenciar o discriminar clases a partir de medidas de IMS. Se propusieron dos modelos multivariantes con técnicas de validación cruzada. Los resultados obtenidos muestran el gran potencial de IMS en este sentido. Se evaluó el rendimiento cuantitativo de los IMS al utilizar métodos multivariantes y fueron comparados con métodos univariantes habituales en el ámbito de IMS. De los resultados obtenidos se observó que los modelos univariantes no son capaces de resolver comportamientos típicos de IMS como son el comportamiento no lineal y el efecto en mezclas. En este sentido las técnicas multivariantes mostraron mejores prestaciones. Se comparó la utilización de técnicas multivariantes que proyectan los datos en un nuevo subespacio como lo es PLS con técnicas de deconvolución como lo es MCR en sus dos versiones ALS y Lasso. Los resultados obtenidos fueron bastante similares, sin embargo MCR ofrece una ventaja importante ya que permite interpretar de mejor manera los resultados.
Cannon, Paul C. "Extending the Information Partition Function: Modeling Interaction Effects in Highly Multivariate, Discrete Data." BYU ScholarsArchive, 2007. https://scholarsarchive.byu.edu/etd/1234.
Full textOller, Moreno Sergio. "Data processing for Life Sciences measurements with hyphenated Gas Chromatography-Ion Mobility Spectrometry." Doctoral thesis, Universitat de Barcelona, 2018. http://hdl.handle.net/10803/523539.
Full textEls avenços recents en instrumentació química i el progrés en les capacitats computacionals obren noves possibilitats per l’anàlisi de dades provinents de diversos camps en l’àmbit de les ciències de la vida, com la biologia, la medicina o la ciència de l’alimentació. Una de les tècniques que s’ha beneficiat d’aquests avenços és la cromatografia de gasos – espectrometria de mobilitat d’ions (GC-IMS). Aquesta tècnica és útil per detectar compostos orgànics volàtils en mostres complexes. L’IMS és una tècnica analítica per caracteritzar substàncies químiques basada en la velocitat d’ions en fase gasosa en un camp elèctric, capaç de detectar traces d’alguns volàtils en concentracions de ppb ràpidament. Per augmentar-ne la selectivitat, un cromatògraf de gasos pot emprar-se per pre-separar la mostra, a expenses de la durada de l’anàlisi. Tot i disposar de millores en la instrumentació i més poder computacional, calen millors algoritmes per extreure tota la informació de les mostres. En particular, GC-IMS no ha rebut molta atenció en comparació amb altres tècniques analítiques. En aquest treball, tractem alguns problemes de l’anàlisi de dades de GC-IMS: Pel que fa al pre-processat, explorem algoritmes d’estimació de la línia de base i en proposem una millora, adaptada a les necessitats de l’instrument. Aquest algoritme també s’utilitza en mostres de cromatografia de gasos espectrometria de masses (GC-MS), en tant que s’adapta correctament a ambdues tècniques. Caracteritzem els desalineaments espectrals que es produeixen en un estudi de diversos mesos de durada, i proposem un mètode d’alineat basat en splines cúbics monotònics per a la seva correcció i un interval de temps òptim entre dues mostres calibrants. Explorem l’ús de mètodes de resolució multivariant de corbes (MCR) per a la deconvolució de pics solapats i la seva extracció en components purs. Proposem l’ús d’una finestra mòbil en el temps de retenció. Aquesta millora permet extreure més informació d’analits. Finalment utilitzem alguns d’aquests desenvolupaments a dues aplicacions: la prevenció de frau en la classificació d’olis d’oliva, mesurada amb GC-IMS i la cerca de biomarcadors de càncer de pròstata en volàtils de la orina, feta amb GC-MS.
Alexander, Miranda Abhilash. "Spectral factor model for time series learning." Doctoral thesis, Universite Libre de Bruxelles, 2011. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209812.
Full textmassive amounts of streaming data.
In many applications, data is collected for modeling the processes. The process model is hoped to drive objectives such as decision support, data visualization, business intelligence, automation and control, pattern recognition and classification, etc. However, we face significant challenges in data-driven modeling of processes. Apart from the errors, outliers and noise in the data measurements, the main challenge is due to a large dimensionality, which is the number of variables each data sample measures. The samples often form a long temporal sequence called a multivariate time series where any one sample is influenced by the others.
We wish to build a model that will ensure robust generation, reviewing, and representation of new multivariate time series that are consistent with the underlying process.
In this thesis, we adopt a modeling framework to extract characteristics from multivariate time series that correspond to dynamic variation-covariation common to the measured variables across all the samples. Those characteristics of a multivariate time series are named its 'commonalities' and a suitable measure for them is defined. What makes the multivariate time series model versatile is the assumption regarding the existence of a latent time series of known or presumed characteristics and much lower dimensionality than the measured time series; the result is the well-known 'dynamic factor model'.
Original variants of existing methods for estimating the dynamic factor model are developed: The estimation is performed using the frequency-domain equivalent of the dynamic factor model named the 'spectral factor model'. To estimate the spectral factor model, ideas are sought from the asymptotic theory of spectral estimates. This theory is used to attain a probabilistic formulation, which provides maximum likelihood estimates for the spectral factor model parameters. Then, maximum likelihood parameters are developed with all the analysis entirely in the spectral-domain such that the dynamically transformed latent time series inherits the commonalities maximally.
The main contribution of this thesis is a learning framework using the spectral factor model. We term learning as the ability of a computational model of a process to robustly characterize the data the process generates for purposes of pattern matching, classification and prediction. Hence, the spectral factor model could be claimed to have learned a multivariate time series if the latent time series when dynamically transformed extracts the commonalities reliably and maximally. The spectral factor model will be used for mainly two multivariate time series learning applications: First, real-world streaming datasets obtained from various processes are to be classified; in this exercise, human brain magnetoencephalography signals obtained during various cognitive and physical tasks are classified. Second, the commonalities are put to test by asking for reliable prediction of a multivariate time series given its past evolution; share prices in a portfolio are forecasted as part of this challenge.
For both spectral factor modeling and learning, an analytical solution as well as an iterative solution are developed. While the analytical solution is based on low-rank approximation of the spectral density function, the iterative solution is based on the expectation-maximization algorithm. For the human brain signal classification exercise, a strategy for comparing similarities between the commonalities for various classes of multivariate time series processes is developed. For the share price prediction problem, a vector autoregressive model whose parameters are enriched with the maximum likelihood commonalities is designed. In both these learning problems, the spectral factor model gives commendable performance with respect to competing approaches.
Les processus informatisés actuels génèrent des quantités massives de flux de données. Dans nombre d'applications, ces flux de données sont collectées en vue de modéliser les processus. Les modèles de processus obtenus ont pour but la réalisation d'objectifs tels que l'aide à la décision, la visualisation de données, l'informatique décisionnelle, l'automatisation et le contrôle, la reconnaissance de formes et la classification, etc. La modélisation de processus sur la base de données implique cependant de faire face à d’importants défis. Outre les erreurs, les données aberrantes et le bruit, le principal défi provient de la large dimensionnalité, i.e. du nombre de variables dans chaque échantillon de données mesurées. Les échantillons forment souvent une longue séquence temporelle appelée série temporelle multivariée, où chaque échantillon est influencé par les autres. Notre objectif est de construire un modèle robuste qui garantisse la génération, la révision et la représentation de nouvelles séries temporelles multivariées cohérentes avec le processus sous-jacent.
Dans cette thèse, nous adoptons un cadre de modélisation capable d’extraire, à partir de séries temporelles multivariées, des caractéristiques correspondant à des variations - covariations dynamiques communes aux variables mesurées dans tous les échantillons. Ces caractéristiques sont appelées «points communs» et une mesure qui leur est appropriée est définie. Ce qui rend le modèle de séries temporelles multivariées polyvalent est l'hypothèse relative à l'existence de séries temporelles latentes de caractéristiques connues ou présumées et de dimensionnalité beaucoup plus faible que les séries temporelles mesurées; le résultat est le bien connu «modèle factoriel dynamique». Des variantes originales de méthodes existantes pour estimer le modèle factoriel dynamique sont développées :l'estimation est réalisée en utilisant l'équivalent du modèle factoriel dynamique au niveau du domaine de fréquence, désigné comme le «modèle factoriel spectral». Pour estimer le modèle factoriel spectral, nous nous basons sur des idées relatives à la théorie des estimations spectrales. Cette théorie est utilisée pour aboutir à une formulation probabiliste, qui fournit des estimations de probabilité maximale pour les paramètres du modèle factoriel spectral. Des paramètres de probabilité maximale sont alors développés, en plaçant notre analyse entièrement dans le domaine spectral, de façon à ce que les séries temporelles latentes transformées dynamiquement héritent au maximum des points communs.
La principale contribution de cette thèse consiste en un cadre d'apprentissage utilisant le modèle factoriel spectral. Nous désignons par apprentissage la capacité d'un modèle de processus à caractériser de façon robuste les données générées par le processus à des fins de filtrage par motif, classification et prédiction. Dans ce contexte, le modèle factoriel spectral est considéré comme ayant appris une série temporelle multivariée si la série temporelle latente, une fois dynamiquement transformée, permet d'extraire les points communs de façon fiable et maximale. Le modèle factoriel spectral sera utilisé principalement pour deux applications d'apprentissage de séries multivariées :en premier lieu, des ensembles de données sous forme de flux venant de différents processus du monde réel doivent être classifiés; lors de cet exercice, la classification porte sur des signaux magnétoencéphalographiques obtenus chez l'homme au cours de différentes tâches physiques et cognitives; en second lieu, les points communs obtenus sont testés en demandant une prédiction fiable d'une série temporelle multivariée étant donnée l'évolution passée; les prix d'un portefeuille d'actions sont prédits dans le cadre de ce défi.
À la fois pour la modélisation et pour l'apprentissage factoriel spectral, une solution analytique aussi bien qu'une solution itérative sont développées. Tandis que la solution analytique est basée sur une approximation de rang inférieur de la fonction de densité spectrale, la solution itérative est basée, quant à elle, sur l'algorithme de maximisation des attentes. Pour l'exercice de classification des signaux magnétoencéphalographiques humains, une stratégie de comparaison des similitudes entre les points communs des différentes classes de processus de séries temporelles multivariées est développée. Pour le problème de prédiction des prix des actions, un modèle vectoriel autorégressif dont les paramètres sont enrichis avec les points communs de probabilité maximale est conçu. Dans ces deux problèmes d’apprentissage, le modèle factoriel spectral atteint des performances louables en regard d’approches concurrentes.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished
Ablin, Pierre. "Exploration of multivariate EEG /MEG signals using non-stationary models." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT051.
Full textIndependent Component Analysis (ICA) models a set of signals as linear combinations of independent sources. This analysis method plays a key role in electroencephalography (EEG) and magnetoencephalography (MEG) signal processing. Applied on such signals, it allows to isolate interesting brain sources, locate them, and separate them from artifacts. ICA belongs to the toolbox of many neuroscientists, and is a part of the processing pipeline of many research articles. Yet, the most widely used algorithms date back to the 90's. They are often quite slow, and stick to the standard ICA model, without more advanced features.The goal of this thesis is to develop practical ICA algorithms to help neuroscientists. We follow two axes. The first one is that of speed. We consider the optimization problems solved by two of the most widely used ICA algorithms by practitioners: Infomax and FastICA. We develop a novel technique based on preconditioning the L-BFGS algorithm with Hessian approximation. The resulting algorithm, Picard, is tailored for real data applications, where the independence assumption is never entirely true. On M/EEG data, it converges faster than the `historical' implementations.Another possibility to accelerate ICA is to use incremental methods, which process a few samples at a time instead of the whole dataset. Such methods have gained huge interest in the last years due to their ability to scale well to very large datasets. We propose an incremental algorithm for ICA, with important descent guarantees. As a consequence, the proposed algorithm is simple to use and does not have a critical and hard to tune parameter like a learning rate.In a second axis, we propose to incorporate noise in the ICA model. Such a model is notoriously hard to fit under the standard non-Gaussian hypothesis of ICA, and would render estimation extremely long. Instead, we rely on a spectral diversity assumption, which leads to a practical algorithm, SMICA. The noise model opens the door to new possibilities, like finer estimation of the sources, and use of ICA as a statistically sound dimension reduction technique. Thorough experiments on M/EEG datasets demonstrate the usefulness of this approach.All algorithms developed in this thesis are open-sourced and available online. The Picard algorithm is included in the largest M/EEG processing Python library, MNE and Matlab library, EEGlab
Siepka, Damian. "Development of multidimensional spectral data processing procedures for analysis of composition and mixing state of aerosol particles by Raman and FTIR spectroscopy." Thesis, Lille 1, 2017. http://www.theses.fr/2017LIL10188/document.
Full textSufficiently adjusted, multivariate data processing methods and procedures can significantly improve the process for obtaining knowledge of a sample composition. Spectroscopic techniques have capabilities for fast analysis of various samples and were developed for research and industrial purposes. It creates a great possibility for advanced molecular analysis of complex samples, such as atmospheric aerosols. Airborne particles affect air quality, human health, ecosystem condition and play an important role in the Earth’s climate system. The purpose of this thesis is twofold. On an analytical level, the functional algorithm for evaluation of quantitative composition of atmospheric particles from measurements of individual particles by Raman microspectrocopy (RMS) was established. On a constructive level, the readily accessible analytical system for Raman and FTIR data processing was developed. A potential of a single particle analysis by RMS has been exploited by an application of the designed analytical algorithm based on a combination between a multicurve resolution and a multivariate data treatment for an efficient description of chemical mixing of aerosol particles. The algorithm was applied to the particles collected in a copper mine in Bolivia and provides a new way of a sample description. The new user-friendly software, which includes pre-treatment algorithms and several easy-to access, common multivariate data treatments, is equipped with a graphical interface. The created software was applied to some challenging aspects of a pattern recognition in the scope of Raman and FTIR spectroscopy for coal mine particles, biogenic particles and organic pigments
Derksen, Timothy J. (Timothy John). "Processing of outliers and missing data in multivariate manufacturing data." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38800.
Full textIncludes bibliographical references (leaf 64).
by Timothy J. Derksen.
M.Eng.
Jonsson, Pär. "Multivariate processing and modelling of hyphenated metabolite data /." Umeå : Dept. of Chemistry, Umeå University, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-663.
Full textOliveira, Irene. "Correlated data in multivariate analysis." Thesis, University of Aberdeen, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.401414.
Full textPrelorendjos, Alexios. "Multivariate analysis of metabonomic data." Thesis, University of Strathclyde, 2014. http://oleg.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=24286.
Full textTavares, Nuno Filipe Ramalho da Cunha. "Multivariate analysis applied to clinical analysis data." Master's thesis, Faculdade de Ciências e Tecnologia, 2014. http://hdl.handle.net/10362/12288.
Full textFolate, vitamin B12, iron and hemoglobin are essential for metabolic functions in the body. The deficiency of these can be the cause of several known pathologies and, untreated, can be responsible for severe morbidity and even death. The objective of this study is to characterize a population, residing in the metropolitan area of Lisbon and Setubal, concerning serum levels of folate, vitamin B12, iron and hemoglobin, as well as finding evidence of correlations between these parameters and illnesses, mainly cardiovascular, gastrointestinal, neurological and anemia. Clinical analysis data was collected and submitted to multivariate analysis. First the data was screened with Spearman correlation and Kruskal-Wallis analysis of variance to study correlations and variability between groups. To characterize the population, we used cluster analysis with Ward’s linkage method. Finally a sensitivity analysis was performed to strengthen the results. A positive correlation between iron with, ferritin and transferrin, and with hemoglobin was observed with the Spearman correlation. Kruskal-Wallis analysis of variance test showed significant differences between these biomarkers in persons aged 0 to 29, 30 to 59 and over 60 years old. Cluster analysis proved to be a useful tool when characterizing a population based on its biomarkers, showing evidence of low folate levels for the population in general, and hemoglobin levels below the reference values. Iron and vitamin B12 were within the reference range for most of the population. Low levels of the parameters were registered mainly in patients with cardiovascular, gastrointestinal, and neurological diseases and anemia.
Rehman, Naveed Ur. "Data-driven time-frequency analysis of multivariate data." Thesis, Imperial College London, 2011. http://hdl.handle.net/10044/1/9116.
Full textDroop, Alastair Philip. "Correlation Analysis of Multivariate Biological Data." Thesis, University of York, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.507622.
Full textCollins, Gary Stephen. "Multivariate analysis of flow cytometry data." Thesis, University of Exeter, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.324749.
Full textZhu, Liang. "Semiparametric analysis of multivariate longitudinal data." Diss., Columbia, Mo. : University of Missouri-Columbia, 2008. http://hdl.handle.net/10355/6044.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on August 3, 2009) Vita. Includes bibliographical references.
Haydock, Richard. "Multivariate analysis of Raman spectroscopy data." Thesis, University of Nottingham, 2015. http://eprints.nottingham.ac.uk/30697/.
Full textLans, Ivo A. van der. "Nonlinear multivariate analysis for multiattribute preference data." [Leiden] : DSWO Press, Leiden University, 1992. http://catalog.hathitrust.org/api/volumes/oclc/28733326.html.
Full textYang, Di. "Analysis guided visual exploration of multivariate data." Worcester, Mass. : Worcester Polytechnic Institute, 2007. http://www.wpi.edu/Pubs/ETD/Available/etd-050407-005925/.
Full textSnavely, Anna Catherine. "Multivariate Data Analysis with Applications to Cancer." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10371.
Full textBolton, Richard John. "Multivariate analysis of multiproduct market research data." Thesis, University of Exeter, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.302542.
Full textDurif, Ghislain. "Multivariate analysis of high-throughput sequencing data." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1334/document.
Full textThe statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid dimension reduction methods that rely on both compression (representation of the data into a lower dimensional space) and variable selection. Developments are made concerning: the sparse Partial Least Squares (PLS) regression framework for supervised classification, and the sparse matrix factorization framework for unsupervised exploration. In both situations, our main purpose will be to focus on the reconstruction and visualization of the data. First, we will present a new sparse PLS approach, based on an adaptive sparsity-inducing penalty, that is suitable for logistic regression to predict the label of a discrete outcome. For instance, such a method will be used for prediction (fate of patients or specific type of unidentified single cells) based on gene expression profiles. The main issue in such framework is to account for the response to discard irrelevant variables. We will highlight the direct link between the derivation of the algorithms and the reliability of the results. Then, motivated by questions regarding single-cell data analysis, we propose a flexible model-based approach for the factorization of count matrices, that accounts for over-dispersion as well as zero-inflation (both characteristic of single-cell data), for which we derive an estimation procedure based on variational inference. In this scheme, we consider probabilistic variable selection based on a spike-and-slab model suitable for count data. The interest of our procedure for data reconstruction, visualization and clustering will be illustrated by simulation experiments and by preliminary results on single-cell data analysis. All proposed methods were implemented into two R-packages "plsgenomics" and "CMF" based on high performance computing
Tardif, Geneviève. "Multivariate Analysis of Canadian Water Quality Data." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32245.
Full textBergfors, Linus. "Explorative Multivariate Data Analysis of the Klinthagen Limestone Quarry Data." Thesis, Uppsala University, Department of Information Technology, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-122575.
Full text
The today quarry planning at Klinthagen is rough, which provides an opportunity to introduce new exciting methods to improve the quarry gain and efficiency. Nordkalk AB, active at Klinthagen, wishes to start a new quarry at a nearby location. To exploit future quarries in an efficient manner and ensure production quality, multivariate statistics may help gather important information.
In this thesis the possibilities of the multivariate statistical approaches of Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression were evaluated on the Klinthagen bore data. PCA data were spatially interpolated by Kriging, which also was evaluated and compared to IDW interpolation.
Principal component analysis supplied an overview of the variables relations, but also visualised the problems involved when linking geophysical data to geochemical data and the inaccuracy introduced by lacking data quality.
The PLS regression further emphasised the geochemical-geophysical problems, but also showed good precision when applied to strictly geochemical data.
Spatial interpolation by Kriging did not result in significantly better approximations than the less complex control interpolation by IDW.
In order to improve the information content of the data when modelled by PCA, a more discrete sampling method would be advisable. The data quality may cause trouble, though with sample technique of today it was considered to be of less consequence.
Faced with a single geophysical component to be predicted from chemical variables further geophysical data need to complement existing data to achieve satisfying PLS models.
The stratified rock composure caused trouble when spatially interpolated. Further investigations should be performed to develop more suitable interpolation techniques.
Lee, Yau-wing. "Modelling multivariate survival data using semiparametric models." Click to view the E-thesis via HKUTO, 2000. http://sunzi.lib.hku.hk/hkuto/record/B4257528X.
Full textIrick, Nancy. "Post Processing Data Analysis." International Foundation for Telemetering, 2009. http://hdl.handle.net/10150/606091.
Full textOnce the test is complete, the job of the Data Analyst has begun. Files from the various acquisition systems are collected. It is the job of the analyst to put together these files in a readable format so the success or failure of the test can be attained. This paper will discuss the process of breaking down these files, comparing data from different systems, and methods of presenting the data.
李友榮 and Yau-wing Lee. "Modelling multivariate survival data using semiparametric models." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B4257528X.
Full textBillah, Baki. "The analysis of multivariate incomplete failure time data." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1995. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp04/mq25823.pdf.
Full textRawizza, Mark Alan. "Time-series analysis of multivariate manufacturing data sets." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/10895.
Full textRitchie, Elspeth Kathryn. "Application of multivariate data analysis in biopharmaceutical production." Thesis, University of Newcastle upon Tyne, 2016. http://hdl.handle.net/10443/3356.
Full textLawal, Najib. "Modelling and multivariate data analysis of agricultural systems." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/modelling-and-multivariate-data-analysis-of-agricultural-systems(f6b86e69-5cff-4ffb-a696-418662ecd694).html.
Full textHopkins, Julie Anne. "Sampling designs for exploratory multivariate analysis." Thesis, University of Sheffield, 2000. http://etheses.whiterose.ac.uk/14798/.
Full textZhou, Feifei, and 周飞飞. "Cure models for univariate and multivariate survival data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2011. http://hub.hku.hk/bib/B45700977.
Full textPetersson, Henrik. "Multivariate Exploration and Processing of Sensor Data-applications with multidimensional sensor systems." Doctoral thesis, Linköpings universitet, Tillämpad Fysik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-14879.
Full textEn sensor är en komponent som överför en fysikalisk, kemisk, eller biologisk storhet eller kvalitet till en utläsbar signal. Sensorer utgör idag en viktig del i flertalet högteknologiska produkter och sensorforskning är ett aktivt område. Komplexiteten på sensorbaserade system ökar och det blir möjligt att registrera allt er olika typer av mätsignaler. Mätsignalerna är inte alltid direkt tydbara, varvid signalbehandling blir ett väsentligt verktyg för att vaska fram den viktiga information som sökes. Signalbehandling av sensorsignaler är dessvärre inte en okomplicerad procedur och det finns många aspekter att beakta. Av denna anledning har signalbehandling och analys av sensorsignaler utvecklats till ett eget forskningsområde. Denna avhandling avhandlar metoder för att analysera komplexa multidimensionella sensorsignaler. En introduktion ges till metoder för att, utifrån mätningar, klassificera och kvantifiera egenskaper hos mätobjekt. En överblick ges av de effekter som kan uppstå på grund av imperfektioner hos sensorerna och en diskussion föres kring metoder för att undvika eller lindra de problem som dessa imperfektioner kan ge uppkomst till. Speciell vikt lägges vid sådana metoder som medför en direkt applicerbarhet och nytta för system av kemiska sensorer. I avhandlingen ingår fyra artiklar, som vart och en belyser hur de metoder som beskrivits kan användas i praktiska situationer.
Sensor,
Nicolini, Olivier. "LIBS Multivariate Analysis with Machine Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286595.
Full textLaser-Induced Breakdown Spectroscopy (LIBS) är en spektroskopisk teknik som används för kemisk analys av material. Genom att analysera det spektrum som erhållits med denna teknik är det möjligt att förstå den kemiska sammansättningen av ett prov. Möjligheten att analysera material på ett kontaktlöst och online sätt utan förberedelse av prov gör LIBS till en av de mest intressanta teknikerna för kemisk sammansättning analys. Trots dess inneboende fördelar lider LIBS-analysen av dålig noggrannhet och begränsad reproducerbarhet av resultaten på grund av interferenseffekter orsakade av provets kemiska sammansättning eller andra experimentella faktorer. Hur man kan förbättra analysens noggrannhet genom att extrahera användbar information från LIBS-data med hög dimensionering är fortfarande den största utmaningen med denna teknik. I det nuvarande arbetet, med syftet att föreslå en robust analysmetod, presenterar jag en pipeline för multivariat regression på LIBS-data som består av förbehandling, val av funktioner och regression. Första rådata förbehandlas genom tillämpning av intensitetsfiltrering, normalisering och baslinjekorrektion för att mildra effekten av interferensfaktorer såsom laserens energifluktuationer eller närvaron av baslinjen i spektrumet. Funktionsval gör det möjligt att hitta de mest informativa linjerna för ett element som sedan används som input i den efterföljande regressionsfasen för att förutsäga elementkoncentrationen. Partial Least Squares (PLS) och Elastic Net visade den bästa förutsägelseförmågan bland de undersökta regressionsmetoderna, medan Interval PLS (iPLS) och Iterative PredictorWeighting PLS (IPW-PLS) visade sig vara de bästa funktionsval algoritmerna för denna typ av data. Genom att tillämpa dessa funktionsval algoritmer på hela LIBS-spektrumet före regression med PLS eller Elastic Net är det möjligt att få exakta förutsägelser på ett robust sätt.
Ehlers, Rene. "Maximum likelihood estimation procedures for categorical data." Pretoria : [s.n.], 2002. http://upetd.up.ac.za/thesis/available/etd-07222005-124541.
Full textCai, Jianwen. "Generalized estimating equations for censored multivariate failure time data /." Thesis, Connect to this title online; UW restricted, 1992. http://hdl.handle.net/1773/9581.
Full textNothnagel, Carien. "Multivariate data analysis using spectroscopic data of fluorocarbon alcohol mixtures / Nothnagel, C." Thesis, North-West University, 2012. http://hdl.handle.net/10394/7064.
Full textThesis (M.Sc. (Chemistry))--North-West University, Potchefstroom Campus, 2012.
Ahmadi-Nedushan, Behrooz 1966. "Multivariate statistical analysis of monitoring data for concrete dams." Thesis, McGill University, 2002. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=82815.
Full textStatistical models such as multiple linear regression, and back propagation neural networks have been used to estimate the response of individual instruments. Multiple linear regression models are of two kinds, (1) Hydro-Seasonal-Time (HST) models and (2) models that consider concrete temperatures as predictors.
Univerariate, bivariate, and multivariate methods are proposed for the identification of anomalies in the instrumentation data. The source of these anomalies can be either bad readings, faulty instruments, or changes in dam behavior.
The proposed methodologies are applied to three different dams, Idukki, Daniel Johnson and Chute-a-Caron, which are respectively an arch, multiple arch and a gravity dam. Displacements, strains, flow rates, and crack openings of these three dams are analyzed.
This research also proposes various multivariate statistical analyses and artificial neural networks techniques to analyze dam monitoring data. One of these methods, Principal Component Analysis (PCA) is concerned with explaining the variance-covariance structure of a data set through a few linear combinations of the original variables. The general objectives are (1) data reduction and (2) data interpretation. Other multivariate analysis methods such as canonical correlation analysis, partial least squares and nonlinear principal component analysis are discussed. The advantages of methodologies for noise reduction, the reduction of number of variables that have to be monitored, the prediction of response parameters, and the identification of faulty readings are discussed. Results indicated that dam responses are generally correlated and that only a few principal components can summarize the behavior of a dam.
Wang, Lianming. "Statistical analysis of multivariate interval-censored failure time data." Diss., Columbia, Mo. : University of Missouri-Columbia, 2006. http://hdl.handle.net/10355/4375.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (May 2, 2007) Vita. Includes bibliographical references.
Das, Mitali. "Motion within music : the analysis of multivariate MIDI data." Thesis, University of York, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.367466.
Full textChen, Man-Hua. "Statistical analysis of multivariate interval-censored failure time data." Diss., Columbia, Mo. : University of Missouri-Columbia, 2007. http://hdl.handle.net/10355/4776.
Full textThe entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on March 6, 2009) Includes bibliographical references.
Edberg, Alexandra. "Monitoring Kraft Recovery Boiler Fouling by Multivariate Data Analysis." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230906.
Full textDetta arbete handlar om inkruster i sodapannan pa Montes del Plata, Uruguay. Multivariat dataanalys har anvands for att analysera den stora datamangd som fanns tillganglig for att undersoka hur olika parametrar paverkar inkrusterproblemen. Principal·· Component Analysis (PCA) och Partial Least Square Projection (PLS) har i detta jobb anvants. PCA har anvants for att jamfora medelvarden mellan tidsperioder med hoga och laga inkrusterproblem medan PLS har anvants for att studera korrelationen mellan variablema och darmed ge en indikation pa vilka parametrar som kan tankas att andras for att forbattra tillgangligheten pa sodapannan. Resultaten visar att sodapannan tenderar att ha problem med inkruster som kan hero pa fdrdelningen av luft, pa svartlutens tryck eller pa torrhalten i svartluten. Resultaten visar ocksa att multivariat dataanalys ar ett anvandbart verktyg for att analysera dessa typer av inkrusterproblem.
Sheppard, Therese. "Extending covariance structure analysis for multivariate and functional data." Thesis, University of Manchester, 2010. https://www.research.manchester.ac.uk/portal/en/theses/extending-covariance-structure-analysis-for-multivariate-and-functional-data(e2ad7f12-3783-48cf-b83c-0ca26ef77633).html.
Full text陳志昌 and Chee-cheong Chan. "Compositional data analysis of voting patterns." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1993. http://hub.hku.hk/bib/B31977236.
Full text