Dissertations / Theses on the topic 'Partial least square analysis'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Partial least square analysis.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Moller, Jurgen Johann. "The implementation of noise addition partial least squares." Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/3362.
Full textWhen determining the chemical composition of a specimen, traditional laboratory techniques are often both expensive and time consuming. It is therefore preferable to employ more cost effective spectroscopic techniques such as near infrared (NIR). Traditionally, the calibration problem has been solved by means of multiple linear regression to specify the model between X and Y. Traditional regression techniques, however, quickly fail when using spectroscopic data, as the number of wavelengths can easily be several hundred, often exceeding the number of chemical samples. This scenario, together with the high level of collinearity between wavelengths, will necessarily lead to singularity problems when calculating the regression coefficients. Ways of dealing with the collinearity problem include principal component regression (PCR), ridge regression (RR) and PLS regression. Both PCR and RR require a significant amount of computation when the number of variables is large. PLS overcomes the collinearity problem in a similar way as PCR, by modelling both the chemical and spectral data as functions of common latent variables. The quality of the employed reference method greatly impacts the coefficients of the regression model and therefore, the quality of its predictions. With both X and Y subject to random error, the quality the predictions of Y will be reduced with an increase in the level of noise. Previously conducted research focussed mainly on the effects of noise in X. This paper focuses on a method proposed by Dardenne and Fernández Pierna, called Noise Addition Partial Least Squares (NAPLS) that attempts to deal with the problem of poor reference values. Some aspects of the theory behind PCR, PLS and model selection is discussed. This is then followed by a discussion of the NAPLS algorithm. Both PLS and NAPLS are implemented on various datasets that arise in practice, in order to determine cases where NAPLS will be beneficial over conventional PLS. For each dataset, specific attention is given to the analysis of outliers, influential values and the linearity between X and Y, using graphical techniques. Lastly, the performance of the NAPLS algorithm is evaluated for various
Krämer, Nicole. "Analysis of high dimensional data with partial least squares and boosting." [S.l.] : [s.n.], 2006. http://opus.kobv.de/tuberlin/volltexte/2007/1484.
Full textLi, Siqing. "Kernel-based least-squares approximations: theories and applications." HKBU Institutional Repository, 2018. https://repository.hkbu.edu.hk/etd_oa/539.
Full textZhou, Yue. "Analysis of Additive Risk Model with High Dimensional Covariates Using Partial Least Squares." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/math_theses/6.
Full textWang, Hailun. "Some Conclusions of Statistical Analysis of the Spectropscopic Evaluation of Cervical Cancer." Digital Archive @ GSU, 2008. http://digitalarchive.gsu.edu/math_theses/58.
Full textEdberg, Alexandra. "Monitoring Kraft Recovery Boiler Fouling by Multivariate Data Analysis." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230906.
Full textDetta arbete handlar om inkruster i sodapannan pa Montes del Plata, Uruguay. Multivariat dataanalys har anvands for att analysera den stora datamangd som fanns tillganglig for att undersoka hur olika parametrar paverkar inkrusterproblemen. Principal·· Component Analysis (PCA) och Partial Least Square Projection (PLS) har i detta jobb anvants. PCA har anvants for att jamfora medelvarden mellan tidsperioder med hoga och laga inkrusterproblem medan PLS har anvants for att studera korrelationen mellan variablema och darmed ge en indikation pa vilka parametrar som kan tankas att andras for att forbattra tillgangligheten pa sodapannan. Resultaten visar att sodapannan tenderar att ha problem med inkruster som kan hero pa fdrdelningen av luft, pa svartlutens tryck eller pa torrhalten i svartluten. Resultaten visar ocksa att multivariat dataanalys ar ett anvandbart verktyg for att analysera dessa typer av inkrusterproblem.
Yue, Weiping Biotechnology & Biomolecular Sciences Faculty of Science UNSW. "Predicting the citation impact of clinical neurology journals using structural equation modeling with partial least squares." Awarded by:University of New South Wales. School of Biotechnology and Biomolecular Sciences, 2004. http://handle.unsw.edu.au/1959.4/20821.
Full textPlard, Jérôme. "Apport de la chimiométrie et des plans d’expériences pour l’évaluation de la qualité de l’huile d’olive au cours de différents processus de vieillissement." Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4315/document.
Full textOlive oil is an important component of the Mediterranean diet. When oil ages, it deteriorates and loses its properties. It is therefore important to know the evolution of the oil composition according to the conditions of storage and manufacturing. This monitoring was carried out on two different oils manufacturing, green fruity oil obtained from olives harvested before maturity, and black fruit oil obtained from olives harvest at maturity and fermented for few days under controlled conditions. To obtain quickly pushed aging, these two oils were artificially aged by heat process (heated to 180 °C under supply of O2), and photochemical process (under an UV lamp and under supply of O2). These aging were performed on different volumes to determine the impact of surface/weight ratio. In parallel, samples of both oils were stored for 24 months under different storage conditions determined using an experimental design. The parameters affecting the most the conservation of olive oil are oxygen, light and temperature. These influences were determined from the monitoring of key quality criteria. Response of experimental design helped to highlight the interactions between these different parameters. The analysis of the oil composition as well as all the quality criteria requires a large amount of solvents and a lot of time consumer. To overcome these inconveniences, chemometric models has been built to determine these criteria from the near and mid-infrared spectra of samples. Natural aging is very little advanced in comparison to accelerated aging, so predictive models were established from the results of natural aging and accelerated separately
Mat, Roni Mohd Saiyidi. "An analysis of insider dysfunctional behavours in an accounting information system environment." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2015. https://ro.ecu.edu.au/theses/1640.
Full textPatten, Kyle. "An analysis of the modeling used to determine customer satisfaction." Thesis, Kansas State University, 2014. http://hdl.handle.net/2097/35765.
Full textDepartment of Agricultural Economics
Kevin Dhuyvetter
Many companies use surveys to establish customer satisfaction metrics. This OEM has been using surveys to analyze customer satisfaction with their products, services, and distribution channel for several decades. Satisfaction metrics are established for the brand, product, and channel partners. The product metric is derived from a question on the survey asking customers how satisfied they are with the product. There are subsequent questions thereafter inquiring about satisfaction with specific functional areas of the product. It is common practice to use Partial Least Squares (PLS) regression analysis to evaluate what impacts the functional area questions have on the overall satisfaction question. The model results are used to understand what areas of the machine should be focused on to improve customers’ experiences with the machine. These results are compared to other data sources such as warranty, field reports, customer focus groups, etc. The results from these models are sometimes questioned based on what common intuition would suggest. Typically the top three drivers to the product metric are understandable, but there are often one or two key areas that do not make logical sense. The objective of this thesis was to understand whether PLS modeling is appropriate given the nature of customer survey data. Models were estimated using existing survey data on a specific model in the tractor product line. PLS models assume data are linear with no bounds. This in itself likely makes this type of model inappropriate for analyzing customer survey data. Responses are bounded on an 11 point scale from 0-10, however, the PLS model being non-bounded assumes there can be a score under 0 or over 10. The model also assumes a linear slope that would indicate each covariate answer 0-10 has the same level of effect on the response variable. This research has found that each covariate answer is in fact non-linear. For example, a customer answering a 2 to quality of manufacturing workmanship has a different impact on the overall satisfaction score than a customer who answers 8. Finally, this research discovered that the PLS models produce negative coefficients of significant value that are not reported to the enterprise. Binary and ordered logistic (logit) models were estimated as an alternative to PLS. Logistic models are non-linear and are commonly used to evaluate bounded data. Response data were separated into two groups based on Net Promoter Score (NPS) Methodology (Reicheld 2006). Using the NPS methodology, 0-6 scores are considered detractors, 7-8 scores are considered passives, and 9-10 scores are considered promoters. The logistic models demonstrate that the top two drivers to customer satisfaction scores are still quality of manufacturing workmanship and reliability/operational availability (similar to results of the PLS model). The unresolved problems question on the survey was included in the models and demonstrated that the predicted probability of a customer being a promoter is much higher in both binary and ordered logit models if no unresolved problems exist. Finally, the model found engine oil consumption remained negative and is statistically significant suggesting that even with the alternative modeling approach there still may be data issues related to the survey. It is recommended that the OEM implement logistic modeling for analyzing customer survey data. It is also recommended that a new survey design be constructed to eliminate issues with correlated data that can lead to spurious and unexplainable results.
Nguyen, Nga. "Multivariate analysis and GIS in generating vulnerability map of acid sulfate soils." Thesis, KTH, Mark- och vattenteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-170472.
Full textVelayutham, Sunitadevi. "The influence of classroom environment on students’ motivation and self-regulation." Thesis, Curtin University, 2012. http://hdl.handle.net/20.500.11937/1717.
Full textOzer, Semih. "Analysis Of Critical Factors Affecting Customer Satisfaction In Modular Kitchen Sector." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/2/12610593/index.pdf.
Full textSinioja, Tim. ""Source characterization of soils contaminated with Polycyclic Aromatic Compounds (PACs) by use of Partial Least Squares Discriminant Analysis (PLS-DA)"." Thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-64627.
Full textHassling, Andreas, and Simon Flink. "SYSTEM IDENTIFICATION OF A WASTE-FIRED CFB BOILER : Using Principal Component Analysis (PCA) and Partial Least Squares Regression modeling (PLS-R)." Thesis, Mälardalens högskola, Akademin för ekonomi, samhälle och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-34979.
Full textSong, Hyojong. "An Exploratory Study of Macro-Social Correlates of Online Property Crime." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6954.
Full textRech, André Machado. "Caracterização de bebidas à base de soja empregando espectroscopia no infravermelho médio com transformada de Fourier por reflexão total atenuada e quimiometria." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2018. http://hdl.handle.net/10183/180663.
Full textIn this work, strategies were studied for the characterization of soy-based beverages (SBB), by means of Fourier transform infrared spectroscopy with attenuated total reflectance (FTIR-ATR). Twenty commercial samples of SBB were used, of 7 different flavors and 7 different brands. The contents studied in SBB were total sugar, reducing sugar, non-reducing sugars, and total proteins. The multivariate regression models were constructed by partial least squares (PLS), with evaluation of the methods by interval partial least squares (iPLS) and by sinergy interval partial least squares (siPLS), for selection of variables. The selections of variables per siPLS presented the best results for the constructed models. Among the evaluated properties, the total sugar content presented models with low calibration and prediction errors (RMSECV and RMSEP), and determination coefficients (R2cv and R2prev) close to one. For total proteins, the models presented promising results, as they also had low calibration and prediction errors (RMSECV and RMSEP), and determination coefficients (R2cv and R2prev) close to one, considering that the actual samples did not present an ideal protein concentration variability. For the properties of reducing sugars and non-reducing sugars, good results were not obtained for the regression models. In this way, the proposed methodology presents potential in routine analysis for simultaneous determination of total glycogen and protein, taking into account the requirements referring to the nutritional information in the SBB labeling, adding to the advantages of the infrared spectroscopy, such as speed in the analysis, high analytical frequency, small amount of sample required, low cost, non destructive and environmentally friendly.
Johnson, Mikael. "Acoustic Emission in Composite Laminates - Numerical Simulations and Experimental Characterization." Doctoral thesis, KTH, Solid Mechanics, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3452.
Full textLe, floch Edith. "Méthodes multivariées pour l'analyse jointe de données de neuroimagerie et de génétique." Thesis, Paris 11, 2012. http://www.theses.fr/2012PA112214/document.
Full textBrain imaging is increasingly recognised as an interesting intermediate phenotype to understand the complex path between genetics and behavioural or clinical phenotypes. In this context, a first goal is to propose methods to identify the part of genetic variability that explains some neuroimaging variability. Classical univariate approaches often ignore the potential joint effects that may exist between genes or the potential covariations between brain regions. Our first contribution is to improve the sensitivity of the univariate approach by taking advantage of the multivariate nature of the genetic data in a local way. Indeed, we adapt cluster-inference techniques from neuroimaging to Single Nucleotide Polymorphism (SNP) data, by looking for 1D clusters of adjacent SNPs associated with the same imaging phenotype. Then, we push further the concept of clusters and we combined voxel clusters and SNP clusters, by using a simple 4D cluster test that detects conjointly brain and genome regions with high associations. We obtain promising preliminary results on both simulated and real datasets .Our second contribution is to investigate exploratory multivariate methods to increase the detection power of imaging genetics studies, by accounting for the potential multivariate nature of the associations, at a longer range, on both the imaging and the genetics sides. Recently, Partial Least Squares (PLS) regression or Canonical Correlation Analysis (CCA) have been proposed to analyse genetic and transcriptomic data. Here, we propose to transpose this idea to the genetics vs. imaging context. Moreover, we investigate the use of different strategies of regularisation and dimension reduction techniques combined with PLS or CCA, to face the overfitting issues due to the very high dimensionality of the data. We propose a comparison study of the different strategies on both a simulated dataset and a real fMRI and SNP dataset. Univariate selection appears to be necessary to reduce the dimensionality. However, the generalisable and significant association uncovered on the real dataset by the two-step approach combining univariate filtering and L1-regularised PLS suggests that discovering meaningful imaging genetics associations calls for a multivariate approach
Venaik, Sunil AGSM UNSW. "A Model of Global Marketing in Multinational Firms: An Emprirical Investigation." Awarded by:University of New South Wales. AGSM, 1999. http://handle.unsw.edu.au/1959.4/17479.
Full textLoftus, John. "On the development of control systems technology for fermentation processes." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/on-the-development-of-control-systems-technology-for-fermentation-processes(61955790-a48b-4703-8942-bfe47a38a6c2).html.
Full textYoon, Jisu [Verfasser], Tatyana [Akademischer Betreuer] Krivobokova, Stephan [Akademischer Betreuer] Klasen, and Axel [Akademischer Betreuer] Dreher. "Partial Least Squares and Principal Component Analysis with Non-metric Variables for Composite Indices / Jisu Yoon. Gutachter: Tatyana Krivobokova ; Stephan Klasen ; Axel Dreher. Betreuer: Tatyana Krivobokova." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2015. http://d-nb.info/1076160972/34.
Full textAbrahamsson, Sandra. "Utformning av mjukvarusensorer för avloppsvatten med multivariata analysmetoder." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-207863.
Full textStudies of real processes are based on measured data. In the past, the amount of available data was very limited. However, with modern technology, the information which is possible to obtain from measurements is more available, which considerably alters the possibility to understand and describe processes. Multivariate analysis is often used when large datasets which contains many variables are evaluated. In this thesis, the multivariate analysis methods PCA (principal component analysis) and PLS (partial least squares projection to latent structures) has been applied to wastewater data collected at Hammarby Sjöstadsverk WWTP (wastewater treatment plant). Wastewater treatment plants are required to monitor and control their systems in order to reduce their environmental impact. With improved knowledge of the processes involved, the impact can be significantly decreased without affecting the plant efficiency. Several variables are easy to measure directly in the water, while other require extensive laboratory analysis. Some of the parameters from the latter category are the contents of phosphorus and nitrogen in the water, both of which are important for the wastewater treatment results. The concentrations of these substances in the inlet water vary during the day and are difficult to monitor properly. The purpose of this study was to investigate whether it is possible, from the more easily measured variables, to obtain information on those which require more extensive analysis. This was done by using multivariate analysis to create models attempting to explain the variation in these variables. The models are commonly referred to as soft sensors, since they don’t actually make use of any physical sensors to measure the relevant variable. Data were collected during the period of March 11 to March 15, 2013 in the wastewater at different stages of the treatment process and a number of multivariate models were created. The result shows that it is possible to obtain information about the variables with PLS models based on easy-to-measure variables. The best created model was the one explaining the concentration of nitrogen in the inlet water.
Vitale, Raffaele. "Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/90442.
Full textLa presente tesis doctoral, concebida principalmente para apoyar y reforzar la relación entre la academia y la industria, se desarrolló en colaboración con Shell Global Solutions (Amsterdam, Países Bajos) en el esfuerzo de aplicar y posiblemente extender los enfoques ya consolidados basados en variables latentes (es decir, Análisis de Componentes Principales - PCA - Regresión en Mínimos Cuadrados Parciales - PLS - o PLS discriminante - PLSDA) para la resolución de problemas complejos no sólo en los campos de mejora y optimización de procesos, sino también en el entorno más amplio del análisis de datos multivariados. Con este fin, en todos los capítulos proponemos nuevas soluciones algorítmicas eficientes para abordar tareas dispares, desde la transferencia de calibración en espectroscopia hasta el modelado en tiempo real de flujos de datos. El manuscrito se divide en las seis partes siguientes, centradas en diversos temas de interés: Parte I - Prefacio, donde presentamos un resumen de este trabajo de investigación, damos sus principales objetivos y justificaciones junto con una breve introducción sobre PCA, PLS y PLSDA; Parte II - Sobre las extensiones basadas en kernels de PCA, PLS y PLSDA, donde presentamos el potencial de las técnicas de kernel, eventualmente acopladas a variantes específicas de la recién redescubierta proyección de pseudo-muestras, formulada por el estadista inglés John C. Gower, y comparamos su rendimiento respecto a metodologías más clásicas en cuatro aplicaciones a escenarios diferentes: segmentación de imágenes Rojo-Verde-Azul (RGB), discriminación y monitorización de procesos por lotes y análisis de diseños de experimentos de mezclas; Parte III - Sobre la selección del número de factores en el PCA por pruebas de permutación, donde aportamos una guía extensa sobre cómo conseguir la selección de componentes de PCA mediante pruebas de permutación y una ilustración completa de un procedimiento algorítmico original implementado para tal fin; Parte IV - Sobre la modelización de fuentes de variabilidad común y distintiva en el análisis de datos multi-conjunto, donde discutimos varios aspectos prácticos del análisis de componentes comunes y distintivos de dos bloques de datos (realizado por métodos como el Análisis Simultáneo de Componentes - SCA - Análisis Simultáneo de Componentes Distintivos y Comunes - DISCO-SCA - Descomposición Adaptada Generalizada de Valores Singulares - Adapted GSVD - ECO-POWER, Análisis de Correlaciones Canónicas - CCA - y Proyecciones Ortogonales de 2 conjuntos a Estructuras Latentes - O2PLS). Presentamos a su vez una nueva estrategia computacional para determinar el número de factores comunes subyacentes a dos matrices de datos que comparten la misma dimensión de fila o columna y dos planteamientos novedosos para la transferencia de calibración entre espectrómetros de infrarrojo cercano; Parte V - Sobre el procesamiento y la modelización en tiempo real de flujos de datos de alta dimensión, donde diseñamos la herramienta de Procesamiento en Tiempo Real (OTFP), un nuevo sistema de manejo racional de mediciones multi-canal registradas en tiempo real; Parte VI - Epílogo, donde presentamos las conclusiones finales, delimitamos las perspectivas futuras, e incluimos los anexos.
La present tesi doctoral, concebuda principalment per a recolzar i reforçar la relació entre l'acadèmia i la indústria, es va desenvolupar en col·laboració amb Shell Global Solutions (Amsterdam, Països Baixos) amb l'esforç d'aplicar i possiblement estendre els enfocaments ja consolidats basats en variables latents (és a dir, Anàlisi de Components Principals - PCA - Regressió en Mínims Quadrats Parcials - PLS - o PLS discriminant - PLSDA) per a la resolució de problemes complexos no solament en els camps de la millora i optimització de processos, sinó també en l'entorn més ampli de l'anàlisi de dades multivariades. A aquest efecte, en tots els capítols proposem noves solucions algorítmiques eficients per a abordar tasques dispars, des de la transferència de calibratge en espectroscopia fins al modelatge en temps real de fluxos de dades. El manuscrit es divideix en les sis parts següents, centrades en diversos temes d'interès: Part I - Prefaci, on presentem un resum d'aquest treball de recerca, es donen els seus principals objectius i justificacions juntament amb una breu introducció sobre PCA, PLS i PLSDA; Part II - Sobre les extensions basades en kernels de PCA, PLS i PLSDA, on presentem el potencial de les tècniques de kernel, eventualment acoblades a variants específiques de la recentment redescoberta projecció de pseudo-mostres, formulada per l'estadista anglés John C. Gower, i comparem el seu rendiment respecte a metodologies més clàssiques en quatre aplicacions a escenaris diferents: segmentació d'imatges Roig-Verd-Blau (RGB), discriminació i monitorització de processos per lots i anàlisi de dissenys d'experiments de mescles; Part III - Sobre la selecció del nombre de factors en el PCA per proves de permutació, on aportem una guia extensa sobre com aconseguir la selecció de components de PCA a través de proves de permutació i una il·lustració completa d'un procediment algorítmic original implementat per a la finalitat esmentada; Part IV - Sobre la modelització de fonts de variabilitat comuna i distintiva en l'anàlisi de dades multi-conjunt, on discutim diversos aspectes pràctics de l'anàlisis de components comuns i distintius de dos blocs de dades (realitzat per mètodes com l'Anàlisi Simultània de Components - SCA - Anàlisi Simultània de Components Distintius i Comuns - DISCO-SCA - Descomposició Adaptada Generalitzada en Valors Singulars - Adapted GSVD - ECO-POWER, Anàlisi de Correlacions Canòniques - CCA - i Projeccions Ortogonals de 2 blocs a Estructures Latents - O2PLS). Presentem al mateix temps una nova estratègia computacional per a determinar el nombre de factors comuns subjacents a dues matrius de dades que comparteixen la mateixa dimensió de fila o columna, i dos plantejaments nous per a la transferència de calibratge entre espectròmetres d'infraroig proper; Part V - Sobre el processament i la modelització en temps real de fluxos de dades d'alta dimensió, on dissenyem l'eina de Processament en Temps Real (OTFP), un nou sistema de tractament racional de mesures multi-canal registrades en temps real; Part VI - Epíleg, on presentem les conclusions finals, delimitem les perspectives futures, i incloem annexos.
Vitale, R. (2017). Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90442
TESIS
Hennerdal, Aron. "Investigation of multivariate prediction methods for the analysis of biomarker data." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5889.
Full textThe paper describes predictive modelling of biomarker data stemming from patients suffering from multiple sclerosis. Improvements of multivariate analyses of the data are investigated with the goal of increasing the capability to assign samples to correct subgroups from the data alone.
The effects of different preceding scalings of the data are investigated and combinations of multivariate modelling methods and variable selection methods are evaluated. Attempts at merging the predictive capabilities of the method combinations through voting-procedures are made. A technique for improving the result of PLS-modelling, called bagging, is evaluated.
The best methods of multivariate analysis of the ones tried are found to be Partial least squares (PLS) and Support vector machines (SVM). It is concluded that the scaling have little effect on the prediction performance for most methods. The method combinations have interesting properties – the default variable selections of the multivariate methods are not always the best. Bagging improves performance, but at a high cost. No reasons for drastically changing the work flows of the biomarker data analysis are found, but slight improvements are possible. Further research is needed.
Durif, Ghislain. "Multivariate analysis of high-throughput sequencing data." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1334/document.
Full textThe statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid dimension reduction methods that rely on both compression (representation of the data into a lower dimensional space) and variable selection. Developments are made concerning: the sparse Partial Least Squares (PLS) regression framework for supervised classification, and the sparse matrix factorization framework for unsupervised exploration. In both situations, our main purpose will be to focus on the reconstruction and visualization of the data. First, we will present a new sparse PLS approach, based on an adaptive sparsity-inducing penalty, that is suitable for logistic regression to predict the label of a discrete outcome. For instance, such a method will be used for prediction (fate of patients or specific type of unidentified single cells) based on gene expression profiles. The main issue in such framework is to account for the response to discard irrelevant variables. We will highlight the direct link between the derivation of the algorithms and the reliability of the results. Then, motivated by questions regarding single-cell data analysis, we propose a flexible model-based approach for the factorization of count matrices, that accounts for over-dispersion as well as zero-inflation (both characteristic of single-cell data), for which we derive an estimation procedure based on variational inference. In this scheme, we consider probabilistic variable selection based on a spike-and-slab model suitable for count data. The interest of our procedure for data reconstruction, visualization and clustering will be illustrated by simulation experiments and by preliminary results on single-cell data analysis. All proposed methods were implemented into two R-packages "plsgenomics" and "CMF" based on high performance computing
Bringmann, Philipp. "Adaptive least-squares finite element method with optimal convergence rates." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/22350.
Full textThe least-squares finite element methods (LSFEMs) base on the minimisation of the least-squares functional consisting of the squared norms of the residuals of first-order systems of partial differential equations. This functional provides a reliable and efficient built-in a posteriori error estimator and allows for adaptive mesh-refinement. The established convergence analysis with rates for adaptive algorithms, as summarised in the axiomatic framework by Carstensen, Feischl, Page, and Praetorius (Comp. Math. Appl., 67(6), 2014), fails for two reasons. First, the least-squares estimator lacks prefactors in terms of the mesh-size, what seemingly prevents a reduction under mesh-refinement. Second, the first-order divergence LSFEMs measure the flux or stress errors in the H(div) norm and, thus, involve a data resolution error of the right-hand side f. These difficulties led to a twofold paradigm shift in the convergence analysis with rates for adaptive LSFEMs in Carstensen and Park (SIAM J. Numer. Anal., 53(1), 2015) for the lowest-order discretisation of the 2D Poisson model problem with homogeneous Dirichlet boundary conditions. Accordingly, some novel explicit residual-based a posteriori error estimator accomplishes the reduction property. Furthermore, a separate marking strategy in the adaptive algorithm ensures the sufficient data resolution. This thesis presents the generalisation of these techniques to three linear model problems, namely, the Poisson problem, the Stokes equations, and the linear elasticity problem. It verifies the axioms of adaptivity with separate marking by Carstensen and Rabus (SIAM J. Numer. Anal., 55(6), 2017) in three spatial dimensions. The analysis covers discretisations with arbitrary polynomial degree and inhomogeneous Dirichlet and Neumann boundary conditions. Numerical experiments confirm the theoretically proven optimal convergence rates of the h-adaptive algorithm.
Muratori, Giacomo. "Application of multivariate statistical methods to the modelling of a flue gas treatment stage in a waste-to-energy plant." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17262/.
Full textLopez, Montero Eduardo. "Use of multivariate statistical methods for control of chemical batch processes." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/use-of-multivariate-statistical-methods-for-control-of-chemical-batch-processes(6cf45624-2388-4e85-b4c6-99503547ad06).html.
Full textAloglu, Ahmet Kemal. "Characterization of Foods by Chromatographic and Spectroscopic Methods Coupled to Chemometrics." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou152293360889416.
Full textYan, Lipeng. "The application of multivariate statistical analysis and optimization to batch processes." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/the-application-of-multivariate-statistical-analysis-and-optimization-to-batch-processes(e6dbe45d-94bb-4e84-a12f-542876af54f5).html.
Full textAppiagyei, Kwadjo. "Evaluating integrated reporting quality, its determinants and its effect on sustainability in a mandatory reporting environment." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2020. https://ro.ecu.edu.au/theses/2378.
Full textHellberg, Sven. "A multivariate approach to QSAR." Doctoral thesis, Umeå universitet, Kemiska institutionen, 1986. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-100713.
Full textDiss. (sammanfattning) Umeå : Umeå universitet, 1986, härtill 8 uppsatser
digitalisering@umu
Padalkar, Mugdha Vijay. "DEVELOPMENT OF NON-DESTRUCTIVE INFRARED FIBER OPTIC METHOD FOR ASSESSMENT OF LIGAMENT AND TENDON COMPOSITION." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/378679.
Full textPh.D.
More than 350,000 anterior cruciate ligament (ACL) injuries occur every year in the United States. A torn ACL is typically replaced with an allograft or autograft tendon (patellar, quadriceps or hamstring), with the choice of tissue generally dictated by surgeon preference. Despite the number of ACL reconstructions performed every year, the process of ligamentization, transformation of a tendon graft to a healthy functional ligament, is poorly understood. Previous research studies have relied on mechanical, biochemical and histological studies. However, these methods are destructive. Clinically, magnetic resonance imaging (MRI) is the most common method of graft evaluation, but it lacks adequate resolution and molecular specificity. There is a need for objective methodology to study the ligament repair process that would ideally be non- or minimally invasive. Development of such a method could lead to a better understanding of the effects of therapeutic interventions and rehabilitation protocols in animal models of ligamentization, and ultimately, in clinical studies. Fourier transform infrared (FT-IR) spectroscopy is a technique sensitive to molecular structure and composition in tissues. FT-IR fiber optic probes combined with arthroscopy could prove to be an important tool where minimally invasive tissue assessment is required, such as assessment of graft composition during the ligamentization process. Spectroscopic methods have been used to differentiate normal and diseased connective tissues, but have not been applied to investigate ligamentization, or to investigate differences in tendons and ligaments. In the proposed studies, we hypothesize that infrared spectroscopy can provide molecular information about the compositional differences between tendons and ligaments, which can serve as a foundation to non-destructively monitor the tissue transformation that occurs during ligamentization.
Temple University--Theses
Skoglund, Ingegerd. "Algorithms for a Partially Regularized Least Squares Problem." Licentiate thesis, Linköping : Linköpings universitet, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8784.
Full textChampion, Patrick D. "An analysis of Fourier transform infrared spectroscopy data to predict herpes simplex virus 1 infection." Atlanta, Ga. : Georgia State University, 2008. http://digitalarchive.gsu.edu/math_theses/62/.
Full textTitle from title page (Digital Archive@GSU, viewed July 29, 2010) Yu-Sheng Hsu, committee chair; Gary Hastings, Jun Han, committee members. Includes bibliographical references (p. 41).
Bergfors, Linus. "Explorative Multivariate Data Analysis of the Klinthagen Limestone Quarry Data." Thesis, Uppsala University, Department of Information Technology, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-122575.
Full text
The today quarry planning at Klinthagen is rough, which provides an opportunity to introduce new exciting methods to improve the quarry gain and efficiency. Nordkalk AB, active at Klinthagen, wishes to start a new quarry at a nearby location. To exploit future quarries in an efficient manner and ensure production quality, multivariate statistics may help gather important information.
In this thesis the possibilities of the multivariate statistical approaches of Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression were evaluated on the Klinthagen bore data. PCA data were spatially interpolated by Kriging, which also was evaluated and compared to IDW interpolation.
Principal component analysis supplied an overview of the variables relations, but also visualised the problems involved when linking geophysical data to geochemical data and the inaccuracy introduced by lacking data quality.
The PLS regression further emphasised the geochemical-geophysical problems, but also showed good precision when applied to strictly geochemical data.
Spatial interpolation by Kriging did not result in significantly better approximations than the less complex control interpolation by IDW.
In order to improve the information content of the data when modelled by PCA, a more discrete sampling method would be advisable. The data quality may cause trouble, though with sample technique of today it was considered to be of less consequence.
Faced with a single geophysical component to be predicted from chemical variables further geophysical data need to complement existing data to achieve satisfying PLS models.
The stratified rock composure caused trouble when spatially interpolated. Further investigations should be performed to develop more suitable interpolation techniques.
Liggett, Rachel Esther. "Multivariate Approaches for Relating Consumer Preference to Sensory Characteristics." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1282868174.
Full textNothnagel, Carien. "Multivariate data analysis using spectroscopic data of fluorocarbon alcohol mixtures / Nothnagel, C." Thesis, North-West University, 2012. http://hdl.handle.net/10394/7064.
Full textThesis (M.Sc. (Chemistry))--North-West University, Potchefstroom Campus, 2012.
Flaxman, Teresa. "Neuromuscular Strategies for Regulating Knee Joint Moments in Healthy and Injured Populations." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36102.
Full textAlves, Evandro Roberto. "Sistemas de análises químicas em fluxo explorando multi-impulsão, interface única ou quimiometria." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/64/64135/tde-14052010-092516/.
Full textMulti-pumping flow systems (MPFS) present as an unique feature the use of solenoid pumps as fluid propelling devices, which deliver pulsed flows. This flow regime was evaluated in order to improve mixing conditions between the involved solutions, heating transfer and gas diffusion.The association of the chemometric methods of analysis and MPFS systems was demonstrated in the spectrophotometric determination of glucose, fructose and glycerol in musts and sugar cane juices. The method involved metaperiodate oxidation of carbohydrates and further oxidation of remainder metaperiodate iodide yield in the [I3 -] complex that was spectrophotometrically monitored. Data treatment involved multivariate calibration relying on the PLS algorithm and results were in agreement with liquid anion chromatography with pulsed amperometric detection. The proposed system is simple and rugged, allowing 120 samples to be run per hour. The pulsed flow led to a enhanced in heating transfer and gas diffusion, in view of the enhanced radial mass transport. These aspects were verified in the spectrophotometric determination of total reducing sugars (TRS) and ethanol. The proposed MPFS system for TRS determination involved in-line hydrolysis of sucrose and alkaline degradation of the carbohydrates. The intrinsic characteristic of pulsed flow allowed the use of lower temperatures in bath thermostatization during hydrolysis and degradation steps, as well as a lower alkalinity. The MPFS for spectrophotometric determination of ethanol involving diffusion towards an acceptor stream, reduction of Cr(VI) to Cr(III) under acidic condition, and Cr*(III) monitoring proved to be eficient and amenale to analytical procedures involving gas diffusion. After optimization of the main parameters, the system was compared with a multicommuted flow system (MCFA) that exploits a laminar flow. Better analytical results were obtained with the proposed system which demonstrated fair sensitivity. Regarding flow systems exploiting a single reaction interface (SIFA), their potentialities were demonstrated by implementing analytical procedures for simultaneous determination without requiring reconfigurations in the flow manifold. In this proposed system the simplification of the optimization step was atained, and the approach was evaluated in relation to spectrophotometrically determination of aluminum, total iron and phosphate. The system exhibits simple configuration and allows 130, 140 and 90 samples of aluminum, total iron and phosphate to be run per hour, respectivelly
Wu, MeiMei. "Investigating the adoption of banking services delivered over remote channels : the case of Chinese Internet banking customers." Thesis, Loughborough University, 2012. https://dspace.lboro.ac.uk/2134/9387.
Full textAndersson, David. "Multivariate design of molecular docking experiments : An investigation of protein-ligand interactions." Doctoral thesis, Umeå universitet, Kemiska institutionen, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-35736.
Full textLarsson, Daniel. "Multivariat dataanalys för att undersöka skillnader i undervisnings- och bedömningspraxis i kursen kemi 2." Thesis, Linnéuniversitetet, Institutionen för didaktik och lärares praktik (DLP), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-70394.
Full textAjib, Budour. "Contribution à la modélisation de la qualité de l'orge et du malt pour la maîtrise du procédé de maltage." Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0315/document.
Full textIn a continuously growing market and in order to meet the needs of Brewers in high quality malt, control of the malting process is a great challenge. Malt quality is highly dependent on the malting process operating conditions, especially on the steeping conditions, but also the quality of the raw material: barley. In this study, we established polynomial models that relate the operating conditions and the malt quality. These models have been coupled with our genetic algorithms to determine the optimal steeping conditions, either to obtain a targeted quality of malt (friability), or to allow a malting at low water content while maintaining acceptable quality of malt (to reduce water consumption and control the environmental costs of malt production). However, the variability of the raw material is a limiting factor for our approach. Established models are very sensitive to the species (spring and winter barley) or to the barley variety. The models are especially highly dependent on the crop year. Variations on the properties of a crop from one to another year are poorly characterized and are not incorporated in our models. They thus prevent us to capitalize experimental information over time. Some structural properties of barley (porosity, hardness) were considered as new factors to better characterize barley but they did not explain the observed variations.To characterize barley, 394 samples from 3 years of different crops 2009-2010-2011 were analysed by MIR spectroscopy. ACP analyses have confirmed the significant effect of the crop-years, species, varieties and sometimes of places of harvest on the properties of barley. A PLS regression allowed, for some years and for some species, to predict content of protein and beta-glucans of barley using MIR spectra. These results thus still face product variability, however, these new PLS models are very promising and could be exploited to implement control strategies in malting process using MIR spectroscopic measurements
Tano, Kent. "Multivariate modelling and monitoring of mineral processes using partial least square regression." Licentiate thesis, Luleå tekniska universitet, Institutionen för samhällsbyggnad och naturresurser, 1996. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-16872.
Full textMohiddin, Syed B. "Development of novel unsupervised and supervised informatics methods for drug discovery applications." The Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=osu1138385657.
Full textFortes, Paula Regina. ""Calibração multivariada e cinética diferencial em sistemas de análises em fluxo com detecção espectrofotométrica"." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/64/64132/tde-14082006-120124/.
Full textDifferential kinetic analysis can be implemented in a flow system analyser, and this was demonstrated in designing an improved spectrophotometric catalytic determination of iron and vanadium in Fe-V alloys. The method relied on the influence of Fe2+ and VO2+ on the rate of the iodide oxidation by Cr2O7 under acidic conditions; therefore the Jones reductor was needed. To this end, a flow injection system (FIA) and a multi-pumping flow system (MPFS) were dimensioned and evaluated. In both systems, the alloy solution was inserted into an acidic KI solution that acted also as carrier stream, and a dichromate solution was added by confluence. Successive measurements were performed during sample passage through the detector, each one related to a different yet reproducible condition for reaction development. Data treatment involved multivariate calibration by the PLS algorithm. The FIA system was less recommended for multi-parametric determination, as the laminar flow regimen could not provide suitable kinetic information. On the other hand, a MPFS demonstrated that pulsed flow led to enhance figures of merit due to chaotic movement of its fluid elements. The proposed MPFS system is very simple and rugged, allowing 50 samples to be run per hour, meaning 48 mg KI per determination. The first two latent variables carry ca 94 % of the analytical information, pointing out that the intrinsic dimensionality of the data set is two. Results are in agreement with inductively coupled argon plasma optical emission spectrometry.
Sasaki, Milton Katsumi. "Projeto e desenvolvimento de um sistema de análises químicas por injeção em fluxo para determinações espectrofotométricas simultâneas de cobre e de níquel explorando cinética diferencial e calibração multivariada." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/64/64135/tde-08022012-094037/.
Full textDifferential kinetic analysis exploits the differences in reaction rates between the analytes and a common reactant system; prior steps of analyte separation can then be waived. Flow-injection systems (FIA) are considered as an important tool for methods involving such a strategy because they allow precise control of sample / reagent dispersion and timing. The aim of this work was then to exploit these two favorable aspects for the simultaneous determination of copper and nickel using the 5-Br-PADAP chromogenic reagent. Three sample aliquots were simultaneously inserted by means of a proportional injector into reagent carrier stream (75 mg L-1 5-Br-PADAP + 0.5 mol L-1 acetic acid / acetate, pH 4.7) of a single-line FIA system. During transport towards detection, the established zones coalesce themselves, resulting in a complex zone that was monitored at 562 nm. The local maximum and minimum values of the concentration / time obtained function were considered for multivariate calibration using the PLS-2 (partial least squares - 2) chemometric tool. The reagent concentration, buffering capacity, temperature, flow rate and lengths of the analytical path, sampling loops and initial distance between plugs were established and evaluated for the construction of mathematical models. To this end, 24 Cu2+ and Ni2+ (0.00 - 1.60 mg L-1, also 0.1% v/v HNO3) mixed standard solutions were used. Two latent variables were enough to capture > 98% of the variance inherent in the data set and average prediction errors (RMSEP) were estimated as 0.025 and 0.071 mg L-1 for Cu and Ni, emphasizing the good precision the calibration model. The proposed system presents good figures of merit: physical stability when kept in operation for four uninterrupted hours, consumption of 314 \'mü\'g 5-Br-PADAP per sample, sample throughput of 33 h-1 (165 data, 66 determinations) and error readings in absorbance signals typically <5%. However, inaccuracy of the predictions made by the proposed model when compared to results obtained by ICP OES was noted. Thus, further studies involving this type of matrix, as well as masking techniques of potential interferences present, are recommended
Feng, Zijie. "Machine learning methods for seasonal allergic rhinitis studies." Thesis, Linköpings universitet, Statistik och maskininlärning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173090.
Full text