Academic literature on the topic 'Pre data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Pre data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Pre data processing"

1

S, Vishesh, Manu Srinath, Akshatha C. Kumar, and Nandan A.S. "Data Warehousing Architecture and Pre-Processing." IJARCCE 6, no. 5 (May 30, 2017): 13–18. http://dx.doi.org/10.17148/ijarcce.2017.6503.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jarolímek, Jan, Jan Pavlík, Jana Kholova, and Swarna Ronanki. "Data Pre-processing for Agricultural Simulations." Agris on-line Papers in Economics and Informatics 11, no. 01 (March 30, 2019): 49–53. http://dx.doi.org/10.7160/aol.2019.110105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Garcia de la Nava, J., S. van Hijum, and O. Trelles. "PreP: gene expression data pre-processing." Bioinformatics 19, no. 17 (November 20, 2003): 2328–29. http://dx.doi.org/10.1093/bioinformatics/btg318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Bilalli, Besim, Alberto Abelló, Tomàs Aluja-Banet, and Robert Wrembel. "Intelligent assistance for data pre-processing." Computer Standards & Interfaces 57 (March 2018): 101–9. http://dx.doi.org/10.1016/j.csi.2017.05.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wei, Bo, Kai Li, Chengwen Luo, Weitao Xu, Jin Zhang, and Kuan Zhang. "No Need of Data Pre-processing." ACM Transactions on Internet of Things 2, no. 4 (November 30, 2021): 1–26. http://dx.doi.org/10.1145/3467980.

Full text
Abstract:
Device-free context awareness is important to many applications. There are two broadly used approaches for device-free context awareness, i.e., video-based and radio-based. Video-based approaches can deliver good performance, but privacy is a serious concern. Radio-based context awareness applications have drawn researchers' attention instead, because it does not violate privacy and radio signal can penetrate obstacles. The existing works design explicit methods for each radio-based application. Furthermore, they use one additional step to extract features before conducting classification and exploit deep learning as a classification tool. Although this feature extraction step helps explore patterns of raw signals, it generates unnecessary noise and information loss. The use of raw CSI signal without initial data processing was, however, considered as no usable patterns. In this article, we are the first to propose an innovative deep learning–based general framework for both signal processing and classification. The key novelty of this article is that the framework can be generalised for all the radio-based context awareness applications with the use of raw CSI. We also eliminate the extra work to extract features from raw radio signals. We conduct extensive evaluations to show the superior performance of our proposed method and its generalisation.
APA, Harvard, Vancouver, ISO, and other styles
6

Martens, Harald, Martin Høy, Barry M. Wise, Rasmus Bro, and Per B. Brockhoff. "Pre-whitening of data by covariance-weighted pre-processing." Journal of Chemometrics 17, no. 3 (March 2003): 153–65. http://dx.doi.org/10.1002/cem.780.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gheshlaghi, F., and J. C. Santamarina. "Data Pre‐Processing in Cross‐Hole Geotomography." Journal of Environmental and Engineering Geophysics 3, no. 1 (March 1998): 41–47. http://dx.doi.org/10.4133/jeeg3.1.41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hu, Li Li, Sheng Suo Niu, and Zhi Rui Liang. "WAMS/PMU Data Pre-Processing and Compression." Advanced Materials Research 986-987 (July 2014): 1700–1703. http://dx.doi.org/10.4028/www.scientific.net/amr.986-987.1700.

Full text
Abstract:
With the Wide Area Measurement applications, a large number of experimental PMU data was generated. In order to analyze, transmit and apply the data efficiently, by understanding the characteristics of the PMU data in this paper, pressing the data with the Waveform difference method, then using Huffman algorithm to compress the data. Compare the data pre-processing before and after, compression ratio has been further improved.
APA, Harvard, Vancouver, ISO, and other styles
9

Yan, Xiaowei, Chengqi Zhang, and Shichao Zhang. "Toward databases mining: Pre-processing collected data." Applied Artificial Intelligence 17, no. 5-6 (May 2003): 545–61. http://dx.doi.org/10.1080/713827171.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Moustakidis, Serafeim, Athanasios Anagnostis, Apostolos Chondronasios, Patrik Karlsson, and Kostas Hrissagis. "Excitation-invariant pre-processing of thermographic data." Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability 232, no. 4 (April 23, 2018): 435–46. http://dx.doi.org/10.1177/1748006x18770888.

Full text
Abstract:
There is a large number of industries that make extensive use of composite materials in their respective sectors. This rise in composites’ use has necessitated the development of new non-destructive inspection techniques that focus on manufacturing quality assurance, as well as in-service damage testing. Active infrared thermography is now a popular nondestructive testing method for detecting defects in composite structures. Non-uniform emissivity, uneven heating of the test surface, and variation in thermal properties of the test material are some of the crucial factors in experimental thermography. These unwanted thermal effects are typically coped with the application of a number of well-established thermographic techniques including pulse phase thermography and thermographic signal reconstruction. This article addresses this problem of the induced uneven heating at the pre-processing phase prior to the application of the thermographic processing techniques. To accomplish this, a number of excitation invariant pre-processing techniques were developed and tested in this article addressing the unwanted effect of non-uniform excitation in the collected thermographic data. Various fitting approaches were validated in light of modeling the non-uniform heating effect, and new normalization approaches were proposed following a time-dependent framework. The proposed pre-processing techniques were validated on a testing composite sample with pre-determined defects. The results demonstrated the effectiveness of the proposed processing algorithms in terms of removing the unwanted heat distribution effect along with the signal-to-noise ratio of the produced infrared images.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Pre data processing"

1

Bilalli, Besim. "Learning the impact of data pre-processing in data analysis." Doctoral thesis, Universitat Politècnica de Catalunya, 2018. http://hdl.handle.net/10803/587221.

Full text
Abstract:
There is a clear correlation between data availability and data analytics, and hence with the increase of data availability --- unavoidable according to Moore's law, the need for data analytics increases too. This certainly engages many more people, not necessarily experts, to perform analytics tasks. However, the different, challenging, and time consuming steps of the data analytics process, overwhelm non-experts and they require support (e.g., through automation or recommendations). A very important and time consuming step that marks itself out of the rest, is the data pre-processing step. Data pre-processing is challenging but at the same time has a heavy impact on the overall analysis. In this regard, previous works have focused on providing user assistance in data pre-processing but without being concerned on its impact on the analysis. Hence, the goal has generally been to enable analysis through data pre-processing and not to improve it. In contrast, this thesis aims at developing methods that provide assistance in data pre-processing with the only goal of improving (e.g., increasing the predictive accuracy of a classifier) the result of the overall analysis. To this end, we propose a method and define an architecture that leverages ideas from meta-learning to learn the relationship between transformations (i.e., pre-processing operators) and mining algorithms (i.e., classification algorithms). This eventually enables ranking and recommending transformations according to their potential impact on the analysis. To reach this goal, we first study the currently available methods and systems that provide user assistance, either for the individual steps of data analytics or for the whole process altogether. Next, we classify the metadata these different systems use and then specifically focus on the metadata used in meta-learning. We apply a method to study the predictive power of these metadata and we extract and select the metadata that are most relevant. Finally, we focus on the user assistance in the pre-processing step. We devise an architecture and build a tool, PRESISTANT, that given a classification algorithm is able to recommend pre-processing operators that once applied, positively impact the final results (e.g., increase the predictive accuracy). Our results show that providing assistance in data pre-processing with the goal of improving the result of the analysis is feasible and also very useful for non-experts. Furthermore, this thesis is a step towards demystifying the non-trivial task of pre-processing that is an exclusive asset in the hands of experts.
Existe una clara correlación entre disponibilidad y análisis de datos, por tanto con el incremento de disponibilidad de datos --- inevitable según la ley de Moore, la necesidad de analizar datos se incrementa también. Esto definitivamente involucra mucha más gente, no necesariamente experta, en la realización de tareas analíticas. Sin embargo los distintos, desafiantes y temporalmente costosos pasos del proceso de análisis de datos abruman a los no expertos, que requieren ayuda (por ejemplo, automatización o recomendaciones). Uno de los pasos más importantes y que más tiempo conlleva es el pre-procesado de datos. Pre-procesar datos es desafiante, y a la vez tiene un gran impacto en el análisis. A este respecto, trabajos previos se han centrado en proveer asistencia al usuario en el pre-procesado de datos pero sin tener en cuenta el impacto en el resultado del análisis. Por lo tanto, el objetivo ha sido generalmente el de permitir analizar los datos mediante el pre-procesado y no el de mejorar el resultado. Por el contrario, esta tesis tiene como objetivo desarrollar métodos que provean asistencia en el pre-procesado de datos con el único objetivo de mejorar (por ejemplo, incrementar la precisión predictiva de un clasificador) el resultado del análisis. Con este objetivo, proponemos un método y definimos una arquitectura que emplea ideas de meta-aprendizaje para encontrar la relación entre transformaciones (operadores de pre-procesado) i algoritmos de minería de datos (algoritmos de clasificación). Esto, eventualmente, permite ordenar y recomendar transformaciones de acuerdo con el impacto potencial en el análisis. Para alcanzar este objetivo, primero estudiamos los métodos disponibles actualmente y los sistemas que proveen asistencia al usuario, tanto para los pasos individuales en análisis de datos como para el proceso completo. Posteriormente, clasificamos los metadatos que los diferentes sistemas usan y ponemos el foco específicamente en aquellos que usan metadatos para meta-aprendizaje. Aplicamos un método para estudiar el poder predictivo de los metadatos y extraemos y seleccionamos los metadatos más relevantes. Finalmente, nos centramos en la asistencia al usuario en el paso de pre-procesado de datos. Concebimos una arquitectura y construimos una herramienta, PRESISTANT, que dado un algoritmo de clasificación es capaz de recomendar operadores de pre-procesado que una vez aplicados impactan positivamente el resultado final (por ejemplo, incrementan la precisión predictiva). Nuestros resultados muestran que proveer asistencia al usuario en el pre-procesado de datos con el objetivo de mejorar el resultado del análisis es factible y muy útil para no-expertos. Además, esta tesis es un paso en la dirección de desmitificar que la tarea no trivial de pre-procesar datos esta solo al alcance de expertos.
APA, Harvard, Vancouver, ISO, and other styles
2

Khondoker, Md Mizanur Rahman. "Statistical methods for pre-processing microarray gene expression data." Thesis, University of Edinburgh, 2006. http://hdl.handle.net/1842/12367.

Full text
Abstract:
A novel method is developed for combining multiple laser scans of microarrays to correct for “signal saturation” and “signal deterioration” effects in the gene expression measurement. A multivariate nonlinear functional regression model with Cauchy distributed errors having additive plus multiplicative scale is proposed as a model for combining multiple scan data. The model has been found to flexibly describe the nonlinear relationship in multiple scan data. The heavy tailed Cauchy distribution with additive plus multiplicative scale provides a basis for objective and robust estimation of gene expression from multiple scan data adjusting for censoring and deterioration bias in the observed intensity. Through combining multiple scans, the model reduces sampling variability in the gene expression estimates. A unified approach for nonparametric location and scale normalisation of log-ratio data is considered. A Generalised Additive Model for Location, Scale and Shape (GAMLSS) is proposed. GAMLSS uses a nonparametric approach for modelling both location and scale of log-ratio data, in contrast to the general tendency of using a parametric transformation, such as arcsinh, for variance stabilisation. Simulation studies demonstrate GAMLSS to be more powerful than the parametric method when a GAMLSS location and scale model, fitted to real data, is assumed correct. GAMLSS has been found to be as powerful as the parametric approach even when the parametric model is appropriate. Finally, we investigate the optimality of different estimation methods for analysing functional regression models. Alternative estimators are available in the literature to deal with the problems of identifiability and consistency. We investigated these estimators in terms of unbiasedness and efficiency for a specific case involving multiple laser scans of microarrays, and found that, in addition to being consistent, named methods are highly efficient and unbiased.
APA, Harvard, Vancouver, ISO, and other styles
3

Mukun, Wang, Liu Gozhi, and Li Zhenglian. "HARDWARE PRE-PROCESSING FOR DATA OF UNDERWATER MEASURING SYSTEM." International Foundation for Telemetering, 1991. http://hdl.handle.net/10150/612907.

Full text
Abstract:
International Telemetering Conference Proceedings / November 04-07, 1991 / Riviera Hotel and Convention Center, Las Vegas, Nevada
The synchro double pulse signal mode is freqently used in Short Base Line (SBL) underwater positioning system so as to obtain the information of both distance and depth of a target simultaneously. Howerer, this signal mode also brings about ranging indistinctness resulting in a shorter positioning distance much less than that limited by the period of the synchro signal. This paper presents a hardware distance gate date acquiring scheme. It puts the original data sent to the computer in order of “direct first pulse--depth information pulse (or first pulse reflected by water surface)•••- to guarantee the effective positioning distance of the system. It has the advantage of reducing the processing time of the computer thus ensuring the realtime functioning of the system.
APA, Harvard, Vancouver, ISO, and other styles
4

Giovanelli, Joseph. "AutoML: A new methodology to automate data pre-processing pipelines." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20422/.

Full text
Abstract:
It is well known that we are living in the Big Data Era. Indeed, the exponential growth of Internet of Things, Web of Things and Pervasive Computing systems greatly increased the amount of stored data. Thanks to the availability of data, the figure of the Data Scientist has become one of the most sought, because he is capable of transforming data, performing analysis on it, and applying Machine Learning techniques to improve the business decisions of companies. Yet, Data Scientists do not scale. It is almost impossible to balance their number and the required effort to analyze the increasingly growing sizes of available data. Furthermore, today more and more non-experts use Machine Learning tools to perform data analysis but they do not have the required knowledge. To this end, tools that help them throughout the Machine Learning process have been developed and are typically referred to as AutoML tools. However, even with the presence of such tools, raw data (i.e., without being pre-processed) are rarely ready to be consumed, and generally perform poorly when consumed in a raw form. A pre-processing phase (i.e., application of a set of transformations), which improves the quality of the data and makes it suitable for algorithms is usually required. Most of AutoML tools do not consider this preliminary part, even though it has already shown to improve the final performance. Moreover, there exist a few works that actually support pre-processing, but they provide just the application of a fixed series of transformations, decided a priori, not considering the nature of the data, the used algorithm, or simply that the order of the transformations could affect the final result. In this thesis we propose a new methodology that allows to provide a series of pre-processing transformations according to the specific presented case. Our approach analyzes the nature of the data, the algorithm we intend to use, and the impact that the order of transformations could have.
APA, Harvard, Vancouver, ISO, and other styles
5

Patel, Ankur. "3D morphable models : data pre-processing, statistical analysis and fitting." Thesis, University of York, 2011. http://etheses.whiterose.ac.uk/1576/.

Full text
Abstract:
This thesis presents research aimed at using a 3D linear statistical model (known as a 3D morphable model) of an object class (which could be faces, bodies, cars, etc) for robust shape recovery. Our aim is to use this recovered information for the purposes of potentially useful applications like recognition and synthesis. With a 3D morphable model as its central theme, this thesis includes: a framework for the groupwise processing of a set of meshes in dense correspondence; a new method for model construction; a new interpretation of the statistical constraints afforded by the model and addressing of some key limitations associated with using such models in real world applications. In Chapter 1 we introduce 3D morphable models, touch on the current state-of-the-art and emphasise why these models are an interesting and important research tool in the computer vision and graphics community. We then talk about the limitations of using such models and use these limitations as a motivation for some of the contributions made in this thesis. Chapter 2 presents an end-to-end system for obtaining a single (possibly symmetric) low resolution mesh topology and texture parameterisation which are optimal with respect to a set of high resolution input meshes in dense correspondence. These methods result in data which can be used to build 3D morphable models (at any resolution). In Chapter 3 we show how the tools of thin-plate spline warping and Procrustes analysis can be used to construct a morphable model as a shape space. We observe that the distribution of parameter vector lengths follows a chi-square distribution and discuss how the parameters of this distribution can be used as a regularisation constraint on the length of parameter vectors. In Chapter 4 we take the idea introduced in Chapter 3 further by enforcing a hard constraint which restricts faces to points on a hyperspherical manifold within the parameter space of a linear statistical model. We introduce tools from differential geometry (log and exponential maps for a hyperspherical manifold) which are necessary for developing our methodology and provide empirical validation to justify our choice of manifold. Finally, we show how to use these tools to perform model fitting, warping and averaging operations on the surface of this manifold. Chapter 5 presents a method to simplify a 3D morphable model without requiring knowledge of the training meshes used to build the model. This extends the simplification ideas in Chapter 2 into a statistical setting. The proposed method is based on iterative edge collapse and we show that the expected value of the Quadric Error Metric can be computed in closed form for a linear deformable model. The simplified models can used to achieve efficient multiscale fitting and super-resolution. In Chapter 6 we consider the problem of model dominance and show how shading constraints can be used to refine morphable model shape estimates, offering the possibility of exceeding the maximum possible accuracy of the model. We present an optimisation scheme based on surface normal error as opposed to image error. This ensures the fullest possible use of the information conveyed by the shading in an image. In addition, our framework allows non-model based estimation of per-vertex bump and albedo maps. This means the recovered model is capable of describing shape and reflectance phenomena not present in the training set. We explore the use of the recovered shape and reflectance information for face recognition and synthesis. Finally, in Chapter 7 we provide concluding remarks and discuss directions for future research.
APA, Harvard, Vancouver, ISO, and other styles
6

Hutton, Tanya. "Sensitivity of fusion neutronics to the pre-processing of nuclear data." Thesis, University of Birmingham, 2016. http://etheses.bham.ac.uk//id/eprint/6724/.

Full text
Abstract:
Nuclear data are the foundation of simulation and design in the nuclear industry. The success of commercialising thermonuclear fusion will be based on a set of highly accurate simulations used in design, optimisation and safety analyses. This work focuses on the often overlooked, pre-processing stage of nuclear data. The effect of legacy methods in a fusion context is a concern within the community, but has never been quantified. The sensitivity of fusion neutronics to pre-processing was determined using a set of codes and methods developed as part of this thesis. Legacy pre-processing methods demonstrated a difference between the processed and unprocessed distributions of up to 20%. Simple Monte-Carlo radiation transport simulations exhibited sensitivity within energy distributions for small models (< 5 mfp). Alternative data formats did not improve simulation results sufficiently to justify their implementation. Complex, fusion specific models showed a general insensitivity to the pre-processing when run to the current levels of statistical precision. Future recommendations are to process all future data libraries into the cumulative tabulated probability format. Improved methods are not required at this stage as the core data libraries are incomplete and sometimes inaccurate. Only after the libraries have improved will pre-processing become significant.
APA, Harvard, Vancouver, ISO, and other styles
7

Pettersson, Hanna. "Estimation and Pre-Processing of Sensor Data in Heavy Duty Vehicle Platooning." Thesis, Linköpings universitet, Reglerteknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-79038.

Full text
Abstract:
Today, a rapid development towards fuel efficient technological aids for vehicles is in progress. One step towards this is the development of platooning systems. The main concept of platooning is to let several heavy duty vehicles (HDVs) drive in a convoy and share important information with each other via wireless communication. This thesis describes one out of three subsystems in a project developed to handle the process from raw sensor data to control signal. The goal of the project is to achieve a safe and smooth control with the main purpose of reduced fuel consumption. This subsystem processes the raw sensor data received from the different HDVs. The purpose is to estimate the positions and velocities of the vehicles in a platoon, taking into account that packet-loss, out of sequence measurements and irrelevant information can occur. This is achieved by filtering the information from different sensors in an Extended Kalman Filter and converting it into a local coordinate system with the origin in the ego vehicle. Moreover, the estimates are sorted and categorized into classes with respect to the status of the vehicles. The result of the thesis is useful estimates that are independent of outer effects in a local reference system with origin in the host vehicle. This information can then be used for further sensor fusion and implementation of a Model Predictive Controller (MPC) in two other subsystems. These three subsystems result in a smooth and safe control with an average reduced fuel consumption of approxi- mately 11.1% when the vehicles drive with a distance of 0.5 seconds in a simulated environment.
Dagens utveckling inom fordonsindustrin fokuserar mer och mer påutveckling av bränsleeffektiva hjälpmedel. Ett steg i denna riktning är utvecklingen av platooningsystem. Huvudkonceptet med platooning är att låta flera tunga fordon köra i följd i en konvoj och dela viktig information med varandra via trådlös kommuni- kation och en automatiserad styrstrategi. Detta examensarbete beskriver ett utav tre delsystem i ett projekt som är utvecklat för att hantera en process från rå sensordata till styrsignaler för fordonen. Målet är att uppnå en säker och mjuk reglering med huvudsyftet att reducera bränsleförbrukningen. Det här delsystemet behandlar mottagen sensordata från de olika fordonen. Målet med delsystemet är att skatta positioner och hastigheter för fordonen i konvojen med hänsyn till att förlorad, försenad eller irrelevant information från det trådlösa nätverket kan förekomma. Detta uppnås genom filtrering i ett Extended Kalman Filter och konvertering till ett lokalt referenssystem med origo i det egna fordo- net. Utöver detta sorteras informationen och kategoriseras in i olika klasser efter fordonens status. Examensarbetet resulterade i användbara skattningar oberoende av yttre om- ständigheter i ett lokalt referenssystem med origo i det egna fordonet. Denna information kan användas vidare för ytterligare sensorfusion och implementering av en modellbaserad prediktionsregulator (MPC) i två andra delsystem. De tre delsystemen resulterade i en mjuk och säker reglering och en reducerad bränsleför- brukning med i genomsnitt 11.1% då fordonen körde med 0.5 sekunders avstånd i en simulerad miljö.
APA, Harvard, Vancouver, ISO, and other styles
8

Abdul, Rahim Siti Khatijah Nor. "Transformation of the university examination timetabling problem space through data pre-processing." Thesis, University of Nottingham, 2015. http://eprints.nottingham.ac.uk/28895/.

Full text
Abstract:
This research investigates Examination Timetabling or Scheduling, with the aim of producing good quality, feasible timetables that satisfy hard constraints and various soft constraints. A novel approach to scheduling, that of transformation of the problem space, has been developed and evaluated for its effectiveness. The examination scheduling problem involves many constraints due to many relationships between students and exams, making it complex and expensive in terms of time and resources. Despite the extensive research in this area, it has been observed that most of the published methods do not produce good quality timetables consistently due to the utilisation of random-search. In this research we have avoided random-search and instead have proposed a systematic, deterministic approach to solving the examination scheduling problem. We pre-process data and constraints to generate more meaningful aggregated data constructs with better expressive power that minimise the need for cross-referencing original student and exam data at a later stage. Using such aggregated data and custom-designed mechanisms, the timetable construction is done systematically, while assuring its feasibility. Later, the timetable is optimized to improve the quality, focusing on maximizing the gap between consecutive exams. Our solution is always reproducible and displays a deterministic optimization pattern on all benchmark datasets. Transformation of the problem space into new aggregated data constructs through pre-processing represents the key novel contribution of this research.
APA, Harvard, Vancouver, ISO, and other styles
9

Rahman, M. Mostafizur. "Machine learning based data pre-processing for the purpose of medical data mining and decision support." Thesis, University of Hull, 2014. http://hydra.hull.ac.uk/resources/hull:10103.

Full text
Abstract:
Building an accurate and reliable model for prediction for different application domains, is one of the most significant challenges in knowledge discovery and data mining. Sometimes, improved data quality is itself the goal of the analysis, usually to improve processes in a production database and the designing of decision support. As medicine moves forward there is a need for sophisticated decision support systems that make use of data mining to support more orthodox knowledge engineering and Health Informatics practice. However, the real-life medical data rarely complies with the requirements of various data mining tools. It is often inconsistent, noisy, containing redundant attributes, in an unsuitable format, containing missing values and imbalanced with regards to the outcome class label. Many real-life data sets are incomplete, with missing values. In medical data mining the problem with missing values has become a challenging issue. In many clinical trials, the medical report pro-forma allow some attributes to be left blank, because they are inappropriate for some class of illness or the person providing the information feels that it is not appropriate to record the values for some attributes. The research reported in this thesis has explored the use of machine learning techniques as missing value imputation methods. The thesis also proposed a new way of imputing missing value by supervised learning. A classifier was used to learn the data patterns from a complete data sub-set and the model was later used to predict the missing values for the full dataset. The proposed machine learning based missing value imputation was applied on the thesis data and the results are compared with traditional Mean/Mode imputation. Experimental results show that all the machine learning methods which we explored outperformed the statistical method (Mean/Mode). The class imbalance problem has been found to hinder the performance of learning systems. In fact, most of the medical datasets are found to be highly imbalance in their class label. The solution to this problem is to reduce the gap between the minority class samples and the majority class samples. Over-sampling can be applied to increase the number of minority class sample to balance the data. The alternative to over-sampling is under-sampling where the size of majority class sample is reduced. The thesis proposed one cluster based under-sampling technique to reduce the gap between the majority and minority samples. Different under-sampling and over-sampling techniques were explored as ways to balance the data. The experimental results show that for the thesis data the new proposed modified cluster based under-sampling technique performed better than other class balancing techniques. In further research it is found that the class imbalance problem not only affects the classification performance but also has an adverse effect on feature selection. The thesis proposed a new framework for feature selection for class imbalanced datasets. The research found that, using the proposed framework the classifier needs less attributes to show high accuracy, and more attributes are needed if the data is highly imbalanced. The research described in the thesis contains the flowing four novel main contributions. a) Improved data mining methodology for mining medical data b) Machine learning based missing value imputation method c) Cluster Based semi-supervised class balancing method d) Feature selection framework for class imbalance datasets The performance analysis and comparative study show that the use of proposed method of missing value imputation, class balancing and feature selection framework can provide an effective approach to data preparation for building medical decision support.
APA, Harvard, Vancouver, ISO, and other styles
10

Hafner, F. W. (Bill). "Advanced Data Acquisition and Processing System (ADAPS) – The Current State of the System." International Foundation for Telemetering, 1999. http://hdl.handle.net/10150/607323.

Full text
Abstract:
International Telemetering Conference Proceedings / October 25-28, 1999 / Riviera Hotel and Convention Center, Las Vegas, Nevada
The technology growth in the Aerospace industry, as manifested and embodied in the current fighter technology, presents many challenges in the area of flight test and data processing. Past papers have delineated the concepts brought to bear in the design and implementation of the AFFTC’s latest generation of telemetry data systems in the Advanced Data Acquisition and Processing System (ADAPS) program. The current deployed system incorporates the planned approach of commercial-off-the-shelf (COTS) and government-off-the-shelf (GOTS) elements as basic to the system solution. The state of the program has advanced through full development, delivery and performance testing. The system is currently deployed in support of flight testing at Edwards AFB. This paper will present the status of the program.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Pre data processing"

1

Image acquisition and pre-processing for machine vision. Bellingham, Wash: SPIE, 2010.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ahmad, Masri Bin. PC graphical pre- and post-processing of finele data. Manchester: UMIST, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Conference "Computers and the Humanities : Today's Research, Tomorrow's Teaching" (1986 University of Toronto). Computers and the Humanities : Today's Research, Tomorrow's Teaching: Conference pre-prints. Toronto, Ont: Centre for Computing in the Humanities, New College, University of Toronto, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

2008, International Pre-Olympic Congress on Computer Science (2008 Nanjing China). Proceedings of 2008 International Pre-Olympic Congress on Computer Science: Nanjing, China, August 4-7, 2008. Liverpool: World Academic Union (World Academic Press), 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Dombroff, Mark A. Litigation organization and management. 2nd ed. Englewood Cliffs, N.J: Prentice Hall Law & Business, 1991.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shi, Feng. Learn About Text Pre-Processing in R With Data From How ISIS Uses Twitter Dataset (2016). 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526488909.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Shi, Feng. Learn About Text Pre-Processing in Python With Data From How ISIS Uses Twitter Dataset (2016). 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526497864.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Martinez, Julia. Learn Data Preparation and Pre-Processing in SPSS With an Online Survey of Correlates of Heavy Drinking (2012). 1 Oliver's Yard, 55 City Road, London EC1Y 1SP United Kingdom: SAGE Publications Ltd., 2019. http://dx.doi.org/10.4135/9781526474902.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Joint International Pre-Olympic Conference of Sports Science and Sports Engineering (1st 2008 Nanjing Shi, China). Proceedings of First Joint International Pre-Olympic Conference of Sports Science and Sports Engineering: Nanjing, China, August 4-7, 2008. Liverpool: World Academic Union (World Academic Press), 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Getting it right in print: Digital pre-press for graphic designers. London: Laurence King, 2004.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Pre data processing"

1

Christen, Peter. "Data Pre-Processing." In Data Matching, 39–67. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-31164-2_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Finlay, Steven. "Data Pre-processing." In Credit Scoring, Response Modelling and Insurance Rating, 144–59. London: Palgrave Macmillan UK, 2010. http://dx.doi.org/10.1057/9780230298989_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Malley, Brian, Daniele Ramazzotti, and Joy Tzung-yu Wu. "Data Pre-processing." In Secondary Analysis of Electronic Health Records, 115–41. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-43742-2_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kuhn, Max, and Kjell Johnson. "Data Pre-processing." In Applied Predictive Modeling, 27–59. New York, NY: Springer New York, 2013. http://dx.doi.org/10.1007/978-1-4614-6849-3_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Maimon, Oded, and Mark Last. "Automated Data Pre-Processing." In Knowledge Discovery and Data Mining, 23–29. Boston, MA: Springer US, 2001. http://dx.doi.org/10.1007/978-1-4757-3296-2_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hong, Wei-Chiang. "Data Pre-processing Methods." In Hybrid Intelligent Technologies in Energy Demand Forecasting, 45–67. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-36529-5_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Finlay, Steven. "Data Transformation (Pre-processing)." In Credit Scoring, Response Modeling, and Insurance Rating, 144–64. London: Palgrave Macmillan UK, 2012. http://dx.doi.org/10.1057/9781137031693_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

de Prada, Cesar, and Daniel Sarabia. "Data Pre-treatment." In Resource Efficiency of Processing Plants, 181–210. Weinheim, Germany: Wiley-VCH Verlag GmbH & Co. KGaA, 2018. http://dx.doi.org/10.1002/9783527804153.ch8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wu, Xindong. "Induction as Pre-processing." In Methodologies for Knowledge Discovery and Data Mining, 114–22. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/3-540-48912-6_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Badia, Antonio. "Data Cleaning and Pre-processing." In Data-Centric Systems and Applications, 77–169. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-57592-2_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Pre data processing"

1

Ribeiro, Marcela X., Mônica R. P. Ferreira, Caetano Traina, and Agma J. M. Traina. "Data pre-processing." In the 5th international conference. New York, New York, USA: ACM Press, 2008. http://dx.doi.org/10.1145/1456223.1456277.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Frigui, H. "Pre-processing for data clustering." In IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04. IEEE, 2004. http://dx.doi.org/10.1109/nafips.2004.1337437.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Batista, Lu, and Lu Alexandre. "Text Pre-processing for Lossless Compression." In 2008 Data Compression Conference DCC. IEEE, 2008. http://dx.doi.org/10.1109/dcc.2008.78.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Vu, Minh Thanh, Tobias J. Oechtering, and Mikael Skoglund. "Gaussian Hierarchical Identification with Pre-processing." In 2018 Data Compression Conference (DCC). IEEE, 2018. http://dx.doi.org/10.1109/dcc.2018.00036.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lv, Qiujuan, and Suping Fang. "ICT slice profile data pre-processing." In 2010 IEEE International Conference on Mechatronics and Automation (ICMA). IEEE, 2010. http://dx.doi.org/10.1109/icma.2010.5588019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Saleem, Asma, Khadim Hussain Asif, Ahmad Ali, Shahid Mahmood Awan, and Mohammed A. Alghamdi. "Pre-processing Methods of Data Mining." In 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC). IEEE, 2014. http://dx.doi.org/10.1109/ucc.2014.57.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gan, Jiangying, and Zhijun Xu. "Pre-Processing VDIF Data in FPGA." In 2018 Progress in Electromagnetics Research Symposium (PIERS-Toyama). IEEE, 2018. http://dx.doi.org/10.23919/piers.2018.8597678.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Taleb, Ikbal, Rachida Dssouli, and Mohamed Adel Serhani. "Big Data Pre-processing: A Quality Framework." In 2015 IEEE International Congress on Big Data (BigData Congress). IEEE, 2015. http://dx.doi.org/10.1109/bigdatacongress.2015.35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Bhattacharjee, Sangita, and Indra Kanta Maitra. "Pre-processing of Digital Mammogram – A Review." In Smart Technologies in Data Science and Communication 2017. Science & Engineering Research Support soCiety, 2017. http://dx.doi.org/10.14257/astl.2017.147.23.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Billal, Belainine, Alexsandro Fonseca, and Fatiha Sadat. "Efficient natural language pre-processing for analyzing large data sets." In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 2016. http://dx.doi.org/10.1109/bigdata.2016.7841060.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Pre data processing"

1

Hennekens, S. M., W. A. Ozinga, and J. H. J. Schaminée. BioScore 3 - Plants : Background and pre-processing of distribution data. Wageningen: Wageningen University & Research, Statutory Research Tasks Unit for Nature & the Environment, 2017. http://dx.doi.org/10.18174/428824.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Taasevigen, Danny J. User's Guide to Pre-Processing Data in Universal Translator 2 for the Energy Charting and Metrics Tool (ECAM). Office of Scientific and Technical Information (OSTI), November 2011. http://dx.doi.org/10.2172/1032417.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bauer, Andrew, James Forsythe, Jayanarayanan Sitaraman, Andrew Wissink, Buvana Jayaraman, and Robert Haehnel. In situ analysis and visualization to enable better workflows with CREATE-AV™ Helios. Engineer Research and Development Center (U.S.), June 2021. http://dx.doi.org/10.21079/11681/40846.

Full text
Abstract:
The CREATE-AV™ Helios CFD simulation code has been used to accurately predict rotorcraft performance under a variety of flight conditions. The Helios package contains a suite of tools that contain almost the entire set of functionality needed for a variety of workflows. These workflows include tools customized to properly specify many in situ analysis and visualization capabilities appropriate for rotorcraft analysis. In situ is the process of computing analysis and visualization information during a simulation run before data is saved to disk. In situ has been referred to with a variety of terms including co-processing, covisualization, coviz, etc. In this paper we describe the customization of the pre-processing GUI and corresponding development of the Helios solver code-base to effectively implement in situ analysis and visualization to reduce file IO and speed up workflows for CFD analysts. We showcase how the workflow enables the wide variety of Helios users to effectively work in post-processing tools they are already familiar with as opposed to forcing them to learn new tools in order post-process in situ data extracts being produced by Helios. These data extracts include various sources of information customized to Helios, such as knowledge about the near- and off-body grids, internal surface extracts with patch information, and volumetric extracts meant for fast post-processing of data. Additionally, we demonstrate how in situ can be used by workflow automation tools to help convey information to the user that would be much more difficult when using full data dumps.
APA, Harvard, Vancouver, ISO, and other styles
4

Lasko, Kristofer, and Sean Griffin. Monitoring Ecological Restoration with Imagery Tools (MERIT) : Python-based decision support tools integrated into ArcGIS for satellite and UAS image processing, analysis, and classification. Engineer Research and Development Center (U.S.), April 2021. http://dx.doi.org/10.21079/11681/40262.

Full text
Abstract:
Monitoring the impacts of ecosystem restoration strategies requires both short-term and long-term land surface monitoring. The combined use of unmanned aerial systems (UAS) and satellite imagery enable effective landscape and natural resource management. However, processing, analyzing, and creating derivative imagery products can be time consuming, manually intensive, and cost prohibitive. In order to provide fast, accurate, and standardized UAS and satellite imagery processing, we have developed a suite of easy-to-use tools integrated into the graphical user interface (GUI) of ArcMap and ArcGIS Pro as well as open-source solutions using NodeOpenDroneMap. We built the Monitoring Ecological Restoration with Imagery Tools (MERIT) using Python and leveraging third-party libraries and open-source software capabilities typically unavailable within ArcGIS. MERIT will save US Army Corps of Engineers (USACE) districts significant time in data acquisition, processing, and analysis by allowing a user to move from image acquisition and preprocessing to a final output for decision-making with one application. Although we designed MERIT for use in wetlands research, many tools have regional or global relevancy for a variety of environmental monitoring initiatives.
APA, Harvard, Vancouver, ISO, and other styles
5

Cairo, Jessica, Iulia Gherman, and Paul Cook. The effects of consumer freezing of food on its use-by date. Food Standards Agency, July 2021. http://dx.doi.org/10.46756/sci.fsa.ret874.

Full text
Abstract:
The current Food Standards Agency consumer guidance states that consumers can freeze pre-packed food right up to the “use-by” date and, once food has been defrosted, it should be consumed within 24 hours. This strategic review has collated relevant data to determine whether there is an increased risk in relation to freezing ready-to-eat and non-ready-to-eat foods on the use-by date compared to the day before the use-by date. The review has focused on how the shelf-life of a food is determined and the effects of freezing, thawing and refrigeration on foodborne pathogens, including Bacillus spp., Campylobacter spp., Clostridium botulinum, Clostridium perfringens, Listeria monocytogenes, Salmonella, pathogenic Escherichia coli and Shigella spp. In the UK, food business operators are responsible for setting the safe shelf-life of a food which, in practice, should take into consideration the consumer habits, as well as the factors affecting shelf-life, such as food product characteristics, food processing techniques, transport, retail and domestic food storage temperatures, and type of packaging. Some countries, such as Ireland, New Zealand and Canada specifically recommend including safety margins within shelf lives. This is used to maintain brand integrity because it ensures that the food is consumed in its optimum condition. The FSA has collaborated with other organisations in the production of several guidance documents; however, there is no explicit requirement for the consideration of a margin of safety when setting shelf-life. There is also no legal requirement in the UK to consider a safety margin when setting shelf-life. According to regulations, pathogens should not be present in sufficient levels to cause foodborne illness on the use-by date, as food should still be safe to eat on that day. Given that these requirements are met, the risk assessed in this report arises from the processes of freezing, thawing and subsequent refrigerated storage for a further 24 hours, and the potential for these to increase pathogen levels. In this review, it was found that there is a risk of additional growth of certain pathogens during the refrigerated storage period although the impact of freezing and thawing on the extent of this growth was not readily evident. This risk would relate specifically to ready-to-eat foods as cooking of non-ready-to-eat foods after defrosting would eliminate pathogens. This report explores the potential issues related to consumer freezing on the use-by date and identifies additional information or research required to understand the risks involved. Overall, there is little evidence to suggest a significant change in risk between consumers freezing ready-to-eat food on the use-by date compared to freezing the food on the day before the use-by date. Specific areas that merit further research include the risks due to low temperature survival and growth of L. monocytogenes. There is also a lack of research on the effects of freezing, defrosting and refrigeration on the growth and toxin production of non-proteolytic C. botulinum, and the growth of Salmonella during domestic freezing and thawing. Finally, more information on how food business operators set shelf-life would enable a better understanding of the process and the extent of the safety margin when determining shelf-life of ready-to-eat and non-ready-to-eat foods.
APA, Harvard, Vancouver, ISO, and other styles
6

Federal Information Processing Standards Publication: 1,200 bits per second two-wire duplex modems for data communications use on telephone-type circuits. Gaithersburg, MD: National Institute of Standards and Technology, 1992. http://dx.doi.org/10.6028/nist.fips.162.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Federal Information Processing Standards Publication: 2,400 bits per second two-wire duplex modems for data communications use on telephone-type circuits. Gaithersburg, MD: National Institute of Standards and Technology, 1992. http://dx.doi.org/10.6028/nist.fips.163.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Federal Information Processing Standards Publication: 9,600 bits per second four-wire duplex modems for data communications use on telephone-type circuits. Gaithersburg, MD: National Institute of Standards and Technology, 1992. http://dx.doi.org/10.6028/nist.fips.167.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Federal Information Processing Standards Publication: 4,800 and 9,600 bits per second two-wire duplex modems for data communications use on telephone-type circuits. Gaithersburg, MD: National Institute of Standards and Technology, 1992. http://dx.doi.org/10.6028/nist.fips.166.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Federal Information Processing Standards Publication: 12,000 and 14,400 bits per second four-wire duplex modems for data communications use on telephone-type circuits. Gaithersburg, MD: National Institute of Standards and Technology, 1992. http://dx.doi.org/10.6028/nist.fips.168.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography