Academic literature on the topic 'Tabular data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Tabular data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Tabular data"

1

Altman, Naomi, and Martin Krzywinski. "Tabular data." Nature Methods 14, no. 4 (March 30, 2017): 329–30. http://dx.doi.org/10.1038/nmeth.4239.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Eken, Süleyman, Ahmet Sayar, and Kürşat Topçuoğlu. "AutoTest: Automation to Test Tabular Data Quality." International Journal of Computer and Electrical Engineering 6, no. 4 (2014): 365–68. http://dx.doi.org/10.7763/ijcee.2014.v6.854.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Altman, Naomi, and Martin Krzywinski. "Author Correction: Tabular data." Nature Methods 16, no. 7 (June 12, 2019): 658. http://dx.doi.org/10.1038/s41592-019-0474-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Badaro, Gilbert, and Paolo Papotti. "Transformers for tabular data representation." Proceedings of the VLDB Endowment 15, no. 12 (August 2022): 3746–49. http://dx.doi.org/10.14778/3554821.3554890.

Full text
Abstract:
In the last few years, the natural language processing community witnessed advances in neural representations of free texts with transformer-based language models (LMs). Given the importance of knowledge available in relational tables, recent research efforts extend LMs by developing neural representations for tabular data. In this tutorial, we present these proposals with two main goals. First, we introduce to a database audience the potentials and the limitations of current models. Second, we demonstrate the large variety of data applications that benefit from the transformer architecture. The tutorial aims at encouraging database researchers to engage and contribute to this new direction, and at empowering practitioners with a new set of tools for applications involving text and tabular data.
APA, Harvard, Vancouver, ISO, and other styles
5

Kelly, James P., Bruce L. Golden, Arjang A. Assad, and Edward K. Baker. "Controlled Rounding of Tabular Data." Operations Research 38, no. 5 (October 1990): 760–72. http://dx.doi.org/10.1287/opre.38.5.760.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Menon, Anil, and Nadhamuni Nerella. "Communicating Tabular Data Using ORACLE." Pharmaceutical Development and Technology 5, no. 3 (January 2000): 423–31. http://dx.doi.org/10.1081/pdt-100100559.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Humpal, John J. "Numbers, Statistics, and Tabular Data." Radiology 218, no. 1 (January 2001): 12. http://dx.doi.org/10.1148/radiology.218.1.r01ja6812.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

MORRISON, PHILIP S. "Symbolic Representation of Tabular Data." New Zealand Journal of Geography 79, no. 1 (May 15, 2008): 11–18. http://dx.doi.org/10.1111/j.0028-8292.1985.tb00199.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Gonzalez, Joe Fred, and Lawrence H. Cox. "Software for tabular data protection." Statistics in Medicine 24, no. 4 (2005): 659–69. http://dx.doi.org/10.1002/sim.2043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

C., Keerthy, and Sabitha S. "Privacy Preserved Data Publishing Techniques for Tabular Data." International Journal of Computer Applications 151, no. 9 (October 17, 2016): 1–6. http://dx.doi.org/10.5120/ijca2016911874.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Tabular data"

1

Xu, Lei S. M. Massachusetts Institute of Technology. "Synthesizing tabular data using conditional GAN." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/128349.

Full text
Abstract:
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2020
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 89-93).
In data science, the ability to model the distribution of rows in tabular data and generate realistic synthetic data enables various important applications including data compression, data disclosure, and privacy-preserving machine learning. However, because tabular data usually contains a mix of discrete and continuous columns, building such a model is a non-trivial task. Continuous columns may have multiple modes, while discrete columns are sometimes imbalanced, making modeling difficult. To address this problem, I took two major steps. (1) I designed SDGym, a thorough benchmark, to compare existing models, identify different properties of tabular data and analyze how these properties challenge different models. Our experimental results show that statistical models, such as Bayesian networks, that are constrained to a fixed family of available distributions cannot model tabular data effectively, especially when both continuous and discrete columns are included. Recently proposed deep generative models are capable of modeling more sophisticated distributions, but cannot outperform Bayesian network models in practice, because the network structure and learning procedure are not optimized for tabular data which may contain non-Gaussian continuous columns and imbalanced discrete columns. (2) To address these problems, I designed CTGAN, which uses a conditional generative adversarial network to address the challenges in modeling tabular data. Because CTGAN uses reversible data transformations and is trained by re-sampling the data, it can address common challenges in synthetic data generation. I evaluated CTGAN on the benchmark and showed that it consistently and significantly outperforms existing statistical and deep learning models.
by Lei Xu.
S.M.
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
2

Liu, Zhicheng. "Network-based visual analysis of tabular data." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/43687.

Full text
Abstract:
Tabular data is pervasive in the form of spreadsheets and relational databases. Although tables often describe multivariate data without explicit network semantics, it may be advantageous to explore the data modeled as a graph or network for analysis. Even when a given table design conveys some static network semantics, analysts may want to look at multiple networks from different perspectives, at different levels of abstraction, and with different edge semantics. This dissertation is motivated by the observation that a general approach for performing multi-dimensional and multi-level network-based visual analysis on multivariate tabular data is necessary. We present a formal framework based on the relational data model that systematically specifies the construction and transformation of graphs from relational data tables. In the framework, a set of relational operators provide the basis for rich expressive power for network modeling. Powered by this relational algebraic framework, we design and implement a visual analytics system called Ploceus. Ploceus supports flexible construction and transformation of networks through a direct manipulation interface, and integrates dynamic network manipulation with visual exploration for a seamless analytic experience.
APA, Harvard, Vancouver, ISO, and other styles
3

Caspár, Sophia. "Visualization of tabular data on mobile devices." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-68036.

Full text
Abstract:
This thesis evaluates various ways of displaying tabular data on mobile devices using different responsive table solutions. It also presents a tool to help web developers and designers in the process of choosing and implementing a suitable table approach. The proposed solution for this thesis is a web system called The Visualizing Wizard that allows the user to answer some questions about the intended table and then get a recommended responsive table solution generated based on the answers. The system uses a rule-based approach via Prolog to match the answers to a set of rules and provide an appropriate result. In order to determine which table solutions are more appropriate to use for which type of data a statistical analysis and user tests were performed. The statistical analysis contains an investigation to identify the most common table approaches and data types used on various websites. The result indicates that solutions such as "squish", "collapse by rows", "click" and "scroll" are most common. The most common table categories are product comparison, product offerings, sports and stock market/statistics. This information was used to implement and establish user tests to collect feedback and opinions. The data and statistics gathered from the user tests were mapped into sets of rules to answer the question of which responsive table solution is more appropriate to use for which type of data. This serves as the foundation for The Visualizing Wizard.
APA, Harvard, Vancouver, ISO, and other styles
4

Braunschweig, Katrin. "Recovering the Semantics of Tabular Web Data." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-184502.

Full text
Abstract:
The Web provides a platform for people to share their data, leading to an abundance of accessible information. In recent years, significant research effort has been directed especially at tables on the Web, which form a rich resource for factual and relational data. Applications such as fact search and knowledge base construction benefit from this data, as it is often less ambiguous than unstructured text. However, many traditional information extraction and retrieval techniques are not well suited for Web tables, as they generally do not consider the role of the table structure in reflecting the semantics of the content. Tables provide a compact representation of similarly structured data. Yet, on the Web, tables are very heterogeneous, often with ambiguous semantics and inconsistencies in the quality of the data. Consequently, recognizing the structure and inferring the semantics of these tables is a challenging task that requires a designated table recovery and understanding process. In the literature, many important contributions have been made to implement such a table understanding process that specifically targets Web tables, addressing tasks such as table detection or header recovery. However, the precision and coverage of the data extracted from Web tables is often still quite limited. Due to the complexity of Web table understanding, many techniques developed so far make simplifying assumptions about the table layout or content to limit the amount of contributing factors that must be considered. Thanks to these assumptions, many sub-tasks become manageable. However, the resulting algorithms and techniques often have a limited scope, leading to imprecise or inaccurate results when applied to tables that do not conform to these assumptions. In this thesis, our objective is to extend the Web table understanding process with techniques that enable some of these assumptions to be relaxed, thus improving the scope and accuracy. We have conducted a comprehensive analysis of tables available on the Web to examine the characteristic features of these tables, but also identify unique challenges that arise from these characteristics in the table understanding process. To extend the scope of the table understanding process, we introduce extensions to the sub-tasks of table classification and conceptualization. First, we review various table layouts and evaluate alternative approaches to incorporate layout classification into the process. Instead of assuming a single, uniform layout across all tables, recognizing different table layouts enables a wide range of tables to be analyzed in a more accurate and systematic fashion. In addition to the layout, we also consider the conceptual level. To relax the single concept assumption, which expects all attributes in a table to describe the same semantic concept, we propose a semantic normalization approach. By decomposing multi-concept tables into several single-concept tables, we further extend the range of Web tables that can be processed correctly, enabling existing techniques to be applied without significant changes. Furthermore, we address the quality of data extracted from Web tables, by studying the role of context information. Supplementary information from the context is often required to correctly understand the table content, however, the verbosity of the surrounding text can also mislead any table relevance decisions. We first propose a selection algorithm to evaluate the relevance of context information with respect to the table content in order to reduce the noise. Then, we introduce a set of extraction techniques to recover attribute-specific information from the relevant context in order to provide a richer description of the table content. With the extensions proposed in this thesis, we increase the scope and accuracy of Web table understanding, leading to a better utilization of the information contained in tables on the Web.
APA, Harvard, Vancouver, ISO, and other styles
5

Cappuzzo, Riccardo. "Deep learning models for tabular data curation." Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS047.

Full text
Abstract:
La conservation des données est un sujet omniprésent et de grande envergure, qui touche tous les domaines, du monde universitaire à l'industrie. Les solutions actuelles reposent sur le travail manuel des utilisateurs du domaine, mais elles ne sont pas adaptées. Nous étudions comment appliquer l'apprentissage profond à la conservation des données tabulaires. Nous concentrons notre travail sur le développement de systèmes de curation de données non supervisés et sur la conception de systèmes de curation qui modélisent intrinsèquement les valeurs catégorielles dans leur forme brute. Nous implémentons d'abord EmbDI pour générer des embeddings pour les données tabulaires, et nous traitons les tâches de résolution d'entités et de correspondance de schémas. Nous passons ensuite au problème de l'imputation des données en utilisant des réseaux neuronaux graphiques dans un cadre d'apprentissage multi-tâches appelé GRIMP
Data retention is a pervasive and far-reaching topic, affecting everything from academia to industry. Current solutions rely on manual work by domain users, but they are not adequate. We are investigating how to apply deep learning to tabular data curation. We focus our work on developing unsupervised data curation systems and designing curation systems that intrinsically model categorical values in their raw form. We first implement EmbDI to generate embeddings for tabular data, and address the tasks of entity resolution and schema matching. We then turn to the data imputation problem using graphical neural networks in a multi-task learning framework called GRIMP
APA, Harvard, Vancouver, ISO, and other styles
6

Baxter, Jay. "BayesDB : querying the probable implications of tabular data." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/91451.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
43
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 93-95).
BayesDB, a Bayesian database table, lets users query the probable implications of their tabular data as easily as an SQL database lets them query the data itself. Using the built-in Bayesian Query Language (BQL), users with little statistics knowledge can solve basic data science problems, such as detecting predictive relationships between variables, inferring missing values, simulating probable observations, and identifying statistically similar database entries. BayesDB is suitable for analyzing complex, heterogeneous data tables with no preprocessing or parameter adjustment required. This generality rests on the model independence provided by BQL, analogous to the physical data independence provided by the relational model. SQL enables data filtering and aggregation tasks to be described independently of the physical layout of data in memory and on disk. Non-experts rely on generic indexing strategies for good-enough performance, while experts customize schemes and indices for performance-sensitive applications. Analogously, BQL enables analysis tasks to be described independently of the models used to solve them. Non-statisticians can rely on a general-purpose modeling method called CrossCat to build models that are good enough for a broad class of applications, while experts can customize the schemes and models when needed. This thesis defines BQL, describes an implementation of BayesDB, quantitatively characterizes its scalability and performance, and illustrates its efficacy on real-world data analysis problems in the areas of healthcare economics, statistical survey data analysis, web analytics, and predictive policing.
by Jay Baxter.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
7

Jiang, Ji Chu. "High Precision Deep Learning-Based Tabular Data Extraction." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/41699.

Full text
Abstract:
The advancements of AI methodologies and computing power enables automation and propels the Industry 4.0 phenomenon. Information and data are digitized more than ever, millions of documents are being processed every day, they are fueled by the growth in institutions, organizations, and their supply chains. Processing documents is a time consuming laborious task. Therefore automating data processing is a highly important task for optimizing supply chains efficiency across all industries. Document analysis for data extraction is an impactful field, this thesis aims to achieve the vital steps in an ideal data extraction pipeline. Data is often stored in tables since it is a structured formats and the user can easily associate values and attributes. Tables can contain vital information from specifications, dimensions, cost etc. Therefore focusing on table analysis and recognition in documents is a cornerstone to data extraction. This thesis applies deep learning methodologies for automating the two main problems within table analysis for data extraction; table detection and table structure detection. Table detection is identifying and localizing the boundaries of the table. The output of the table detection model will be inputted into the table structure detection model for structure format analysis. Therefore the output of the table detection model must have high localization performance otherwise it would affect the rest of the data extraction pipeline. Our table detection improves bounding box localization performance by incorporating a Kullback–Leibler loss function that calculates the divergence between the probabilistic distribution between ground truth and predicted bounding boxes. As well as adding a voting procedure into the non-maximum suppression step to produce better localized merged bounding box proposals. This model improved precision of tabular detection by 1.2% while achieving the same recall as other state-of-the-art models on the public ICDAR2013 dataset. While also achieving state-of-the-art results of 99.8% precision on the ICDAR2017 dataset. Furthermore, our model showed huge improvements espcially at higher intersection over union (IoU) thresholds; at 95% IoU an improvement of 10.9% can be seen for ICDAR2013 dataset and an improvement of 8.4% can be seen for ICDAR2017 dataset. Table structure detection is recognizing the internal layout of a table. Often times researchers approach this through detecting the rows and columns. However, in order for correct mapping of each individual cell data location in the semantic extraction step the rows and columns would have to be combined and form a matrix, this introduces additional degrees of error. Alternatively we propose a model that directly detects each individual cell. Our model is an ensemble of state-of-the-art models; Hybird Task Cascade as the detector and dual ResNeXt101 backbones arranged in a CBNet architecture. There is a lack of quality labeled data for table cell structure detection, therefore we hand labeled the ICDAR2013 dataset, and we wish to establish a strong baseline for this dataset. Our model was compared with other state-of-the-art models that excelled at table or table structure detection. Our model yielded a precision of 89.2% and recall of 98.7% on the ICDAR2013 cell structure dataset.
APA, Harvard, Vancouver, ISO, and other styles
8

Rahman, Md Anisur. "Tabular Representation of Schema Mappings: Semantics and Algorithms." Thèse, Université d'Ottawa / University of Ottawa, 2011. http://hdl.handle.net/10393/20032.

Full text
Abstract:
Our thesis investigates a mechanism for representing schema mapping by tabular forms and checking utility of the new representation. Schema mapping is a high-level specification that describes the relationship between two database schemas. Schema mappings constitute essential building blocks of data integration, data exchange and peer-to-peer data sharing systems. Global-and-local-as-view (GLAV) is one of the approaches for specifying the schema mappings. Tableaux are used for expressing queries and functional dependencies on a single database in a tabular form. In our thesis, we first introduce a tabular representation of GLAV mappings. We find that this tabular representation helps to solve many mapping-related algorithmic and semantic problems. For example, a well-known problem is to find the minimal instance of the target schema for a given instance of the source schema and a set of mappings between the source and the target schema. Second, we show that our proposed tabular mapping can be used as an operator on an instance of the source schema to produce an instance of the target schema which is `minimal' and `most general' in nature. There exists a tableaux-based mechanism for finding equivalence of two queries. Third, we extend that mechanism for deducing equivalence between two schema mappings using their corresponding tabular representations. Sometimes, there exist redundant conjuncts in a schema mapping which causes data exchange, data integration and data sharing operations more time consuming. Fourth, we present an algorithm that utilizes the tabular representations for reducing number of constraints in the schema mappings. At present, either schema-level mappings or data-level mappings are used for data sharing purposes. Fifth, we introduce and give the semantics of bi-level mapping that combines the schema-level and data-level mappings. We also show that bi-level mappings are more effective for data sharing systems. Finally, we implemented our algorithms and developed a software prototype to evaluate our proposed strategies.
APA, Harvard, Vancouver, ISO, and other styles
9

Baena, Mirabete Daniel. "Exact and heuristic methods for statistical tabular data protection." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/456809.

Full text
Abstract:
One of the main purposes of National Statistical Agencies (NSAs) is to provide citizens or researchers with a large amount of trustful and high quality statistical information. NSAs must guarantee that no confidential individual information can be obtained from the released statistical outputs. The discipline of Statistical disclosure control (SDC) aims to avoid that confidential information is derived from data released while, at the same time, maintaining as much as possible the data utility. NSAs work with two types of data: microdata and tabular data. Microdata files contain records of individuals or respondents (persons or enterprises) with attributes. For instance, a national census might collect attributes such as age, address, salary, etc. Tabular data contains aggregated information obtained by crossing one or more categorical variables from those microdata files. Several SDC methods are available to avoid that no confidential individual information can be obtained from the released microdata or tabular data. This thesis focus on tabular data protection, although the research carried out can be applied to other classes of problems. Controlled Tabular Adjustment(CTA) and Cell Suppression Problem (CSP) have concentrated most of the recent research in the tabular data protection field. Both methods formulate Mixed Integer Linear Programming problems (MILPs) which are challenging for tables of moderate size. Even finding a feasible initial solution may be a challenging task for large instances. Due to the fact that many end users give priority to fast executions and are thus satisfied, in practice, with suboptimal solutions, as a first result of this thesis we present an improvement of a known and successful heuristic for finding feasible solutions of MILPs, called feasibility pump. The new approach, based on the computation of analytic centers, is named the Analytic Center Feasbility Pump.The second contribution consists in the application of the fix-and-relax heuristic (FR) to the CTA method. FR (alone or in combination with other heuristics) is shown to be competitive compared to CPLEX branch-and-cut in terms of quickly finding either a feasible solution or a good upper bound. The last contribution of this thesis deals with general Benders decomposition, which is improved with the application of stabilization techniques. A stabilized Benders decomposition is presented,which focus on finding new solutions in the neighborhood of "good'' points. This approach is efficiently applied to the solution of realistic and real-world CSP instances, outperforming alternative approaches.The first two contributions are already published in indexed journals (Operations Research Letters and Computers and Operations Research). The third contribution is a working paper to be submitted soon.
Un dels principals objectius dels Instituts Nacionals d'Estadística (INEs) és proporcionar, als ciutadans o als investigadors, una gran quantitat de dades estadístiques fiables i precises. Al mateix temps els INEs deuen garantir la confidencialitat estadística i que cap dada personal pot ser obtinguda gràcies a les dades estadístiques disseminades. La disciplina Control de revelació estadística (en anglès Statistical Disclosure Control, SDC) s'ocupa de garantir que cap dada individual pot derivar-se dels outputs de estadístics publicats però intentant al mateix temps mantenir el màxim possible de riquesa de les dades. Els INEs treballen amb dos tipus de dades: microdades i dades tabulars. Les microdades son arxius amb registres individuals de persones o empreses amb un conjunt d'atributs. Per exemple, el censos nacional recull atributs tals com l'edat, sexe, adreça o salari entre d'altres. Les dades tabulars són dades agregades obtingudes a partir del creuament d’un o més atributs o variables categòriques dels fitxers de microdades. Varis mètodes CRE són disponibles per evitar la revelació estadística en fitxers de microdades o dades tabulars. Aquesta tesi es centra en la protecció de dades tabulars tot i que la recerca duta a terme pot ser aplicada també a altres tipus de problemes. Els mètodes CTA (en anglès Controlled Tabular Adjustment) i CSP (en anglès Cell Suppression Problem) ha centrat la major part de la recerca feta en el camp de protecció de dades tabulars. Tots dos mètodes formulen problemes MILP (Mixed Integer Linear Programming problems) difícils de solucionar en taules de mida moderada. Fins i tot trobar solucions inicials factibles pot resultar molt difícil. Donat el fet que molts usuaris finals donen prioritat a tenir solucions ràpides i bones tot i que aquestes no siguin les òptimes, la primera contribució de la tesis presenta una millora en una coneguda i exitosa heurística per trobar solucions factibles de MILPs, anomenada feasibility pump. La nova aproximació, basada en el càlcul de centres analítics, s'anomena Analytic Center Feasibility Pump. La segona contribució consisteix en l'aplicació de la heurística fix-and-relax (FR) al mètode CTA. FR (sol o en combinació amb d'altres heurístiques) es mostra com a competitiu davant CPLEX branch-and-cut en termes de trobar ràpidament solucions factibles o bons upper bounds. La darrera contribució d’aquesta tesi tracta sobre el problema general de descomposició de Benders, aportant una millora amb l'aplicació de tècniques d’estabilització. Presentem un mètode anomenat stabilized Benders decomposition que es centra en trobar noves solucions properes a punts considerats prèviament com a bons. Aquesta aproximació ha estat eficientment aplicada al problema CSP, obtenint molt bons resultats en dades tabulars reals, millorant altres alternatives conegudes del mètode CSP. Les dues primeres contribucions ja han estat publicades en revistes indexades (Operations Research Letters and Computers and Operations Research). Actualment estem treballant en la publicació de la tercera contribució i serà en breu enviada a revisar.
APA, Harvard, Vancouver, ISO, and other styles
10

Karlsson, Anton, and Torbjörn Sjöberg. "Synthesis of Tabular Financial Data using Generative Adversarial Networks." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273633.

Full text
Abstract:
Digitalization has led to tons of available customer data and possibilities for data-driven innovation. However, the data needs to be handled carefully to protect the privacy of the customers. Generative Adversarial Networks (GANs) are a promising recent development in generative modeling. They can be used to create synthetic data which facilitate analysis while ensuring that customer privacy is maintained. Prior research on GANs has shown impressive results on image data. In this thesis, we investigate the viability of using GANs within the financial industry. We investigate two state-of-the-art GAN models for synthesizing tabular data, TGAN and CTGAN, along with a simpler GAN model that we call WGAN. A comprehensive evaluation framework is developed to facilitate comparison of the synthetic datasets. The results indicate that GANs are able to generate quality synthetic datasets that preserve the statistical properties of the underlying data and enable a viable and reproducible subsequent analysis. It was however found that all of the investigated models had problems with reproducing numerical data.
Digitaliseringen har fört med sig stora mängder tillgänglig kunddata och skapat möjligheter för datadriven innovation. För att skydda kundernas integritet måste dock uppgifterna hanteras varsamt. Generativa Motstidande Nätverk (GANs) är en ny lovande utveckling inom generativ modellering. De kan användas till att syntetisera data som underlättar dataanalys samt bevarar kundernas integritet. Tidigare forskning på GANs har visat lovande resultat på bilddata. I det här examensarbetet undersöker vi gångbarheten av GANs inom finansbranchen. Vi undersöker två framstående GANs designade för att syntetisera tabelldata, TGAN och CTGAN, samt en enklare GAN modell som vi kallar för WGAN. Ett omfattande ramverk för att utvärdera syntetiska dataset utvecklas för att möjliggöra jämförelse mellan olika GANs. Resultaten indikerar att GANs klarar av att syntetisera högkvalitativa dataset som bevarar de statistiska egenskaperna hos det underliggande datat, vilket möjliggör en gångbar och reproducerbar efterföljande analys. Alla modellerna som testades uppvisade dock problem med att återskapa numerisk data.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Tabular data"

1

Ye, Andre, and Zian Wang. Modern Deep Learning for Tabular Data. Berkeley, CA: Apress, 2023. http://dx.doi.org/10.1007/978-1-4842-8692-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gilbert, G. Nigel. Analyzing tabular data: Loglinear and logistic models for social researchers. London: UCL Press, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

United States. Bureau of Mines., ed. MULSIM/PC: A personal-computer-based structural analysis program for mine design in deep tabular deposits. Washington, D.C. (810 7th St., N.W., Washington 20241-0001): U.S. Dept. of the Interior, Bureau of Mines, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

United States. Bureau of Mines., ed. MULSIM/PC: A personal-computer-based structural analysis program for mine design in deep tabular deposits. Washington, D.C. (810 7th St., N.W., Washington 20241-0001): U.S. Dept. of the Interior, Bureau of Mines, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Donato, D. A. MULSIM/PC: A personal computer-based structural analysis program for mine design in deep tabular deposits. Washington, D.C: U.S. Dept. of the Interior, Bureau of Mines, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Spamer, Earle E. Geology of the Grand Canyon: A guide and index to published graphic and tabular data (excluding paleontology). Boulder, Colo: Geological Society of America, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

B, Taylor Richard. GS MRDS: A system based on the data fields used in the national MRDS system but using dBASE III and a microcomputer (IBM PC) or compatible) for organizing data on mineral resource occurrences and providing tabular and graphic output. Denver, Colo: U.S. Dept. of the Interior, Geological Survey, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

B, Taylor Richard. GS MRDS: A system based on the data fields used in the national MRDS system but using dBASE III and a microcomputer (IBM PC) or compatible) for organizing data on mineral resource occurrences and providing tabular and graphic output. Denver, Colo: U.S. Dept. of the Interior, Geological Survey, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

I, Selner G., Johnson Bruce R, and Geological Survey (U.S.), eds. GS MRDS: A system based on the data fields used in the national MRDS system but using dBASE III and a microcomputer (IBM PC) or compatible) for organizing data on mineral resource occurrences and providing tabular and graphic output. Denver, Colo: U.S. Dept. of the Interior, Geological Survey, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Haworth, Lauren E. PROC TABULATE by example. Cary, NC: SAS Institute, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Tabular data"

1

Stowell, Sarah. "Tabular Data." In Using R for Statistics, 73–86. Berkeley, CA: Apress, 2014. http://dx.doi.org/10.1007/978-1-4842-0139-8_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Domingo-Ferrer, Josep. "Tabular Data." In Encyclopedia of Database Systems, 1. New York, NY: Springer New York, 2014. http://dx.doi.org/10.1007/978-1-4899-7993-3_1493-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Willenborg, Leon, and Ton de Waal. "Tabular Data." In Statistical Disclosure Control in Practice, 87–111. New York, NY: Springer New York, 1996. http://dx.doi.org/10.1007/978-1-4612-4028-0_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Dalgaard, Peter. "Tabular data." In Statistics and Computing, 145–54. New York, NY: Springer New York, 2008. http://dx.doi.org/10.1007/978-0-387-79054-1_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Domingo-Ferrer, Josep. "Tabular Data." In Encyclopedia of Database Systems, 2908. Boston, MA: Springer US, 2009. http://dx.doi.org/10.1007/978-0-387-39940-9_1493.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Darrin, Speegle, and Clair Bryan. "Tabular Data." In Probability, Statistics, and Data, 335–70. Boca Raton: Chapman and Hall/CRC, 2021. http://dx.doi.org/10.1201/9781003004899-10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Domingo-Ferrer, Josep. "Tabular Data." In Encyclopedia of Database Systems, 3874. New York, NY: Springer New York, 2018. http://dx.doi.org/10.1007/978-1-4614-8265-9_1493.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Keydana, Sigrid. "Tabular Data." In Deep Learning and Scientific Computing with R torch, 201–18. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003275923-20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Gilbert, Nigel. "Modelling mobility and change." In Analyzing Tabular Data, 82–100. London: Routledge, 2022. http://dx.doi.org/10.4324/9781003259701-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Gilbert, Nigel. "Choosing and fitting models." In Analyzing Tabular Data, 66–81. London: Routledge, 2022. http://dx.doi.org/10.4324/9781003259701-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Tabular data"

1

Wang, Tengyun, Jibing Wu, Kaiming Xiao, Ningchao Ge, Hang Zhang, Tao Qiu, and Yifan Zeng. "Enhancing Tabular Data Generation through Data and Knowledge Dual-Driven Approaches." In 2024 10th International Conference on Big Data and Information Analytics (BigDIA), 140–45. IEEE, 2024. https://doi.org/10.1109/bigdia63733.2024.10808817.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wen, Yizhu, Yiwei Wang, Kai Yi, Jing Ke, and Yiqing Shen. "Diffimpute: Tabular Data Imputation with Denoising Diffusion Probabilistic Model." In 2024 IEEE International Conference on Multimedia and Expo (ICME), 1–6. IEEE, 2024. http://dx.doi.org/10.1109/icme57554.2024.10687685.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Bonnier, Thomas. "Revisiting Multimodal Transformers for Tabular Data with Text Fields." In Findings of the Association for Computational Linguistics ACL 2024, 1481–500. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.findings-acl.87.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Yu, Na, Ke Xu, Kaixuan Chen, Shunyu Liu, Tongya Zheng, and Mingli Song. "Multi-Channel Graph Fusion Representation for Tabular Data Imputation." In 2024 International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE, 2024. http://dx.doi.org/10.1109/ijcnn60899.2024.10651425.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Apellaniz, Patricia A., Juan Parras, and Santiago Zazo. "An Improved Tabular Data Generator with VAE-GMM Integration." In 2024 32nd European Signal Processing Conference (EUSIPCO), 1886–90. IEEE, 2024. http://dx.doi.org/10.23919/eusipco63174.2024.10715230.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Lin, Tong, Jason Yan, David Jurgens, and Sabina J. Tomkins. "Tab2Text - A framework for deep learning with tabular data." In Findings of the Association for Computational Linguistics: EMNLP 2024, 12925–35. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.findings-emnlp.756.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sukhobok, Dina, Nikolay Nikolov, and Dumitru Roman. "Tabular Data Anomaly Patterns." In 2017 International Conference on Big Data Innovations and Applications (Innovate-Data). IEEE, 2017. http://dx.doi.org/10.1109/innovate-data.2017.10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zhu, Yujin, Zilong Zhao, Robert Birke, and Lydia Y. Chen. "Permutation-Invariant Tabular Data Synthesis." In 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022. http://dx.doi.org/10.1109/bigdata55660.2022.10020639.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Clausner, Christian, Justin Hayes, and Apostolos Antonacopoulos. "Crowdsourcing Historical Tabular Data." In the 5th International Workshop. New York, New York, USA: ACM Press, 2019. http://dx.doi.org/10.1145/3352631.3352643.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Top, J. "Adding semantics to tabular agrifood dataAdding semantics to tabular agrifood data." In Scientific Symposium FAIR Data Sciences for Green Life Sciences. Wageningen University & Research, 2018. http://dx.doi.org/10.18174/fairdata2018.16285.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Tabular data"

1

Garton, Timothy. Data enrichment and enhanced accessibility of waterborne commerce numerical data : spatially depicting the National Waterway Network. Engineer Research and Development Center (U.S.), December 2020. http://dx.doi.org/10.21079/11681/39223.

Full text
Abstract:
This report provides methodologies and processes of data enrichment and enhanced accessibility of Waterborne Commerce and Statistics Center (WCSC) maintained databases. These databases house tabular and statistical data that reports on The U.S. Army Corps of Engineers (USACE) Civil Works Division National Waterway Network (NWN), which geospatially represents approximately 1,000 harbors and 25,000 miles of channels and waterways. WCSC is a division of The Institute for Water Resources (IWR). They have been tasked with the international collection, maintenance, and archival of all records involving commercial movements and commerce that occur on federal waterways. The current records structure is a large, tabular dataset and limited to the systems and processes put in place prior to the computing standards and capabilities available today. Methods have been tested and utilized to bring the tabular datasets into an optimized, modern geospatial network and expanded upon to create a higher resolution than previously maintained by the WCSC. This report will expand upon the applied methodologies to optimize data queries and the overall enhancement of the data system to allow for linkages to various other sources of information for commerce data enhancement for decision support assistance.
APA, Harvard, Vancouver, ISO, and other styles
2

Hazen, T. C. Operations Support of Phase 2 Integrated Demonstration In Situ Bioremediation. Volume 2, Final report: Data in tabular form, Disks 2,3,4. Office of Scientific and Technical Information (OSTI), September 1993. http://dx.doi.org/10.2172/10161904.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hazen, T. C. Operations Support of Phase 2 Integrated Demonstration In Situ Bioremediation. Volume 3, Final report: Data in graphical form, Disks 1,2,3,4; Averaged data in tabular form, Disks 1,2. Office of Scientific and Technical Information (OSTI), September 1993. http://dx.doi.org/10.2172/10161897.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Hazen, T. C. Operations Support of Phase 2 Integrated Demonstration In Situ Bioremediation. Volume 4, Final report: Averaged data in tabular form, Disks 3,4; Averaged data in graphical form, Disks 1,2,3,4. Office of Scientific and Technical Information (OSTI), September 1993. http://dx.doi.org/10.2172/10161901.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Hazen, T. C. Operations Support of Phase 2 Integrated Demonstration In Situ Bioremediation. Volume 1, Final report: Final report text data in tabular form, Disk 1. Office of Scientific and Technical Information (OSTI), September 1993. http://dx.doi.org/10.2172/10161907.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Leis, Flamberg, and Rose. BB78ES8 Vintage Line Pipe Properties via Battelle's Archives. Chantilly, Virginia: Pipeline Research Council International, Inc. (PRCI), October 2008. http://dx.doi.org/10.55274/r0011082.

Full text
Abstract:
Pipeline fitness-for-service and maintenance prioritization both require knowledge of the line pipe�s mechanical properties and fracture resistance. This report presents such information for a broad range of steels, including vintage Grades A and B as well as early X-grades up through X52 produced from 1930 to 1970. Thirty-six data sets for steels over this range of grades were presented in terms of full-range curves for steels produced up through 1960, while results for another ~150 steels produced until 1970 were summarized in tabular format. Strength and toughness (via CVN and/or DWTT) data were presented with pipe vintage, line-pipe geometry, seam type, steel chemistry, and supplier.
APA, Harvard, Vancouver, ISO, and other styles
7

Kokurina, O., and A. Burov. Analytical report on the results of an empirical study of the characteristics and level of sociopolitical stability of student youth as a factor of sustainable development of the Russian statehood in the context of current global challenges. SIB-Expertise, December 2022. http://dx.doi.org/10.12731/er0623.06122022.

Full text
Abstract:
The Аnalytical report summarizes and interprets the results of empirical research of characteristics and level of sociopolitical stability of student youth as a factor of sustainable development of Russian statehood in the context of contemporary global challenges. The research is performed in the form of a sociological survey. Also in the specified result of intellectual activity the data on the revealed most significant channels, forms and directions of influence of the state youth and state national policy of the Russian Federation on sociopolitical stability are listed. The Аnalytical report reflects the results of the collected and processed material - the respondents' proposals to improve the sociopolitical stability of student youth in modern Russia. The results of the study in tabular form and comments on private and general results of the study are given.
APA, Harvard, Vancouver, ISO, and other styles
8

Casper, Gary, Stefanie Nadeau, and Thomas Parr. Acoustic amphibian monitoring, 2019 data summary: Isle Royale National Park. National Park Service, December 2022. http://dx.doi.org/10.36967/2295506.

Full text
Abstract:
Amphibians are a Vital Sign indicator for monitoring long-term ecosystem health in seven national park units that comprise the Great Lakes Network. We present here the results for 2019 amphibian monitoring at Isle Royale National Park (ISRO). Appendices contain tabular summaries for six years of cumulative results. The National Park Service Great Lakes Inventory and Monitoring Network established 10 permanent acoustic amphibian monitoring sites at ISRO in 2015. Acoustic samples are collected by placing automated recorders with omnidirectional stereo microphones at each of the 10 sampling sites. Temperature loggers co-located with the recorders also collect air temperature during the sampling period. The monitoring program detected all seven species of frog and toad known to occur at ISRO in 2019, with Eastern American Toad, Green Frog and Spring Peeper occurring at almost every site sampled, and Wood Frog at six sites. Gray Treefrog, Mink Frog, and Boreal Chorus Frog were found at only one or two sites each. Northern Leopard Frog has yet to be confirmed at ISRO in this GLKN monitoring program. We expanded analyses and reporting in 2018 to address calling phenology and to provide a second metric for tracking changes in abundance (as opposed to occupancy) across years. Occupancy analyses track whether or not a site was occupied by a species. Abundance is tracked by assessing how the maximum call intensity changes on sites across years, and by how many automated detections are reported from sites across years. Using two independent survey methods, manual and automated, with large sample sizes continues to return reliable results, providing a confident record of site occupancy for most species. There were no significant data collection issues in 2019. Three units stopped collecting data early but these data gaps did not compromise sampling rigor or analysis. Since temperature logs show that the threshold of ≥40°F was often exceeded by 1 April in 2019, making 15 March a start date for data collection may be considered if park personnel feel snow and ice cover would be reduced enough by that date as well. We do recommend making sure that temperature logger solar shields in future are not hanging in such a manner as to be banging against anything in a breeze, as this contaminates the soundscape
APA, Harvard, Vancouver, ISO, and other styles
9

Casper, Gary, Stefanie Nadeau, and Thomas Parr. Acoustic amphibian monitoring, 2019 data summary: Sleeping Bear Dunes National Lakeshore. National Park Service, December 2022. http://dx.doi.org/10.36967/2295512.

Full text
Abstract:
Amphibians are a Vital Sign indicator for monitoring long-term ecosystem health in seven national park units that comprise the Great Lakes Network. We present here the results for 2019 amphibian monitoring at Sleeping Bear Dunes National Lakeshore (SLBE). Appendices contain tabular summaries for six years of cumulative results. The National Park Service Great Lakes Inventory and Monitoring Network established 10 permanent acoustic amphibian monitoring sites at SLBE in 2013. Acoustic samples are collected by placing automated recorders with omnidirectional stereo microphones at each of the 10 sampling sites. Temperature loggers co-located with the recorders also collect air temperature during the sampling period. We expanded analyses and reporting in 2018 to address calling phenology and to provide a second metric for tracking changes in abundance across years. Occupancy analyses track whether or not a site was occupied by a species. Abundance is tracked by assessing how the maximum call intensity changes on sites across years, and by how many automated detections are reported from sites across years. Using two independent survey methods, manual and automated, with large sample sizes continues to return reliable results, providing a confident record of site occupancy for most species. The monitoring program detected five of the six species of frog and toad known to occur at SLBE in 2019, with Eastern American Toad, Gray Treefrog, Green Frog and Spring Peeper occurring at almost every site sampled. Wood Frog was found at one new site, and Northern Leopard Frog was not confirmed in 2019 but was detected at five sites in 2018. There were no significant data collection issues in 2019 except for late deployment of SLBE11, which limited data analyses for this site. Remaining sites successfully collected data as programmed. Cumulative data collection result summaries since inception are provided in appendices. Since temperature logs show that the threshold of ≥40°F was often exceeded by 1 April in 2019, making 15 March a start date for data collection may be considered if park personnel feel snow and ice cover would be reduced enough by that date as well.
APA, Harvard, Vancouver, ISO, and other styles
10

Casper, Gary, Stefanie Nadeau, and Thomas Parr. Acoustic amphibian monitoring, 2019 data summary: Pictured Rocks National Lakeshore. National Park Service, December 2022. http://dx.doi.org/10.36967/2295509.

Full text
Abstract:
Amphibians are a Vital Sign indicator for monitoring long-term ecosystem health in seven national park units that comprise the Great Lakes Network. We present here the results for 2019 amphibian monitoring at Pictured Rocks National Lakeshore (PIRO). Appendices contain tabular summaries for six years of cumulative results. The National Park Service Great Lakes Inventory and Monitoring Network established 10 permanent acoustic amphibian monitoring sites at PIRO in 2013. Acoustic samples are collected by placing automated recorders with omnidirectional stereo microphones at each of the 10 sampling sites. Temperature loggers co-located with the recorders also collect air temperature during the sampling period. We expanded analyses and reporting in 2018 to address calling phenology and to provide a second metric for tracking changes in abundance across years. Occupancy analyses track whether or not a site was occupied by a species. Abundance is tracked by assessing how the maximum call intensity changes on sites across years, and by how many automated detections are reported from sites across years. Using two independent survey methods, manual and automated, with large sample sizes continues to return reliable results, providing a confident record of site occupancy for most species. The monitoring program detected five of the six species of frog and toad known to occur at PIRO in 2019, with Eastern American Toad, Gray Treefrog, Green Frog, and Spring Peeper occurring at almost every site sampled. Wood Frog was found at five sites. Mink Frog is known to occur at Sand Point but has never been confirmed at sites monitored by this GLKN program. Additional species of potential occurrence remain hypothetical (i.e., Northern Leopard Frog). The only significant data collection issue in 2019 was at PIRO02, where the equipment recorded only intermittently resulting in only partial data analysis possible. Remaining sites successfully collected data as programmed. Cumulative program result summaries since inception are provided in appendices. Temperature logs in 2019 showed that the threshold of ≥40°F was uniformly exceeded by 1 May, hence we recommend making 10 April the target start date for data collection in future. This could be accomplished by fall deployment of recorders on delayed starts. We also recommend making sure that recorders are mounted 6–10 feet high to better survey the soundscape with less interference from foliage, and that temperature loggers be placed within solar shields.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography