Academic literature on the topic 'Data-to-text'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Data-to-text.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Data-to-text"

1

Yang, Sen, and Yang Liu. "Data-to-text Generation via Planning." Journal of Physics: Conference Series 1827, no. 1 (March 1, 2021): 012190. http://dx.doi.org/10.1088/1742-6596/1827/1/012190.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Puduppully, Ratish, and Mirella Lapata. "Data-to-text Generation with Macro Planning." Transactions of the Association for Computational Linguistics 9 (2021): 510–27. http://dx.doi.org/10.1162/tacl_a_00381.

Full text
Abstract:
Abstract Recent approaches to data-to-text generation have adopted the very successful encoder-decoder architecture or variants thereof. These models generate text that is fluent (but often imprecise) and perform quite poorly at selecting appropriate content and ordering it coherently. To overcome some of these issues, we propose a neural model with a macro planning stage followed by a generation stage reminiscent of traditional methods which embrace separate modules for planning and surface realization. Macro plans represent high level organization of important content such as entities, events, and their interactions; they are learned from data and given as input to the generator. Extensive experiments on two data-to-text benchmarks (RotoWire and MLB) show that our approach outperforms competitive baselines in terms of automatic and human evaluation.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Dell, Jiahao Yuan, Xiaoling Wang, and Adam Foster. "Probabilistic Verb Selection for Data-to-Text Generation." Transactions of the Association for Computational Linguistics 6 (December 2018): 511–27. http://dx.doi.org/10.1162/tacl_a_00038.

Full text
Abstract:
In data-to-text Natural Language Generation (NLG) systems, computers need to find the right words to describe phenomena seen in the data. This paper focuses on the problem of choosing appropriate verbs to express the direction and magnitude of a percentage change (e.g., in stock prices). Rather than simply using the same verbs again and again, we present a principled data-driven approach to this problem based on Shannon’s noisy-channel model so as to bring variation and naturalness into the generated text. Our experiments on three large-scale real-world news corpora demonstrate that the proposed probabilistic model can be learned to accurately imitate human authors’ pattern of usage around verbs, outperforming the state-of-the-art method significantly.
APA, Harvard, Vancouver, ISO, and other styles
4

Rüdiger, Matthias, David Antons, and Torsten Oliver Salge. "From Text to Data: On The Role and Effect of Text Pre-Processing in Text Mining Research." Academy of Management Proceedings 2017, no. 1 (August 2017): 16353. http://dx.doi.org/10.5465/ambpp.2017.16353abstract.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Iso, Hayate, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, Ichiro Kobayashi, Yusuke Miyao, Naoaki Okazaki, and Hiroya Takamura. "Learning to Select, Track, and Generate for Data-to-Text." Journal of Natural Language Processing 27, no. 3 (September 15, 2020): 599–626. http://dx.doi.org/10.5715/jnlp.27.599.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Riza, Lala Septem, Muhammad Ridwan, Enjun Junaeti, and Khyrina Airin Fariza Abu Samah. "Development of data-to-text (D2T) on generic data using fuzzy sets." International Journal of Advanced Technology and Engineering Exploration 8, no. 75 (February 28, 2021): 382–90. http://dx.doi.org/10.19101/ijatee.2020.762134.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Puduppully, Ratish, Li Dong, and Mirella Lapata. "Data-to-Text Generation with Content Selection and Planning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6908–15. http://dx.doi.org/10.1609/aaai.v33i01.33016908.

Full text
Abstract:
Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what to say and in what order. In this work, we present a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training. We decompose the generation task into two stages. Given a corpus of data records (paired with descriptive documents), we first generate a content plan highlighting which information should be mentioned and in which order and then generate the document while taking the content plan into account. Automatic and human-based evaluation experiments show that our model1 outperforms strong baselines improving the state-of-the-art on the recently released RotoWIRE dataset.
APA, Harvard, Vancouver, ISO, and other styles
8

Guru, D. S., K. Swarnalatha, N. Vinay Kumar, and Basavaraj S. Anami. "Effective Technique to Reduce the Dimension of Text Data." International Journal of Computer Vision and Image Processing 10, no. 1 (January 2020): 67–85. http://dx.doi.org/10.4018/ijcvip.2020010104.

Full text
Abstract:
In this article, features are selected using feature clustering and ranking of features for imbalanced text data. Initially the text documents are represented in lower dimension using the term class relevance (TCR) method. The class wise clustering is recommended to balance the documents in each class. Subsequently, the clusters are treated as classes and the documents of each cluster are represented in the lower dimensional form using the TCR again. The features are clustered and for each feature cluster the cluster representative is selected and these representatives are used as selected features of the documents. Hence, this proposed model reduces the dimension to a smaller number of features. For selecting the cluster representative, four feature evaluation methods are used and classification is done by using SVM classifier. The performance of the method is compared with the global feature ranking method. The experiment is conducted on two benchmark datasets the Reuters-21578 and the TDT2 dataset. The experimental results show that this method performs well when compared to the other existing works.
APA, Harvard, Vancouver, ISO, and other styles
9

Al Rababaa, Mamoun Suleiman, and Essam Said Hanandeh. "The Automated VSMs to Categorize Arabic Text Data Sets." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 13, no. 1 (March 31, 2014): 4074–81. http://dx.doi.org/10.24297/ijct.v13i1.2925.

Full text
Abstract:
Text Categorization is one of the most important tasks in information retrieval and data mining. This paper aims at investigating different variations of vector space models (VSMs) using KNN algorithm. we used 242 Arabic abstract documents that were used by (Hmeidi & Kanaan, 1997). The bases of our comparison are the most popular text evaluation measures; we use Recall measure, Precision measure, and F1 measure. The Experimental results against the Saudi data sets reveal that Cosine outperformed over of the Dice and Jaccard coefficients.
APA, Harvard, Vancouver, ISO, and other styles
10

Gkatzia, Dimitra, Oliver Lemon, and Verena Rieser. "Data-to-Text Generation Improves Decision-Making Under Uncertainty." IEEE Computational Intelligence Magazine 12, no. 3 (August 2017): 10–17. http://dx.doi.org/10.1109/mci.2017.2708998.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Data-to-text"

1

Kyle, Cameron. "Data to information to text summaries of financial data." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29643.

Full text
Abstract:
The field of auditing is becoming increasingly dependent on information technology as auditors are forced to follow the increasingly complex information processing of their clients. There exists a need for a system that can convert vast quantities of data generated by existing systems and data analytics techniques, into usable information and then into a format that is easy for someone not trained in data analytics to understand. This is possible through Natural Language Generation (NLG). The field of auditing has not previously been applied to this pipeline. This research looks at the auditing of Investment Fund Management, of which a specific procedure is the comparison of two time series (one of the fund being tested and another of the benchmark it is supposed to follow) to identify potential misstatements in the investment fund. We solve this problem through a combination of incremental innovations on existing techniques in the text planning stage as well as pre-NLG processing steps, with effective leveraging of accepted sentence planning and realisation techniques. Additionally, fuzzy logic is used to provide a more human decision system. This allows the system to transform data into information and then into text. This has been evaluated by experts and achieved positive results with regard to audit impact, readability and understandability, while falling slight short of the stated accuracy targets. These preliminary results are positive in general and are therefore encouraging for further development.
APA, Harvard, Vancouver, ISO, and other styles
2

Gkatzia, Dimitra. "Data-driven approaches to content selection for data-to-text generation." Thesis, Heriot-Watt University, 2015. http://hdl.handle.net/10399/3003.

Full text
Abstract:
Data-to-text systems are powerful in generating reports from data automatically and thus they simplify the presentation of complex data. Rather than presenting data using visualisation techniques, data-to-text systems use human language, which is the most common way for human-human communication. In addition, data-to-text systems can adapt their output content to users’ preferences, background or interests and therefore they can be pleasant for users to interact with. Content selection is an important part of every data-to-text system, because it is the module that decides which from the available information should be conveyed to the user. This thesis makes three important contributions. Firstly, it investigates data-driven approaches to content selection with respect to users’ preferences. It develops, compares and evaluates two novel content selection methods. The first method treats content selection as a Markov Decision Process (MDP), where the content selection decisions are made sequentially, i.e. given the already chosen content, decide what to talk about next. The MDP is solved using Reinforcement Learning (RL) and is optimised with respect to a cumulative reward function. The second approach considers all content selection decisions simultaneously by taking into account data relationships and treats content selection as a multi-label classification task. The evaluation shows that the users significantly prefer the output produced by the RL framework, whereas the multi-label classification approach scores significantly higher than the RL method in automatic metrics. The results also show that the end users’ preferences should be taken into account when developing Natural Language Generation (NLG) systems. NLG systems are developed with the assistance of domain experts, however the end users are normally non-experts. Consider for instance a student feedback generation system, where the system imitates the teachers. The system will produce feedback based on the lecturers’ rather than the students’ preferences although students are the end users. Therefore, the second contribution of this thesis is an approach that adapts the content to “speakers” and “hearers” simultaneously. It considers initially two types of known stakeholders; lecturers and students. It develops a novel approach that analyses the preferences of the two groups using Principal Component Regression and uses the derived knowledge to hand-craft a reward function that is then optimised using RL. The results show that the end users prefer the output generated by this system, rather than the output that is generated by a system that mimics the experts. Therefore, it is possible to model the middle ground of the preferences of different known stakeholders. In most real world applications however, first-time users are generally unknown, which is a common problem for NLG and interactive systems: the system cannot adapt to user preferences without prior knowledge. This thesis contributes a novel framework for addressing unknown stakeholders such as first time users, using Multi-objective Optimisation to minimise regret for multiple possible user types. In this framework, the content preferences of potential users are modelled as objective functions, which are simultaneously optimised using Multi-objective Optimisation. This approach outperforms two meaningful baselines and minimises regret for unknown users.
APA, Harvard, Vancouver, ISO, and other styles
3

Turner, Ross. "Georeferenced data-to-text techniques and application /." Thesis, Available from the University of Aberdeen Library and Historic Collections Digital Resources, 2009. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?application=DIGITOOL-3&owner=resourcediscovery&custom_att_2=simple_viewer&pid=56243.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Štajner, Sanja. "New data-driven approaches to text simplification." Thesis, University of Wolverhampton, 2015. http://hdl.handle.net/2436/554413.

Full text
Abstract:
Many texts we encounter in our everyday lives are lexically and syntactically very complex. This makes them difficult to understand for people with intellectual or reading impairments, and difficult for various natural language processing systems to process. This motivated the need for text simplification (TS) which transforms texts into their simpler variants. Given that this is still a relatively new research area, many challenges are still remaining. The focus of this thesis is on better understanding the current problems in automatic text simplification (ATS) and proposing new data-driven approaches to solving them. We propose methods for learning sentence splitting and deletion decisions, built upon parallel corpora of original and manually simplified Spanish texts, which outperform the existing similar systems. Our experiments in adaptation of those methods to different text genres and target populations report promising results, thus offering one possible solution for dealing with the scarcity of parallel corpora for text simplification aimed at specific target populations, which is currently one of the main issues in ATS. The results of our extensive analysis of the phrase-based statistical machine translation (PB-SMT) approach to ATS reject the widespread assumption that the success of that approach largely depends on the size of the training and development datasets. They indicate more influential factors for the success of the PB-SMT approach to ATS, and reveal some important differences between cross-lingual MT and the monolingual v MT used in ATS. Our event-based system for simplifying news stories in English (EventSimplify) overcomes some of the main problems in ATS. It does not require a large number of handcrafted simplification rules nor parallel data, and it performs significant content reduction. The automatic and human evaluations conducted show that it produces grammatical text and increases readability, preserving and simplifying relevant content and reducing irrelevant content. Finally, this thesis addresses another important issue in TS which is how to automatically evaluate the performance of TS systems given that access to the target users might be difficult. Our experiments indicate that existing readability metrics can successfully be used for this task when enriched with human evaluation of grammaticality and preservation of meaning.
APA, Harvard, Vancouver, ISO, and other styles
5

Jones, Greg 1963-2017. "RADIX 95n: Binary-to-Text Data Conversion." Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc500582/.

Full text
Abstract:
This paper presents Radix 95n, a binary to text data conversion algorithm. Radix 95n (base 95) is a variable length encoding scheme that offers slightly better efficiency than is available with conventional fixed length encoding procedures. Radix 95n advances previous techniques by allowing a greater pool of 7-bit combinations to be made available for 8-bit data translation. Since 8-bit data (i.e. binary files) can prove to be difficult to transfer over 7-bit networks, the Radix 95n conversion technique provides a way to convert data such as compiled programs or graphic images to printable ASCII characters and allows for their transfer over 7-bit networks.
APA, Harvard, Vancouver, ISO, and other styles
6

Štajner, Sanja. "New data-driven approaches to text simplification." Thesis, University of Wolverhampton, 2016. http://hdl.handle.net/2436/601113.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rose, Øystein. "Text Mining in Health Records : Classification of Text to Facilitate Information Flow and Data Overview." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9629.

Full text
Abstract:

This project consists of two parts. In the first part we apply techniques from the field of text mining to classify sentences in encounter notes of the electronic health record (EHR) into classes of {it subjective}, {it objective} and {it plan} character. This is a simplification of the {it SOAP} standard, and is applied due to the way GPs structure the encounter notes. Structuring the information in a subjective, objective, and plan way, may enhance future information flow between the EHR and the personal health record (PHR). In the second part of the project we seek to use apply the most adequate to classify encounter notes from patient histories of patients suffering from diabetes. We believe that the distribution of sentences of a subjective, objective, and plan character changes according to different phases of diseases. In our work we experiment with several preprocessing techniques, classifiers, and amounts of data. Of the classifiers considered, we find that Complement Naive Bayes (CNB) produces the best result, both when the preprocessing of the data has taken place and not. On the raw dataset, CNB yields an accuracy of 81.03%, while on the preprocessed dataset, CNB yields an accuracy of 81.95%. The Support Vector Machines (SVM) classifier algorithm yields results comparable to the results obtained by use of CNB, while the J48 classifier algorithm performs poorer. Concerning preprocessing techniques, we find that use of techniques reducing the dimensionality of the datasets improves the results for smaller attribute sets, but worsens the result for larger attribute sets. The trend is opposite for preprocessing techniques that expand the set of attributes. However, finding the ratio between the size of the dataset and the number of attributes, where the preprocessing techniques improve the result, is difficult. Hence, preprocessing techniques are not applied in the second part of the project. From the result of the classification of the patient histories we have extracted graphs that show how the sentence class distribution after the first diagnosis of diabetes is set. Although no empiric research is carried out, we believe that such graphs may, through further research, facilitate the recognition of points of interest in the patient history. From the same results we also create graphs that show the average distribution of sentences of subjective, objective, and plan character for 429 patients after the first diagnosis of diabetes is set. From these graphs we find evidence that there is an overrepresentation of subjective sentences in encounter notes where the diagnosis of diabetes is first set. However, we believe that similar experiments for several diseases, may uncover patterns or trends concerning the diseases in focus.

APA, Harvard, Vancouver, ISO, and other styles
8

Ma, Yimin. "Text classification on imbalanced data: Application to systematic reviews automation." Thesis, University of Ottawa (Canada), 2007. http://hdl.handle.net/10393/27532.

Full text
Abstract:
Systematic Review is the basic process of Evidence-based Medicine, and consequently there is urgent need for tools assisting and eventually automating a large part of this process. In the traditional Systematic Review System, reviewers or domain experts manually classify literatures into relevant class and irrelevant class through a series of systematic review levels. In our work with TrialStat, we apply text classification techniques to a Systematic Review System in order to minimize the human efforts in identifying relevant literatures. In most cases, the relevant articles are a small portion of the Medline corpus. The first essential issue for this task is achieving high recall for those relevant articles. We also face two technical challenges: handling imbalanced data, and reducing the size of the labeled training set. To address these issues, we first study the feature selection and sample selection bias caused by the skewness data. We then experimented with different feature selection, sample selection, and classification methods to find the ones that can properly handle these problems. In order to minimize the labeled training set size, we also experimented with the active learning techniques. Active learning selects the most informative instances to be labeled, so that the required training examples are reduced while the performance is guaranteed. By using an active learning technique, we saved 86% of the effort required to label the training examples. The best testing result was obtained by combining the feature selection method Modified BNS, the sample selection method clustering-based sample selection and active learning with the Naive Bayes as classifier. We achieved 100% recall for the minority class with the overall accuracy of 58.43%. By achieving work saved over sampling (WSS) as 53.4%, we saved half of the workload for the reviewers.
APA, Harvard, Vancouver, ISO, and other styles
9

Salah, Aghiles. "Von Mises-Fisher based (co-)clustering for high-dimensional sparse data : application to text and collaborative filtering data." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB093/document.

Full text
Abstract:
La classification automatique, qui consiste à regrouper des objets similaires au sein de groupes, également appelés classes ou clusters, est sans aucun doute l’une des méthodes d’apprentissage non-supervisé les plus utiles dans le contexte du Big Data. En effet, avec l’expansion des volumes de données disponibles, notamment sur le web, la classification ne cesse de gagner en importance dans le domaine de la science des données pour la réalisation de différentes tâches, telles que le résumé automatique, la réduction de dimension, la visualisation, la détection d’anomalies, l’accélération des moteurs de recherche, l’organisation d’énormes ensembles de données, etc. De nombreuses méthodes de classification ont été développées à ce jour, ces dernières sont cependant fortement mises en difficulté par les caractéristiques complexes des ensembles de données que l’on rencontre dans certains domaines d’actualité tel que le Filtrage Collaboratif (FC) et de la fouille de textes. Ces données, souvent représentées sous forme de matrices, sont de très grande dimension (des milliers de variables) et extrêmement creuses (ou sparses, avec plus de 95% de zéros). En plus d’être de grande dimension et sparse, les données rencontrées dans les domaines mentionnés ci-dessus sont également de nature directionnelles. En effet, plusieurs études antérieures ont démontré empiriquement que les mesures directionnelles, telle que la similarité cosinus, sont supérieurs à d’autres mesures, telle que la distance Euclidiennes, pour la classification des documents textuels ou pour mesurer les similitudes entre les utilisateurs/items dans le FC. Cela suggère que, dans un tel contexte, c’est la direction d’un vecteur de données (e.g., représentant un document texte) qui est pertinente, et non pas sa longueur. Il est intéressant de noter que la similarité cosinus est exactement le produit scalaire entre des vecteurs unitaires (de norme 1). Ainsi, d’un point de vue probabiliste l’utilisation de la similarité cosinus revient à supposer que les données sont directionnelles et réparties sur la surface d’une hypersphère unité. En dépit des nombreuses preuves empiriques suggérant que certains ensembles de données sparses et de grande dimension sont mieux modélisés sur une hypersphère unité, la plupart des modèles existants dans le contexte de la fouille de textes et du FC s’appuient sur des hypothèses populaires : distributions Gaussiennes ou Multinomiales, qui sont malheureusement inadéquates pour des données directionnelles. Dans cette thèse, nous nous focalisons sur deux challenges d’actualité, à savoir la classification des documents textuels et la recommandation d’items, qui ne cesse d’attirer l’attention dans les domaines de la fouille de textes et celui du filtrage collaborative, respectivement. Afin de répondre aux limitations ci-dessus, nous proposons une série de nouveaux modèles et algorithmes qui s’appuient sur la distribution de von Mises-Fisher (vMF) qui est plus appropriée aux données directionnelles distribuées sur une hypersphère unité
Cluster analysis or clustering, which aims to group together similar objects, is undoubtedly a very powerful unsupervised learning technique. With the growing amount of available data, clustering is increasingly gaining in importance in various areas of data science for several reasons such as automatic summarization, dimensionality reduction, visualization, outlier detection, speed up research engines, organization of huge data sets, etc. Existing clustering approaches are, however, severely challenged by the high dimensionality and extreme sparsity of the data sets arising in some current areas of interest, such as Collaborative Filtering (CF) and text mining. Such data often consists of thousands of features and more than 95% of zero entries. In addition to being high dimensional and sparse, the data sets encountered in the aforementioned domains are also directional in nature. In fact, several previous studies have empirically demonstrated that directional measures—that measure the distance between objects relative to the angle between them—, such as the cosine similarity, are substantially superior to other measures such as Euclidean distortions, for clustering text documents or assessing the similarities between users/items in CF. This suggests that in such context only the direction of a data vector (e.g., text document) is relevant, not its magnitude. It is worth noting that the cosine similarity is exactly the scalar product between unit length data vectors, i.e., L 2 normalized vectors. Thus, from a probabilistic perspective using the cosine similarity is equivalent to assuming that the data are directional data distributed on the surface of a unit-hypersphere. Despite the substantial empirical evidence that certain high dimensional sparse data sets, such as those encountered in the above domains, are better modeled as directional data, most existing models in text mining and CF are based on popular assumptions such as Gaussian, Multinomial or Bernoulli which are inadequate for L 2 normalized data. In this thesis, we focus on the two challenging tasks of text document clustering and item recommendation, which are still attracting a lot of attention in the domains of text mining and CF, respectively. In order to address the above limitations, we propose a suite of new models and algorithms which rely on the von Mises-Fisher (vMF) assumption that arises naturally for directional data lying on a unit-hypersphere
APA, Harvard, Vancouver, ISO, and other styles
10

Natarajan, Jeyakumar. "Text mining of biomedical literature and its applications to microarray data analysis and interpretation." Thesis, University of Ulster, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445041.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Data-to-text"

1

McLean, Gerald. More like this: Development of digital file management methodologies (including linking to text data) for integrating text and images. London: LCP, 2003.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Keyes, Bettye. Voice writing method. Little Rock, Ark: VoiceCAT Corp., 2005.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

1948-, Fogelman-Soulié Françoise, North Atlantic Treaty Organization. Public Diplomacy Division, and ebrary Inc, eds. Mining massive data sets for security: Advances in data mining, search, social networks and text mining, and their applications to security. Amsterdam: IOS Press, 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Thanassoulis, Emmanuel. Introduction to the theory and application of data envelopment analysis: A foundation text with integrated software. Norwell, Mass: Kluwer Academic Publishers, 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Analyzing streams of language: Twelve steps to the systematic coding of text, talk, and other verbal data. New York: Longman, 2003.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bateson, Teresa M. Report on parsing and construction of prototype that will accept freeform text data from publishers' sites and parse this data for automatic entry to book database. [s.l: The Author], 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Schneider, G. Michael. Modula-2 supplement to accompany 'Concepts in data structures and software development': A text for the second course inComputer Science. St. Paul, MN: West Publishing Co., 1991.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Schneider, G. Michael. ADA supplement to accompany 'Concepts in data structures and software development': A text for the second course in Computer Science. St. Paul, MN: West Publishing Co., 1991.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kōpasu to tekisuto mainingu: Corpus & text mining. Tōkyō-to Bunkyō-ku: Kyōritsu Shuppan, 2012.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Marchese, Francis T. Knowledge Visualization Currents: From Text to Art to Culture. London: Springer London, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Data-to-text"

1

Gardent, Claire. "Syntax and Data-to-Text Generation." In Statistical Language and Speech Processing, 3–20. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-11397-5_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ruiz, José Antonio Álvarez. "Learning to Discriminate Text from Synthetic Data." In Lecture Notes in Computer Science, 270–81. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-32060-6_23.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Suchowolec, Karolina, Christian Lang, Roman Schneider, and Horst Schwinn. "Shifting Complexity from Text to Data Model." In Lecture Notes in Computer Science, 203–12. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-59888-8_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Upadhyay, Ashish, Stewart Massie, Ritwik Kumar Singh, Garima Gupta, and Muneendra Ojha. "A Case-Based Approach to Data-to-Text Generation." In Case-Based Reasoning Research and Development, 232–47. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86957-1_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Balbi, Simona, and Emilio Meglio. "Contributions of Textual Data Analysis to Text Retrieval." In Classification, Clustering, and Data Mining Applications, 511–20. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-642-17103-1_48.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Rebuffel, Clément, Laure Soulier, Geoffrey Scoutheeten, and Patrick Gallinari. "A Hierarchical Model for Data-to-Text Generation." In Lecture Notes in Computer Science, 65–80. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-45439-5_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Pauws, Steffen, Albert Gatt, Emiel Krahmer, and Ehud Reiter. "Making Effective Use of Healthcare Data Using Data-to-Text Technology." In Data Science for Healthcare, 119–45. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-05249-2_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Rezk, Martín, Jungyeul Park, Yoon Yongun, Kyungtae Lim, John Larsen, YoungGyun Hahm, and Key-Sun Choi. "Korean Linked Data on the Web: Text to RDF." In Semantic Technology, 368–74. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37996-3_31.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Manthan, C. S., K. Roopa, H. S. Bindu, B. M. Apoorva, and Mala D. Madar. "Pseudonymization of Text and Image Data to Provide Confidentiality." In Lecture Notes on Data Engineering and Communications Technologies, 553–64. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-33-4968-1_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Song, Jianjie, and Hean Liu. "Application of Text Data Mining to Education in Long-Distance." In Lecture Notes in Electrical Engineering, 745–51. Dordrecht: Springer Netherlands, 2011. http://dx.doi.org/10.1007/978-94-007-1839-5_80.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Data-to-text"

1

Sanchez, D., M. J. Martin-Bautista, I. Blanco, and C. Justicia de la Torre. "Text Knowledge Mining: An Alternative to Text Data Mining." In 2008 IEEE International Conference on Data Mining Workshops. IEEE, 2008. http://dx.doi.org/10.1109/icdmw.2008.57.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Perez-Beltrachini, Laura, and Claire Gardent. "Analysing Data-To-Text Generation Benchmarks." In Proceedings of the 10th International Conference on Natural Language Generation. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/w17-3537.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ma, Long, and Yanqing Zhang. "Using Word2Vec to process big text data." In 2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015. http://dx.doi.org/10.1109/bigdata.2015.7364114.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Puduppully, Ratish, Li Dong, and Mirella Lapata. "Data-to-text Generation with Entity Modeling." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1195.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lin, Shuai, Wentao Wang, Zichao Yang, Xiaodan Liang, Frank F. Xu, Eric Xing, and Zhiting Hu. "Data-to-Text Generation with Style Imitation." In Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.findings-emnlp.144.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Reiter, Ehud. "An architecture for data-to-text systems." In the Eleventh European Workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2007. http://dx.doi.org/10.3115/1610163.1610180.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Tang, Yun, Juan Pino, Changhan Wang, Xutai Ma, and Dmitriy Genzel. "A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks." In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9415058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Jiang, Eric P. "Learning to integrate unlabeled data in text classification." In 2010 3rd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2010). IEEE, 2010. http://dx.doi.org/10.1109/iccsit.2010.5564473.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Hechong, Wei Zhang, Yuesheng Zhu, and Zhiqiang Bai. "Data-to-Text Generation with Attention Recurrent Unit." In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019. http://dx.doi.org/10.1109/ijcnn.2019.8852343.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Liu, Mengzhu, Zhaonan Mu, Jieping Sun, and Cheng Wang. "Data-to-text Generation with Pointer-Generator Networks." In 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA). IEEE, 2020. http://dx.doi.org/10.1109/aeeca49918.2020.9213600.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Data-to-text"

1

Currie, Janet, Henrik Kleven, and Esmée Zwiers. Technology and Big Data Are Changing Economics: Mining Text to Track Methods. Cambridge, MA: National Bureau of Economic Research, January 2020. http://dx.doi.org/10.3386/w26715.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Fadaie, K. Final text of ISO TR 19121, Geographic information - Imagery and gridded data, as sent to ISO for publication. Natural Resources Canada/ESS/Scientific and Technical Publishing Services, 2000. http://dx.doi.org/10.4095/219711.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Gates, Allison, Michelle Gates, Shannon Sim, Sarah A. Elliott, Jennifer Pillay, and Lisa Hartling. Creating Efficiencies in the Extraction of Data From Randomized Trials: A Prospective Evaluation of a Machine Learning and Text Mining Tool. Agency for Healthcare Research and Quality (AHRQ), August 2021. http://dx.doi.org/10.23970/ahrqepcmethodscreatingefficiencies.

Full text
Abstract:
Background. Machine learning tools that semi-automate data extraction may create efficiencies in systematic review production. We prospectively evaluated an online machine learning and text mining tool’s ability to (a) automatically extract data elements from randomized trials, and (b) save time compared with manual extraction and verification. Methods. For 75 randomized trials published in 2017, we manually extracted and verified data for 21 unique data elements. We uploaded the randomized trials to ExaCT, an online machine learning and text mining tool, and quantified performance by evaluating the tool’s ability to identify the reporting of data elements (reported or not reported), and the relevance of the extracted sentences, fragments, and overall solutions. For each randomized trial, we measured the time to complete manual extraction and verification, and to review and amend the data extracted by ExaCT (simulating semi-automated data extraction). We summarized the relevance of the extractions for each data element using counts and proportions, and calculated the median and interquartile range (IQR) across data elements. We calculated the median (IQR) time for manual and semiautomated data extraction, and overall time savings. Results. The tool identified the reporting (reported or not reported) of data elements with median (IQR) 91 percent (75% to 99%) accuracy. Performance was perfect for four data elements: eligibility criteria, enrolment end date, control arm, and primary outcome(s). Among the top five sentences for each data element at least one sentence was relevant in a median (IQR) 88 percent (83% to 99%) of cases. Performance was perfect for four data elements: funding number, registration number, enrolment start date, and route of administration. Among a median (IQR) 90 percent (86% to 96%) of relevant sentences, pertinent fragments had been highlighted by the system; exact matches were unreliable (median (IQR) 52 percent [32% to 73%]). A median 48 percent of solutions were fully correct, but performance varied greatly across data elements (IQR 21% to 71%). Using ExaCT to assist the first reviewer resulted in a modest time savings compared with manual extraction by a single reviewer (17.9 vs. 21.6 hours total extraction time across 75 randomized trials). Conclusions. Using ExaCT to assist with data extraction resulted in modest gains in efficiency compared with manual extraction. The tool was reliable for identifying the reporting of most data elements. The tool’s ability to identify at least one relevant sentence and highlight pertinent fragments was generally good, but changes to sentence selection and/or highlighting were often required.
APA, Harvard, Vancouver, ISO, and other styles
4

Ripey, Mariya. NUMBERS IN THE NEWS TEXT (BASED ON MATERIAL OF ONE ISSUE OF NATIONWIDE NEWSPAPER “DAY”). Ivan Franko National University of Lviv, March 2021. http://dx.doi.org/10.30970/vjo.2021.50.11106.

Full text
Abstract:
The article is devoted to the analysis of the digital content of publications of one issue of the daily All-Ukrainian newspaper “Den” (March 13-14, 2020). The author aims to identify the main thematic groups of digital designations, as well as to consider cases of justified and unsuccessful use of digital designations. Applying the content analysis method, the author identifies publications that contain numerical notations, determines the number of such notations and their affiliation with the main subject groups. Finds that the thematic group of digital designations “time” (58.6% of all digital designations) is much more dominant. This indicates that timing is the most important task of a newspaper text. The second largest group of digital designations is “measure” (15.8% of all digital designations). It covers dimensions and proportions, measurements of distance, weight, volume, and more. The third largest group of digital signage is money (8.2% of all digital signage), the fourth is numbering (5.2% of all digital signage), and the fifth is people (4.4% of all digital signage). The author focuses on the fact that the digits of the journalist’s text are both a source of information and a catch for the reader. Vivid indicators give the text a sense of accuracy. When referring digital data to the text, journalists must adhere to certain rules for the writing of ordinal numbers with incremental graduation; submission of dates; pointing to unique integers that are combined (or not combined) with units of physical quantities, monetary units, etc.; writing a numerator at the beginning of a sentence; unified presentation of data. This will greatly facilitate the reader’s perception of the information.
APA, Harvard, Vancouver, ISO, and other styles
5

DiGrande, Laura, Christine Bevc, Jessica Williams, Lisa Carley-Baxter, Craig Lewis-Owen, and Suzanne Triplett. Pilot Study on the Experiences of Hurricane Shelter Evacuees. RTI Press, September 2019. http://dx.doi.org/10.3768/rtipress.2019.rr.0035.1909.

Full text
Abstract:
Community members who evacuate to shelters may represent the most socially and economically vulnerable group within a hurricane’s affected geographic area. Disaster research has established associations between socioeconomic conditions and adverse effects, but data are overwhelmingly collected retrospectively on large populations and lack further explication. As Hurricane Florence approached North Carolina in September 2018, RTI International developed a pilot survey for American Red Cross evacuation shelter clients. Two instruments, an interviewer-led paper questionnaire and a short message service (SMS text) questionnaire, were tested. A total of 200 evacuees completed the paper survey, but only 34 participated in the SMS text portion of the study. Data confirmed that the sample represented very marginalized coastline residents: 60 percent were unemployed, 70 percent had no family or friends to stay with during evacuation, 65 percent could not afford to evacuate to another location, 36 percent needed medicine/medical care, and 11 percent were homeless. Although 19 percent of participants had a history of evacuating for prior hurricanes/disasters and 14 percent had previously utilized shelters, we observed few associations between previous experiences and current evacuation resources, behaviors, or opinions about safety. This study demonstrates that, for vulnerable populations exposed to storms of increasing intensity and frequency, traditional survey research methods are best employed to learn about their experiences and needs.
APA, Harvard, Vancouver, ISO, and other styles
6

Acred, Aleksander, Milena Devineni, and Lindsey Blake. Opioid Free Anesthesia to Prevent Post Operative Nausea/Vomiting. University of Tennessee Health Science Center, July 2021. http://dx.doi.org/10.21007/con.dnp.2021.0006.

Full text
Abstract:
Purpose The purpose of this study is to compare the incidence of post-operative nausea and vomiting (PONV) in opioid-utilizing and opioid-free general anesthesia. Background PONV is an extremely common, potentially dangerous side effect of general anesthesia. PONV is caused by a collection of anesthetic and surgical interventions. Current practice to prevent PONV is to use 1-2 antiemetics during surgery, identify high risk patients and utilize tracheal intubation over laryngeal airways when indicated. Current research suggests minimizing the use of volatile anesthetics and opioids can reduce the incidence of PONV, but this does not reflect current practice. Methods In this scoping review, the MeSH search terms used to collect data were “anesthesia”, “postoperative nausea and vomiting”, “morbidity”, “retrospective studies”, “anesthesia, general”, “analgesics, opioid”, “pain postoperative”, “pain management” and “anesthesia, intravenous”. The Discovery Search engine, AccessMedicine and UpToDate were the search engines used to research this data. Filters were applied to these searches to ensure all the literature was peer-reviewed, full-text and preferably from academic journals. Results Opioid free anesthesia was found to decrease PONV by 69%. PONV incidence was overwhelming decreased with opioid free anesthesia in every study that was reviewed. Implications The future direction of opioid-free anesthesia and PONV prevention are broad topics to discuss, due to the nature of anesthesia. Administration of TIVA, esmolol and ketamine, as well as the decision to withhold opioids, are solely up to the anesthesia provider’s discretion. Increasing research and education in the importance of opioid-free anesthesia to decrease the incidence of PONV will be necessary to ensure anesthesia providers choose this protocol in their practice.
APA, Harvard, Vancouver, ISO, and other styles
7

Paynter, Robin A., Celia Fiordalisi, Elizabeth Stoeger, Eileen Erinoff, Robin Featherstone, Christiane Voisin, and Gaelen P. Adam. A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools. Agency for Healthcare Research and Quality (AHRQ), March 2021. http://dx.doi.org/10.23970/ahrqepcmethodsprospectivecomparison.

Full text
Abstract:
Background: In an era of explosive growth in biomedical evidence, improving systematic review (SR) search processes is increasingly critical. Text-mining tools (TMTs) are a potentially powerful resource to improve and streamline search strategy development. Two types of TMTs are especially of interest to searchers: word frequency (useful for identifying most used keyword terms, e.g., PubReminer) and clustering (visualizing common themes, e.g., Carrot2). Objectives: The objectives of this study were to compare the benefits and trade-offs of searches with and without the use of TMTs for evidence synthesis products in real world settings. Specific questions included: (1) Do TMTs decrease the time spent developing search strategies? (2) How do TMTs affect the sensitivity and yield of searches? (3) Do TMTs identify groups of records that can be safely excluded in the search evaluation step? (4) Does the complexity of a systematic review topic affect TMT performance? In addition to quantitative data, we collected librarians' comments on their experiences using TMTs to explore when and how these new tools may be useful in systematic review search¬¬ creation. Methods: In this prospective comparative study, we included seven SR projects, and classified them into simple or complex topics. The project librarian used conventional “usual practice” (UP) methods to create the MEDLINE search strategy, while a paired TMT librarian simultaneously and independently created a search strategy using a variety of TMTs. TMT librarians could choose one or more freely available TMTs per category from a pre-selected list in each of three categories: (1) keyword/phrase tools: AntConc, PubReMiner; (2) subject term tools: MeSH on Demand, PubReMiner, Yale MeSH Analyzer; and (3) strategy evaluation tools: Carrot2, VOSviewer. We collected results from both MEDLINE searches (with and without TMTs), coded every citation’s origin (UP or TMT respectively), deduplicated them, and then sent the citation library to the review team for screening. When the draft report was submitted, we used the final list of included citations to calculate the sensitivity, precision, and number-needed-to-read for each search (with and without TMTs). Separately, we tracked the time spent on various aspects of search creation by each librarian. Simple and complex topics were analyzed separately to provide insight into whether TMTs could be more useful for one type of topic or another. Results: Across all reviews, UP searches seemed to perform better than TMT, but because of the small sample size, none of these differences was statistically significant. UP searches were slightly more sensitive (92% [95% confidence intervals (CI) 85–99%]) than TMT searches (84.9% [95% CI 74.4–95.4%]). The mean number-needed-to-read was 83 (SD 34) for UP and 90 (SD 68) for TMT. Keyword and subject term development using TMTs generally took less time than those developed using UP alone. The average total time was 12 hours (SD 8) to create a complete search strategy by UP librarians, and 5 hours (SD 2) for the TMT librarians. TMTs neither affected search evaluation time nor improved identification of exclusion concepts (irrelevant records) that can be safely removed from the search set. Conclusion: Across all reviews but one, TMT searches were less sensitive than UP searches. For simple SR topics (i.e., single indication–single drug), TMT searches were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), TMT searches were less sensitive than UP searches; nevertheless, in complex reviews, they identified unique eligible citations not found by the UP searches. TMT searches also reduced time spent in search strategy development. For all evidence synthesis types, TMT searches may be more efficient in reviews where comprehensiveness is not paramount, or as an adjunct to UP for evidence syntheses, because they can identify unique includable citations. If TMTs were easier to learn and use, their utility would be increased.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography