Log in

Relevant bibliographies by topics / Semantics - Data processing. eng / Journal articles

To see the other types of publications on this topic, follow the link: Semantics - Data processing. eng.

Journal articles on the topic 'Semantics - Data processing. eng'

Author: Grafiati

Published: 4 June 2021

Last updated: 15 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Semantics - Data processing. eng.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Tutzauer, P., and N. Haala. "PROCESSING OF CRAWLED URBAN IMAGERY FOR BUILDING USE CLASSIFICATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-1/W1 (May 31, 2017): 143–49. http://dx.doi.org/10.5194/isprs-archives-xlii-1-w1-143-2017.

Full text

Abstract:

Recent years have shown a shift from pure geometric 3D city models to data with semantics. This is induced by new applications (e.g. Virtual/Augmented Reality) and also a requirement for concepts like Smart Cities. However, essential urban semantic data like building use categories is often not available. We present a first step in bridging this gap by proposing a pipeline to use crawled urban imagery and link it with ground truth cadastral data as an input for automatic building use classification. We aim to extract this city-relevant semantic information automatically from Street View (SV) imagery. Convolutional Neural Networks (CNNs) proved to be extremely successful for image interpretation, however, require a huge amount of training data. Main contribution of the paper is the automatic provision of such training datasets by linking semantic information as already available from databases provided from national mapping agencies or city administrations to the corresponding façade images extracted from SV. Finally, we present first investigations with a CNN and an alternative classifier as a proof of concept.

APA, Harvard, Vancouver, ISO, and other styles

2

Terziyan, Vagan, and Anton Nikulin. "Semantics of Voids within Data: Ignorance-Aware Machine Learning." ISPRS International Journal of Geo-Information 10, no. 4 (2021): 246. http://dx.doi.org/10.3390/ijgi10040246.

Full text

Abstract:

Operating with ignorance is an important concern of geographical information science when the objective is to discover knowledge from the imperfect spatial data. Data mining (driven by knowledge discovery tools) is about processing available (observed, known, and understood) samples of data aiming to build a model (e.g., a classifier) to handle data samples that are not yet observed, known, or understood. These tools traditionally take semantically labeled samples of the available data (known facts) as an input for learning. We want to challenge the indispensability of this approach, and we suggest considering the things the other way around. What if the task would be as follows: how to build a model based on the semantics of our ignorance, i.e., by processing the shape of “voids” within the available data space? Can we improve traditional classification by also modeling the ignorance? In this paper, we provide some algorithms for the discovery and visualization of the ignorance zones in two-dimensional data spaces and design two ignorance-aware smart prototype selection techniques (incremental and adversarial) to improve the performance of the nearest neighbor classifiers. We present experiments with artificial and real datasets to test the concept of the usefulness of ignorance semantics discovery.

APA, Harvard, Vancouver, ISO, and other styles

3

Lin, Yi, Hongwei Ding, and Yang Zhang. "Prosody Dominates Over Semantics in Emotion Word Processing: Evidence From Cross-Channel and Cross-Modal Stroop Effects." Journal of Speech, Language, and Hearing Research 63, no. 3 (2020): 896–912. http://dx.doi.org/10.1044/2020_jslhr-19-00258.

Full text

Abstract:

Purpose Emotional speech communication involves multisensory integration of linguistic (e.g., semantic content) and paralinguistic (e.g., prosody and facial expressions) messages. Previous studies on linguistic versus paralinguistic salience effects in emotional speech processing have produced inconsistent findings. In this study, we investigated the relative perceptual saliency of emotion cues in cross-channel auditory alone task (i.e., semantics–prosody Stroop task) and cross-modal audiovisual task (i.e., semantics–prosody–face Stroop task). Method Thirty normal Chinese adults participated in two Stroop experiments with spoken emotion adjectives in Mandarin Chinese. Experiment 1 manipulated auditory pairing of emotional prosody (happy or sad) and lexical semantic content in congruent and incongruent conditions. Experiment 2 extended the protocol to cross-modal integration by introducing visual facial expression during auditory stimulus presentation. Participants were asked to judge emotional information for each test trial according to the instruction of selective attention. Results Accuracy and reaction time data indicated that, despite an increase in cognitive demand and task complexity in Experiment 2, prosody was consistently more salient than semantic content for emotion word processing and did not take precedence over facial expression. While congruent stimuli enhanced performance in both experiments, the facilitatory effect was smaller in Experiment 2. Conclusion Together, the results demonstrate the salient role of paralinguistic prosodic cues in emotion word processing and congruence facilitation effect in multisensory integration. Our study contributes tonal language data on how linguistic and paralinguistic messages converge in multisensory speech processing and lays a foundation for further exploring the brain mechanisms of cross-channel/modal emotion integration with potential clinical applications.

APA, Harvard, Vancouver, ISO, and other styles

4

Bhatt, Nirav, and Amit Thakkar. "An efficient approach for low latency processing in stream data." PeerJ Computer Science 7 (March 10, 2021): e426. http://dx.doi.org/10.7717/peerj-cs.426.

Full text

Abstract:

Stream data is the data that is generated continuously from the different data sources and ideally defined as the data that has no discrete beginning or end. Processing the stream data is a part of big data analytics that aims at querying the continuously arriving data and extracting meaningful information from the stream. Although earlier processing of such stream was using batch analytics, nowadays there are applications like the stock market, patient monitoring, and traffic analysis which can cause a drastic difference in processing, if the output is generated in levels of hours and minutes. The primary goal of any real-time stream processing system is to process the stream data as soon as it arrives. Correspondingly, analytics of the stream data also needs consideration of surrounding dependent data. For example, stock market analytics results are often useless if we do not consider their associated or dependent parameters which affect the result. In a real-world application, these dependent stream data usually arrive from the distributed environment. Hence, the stream processing system has to be designed, which can deal with the delay in the arrival of such data from distributed sources. We have designed the stream processing model which can deal with all the possible latency and provide an end-to-end low latency system. We have performed the stock market prediction by considering affecting parameters, such as USD, OIL Price, and Gold Price with an equal arrival rate. We have calculated the Normalized Root Mean Square Error (NRMSE) which simplifies the comparison among models with different scales. A comparative analysis of the experiment presented in the report shows a significant improvement in the result when considering the affecting parameters. In this work, we have used the statistical approach to forecast the probability of possible data latency arrives from distributed sources. Moreover, we have performed preprocessing of stream data to ensure at-least-once delivery semantics. In the direction towards providing low latency in processing, we have also implemented exactly-once processing semantics. Extensive experiments have been performed with varying sizes of the window and data arrival rate. We have concluded that system latency can be reduced when the window size is equal to the data arrival rate.

APA, Harvard, Vancouver, ISO, and other styles

5

Huynh, Dat, and Sy Nguyen-Ky. "Engaging Building Automation Data Visualisation Using Building Information Modelling and Progressive Web Application." Open Engineering 10, no. 1 (2020): 434–42. http://dx.doi.org/10.1515/eng-2020-0054.

Full text

Abstract:

AbstractThe integration of the building information modelling (BIM) and assets database has enabled a potential pathway for building stakeholders to add value to the building lifecycle management (BLM) processes. However, the obtaining, storing, processing, and distributing of data have not been ratified in a detailed and semantic manner in any official guideline. This paper suggests a framework for the efficient development of an interoperable visualisation of a building “digital twin” through an intuitive interface, to contribute to the idea above. The framework was applied in two case studies as examples, where rich visualisations of two buildings in the Kanta-Häme region of Finland were constructed using their architectural models and building indoor climate metrics. By constructing an engaging interface, relevant metrics and constant feedback from buildings’ occupants are gathered, meaningful data is selected, processed, and displayed to improve the facility management (FM) process. The yielded result is a progressive web application (PWA), where valuable sets of building performance data are visualised and a promptly communicable channel between owners/occupants and building system is delivered. Further development of this application in practice and research work is also proposed to harness the data-driven monitoring and automation in buildings to the greatest extent.

APA, Harvard, Vancouver, ISO, and other styles

6

Schuetz, C. G., B. Neumayr, M. Schrefl, E. Gringinger, and S. Wilson. "Semantics-based summarisation of ATM information." Aeronautical Journal 123, no. 1268 (2019): 1639–65. http://dx.doi.org/10.1017/aer.2019.74.

Full text

Abstract:

ABSTRACTPilot briefings, in their traditional form, drown pilots in a sea of information. Rather than unfocused swathes of air traffic management (ATM) information, pilots require only the information for their specific flight, preferably with an emphasis on the most important information. In this paper, we introduce the notion of ATM information cubes – in analogy to the well-established concept of Online analytical processing (OLAP) cubes in data warehousing. We propose a framework with merge and abstraction operations for the combination and summarization of the information in ATM information cubes to obtain management summaries of relevant information. To this end, we adopt the concept of semantic data container – a package of data items with a semantic description of the contents. The semantic descriptions then serve to hierarchically organise semantic containers along the dimensions of an ATM information cube. Leveraging this hierarchical organisation, a merge operation combines ATM information from individual semantic containers and collects the data items into composite containers. An abstraction operation summarises the data items within a semantic container, replacing individual data items with more abstract data items with summary information.

APA, Harvard, Vancouver, ISO, and other styles

7

Lutfallah, Susan, Candice Fast, Chitra Rangan, and Lori Buchanan. "Semantic neighbourhoods." Mental Lexicon 13, no. 3 (2018): 388–93. http://dx.doi.org/10.1075/ml.18015.lut.

Full text

Abstract:

Abstract The contributions of semantic processing have come under increasing attention in recent years (Yap, Pexman, Wellsby, Hargreaves, & Huff, 2012), and variables that measure the semantic content of words are a requirement of this increased experimental attention. The density and size of semantic neighborhoods derived from computational models have been shown to predict reaction times across a range of psycholinguistic tasks (e.g., Danguecan & Buchanan, 2016), and the distance between two words in semantic space has been shown to predict priming (Kenett, Levi, Anaki & Faust, 2017). The data to support the construction of stimulus sets that use these variables are complicated to obtain. The app that we describe here makes these measures of semantics available for 100,000 English words.

APA, Harvard, Vancouver, ISO, and other styles

8

DEZFULI, MOHAMMAD G., and MOSTAFA S. HAGHJOO. "WD-PWS: THE FIRST SEMANTICS FOR QUERYING OVER PROBABILISTIC DATA STREAMS WITH CONTINUOUS DISTRIBUTIONS." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 21, no. 05 (2013): 695–722. http://dx.doi.org/10.1142/s0218488513500335.

Full text

Abstract:

Many emerging applications need continuous querying over uncertain event streams, mostly for online monitoring. These streaming uncertain events may come from radars, sensors, or even software hooks. The uncertainty is usually due to measurement errors, inherent ambiguities and privacy preserving reasons. To cover new requirements, we have designed and implemented a new system called Probabilistic Data Stream Management System (PDSMS) in Ref. 1. PDSMS is a data processing engine which runs continuous queries over probabilistic streams. However, lack of a semantics for probabilistic databases which supports continuous distributions prevented us from having a strong foundation for our query operators. It also precludes us from proving consistency and correctness of query operations especially after optimization and adaption. In fact, in the probabilistic database literature, there is no semantics available which covers continuous distributions. This limitation is very restrictive as in real-world, uncertainty is usually modeled by continuous distributions. In this paper, after presenting a basic probabilistic data model for PDSMS, we focus on querying and formally present the first semantics for probabilistic query operations which supports continuous distributions as well as discrete ones. Using this new semantics, we define our query operators (e.g. select, project, and join) formally without ambiguity and compatible with operators in relational algebra. Thus, we can leverage many transformation rules in relational algebra as well. This new semantics allows us to have different strictness levels and consistency between operators. We also proved many strictness theorems about different alternatives for query operators.

APA, Harvard, Vancouver, ISO, and other styles

9

Rubí, Jesús Noel Sárez, and Paulo Roberto de Lira Gondim. "Interoperable Internet of Medical Things platform for e-Health applications." International Journal of Distributed Sensor Networks 16, no. 1 (2020): 155014771988959. http://dx.doi.org/10.1177/1550147719889591.

Full text

Abstract:

The development of information and telecommunication technologies has given rise to new platforms for e-Health. However, some difficulties have been detected since each manufacturer implements its communication protocols and defines their data formats. A semantic incongruence is observed between platforms since no common healthcare domain vocabulary is shared between manufacturers and stakeholders. Despite the existence of standards for semantic and platform interoperability (e.g. openEHR for healthcare, Semantic Sensor Network for Internet of Medical Things platforms, and machine-to-machine standards), no approach has combined them for granting interoperability or considered the whole integration of legacy Electronic Health Record Systems currently used worldwide. Moreover, the heterogeneity in the large volume of health data generated by Internet of Medical Things platforms must be attenuated for the proper application of big data processing techniques. This article proposes the joint use of openEHR and Semantic Sensor Network semantics for the achievement of interoperability at the semantic level and use of a machine-to-machine architecture for the definition of an interoperable Internet of Medical Things platform.

APA, Harvard, Vancouver, ISO, and other styles

10

Ponciano, Jean-Jacques, Claire Prudhomme, and Frank Boochs. "From Acquisition to Presentation—The Potential of Semantics to Support the Safeguard of Cultural Heritage." Remote Sensing 13, no. 11 (2021): 2226. http://dx.doi.org/10.3390/rs13112226.

Full text

Abstract:

The signature of the 2019 Declaration of Cooperation on advancing the digitization of cultural heritage in Europe shows the important role that the 3D digitization process plays in the safeguard and sustainability of cultural heritage. The digitization also aims at sharing and presenting cultural heritage. However, the processing steps of data acquisition to its presentation requires an interdisciplinary collaboration, where understanding and collaborative work is difficult due to the presence of different expert knowledge involved. This study proposes an end-to-end method from the cultural data acquisition to its presentation thanks to explicit semantics representing the different fields of expert knowledge intervening in this process. This method is composed of three knowledge-based processing steps: (i) a recommendation process of acquisition technology to support cultural data acquisition; (ii) an object recognition process to structure the unstructured acquired data; and (iii) an enrichment process based on Linked Open Data to document cultural objects with further information, such as geospatial, cultural, and historical information. The proposed method was applied in two case studies concerning the watermills of Ephesos terrace house 2 and the first Sacro Monte chapel in Varallo. These application cases show the proposed method’s ability to recognize and document digitized cultural objects in different contexts thanks to the semantics.

APA, Harvard, Vancouver, ISO, and other styles

11

Feng, Chen, Markus F. Damian, and Qingqing Qu. "Parallel Processing of Semantics and Phonology in Spoken Production: Evidence from Blocked Cyclic Picture Naming and EEG." Journal of Cognitive Neuroscience 33, no. 4 (2021): 725–38. http://dx.doi.org/10.1162/jocn_a_01675.

Full text

Abstract:

Spoken language production involves lexical-semantic access and phonological encoding. A theoretically important question concerns the relative time course of these two cognitive processes. The predominant view has been that semantic and phonological codes are accessed in successive stages. However, recent evidence seems difficult to reconcile with a sequential view but rather suggests that both types of codes are accessed in parallel. Here, we used ERPs combined with the “blocked cyclic naming paradigm” in which items overlapped either semantically or phonologically. Behaviorally, both semantic and phonological overlap caused interference relative to unrelated baseline conditions. Crucially, ERP data demonstrated that the semantic and phonological effects emerged at a similar latency (∼180 msec after picture onset) and within a similar time window (180–380 msec). These findings suggest that access to phonological information takes place at a relatively early stage during spoken planning, largely in parallel with semantic processing.

APA, Harvard, Vancouver, ISO, and other styles

12

Bashar, Md Abul, and Richi Nayak. "Active Learning for Effectively Fine-Tuning Transfer Learning to Downstream Task." ACM Transactions on Intelligent Systems and Technology 12, no. 2 (2021): 1–24. http://dx.doi.org/10.1145/3446343.

Full text

Abstract:

Language model (LM) has become a common method of transfer learning in Natural Language Processing (NLP) tasks when working with small labeled datasets. An LM is pretrained using an easily available large unlabelled text corpus and is fine-tuned with the labelled data to apply to the target (i.e., downstream) task. As an LM is designed to capture the linguistic aspects of semantics, it can be biased to linguistic features. We argue that exposing an LM model during fine-tuning to instances that capture diverse semantic aspects (e.g., topical, linguistic, semantic relations) present in the dataset will improve its performance on the underlying task. We propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance. Experimental results show that MAS performs better than random sampling as well as the state-of-the-art active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier.

APA, Harvard, Vancouver, ISO, and other styles

13

Royle, Phaedra, John E. Drury, and Karsten Steinhauer. "ERPs and task effects in the auditory processing of gender agreement and semantics in French." Neural Correlates of Lexical Processing 8, no. 2 (2013): 216–44. http://dx.doi.org/10.1075/ml.8.2.05roy.

Full text

Abstract:

We investigated task effects on violation ERP responses to Noun-Adjective gender mismatches and lexical/conceptual semantic mismatches in a combined auditory/visual paradigm in French. Participants listened to sentences while viewing pictures of objects. This paradigm was designed to investigate language processing in special populations (e.g., children) who may not be able to read or to provide stable behavioural judgment data. Our main goal was to determine how ERP responses to our target violations might differ depending on whether participants performed a judgment task (Task) versus listening for comprehension (No-Task). Characterizing the influence of the presence versus absence of judgment tasks on violation ERP responses allows us to meaningfully interpret data obtained using this paradigm without a behavioural task and relate them to judgment-based paradigms in the ERP literature. We replicated previously observed ERP patterns for semantic and gender mismatches, and found that the task especially affected the later P600 component.

APA, Harvard, Vancouver, ISO, and other styles

14

Giuliani, Gregory, Gilberto Camara, Brian Killough, and Stuart Minchin. "Earth Observation Open Science: Enhancing Reproducible Science Using Data Cubes." Data 4, no. 4 (2019): 147. http://dx.doi.org/10.3390/data4040147.

Full text

Abstract:

Earth Observation Data Cubes (EODC) have emerged as a promising solution to efficiently and effectively handle Big Earth Observation (EO) Data generated by satellites and made freely and openly available from different data repositories. The aim of this Special Issue, “Earth Observation Data Cube”, in Data, is to present the latest advances in EODC development and implementation, including innovative approaches for the exploitation of satellite EO data using multi-dimensional (e.g., spatial, temporal, spectral) approaches. This Special Issue contains 14 articles covering a wide range of topics such as Synthetic Aperture Radar (SAR), Analysis Ready Data (ARD), interoperability, thematic applications (e.g., land cover, snow cover mapping), capacity development, semantics, processing techniques, as well as national implementations and best practices. These papers made significant contributions to the advancement of a more Open and Reproducible Earth Observation Science, reducing the gap between users’ expectations for decision-ready products and current Big Data analytical capabilities, and ultimately unlocking the information power of EO data by transforming them into actionable knowledge.

APA, Harvard, Vancouver, ISO, and other styles

15

Venu Gopalachari, M., and Porika Sammulal. "An Effective Hybrid Recommender Using Metadata-based Conceptualization and Temporal Semantics." International Journal of Recent Contributions from Engineering, Science & IT (iJES) 4, no. 3 (2016): 4. http://dx.doi.org/10.3991/ijes.v4i3.5943.

Full text

Abstract:

Modern recommender systems target the satisfaction of the end user through the personalization techniques that collects the history of the user’s navigation. But the sole dependency on user profile based on navigation alone cannot promise the quality of recommendations because of the lack of semantics of various aspect such as demographics of the user, time of usage, concept of need etc in the processing. Though the literature provides many techniques to conceptualize the process makes high computational complexity because of the content data considered as input information. In this paper a hybrid recommender framework is developed that considers Meta data based conceptual semantics and the temporal patterns on top of the history of the usage. This framework also includes an online process that identifies the conceptual drift of the usage dynamically. The experimental results shown the effectiveness of the proposed framework when compared to the existing modern recommenders also indicate that the proposed model can resolve a cold start problem yet accurate suggestions reducing computational complexity.

APA, Harvard, Vancouver, ISO, and other styles

16

Fafalios, Pavlos, Manolis Baritakis, and Yannis Tzitzikas. "Exploiting Linked Data for Open and Configurable Named Entity Extraction." International Journal on Artificial Intelligence Tools 24, no. 02 (2015): 1540012. http://dx.doi.org/10.1142/s0218213015400126.

Full text

Abstract:

Named Entity Extraction (NEE) is the process of identifying entities in texts and, very commonly, linking them to related (Web) resources. This task is useful in several applications, e.g. for question answering, annotating documents, post-processing of search results, etc. However, existing NEE tools lack an open or easy configuration although this is very important for building domain-specific applications. For example, supporting a new category of entities, or specifying how to link the detected entities with online resources, is either impossible or very laborious. In this paper, we show how we can exploit semantic information (Linked Data) at real-time for configuring (handily) a NEE system and we propose a generic model for configuring such services. To explicitly define the semantics of the proposed model, we introduce an RDF/S vocabulary, called “Open NEE Configuration Model”, which allows a NEE service to describe (and publish as Linked Data) its entity mining capabilities, but also to be dynamically configured. To allow relating the output of a NEE process with an applied configuration, we propose an extension of the Open Annotation Data Model which also enables an application to run advanced queries over the annotated data. As a proof of concept, we present X-Link, a fully-configurable NEE framework that realizes this approach. Contrary to the existing tools, X-Link allows the user to easily define the categories of entities that are interesting for the application at hand by exploiting one or more semantic Knowledge Bases. The user is also able to update a category and specify how to semantically link and enrich the identified entities. This enhanced configurability allows X-Link to be easily configured for different contexts for building domain-specific applications. To test the approach, we conducted a task-based evaluation with users that demonstrates its usability, and a case study that demonstrates its feasibility.

APA, Harvard, Vancouver, ISO, and other styles

17

Ramage, Amy E., Semra Aytur, and Kirrie J. Ballard. "Resting-State Functional Magnetic Resonance Imaging Connectivity Between Semantic and Phonological Regions of Interest May Inform Language Targets in Aphasia." Journal of Speech, Language, and Hearing Research 63, no. 9 (2020): 3051–67. http://dx.doi.org/10.1044/2020_jslhr-19-00117.

Full text

Abstract:

Purpose Brain imaging has provided puzzle pieces in the understanding of language. In neurologically healthy populations, the structure of certain brain regions is associated with particular language functions (e.g., semantics, phonology). In studies on focal brain damage, certain brain regions or connections are considered sufficient or necessary for a given language function. However, few of these account for the effects of lesioned tissue on the “functional” dynamics of the brain for language processing. Here, functional connectivity (FC) among semantic–phonological regions of interest (ROIs) is assessed to fill a gap in our understanding about the neural substrates of impaired language and whether connectivity strength can predict language performance on a clinical tool in individuals with aphasia. Method Clinical assessment of language, using the Western Aphasia Battery–Revised, and resting-state functional magnetic resonance imaging data were obtained for 30 individuals with chronic aphasia secondary to left-hemisphere stroke and 18 age-matched healthy controls. FC between bilateral ROIs was contrasted by group and used to predict Western Aphasia Battery–Revised scores. Results Network coherence was observed in healthy controls and participants with stroke. The left–right premotor cortex connection was stronger in healthy controls, as reported by New et al. (2015) in the same data set. FC of (a) connections between temporal regions, in the left hemisphere and bilaterally, predicted lexical–semantic processing for auditory comprehension and (b) ipsilateral connections between temporal and frontal regions in both hemispheres predicted access to semantic–phonological representations and processing for verbal production. Conclusions Network connectivity of brain regions associated with semantic–phonological processing is predictive of language performance in poststroke aphasia. The most predictive connections involved right-hemisphere ROIs—particularly those for which structural adaptions are known to associate with recovered word retrieval performance. Predictions may be made, based on these findings, about which connections have potential as targets for neuroplastic functional changes with intervention in aphasia. Supplemental Material https://doi.org/10.23641/asha.12735785

APA, Harvard, Vancouver, ISO, and other styles

18

Misale, Claudia, Maurizio Drocco, Marco Aldinucci, and Guy Tremblay. "A Comparison of Big Data Frameworks on a Layered Dataflow Model." Parallel Processing Letters 27, no. 01 (2017): 1740003. http://dx.doi.org/10.1142/s0129626417400035.

Full text

Abstract:

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models–for which only informal (and often confusing) semantics is generally provided–all share a common underlying model, namely, the Dataflow model. The model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level.

APA, Harvard, Vancouver, ISO, and other styles

19

Legalov, Alexander I., Ivan V. Matkovskii, Mariya S. Ushakova, and Darya S. Romanova. "Dynamically Changing Parallelism with the Asynchronous Sequential Data Flows." Modeling and Analysis of Information Systems 27, no. 2 (2020): 164–79. http://dx.doi.org/10.18255/1818-1015-2020-2-164-179.

Full text

Abstract:

A statically typed version of the data driven functional parallel computing model is proposed. It enables a representation of dynamically changing parallelism by means of asynchronous serial data flows. We consider the features of the syntax and semantics of the statically typed data driven functional parallel programming language Smile that supports asynchronous sequential flows. Our main idea is to apply the Hoar concept of communicating sequential processes to the computation control on the data readiness. It is assumed that on the data readiness a control signal is emitted to inform the processes about the occurrence of certain events. The special feature of our approach is that the model is extended with the special asynchronous containers that can generate events on their partial filling. These containers are a stream and a swarm, each of which has its own specifics. A stream is used to process data which have identical type. The data comes sequentially and asynchronously at arbitrary time moments. The number of the incoming data elements is initially unknown, so the processing completes on the signal of the end of the stream. A swarm is used to contain independent data of the same type and may be used for the massive parallel operations performing. Unlike a stream, the swarm’s size is fixed and known in advance. General principles of the operations with the asynchronous sequential flows with an arbitrary order of data arrival are described. The use of the streams and the swarms in various situations is considered. We propose the language constructions which allow us to operate the swarms and streams and describe the specifics of their application. We provide the sample functions to illustrate the use of the different approaches to description of the parallelism: recursive processing of the asynchronous flows, processing of the flows in an arbitrary or predefined order of operations, direct access and access by the reference to the elements of the streams and swarms, pipelining of calculations. We give a preliminary parallelism assessment which depends on the ratio of the rates of data arrival and their processing. The proposed methods can be used in the development of the future languages and tool-kits of architecture-independent parallel programming.

APA, Harvard, Vancouver, ISO, and other styles

20

Et. al., Monika Parmar,. "Hashing based Data Transaction and Optimized Storage for IoT Applications." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 5 (2021): 1206–15. http://dx.doi.org/10.17762/turcomat.v12i5.1786.

Full text

Abstract:

Blockchain technology, which would be the underlying technology, has recently become very popular with the increase in cryptocurrencies and is being used in IoT and other fields. There have been shortfalls, however, which impede its implementation, including the volume of space. Transactions will be produced at a significant level due to the huge amount of Connected systems that often work in many networks as data processors. In IoT, the storage issue will become more intense. Current storing data platforms have a wide range of features to respond to an extensive variety spectrum of uses. Nevertheless, new groups of systems have arisen, e.g., blockchain with data version control, fork semantics, tamper-evidence or some variation thereof, and distributed analysis. They're showing new challenges for storage solutions to effectively serve such energy storage Systems by integrating the criteria mentioned in the processing. This paper discusses the potential security and privacy concerns of IoT applications and also it is shown that in first step the storage is enhanced by 50% and further in the next step, it is improved and it takes only 256 bytes irrespective of the input data size.

APA, Harvard, Vancouver, ISO, and other styles

21

BRIER, MATTHEW R., MANDY J. MAGUIRE, GAIL D. TILLMAN, JOHN HART, and MICHAEL A. KRAUT. "Event-related potentials in semantic memory retrieval." Journal of the International Neuropsychological Society 14, no. 5 (2008): 815–22. http://dx.doi.org/10.1017/s135561770808096x.

Full text

Abstract:

The involvement of the left temporal lobe in semantics and object naming has been repeatedly demonstrated in the context of language comprehension; however, its role in the mechanisms and time course for the retrieval of an integrated object memory from its constituent features have not been well delineated. In this study, 19 young adults were presented with two features of an object (e.g., “desert” and “humps”) and asked to determine whether these two features were congruent to form a retrieval of a specific object (“camel”) or incongruent and formed no retrieval while event-related potentials (ERP) were recorded. Beginning around 750 ms the ERP retrieval and nonretrieval waveforms over the left anterior fronto-temporal region show significance differences, indicating distinct processes for retrievals and nonretrievals. In addition to providing further data implicating the left frontal-anterior temporal region in object memory/retrieval, the results provide insight into the time course of semantic processing related to object memory retrieval in this region. The likely semantic process at 750 ms in this task would be coactivation of feature representations common to the same object. The consistency of this finding suggests that the process is stable across individuals. The potential clinical applications are discussed. (JINS, 2008, 14, 815–822.)

APA, Harvard, Vancouver, ISO, and other styles

22

Belinkov, Yonatan, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, and James Glass. "On the Linguistic Representational Power of Neural Machine Translation Models." Computational Linguistics 46, no. 1 (2020): 1–52. http://dx.doi.org/10.1162/coli_a_00367.

Full text

Abstract:

Despite the recent success of deep neural networks in natural language processing and other spheres of artificial intelligence, their interpretability remains a challenge. We analyze the representations learned by neural machine translation (NMT) models at various levels of granularity and evaluate their quality through relevant extrinsic properties. In particular, we seek answers to the following questions: (i) How accurately is word structure captured within the learned representations, which is an important aspect in translating morphologically rich languages? (ii) Do the representations capture long-range dependencies, and effectively handle syntactically divergent languages? (iii) Do the representations capture lexical semantics? We conduct a thorough investigation along several parameters: (i) Which layers in the architecture capture each of these linguistic phenomena; (ii) How does the choice of translation unit (word, character, or subword unit) impact the linguistic properties captured by the underlying representations? (iii) Do the encoder and decoder learn differently and independently? (iv) Do the representations learned by multilingual NMT models capture the same amount of linguistic information as their bilingual counterparts? Our data-driven, quantitative evaluation illuminates important aspects in NMT models and their ability to capture various linguistic phenomena. We show that deep NMT models trained in an end-to-end fashion, without being provided any direct supervision during the training process, learn a non-trivial amount of linguistic information. Notable findings include the following observations: (i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic and semantic dependencies are better represented at the higher layers of the model; (iii) Representations learned using characters are more informed about word-morphology compared to those learned using subword units; and (iv) Representations learned by multilingual models are richer compared to bilingual models.

APA, Harvard, Vancouver, ISO, and other styles

23

EITER, THOMAS, MICHAEL FINK, and HANS TOMPITS. "A knowledge-based approach for selecting information sources." Theory and Practice of Logic Programming 7, no. 3 (2007): 249–300. http://dx.doi.org/10.1017/s1471068406002754.

Full text

Abstract:

AbstractThrough the Internet and the World-Wide Web, a vast number of information sources has become available, which offer information on various subjects by different providers, often in heterogeneous formats. This calls for tools and methods for building an advanced information-processing infrastructure. One issue in this area is the selection of suitable information sources in query answering. In this paper, we present a knowledge-based approach to this problem, in the setting where one among a set of information sources (prototypically, data repositories) should be selected for evaluating a user query. We use extended logic programs (ELPs) to represent rich descriptions of the information sources, an underlying domain theory, and user queries in a formal query language (here, XML-QL, but other languages can be handled as well). Moreover, we use ELPs for declarative query analysis and generation of a query description. Central to our approach are declarativesource-selection programs, for which we define syntax and semantics. Due to the structured nature of the considered data items, the semantics of such programs must carefully respect implicit context information in source-selection rules, and furthermore combine it with possible user preferences. A prototype implementation of our approach has been realized exploiting the DLV KR system and its PLP front-end for prioritized ELPs. We describe a representative example involving specific movie databases, and report about experimental results.

APA, Harvard, Vancouver, ISO, and other styles

24

Huang, Qunying, and Xinyi Liu. "Semantic trajectory inference from geo-tagged tweets." Abstracts of the ICA 1 (July 15, 2019): 1–2. http://dx.doi.org/10.5194/ica-abs-1-129-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> Individual travel trajectories denote a series of places people visit along the time. These places (e.g., home, workspace, and park) reflect people’s corresponding activities (e.g., dwelling, work, and entertainment), which are discussed as semantic knowledge and could be implicit under raw data (Yan et al. 2013, Cai et al. 2016). Traditional survey data directly describe people’ activities at certain places, while costing tremendous labors and resources (Huang and Wong 2016). GPS data such as taxi logs record exact origin-destination pairs as well as people’s stay time along the way, from which semantics can be easily inferred combining with geographical context data (Yan et al. 2013). Research has been done to understand the activity sequences indicated by either individual or collective spatiotemporal (ST) travel trajectories using those dense data. Different models are proposed for trajectory mining and activity inference, including location categorization, frequent region detection, and so on (Njoo et al. 2015). A typical method for matching a location or region with a known activity type is to detect stay points and stay intervals of trajectories and to find geographical context of these stay occurrences (Furtado et al. 2013, Njoo et al. 2015, Beber et al. 2016, Beber et al. 2017).</p><p>However, limited progress has been made to mine semantics of trajectory data collected from social media platforms. Specifically, detection of stay points and their intervals could be inaccurate using online trajectories because of data sparsity. Huang et al. (2014) define the notion of activity zone to detect activity types from digital footprints. In this method, individual travel trajectories first are aggregated using spatial clustering method such as density-based spatial clustering of applications with noise (DBSCAN). Then produced clusters are classified based on a regional land use map and Google Places application programming interface (API). Such land use data are only published at specific places, such as the state cartography office’s website at University of Wisconsin-Madison. Researchers need to search for those data based on their study area. Moreover, while major land use maps can be searched for large areas such as the whole United States, detailed land use data for statewide or citywide areas are made in diverse standards, which adds extra work to classify activity zones consistently. Besides, Google Places API is a service that Google opened for developers and will return information about a place, given the place location (e.g., address or GPS coordinates), in the search request. However, API keys need to be generated before we can use these interfaces and each user can only make a limited number of free-charged requests every day (i.e., 1,000 requests per 24 hours period). In sum, previous methods to detect activity zone types using social media data are not sufficient and can hardly achieve effective data fusion. Comparing to the high cost of using officially published dataset, emerging Volunteered Geographic Information (VGI) data offer an alternative to infer the types of an individual’s activities performed in each zone (i.e., cluster).</p><p>Using geo-tagged tweets as an example, this research proposes a framework for mining social media data, detecting individual semantic travel trajectories, and individual representative daily travel trajectory paths by fusing with VGI data, specifically OpenStreetMap (OSM) datasets. First, inactive users and abnormal users (e.g., users representing a company with account being shared by many employees) are removed through data pre-processing (Step 1 in Figure 1). Next, a multi-scale spatial clustering method is developed to aggregate online trajectories captured through geo-tagged tweets of a group of users into collective spatial hot-spots (i.e., activity zones; Step 2). By integrating multiple OSM datasets the activity type (e.g., dwelling, service, transportation and work) of each collective zone then can be identified (Step 3). Each geo-tagged tweet of an individual, represented as a ST point, is then attached with a collective activity zone that either includes or overlaps a buffer zone of the ST point. Herein, the buffer zone is generated by using the point as the centroid and a predefined threshold as the radius. Given an individual’s ST points with semantics (i.e., activity type information) derived from the attached collective activity zone, a semantic activity clustering method is then developed to detect daily representative activity clusters of the individual (Step 4). Finally, individual representative daily semantic travel trajectory paths (i.e., semantic travel trajectory, defined as chronological travel activity sequences) are constructed between every two subsequent activity clusters (Step 5). Experiments with the historic geo-tagged tweets collected within Madison, Wisconsin reveal that: 1) The proposed method can detect most significant activity zones with accurate zone types identified (Figure 2); and 2) The semantic activity clustering method based on the derived activity zones can aggregate individual travel trajectories into activity clusters more efficiently comparing to DBSCAN and varying DBSCAN (VDBSCAN).</p>

APA, Harvard, Vancouver, ISO, and other styles

25

Sickinger, Pawel. "Aiming for Cognitive Equivalence – Mental Models as a Tertium Comparationis for Translation and Empirical Semantics." Research in Language 15, no. 2 (2017): 213–36. http://dx.doi.org/10.1515/rela-2017-0013.

Full text

Abstract:

This paper introduces my concept of cognitive equivalence (cf. Mandelblit, 1997), an attempt to reconcile elements of Nida’s dynamic equivalence with recent innovations in cognitive linguistics and cognitive psychology, and building on the current focus on translators’ mental processes in translation studies (see e.g. Göpferich et al., 2009, Lewandowska-Tomaszczyk, 2010). My approach shares its general impetus with Lewandowska-Tomaszczyk’s concept of re-conceptualization, but is independently derived from findings in cognitive linguistics and simulation theory (see e.g. Langacker, 2008; Feldman, 2006; Barsalou, 1999; Zwaan, 2004). Against this background, I propose a model of translation processing focused on the internal simulation of reader reception and the calibration of these simulations to achieve similarity between ST and TT impact. The concept of cognitive equivalence is exemplarily tested by exploring a conceptual / lexical field (MALE BALDNESS) through the way that English, German and Japanese lexical items in this field are linked to matching visual-conceptual representations by native speaker informants. The visual data gathered via this empirical method can be used to effectively triangulate the linguistic items involved, enabling an extra-linguistic comparison across languages. Results show that there is a reassuring level of inter-informant agreement within languages, but that the conceptual domain for BALDNESS is linguistically structured in systematically different ways across languages. The findings are interpreted as strengthening the call for a cognition-focused, embodied approach to translation.

APA, Harvard, Vancouver, ISO, and other styles

26

Jin, Min. "Semantics in XML Data Processing." Journal of the Korea Academia-Industrial cooperation Society 12, no. 3 (2011): 1327–35. http://dx.doi.org/10.5762/kais.2011.12.3.1327.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Gupta, Dhruv. "Search and analytics using semantic annotations." ACM SIGIR Forum 53, no. 2 (2019): 100–101. http://dx.doi.org/10.1145/3458553.3458567.

Full text

Abstract:

Current information retrieval systems are limited to text in documents for helping users with their information needs. With the progress in the field of natural language processing, there now exists the possibility of enriching large document collections with accurate semantic annotations. Annotations in the form of part-of-speech tags, temporal expressions, numerical values, geographic locations, and other named entities can help us look at terms in text with additional semantics. This doctoral dissertation presents methods for search and analysis of large semantically annotated document collections. Concretely, we make contributions along three broad directions: indexing, querying, and mining of large semantically annotated document collections. Indexing Annotated Document Collections. Knowledge-centric tasks such as information extraction, question answering, and relationship extraction require a user to retrieve text regions within documents that detail relationships between entities. Current search systems are ill-equipped to handle such tasks, as they can only provide phrase querying with Boolean operators. To enable knowledge acquisition at scale, we propose gyani, an indexing infrastructure for knowledge-centric tasks. gyani enables search for structured query patterns by allowing regular expression operators to be expressed between word sequences and semantic annotations. To implement grep-like search capabilities over large annotated document collections, we present a data model and index design choices involving word sequences, annotations, and their combinations. We show that by using our proposed indexing infrastructure we bring about drastic speedups in crucial knowledge-centric tasks: 95× in information extraction, 53× in question answering, and 12× in relationship extraction. Hyper-phrase queries are multi-phrase set queries that naturally arise when attempting to spot knowledge graph facts or subgraphs in large document collections. An example hyper-phrase query for the fact 〈mahatma gandhi, nominated for, nobel peace prize〉 is: 〈{ mahatma gandhi, m k gandhi, gandhi }, { nominated, nominee, nomination received }, { nobel peace prize, nobel prize for peace, nobel prize in peace }〉. Efficient execution of hyper-phrase queries is of essence when attempting to verify and validate claims concerning named entities or emerging named entities. To do so, it is required that the fact concerning the entity can be contextualized in text. To acquire text regions given a hyper-phrase query, we propose a retrieval framework using combinations of n-gram and skip-gram indexes. Concretely, we model the combinatorial space of the phrases in the hyper-phrase query to be retrieved using vertical and horizontal operators and propose a dynamic programming approach for optimized query processing. We show that using our proposed optimizations we can retrieve sentences in support of knowledge graph facts and subgraphs from large document collections within seconds. Querying Annotated Document Collections. Users often struggle to convey their information needs in short keyword queries. This often results in a series of query reformulations, in an attempt to find relevant documents. To assist users navigate large document collections and lead them to their information needs with ease, we propose methods that leverage semantic annotations. As a first step, we focus on temporal information needs. Specifically, we leverage temporal expressions in large document collections to serve time-sensitive queries better. Time-sensitive queries, e.g., summer olympics implicitly carry a temporal dimension for document retrieval. To help users explore longitudinal document collections, we propose a method that generates time intervals of interest as query reformulations. For instance, for the query world war , time intervals of interest are: [1914; 1918] and [1939;1945]. The generated time intervals are immediately useful in search-related tasks such as temporal query classification and temporal diversification of documents. As a second and final step, we focus on helping the user in navigating large document collections by generating semantic aspects. The aspects are generated using semantic annotations in the form of temporal expressions, geographic locations, and other named entities. Concretely, we propose the xFactor algorithm that generates semantic aspects in two steps. In the first step, xFactor computes the salience of annotations in models informed of their semantics. Thus, the temporal expressions 1930s and 1939 are considered similar as well as entities such as usain bolt and justin gatlin are considered related when computing their salience. Second, the xFactor algorithm computes the co-occurrence salience of annotations belonging to different types by using an efficient partitioning procedure. For instance, the aspect 〈{usain bolt}, {beijing, London}, [2008;2012]〉 signifies that the entity, locations, and the time interval are observed frequently in isolation as well as together in the documents retrieved for the query olympic medalists. Mining Annotated Document Collections. Large annotated document collections are a treasure trove of historical information concerning events and entities. In this regard, we first present EventMiner, a clustering algorithm, that mines events for keyword queries by using annotations in the form of temporal expressions, geographic locations, and other disambiguated named entities present in a pseudo-relevant set of documents. EventMiner aggregates the annotation evidences by mathematically modeling their semantics. Temporal expressions are modeled in an uncertainty and proximity-aware time model. Geographic locations are modeled as minimum bounding rectangles over their geographic co-ordinates. Other disambiguated named entities are modeled as a set of links corresponding to their Wikipedia articles. For a set of history-oriented queries concerning entities and events, we show that our approach can truly identify event clusters when compared to approaches that disregard annotation semantics. Second and finally, we present jigsaw, an end-to-end query-driven system that generates structured tables for user-defined schema from unstructured text. To define the table schema, we describe query operators that help perform structured search on annotated text and fill in table cell values. To resolve table cell values whose values can not be retrieved, we describe methods for inferring null values using local context. jigsaw further relies on semantic models for text and numbers to link together near-duplicate rows. This way, jigsaw is able to piece together paraphrased, partial, and redundant text regions retrieved in response to structured queries to generate high-quality tables within seconds. This doctoral dissertation was supervised by Klaus Berberich at the Max Planck Institute for Informatics and htw saar in Saarbrücken, Germany. This thesis is available online at: https://people.mpi-inf.mpg.de/~dhgupta/pub/dhruv-thesis.pdf.

APA, Harvard, Vancouver, ISO, and other styles

28

Tohti, Turdi, Jimmy Huang, Askar Hamdulla, and Xing Tan. "Text Filtering through Multi-Pattern Matching: A Case Study of Wu–Manber–Uy on the Language of Uyghur." Information 10, no. 8 (2019): 246. http://dx.doi.org/10.3390/info10080246.

Full text

Abstract:

Given its generality in applications and its high time-efficiency on big data-sets, in recent years, the technique of text filtering through pattern matching has been attracting increasing attention from the field of information retrieval and Natural language Processing (NLP) research communities at large. That being the case, however, it has yet to be seen how this technique and its algorithms, (e.g., Wu–Manber, which is also considered in this paper) can be applied and adopted properly and effectively to Uyghur, a low-resource language that is mostly spoken by the ethnic Uyghur group with a population of more than eleven-million in Xinjiang, China. We observe that technically, the challenge is mainly caused by two factors: (1) Vowel weakening and (2) mismatching in semantics between affixes and stems. Accordingly, in this paper, we propose Wu–Manber–Uy, a variant of an improvement to Wu–Manber, dedicated particularly for working on the Uyghur language. Wu–Manber–Uy implements a stem deformation-based pattern expansion strategy, specifically for reducing the mismatching of patterns caused by vowel weakening and spelling errors. A two-way strategy that applies invigilation and control on the change of lexical meaning of stems during word-building is also used in Wu–Manber–Uy. Extra consideration with respect to Word2vec and the dictionary are incorporated into the system for processing Uyghur. The experimental results we have obtained consistently demonstrate the high performance of Wu–Manber–Uy.

APA, Harvard, Vancouver, ISO, and other styles

29

Ait-Mlouk, Addi, Xuan-Son Vu, and Lili Jiang. "WINFRA: A Web-Based Platform for Semantic Data Retrieval and Data Analytics." Mathematics 8, no. 11 (2020): 2090. http://dx.doi.org/10.3390/math8112090.

Full text

Abstract:

Given the huge amount of heterogeneous data stored in different locations, it needs to be federated and semantically interconnected for further use. This paper introduces WINFRA, a comprehensive open-access platform for semantic web data and advanced analytics based on natural language processing (NLP) and data mining techniques (e.g., association rules, clustering, classification based on associations). The system is designed to facilitate federated data analysis, knowledge discovery, information retrieval, and new techniques to deal with semantic web and knowledge graph representation. The processing step integrates data from multiple sources virtually by creating virtual databases. Afterwards, the developed RDF Generator is built to generate RDF files for different data sources, together with SPARQL queries, to support semantic data search and knowledge graph representation. Furthermore, some application cases are provided to demonstrate how it facilitates advanced data analytics over semantic data and showcase our proposed approach toward semantic association rules.

APA, Harvard, Vancouver, ISO, and other styles

30

Thirunarayan, Krishnaprasad, and Amit Sheth. "Semantics-Empowered Big Data Processing with Applications." AI Magazine 36, no. 1 (2015): 39–54. http://dx.doi.org/10.1609/aimag.v36i1.2566.

Full text

Abstract:

We discuss the nature of big data and address the role of semantics in analyzing and processing big data that arises in the context of physical-cyber-social systems. To handle volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle variety, we resort to semantic models and annotations of data so that intelligent processing can be done independent of heterogeneity of data formats and media. To handle velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts. To handle veracity, we explore trust models and approaches to glean trustworthiness. These four v's of big data are harnessed by the semantics-empowered analytics to derive value to support applications transcending physical-cyber-social continuum.

APA, Harvard, Vancouver, ISO, and other styles

31

Hupkes, Dieuwke, Sara Veldhoen, and Willem Zuidema. "Visualisation and 'Diagnostic Classifiers' Reveal How Recurrent and Recursive Neural Networks Process Hierarchical Structure." Journal of Artificial Intelligence Research 61 (April 30, 2018): 907–26. http://dx.doi.org/10.1613/jair.1.11196.

Full text

Abstract:

We investigate how neural networks can learn and process languages with hierarchical, compositional semantics. To this end, we define the artificial task of processing nested arithmetic expressions, and study whether different types of neural networks can learn to compute their meaning. We find that recursive neural networks can implement a generalising solution to this problem, and we visualise this solution by breaking it up in three steps: project, sum and squash. As a next step, we investigate recurrent neural networks, and show that a gated recurrent unit, that processes its input incrementally, also performs very well on this task: the network learns to predict the outcome of the arithmetic expressions with high accuracy, although performance deteriorates somewhat with increasing length. To develop an understanding of what the recurrent network encodes, visualisation techniques alone do not suffice. Therefore, we develop an approach where we formulate and test multiple hypotheses on the information encoded and processed by the network. For each hypothesis, we derive predictions about features of the hidden state representations at each time step, and train 'diagnostic classifiers' to test those predictions. Our results indicate that the networks follow a strategy similar to our hypothesised 'cumulative strategy', which explains the high accuracy of the network on novel expressions, the generalisation to longer expressions than seen in training, and the mild deterioration with increasing length. This in turn shows that diagnostic classifiers can be a useful technique for opening up the black box of neural networks. We argue that diagnostic classification, unlike most visualisation techniques, does scale up from small networks in a toy domain, to larger and deeper recurrent networks dealing with real-life data, and may therefore contribute to a better understanding of the internal dynamics of current state-of-the-art models in natural language processing.

APA, Harvard, Vancouver, ISO, and other styles

32

Effenberger, Charlotte. "Linguistic Approach to Semantic Correlation Rules." SHS Web of Conferences 102 (2021): 02004. http://dx.doi.org/10.1051/shsconf/202110202004.

Full text

Abstract:

As communication between humans and machines in natural language still seems essential, especially for end users, Natural Language Processing (NLP) methods are used to classify and interpret this. NLP, as a technology, combines grammatical, semantical, and pragmatical analyses with statistics or machine learning to make language logically understandable by machines and to allow new interpretations of data in contrast to predefined logical structures. Some NLP methods do not go far beyond a retrieving of the indexation of content. Therefore, indexation is considered as a very simple linguistic approach. Semantic correlation rules offer the possibility to retrieve easy semantic relations without a special tool by using a set of predefined rules. Therefore, this paper aims to examine, to which extend Semantic Correlation Rules (SCRs) will be able to retrieve linguistic semantic relations and to what extend a simple NLP method can be set up to allow further interpretation of data. In order to do so, an easy linguistic model was modelled by an indexation that is enriched with semantical relations to give data more context. These semantic relations were then queried by SCRs to set up an NLP method.

APA, Harvard, Vancouver, ISO, and other styles

33

Xhafa, Fatos, and Leonard Barolli. "Semantics, intelligent processing and services for big data." Future Generation Computer Systems 37 (July 2014): 201–2. http://dx.doi.org/10.1016/j.future.2014.02.004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Benz, Anton, and Reinhard Blutner. "Papers on pragmasemantics." ZAS Papers in Linguistics 51 (January 1, 2009): 216. http://dx.doi.org/10.21248/zaspil.51.2009.371.

Full text

Abstract:

Optimality theory as used in linguistics (Prince & Smolensky, 1993/2004; Smolensky & Legendre, 2006) and cognitive psychology (Gigerenzer & Selten, 2001) is a theoretical framework that aims to integrate constraint based knowledge representation systems, generative grammar, cognitive skills, and aspects of neural network processing. In the last years considerable progress was made to overcome the artificial separation between the disciplines of linguistic on the one hand which are mainly concerned with the description of natural language competences and the psychological disciplines on the other hand which are interested in real language performance.  The semantics and pragmatics of natural language is a research topic that is asking for an integration of philosophical, linguistic, psycholinguistic aspects, including its neural underpinning. Especially recent work on experimental pragmatics (e.g. Noveck & Sperber, 2005; Garrett & Harnish, 2007) has shown that real progress in the area of pragmatics isn’t possible without using data from all available domains including data from language acquisition and actual language generation and comprehension performance. It is a conceivable research programme to use the optimality theoretic framework in order to realize the integration.  Game theoretic pragmatics is a relatively young development in pragmatics. The idea to view communication as a strategic interaction between speaker and hearer is not new. It is already present in Grice' (1975) classical paper on conversational implicatures. What game theory offers is a mathematical framework in which strategic interaction can be precisely described. It is a leading paradigm in economics as witnessed by a series of Nobel prizes in the field. It is also of growing importance to other disciplines of the social sciences. In linguistics, its main applications have been so far pragmatics and theoretical typology. For pragmatics, game theory promises a firm foundation, and a rigor which hopefully will allow studying pragmatic phenomena with the same precision as that achieved in formal semantics.  The development of game theoretic pragmatics is closely connected to the development of bidirectional optimality theory (Blutner, 2000). It can be easily seen that the game theoretic notion of a Nash equilibrium and the optimality theoretic notion of a strongly optimal form-meaning pair are closely related to each other. The main impulse that bidirectional optimality theory gave to research on game theoretic pragmatics stemmed from serious empirical problems that resulted from interpreting the principle of weak optimality as a synchronic interpretation principle.  In this volume, we have collected papers that are concerned with several aspects of game and optimality theoretic approaches to pragmatics.

APA, Harvard, Vancouver, ISO, and other styles

35

Kim, Hyeon Gyu. "Exploiting Window Query Semantics in Scalable Data Stream Processing." International Journal of Control and Automation 8, no. 11 (2015): 13–20. http://dx.doi.org/10.14257/ijca.2015.8.11.02.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

NISHIMURA, SUSUMU, and ATSUSHI OHORI. "Parallel functional programming on recursively defined data via data-parallel recursion." Journal of Functional Programming 9, no. 4 (1999): 427–62. http://dx.doi.org/10.1017/s0956796899003457.

Full text

Abstract:

This article proposes a new language mechanism for data-parallel processing of dynamically allocated recursively defined data. Different from the conventional array-based data- parallelism, it allows parallel processing of general recursively defined data such as lists or trees in a functional way. This is achieved by representing a recursively defined datum as a system of equations, and defining new language constructs for parallel transformation of a system of equations. By integrating them with a higher-order functional language, we obtain a functional programming language suitable for describing data-parallel algorithms on recursively defined data in a declarative way. The language has an ML style polymorphic type system and a type sound operational semantics that uniformly integrates the parallel evaluation mechanism with the semantics of a typed functional language. We also show the intended parallel execution model behind the formal semantics, assuming an idealized distributed memory multicomputer.

APA, Harvard, Vancouver, ISO, and other styles

37

Liu, Jun Qiang, and Xiao Ling Guan. "Composite Event Processing for Data Streams and Domain Knowledge." Advanced Materials Research 219-220 (March 2011): 927–31. http://dx.doi.org/10.4028/www.scientific.net/amr.219-220.927.

Full text

Abstract:

In recent years the processing of composite event queries over data streams has attracted a lot of research attention. Traditional database techniques were not designed for stream processing system. Furthermore, example continuous queries are often formulated in declarative query language without specifying the semantics. To overcome these deficiencies, this article presents the design, implementation, and evaluation of a system that executes data streams with semantic information. Then, a set of optimization techniques are proposed for handling query. So, our approach not only makes it possible to express queries with a sound semantics, but also provides a solid foundation for query optimization. Experiment results show that our approach is effective and efficient for data streams and domain knowledge.

APA, Harvard, Vancouver, ISO, and other styles

38

Touzani, Samir, and Jessica Granderson. "Open Data and Deep Semantic Segmentation for Automated Extraction of Building Footprints." Remote Sensing 13, no. 13 (2021): 2578. http://dx.doi.org/10.3390/rs13132578.

Full text

Abstract:

Advances in machine learning and computer vision, combined with increased access to unstructured data (e.g., images and text), have created an opportunity for automated extraction of building characteristics, cost-effectively, and at scale. These characteristics are relevant to a variety of urban and energy applications, yet are time consuming and costly to acquire with today’s manual methods. Several recent research studies have shown that in comparison to more traditional methods that are based on features engineering approach, an end-to-end learning approach based on deep learning algorithms significantly improved the accuracy of automatic building footprint extraction from remote sensing images. However, these studies used limited benchmark datasets that have been carefully curated and labeled. How the accuracy of these deep learning-based approach holds when using less curated training data has not received enough attention. The aim of this work is to leverage the openly available data to automatically generate a larger training dataset with more variability in term of regions and type of cities, which can be used to build more accurate deep learning models. In contrast to most benchmark datasets, the gathered data have not been manually curated. Thus, the training dataset is not perfectly clean in terms of remote sensing images exactly matching the ground truth building’s foot-print. A workflow that includes data pre-processing, deep learning semantic segmentation modeling, and results post-processing is introduced and applied to a dataset that include remote sensing images from 15 cities and five counties from various region of the USA, which include 8,607,677 buildings. The accuracy of the proposed approach was measured on an out of sample testing dataset corresponding to 364,000 buildings from three USA cities. The results favorably compared to those obtained from Microsoft’s recently released US building footprint dataset.

APA, Harvard, Vancouver, ISO, and other styles

39

Dell'Aglio, Daniele, Emanuele Della Valle, Jean-Paul Calbimonte, and Oscar Corcho. "RSP-QL Semantics." International Journal on Semantic Web and Information Systems 10, no. 4 (2014): 17–44. http://dx.doi.org/10.4018/ijswis.2014100102.

Full text

Abstract:

RDF and SPARQL are established standards for data interchange and querying on the Web. While they have been shown to be useful and applicable in many scenarios, they are not sufficiently adequate for dealing with streams of data and their intrinsic continuous nature. In the last years data and query languages have been proposed to extend both RDF and SPARQL for streams and continuous processing, under the name of RDF Stream Processing – RSP. These efforts resulted in several models and implementations that, at a first look, appear to propose alternative syntaxes but equivalent semantics. However, when asked to continuously answer the same queries on the same data streams, they provide different answers at disparate moments due to the heterogeneity of their operational semantics. These discrepancies render the process of understanding and comparing continuous query results complex and misleading. In this work, the authors propose RSP-QL, a comprehensive model that formally defines the semantics of an RSP system. RSP-QL makes explicit the hidden assumptions of currently available RSP systems, allows defining a formal notion of correctness for RSP query results and, thus, explains why available implementations provide different answers at disparate moments.

APA, Harvard, Vancouver, ISO, and other styles

40

Wang, Lin, Xingfu Wang, Ammar Hawbani, Yan Xiong, and Xu Zhang. "Rethinking Separable Convolutional Encoders for End-to-End Semantic Image Segmentation." Mathematical Problems in Engineering 2021 (April 16, 2021): 1–12. http://dx.doi.org/10.1155/2021/5566691.

Full text

Abstract:

With the development of science and technology, the middle volume and neural network in the semantic image segmentation of the codec show good development prospects. Its advantage is that it can extract richer semantic features, but this will cause high costs. In order to solve this problem, this article mainly introduces the codec based on a separable convolutional neural network for semantic image segmentation. This article proposes a codec based on a separable convolutional neural network for semantic image segmentation research methods, including the traditional convolutional neural network hierarchy into a separable convolutional neural network, which can reduce the cost of image data segmentation and improve processing efficiency. Moreover, this article builds a separable convolutional neural network codec structure and designs a semantic segmentation process, so that the codec based on a separable convolutional neural network is used for semantic image segmentation research experiments. The experimental results show that the average improvement of the dataset by the improved codec is 0.01, which proves the effectiveness of the improved SegProNet. The smaller the number of training set samples, the more obvious the performance improvement.

APA, Harvard, Vancouver, ISO, and other styles

41

Zhou, Jiemin, Jie Shen, Lukáš Herman, and Yixian Du. "Study on multi-source data integrating standard and 3D cartographic visualization of urban flooding based on CityGML." Abstracts of the ICA 1 (July 15, 2019): 1–2. http://dx.doi.org/10.5194/ica-abs-1-435-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> Urban flooding refers to the phenomenon of water accumulation in cities due to strong precipitation or continuous precipitation exceeding urban drainage capacity. At present, the problem of urban flooding has become another major urban disease after other urban problems such as crowded people, traffic congestion and environmental pollution (Weiwu W, et al. 2015). In order to better cope with the problem of waterlogging, there are many hydrodynamic models that can accurately simulate the process of pipe network drainage and water accumulation. However, there are currently two limitations that limit the practicality of these hydrodynamic models. First, these models often require a large amount of input data as a support. These data exhibit multi-source, heterogeneity, multi-scale, multi-resolution and other characteristics, which bring great difficulty to data acquisition and processing (Dongdong Z, et al. 2014). Second, the analysis results of the model contain a large amount of waterlogging related information. However, this information is usually represented by simple texts and tables, which is not conducive to interpretation, transmission and visualization. Especially for location-related information, such as flooding points, flow information of various parts of the drainage pipe, water depth in the flooded area, etc., non-interactive two-dimensional maps or tables can cause inconvenience to disaster management and decision-making.</p><p>Therefore, a proper data model, respecting the needs of integration of multisource input and output data of waterlogging simulation and analysis and the needs of 3D visualization in front end, is needed. As a matter of fact, what in the last years has proven to be an emerging and effective approach is the adoption of standard-based, integrated semantic 3D virtual city models, which represent an information hub for most of the above-mentioned needs (Agugiaro et al. 2018). In particular, being based on open standards (e.g. on the CityGML standard proposed by the Open Geospatial Consortium), virtual city models firstly reduce the effort in terms of data preparation and provision. Secondly, they offer clear data structures, ontologies and semantics to facilitate data exchange between different domains and applications, and 3D visualization which is essential for crisis management (Herman L, Řezník T, 2015). However, a standardized and omni-comprehensive urban data model covering also the waterlogging domain is still missing. Even CityGML falls partially short when it comes to the definition of specific entities and attributes for waterlogging-related applications.</p><p>This study aims to propose an Waterlogging Application Domain Extension (ADE) for CityGML 2.0, which is used to integrate the multi-source input data and the rich waterlogging related information of waterlogging simulation and analysis. In order to achieve the above objectives, the following contents will be investigated:</p><p>(1) Analysis of urban hydrodynamic models’ function and their data characteristics</p><p>Analyze the function of the hydrodynamic model in the city, and explore the source, precision and organizational structure of the input and output data during the simulation and analysis of waterlogging.</p><p>(2) Construction of waterlogging data integration model based on CityGML</p><p>Based on the CityGML2.0 and its ADE mechanism, we will construct a conceptual model of waterlogging data integration standard, to describe the geometric and spatial structure of urban buildings, vegetation, land-use, underground pipelines, waterlogging related entities, etc. And more information about waterlogging, such as flooding points, drainage network carrying pressure, catchment area, etc., will be integrated into the corresponding fields of the data integration model in a reasonable way to facilitate Internet transmission and visualization.</p><p>(3) Interactive 3D visualization method based on the proposed waterlogging ADE</p><p>Based on the rich waterlogging-related information in the waterlogging ADE, combined with 3D visualization technology, the 3D dynamic interactive visualization method is explored, in order to provide more intuitive data support for waterlogging disaster management and decision-making.</p><p>Through this study, we expect to improve the efficiency and practicability of the current waterlogging simulation and analysis by integrating the waterlogging related data from multi-source based on CityGML. As the result of the research, we intend to develop a prototype system based on the multi-source waterlogging related data integration method proposed in this study. With this standard, we can provide unified and standard data support for the waterlogging model. At the same time, the results of the waterlogging models are automatically integrated into the data set for dynamic 3D visualization.</p>

APA, Harvard, Vancouver, ISO, and other styles

42

Chen, Yan, Lin Huang, Keliang Chen, et al. "White matter basis for the hub-and-spoke semantic representation: evidence from semantic dementia." Brain 143, no. 4 (2020): 1206–19. http://dx.doi.org/10.1093/brain/awaa057.

Full text

Abstract:

Abstract The hub-and-spoke semantic representation theory posits that semantic knowledge is processed in a neural network, which contains an amodal hub, the sensorimotor modality-specific regions, and the connections between them. The exact neural basis of the hub, regions and connectivity remains unclear. Semantic dementia could be an ideal lesion model to construct the semantic network as this disease presents both amodal and modality-specific semantic processing (e.g. colour) deficits. The goal of the present study was to identify, using an unbiased data-driven approach, the semantic hub and its general and modality-specific semantic white matter connections by investigating the relationship between the lesion degree of the network and the severity of semantic deficits in 33 patients with semantic dementia. Data of diffusion-weighted imaging and behavioural performance in processing knowledge of general semantic and six sensorimotor modalities (i.e. object form, colour, motion, sound, manipulation and function) were collected from each subject. Specifically, to identify the semantic hub, we mapped the white matter nodal degree value (a graph theoretical index) of the 90 regions in the automated anatomical labelling atlas with the general semantic abilities of the patients. Of the regions, only the left fusiform gyrus was identified as the hub because its structural connectivity strength (i.e. nodal degree value) could significantly predict the general semantic processing of the patients. To identify the general and modality-specific semantic connections of the semantic hub, we separately correlated the white matter integrity values of each tract connected with the left fusiform gyrus, with the performance for general semantic processing and each of six semantic modality processing. The results showed that the hub region worked in concert with nine other regions in the semantic memory network for general semantic processing. Moreover, the connection between the hub and the left calcarine was associated with colour-specific semantic processing. The observed effects could not be accounted for by potential confounding variables (e.g. total grey matter volume, regional grey matter volume and performance on non-semantic control tasks). Our findings refine the neuroanatomical structure of the semantic network and underline the critical role of the left fusiform gyrus and its connectivity in the network.

APA, Harvard, Vancouver, ISO, and other styles

43

Sampaio, Thiago Oliveira da Motta, and Aniela Improta França. "Event-duration semantics in online sentence processing." Letras de Hoje 53, no. 1 (2018): 59. http://dx.doi.org/10.15448/1984-7726.2018.1.28695.

Full text

Abstract:

Several experiments in Psycholinguistics found evidences of Iterative Coercion, an effect related to the reanalysis of punctual events used in durative contexts triggering an iterative meaning. We argue that this effect is not related to aspectual features, and that event-duration semantics is accessed during Sentence Processing. We ran a self-paced reading experiment in Brazilian Portuguese whose sentences contain events with an average duration of a few minutes. These sentences were inserted in durative contexts that became the experiment’s conditions following a Latin Square design: control condition (minutes), subtractive (seconds), iterative (hours) and habitual (days). Higher RTs were measured at the critical segments of all experimental conditions, except for the habitual context. The results corroborated our hypothesis while defying the psychological reality of habitual coercion. To better observe the habitual coercion condition, we now present a reanalysis of Sampaio et al. (2014) data. The present analysis confirms the results of our tests.***Semântica da duração de eventos no processamento online de sentenças***Diversos experimentos psicolinguísticos encontraram evidências da CoerçãoIterativa, um efeito relacionado à reanálise de eventos pontuais usados emcontextos durativos, que resultam numa leitura iterativa. Nesse trabalho argumentamos que esse efeito não é relacionado a propriedades aspectuais e que a semântica da duração dos eventos é acessada online. Aplicamos um experimento de leitura auto-monitorada em Português Brasileiro, cujas sentenças contém eventos com duração média de alguns minutos. Estas sentenças foram inseridas em quatro condições durativas, seguindo uma distribuição fatorial (quadrado latino):condição controle (minutos), subtrativa (segundos), iterativa (horas) e habitual (dias). Nossos resultados indicam maiores RTs nos segmentos críticos das condições experimentais, exceto em relação a condição habitual. Esses resultados confirmam nossa hipótese, porém, põem em cheque a realidade psicológica da coerção habitual. Assim, para melhor observar os efeitos dessa condição, procedemos uma reanálise dos dados de Sampaio et al. (2014), que finalmente veio confirmar o resultado de nosso teste.

APA, Harvard, Vancouver, ISO, and other styles

44

Fisher, Meghan A., Pádraig Ó. Conbhuí, Cathal Ó. Brion, et al. "ExSeisDat: A set of parallel I/O and workflow libraries for petroleum seismology." Oil & Gas Science and Technology – Revue d’IFP Energies nouvelles 73 (2018): 74. http://dx.doi.org/10.2516/ogst/2018048.

Full text

Abstract:

Seismic data-sets are extremely large and are broken into data files, ranging in size from 100s of GiBs to 10s of TiBs and larger. The parallel I/O for these files is complex due to the amount of data along with varied and multiple access patterns within individual files. Properties of legacy file formats, such as the de-facto standard SEG-Y, also contribute to the decrease in developer productivity while working with these files. SEG-Y files embed their own internal layout which could lead to conflict with traditional, file-system-level layout optimization schemes. Additionally, as seismic files continue to increase in size, memory bottlenecks will be exacerbated, resulting in the need for smart I/O optimization not only to increase the efficiency of read/writes, but to manage memory usage as well. The ExSeisDat (Extreme-Scale Seismic Data) set of libraries addresses these problems through the development and implementation of easy to use, object oriented libraries that are portable and open source with bindings available in multiple languages. The lower level parallel I/O library, ExSeisPIOL (Extreme-Scale Seismic Parallel I/O Library), targets SEG-Y and other proprietary formats, simplifying I/O by internally interfacing MPI-I/O and other I/O interfaces. The I/O is explicitly handled; end users only need to define the memory limits, decomposition of I/O across processes, and data access patterns when reading and writing data. ExSeisPIOL bridges the layout gap between the SEG-Y file structure and file system organization. The higher level parallel seismic workflow library, ExSeisFlow (Extreme-Scale Seismic workFlow), leverages ExSeisPIOL, further simplifying I/O by implicitly handling all I/O parameters, thus allowing geophysicists to focus on domain-specific development. Operations in ExSeisFlow focus on prestack processing and can be performed on single traces, individual gathers, and across entire surveys, including out of core sorting, binning, filtering, and transforming. To optimize memory management, the workflow only reads in data pertinent to the operations being performed instead of an entire file. A smart caching system manages the read data, discarding it when no longer needed in the workflow. As the libraries are optimized to handle spatial and temporal locality, they are a natural fit to burst buffer technologies, particularly DDN’s Infinite Memory Engine (IME) system. With appropriate access semantics or through the direct exploitation of the low-level interfaces, the ExSeisDat stack on IME delivers a significant improvement to I/O performance over standalone parallel file systems like Lustre.

APA, Harvard, Vancouver, ISO, and other styles

45

Trinh, Tuan-Dat, Peter Wetz, Ba-Lam Do, Elmar Kiesling, and A. Min Tjoa. "Distributed mashups: a collaborative approach to data integration." International Journal of Web Information Systems 11, no. 3 (2015): 370–96. http://dx.doi.org/10.1108/ijwis-04-2015-0018.

Full text

Abstract:

Purpose – This paper aims to present a collaborative mashup platform for dynamic integration of heterogeneous data sources. The platform encourages sharing and connects data publishers, integrators, developers and end users. Design/methodology/approach – This approach is based on a visual programming paradigm and follows three fundamental principles: openness, connectedness and reusability. The platform is based on semantic Web technologies and the concept of linked widgets, i.e. semantic modules that allow users to access, integrate and visualize data in a creative and collaborative manner. Findings – The platform can effectively tackle data integration challenges by allowing users to explore relevant data sources for different contexts, tackling the data heterogeneity problem and facilitating automatic data integration, easing data integration via simple operations and fostering reusability of data processing tasks. Research limitations/implications – This research has focused exclusively on conceptual and technical aspects so far; a comprehensive user study, extensive performance and scalability testing is left for future work. Originality/value – A key contribution of this paper is the concept of distributed mashups. These ad hoc data integration applications allow users to perform data processing tasks in a collaborative and distributed manner simultaneously on multiple devices. This approach requires no server infrastructure to upload data, but rather allows each user to keep control over their data and expose only relevant subsets. Distributed mashups can run persistently in the background and are hence ideal for real-time data monitoring or data streaming use cases. Furthermore, we introduce automatic mashup composition as an innovative approach based on an explicit semantic widget model.

APA, Harvard, Vancouver, ISO, and other styles

46

HULSTIJN, JAN. "Semantic-informational and formal processing principles in Processability Theory." Bilingualism: Language and Cognition 1, no. 1 (1998): 27–28. http://dx.doi.org/10.1017/s1366728998000054.

Full text

Abstract:

Let me begin my comments on Pienemann's keynote paper by expressing my admiration for the scholar who has developed Processability Theory (PT) over a period of some fifteen years with great determination and perseverance. What in earlier publications (e.g. Pienemann, 1985, 1987) appeared to me to be a rather disparate set of principles aiming to account for a limited set of empirical data (the well known sequence of five word orders of the ZISA study), has now evolved into a coherent theory which meets the demands of falsifiability, as PT's claims are formulated in sufficient detail to allow SLA researchers to put them to empirical test. PT comprises a number of principles of great generality, accounting, in principle, for the acquisition of any structure in any language, thereby exceeding the limits of the ZISA data of natural German L2 acquisition. As such, it is to be hoped that PT will have a healthy influence on the field of SLA research, as this field, in the last few years, has perhaps been dominated too much by the issue of whether L2 learners have access to Universal Grammar.

APA, Harvard, Vancouver, ISO, and other styles

47

BROUWER, Susanne, Deniz ÖZKAN, and Aylin C. KÜNTAY. "Verb-based prediction during language processing: the case of Dutch and Turkish." Journal of Child Language 46, no. 1 (2018): 80–97. http://dx.doi.org/10.1017/s0305000918000375.

Full text

Abstract:

AbstractThis study investigated whether cross-linguistic differences affect semantic prediction. We assessed this by looking at two languages, Dutch and Turkish, that differ in word order and thus vary in how words come together to create sentence meaning. In an eye-tracking task, Dutch and Turkish four-year-olds (N = 40), five-year-olds (N = 58), and adults (N = 40) were presented with a visual display containing two familiar objects (e.g., a cake and a tree). Participants heard semantically constraining (e.g., “The boy eats the big cake”) or neutral sentences (e.g., “The boy sees the big cake”) in their native language. The Dutch data revealed a prediction effect for children and adults; however, it was larger for the adults. The Turkish data revealed no prediction effect for the children but only for the adults. These findings reveal that experience with word order structures and/or automatization of language processing routines may lead to timecourse differences in semantic prediction.

APA, Harvard, Vancouver, ISO, and other styles

48

Vogelsang, David A., Matthias Gruber, Zara M. Bergström, Charan Ranganath, and Jon S. Simons. "Alpha Oscillations during Incidental Encoding Predict Subsequent Memory for New “Foil” Information." Journal of Cognitive Neuroscience 30, no. 5 (2018): 667–79. http://dx.doi.org/10.1162/jocn_a_01234.

Full text

Abstract:

People can employ adaptive strategies to increase the likelihood that previously encoded information will be successfully retrieved. One such strategy is to constrain retrieval toward relevant information by reimplementing the neurocognitive processes that were engaged during encoding. Using EEG, we examined the temporal dynamics with which constraining retrieval toward semantic versus nonsemantic information affects the processing of new “foil” information encountered during a memory test. Time–frequency analysis of EEG data acquired during an initial study phase revealed that semantic compared with nonsemantic processing was associated with alpha decreases in a left frontal electrode cluster from around 600 msec after stimulus onset. Successful encoding of semantic versus nonsemantic foils during a subsequent memory test was related to decreases in alpha oscillatory activity in the same left frontal electrode cluster, which emerged relatively late in the trial at around 1000–1600 msec after stimulus onset. Across participants, left frontal alpha power elicited by semantic processing during the study phase correlated significantly with left frontal alpha power associated with semantic foil encoding during the memory test. Furthermore, larger left frontal alpha power decreases elicited by semantic foil encoding during the memory test predicted better subsequent semantic foil recognition in an additional surprise foil memory test, although this effect did not reach significance. These findings indicate that constraining retrieval toward semantic information involves reimplementing semantic encoding operations that are mediated by alpha oscillations and that such reimplementation occurs at a late stage of memory retrieval, perhaps reflecting additional monitoring processes.

APA, Harvard, Vancouver, ISO, and other styles

49

Ahmed, Jeelani, and Dr Muqeem Ahmed. "Big data and semantic web, challenges and opportunities a survey." International Journal of Engineering & Technology 7, no. 4.5 (2018): 631. http://dx.doi.org/10.14419/ijet.v7i4.5.21174.

Full text

Abstract:

In recent years, vast and complex amounts of data are being created and making it difficult for traditional data processing applications to manage them. The coming of the Internet prompted monstrous spike in the volume of information being made and made accessible. World Wide Web consortium W3C and international standardization body of the web spread the Semantic Web. It is an extended form of current web which provide easier way to search, reuse, combine and share information. In the last few years, major businesses corporations have demonstrated interest in incorporating semantic web technology with big data for added value. Indeed this incorporation has some benefits as well; it increases end-users ability to self-manage data from various sources, it on focuses changing business environments and varying user needs and handles concepts and relationships, manages terminology while connecting different data from varied data sources. For Social Network Analysis (SNA) new methods are needed by combining Big Data and Semantic Web technologies as a way to utilize and add capacities to existing frameworks. Moreover, the fast changing business requirements and latest industry culture of Agile Development needs a robust yet flexible solution for Business Intelligence and by using distributed enterprise level ontologies Data Warehousing can be incorporated. This paper is an attempt to focus on effects of incorporating Big Data with Semantic web, how Semantic Web making Big Data smarter, revisit the Big Data and Semantic Web challenges and opportunities, relationship between them and finally we summarizes with future direction of this integration

APA, Harvard, Vancouver, ISO, and other styles

50

Sefton, Peter, Ian Barnes, Ron Ward, and Jim Downing. "Embedding Metadata and Other Semantics in Word Processing Documents." International Journal of Digital Curation 4, no. 2 (2009): 93–106. http://dx.doi.org/10.2218/ijdc.v4i2.96.

Full text

Abstract:

This paper describes a technique for embedding document metadata, and potentially other semantic references inline in word processing documents, which the authors have implemented with the help of a software development team. Several assumptions underly the approach; It must be available across computing platforms and work with both Microsoft Word (because of its user base) and OpenOffice.org (because of its free availability). Further the application needs to be acceptable to and usable by users, so the initial implementation covers only small number of features, which will only be extended after user-testing. Within these constraints the system provides a mechanism for encoding not only simple metadata, but for inferring hierarchical relationships between metadata elements from a ‘flat’ word processing file.The paper includes links to open source code implementing the techniques as part of a broader suite of tools for academic writing. This addresses tools and software, semantic web and data curation, integrating curation into research workflows and will provide a platform for integrating work on ontologies, vocabularies and folksonomies into word processing tools.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!