To see the other types of publications on this topic, follow the link: Data patterns.

Journal articles on the topic 'Data patterns'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Data patterns.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

S., Sivaranjani. "Detecting Congestion Patterns in Spatio Temporal Traffic Data Using Frequent Pattern Mining." Bonfring International Journal of Networking Technologies and Applications 5, no. 1 (March 30, 2018): 21–23. http://dx.doi.org/10.9756/bijnta.8372.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

McGuirl, Melissa R., Alexandria Volkening, and Björn Sandstede. "Topological data analysis of zebrafish patterns." Proceedings of the National Academy of Sciences 117, no. 10 (February 25, 2020): 5113–24. http://dx.doi.org/10.1073/pnas.1917763117.

Full text
Abstract:
Self-organized pattern behavior is ubiquitous throughout nature, from fish schooling to collective cell dynamics during organism development. Qualitatively these patterns display impressive consistency, yet variability inevitably exists within pattern-forming systems on both microscopic and macroscopic scales. Quantifying variability and measuring pattern features can inform the underlying agent interactions and allow for predictive analyses. Nevertheless, current methods for analyzing patterns that arise from collective behavior capture only macroscopic features or rely on either manual inspection or smoothing algorithms that lose the underlying agent-based nature of the data. Here we introduce methods based on topological data analysis and interpretable machine learning for quantifying both agent-level features and global pattern attributes on a large scale. Because the zebrafish is a model organism for skin pattern formation, we focus specifically on analyzing its skin patterns as a means of illustrating our approach. Using a recent agent-based model, we simulate thousands of wild-type and mutant zebrafish patterns and apply our methodology to better understand pattern variability in zebrafish. Our methodology is able to quantify the differential impact of stochasticity in cell interactions on wild-type and mutant patterns, and we use our methods to predict stripe and spot statistics as a function of varying cellular communication. Our work provides an approach to automatically quantifying biological patterns and analyzing agent-based dynamics so that we can now answer critical questions in pattern formation at a much larger scale.
APA, Harvard, Vancouver, ISO, and other styles
3

Singh, Sakshi, Harsh Mittal, and Archana Purwar. "Prediction of Investment Patterns Using Data Mining Techniques." International Journal of Computer and Communication Engineering 3, no. 2 (2014): 145–48. http://dx.doi.org/10.7763/ijcce.2014.v3.309.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zvyagin, L. S. "DATA MINING: BIG DATA AND DATA SCIENCE." SOFT MEASUREMENTS AND COMPUTING 5, no. 54 (2022): 81–90. http://dx.doi.org/10.36871/2618-9976.2022.05.006.

Full text
Abstract:
Data mining is the process of discovering information that can be used in large amounts of data. This method uses mathematical analysis, which helps to identify patterns and trends in the data. Such patterns cannot be noticed during normal data viewing due to the complexity of the relationships that arise with a large amount of data. All of them are a set of tools and methods that help humanity in the changing world around us. It is becoming more and more voluminous, we receive huge aggregates of data on various processes. Big Data and Data Science allow large companies to systematize information about the markets in which they operate, which allows them to get a large amount of profit and benefits.
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Shihu, Li Deng, Haiyan Gao, and Xueyu Ma. "Relative Entropy-Based Similarity for Patterns in Graph Data." Wireless Communications and Mobile Computing 2022 (July 26, 2022): 1–20. http://dx.doi.org/10.1155/2022/7490656.

Full text
Abstract:
How to make a correct similarity between patterns is a groundwork in data mining, especially for graph data. Despite these methods that can obtain great results, there may be still some limitations, for instance, the similarity of patterns in directed weighted graph data. Here, we introduce a new approach by taking the so-called the second-order neighbors into consideration. The proposed new similarity approach is named as relative entropy-based similarity for patterns in graph data, wherein the relative entropy provides a brand new aspect to make the difference between patterns in directed weighted graph data. The proposed similarity measure can be partitioned under three phases. First of all, strength set is given by degree and weight of patterns; in this phase, four variables holding the strength about out-degree, in-degree, out-weight, and in-weight are constructed. Then, with the help of Euclidean metric, pattern’s probability set is constructed, which contains influence of similarity between pattern and its all one-order neighbors. Finally, relative entropy is used to measure the difference between patterns. In order to examine the validity of our approach as well as its advantage comparing with the state-of-art approach, two sorts of experiments are suggested for real-world and synthetic graph data. The outcomes of experiment indicate that the recommended method get handy execution done measuring similarity and gain accurate results.
APA, Harvard, Vancouver, ISO, and other styles
6

Batra, Dinesh. "Conceptual Data Modeling Patterns." Journal of Database Management 16, no. 2 (April 2005): 84–106. http://dx.doi.org/10.4018/jdm.2005040105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Wagner, Peter, Ragna Hoffmann, Marek Junghans, Andreas Leich, and Hagen Saul. "Visualizing crash data patterns." Transactions on Transport Sciences 11, no. 2 (September 11, 2020): 77–83. http://dx.doi.org/10.5507/tots.2020.008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Labra Gayo, Jose Emilio, Dimitris Kontokostas, and Sören Auer. "Multilingual linked data patterns." Semantic Web 6, no. 4 (August 7, 2015): 319–37. http://dx.doi.org/10.3233/sw-140136.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Muley, Abhinav, and Manish Gudadhe. "Synthesizing High-Utility Patterns from Different Data Sources." Data 3, no. 3 (September 3, 2018): 32. http://dx.doi.org/10.3390/data3030032.

Full text
Abstract:
In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.
APA, Harvard, Vancouver, ISO, and other styles
10

Zhou, C., W. D. Xiao, and D. Q. Tang. "MINING CO-LOCATION PATTERNS FROM SPATIAL DATA." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-2 (June 2, 2016): 85–90. http://dx.doi.org/10.5194/isprsannals-iii-2-85-2016.

Full text
Abstract:
Due to the widespread application of geographic information systems (GIS) and GPS technology and the increasingly mature infrastructure for data collection, sharing, and integration, more and more research domains have gained access to high-quality geographic data and created new ways to incorporate spatial information and analysis in various studies. There is an urgent need for effective and efficient methods to extract unknown and unexpected information, e.g., co-location patterns, from spatial datasets of high dimensionality and complexity. A co-location pattern is defined as a subset of spatial items whose instances are often located together in spatial proximity. Current co-location mining algorithms are unable to quantify the spatial proximity of a co-location pattern. We propose a co-location pattern miner aiming to discover co-location patterns in a multidimensional spatial data by measuring the cohesion of a pattern. We present a model to measure the cohesion in an attempt to improve the efficiency of existing methods. The usefulness of our method is demonstrated by applying them on the publicly available spatial data of the city of Antwerp in Belgium. The experimental results show that our method is more efficient than existing methods.
APA, Harvard, Vancouver, ISO, and other styles
11

Zhou, C., W. D. Xiao, and D. Q. Tang. "MINING CO-LOCATION PATTERNS FROM SPATIAL DATA." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-2 (June 2, 2016): 85–90. http://dx.doi.org/10.5194/isprs-annals-iii-2-85-2016.

Full text
Abstract:
Due to the widespread application of geographic information systems (GIS) and GPS technology and the increasingly mature infrastructure for data collection, sharing, and integration, more and more research domains have gained access to high-quality geographic data and created new ways to incorporate spatial information and analysis in various studies. There is an urgent need for effective and efficient methods to extract unknown and unexpected information, e.g., co-location patterns, from spatial datasets of high dimensionality and complexity. A co-location pattern is defined as a subset of spatial items whose instances are often located together in spatial proximity. Current co-location mining algorithms are unable to quantify the spatial proximity of a co-location pattern. We propose a co-location pattern miner aiming to discover co-location patterns in a multidimensional spatial data by measuring the cohesion of a pattern. We present a model to measure the cohesion in an attempt to improve the efficiency of existing methods. The usefulness of our method is demonstrated by applying them on the publicly available spatial data of the city of Antwerp in Belgium. The experimental results show that our method is more efficient than existing methods.
APA, Harvard, Vancouver, ISO, and other styles
12

Kovacic, Ilko, Christoph G. Schuetz, Bernd Neumayr, and Michael Schrefl. "OLAP Patterns: A pattern-based approach to multidimensional data analysis." Data & Knowledge Engineering 138 (March 2022): 101948. http://dx.doi.org/10.1016/j.datak.2021.101948.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Zou, Qinghua, Wesley Chu, David Johnson, and Henry Chiu. "A Pattern Decomposition Algorithm for Data Mining of Frequent Patterns." Knowledge and Information Systems 4, no. 4 (September 27, 2002): 466–82. http://dx.doi.org/10.1007/s101150200016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Kamiya, Yohei, and Hirohisa Seki. "Distributed Mining of Closed Patterns from Multi-Relational Data." Journal of Advanced Computational Intelligence and Intelligent Informatics 19, no. 6 (November 20, 2015): 804–9. http://dx.doi.org/10.20965/jaciii.2015.p0804.

Full text
Abstract:
In multi-relational data mining (MRDM), there have been proposed many methods for searching for patterns that involve multiple tables (relations) from a relational database. In this paper, we consider closed pattern mining from distributed multi-relational databases (MRDBs). Since the computation of MRDM is costly compared with the conventional itemset mining, we propose some efficient methods for computing closed patterns using the techniques studied in Inductive Logic Programming (ILP) and Formal Concept Analysis (FCA). Given a set oflocaldatabases, we first compute sets of their closed patterns (concepts) using a closed pattern mining algorithm tailored to MRDM, and then generate the set of closed patterns in the global database by utilizing themergeoperator. We also present some experimental results, which shows the effectiveness of the proposed methods.
APA, Harvard, Vancouver, ISO, and other styles
15

Liu, Mei Ling. "A Novel Approach to Identifying Global Exceptional Patterns in Distributed Data Mining." Applied Mechanics and Materials 182-183 (June 2012): 1972–77. http://dx.doi.org/10.4028/www.scientific.net/amm.182-183.1972.

Full text
Abstract:
With the increasing development and application of distributed database, distributed data mining has attracted many data mining researchers’ attention. In this paper, a framework for distributed data mining is introduced, and based on the framework, many patterns are generated from each database after data mining, so it is necessary to synthesize all the patterns to identify the meaningful global patterns. An approach to synthesizing local patterns to identifying global exceptional patterns is developed. In this approach, a pattern’s significance is measured by the deviation of the pattern’s support from the average support. Experimental results show that our approach is reasonable and appropriate to identify exceptional patterns.
APA, Harvard, Vancouver, ISO, and other styles
16

Fang, Dianwu, Lizhen Wang, Jialong Wang, and Meijiao Wang. "High Influencing Pattern Discovery over Time Series Data." ISPRS International Journal of Geo-Information 10, no. 10 (October 14, 2021): 696. http://dx.doi.org/10.3390/ijgi10100696.

Full text
Abstract:
A spatial co-location pattern denotes a subset of spatial features whose instances frequently appear nearby. High influence co-location pattern mining is used to find co-location patterns with high influence in specific aspects. Studies of such pattern mining usually rely on spatial distance for measuring nearness between instances, a method that cannot be applied to an influence propagation process concluded from epidemic dispersal scenarios. To discover meaningful patterns by using fruitful results in this field, we extend existing approaches and propose a mining framework. We first defined a new concept of proximity to depict semantic nearness between instances of distinct features, thus applying a star-shaped materialized model to mine influencing patterns. Then, we designed attribute descriptors to perceive attributes of instances and edges from time series data, and we calculated the attribute weights via an analytic hierarchy process, thereby computing the influence between instances and the influence of features in influencing patterns. Next, we constructed influencing metrics and set a threshold to discover high influencing patterns. Since the metrics do not satisfy the downward closure property, we propose two improved algorithms to boost efficiency. Extensive experiments conducted on real and synthetic datasets verified the effectiveness, efficiency, and scalability of our method.
APA, Harvard, Vancouver, ISO, and other styles
17

Egger, Joseph. "Sensitivity of Principal Oscillation Patterns (POPs) to data distribution in space." Meteorologische Zeitschrift 17, no. 5 (October 27, 2008): 673–77. http://dx.doi.org/10.1127/0941-2948/2008/0328.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

BURKE, HARRY B. "Discovering Patterns in Microarray Data." Molecular Diagnosis 5, no. 4 (2000): 349–57. http://dx.doi.org/10.2165/00066982-200005040-00013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

McKitrick, Ross, and Nicolas Nierenberg. "Socioeconomic patterns in climate data." Journal of Economic and Social Measurement 35, no. 3-4 (December 31, 2010): 149–75. http://dx.doi.org/10.3233/jem-2010-0336.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Johnson, Roger W. "Discovering patterns in interarrival data." Teaching Statistics 39, no. 2 (February 21, 2017): 42–46. http://dx.doi.org/10.1111/test.12123.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Chernov, G. P., Y. H. Yan, Q. J. Fu, and Ch M. Tan. "Recent data on zebra patterns." Astronomy & Astrophysics 437, no. 3 (June 30, 2005): 1047–54. http://dx.doi.org/10.1051/0004-6361:20042578.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Dolores Ugarte, M. "Visualizing Data Patterns with Micromaps." Journal of the Royal Statistical Society: Series A (Statistics in Society) 175, no. 4 (October 2012): 1072–73. http://dx.doi.org/10.1111/j.1467-985x.2012.01069_3.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Gelfand, Natasha, Michael T. Goodrich, and Roberto Tamassia. "Teaching data structure design patterns." ACM SIGCSE Bulletin 30, no. 1 (March 1998): 331–35. http://dx.doi.org/10.1145/274790.274324.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Nguyen, Dung. "Design patterns for data structures." ACM SIGCSE Bulletin 30, no. 1 (March 1998): 336–40. http://dx.doi.org/10.1145/274790.274325.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Burke, H. "Discovering patterns in microarray data." Molecular Diagnosis 5, no. 4 (December 2000): 349–57. http://dx.doi.org/10.1054/modi.2000.19562.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Schwinn, Alexander, and Joachim Schelp. "Design patterns for data integration." Journal of Enterprise Information Management 18, no. 4 (August 2005): 471–82. http://dx.doi.org/10.1108/17410390510609617.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Ferrer, J. L., M. Roth, and A. Antoniadis. "Data Compression for Diffraction Patterns." Acta Crystallographica Section D Biological Crystallography 54, no. 2 (March 1, 1998): 184–99. http://dx.doi.org/10.1107/s0907444997007257.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Matthews, Stephen A. "Visualizing Data Patterns with Micromaps." Spatial Demography 1, no. 1 (April 2013): 141–43. http://dx.doi.org/10.1007/bf03354893.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Li, Yun, and Saihua Cai. "Detecting Outliers in Data Streams Based on Minimum Rare Pattern Mining and Pattern Matching." Information Technology and Control 51, no. 2 (June 23, 2022): 268–82. http://dx.doi.org/10.5755/j01.itc.51.2.30524.

Full text
Abstract:
Outliers are the major factors to influence the accuracy of data-based processing, thus, they must be discovered from collected datasets to guarantee data security. With the widely use of sensors and other monitoring equipment, data streams are becoming the main form of data. However, the huge scale of data streams results in the number of mined rare patterns very large, which makes it hard to effectively detect outliers through pattern-based outlier detection methods. Since minimal rare patterns (MRPs) can represent rare patterns and the number is much smaller, therefore, the use of MRPs on outlier detection can short the time consumption. Based on this idea, this paper proposes an outlier detection approach based on minimal rare pattern, called ODMRP, which is composed of pattern mining phase and pattern matching phase. Specifically, in the pattern mining phase, an improved minimal rare pattern mining algorithm, namely MRPM, is proposed to mine the MRPs from data streams; It first constructs two matrix structures to store information of transactions and frequent 2-patterns, and then apply “pattern extension” operations to extend frequent 2-patterns to longer patterns, at the same time, rare patterns are removed to prevent them participating into “pattern extension” operation to reduce meaningless time cost; In the pattern matching phase, we use the IM-Sunday algorithm to match the mined MRPs with the patterns stored in outlier pattern library, to find potential outliers. Extensive experimental studies show that the proposed ODMRP method can accurately detect outliers from data streams in less overhead.
APA, Harvard, Vancouver, ISO, and other styles
30

Gebremeskel, Gebeyehu Belay, Chai Yi, Chengliang Wang, and Zhongshi He. "Critical analysis of smart environment sensor data behavior pattern based on sequential data mining techniques." Industrial Management & Data Systems 115, no. 6 (July 13, 2015): 1151–78. http://dx.doi.org/10.1108/imds-12-2014-0386.

Full text
Abstract:
Purpose – Behavioral pattern mining for intelligent system such as SmEs sensor data are vitally important in many applications and performance optimizations. Sensor pattern mining (SPM) is also dynamic and a hot research issue to pervasive and ubiquitous of smart technologies toward improving human life. However, in large-scale sensor data, exploring and mining pattern, which leads to detect the abnormal behavior is challenging. The paper aims to discuss these issues. Design/methodology/approach – Sensor data are complex and multivariate, for example, which data captured by the sensors, how it is precise, what properties are recorded or measured, are important research issues. Therefore, the method, the authors proposed Sequential Data Mining (SDM) approach to explore pattern behaviors toward detecting abnormal patterns for smart space fault diagnosis and performance optimization in the intelligent world. Sensor data types, modeling, descriptions and SPM techniques are discussed in depth using real sensor data sets. Findings – The outcome of the paper is measured as introducing a novel idea how SDM technique’s scale-up to sensor data pattern mining. In the paper, the approach and technicality of the sensor data pattern analyzed, and finally the pattern behaviors detected or segmented as normal and abnormal patterns. Originality/value – The paper is focussed on sensor data behavioral patterns for fault diagnosis and performance optimizations. It is other ways of knowledge extraction from the anomaly of sensor data (observation records), which is pertinent to adopt in many intelligent systems applications, including safety and security, efficiency, and other advantages as the consideration of the real-world problems.
APA, Harvard, Vancouver, ISO, and other styles
31

Eno, Josh, and Craig W. Thompson. "Generating Synthetic Data to Match Data Mining Patterns." IEEE Internet Computing 12, no. 3 (May 2008): 78–82. http://dx.doi.org/10.1109/mic.2008.55.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

GUSTAFSSON, PER, and KONSTANTINOS SAGONAS. "Efficient manipulation of binary data using pattern matching." Journal of Functional Programming 16, no. 1 (September 12, 2005): 35–74. http://dx.doi.org/10.1017/s0956796805005745.

Full text
Abstract:
Pattern matching is an important operation in functional programs. So far, pattern matching has been investigated in the context of structured terms. This article presents an approach to extend pattern matching to terms without (much of a) structure such as binaries which is the kind of data format that network applications typically manipulate. After introducing the binary datatype and a notation for matching binary data against patterns, we present an algorithm that constructs a decision tree automaton from a set of binary patterns. We then show how the pattern matching using this tree automaton can be made adaptive, how redundant tests can be avoided, and how we can further reduce the size of the resulting automaton by taking interferences between patterns into account. Since the size of the tree automaton is exponential in the worst case, we also present an alternative new approach to compiling binary pattern matching which is conservative in space and analyze its complexity properties. The effectiveness of our techniques is evaluated using standard packet filter benchmarks and on implementations of network protocols taken from actual telecom applications.
APA, Harvard, Vancouver, ISO, and other styles
33

Jamdar, Nikhil, and A. Vijayalakshmi. "BIG DATA MINING FOR INTERESTING PATTERNS WITH MAP REDUCE TECHNIQUE." Asian Journal of Pharmaceutical and Clinical Research 10, no. 13 (April 1, 2017): 191. http://dx.doi.org/10.22159/ajpcr.2017.v10s1.19634.

Full text
Abstract:
There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraints
APA, Harvard, Vancouver, ISO, and other styles
34

Mukhlash, Imam, Desna Yuanda, and Mohammad Iqbal. "Mining Fuzzy Time Interval Periodic Patterns in Smart Home Data." International Journal of Electrical and Computer Engineering (IJECE) 8, no. 5 (October 1, 2018): 3374. http://dx.doi.org/10.11591/ijece.v8i5.pp3374-3385.

Full text
Abstract:
A convergence of technologies in data mining, machine learning, and a persuasive computer has led to an interest in the development of smart environment to help human with functions, such as monitoring and remote health interventions, activity recognition, energy saving. The need for technology development was confirmed again by the aging population and the importance of individual independent in their own homes. Pattern mining on sensor data from smart home is widely applied in research such as using data mining. In this paper, we proposed a periodic pattern mining in smart house data that is integrated between the FP-Growth PrefixSpan algorithm and a fuzzy approach, which is called as fuzzy-time interval periodic patterns mining. Our purpose is to obtain the periodic pattern of activity at various time intervals. The simulation results show that the resident activities can be recognized by analyzing the triggered sensor patterns, and the impacts of minimum support values to the number of fuzzy-time-interval periodic patterns generated. Moreover, fuzzy-time-interval periodic patterns that are generated encourages to find daily or anomalies resident’s habits.
APA, Harvard, Vancouver, ISO, and other styles
35

Liu, Tongyu, Ju Fan, Yinqing Luo, Nan Tang, Guoliang Li, and Xiaoyong Du. "Adaptive data augmentation for supervised learning over missing data." Proceedings of the VLDB Endowment 14, no. 7 (March 2021): 1202–14. http://dx.doi.org/10.14778/3450980.3450989.

Full text
Abstract:
Real-world data is dirty, which causes serious problems in (supervised) machine learning (ML). The widely used practice in such scenario is to first repair the labeled source (a.k.a. train) data using rule-, statistical- or ML-based methods and then use the "repaired" source to train an ML model. During production, unlabeled target (a.k.a. test) data will also be repaired, and is then fed in the trained ML model for prediction. However, this process often causes a performance degradation when the source and target datasets are dirty with different noise patterns , which is common in practice. In this paper, we propose an adaptive data augmentation approach, for handling missing data in supervised ML. The approach extracts noise patterns from target data, and adapts the source data with the extracted target noise patterns while still preserving supervision signals in the source. Then, it patches the ML model by retraining it on the adapted data, in order to better serve the target. To effectively support adaptive data augmentation, we propose a novel generative adversarial network (GAN) based framework, called DAGAN, which works in an unsupervised fashion. DAGAN consists of two connected GAN networks. The first GAN learns the noise pattern from the target, for target mask generation. The second GAN uses the learned target mask to augment the source data, for source data adaptation. The augmented source data is used to retrain the ML model. Extensive experiments show that our method significantly improves the ML model performance and is more robust than the state-of-the-art missing data imputation solutions for handling datasets with different missing value patterns.
APA, Harvard, Vancouver, ISO, and other styles
36

Castelló, Adela, Virginia Lope, Jesús Vioque, Carmen Santamariña, Carmen Pedraz-Pingarrón, Soledad Abad, Maria Ederra, et al. "Reproducibility of data-driven dietary patterns in two groups of adult Spanish women from different studies." British Journal of Nutrition 116, no. 4 (July 4, 2016): 734–42. http://dx.doi.org/10.1017/s000711451600252x.

Full text
Abstract:
AbstractThe objective of the present study was to assess the reproducibility of data-driven dietary patterns in different samples extracted from similar populations. Dietary patterns were extracted by applying principal component analyses to the dietary information collected from a sample of 3550 women recruited from seven screening centres belonging to the Spanish breast cancer (BC) screening network (Determinants of Mammographic Density in Spain (DDM-Spain) study). The resulting patterns were compared with three dietary patterns obtained from a previous Spanish case–control study on female BC (Epidemiological study of the Spanish group for breast cancer research (GEICAM: grupo Español de investigación en cáncer de mama)) using the dietary intake data of 973 healthy participants. The level of agreement between patterns was determined using both the congruence coefficient (CC) between the pattern loadings (considering patterns with a CC≥0·85 as fairly similar) and the linear correlation between patterns scores (considering as fairly similar those patterns with a statistically significant correlation). The conclusions reached with both methods were compared. This is the first study exploring the reproducibility of data-driven patterns from two studies and the first using the CC to determine pattern similarity. We were able to reproduce the EpiGEICAM Western pattern in the DDM-Spain sample (CC=0·90). However, the reproducibility of the Prudent (CC=0·76) and Mediterranean (CC=0·77) patterns was not as good. The linear correlation between pattern scores was statistically significant in all cases, highlighting its arbitrariness for determining pattern similarity. We conclude that the reproducibility of widely prevalent dietary patterns is better than the reproducibility of more population-specific patterns. More methodological studies are needed to establish an objective measurement and threshold to determine pattern similarity.
APA, Harvard, Vancouver, ISO, and other styles
37

Chakravarty, S., and Y. Shahar. "Acquisition and Analysis of Repeating Patterns in Time-oriented Clinical Data." Methods of Information in Medicine 40, no. 05 (2001): 410–20. http://dx.doi.org/10.1055/s-0038-1634201.

Full text
Abstract:
Summary Objectives: (1) Creation of an expressive language for specification of temporal patterns in clinical domains, (2) Development of a graphical knowledge-acquisition tool allowing expert physicians to define meaningful domain-specific patterns, (3) Implementation of an interpreter capable of detecting such patterns in clinical databases, and (4) Evaluation of the tools in the domains of diabetes and oncology. Methods: We describe a constraint-based language, named CAPSUL, for specification of temporal patterns. We implemented a knowledge-acquisition tool and a temporal-pattern interpreter within Résumé, a larger temporal-abstraction architecture. We evaluated the knowledge-acquisition process with the help of domain experts. In collaboration with the Rush Presbyterian/St. Luke’s Medical Center, we analyzed data of bone-marrow transplantation patients. The expert compared the detected patterns to a manual inspection of the data, with the help of an experimental information-visualization tool we are developing in a related project. Results: The CAPSUL language was expressive enough during the knowledge-acquisition process to capture almost all of the patterns that the experts found useful. The patterns detected in the data by the pattern interpreter were all verified as correct. Completeness (whether all correct patterns were found) was difficult to assess, due to the size of the database. Conclusions: The CAPSUL language enables medical experts to express temporal patterns involving multiple levels of abstraction of clinical data. The ability to reuse both domain-patterns and abstract constraints seems highly useful. The Résumé interpreter, augmented by the CAPSUL semantics, finds the complex patterns within a clinical time-oriented database in a sound fashion.
APA, Harvard, Vancouver, ISO, and other styles
38

Starkey, John. "The analysis of three-dimensional orientation data." Canadian Journal of Earth Sciences 30, no. 7 (July 1, 1993): 1355–62. http://dx.doi.org/10.1139/e93-116.

Full text
Abstract:
For many observed orientation patterns of three-dimensional geological data no statistical tests are available that define the probability with which a measured sample represents the parent population. Consequently, there is no accepted definition of an appropriate sample size and patterns can only be compared qualitatively by visual inspection. Empirical solutions to these problems are proposed and applied to quartz c-axis orientation patterns. An estimate of the distribution of orientations in a parent population is provided by an orientation diagram prepared by contouring a measured sample represented on the surface of a sphere using a counting circle 100/n% of the area of the projected hemisphere, where n is the sample size. Contouring the sample as it is accumulated allows identification of the sample size beyond which there is no significant change in the measured pattern. Data from similar-sized samples of computer-simulated random patterns provide outside estimates of the likely differences between the measured sample and the parent population. Pairs of patterns are compared using the mean absolute difference between the matrices used to prepare the contoured diagrams. A measure of their similarity is provided by the magnitude of this difference as a function of the departure of the two patterns from randomness, as indicated by the amount of empty space present. Characterization of the traditionally recognized types of quartz c-axis orientation patterns, using the parameters of the orientation patterns or their normalized eigenvectors, is successful only with point maximum, partial girdle, girdle, and random orientation patterns.
APA, Harvard, Vancouver, ISO, and other styles
39

Salah, Albert Ali, Eric Pauwels, Romain Tavenard, and Theo Gevers. "T-Patterns Revisited: Mining for Temporal Patterns in Sensor Data." Sensors 10, no. 8 (August 10, 2010): 7496–513. http://dx.doi.org/10.3390/s100807496.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

FALOMIR, ZOE, VICENT CASTELLÓ, M. TERESA ESCRIG, and JUAN CARLOS PERIS. "FUZZY DISTANCE SENSOR DATA INTEGRATION AND INTERPRETATION." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 19, no. 03 (June 2011): 499–528. http://dx.doi.org/10.1142/s0218488511007106.

Full text
Abstract:
An approach to distance sensor data integration that obtains a robust interpretation of the robot environment is presented in this paper. This approach consists in obtaining patterns of fuzzy distance zones from sensor readings; comparing these patterns in order to detect non-working sensors; and integrating the patterns obtained by each kind of sensor in order to obtain a final pattern that detects obstacles of any sort. A dissimilarity measure between fuzzy sets has been defined and applied to this approach. Moreover, an algorithm to classify orientation reference systems (built by corners detected in the robot world) as open or closed is also presented. The final pattern of fuzzy distances, resulting from the integration process, is used to extract the important reference systems when a glass wall is included in the robot environment. Finally, our approach has been tested in an ActivMedia Pioneer 2 dx mobile robot using the Player/Stage as the control interface and promising results have been obtained.
APA, Harvard, Vancouver, ISO, and other styles
41

Zhang, Qianzhen, Deke Guo, Xiang Zhao, and Xi Wang. "Continuous matching of evolving patterns over dynamic graph data." World Wide Web 24, no. 3 (April 13, 2021): 721–45. http://dx.doi.org/10.1007/s11280-020-00860-5.

Full text
Abstract:
AbstractNowadays, the scale of various graphs soars rapidly, which imposes a serious challenge to develop processing and analytic algorithms. Among them, graph pattern matching is the one of the most primitive tasks that find a wide spectrum of applications, the performance of which is yet often affected by the size and dynamicity of graphs. In order to handle large dynamic graphs, incremental pattern matching is proposed to avoid re-computing matches of patterns over the entire data graph, hence reducing the matching time and improving the overall execution performance. Due to the complexity of the problem, little work has been reported so far to solve the problem, and most of them only solve the graph pattern matching problem under the scenario of the data graph varying alone. In this article, we are devoted to a more complicated but very practical graph pattern matching problem, continuous matching of evolving patterns over dynamic graph data, and the investigation presents a novel algorithm for continuously pattern matching along with changes of both pattern graph and data graph. Specifically, we propose a concise representation of partial matching solutions, which can help to avoid re-computing matches of the pattern and speed up subsequent matching process. In order to enable the updates of data graph and pattern graph, we propose an incremental maintenance strategy, to efficiently maintain the intermediate results. Moreover, we conceive an effective model for estimating step-wise cost of pattern evaluation to drive the matching process. Extensive experiments verify the superiority of .
APA, Harvard, Vancouver, ISO, and other styles
42

Lou, Jingfeng, and Aiguo Cheng. "Detecting Pattern Changes in Individual Travel Behavior from Vehicle GPS/GNSS Data." Sensors 20, no. 8 (April 17, 2020): 2295. http://dx.doi.org/10.3390/s20082295.

Full text
Abstract:
Although stable in the short term, individual travel behavior generally tends to change over the long term. The ability to detect such changes is important for product and service providers in continuously changing environments. The aim of this paper is to develop a methodology that detects changes in the patterns of individual travel behavior from vehicle global positioning system (GPS)/global navigation satellite system (GNSS) data. For this purpose, we first define individual travel behavior patterns in two dimensions: a spatial pattern and a frequency pattern. Then, we develop a method that can detect such patterns from GPS/GNSS data using a clustering algorithm. Finally, we define three basic pattern-change scenarios for individual travel behavior and introduce a pattern-matching metric for detecting these changes. The proposed methodology is tested using GPS datasets from three randomly selected anonymous users, collected by a Chinese automotive manufacturer. The results show that our methodology can successfully identify significant changes in individual travel behavior patterns.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhu, Shi Song, Fu Jing Zhu, and Wen Hui Man. "Abnormal Pattern of Sensor Monitoring Data Analysis and Recognition." Advanced Materials Research 452-453 (January 2012): 863–67. http://dx.doi.org/10.4028/www.scientific.net/amr.452-453.863.

Full text
Abstract:
In order to solve the abnormal pattern recognition problem of the sensor monitoring data automatically, a set of method on the time series similarity measurement is used in this paper. Abnormal time series patterns clustering analysis based on the DTW distance is proposed firstly, thus the typical time series patterns can be obtained. From which the important shape indexes can be extracted and filtered based on piecewise shape measure method, then the shape index table can be established. With which a pattern recognition system can be designed used to recognize these abnormal patterns on real-time. As a case, this method has been used in a high gas coal mine and the important promotion application value has been proved in the sensor monitoring field.
APA, Harvard, Vancouver, ISO, and other styles
44

Cuthbert, Carol E., and Noel J. Pearse. "Strategic Data Pattern Visualisation." Journal of Systemics, Cybernetics and Informatics 20, no. 1 (January 2022): 122–41. http://dx.doi.org/10.54808/jsci.20.01.122.

Full text
Abstract:
Data visualisation reveals patterns and provides insights that lead to actions from management, thereby playing a mediating role in the relationship between the internal resources of a firm and its financial performance. In this chapter, contingent resource-based theory is applied to the analysis of big data, treating its visualisation as a mode of interdisciplinary communication. In service industries in general and the legal industry in particular, big data analytics (BDA) is emerging as a decision-making tool for management to achieve competitive advantage. Traditionally, data scientists have delved into data armed with a hypothesis, but increasingly they explore data to discern patterns that lead to hypotheses that are then tested. These big data analytics tools in the hands of data scientists have the potential to unlock firm value and increase revenue and profits, through pattern identification, analysis, and strategic action. This exploratory mode of working can increase complexity and thereby diminish efficient management decision-making and action. However, data pattern visualisation reduces complexity, as it enables interdisciplinary communication between data scientists and managers through the translation of statistical patterns into visualisations that enable actionable management decisions. When data scientists visualise data patterns for managers, this translates uncertainty into reliable conclusions, resulting in effective management decision-making and action. Informed by contingent resource theory and viewing these primary and secondary resources as independent variables and performance outcomes such as revenue and profitability as dependent variables, a conceptual framework is developed. The contingent resource-based theory highlights capabilities emerging from the interrelationship between primary and secondary resources as being central to competitiveness and profitability. Data decision-making systems are viewed as a primary resource, while complementary resources are (1) their completeness of vision (i.e., strategy and innovation) and (2) their ability to execute (i.e., operational capabilities). Data visualisation is therefore crucial as a resource facilitating actionable decisions by management, which in turn enhances firm performance. The balance between expert agents' self-reliance and central control, the entity's values, task attributes, and risk appetite all moderate the type of data visualisation produced by data scientists.
APA, Harvard, Vancouver, ISO, and other styles
45

Munavalli, Jyoti R., and Rashmi R. Deshpande. "Pattern Recognition for Data Retrieval using Artificial Neural Network." Journal of University of Shanghai for Science and Technology 23, no. 06 (June 23, 2021): 1436–47. http://dx.doi.org/10.51201/jusst/21/06457.

Full text
Abstract:
Data retrieval is an important aspect of data management. In this paper, we design an ANN to recognize the learned patterns. We use a three-layer feed-forward network for the training of patterns (bitmap data). We implement two kinds of recognition: forced recognition and custom-specific recognition. The ANN model developed recognizes the pattern even if there is variation in the applied test patterns from the learned/trained patterns. In particular, we discuss the fault tolerance offered by neural networks. The characteristic of fault tolerance depends upon the type of distribution taken from random numbers. The conventional network is based on the concept of memorization whereas the neural network is based on the concept of generalization.
APA, Harvard, Vancouver, ISO, and other styles
46

Nair, Hema. "Summaries of certain spatial patterns retrieved from multidate remote-sensing data." Discrete Dynamics in Nature and Society 2004, no. 2 (2004): 287–300. http://dx.doi.org/10.1155/s1026022604402027.

Full text
Abstract:
This paper presents an approach to describe patterns in remote-sensed images utilising fuzzy logic. The truth of a linguistic proposition such as “Y isF” can be determined for each pattern characterised by a tuple in the database, where Y is the pattern andFis a summary that applies to that pattern. This proposition is formulated in terms of primary quantitative measures, such as area, length, perimeter, and so forth, of the pattern. Fuzzy descriptions of linguistic summaries help to evaluate the degree to which a summary describes a pattern or object in the database. Techniques, such as clustering and genetic algorithms, are used to mine images. Image mining is a relatively new area of research. It is used to extract patterns from multidated satellite images of a geographic area.
APA, Harvard, Vancouver, ISO, and other styles
47

Jenkins, Ron. "Profile Data Acquisition for the JCPDS?ICDD Database." Australian Journal of Physics 41, no. 2 (1988): 145. http://dx.doi.org/10.1071/ph880145.

Full text
Abstract:
The principal advantage offered by a fully digitised diffraction pattern is the retention of all features of the experimental pattern, including the line width and shape, the form and distribution of the background, etc. A file containing this type of reference data would in the future allow the use of techniques yet to be developed and of data processing, such as peak location, background subtraction and a2 stripping. The availability of digitised reference patterns would also allow the use of pattern-recognition techniques for qualitative phase analysis, as well as offering interesting possibilities for quantitative work. Until recently most commercially available automated powder diffractometers were limited to 10-20 Mbytes of disc storage and since a single fully digitised pattern requires about 10 kbytes, the provision of a file for thousands of digitised single phase reference patterns has not been possible. The recent advent of compact disc-read only memory (CD-ROM) systems providing in excess of 500 Mbytes now offers a low cost data storage capability. Plans are now in place for a new version of the Powder Diffraction File consisting of fully digitised patterns. Because of the need to maintain the database for years to come, it is most important that the stored data be as accurate and complete as possible.
APA, Harvard, Vancouver, ISO, and other styles
48

Valarmathi, K., and R. Gowdhami. "A Narrative Approach for Removal Embedded Prototype from Big Tree Data." Journal of Advance Research in Computer Science & Engineering (ISSN: 2456-3552) 6, no. 9 (September 30, 2019): 01–10. http://dx.doi.org/10.53555/nncse.v6i9.790.

Full text
Abstract:
Many modern functions and systems represent and exchange data in tree-structured form and process and produce large tree datasets. Discovering informative patterns in large tree datasets is an important research area that has many practical applications. We propose a novel approach that exploits efficient homomorphic pattern matching algorithms to compute pattern support incrementally and avoids the costly enumeration of all patterns matching required by previous approaches. To reduce space consumption, matching information of already computed patterns is materialized as bitmaps. We further optimize our basic support computation method by designing an algorithm which incrementally generates the bitmaps of the embeddings of a new candidate pattern without first explicitly computing the embeddings of this pattern. Our extensive experimental results on real and synthetic large-tree datasets show that our approach displays orders of magnitude performance improvements over a state-of-the-art tree mining algorithm and a recent graph mining algorithm.
APA, Harvard, Vancouver, ISO, and other styles
49

Masich, Igor S., Vadim S. Tyncheko, Vladimir A. Nelyub, Vladimir V. Bukhtoyarov, Sergei O. Kurashkin, and Aleksey S. Borodulin. "Paired Patterns in Logical Analysis of Data for Decision Support in Recognition." Computation 10, no. 10 (October 12, 2022): 185. http://dx.doi.org/10.3390/computation10100185.

Full text
Abstract:
Logical analysis of data (LAD), an approach to data analysis based on Boolean functions, combinatorics, and optimization, can be considered one of the methods of interpretable machine learning. A feature of LAD is that, among many patterns, different types of patterns can be identified, for example, prime, strong, spanned, and maximum. This paper proposes a decision-support approach to recognition by sharing different types of patterns to improve the quality of recognition in terms of accuracy, interpretability, and validity. An algorithm was developed to search for pairs of strong patterns (prime and spanned) with the same coverage as the training sample, having the smallest (for the prime pattern) and the largest (for the spanned pattern) number of conditions. The proposed approach leads to a decrease in the number of unrecognized observations (compared with the use of spanned patterns only) by 1.5–2 times (experimental results), to some reduction in recognition errors (compared with the use of prime patterns only) of approximately 1% (depending on the dataset) and makes it possible to assess in more detail the level of confidence of the recognition result due to a refined decision-making scheme that uses the information about the number and type of patterns covering the observation.
APA, Harvard, Vancouver, ISO, and other styles
50

Ruzmaikin, Alexander, Hartmut H. Aumann, and Thomas S. Pagano. "Patterns of CO2 Variability from Global Satellite Data." Journal of Climate 25, no. 18 (June 19, 2012): 6383–93. http://dx.doi.org/10.1175/jcli-d-11-00223.1.

Full text
Abstract:
Abstract The authors present an analysis of the global midtropospheric CO2 retrieved for all-sky (clear and cloudy) conditions from measurements by the Atmospheric Infrared Radiation Sounder on board the Aqua satellite in 2003–09. The global data coverage allows the identification of the set of CO2 spatial patterns and their time variability by applying principal component analysis and empirical mode decomposition. The first, dominant pattern represents 93% of the variability and exhibits the linear trend of 2 ± 0.2 ppm yr−1, as well as annual and interannual dependencies. The single-site record of CO2 at Mauna Loa compares well with variability of this pattern. The first principal component is phase shifted relative to the Southern Oscillation, indicating a causative relationship between the atmospheric CO2 and ENSO. The higher-order patterns show regional details of CO2 distribution and display the semiannual oscillation. The CO2 distributions are compared with the distribution of two major characteristics of air transport: the vertical velocity and potential temperature surfaces at the same height. In agreement with modeling, CO2 concentration closely traces the potential temperature surfaces (isentropes) in middle and high latitudes. However, its vertical transport in the tropics, where these surfaces are mostly horizontal, is suppressed. The results are in agreement with the previous results on annual and interannual CO2 time variability obtained by using the network flask data. This knowledge of the global CO2 spatial patterns can be useful in climate analyses and potentially in the challenging task of connecting CO2 sources and sinks with its distribution in the atmosphere.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography