Log in

Relevant bibliographies by topics / Datasety / Journal articles

To see the other types of publications on this topic, follow the link: Datasety.

Journal articles on the topic 'Datasety'

Author: Grafiati

Published: 28 June 2021

Last updated: 7 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Datasety.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Almeida, Daniela, Dany Domínguez-Pérez, Ana Matos, et al. "Data Employed in the Construction of a Composite Protein Database for Proteogenomic Analyses of Cephalopods Salivary Apparatus." Data 5, no. 4 (2020): 110. http://dx.doi.org/10.3390/data5040110.

Full text

Abstract:

Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related

APA, Harvard, Vancouver, ISO, and other styles

2

Haider, S. A., and N. S. Patil. "Minimization of Datasets : Using a Master Interlinked Dataset." Indian Journal of Computer Science 3, no. 5 (2018): 20. http://dx.doi.org/10.17010/ijcs/2018/v3/i5/138778.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Feng, Eric, and Xijin Ge. "DataViz: visualization of high-dimensional data in virtual reality." F1000Research 7 (October 23, 2018): 1687. http://dx.doi.org/10.12688/f1000research.16453.1.

Full text

Abstract:

Virtual reality (VR) simulations promote interactivity and immersion, and provide an opportunity that may help researchers gain insights from complex datasets. To explore the utility and potential of VR in graphically rendering large datasets, we have developed an application for immersive, 3-dimensional (3D) scatter plots. Developed using the Unity development environment, DataViz enables the visualization of high-dimensional data with the HTC Vive, a relatively inexpensive and modern virtual reality headset available to the general public. DataViz has the following features: (1) principal co

APA, Harvard, Vancouver, ISO, and other styles

4

Chang, Nai Chen, Elissa Aminoff, John Pyles, Michael Tarr, and Abhinav Gupta. "Scaling Up Neural Datasets: A public fMRI dataset of 5000 scenes." Journal of Vision 18, no. 10 (2018): 732. http://dx.doi.org/10.1167/18.10.732.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Zhang, Yulian, and Shigeyuki Hamori. "Forecasting Crude Oil Market Crashes Using Machine Learning Technologies." Energies 13, no. 10 (2020): 2440. http://dx.doi.org/10.3390/en13102440.

Full text

Abstract:

To the best of our knowledge, this study provides new insight into the forecasting of crude oil futures price crashes in America, employing a moving window. One is the fixed-length window and the other is the expanding-length window, which has never been reported in the past. We aimed to investigate if there is any difference when historical data are discarded. As the explanatory variables, we adapted 13 variables to obtain two datasets, 16 explanatory variables for Dataset1 and 121 explanatory variables for Dataset2. We try to observe results from the different-sized sets of explanatory varia

APA, Harvard, Vancouver, ISO, and other styles

6

Wang, Juan, Zhibin Zhang, and Yanjuan Li. "Constructing Phylogenetic Networks Based on the Isomorphism of Datasets." BioMed Research International 2016 (2016): 1–7. http://dx.doi.org/10.1155/2016/4236858.

Full text

Abstract:

Constructing rooted phylogenetic networks from rooted phylogenetic trees has become an important problem in molecular evolution. So far, many methods have been presented in this area, in which most efficient methods are based on the incompatible graph, such as the CASS, the LNETWORK,and the BIMLR. This paper will research the commonness of the methods based on the incompatible graph, the relationship between incompatible graph and the phylogenetic network, and the topologies of incompatible graphs. We can find out all the simplest datasets for a topologyGand construct a network for every datas

APA, Harvard, Vancouver, ISO, and other styles

7

Xie, Yanqing, Zhengqiang Li, Weizhen Hou, et al. "Validation of FY-3D MERSI-2 Precipitable Water Vapor (PWV) Datasets Using Ground-Based PWV Data from AERONET." Remote Sensing 13, no. 16 (2021): 3246. http://dx.doi.org/10.3390/rs13163246.

Full text

Abstract:

The medium resolution spectral imager-2 (MERSI-2) is one of the most important sensors onboard China’s latest polar-orbiting meteorological satellite, Fengyun-3D (FY-3D). The National Satellite Meteorological Center of China Meteorological Administration has developed four precipitable water vapor (PWV) datasets using five near-infrared bands of MERSI-2, including the P905 dataset, P936 dataset, P940 dataset and the fusion dataset of the above three datasets. For the convenience of users, we comprehensively evaluate the quality of these PWV datasets with the ground-based PWV data derived from

APA, Harvard, Vancouver, ISO, and other styles

8

Bahrami, Mostafa, Hossein Javadikia, and Ebrahim Ebrahimi. "APPLICATION OF PATTERN RECOGNITION TECHNIQUES FOR FAULT DETECTION OF CLUTCH RETAINER OF TRACTOR." Journal of Mechanical Engineering 47, no. 1 (2018): 31–36. http://dx.doi.org/10.3329/jme.v47i1.35356.

Full text

Abstract:

This study develops a technique based on pattern recognition for fault diagnosis of clutch retainer mechanism of MF285 tractor using the neural network. In this technique, time features and frequency domain features consist of Fast Fourier Transform (FFT) phase angle and Power Spectral Density (PSD) proposes to improve diagnosis ability. Three different cases, such as: normal condition, bearing wears and shaft wears were applied for signal processing. The data divides in two parts; in part one 70% data are dataset1 and in part two 30% for dataset2.At first, the artificial neural networks (ANN)

APA, Harvard, Vancouver, ISO, and other styles

9

Bogaardt, Laurens, Romulo Goncalves, Raul Zurita-Milla, and Emma Izquierdo-Verdiguier. "Dataset Reduction Techniques to Speed Up SVD Analyses on Big Geo-Datasets." ISPRS International Journal of Geo-Information 8, no. 2 (2019): 55. http://dx.doi.org/10.3390/ijgi8020055.

Full text

Abstract:

The Singular Value Decomposition (SVD) is a mathematical procedure with multiple applications in the geosciences. For instance, it is used in dimensionality reduction and as a support operator for various analytical tasks applicable to spatio-temporal data. Performing SVD analyses on large datasets, however, can be computationally costly, time consuming, and sometimes practically infeasible. However, techniques exist to arrive at the same output, or at a close approximation, which requires far less effort. This article examines several such techniques in relation to the inherent scale of the s

APA, Harvard, Vancouver, ISO, and other styles

10

Yu, Ellen, Aparna Bhaskaran, Shang-Lin Chen, Zachary E. Ross, Egill Hauksson, and Robert W. Clayton. "Southern California Earthquake Data Now Available in the AWS Cloud." Seismological Research Letters 92, no. 5 (2021): 3238–47. http://dx.doi.org/10.1785/0220210039.

Full text

Abstract:

Abstract The Southern California Earthquake Data Center is hosting its earthquake catalog and seismic waveform archive in the Amazon Web Services (AWS) Open Dataset Program (s3://scedc-pds; us-west-2 region). The cloud dataset’s high data availability and scalability facilitate research that uses large volumes of data and computationally intensive processing. We describe the data archive and our rationale for the formats and data organization. We provide two simple examples to show how storing the data in AWS Simple Storage Service can benefit the analysis of large datasets. We share usage sta

APA, Harvard, Vancouver, ISO, and other styles

11

Waliser, Duane, Peter J. Gleckler, Robert Ferraro, et al. "Observations for Model Intercomparison Project (Obs4MIPs): status for CMIP6." Geoscientific Model Development 13, no. 7 (2020): 2945–58. http://dx.doi.org/10.5194/gmd-13-2945-2020.

Full text

Abstract:

Abstract. The Observations for Model Intercomparison Project (Obs4MIPs) was initiated in 2010 to facilitate the use of observations in climate model evaluation and research, with a particular target being the Coupled Model Intercomparison Project (CMIP), a major initiative of the World Climate Research Programme (WCRP). To this end, Obs4MIPs (1) targets observed variables that can be compared to CMIP model variables; (2) utilizes dataset formatting specifications and metadata requirements closely aligned with CMIP model output; (3) provides brief technical documentation for each dataset, desig

APA, Harvard, Vancouver, ISO, and other styles

12

Kusetogullari, Huseyin, Amir Yavariabdi, Abbas Cheddad, Håkan Grahn, and Johan Hall. "ARDIS: a Swedish historical handwritten digit dataset." Neural Computing and Applications 32, no. 21 (2019): 16505–18. http://dx.doi.org/10.1007/s00521-019-04163-3.

Full text

Abstract:

Abstract This paper introduces a new image-based handwritten historical digit dataset named Arkiv Digital Sweden (ARDIS). The images in ARDIS dataset are extracted from 15,000 Swedish church records which were written by different priests with various handwriting styles in the nineteenth and twentieth centuries. The constructed dataset consists of three single-digit datasets and one-digit string dataset. The digit string dataset includes 10,000 samples in red–green–blue color space, whereas the other datasets contain 7600 single-digit images in different color spaces. An extensive analysis of

APA, Harvard, Vancouver, ISO, and other styles

13

Sawangarreerak, Siriporn, and Putthiporn Thanathamathee. "Random Forest with Sampling Techniques for Handling Imbalanced Prediction of University Student Depression." Information 11, no. 11 (2020): 519. http://dx.doi.org/10.3390/info11110519.

Full text

Abstract:

In this work, we propose a combined sampling technique to improve the performance of imbalanced classification of university student depression data. In experimental results, we found that combined random oversampling with the Tomek links under sampling methods allowed generating a relatively balanced depression dataset without losing significant information. In this case, the random oversampling technique was used for sampling the minority class to balance the number of samples between the datasets. Then, the Tomek links technique was used for undersampling the samples by removing the depress

APA, Harvard, Vancouver, ISO, and other styles

14

Eum, Hyung-Il, and Anil Gupta. "Hybrid climate datasets from a climate data evaluation system and their impacts on hydrologic simulations for the Athabasca River basin in Canada." Hydrology and Earth System Sciences 23, no. 12 (2019): 5151–73. http://dx.doi.org/10.5194/hess-23-5151-2019.

Full text

Abstract:

Abstract. A reliable climate dataset is the backbone for modelling the essential processes of the water cycle and predicting future conditions. Although a number of gridded climate datasets are available for the North American content which provide reasonable estimates of climatic conditions in the region, there are inherent inconsistencies in these available climate datasets (e.g., spatially and temporally varying data accuracies, meteorological parameters, lengths of records, spatial coverage, temporal resolution, etc.). These inconsistencies raise questions as to which datasets are the most

APA, Harvard, Vancouver, ISO, and other styles

15

Majidifard, Hamed, Peng Jin, Yaw Adu-Gyamfi, and William G. Buttlar. "Pavement Image Datasets: A New Benchmark Dataset to Classify and Densify Pavement Distresses." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 2 (2020): 328–39. http://dx.doi.org/10.1177/0361198120907283.

Full text

Abstract:

Automated pavement distresses detection using road images remains a challenging topic in the computer vision research community. Recent developments in deep learning have led to considerable research activity directed towards improving the efficacy of automated pavement distress identification and rating. Deep learning models require a large ground truth data set, which is often not readily available in the case of pavements. In this study, a labeled dataset approach is introduced as a first step towards a more robust, easy-to-deploy pavement condition assessment system. The technique is terme

APA, Harvard, Vancouver, ISO, and other styles

16

Amjad, Muhammad. "The Value of Manifold Learning Algorithms in Simplifying Complex Datasets for More Efficacious Analysis." Sciential - McMaster Undergraduate Science Journal, no. 5 (December 4, 2020): 13–20. http://dx.doi.org/10.15173/sciential.v1i5.2537.

Full text

Abstract:

Advances in manifold learning have proven to be of great benefit in reducing the dimensionality of large complex datasets. Elements in an intricate dataset will typically belong in high-dimensional space as the number of individual features or independent variables will be extensive. However, these elements can be integrated into a low-dimensional manifold with well-defined parameters. By constructing a low-dimensional manifold and embedding it into high-dimensional feature space, the dataset can be simplified for easier interpretation. In spite of this elemental dimensionality reduction, the

APA, Harvard, Vancouver, ISO, and other styles

17

Morgan, Maria, Carla Blank, and Raed Seetan. "Plant disease prediction using classification algorithms." IAES International Journal of Artificial Intelligence (IJ-AI) 10, no. 1 (2021): 257. http://dx.doi.org/10.11591/ijai.v10.i1.pp257-264.

Full text

Abstract:

<p>This paper investigates the capability of six existing classification algorithms (Artificial Neural Network, Naïve Bayes, k-Nearest Neighbor, Support Vector Machine, Decision Tree and Random Forest) in classifying and predicting diseases in soybean and mushroom datasets using datasets with numerical or categorical attributes. While many similar studies have been conducted on datasets of images to predict plant diseases, the main objective of this study is to suggest classification methods that can be used for disease classification and prediction in datasets that contain raw measureme

APA, Harvard, Vancouver, ISO, and other styles

18

Williamson, Sinead A., and Jette Henderson. "Understanding Collections of Related Datasets Using Dependent MMD Coresets." Information 12, no. 10 (2021): 392. http://dx.doi.org/10.3390/info12100392.

Full text

Abstract:

Understanding how two datasets differ can help us determine whether one dataset under-represents certain sub-populations, and provides insights into how well models will generalize across datasets. Representative points selected by a maximum mean discrepancy (MMD) coreset can provide interpretable summaries of a single dataset, but are not easily compared across datasets. In this paper, we introduce dependent MMD coresets, a data summarization method for collections of datasets that facilitates comparison of distributions. We show that dependent MMD coresets are useful for understanding multip

APA, Harvard, Vancouver, ISO, and other styles

19

Kramberger, Tin, and Božidar Potočnik. "LSUN-Stanford Car Dataset: Enhancing Large-Scale Car Image Datasets Using Deep Learning for Usage in GAN Training." Applied Sciences 10, no. 14 (2020): 4913. http://dx.doi.org/10.3390/app10144913.

Full text

Abstract:

Currently there is no publicly available adequate dataset that could be used for training Generative Adversarial Networks (GANs) on car images. All available car datasets differ in noise, pose, and zoom levels. Thus, the objective of this work was to create an improved car image dataset that would be better suited for GAN training. To improve the performance of the GAN, we coupled the LSUN and Stanford car datasets. A new merged dataset was then pruned in order to adjust zoom levels and reduce the noise of images. This process resulted in fewer images that could be used for training, with incr

APA, Harvard, Vancouver, ISO, and other styles

20

Alade, Oyekale Abel, Ali Selamat, and Roselina Sallehuddin. "The Effects of Missing Data Characteristics on the Choice of Imputation Techniques." Vietnam Journal of Computer Science 07, no. 02 (2020): 161–77. http://dx.doi.org/10.1142/s2196888820500098.

Full text

Abstract:

One major characteristic of data is completeness. Missing data is a significant problem in medical datasets. It leads to incorrect classification of patients and is dangerous to the health management of patients. Many factors lead to the missingness of values in databases in medical datasets. In this paper, we propose the need to examine the causes of missing data in a medical dataset to ensure that the right imputation method is used in solving the problem. The mechanism of missingness in datasets was studied to know the missing pattern of datasets and determine a suitable imputation techniqu

APA, Harvard, Vancouver, ISO, and other styles

21

Wu, Qiaoyan, and Yilei Wang. "Comparison of Oceanic Multisatellite Precipitation Data from Tropical Rainfall Measurement Mission and Global Precipitation Measurement Mission Datasets with Rain Gauge Data from Ocean Buoys." Journal of Atmospheric and Oceanic Technology 36, no. 5 (2019): 903–20. http://dx.doi.org/10.1175/jtech-d-18-0152.1.

Full text

Abstract:

AbstractThree satellite-derived precipitation datasets [the Tropical Rainfall Measuring Mission Multisatellite Precipitation Analysis (TMPA) dataset, the NOAA Climate Prediction Center morphing technique (CMORPH) dataset, and the newly available Integrated Multisatellite Retrievals for Global Precipitation Measurement (IMERG) dataset] are compared with data obtained from 55 rain gauges mounted on floating buoys in the tropics for the period 1 April 2014–30 April 2017. All three satellite datasets underestimate low rainfall and overestimate high rainfall in the tropical Pacific Ocean, but the T

APA, Harvard, Vancouver, ISO, and other styles

22

Khakwani, Aamir, Ruth H. Jack, Sally Vernon, et al. "Apples and pears? A comparison of two sources of national lung cancer audit data in England." ERJ Open Research 3, no. 3 (2017): 00003–2017. http://dx.doi.org/10.1183/23120541.00003-2017.

Full text

Abstract:

In 2014, the method of data collection from NHS trusts in England for the National Lung Cancer Audit (NLCA) was changed from a bespoke dataset called LUCADA (Lung Cancer Data). Under the new contract, data are submitted via the Cancer Outcome and Service Dataset (COSD) system and linked additional cancer registry datasets. In 2014, trusts were given opportunity to submit LUCADA data as well as registry data. 132 NHS trusts submitted LUCADA data, and all 151 trusts submitted COSD data. This transitional year therefore provided the opportunity to compare both datasets for data completeness and r

APA, Harvard, Vancouver, ISO, and other styles

23

Ferenc, Rudolf, Zoltán Tóth, Gergely Ladányi, István Siket, and Tibor Gyimóthy. "A public unified bug dataset for java and its assessment regarding metrics and bug prediction." Software Quality Journal 28, no. 4 (2020): 1447–506. http://dx.doi.org/10.1007/s11219-020-09515-0.

Full text

Abstract:

AbstractBug datasets have been created and used by many researchers to build and validate novel bug prediction models. In this work, our aim is to collect existing public source code metric-based bug datasets and unify their contents. Furthermore, we wish to assess the plethora of collected metrics and the capabilities of the unified bug dataset in bug prediction. We considered 5 public datasets and we downloaded the corresponding source code for each system in the datasets and performed source code analysis to obtain a common set of source code metrics. This way, we produced a unified bug dat

APA, Harvard, Vancouver, ISO, and other styles

24

Tarek, Mostafa, François P. Brissette, and Richard Arsenault. "Large-Scale Analysis of Global Gridded Precipitation and Temperature Datasets for Climate Change Impact Studies." Journal of Hydrometeorology 21, no. 11 (2020): 2623–40. http://dx.doi.org/10.1175/jhm-d-20-0100.1.

Full text

Abstract:

AbstractCurrently, there are a large number of diverse climate datasets in existence, which differ, sometimes greatly, in terms of their data sources, quality control schemes, estimation procedures, and spatial and temporal resolutions. Choosing an appropriate dataset for a given application is therefore not a simple task. This study compares nine global/near-global precipitation datasets and three global temperature datasets over 3138 North American catchments. The chosen datasets all meet the minimum requirement of having at least 30 years of available data, so they could all potentially be

APA, Harvard, Vancouver, ISO, and other styles

25

Chaudhary, Archana, Savita Kolhe, and Raj Kamal. "A hybrid ensemble for classification in multiclass datasets: An application to oilseed disease dataset." Computers and Electronics in Agriculture 124 (June 2016): 65–72. http://dx.doi.org/10.1016/j.compag.2016.03.026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Houskeeper, Henry F., and Raphael M. Kudela. "Ocean Color Quality Control Masks Contain the High Phytoplankton Fraction of Coastal Ocean Observations." Remote Sensing 11, no. 18 (2019): 2167. http://dx.doi.org/10.3390/rs11182167.

Full text

Abstract:

Satellite estimation of oceanic chlorophyll-a content has enabled characterization of global phytoplankton stocks, but the quality of retrieval for many ocean color products (including chlorophyll-a) degrades with increasing phytoplankton biomass in eutrophic waters. Quality control of ocean color products is achieved primarily through the application of masks based on standard thresholds designed to identify suspect or low-quality retrievals. This study compares the masked and unmasked fractions of ocean color datasets from two Eastern Boundary Current upwelling ecosystems (the California and

APA, Harvard, Vancouver, ISO, and other styles

27

Cheng, Shiqiang, Cuiyan Wu, Xin Qi, et al. "A Large-Scale Genetic Correlation Scan Between Intelligence and Brain Imaging Phenotypes." Cerebral Cortex 30, no. 7 (2020): 4197–203. http://dx.doi.org/10.1093/cercor/bhaa043.

Full text

Abstract:

Abstract Limited efforts have been paid to evaluate the potential relationships between structural and functional brain imaging and intelligence until now. We performed a two-stage analysis to systematically explore the relationships between 3144 brain image-derived phenotypes (IDPs) and intelligence. First, by integrating genome-wide association studies (GWAS) summaries data of brain IDPs and two GWAS summary datasets of intelligence, we systematically scanned the relationship between each of the 3144 brain IDPs and intelligence through linkage disequilibrium score regression (LDSC) analysis.

APA, Harvard, Vancouver, ISO, and other styles

28

Hayashi, Yoichi. "Does Deep Learning Work Well for Categorical Datasets with Mainly Nominal Attributes?" Electronics 9, no. 11 (2020): 1966. http://dx.doi.org/10.3390/electronics9111966.

Full text

Abstract:

Given the complexity of real-world datasets, it is difficult to present data structures using existing deep learning (DL) models. Most research to date has concentrated on datasets with only one type of attribute: categorical or numerical. Categorical data are common in datasets such as the German (-categorical) credit scoring dataset, which contains numerical, ordinal, and nominal attributes. The heterogeneous structure of this dataset makes very high accuracy difficult to achieve. DL-based methods have achieved high accuracy (99.68%) for the Wisconsin Breast Cancer Dataset, whereas DL-inspir

APA, Harvard, Vancouver, ISO, and other styles

29

Bajamgnigni Gbambie, Abdas Salam, Annie Poulin, Marie-Amélie Boucher, and Richard Arsenault. "Added Value of Alternative Information in Interpolated Precipitation Datasets for Hydrology." Journal of Hydrometeorology 18, no. 1 (2017): 247–64. http://dx.doi.org/10.1175/jhm-d-16-0032.1.

Full text

Abstract:

Abstract Gridded climate datasets are produced in many parts of the world by applying various interpolation methods to weather observations, to which are sometimes added secondary information (in addition to geographic location) such as topography and radar or atmospheric model outputs. For a region of interest, the choice of a dataset for a given study can be a significant challenge given the lack of information on the similarities and differences that exist between datasets, or about the benefits that one dataset may present relative to another. This study aims to provide information on the

APA, Harvard, Vancouver, ISO, and other styles

30

Moon, Myungjin, and Kenta Nakai. "Integrative analysis of gene expression and DNA methylation using unsupervised feature extraction for detecting candidate cancer biomarkers." Journal of Bioinformatics and Computational Biology 16, no. 02 (2018): 1850006. http://dx.doi.org/10.1142/s0219720018500063.

Full text

Abstract:

Currently, cancer biomarker discovery is one of the important research topics worldwide. In particular, detecting significant genes related to cancer is an important task for early diagnosis and treatment of cancer. Conventional studies mostly focus on genes that are differentially expressed in different states of cancer; however, noise in gene expression datasets and insufficient information in limited datasets impede precise analysis of novel candidate biomarkers. In this study, we propose an integrative analysis of gene expression and DNA methylation using normalization and unsupervised fea

APA, Harvard, Vancouver, ISO, and other styles

31

Đokić, Nikola, Borislava Blagojević, and Vladislava Mihailović. "Missing data representation by perception thresholds in flood flow frequency assessment." Journal of Applied Engineering Science 19, no. 2 (2021): 432–38. http://dx.doi.org/10.5937/jaes0-28902.

Full text

Abstract:

Flood flow frequency analysis (FFA) plays one of the key roles in many fields of hydraulic engineering and water resources management. The reliability of FFA results depends on many factors, an obvious one being the reliability of the input data - datasets of the annual peak flow. In practice, however, engineers often encounter the problem of incomplete datasets (missing data, data gaps and/or broken records) which increases the uncertainty of FFA results. In this paper, we perform at-site focused analysis, and we use a complete dataset of annual peak flows from 1931 to 2016 at the hydrologic

APA, Harvard, Vancouver, ISO, and other styles

32

Mabuni, D., and S. Aquter Babu. "High Accurate and a Variant of k-fold Cross Validation Technique for Predicting the Decision Tree Classifier Accuracy." International Journal of Innovative Technology and Exploring Engineering 10, no. 2 (2021): 105–10. http://dx.doi.org/10.35940/ijitee.c8403.0110321.

Full text

Abstract:

In machine learning data usage is the most important criterion than the logic of the program. With very big and moderate sized datasets it is possible to obtain robust and high classification accuracies but not with small and very small sized datasets. In particular only large training datasets are potential datasets for producing robust decision tree classification results. The classification results obtained by using only one training and one testing dataset pair are not reliable. Cross validation technique uses many random folds of the same dataset for training and validation. In order to o

APA, Harvard, Vancouver, ISO, and other styles

33

Di, Yanghua, Zhiguo Jiang, and Haopeng Zhang. "A Public Dataset for Fine-Grained Ship Classification in Optical Remote Sensing Images." Remote Sensing 13, no. 4 (2021): 747. http://dx.doi.org/10.3390/rs13040747.

Full text

Abstract:

Fine-grained visual categorization (FGVC) is an important and challenging problem due to large intra-class differences and small inter-class differences caused by deformation, illumination, angles, etc. Although major advances have been achieved in natural images in the past few years due to the release of popular datasets such as the CUB-200-2011, Stanford Cars and Aircraft datasets, fine-grained ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. In this paper, we investigate a large amount of remote sensing image

APA, Harvard, Vancouver, ISO, and other styles

34

Dlamini, Nkosikhona, and Terence L. van Zyl. "Comparing Class-Aware and Pairwise Loss Functions for Deep Metric Learning in Wildlife Re-Identification." Sensors 21, no. 18 (2021): 6109. http://dx.doi.org/10.3390/s21186109.

Full text

Abstract:

Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or datasets in which few class labels exist per class such as wildlife re-identification. Improving the performance of similarity learning models comes with developing new sampling techniques and designing loss functions better suited to training similarity in neural networks. However, the impact of th

APA, Harvard, Vancouver, ISO, and other styles

35

Sarma, Karthik V., Alex G. Raman, Nikhil J. Dhinagar, et al. "Harnessing clinical annotations to improve deep learning performance in prostate segmentation." PLOS ONE 16, no. 6 (2021): e0253829. http://dx.doi.org/10.1371/journal.pone.0253829.

Full text

Abstract:

Purpose Developing large-scale datasets with research-quality annotations is challenging due to the high cost of refining clinically generated markup into high precision annotations. We evaluated the direct use of a large dataset with only clinically generated annotations in development of high-performance segmentation models for small research-quality challenge datasets. Materials and methods We used a large retrospective dataset from our institution comprised of 1,620 clinically generated segmentations, and two challenge datasets (PROMISE12: 50 patients, ProstateX-2: 99 patients). We trained

APA, Harvard, Vancouver, ISO, and other styles

36

Madden, Frances, Jan Ashton, and Jez Cope. "Building the Picture Behind a Dataset." International Journal of Digital Curation 15, no. 1 (2020): 9. http://dx.doi.org/10.2218/ijdc.v15i1.702.

Full text

Abstract:

As part of the European Commission funded FREYA project The British Library wanted to explore the possibility of developing provenance information in datasets derived from the British Library’s collections, the data.bl.uk collection. Provenance information is defined in this context as ‘information relating to the origin, source and curation of the datasets’. Provenance information is also identified within the FAIR principles as an important aspect of being able to reuse and understand research datasets. According to the FAIR principles, the aim is to understand how to cite and acknowledge th

APA, Harvard, Vancouver, ISO, and other styles

37

Archila Bustos, Maria Francisca, Ola Hall, Thomas Niedomysl, and Ulf Ernstson. "A pixel level evaluation of five multitemporal global gridded population datasets: a case study in Sweden, 1990–2015." Population and Environment 42, no. 2 (2020): 255–77. http://dx.doi.org/10.1007/s11111-020-00360-8.

Full text

Abstract:

Abstract Human activity is a major driver of change and has contributed to many of the challenges we face today. Detailed information about human population distribution is fundamental and use of freely available, high-resolution, gridded datasets on global population as a source of such information is increasing. However, there is little research to guide users in dataset choice. This study evaluates five of the most commonly used global gridded population datasets against a high-resolution Swedish population dataset on a pixel level. We show that datasets which employ more complex modeling t

APA, Harvard, Vancouver, ISO, and other styles

38

Shi, Lingfei, and Feng Ling. "Local Climate Zone Mapping Using Multi-Source Free Available Datasets on Google Earth Engine Platform." Land 10, no. 5 (2021): 454. http://dx.doi.org/10.3390/land10050454.

Full text

Abstract:

As one of the widely concerned urban climate issues, urban heat island (UHI) has been studied using the local climate zone (LCZ) classification scheme in recent years. More and more effort has been focused on improving LCZ mapping accuracy. It has become a prevalent trend to take advantage of multi-source images in LCZ mapping. To this end, this paper tried to utilize multi-source freely available datasets: Sentinel-2 multispectral instrument (MSI), Sentinel-1 synthetic aperture radar (SAR), Luojia1-01 nighttime light (NTL), and Open Street Map (OSM) datasets to produce the 10 m LCZ classifica

APA, Harvard, Vancouver, ISO, and other styles

39

Tran, Thi-Dung, Junghee Kim, Ngoc-Huynh Ho, et al. "Stress Analysis with Dimensions of Valence and Arousal in the Wild." Applied Sciences 11, no. 11 (2021): 5194. http://dx.doi.org/10.3390/app11115194.

Full text

Abstract:

In the field of stress recognition, the majority of research has conducted experiments on datasets collected from controlled environments with limited stressors. As these datasets cannot represent real-world scenarios, stress identification and analysis are difficult. There is a dire need for reliable, large datasets that are specifically acquired for stress emotion with varying degrees of expression for this task. In this paper, we introduced a dataset for Stress Analysis with Dimensions of Valence and Arousal of Korean Movie in Wild (SADVAW), which includes video clips with diversity in faci

APA, Harvard, Vancouver, ISO, and other styles

40

Shaon, Arif, Sarah Callaghan, Bryan Lawrence, et al. "Opening Up Climate Research: A Linked Data Approach to Publishing Data Provenance." International Journal of Digital Curation 7, no. 1 (2012): 163–73. http://dx.doi.org/10.2218/ijdc.v7i1.223.

Full text

Abstract:

Traditionally, the formal scientific output in most fields of natural science has been limited to peer-reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved.

APA, Harvard, Vancouver, ISO, and other styles

41

Xie, Ning-Ning, Fang-Fang Wang, Jue Zhou, Chang Liu, and Fan Qu. "Establishment and Analysis of a Combined Diagnostic Model of Polycystic Ovary Syndrome with Random Forest and Artificial Neural Network." BioMed Research International 2020 (August 20, 2020): 1–13. http://dx.doi.org/10.1155/2020/2613091.

Full text

Abstract:

Polycystic ovary syndrome (PCOS) is one of the most common metabolic and reproductive endocrinopathies. However, few studies have tried to develop a diagnostic model based on gene biomarkers. In this study, we applied a computational method by combining two machine learning algorithms, including random forest (RF) and artificial neural network (ANN), to identify gene biomarkers and construct diagnostic model. We collected gene expression data from Gene Expression Omnibus (GEO) database containing 76 PCOS samples and 57 normal samples; five datasets were utilized, including one dataset for scre

APA, Harvard, Vancouver, ISO, and other styles

42

Wang, Xiaoqing, Xiangjun Wang, and Yubo Ni. "Unsupervised Domain Adaptation for Facial Expression Recognition Using Generative Adversarial Networks." Computational Intelligence and Neuroscience 2018 (July 9, 2018): 1–10. http://dx.doi.org/10.1155/2018/7208794.

Full text

Abstract:

In the facial expression recognition task, a good-performing convolutional neural network (CNN) model trained on one dataset (source dataset) usually performs poorly on another dataset (target dataset). This is because the feature distribution of the same emotion varies in different datasets. To improve the cross-dataset accuracy of the CNN model, we introduce an unsupervised domain adaptation method, which is especially suitable for unlabelled small target dataset. In order to solve the problem of lack of samples from the target dataset, we train a generative adversarial network (GAN) on the

APA, Harvard, Vancouver, ISO, and other styles

43

Page, Roderic. "Liberating links between datasets using lightweight data publishing: an example using plant names and the taxonomic literature." Biodiversity Data Journal 6 (July 23, 2018): e27539. http://dx.doi.org/10.3897/bdj.6.e27539.

Full text

Abstract:

Constructing a biodiversity knowledge graph will require making millions of cross links between diversity entities in different datasets. Researchers trying to bootstrap the growth of the biodiversity knowledge graph by constructing databases of links between these entities lack obvious ways to publish these sets of links. One appealing and lightweight approach is to create a "datasette", a database that is wrapped together with a simple web server that enables users to query the data. Datasettes can be packaged into Docker containers and hosted online with minimal effort. This approach is ill

APA, Harvard, Vancouver, ISO, and other styles

44

Vincke, S., and M. Vergauwen. "GEO-REGISTERING CONSECUTIVE DATASETS BY MEANS OF A REFERENCE DATASET, ELIMINATING GROUND CONTROL POINT INDICATION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-5/W2 (September 20, 2019): 85–91. http://dx.doi.org/10.5194/isprs-archives-xlii-5-w2-85-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> The architecture, engineering and construction (AEC) industry’s interest in more advanced ways of regular monitoring of construction site activities and the achieved building progress has been rising recently. This requires frequent recordings of the area. This is only feasible if the profound observations only require limited time, both for the actual capturing on-site as well as processing of the recorded data. Moreover, for monitoring purposes, it is vital that all datasets use a single, unique reference system. This allows for an easy compari

APA, Harvard, Vancouver, ISO, and other styles

45

Hou, Yu-Tai, Kenneth A. Campana, Kenneth E. Mitchell, Shi-Keng Yang, and Larry L. Stowe. "Comparison of an Experimental NOAA AVHRR Cloud Dataset with Other Observed and Forecast Cloud Datasets." Journal of Atmospheric and Oceanic Technology 10, no. 6 (1993): 833–49. http://dx.doi.org/10.1175/1520-0426(1993)010<0833:coaena>2.0.co;2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Bolón-Canedo, V., N. Sánchez-Maroño, and A. Alonso-Betanzos. "Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset." Expert Systems with Applications 38, no. 5 (2011): 5947–57. http://dx.doi.org/10.1016/j.eswa.2010.11.028.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Jittawiriyanukoon, Chanintorn. "Granularity analysis of classification and estimation for complex datasets with MOA." International Journal of Electrical and Computer Engineering (IJECE) 9, no. 1 (2019): 409. http://dx.doi.org/10.11591/ijece.v9i1.pp409-416.

Full text

Abstract:

<span>Dispersed and unstructured datasets are substantial parameters to realize an exact amount of the required space. Depending upon the size and the data distribution, especially, if the classes are significantly associating, the level of granularity to agree a precise classification of the datasets exceeds. The data complexity is one of the major attributes to govern the proper value of the granularity, as it has a direct impact on the performance. Dataset classification exhibits the vital step in complex data analytics and designs to ensure that dataset is prompt to be efficiently sc

APA, Harvard, Vancouver, ISO, and other styles

48

Huč, Aleks, Jakob Šalej, and Mira Trebar. "Analysis of Machine Learning Algorithms for Anomaly Detection on Edge Devices." Sensors 21, no. 14 (2021): 4946. http://dx.doi.org/10.3390/s21144946.

Full text

Abstract:

The Internet of Things (IoT) consists of small devices or a network of sensors, which permanently generate huge amounts of data. Usually, they have limited resources, either computing power or memory, which means that raw data are transferred to central systems or the cloud for analysis. Lately, the idea of moving intelligence to the IoT is becoming feasible, with machine learning (ML) moved to edge devices. The aim of this study is to provide an experimental analysis of processing a large imbalanced dataset (DS2OS), split into a training dataset (80%) and a test dataset (20%). The training da

APA, Harvard, Vancouver, ISO, and other styles

49

Guo, Rui, Yi-Qin Wang, Jin Xu, et al. "Research on Zheng Classification Fusing Pulse Parameters in Coronary Heart Disease." Evidence-Based Complementary and Alternative Medicine 2013 (2013): 1–8. http://dx.doi.org/10.1155/2013/602672.

Full text

Abstract:

This study was conducted to illustrate that nonlinear dynamic variables of Traditional Chinese Medicine (TCM) pulse can improve the performances of TCM Zheng classification models. Pulse recordings of 334 coronary heart disease (CHD) patients and 117 normal subjects were collected in this study. Recurrence quantification analysis (RQA) was employed to acquire nonlinear dynamic variables of pulse. TCM Zheng models in CHD were constructed, and predictions using a novel multilabel learning algorithm based on different datasets were carried out. Datasets were designed as follows:dataset1, TCM inqu

APA, Harvard, Vancouver, ISO, and other styles

50

Sharma, Vijeta, Manjari Gupta, Ajai Kumar, and Deepti Mishra. "EduNet: A New Video Dataset for Understanding Human Activity in the Classroom Environment." Sensors 21, no. 17 (2021): 5699. http://dx.doi.org/10.3390/s21175699.

Full text

Abstract:

Human action recognition in videos has become a popular research area in artificial intelligence (AI) technology. In the past few years, this research has accelerated in areas such as sports, daily activities, kitchen activities, etc., due to developments in the benchmarks proposed for human action recognition datasets in these areas. However, there is little research in the benchmarking datasets for human activity recognition in educational environments. Therefore, we developed a dataset of teacher and student activities to expand the research in the education domain. This paper proposes a ne

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!