Academic literature on the topic 'High volume big data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'High volume big data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "High volume big data"

1

S., Senthil Kumar, and Ms.V.Kirthika. "Big Data Analytics Architecture and Challenges, Issues of Big Data Analytics." International Journal of Trend in Scientific Research and Development 1, no. 6 (2017): 669–73. https://doi.org/10.31142/ijtsrd4673.

Full text
Abstract:
Big Data technologies uses a new generation of technologies and architectures, designed for organizations can extract value from very large volumes of a wide variety of data by enabling high velocity capture, discovery, and or analysis. Big data is a massive amount of digital data being collected from various sources that are too large. Big data deals with challenges like complexity, security, risks to privacy. Big data is redefining as data management from extraction, transformation, cleaning and reducing. The size and variety of data lead us to think ahead and develop new and faster methods of mining data which uses the parallel computing capability of processors. The above is known as big data. S. Senthil Kumar | Ms.V.Kirthika "Big Data Analytics Architecture and Challenges, Issues of Big Data Analytics" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-6 , October 2017, URL: https://www.ijtsrd.com/papers/ijtsrd4673.pdf
APA, Harvard, Vancouver, ISO, and other styles
2

ADEBO, PHILIP. "BIG DATA IN BUSINESS." International Journal of Advanced Research in Computer Science and Software Engineering 8, no. 1 (2018): 160. http://dx.doi.org/10.23956/ijarcsse.v8i1.543.

Full text
Abstract:
ABSTRACTBusiness has always desired to derive insights from big data in order to make better, smarter, data-driven decisions. Big data refers to data that are generated at high volume, high velocity, high variety, high veracity, and high value. It has fundamentally changed the way business companies operate, make decisions, and compete. It can create value for businesses. This paper provides a brief introduction to how big data is being used in businesses.
APA, Harvard, Vancouver, ISO, and other styles
3

Lakshmanasamy, Rameshbabu, and Girish Ganachari. "Data Integrity Problems in High-Volume High-Velocity Data Ingestion." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 10 (2024): 1–6. http://dx.doi.org/10.55041/ijsrem14175.

Full text
Abstract:
In the era of bigdata, and never ending data push from IoT devices, the IT infrastructure are built to be scalable to handle the huge batch loads or continuous streaming live data. However, the big question is how can be establish the data integrity. How can we make sure no data is lost from Origin till the destination passing through numerous touch points enroute ? How can we ensure the quality with continuous inflow ? Should the inflow be suspended to perform the DQ checks? Or should it be a totally independent parallel activity. Let’s explore. Key words: Quality Data Management, Data Pipelines, AutoScalling, Large Scale Streaming Data, Performance, IoT, Data Cleansing
APA, Harvard, Vancouver, ISO, and other styles
4

U., Prathibha, Thillainayaki M., and Jenneth A. "Big Data Analysis with R Programming and RHadoop." International Journal of Trend in Scientific Research and Development 2, no. 4 (2018): 2623–27. https://doi.org/10.31142/ijtsrd15705.

Full text
Abstract:
Big data is a technology to access huge data sets, have high Velocity, high Volume and high Variety and complex structure with the difficulties of management, analyzing, storing and processing. The paper focuses on extraction of data efficiently in big data tools using R programming techniques and how to manage the data and the components that are useful in handling big data. Data can be classified as public, confidential and sensitive. This paper proposes the big data applications with the Hadoop Distributed Framework for storing huge data in cloud in a highly efficient manner. This paper describes the tools and techniques of R which is integrated with Big data tools for the parallel processing and statistical method. Using RHadoop data tools helps organization to resolve the scalability, issues and solve their predictive analysis with high performance by using Map reducing Framework. U. Prathibha | M. Thillainayaki | A. Jenneth "Big Data Analysis with R Programming and RHadoop" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: https://www.ijtsrd.com/papers/ijtsrd15705.pdf
APA, Harvard, Vancouver, ISO, and other styles
5

Huda, M. Misbachul, Dian Rahma Latifa Hayun, and Zhin Martun. "Data Modeling for Big Data." Jurnal ULTIMA InfoSys 6, no. 1 (2015): 1–11. http://dx.doi.org/10.31937/si.v6i1.273.

Full text
Abstract:
Today the rapid growth of the internet and the massive usage of the data have led to the increasing CPU requirement, velocity for recalling data, a schema for more complex data structure management, the reliability and the integrity of the available data. This kind of data is called as Large-scale Data or Big Data. Big Data demands high volume, high velocity, high veracity and high variety. Big Data has to deal with two key issues, the growing size of the datasets and the increasing of data complexity. To overcome these issues, today researches are devoted to kind of database management system that can be optimally used for big data management. There are two kinds of database management system, relational database management system and nonrelational system that can be optimally used for big data management. There are two kinds of database management, Relational Database Management and Non-relational Database Management. This paper will give reviews about these two database management system, including description, vantage, structure and the application of each DBMS.
 Index Terms - Big Data, DBMS, Large-scale Data, Non-relational Database, Relational Database.
APA, Harvard, Vancouver, ISO, and other styles
6

Jeong, Yoon-su, and Seung-soo Shin. "A Multidata Connection Scheme for Big Data High-Dimension Using the Data Connection Coefficient." Mathematical Problems in Engineering 2015 (2015): 1–6. http://dx.doi.org/10.1155/2015/931352.

Full text
Abstract:
In the era of big data and cloud computing, sources and types of data vary, and the volume and flow of data are massive and continuous. With the widespread use of mobile devices and the Internet, huge volumes of data distributed over heterogeneous networks move forward and backward across networks. In order to meet the demands of big data service providers and customers, efficient technologies for processing and transferring big data over networks are needed. This paper proposes a multidata connection (MDC) scheme that decreases the amount of power and time necessary for information to be communicated between the content server and the mobile users (i.e., the content consumers who are moving freely across different networks while using their mobile devices). MDC scheme is an approach to data validation that requires the presentation of two or more pieces of data in a heterogeneous environment of big data. The MDC transmits the difference of the consecutive data sequences instead of sending the data itself to the receiver, thus increasing transmission throughput and speed.
APA, Harvard, Vancouver, ISO, and other styles
7

Chaudhary, Dr Sunita. "Analysis of Concept of Big Data Process, Strategies, Adoption and Implementation." International Journal on Future Revolution in Computer Science & Communication Engineering 8, no. 1 (2022): 05–08. http://dx.doi.org/10.17762/ijfrcsce.v8i1.2065.

Full text
Abstract:
Big data is the data set which contains variety of data which increases on a daily basis in an organisation and also it has large volume of data with high velocity. It is the complex form of data which is usually identified from new set of data sources. It is widely used in most of the companies as it will have latest data which is useful to solve critical issues in the business or an organisation. BIG data has such large volume of data that it cannot be handled by simple data handling software and need latest technology software to handle such large volumes of data. The amount of data present in big data is so complex and large in volume that it needs skilled professionals to manage the data and requires efficient efforts to maintain it. Small businesses cannot afford to implement BIG data in their business due to its complex nature and also the cost factor is higher as compared to traditional data software. The paper will explain the various categories of big data, its importance, strategies and implementation process.
APA, Harvard, Vancouver, ISO, and other styles
8

Borodo, Salisu Musa, Siti Mariyam Shamsuddin, and Shafaatunnur Hasan. "Big Data Platforms and Techniques." Indonesian Journal of Electrical Engineering and Computer Science 1, no. 1 (2016): 191. http://dx.doi.org/10.11591/ijeecs.v1.i1.pp191-200.

Full text
Abstract:
Data is growing at unprecedented rate and has led to huge volume generated; the data sources include mobile, internet and sensors. This voluminous data is generated and updated at high velocity by batch and streaming platforms. This data is also varied along structured and unstructured types. This volume, velocity and variety of data led to the term big data. Big data has been premised to contain untapped knowledge, its exploration and exploitation is termed big data analytics. This literature reviewed platforms such as batch processing, real time processing and interactive analytics used in big data environments. Techniques used for big data are machine learning, Data Mining, Neural Network and Deep Learning. There are big data architecture offerings from Microsoft, IBM and National Institute of Standards and Technology. Big data potentials can transform economies and reduce running cost of institutions. Big data has challenges such as storage, computation, security and privacy
APA, Harvard, Vancouver, ISO, and other styles
9

Prof., Sachchidanand Nimankar Prof. Sushant Dagare. "BIG DATA ANALYTICS: 4A's." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 7, no. 2 (2018): 328–32. https://doi.org/10.5281/zenodo.1173488.

Full text
Abstract:
Basically, data process is seen to be gathering, processing and management of data for giving output of “new” information for end users [2]. Over time, key challenges are related to mining, storage, transportation and processing of high throughput data. It is different from Big Data challenges to which we have to add Volume, Velocity, Value, Veracity, variety, Visualization and Variability [4]. Consequently, these requirements imply an additional step where data are cleaned, tagged, classified and formatted. Big Data analysis currently splits into four steps: Acquisition or Access, Assembly or Organization, Analyze and Action or Decision. Thus, these steps are mentioned as the “4 A’s”.
APA, Harvard, Vancouver, ISO, and other styles
10

Kadhim Jawad, Wasnaa, and Abbas M. Al-Bakry. "Big Data Analytics: A Survey." Iraqi Journal for Computers and Informatics 49, no. 1 (2022): 41–51. http://dx.doi.org/10.25195/ijci.v49i1.384.

Full text
Abstract:
Internet-based programs and communication techniques have become widely used and respected in the IT industry recently. A persistent source of "big data," or data that is enormous in volume, diverse in type, and has a complicated multidimensional structure, is internet applications and communications. Today, several measures are routinely performed with no assurance that any of them will be helpful in understanding the phenomenon of interest in an era of automatic, large-scale data collection. Online transactions that involve buying, selling, or even investing are all examples of e-commerce. As a result, they generate data that has a complex structure and a high dimension. The usual data storage techniques cannot handle those enormous volumes of data. There is a lot of work being done to find ways to minimize the dimensionality of big data in order to provide analytics reports that are even more accurate and data visualizations that are more interesting. As a result, the purpose of this survey study is to give an overview of big data analytics along with related problems and issues that go beyond technology.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "High volume big data"

1

Le, Montagner Roman. "High-Energy Transient Universe in the Era of Large Optical Surveys." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASP089.

Full text
Abstract:
L'astronomie multi-messagers combine des données de sources variées comme les photons, les ondes gravitationnelles (OG), les neutrinos et les rayons cosmiques. Des avancées significatives ont été réalisées en 1987 avec la détection de neutrinos d'une supernova proche et en 2017 avec la détection conjointe d'OG, d'un sursaut gamma court et d'une kilonova provenant d'une fusion de deux étoiles à neutrons. Ce domaine devrait croître avec le lancement de nouveaux observatoires tels que SVOM, le télescope Einstein, Icecube, KM3Net et l'observatoire Vera C. Rubin. Le Legacy Survey of Space and Time (LSST) commencera ses observations en 2025 pour dix ans, explorant divers domaines comme la science du système solaire, la matière noire et l'énergie noire. LSST enverra jusqu'à 10 millions de notifications chaque nuit. Des courtiers d'alertes astronomiques comme Fink, basés sur de grands systèmes informatiques, ont été établis pour gérer ces données. Ils traitent, catégorisent et stockent automatiquement plusieurs flux d'alertes en temps réel, offrant divers services à la communauté scientifique. Dans ce contexte, l'ensemble des développements scientifiques et techniques de Fink a été initialement porté sur le Zwicky Transient Facility (ZTF), afin de préparer l'arrivée du relevé LSST. Ma thèse est centrée autour de Fink, avec pour objectif de faire progresser l'astronomie multi-messagers. J'ai développé un système automatique, Fink-MM, pour corréler les alertes optiques de LSST avec les alertes à haute énergie provenant des observatoires OG, de neutrinos ou de sursauts gamma (GRB) en temps réel. Pour minimiser le nombre de fausses associations résultant de la mauvaise localisation des événements individuels à haute énergie, j'ai mis au point une méthode pour sélectionner un nombre limité de contreparties candidates prometteuses. Tout d'abord, je m'appuie sur les résultats scientifiques de Fink générés par des algorithmes d'apprentissage automatique et croisés avec des catalogues externes pour réduire considérablement le nombre de candidats. Toutefois, les capacités de classification de Fink étant limitées, il reste un trop grand nombre d'associations possibles. En me concentrant sur les contreparties optiques des sursauts gamma, j'affine davantage l'analyse en fournissant une probabilité d'occurrence. Cette méthode a été testée sur des GRB connus dans la base de données de Fink pour établir le seuil optimal permettant d'obtenir un nombre minimal de candidats d'alertes aptes aux observations de suivi. Pendant la phase de test, j'ai réussi à identifier une contrepartie optique, qui a été indépendamment détectée et confirmée par une autre équipe. Malgré les moyens mis en oeuvre, la recherche de transitoires extragalactiques rapides reste fortement influencée par la présence d'objets non identifiés, la majorité n'ayant que quelques détections la même nuit, trahissant la présence d'objets en mouvement tels que les astéroïdes du système solaire. Bien qu'un nombre substantiel soit déjà catalogué, LSST en détectera des millions de nouveaux. Dans ce contexte, j'ai étendu les capacités de Fink en développant le Fink Asteroid Tracker, qui identifie les trajectoires des objets non catalogués du système solaire et offre des estimations préliminaires des orbites et des éphémérides pour les observations de suivi. J'ai aussi contribué à un nouveau modèle expliquant plus précisément l'évolution de la luminosité des astéroïdes, permettant d'extraire leur rotation et leur forme. Enfin, pour palier à la cadence espacée des relevés optiques, je décris GVOM, un réseau de télescopes en construction spécialement conçu pour apporter rapidement des données scientifiques supplémentaires aux candidats préliminaires issus de Fink<br>Multi-messenger astronomy involves combining data from various sources like photons, gravitational waves (GW), neutrinos, and cosmic rays. It made significant progress in 1987 with the detection of neutrinos from a nearby supernova and, more recently, in 2017 with the joint detection of gravitational waves, a short gamma-ray burst, and a kilonova from a neutron star merger. This field is expected to grow in importance over the next decade, especially with the launch of new observatories such as SVOM, Einstein Telescope, Icecube, KM3Net, and the Vera C. Rubin Observatory, expanding observational capabilities. The Legacy Survey of Space and Time (LSST) is set to begin observing the sky in 2025, continuing for a decade. This survey will delve into various scientific areas, including solar system science, dark matter, and dark energy. LSST will detect and alert the scientific community about significant changes in luminosity in the sky, resulting in up to 10 million notifications each night. In order to handle the vast amounts of data and associated scientific challenges, astronomical alert brokers like Fink have been established on large computing systems. These brokers automatically process, categorize, and store multiple alert streams in real time, offering various services to the scientific community. In this context, Fink has been using the Zwicky Transient Facility (ZTF) for many years in order to establish robust scientific exploitation of alert streams and scale out to LSST. This research work has been conducted within Fink, aiming to advance multi-messenger astronomy. I developed an automatic system, Fink-MM, to correlate LSST optical alerts with high-energy alerts from GW observatories, neutrinos, or gamma-ray bursts (GRBs) in real time. To minimize the number of false associations resulting from the poor localization of individual high-energy events, I devised a method to select a limited number of promising candidate counterparts. First, I rely on Fink's scientific results generated by machine learning algorithms and cross-referenced with external catalogues to significantly reduce the number of candidates. However, given Fink's limited classification capabilities, too many possible associations remain. By focusing on the optical counterparts of gamma-ray bursts, I further refine the analysis by providing an occurrence probability. This method was tested on known GRBs in the Fink database to establish the optimal threshold for obtaining a minimal number of alert candidates suitable for follow-up observations. During the testing phase, I successfully identified an optical counterpart, which was independently detected and confirmed by another team. Despite the efforts made, the search for fast extragalactic transients remains heavily influenced by the presence of unidentified objects, most of which have only a few detections on the same night, indicating the possible presence of moving objects such as solar system asteroids. Although a substantial number are already catalogued, LSST will detect millions of new ones. In this context, I extended Fink's capabilities by developing the Fink Asteroid Tracker, which identifies the trajectories of uncatalogued solar system objects and provides preliminary estimates of orbits and ephemerides for follow-up observations. I also contributed to a new model explaining more precisely the evolution of asteroid luminosity, allowing for the extraction of their rotation and shape. Finally, to compensate for the sparse cadence of optical surveys, I describe GVOM, a network of telescopes under construction, specifically designed to quickly provide additional scientific data to preliminary candidates identified by Fink
APA, Harvard, Vancouver, ISO, and other styles
2

Danesh, Sabri. "BIG DATA : From hype to reality." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-37493.

Full text
Abstract:
Big data is all of a sudden everywhere. It is too big to ignore!It has been six decades since the computer revolution, four decades after the development of the microchip, and two decades of the modern Internet! More than a decade after the 90s “.com” fizz, can Big Data be the next Big Bang? Big data reveals part of our daily lives. It has the potential to solve virtually any problem for a better urbanized global. Big Data sources are also very interesting from an official statistics point of view. The purpose of this paper is to explore the conceptions of big data and opportunities and challenges associated with using big data especially in official statistics. “A petabyte is the equivalent of 1,000 terabytes, or a quadrillion bytes. One terabyte is a thousand gigabytes. One gigabyte is made up of a thousand megabytes. There are a thousand thousand—i.e., a million—petabytes in a zettabyte” (Shaw 2014). And this is to be continued…
APA, Harvard, Vancouver, ISO, and other styles
3

Tudoran, Radu-Marius. "High-Performance Big Data Management Across Cloud Data Centers." Electronic Thesis or Diss., Rennes, École normale supérieure, 2014. http://www.theses.fr/2014ENSR0004.

Full text
Abstract:
La puissance de calcul facilement accessible offerte par les infrastructures clouds, couplés à la révolution du "Big Data", augmentent l'échelle et la vitesse auxquelles l'analyse des données est effectuée. Les ressources de cloud computing pour le calcul et le stockage sont répartis entre plusieurs centres de données de par le monde. Permettre des transferts de données rapides devient particulièrement important dans le cadre d'applications scientifiques pour lesquels déplacer le traitement proche de données est coûteux voire impossible. Les principaux objectifs de cette thèse consistent à analyser comment les clouds peuvent devenir "Big Data - friendly", et quelles sont les meilleures options pour fournir des services de gestion de données aptes à répondre aux besoins des applications. Dans cette thèse, nous présentons nos contributions pour améliorer la performance de la gestion de données pour les applications exécutées sur plusieurs centres de données géographiquement distribués. Nous commençons avec les aspects concernant l'échelle du traitement de données sur un site, et poursuivons avec le développements de solutions de type MapReduce permettant la distribution des calculs entre plusieurs centres. Ensuite, nous présentons une architecture de service de transfert qui permet d'optimiser le rapport coût-performance des transferts. Ce service est exploité dans le contexte de la diffusion de données en temps-réel entre des centres de données de clouds. Enfin, nous étudions la viabilité, pour une fournisseur de cloud, de la solution consistant à intégrer cette architecture comme un service basé sur un paradigme de tarification flexible, qualifiée de "Transfert-as-a-Service"<br>The easily accessible computing power offered by cloud infrastructures, coupled with the "Big Data" revolution, are increasing the scale and speed at which data analysis is performed. Cloud computing resources for compute and storage are spread across multiple data centers around the world. Enabling fast data transfers becomes especially important in scientific applications where moving the processing close to data is expensive or even impossible. The main objectives of this thesis are to analyze how clouds can become "Big Data - friendly", and what are the best options to provide data management services able to meet the needs of applications. In this thesis, we present our contributions to improve the performance of data management for applications running on several geographically distributed data centers. We start with aspects concerning the scale of data processing on a site, and continue with the development of MapReduce type solutions allowing the distribution of calculations between several centers. Then, we present a transfer service architecture that optimizes the cost-performance ratio of transfers. This service is operated in the context of real-time data streaming between cloud data centers. Finally, we study the viability, for a cloud provider, of the solution consisting in integrating this architecture as a service based on a flexible pricing paradigm, qualified as "Transfer-as-a-Service"
APA, Harvard, Vancouver, ISO, and other styles
4

Tran, Viet-Trung. "Scalable data-management systems for Big Data." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2013. http://tel.archives-ouvertes.fr/tel-00920432.

Full text
Abstract:
Big Data can be characterized by 3 V's. * Big Volume refers to the unprecedented growth in the amount of data. * Big Velocity refers to the growth in the speed of moving data in and out management systems. * Big Variety refers to the growth in the number of different data formats. Managing Big Data requires fundamental changes in the architecture of data management systems. Data storage should continue being innovated in order to adapt to the growth of data. They need to be scalable while maintaining high performance regarding data accesses. This thesis focuses on building scalable data management systems for Big Data. Our first and second contributions address the challenge of providing efficient support for Big Volume of data in data-intensive high performance computing (HPC) environments. Particularly, we address the shortcoming of existing approaches to handle atomic, non-contiguous I/O operations in a scalable fashion. We propose and implement a versioning-based mechanism that can be leveraged to offer isolation for non-contiguous I/O without the need to perform expensive synchronizations. In the context of parallel array processing in HPC, we introduce Pyramid, a large-scale, array-oriented storage system. It revisits the physical organization of data in distributed storage systems for scalable performance. Pyramid favors multidimensional-aware data chunking, that closely matches the access patterns generated by applications. Pyramid also favors a distributed metadata management and a versioning concurrency control to eliminate synchronizations in concurrency. Our third contribution addresses Big Volume at the scale of the geographically distributed environments. We consider BlobSeer, a distributed versioning-oriented data management service, and we propose BlobSeer-WAN, an extension of BlobSeer optimized for such geographically distributed environments. BlobSeer-WAN takes into account the latency hierarchy by favoring locally metadata accesses. BlobSeer-WAN features asynchronous metadata replication and a vector-clock implementation for collision resolution. To cope with the Big Velocity characteristic of Big Data, our last contribution feautures DStore, an in-memory document-oriented store that scale vertically by leveraging large memory capability in multicore machines. DStore demonstrates fast and atomic complex transaction processing in data writing, while maintaining high throughput read access. DStore follows a single-threaded execution model to execute update transactions sequentially, while relying on a versioning concurrency control to enable a large number of simultaneous readers.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Liangwei. "Big Data Analytics for eMaintenance : Modeling of high-dimensional data streams." Licentiate thesis, Luleå tekniska universitet, Drift, underhåll och akustik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-17012.

Full text
Abstract:
Big Data analytics has attracted intense interest from both academia and industry recently for its attempt to extract information, knowledge and wisdom from Big Data. In industry, with the development of sensor technology and Information & Communication Technologies (ICT), reams of high-dimensional data streams are being collected and curated by enterprises to support their decision-making. Fault detection from these data is one of the important applications in eMaintenance solutions with the aim of supporting maintenance decision-making. Early discovery of system faults may ensure the reliability and safety of industrial systems and reduce the risk of unplanned breakdowns. Both high dimensionality and the properties of data streams impose stringent challenges on fault detection applications. From the data modeling point of view, high dimensionality may cause the notorious “curse of dimensionality” and lead to the accuracy deterioration of fault detection algorithms. On the other hand, fast-flowing data streams require fault detection algorithms to have low computing complexity and give real-time or near real-time responses upon the arrival of new samples. Most existing fault detection models work on relatively low-dimensional spaces. Theoretical studies on high-dimensional fault detection mainly focus on detecting anomalies on subspace projections of the original space. However, these models are either arbitrary in selecting subspaces or computationally intensive. In considering the requirements of fast-flowing data streams, several strategies have been proposed to adapt existing fault detection models to online mode for them to be applicable in stream data mining. Nevertheless, few studies have simultaneously tackled the challenges associated with high dimensionality and data streams. In this research, an Angle-based Subspace Anomaly Detection (ABSAD) approach to fault detection from high-dimensional data is developed. Both analytical study and numerical illustration demonstrated the efficacy of the proposed ABSAD approach. Based on the sliding window strategy, the approach is further extended to an online mode with the aim of detecting faults from high-dimensional data streams. Experiments on synthetic datasets proved that the online ABSAD algorithm can be adaptive to the time-varying behavior of the monitored system, and hence applicable to dynamic fault detection.<br>Godkänd; 2015; 20150512 (liazha); Nedanstående person kommer att hålla licentiatseminarium för avläggande av teknologie licentiatexamen. Namn: Liangwei Zhang Ämne: Drift och underhållsteknik/Operation and Maintenance Engineering Uppsats: Big Data Analytics for eMaintenance Examinator: Professor Uday Kumar Institutionen för samhällsbyggnad och naturresurser Avdelning Drift, underhåll och akustik Luleå tekniska universitet Diskutant: Professor Wolfgang Birk Institutionen för system- och rymdteknik Avdelning Signaler och system Luleå tekniska universitet Tid: Onsdag 10 juni 2015 kl 10.00 Plats: E243, Luleå tekniska universitet
APA, Harvard, Vancouver, ISO, and other styles
6

Griffin, Alan R., and R. Stephen Wooten. "AUTOMATED DATA MANAGEMENT IN A HIGH-VOLUME TELEMETRY DATA PROCESSING ENVIRONMENT." International Foundation for Telemetering, 1992. http://hdl.handle.net/10150/608908.

Full text
Abstract:
International Telemetering Conference Proceedings / October 26-29, 1992 / Town and Country Hotel and Convention Center, San Diego, California<br>The vast amount of data telemetered from space probe experiments requires careful management and tracking from initial receipt through acquisition, archiving, and distribution. This paper presents the automated system used at the Phillips Laboratory, Geophysics Directorate, for tracking telemetry data from its receipt at the facility to its distribution on various media to the research community. Features of the system include computerized databases, automated generation of media labels, automated generation of reports, and automated archiving.
APA, Harvard, Vancouver, ISO, and other styles
7

Lu, Feng. "Big data scalability for high throughput processing and analysis of vehicle engineering data." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-207084.

Full text
Abstract:
"Sympathy for Data" is a platform that is utilized for Big Data automation analytics. It is based on visual interface and workflow configurations. The main purpose of the platform is to reuse parts of code for structured analysis of vehicle engineering data. However, there are some performance issues on a single machine for processing a large amount of data in Sympathy for Data. There are also disk and CPU IO intensive issues when the data is oversized and the platform need fits comfortably in memory. In addition, for data over the TB or PB level, the Sympathy for data needs separate functionality for efficient processing simultaneously and scalable for distributed computation functionality. This paper focuses on exploring the possibilities and limitations in using the Sympathy for Data platform in various data analytic scenarios within the Volvo Cars vision and strategy. This project re-writes the CDE workflow for over 300 nodes into pure Python script code and make it executable on the Apache Spark and Dask infrastructure. We explore and compare both distributed computing frameworks implemented on Amazon Web Service EC2 used for 4 machine with a 4x type for distributed cluster measurement. However, the benchmark results show that Spark is superior to Dask from performance perspective. Apache Spark and Dask will combine with Sympathy for Data products for a Big Data processing engine to optimize the system disk and CPU IO utilization. There are several challenges when using Spark and Dask to analyze large-scale scientific data on systems. For instance, parallel file systems are shared among all computing machines, in contrast to shared-nothing architectures. Moreover, accessing data stored in commonly used scientific data formats, such as HDF5 is not tentatively supported in Spark. This report presents research carried out on the next generation of Big Data platforms in the automotive industry called "Sympathy for Data". The research questions focusing on improving the I/O performance and scalable distributed function to promote Big Data analytics. During this project, we used the Dask.Array parallelism features for interpretation the data sources as a raster shows in table format, and Apache Spark used as data processing engine for parallelism to load data sources to memory for improving the big data computation capacity. The experiments chapter will demonstrate 640GB of engineering data benchmark for single node and distributed computation mode to evaluate the Sympathy for Data Disk CPU and memory metrics. Finally, the outcome of this project improved the six times performance of the original Sympathy for data by developing a middleware SparkImporter. It is used in Sympathy for Data for distributed computation and connected to the Apache Spark for data processing through the maximum utilization of the system resources. This improves its throughput, scalability, and performance. It also increases the capacity of the Sympathy for data to process Big Data and avoids big data cluster infrastructures.
APA, Harvard, Vancouver, ISO, and other styles
8

Tang, Yuzhe. "Secure and high-performance big-data systems in the cloud." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/53995.

Full text
Abstract:
Cloud computing and big data technology continue to revolutionize how computing and data analysis are delivered today and in the future. To store and process the fast-changing big data, various scalable systems (e.g. key-value stores and MapReduce) have recently emerged in industry. However, there is a huge gap between what these open-source software systems can offer and what the real-world applications demand. First, scalable key-value stores are designed for simple data access methods, which limit their use in advanced database applications. Second, existing systems in the cloud need automatic performance optimization for better resource management with minimized operational overhead. Third, the demand continues to grow for privacy-preserving search and information sharing between autonomous data providers, as exemplified by the Healthcare information networks. My Ph.D. research aims at bridging these gaps. First, I proposed HINDEX, for secondary index support on top of write-optimized key-value stores (e.g. HBase and Cassandra). To update the index structure efficiently in the face of an intensive write stream, HINDEX synchronously executes append-only operations and defers the so-called index-repair operations which are expensive. The core contribution of HINDEX is a scheduling framework for deferred and lightweight execution of index repairs. HINDEX has been implemented and is currently being transferred to an IBM big data product. Second, I proposed Auto-pipelining for automatic performance optimization of streaming applications on multi-core machines. The goal is to prevent the bottleneck scenario in which the streaming system is blocked by a single core while all other cores are idling, which wastes resources. To partition the streaming workload evenly to all the cores and to search for the best partitioning among many possibilities, I proposed a heuristic based search strategy that achieves locally optimal partitioning with lightweight search overhead. The key idea is to use a white-box approach to search for the theoretically best partitioning and then use a black-box approach to verify the effectiveness of such partitioning. The proposed technique, called Auto-pipelining, is implemented on IBM Stream S. Third, I proposed ǫ-PPI, a suite of privacy preserving index algorithms that allow data sharing among unknown parties and yet maintaining a desired level of data privacy. To differentiate privacy concerns of different persons, I proposed a personalized privacy definition and substantiated this new privacy requirement by the injection of false positives in the published ǫ-PPI data. To construct the ǫ-PPI securely and efficiently, I proposed to optimize the performance of multi-party computations which are otherwise expensive; the key idea is to use addition-homomorphic secret sharing mechanism which is inexpensive and to do the distributed computation in a scalable P2P overlay.
APA, Harvard, Vancouver, ISO, and other styles
9

Abidi, Faiz Abbas. "Remote High Performance Visualization of Big Data for Immersive Science." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/78210.

Full text
Abstract:
Remote visualization has emerged as a necessary tool in the analysis of big data. High-performance computing clusters can provide several benefits in scaling to larger data sizes, from parallel file systems to larger RAM profiles to parallel computation among many CPUs and GPUs. For scalable data visualization, remote visualization tools and infrastructure is critical where only pixels and interaction events are sent over the network instead of the data. In this paper, we present our pipeline using VirtualGL, TurboVNC, and ParaView to render over 40 million points using remote HPC clusters and project over 26 million pixels in a CAVE-style system. We benchmark the system by varying the video stream compression parameters supported by TurboVNC and establish some best practices for typical usage scenarios. This work will help research scientists and academicians in scaling their big data visualizations for real time interaction.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
10

Mercier, Michael. "Contribution to High Performance Computing and Big Data Infrastructure Convergence." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM031/document.

Full text
Abstract:
La quantité de données produites dans le monde scientifique comme dans le monde commercial, est en constante augmentation. Le domaine du traitement de donnée à large échelle, appelé “Big Data”, a été inventé pour traiter des données sur de larges infrastructures informatiques distribuées. Mais l’intégration de système Big Data sur des machines de calcul intensif pose de nombreux problèmes. En effet, les gestionnaires de ressources ainsi que les systèmes de fichier de super calculateurs ne sont pas penser pour ce type de travail. Le sujet de cette thèse est de trouver la meilleure approche pour faire interagir ces deux gestionnaires de ressources et de traiter les différents problèmes soulevés par les mouvements de données et leur ordonnancement<br>The amount of data produced, either in the scientific community and the commercial world, is constantly growing. The field of Big Data has emerged to handle a large amount of data on distributed computing infrastructures. High-Performance Computer (HPC) infrastructures are made for intensive parallel computations. The HPC community is also facing more and more data because of new high definition sensors and large physics apparatus. The convergence of the two fields is currently happening. In fact, the HPC community is already using Big Data tools, but they are not integrated correctly, especially at the level of the file system and the Resources and Job Management System (RJMS).In order to understand how we can leverage HPC clusters for Big Data usage, and what are the challenges for the HPC infrastructures, we have studied multiple aspects of the convergence: we have made a survey on the software provisioning methods, with a focus on data-intensive applications. We also propose a new RJMS collaboration technique called BeBiDa which is based on 50 lines of code whereas similar solutions use at least 1000x more. We evaluate this mechanismon real conditions and in a simulation with our simulator Batsim
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "High volume big data"

1

Raj, Pethuru, Anupama Raman, Dhivya Nagaraj, and Siddhartha Duggirala. High-Performance Big-Data Analytics. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-20744-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Arora, Ritu, ed. Conquering Big Data with High Performance Computing. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-33742-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Grandinetti, Lucio, Seyedeh Leili Mirtaheri, and Reza Shahbazian, eds. High-Performance Computing and Big Data Analysis. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-33495-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kołodziej, Joanna, and Horacio González-Vélez, eds. High-Performance Modelling and Simulation for Big Data Applications. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-16272-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pirozzoli, Sergio, and Tapan K. Sengupta, eds. High-Performance Computing of Big Data for Turbulence and Combustion. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-17012-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kasim, Adetayo, Ziv Shkedy, Sebastian Kaiser, Sepp Hochreiter, and Willem Talloen, eds. Applied Biclustering Methods for Big and High-Dimensional Data Using R. Chapman and Hall/CRC, 2016. http://dx.doi.org/10.1201/9781315373966.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Kuldova, Tereza Østbø, Helene Oppen Ingebrigtsen Gundhus, and Christin Thea Wathne, eds. Policing and Intelligence in the Global Big Data Era, Volume I. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-68326-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kuldova, Tereza Østbø, Helene Oppen Ingebrigtsen Gundhus, and Christin Thea Wathne, eds. Policing and Intelligence in the Global Big Data Era, Volume II. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-68298-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

E, Berki Sylvester, and National Center for Health Statistics (U.S.), eds. High-volume and low-volume users of health services, United States, 1980. U.S. Dept. of Health and Human Services, Public Health Service, National Center for Health Statistics, 1985.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Hawaii. Department of Health, ed. Spurious high birth defect rate on the Island of Lanai and an unexpected indication of a high incidence of advers pregnancy outcomes in Kohala on the big island. Research and Statistics Office, Hawaii State Department of Health, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "High volume big data"

1

Chicchón, Miguel, and Ronny Huerta. "Semantic Segmentation Using Convolutional Neural Networks for Volume Estimation of Native Potatoes at High Speed." In Information Management and Big Data. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-76228-5_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Fournier, Fabiana, and Inna Skarbovsky. "Real-Time Data Processing." In Big Data in Bioeconomy. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-71069-9_11.

Full text
Abstract:
AbstractTo remain competitive, organizations are increasingly taking advantage of the high volumes of data produced in real time for actionable insights and operational decision-making. In this chapter, we present basic concepts in real-time analytics, their importance in today’s organizations, and their applicability to the bioeconomy domains investigated in the DataBio project. We begin by introducing key terminology for event processing, and motivation for the growing use of event processing systems, followed by a market analysis synopsis. Thereafter, we provide a high-level overview of event processing system architectures, with its main characteristics and components, followed by a survey of some of the most prominent commercial and open source tools. We then describe how we applied this technology in two of the DataBio project domains: agriculture and fishery. The devised generic pipeline for IoT data real-time processing and decision-making was successfully applied to three pilots in the project from the agriculture and fishery domains. This event processing pipeline can be generalized to any use case in which data is collected from IoT sensors and analyzed in real-time to provide real-time alerts for operational decision-making.
APA, Harvard, Vancouver, ISO, and other styles
3

Ding, Linlin, Yu Liu, Baoyan Song, and Junchang Xin. "An Efficient High-Dimensional Big Data Storage Structure Based on US-ELM." In Proceedings of ELM-2015 Volume 1. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-28397-5_38.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Azhari, Mourad, Khalid Ahaji, Abdallah Abarda, Badia Ettaki, and Jamal Zerouaoui. "Using Machine Learning with Pyspark for Solving a Big Data Problem: Searching for Particles in High Energy Physics with Big Mass (HEPMASS)." In Innovations in Smart Cities Applications Volume 6. Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-26852-6_53.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Gao, Chengfang, Jinman Luo, Haobo Liang, and Xiaoji Guo. "The Construction of Power Grid Operation Monitoring Platform Driven by High-Tech Information Technology." In Proceedings of the 4th International Conference on Big Data Analytics for Cyber-Physical System in Smart City - Volume 1. Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-0880-6_61.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bautista, Elizabeth, Cary Whitney, and Thomas Davis. "Big Data Behind Big Data." In Conquering Big Data with High Performance Computing. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-33742-5_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Khosla P.K. and Kaur Amandeep. "Big Data Technologies." In Advances in Parallel Computing. IOS Press, 2018. https://doi.org/10.3233/978-1-61499-814-3-28.

Full text
Abstract:
World, nowadays, is engulfed in a deluge of data of different formats which is being generated from innumerable sources like mobile phones, social media, digital platforms, scientific experiments and enterprise applications. Such huge amount of unstructured as well as semi-structured data coming from various sources and in different formats is termed as &amp;ldquo;Big Data&amp;rdquo;. The trillion sensor world is further going to add to the explosive growth of Big Data segment. According to Gartner, &amp;ldquo;Big data is high volume, high velocity, and high variety information assets that require new forms of processing to enable enhanced decision making, insight, discovery, and process optimization&amp;rdquo;. It is quite evident that Big Data is creating opportunities for effective decision making across several domains and organizations.
APA, Harvard, Vancouver, ISO, and other styles
8

Chen, Min. "A Hierarchical Security Model for Multimedia Big Data." In Big Data. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9840-6.ch022.

Full text
Abstract:
In this chapter, the author proposes a hierarchical security model (HSM) to enhance security assurance for multimedia big data. It provides role hierarchy management and security roles/rules administration by seamlessly integrating the role-based access control (RBAC) with the object-oriented concept, spatio-temporal constraints, and multimedia standard MPEG-7. As a result, it can deal with challenging and unique security requirements in the multimedia big data environment. First, it supports multilayer access control so different access permission can be conveniently set for various multimedia elements such as visual/audio objects or segments in a multimedia data stream when needed. Second, the spatio-temporal constraints are modeled for access control purpose. Finally, its security processing is efficient to handle high data volume and rapid data arrival rate.
APA, Harvard, Vancouver, ISO, and other styles
9

Sultanow, Eldar, and Alina M. Chircu. "Improving Healthcare with Data-Driven Track-and-Trace Systems." In Big Data. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9840-6.ch055.

Full text
Abstract:
This chapter illustrates the potential of data-driven track-and-trace technology for improving healthcare through efficient management of internal operations and better delivery of services to patients. Track-and-trace can help healthcare organizations meet government regulations, reduce cost, provide value-added services, and monitor and protect patients, equipment, and materials. Two real-world examples of commercially available track-and-trace systems based on RFID and sensors are discussed: a system for counterfeiting prevention and quality assurance in pharmaceutical supply chains and a monitoring system. The system-generated data (such as location, temperature, movement, etc.) about tracked entities (such as medication, patients, or staff) is “big data” (i.e. data with high volume, variety, velocity, and veracity). The chapter discusses the challenges related to data capture, storage, retrieval, and ultimately analysis in support of organizational objectives (such as lowering costs, increasing security, improving patient outcomes, etc.).
APA, Harvard, Vancouver, ISO, and other styles
10

Cavalcanti, José Carlos. "The New “ABC” of ICTs (Analytics + Big Data + Cloud Computing)." In Big Data. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9840-6.ch099.

Full text
Abstract:
Analytics (discover and communication of patterns, with significance, in data) of Big Data (basically characterized by large structured and unstructured data volumes, from a variety of sources, at high velocity - i.e., real-time data capture, storage, and analysis), through the use of Cloud Computing (a model of network computing) is becoming the new “ABC” of information and communication technologies (ICTs), with important effects for the generation of new firms and for the restructuring of those ones already established. However, as this chapter argues, successful application of these new ABC technologies and tools depends on two interrelated policy aspects: 1) the use of a proper model which could help one to approach the structure and dynamics of the firm, and, 2) how the complex trade-off between information technology (IT) and communication technology (CT) costs is handled within, between and beyond firms, organizations and institutions.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "High volume big data"

1

Begoli, Edmon, Kris Brown, Sudarshan Srinivas, and Suzanne Tamang. "SynthNotes: A Generator Framework for High-volume, High-fidelity Synthetic Mental Health Notes." In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2018. http://dx.doi.org/10.1109/bigdata.2018.8621981.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Williams, Jenny Weisenberg, Kareem S. Aggour, John Interrante, Justin McHugh, and Eric Pool. "Bridging high velocity and high volume industrial big data through distributed in-memory storage & analytics." In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014. http://dx.doi.org/10.1109/bigdata.2014.7004325.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yesudas, Michael, Girish Menon S, and Satheesh K. Nair. "High-Volume Performance Test Framework using Big Data." In ICPE'15: ACM/SPEC International Conference on Performance Engineering. ACM, 2015. http://dx.doi.org/10.1145/2693182.2693185.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhong, Tao, Kshitij Doshi, Gang Deng, Xiaoming Yang, and Hegao Zhang. "High volume geospatial mapping for internet-of-vehicle solutions with in-memory map-reduce processing." In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014. http://dx.doi.org/10.1109/bigdata.2014.7004309.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

"Front Matter: Volume 9720." In High-Speed Biomedical Imaging and Spectroscopy: Toward Big Data Instrumentation and Management, edited by Keisuke Goda and Kevin K. Tsia. SPIE, 2016. http://dx.doi.org/10.1117/12.2239496.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

"Front Matter: Volume 10505." In High-Speed Biomedical Imaging and Spectroscopy III: Toward Big Data Instrumentation and Management, edited by Keisuke Goda and Kevin K. Tsia. SPIE, 2018. http://dx.doi.org/10.1117/12.2323059.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gkorou, Dimitra, Alexander Ypma, George Tsirogiannis, et al. "Towards Big Data Visualization for Monitoring and Diagnostics of High Volume Semiconductor Manufacturing." In CF '17: Computing Frontiers Conference. ACM, 2017. http://dx.doi.org/10.1145/3075564.3078883.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lee, Hyun-Joo, Euihyun Jung, Daein Kim, et al. "Study for big data interrelation using real-time monitoring technology in high cost, high volume manufacturing EUV era." In Photomask Technology, edited by Stephen P. Renwick. SPIE, 2021. http://dx.doi.org/10.1117/12.2599020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Im, Seokjin, and HeeJoung Hwang. "A High Efficient Encoding Scheme of Big-Volume Bio-informatics Data using a Linear Block Buffering." In Ubiquitous Science and Engineering 2015. Science & Engineering Research Support soCiety, 2015. http://dx.doi.org/10.14257/astl.2015.107.17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Alblushi, Maryam, Khalid Nasser, Mohammad Readean, and Ahmed Ghamdi. "Big Data Integration Framework for Processing Petrophysical Data." In International Petroleum Technology Conference. IPTC, 2022. http://dx.doi.org/10.2523/iptc-22170-ms.

Full text
Abstract:
Abstract Background Since the introduction of the first electrical resistivity well log by Marcel and Conrad Schlumberger in 1927, the field of petrophysical well logging experienced significant technological advancements [3]. One of the new technologies was Logging While Drilling (LWD), which allows for real time data streaming and acquisition from the initial drilling depth to the target depth. The target depth sometimes reaches more than 25,000 feet, resulting in wealth of captured data [7]. As special logging probes scan given subsurface intervals, a long list of diverse readings is collected as functions of either depth or time [4]. Unfortunately, most of the obtained data cannot be used as is; several processing, calibration and interpretation activities must be performed on the stored raw data to extract useful insights about the penetrated formations [5]. While these data processing activities are plausible for one particular hydrocarbon reservoir using conventional processing techniques, performing field-wide petrophysical studies can be a real challenge. However, big data technologies can be seen as a potential solution as petrophysical data satisfies the main characteristics of big data. Such characteristics include the high volume, velocity, extreme variety of measurement types and formats, and the uncertain veracity of data attained from several vendors and sensors. In this paper, we first review the major challenges limiting geoscientists, geophysicists and petroleum engineers from fully exploiting petrophysical data. Then, we propose a big data-based framework which can help overcome some of these challenges by capitalizing on advanced processing techniques. Finally, we discuss the results of applying the framework on a defined business case.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "High volume big data"

1

van der Sloot, Bart. The Quality of Life: Protecting Non-personal Interests and Non-personal Data in the Age of Big Data. Universitätsbibliothek J. C. Senckenberg, Frankfurt am Main, 2021. http://dx.doi.org/10.21248/gups.64579.

Full text
Abstract:
Under the current legal paradigm, the rights to privacy and data protection provide natural persons with subjective rights to protect their private interests, such as related to human dignity, individual autonomy and personal freedom. In principle, when data processing is based on non-personal or aggregated data or when such data pro- cesses have an impact on societal, rather than individual interests, citizens cannot rely on these rights. Although this legal paradigm has worked well for decades, it is increasingly put under pressure because Big Data processes are typically based indis- criminate rather than targeted data collection, because the high volumes of data are processed on an aggregated rather than a personal level and because the policies and decisions based on the statistical correlations found through algorithmic analytics are mostly addressed at large groups or society as a whole rather than specific individuals. This means that large parts of the data-driven environment are currently left unregu- lated and that individuals are often unable to rely on their fundamental rights when addressing the more systemic effects of Big Data processes. This article will discuss how this tension might be relieved by turning to the notion ‘quality of life’, which has the potential of becoming the new standard for the European Court of Human Rights (ECtHR) when dealing with privacy related cases.
APA, Harvard, Vancouver, ISO, and other styles
2

Marsden, Eric, and Véronique Steyer. Artificial intelligence and safety management: an overview of key challenges. Foundation for an Industrial Safety Culture, 2025. https://doi.org/10.57071/iae290.

Full text
Abstract:
Artificial intelligence based on deep learning, along with big data analysis, has in recent years been the subject of rapid scientific and technological advances. These technologies are increasingly being integrated into various work environments with the aim of enhancing performance and productivity. This dimension of the digital transformation of businesses and regulatory authorities presents both significant opportunities and potential risks for industrial safety management practices. While there are numerous expected benefits, such as the ability to process large volumes of reliability data or unstructured natural language incident reports, the structural opacity of large neural networks, their non-deterministic nature, and their capacity to learn from new data mean that traditional safety assurance techniques used for conventional software are not applicable. Additionally, the expansion of the scope of automatable tasks and the gradual move towards work collectives that are composed of human operators who collaborate with various intelligent machines and agents introduce new variables that must be considered alongside and integrated with the organizational and human factors of safety. What are the main challenges posed by these new technologies in terms of skills management, worker well-being, privacy protection, and the pursuit of performance that aligns with societal expectations? What changes are required in how we conceptualize the safety of high-stakes activities, how we demonstrate and verify the absence of unacceptable risks, and anticipate potential deviations? This document provides a concise overview of the most recent available information, contextualized by decades of research on automation in high-hazard systems. It focuses specifically on the projected impacts for high-hazard industries and infrastructures over the next ten years.
APA, Harvard, Vancouver, ISO, and other styles
3

Author, Not Given. Accelerating High-Level Waste Glass Corrosion Research with Big Data. Office of Scientific and Technical Information (OSTI), 2018. http://dx.doi.org/10.2172/1469277.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Pinar, Ali, Tamara G. Kolda, Kevin Thomas Carlberg, Grey Ballard, and Michael Mahoney. Unsupervised Learning Through Randomized Algorithms for High-Volume High-Velocity Data (ULTRA-HV). Office of Scientific and Technical Information (OSTI), 2018. http://dx.doi.org/10.2172/1417788.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fang, Chin. Using NVMe Gen3 PCIe SSD Cards in High-density Servers for High-performance Big Data Transfer Over Multiple Network Channels. Office of Scientific and Technical Information (OSTI), 2015. http://dx.doi.org/10.2172/1171472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hildstrom, Gregory A. High-Volume Data Analysis Suite (HVDAS) Manual November-17-2003. Defense Technical Information Center, 2003. http://dx.doi.org/10.21236/ada422175.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mullen, E. G., and M. S. Gussenhoven. SCATHA (Spacecraft Charging at High Altitudes) Atlas Data Base. Volume 2. Defense Technical Information Center, 1989. http://dx.doi.org/10.21236/ada214205.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Vienna, John D., Alexander Fluegel, Dong-Sang Kim, and Pavel R. Hrma. Glass Property Data and Models for Estimating High-Level Waste Glass Volume. Office of Scientific and Technical Information (OSTI), 2009. http://dx.doi.org/10.2172/971447.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Avidan, A., and R. Shinnar. Analysis of test data from IGT high pressure fluidized-bed gasifer PDU: Volume 2. Office of Scientific and Technical Information (OSTI), 1988. http://dx.doi.org/10.2172/5290715.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kindle, C. H., and M. R. Kreiter. Comprehensive data base of high-level nuclear waste glasses: September 1987 status report: Volume 1, Discussion and glass durability data. Office of Scientific and Technical Information (OSTI), 1987. http://dx.doi.org/10.2172/5631321.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!