Log in

Relevant bibliographies by topics / BigData

Contents

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'BigData'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 June 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'BigData.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "BigData"

1

Zhang, Jinson, Mao Huang, and Zhao-Peng Meng. "Visual analytics for BigData variety and its behaviours." Computer Science and Information Systems 12, no. 4 (2015): 1171–91. http://dx.doi.org/10.2298/csis141122050z.

Full text

Abstract:

BigData, defined as structured and unstructured data containing images, videos, texts, audio and other forms of data collected from multiple datasets, is too big, too complex and moves too fast to analyze using traditional methods. This has given rise to a few issues that must be addressed; 1) how to analyze BigData across multiple datasets, 2) how to classify the different data forms, 3) how to identify BigData patterns based on its behaviours, 4) how to visualize BigData attributes in order to gain a better understanding of data. It is therefore necessary to establish a new framework for BigData analysis and visualization. In this paper, we have extended our previous works for classifying the BigData attributes into the "5Ws" dimensions based on different data behaviours. Our approach not only classifies BigData attributes for different data forms across multiple datasets, but also establishes the "5Ws" densities to represent the characteristics of data flow patterns. We use additional non-dimensional parallel axes in parallel coordinates to display the ?5Ws? sending and receiving densities, which provide more analytic features for BigData analysis. The experiment shows that our approach with parallel coordinate visualization can be efficiently used for BigData analysis and visualization.

APA, Harvard, Vancouver, ISO, and other styles

2

Fromm, Davida, and Brian MacWhinney. "AphasiaBank as BigData." Seminars in Speech and Language 37, no. 01 (February 16, 2016): 010–22. http://dx.doi.org/10.1055/s-0036-1571357.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Ибрагимов, И. Р., and М. С. У. Халиев. "Большие данные и их структура." ТЕНДЕНЦИИ РАЗВИТИЯ НАУКИ И ОБРАЗОВАНИЯ 92, no. 10 (2022): 87–89. http://dx.doi.org/10.18411/trnio-12-2022-486.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Et. al., Govindaraju G. N,. "Big Data Analytics Performance Enhancement For Covid-19 Data Using Machine Learning And Cloud." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 10 (April 28, 2021): 5608–14. http://dx.doi.org/10.17762/turcomat.v12i10.5371.

Full text

Abstract:

The exponential rise in software computing, internet and web-services has broadened the horizon for BigData that demands robust and highly efficient analytics system to serve timely and accurate distributed data support. The distributed frameworks with parallelized computing have been found key driving force behind the contemporary BigData analytics systems; however, the lack of optimal data pre-processing, feature sensitive computation and more importantly feature learning makes major at-hand solutions inferior, especially in terms of time and accuracy. Unlike major at hand methods employing machine learning for BigData analytics, in this paper the key emphasis was made on improving pre-processing, low-dimensional semantic feature extraction and lightweight improved machine learning based feature learning for BigData analytics. Noticeably, the proposed model hypothesizes that an analytics solution with BigData characteristics must have the potential to process humongous, heterogenous, unstructured and multi-dimensional features to yield time-efficient and accuracy analytical outputs. In this reference, we proposed a state-of-art new and robust BigData analytics model, specially designed for Spark distributed framework. To process analytical task our proposed model at first employs tokenization, followed by Word2Vec based semantic feature extraction using CBOW and N-Skip-Gram methods. Our proposed model was found more effective with Skip-Gram Word2Vec feature extraction. Simulation results with a publicly available COVID-19 data exhibited better performance than existing K-Means based MapReduce distributed data frameworks.

APA, Harvard, Vancouver, ISO, and other styles

5

Kaur, Pankaj Deep, Amneet Kaur, and Sandeep Kaur. "Performance Analysis in Bigdata." International Journal of Information Technology and Computer Science 7, no. 11 (October 8, 2015): 55–61. http://dx.doi.org/10.5815/ijitcs.2015.11.07.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Zolotov, Oleg, Yulia Romanovskaya, and Varvara Rzhannikova. "On Definition of BigData." EPJ Web of Conferences 224 (2019): 04011. http://dx.doi.org/10.1051/epjconf/201922404011.

Full text

Abstract:

The term Big Data (or BigData) is widely used in scientific, educational, and business literature; however, there does not exist a single definition that can be unreservedly called “canonical”. A careless use of Big Data term to promote commercial software further emphasizes the importance of this issue. In this paper, we have performed a review of definitions of Big Data and highlighted the principal features that are attributed to Big Data. We compared all these principal features with features of databases compiled using Edgar F. Codd’s publications, and showed that they are not unique and can also be attributed to the databases. Having studied C. Lynch original work, we proposed the definition of Big Data based on the so-called conservation institution. The key point of this definition is a shift from purely technical attitude towards public institutions. Since the current use of the Big Data term may lead to a loss of meaning. There is a need not only to spread out best practices but also to eliminate or minimize the use of dubious or misleading ones.

APA, Harvard, Vancouver, ISO, and other styles

7

Ranjan, Rajiv, Saurabh Garg, Ali Reza Khoskbar, Ellis Solaiman, Philip James, and Dimitrios Georgakopoulos. "Orchestrating BigData Analysis Workflows." IEEE Cloud Computing 4, no. 3 (2017): 20–28. http://dx.doi.org/10.1109/mcc.2017.55.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Raikhlin, Vadim A., and Roman K. Klassen. "Clusterix-Like BigData DBMS." Data Science and Engineering 5, no. 1 (February 20, 2020): 80–93. http://dx.doi.org/10.1007/s41019-020-00116-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Ridho, Farid, and Arya Aji Kusuma. "Deteksi Intrusi Jaringan dengan K-Means Clustering pada Akses Log dengan Teknik Pengolahan Big Data." Jurnal Aplikasi Statistika & Komputasi Statistik 10, no. 1 (August 15, 2019): 53. http://dx.doi.org/10.34123/jurnalasks.v10i1.202.

Full text

Abstract:

Keamanan jaringan, adalah salah satu aspek penting dalam terciptanya proses komunikasi data yang baik dan aman. Namun, masih adanya serangan yang efektif membuktikan bahwa sistem keamanan yang berlaku belum cukup efektif untuk mencegah dan mendeteksi serangan. Salah satu metode yang dapat digunakan untuk mendeteksi serangan ini adalah dengan dengan Intrusion Detection System (IDS). Besarnya data (volume), cepatnya perubahan data (velocity), serta variasi data (variety) merupakan ciri-ciri dari Big data. Akses log, secara teori termasuk dalam kategori ini sehingga dapat dilakukan pemrosesan menggunakan teknologi bigdata dengan Hadoop. Hal ini mendorong penulis untuk dapat menerapkan metode pengolahan baru yang dapat mengatasi perkembangan data tersebut, yaitu Big data. Penelitian ini dilakukan dengan menganalisis akses log dengan K-Means Clustering menggunakan metode pengolahan bigdata. Penelitian menghasilkan satu model yang dapat digunakan untuk mendeteksi sebuah serangan dengan probabilitas deteksi sebesar 99.68%. Serta dari hasil perbandingan kedua metode pengolahan bigdata menggunakan pyspark dan metode tradisional menggunakan python standar, metode bigdata memiliki perbedaan yang signifikan dalam waktu yang dibutuhkan dalam eksekusi program.

APA, Harvard, Vancouver, ISO, and other styles

10

Chahal, Ayushi, Preeti Gulia, and Nasib Singh Gill. "Different analytical frameworks and bigdata model for internet of things." Indonesian Journal of Electrical Engineering and Computer Science 25, no. 2 (February 1, 2022): 1159. http://dx.doi.org/10.11591/ijeecs.v25.i2.pp1159-1166.

Full text

Abstract:

Sensor devices used in internet of things (IoT) enabled environment produce large amount of data. This data plays a major role in bigdata landscape. In recent years, correlation, and implementation of bigdata and IoT is being extrapolated. Nowadays, predictive analytics is gaining attention of many researchers for big IoT data analytics. This paper summarizes different sort of IoT analytical platforms which consist in-built features for further use in machine learning, MATLAB, and data security. It emphasizes on different machine learning algorithms that plays important role in big IoT data analytics. Besides different analytical frameworks, this paper highlights the proposed model for bigdata in IoT domain and elaborates different forms of data analytical methods. Proposed model comprises different phases i.e., data storing, data cleaning, data analytics, and data visualization. These phases cover the basic characteristics of bigdata V’s model and most important phase is data analytics or big IoT analytics. This model is implemented using an IoT dataset and results are presented in graphical and tabular form using different machine learning techniques. This study enhances researchers’ knowledge about various IoT analytical platforms and usability of these platforms in their respective problem domains.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "BigData"

1

Яковець, Р. І., and Ігор Віталійович Пономаренко. "Основні тенденції в BigData." Thesis, КНУТД, 2016. https://er.knutd.edu.ua/handle/123456789/4083.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Vitali, Federico. "Map-Matching su Piattaforma BigData." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/18089/.

Full text

Abstract:

Nell'ambito dell'analisi dei dati di movimento atto all'estrazione di informazioni utili, il map matching ha l'obiettivo di proiettare i punti GPS generati dagli oggetti in movimento sopra i segmenti stradali in modo da rappresentare l'attuale posizione degli oggetti. Fino ad ora, il map matching è stato sfruttato in ambiti come l'analisi del traffico, l'estrazione dei percorsi frequenti e la predizione della posizione degli oggetti, oltre a rappresentare un'importante fase di pre-processing nell'intero procedimento di trajectory mining. Sfortunatamente, le implementazioni allo stato dell'arte degli algoritmi di map matching sono tutte sequenziali o inefficienti. In questa tesi viene quindi proposto un algoritmo il quale si basa su di un algoritmo sequenziale conosciuto per la sua accuratezza ed efficienza il quale viene completamente riformulato in maniera distribuita in modo tale da raggiungere anche un elevata scalabilità nel caso di utilizzo con i big data. Inoltre, viene migliorata la robustezza dell'algoritmo, il quale è basato sull'Hidden Markov Model di primo ordine, introducendo una strategia per gestire i possibili buchi di informazione che si possono venire a creare tra i segmenti stradali assegnati. Infatti, il problema può accadere in caso di campionamento variabile dei punti GPS in aree urbane con un elevata frammentazione dei segmenti stradali. L'implementazione è basata su Apache Spark e testata su un dataset di oltre 7.8 milioni di punti GPS nella città di Milano.

APA, Harvard, Vancouver, ISO, and other styles

3

Urssi, Nelson José. "Metacidade: projeto, bigdata e urbanidade." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/16/16134/tde-01062017-154915/.

Full text

Abstract:

As tecnologias de informação e comunicação em todas as instâncias de nosso cotidiano modificam nossa maneira de viver e pensar. A computação urbana, ubíqua, locativa, multimídia e interconectada gera grande quantidade de dados o que resulta em abundância de informação sobre quase tudo em nosso mundo. As cidades permeadas por sensores pessoais, veiculares e ambientais adquirem características sencientes. Uma cidade sensível ao cidadão pode funcionar com estratégias individualizadas para o dia a dia. A tese discute o papel das cidades na complexidade de nossas vidas, o inter-relacionamento de equipamentos físicos (hardware), modelos simbólicos (software) e padrões de uso (aplicações), e os desafios de projeto para esse ecossistema global de informação híbrida. Apresenta investigação netnográfica, por meio de estudos de caso, explorações urbanas e entrevistas, em que se pode observar nossa condição contemporânea. Concluímos com a hipótese constatada na tese, a cidade atualizada em tempo real, um ecossistema informacional urbano de novas e infinitas possibilidades de interfaces e interações.
The technologies of information and communication in all the instances of our daily life modifies the way we live and think. Urban computing, ubiquitous, locative, multimídia and interconnected, generates a large amount of data, resulting in an abundance of information on almost everything in our world. Cities permeated by personal, vehicular and environmental sensors acquire sentient characteristics. A citizen-sensitive city can work with individualized day-to-day strategies. The thesis discusses the role of cities and the complexity of our lives, the interrelationship of hardware, symbolic models and patterns of use (applications), and the design challenges to this global hybrid information ecosystem. It presents netnographic research, through case studies, urban explorations and interviews, where one can observe our presente contemporary condition. The hypothesis verified in the thesis, the city updated in real time, an urban informational ecosystem of new and infinite possibilities of interfaces and interactions.

APA, Harvard, Vancouver, ISO, and other styles

4

Hashem, Hadi. "Modélisation intégratrice du traitement BigData." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLL005/document.

Full text

Abstract:

Dans le monde d’aujourd’hui de multiples acteurs de la technologie numérique produisent des quantités infinies de données. Capteurs, réseaux sociaux ou e-commerce, ils génèrent tous de l’information qui s’incrémente en temps-réel selon les 3 V de Gartner : en Volume, en Vitesse et en Variabilité. Afin d’exploiter efficacement et durablement ces données, il est important de respecter la dynamicité de leur évolution chronologique au moyen de deux approches : le polymorphisme d’une part, au moyen d’un modèle dynamique capable de supporter le changement de type à chaque instant sans failles de traitement ; d’autre part le support de la volatilité par un modèle intelligent prenant en compte des données clé seulement interprétables à un instant « t », au lieu de traiter toute la volumétrie des données actuelle et historique.L’objectif premier de cette étude est de pouvoir établir au moyen de ces approches une vision intégratrice du cycle de vie des données qui s’établit selon 3 étapes, (1) la synthèse des données via la sélection des valeurs-clés des micro-données acquises par les différents opérateurs au niveau de la source, (2) la fusion en faisant le tri des valeurs-clés sélectionnées et les dupliquant suivant un aspect de dé-normalisation afin d’obtenir un traitement plus rapide des données et (3) la transformation en un format particulier de carte de cartes de cartes, via Hadoop dans le processus classique de MapReduce afin d’obtenir un graphe défini dans la couche applicative.Cette réflexion est en outre soutenue par un prototype logiciel mettant en oeuvre les opérateurs de modélisation sus-décrits et aboutissant à une boîte à outils de modélisation comparable à un AGL et, permettant une mise en place assistée d'un ou plusieurs traitements sur BigData
Nowadays, multiple actors of Internet technology are producing very large amounts of data. Sensors, social media or e-commerce, all generate real-time extending information based on the 3 Vs of Gartner: Volume, Velocity and Variety. In order to efficiently exploit this data, it is important to keep track of the dynamic aspect of their chronological evolution by means of two main approaches: the polymorphism, a dynamic model able to support type changes every second with a successful processing and second, the support of data volatility by means of an intelligent model taking in consideration key-data, salient and valuable at a specific moment without processing all volumes of history and up to date data.The primary goal of this study is to establish, based on these approaches, an integrative vision of data life cycle set on 3 steps, (1) data synthesis by selecting key-values of micro-data acquired by different data source operators, (2) data fusion by sorting and duplicating the selected key-values based on a de-normalization aspect in order to get a faster processing of data and (3) the data transformation into a specific format of map of maps of maps, via Hadoop in the standard MapReduce process, in order to define the related graph in applicative layer.In addition, this study is supported by a software prototype using the already described modeling tools, as a toolbox compared to an automatic programming software and allowing to create a customized processing chain of BigData

APA, Harvard, Vancouver, ISO, and other styles

5

Hashem, Hadi. "Modélisation intégratrice du traitement BigData." Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLL005.

Full text

Abstract:

Dans le monde d’aujourd’hui de multiples acteurs de la technologie numérique produisent des quantités infinies de données. Capteurs, réseaux sociaux ou e-commerce, ils génèrent tous de l’information qui s’incrémente en temps-réel selon les 3 V de Gartner : en Volume, en Vitesse et en Variabilité. Afin d’exploiter efficacement et durablement ces données, il est important de respecter la dynamicité de leur évolution chronologique au moyen de deux approches : le polymorphisme d’une part, au moyen d’un modèle dynamique capable de supporter le changement de type à chaque instant sans failles de traitement ; d’autre part le support de la volatilité par un modèle intelligent prenant en compte des données clé seulement interprétables à un instant « t », au lieu de traiter toute la volumétrie des données actuelle et historique.L’objectif premier de cette étude est de pouvoir établir au moyen de ces approches une vision intégratrice du cycle de vie des données qui s’établit selon 3 étapes, (1) la synthèse des données via la sélection des valeurs-clés des micro-données acquises par les différents opérateurs au niveau de la source, (2) la fusion en faisant le tri des valeurs-clés sélectionnées et les dupliquant suivant un aspect de dé-normalisation afin d’obtenir un traitement plus rapide des données et (3) la transformation en un format particulier de carte de cartes de cartes, via Hadoop dans le processus classique de MapReduce afin d’obtenir un graphe défini dans la couche applicative.Cette réflexion est en outre soutenue par un prototype logiciel mettant en oeuvre les opérateurs de modélisation sus-décrits et aboutissant à une boîte à outils de modélisation comparable à un AGL et, permettant une mise en place assistée d'un ou plusieurs traitements sur BigData
Nowadays, multiple actors of Internet technology are producing very large amounts of data. Sensors, social media or e-commerce, all generate real-time extending information based on the 3 Vs of Gartner: Volume, Velocity and Variety. In order to efficiently exploit this data, it is important to keep track of the dynamic aspect of their chronological evolution by means of two main approaches: the polymorphism, a dynamic model able to support type changes every second with a successful processing and second, the support of data volatility by means of an intelligent model taking in consideration key-data, salient and valuable at a specific moment without processing all volumes of history and up to date data.The primary goal of this study is to establish, based on these approaches, an integrative vision of data life cycle set on 3 steps, (1) data synthesis by selecting key-values of micro-data acquired by different data source operators, (2) data fusion by sorting and duplicating the selected key-values based on a de-normalization aspect in order to get a faster processing of data and (3) the data transformation into a specific format of map of maps of maps, via Hadoop in the standard MapReduce process, in order to define the related graph in applicative layer.In addition, this study is supported by a software prototype using the already described modeling tools, as a toolbox compared to an automatic programming software and allowing to create a customized processing chain of BigData

APA, Harvard, Vancouver, ISO, and other styles

6

Оверчук, Олексій Сергійович. "Методи кодування інформаційних потоків BigData фінансового ринку." Master's thesis, КПІ ім. Ігоря Сікорського, 2019. https://ela.kpi.ua/handle/123456789/32122.

Full text

Abstract:

Магістерська дисертація : 100 с., 17 рис., 14 табл., 3 додатки, 20 джерел. Об'єкт дослідження – методи кодування інформаційних потоків Bigdata фінансових ринків. Мета роботи – дослідження методів кодування на основі сучасних алгоритмів стискання данних та підвищення надійності зберігання даних на основі методів системного діагностування. Методи дослідження – статистичні методи кодування та використання діагностичних графів. Новизна роботи – використання методів мультикомпресорного стискання даних та структурна декомпозиція даних Big Data на основі застосування діагностичних . У роботі проведено аналіз сучасних методів кодування на основі алгоритмів стискання даних і розроблено загальний підхід на основі мультикомпресорних методів стискання даних; отримано основні співвідношення для оцінки регулярних структур даних Big Data на основі застосування методів системного діагностування. Результати магістерської дисертації опубліковано у двох публікаціях. Отримані результати використано при виконанні науково-дослідної роботи ММСА-1/2018р. У подальшому рекомендується розглянути можливість доповнити методи кодування, а також дослідити інші способи підвищення надійності інформаційних потоків.
Master's Thesis: 100 p., 17 fig., 14 tabl., 3 suppl., 20 sources. Object of Study - Methods for Using Bigdata Numerical Market Flows. Metal works - research of methods used on modern algorithms of modern elemental data and reliable data on preservation of data on the system of methods of diagnostics. Research Methods - Statistical methods of using and diagnosing graphs. New knowledge of work - the use of multicompressor data styling techniques and Big Data structural decompositions for the use of diagnostic diagrams. The study analyzes modern methods that are used by their data compression algorithms and develops publicly available data on various multi-compressor data compression methods; The main comparisons obtained for the Code of Regular Data Data are big data on the use of system diagnostic methods. The results of the master's thesis are published in two publications. The results obtained were used in the research works of MMSA-1/2018. In this work, it is recommended that you review additional methods of code use and explore other ways to secure information flows.

APA, Harvard, Vancouver, ISO, and other styles

7

Прасол, І. Г. "Застосування технологій обробки великих даних (BigData) в маркетингу." Thesis, Київський національний універститет технологій та дизайну, 2017. https://er.knutd.edu.ua/handle/123456789/10404.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Díaz, Huiza César, and Balcázar César Quezada. "Charla sobre aplicaciones de Bigdata en el mercado." Universidad Peruana de Ciencias Aplicadas (UPC), 2019. http://hdl.handle.net/10757/627937.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Gault, Sylvain. "Improving MapReduce Performance on Clusters." Thesis, Lyon, École normale supérieure, 2015. http://www.theses.fr/2015ENSL0985/document.

Full text

Abstract:

Beaucoup de disciplines scientifiques s'appuient désormais sur l'analyse et la fouille de masses gigantesques de données pour produire de nouveaux résultats. Ces données brutes sont produites à des débits toujours plus élevés par divers types d'instruments tels que les séquenceurs d'ADN en biologie, le Large Hadron Collider (LHC) qui produisait en 2012, 25 pétaoctets par an, ou les grands télescopes tels que le Large Synoptic Survey Telescope (LSST) qui devrait produire 30 pétaoctets par nuit. Les scanners haute résolution en imagerie médicale et l'analyse de réseaux sociaux produisent également d'énormes volumes de données. Ce déluge de données soulève de nombreux défis en termes de stockage et de traitement informatique. L'entreprise Google a proposé en 2004 d'utiliser le modèle de calcul MapReduce afin de distribuer les calculs sur de nombreuses machines.Cette thèse s'intéresse essentiellement à améliorer les performances d'un environnement MapReduce. Pour cela, une conception modulaire et adaptable d'un environnement MapReduce est nécessaire afin de remplacer aisément les briques logicielles nécessaires à l'amélioration des performances. C'est pourquoi une approche à base de composants est étudiée pour concevoir un tel environnement de programmation. Afin d'étudier les performances d'une application MapReduce, il est nécessaire de modéliser la plate-forme, l'application et leurs performances. Ces modèles doivent être à la fois suffisamment précis pour que les algorithmes les utilisant produisent des résultats pertinents, mais aussi suffisamment simple pour être analysés. Un état de l'art des modèles existants est effectué et un nouveau modèle correspondant aux besoins d'optimisation est défini. De manière à optimiser un environnement MapReduce la première approche étudiée est une approche d'optimisation globale qui aboutit à une amélioration du temps de calcul jusqu'à 47 %. La deuxième approche se concentre sur la phase de shuffle de MapReduce où tous les nœuds envoient potentiellement des données à tous les autres nœuds. Différents algorithmes sont définis et étudiés dans le cas où le réseau est un goulet d'étranglement pour les transferts de données. Ces algorithmes sont mis à l'épreuve sur la plate-forme expérimentale Grid'5000 et montrent souvent un comportement proche de la borne inférieure alors que l'approche naïve en est éloignée
Nowadays, more and more scientific fields rely on data mining to produce new results. These raw data are produced at an increasing rate by several tools like DNA sequencers in biology, the Large Hadron Collider (LHC) in physics that produced 25 petabytes per year as of 2012, or the Large Synoptic Survey Telescope (LSST) that should produce 30 petabyte of data per night. High-resolution scanners in medical imaging and social networks also produce huge amounts of data. This data deluge raise several challenges in terms of storage and computer processing. The Google company proposed in 2004 to use the MapReduce model in order to distribute the computation across several computers.This thesis focus mainly on improving the performance of a MapReduce environment. In order to easily replace the software parts needed to improve the performance, designing a modular and adaptable MapReduce environment is necessary. This is why a component based approach is studied in order to design such a programming environment. In order to study the performance of a MapReduce application, modeling the platform, the application and their performance is mandatory. These models should be both precise enough for the algorithms using them to produce meaningful results, but also simple enough to be analyzed. A state of the art of the existing models is done and a new model adapted to the needs is defined. On order to optimise a MapReduce environment, the first studied approach is a global optimization which result in a computation time reduced by up to 47 %. The second approach focus on the shuffle phase of MapReduce when all the nodes may send some data to every other node. Several algorithms are defined and studied when the network is the bottleneck of the data transfers. These algorithms are tested on the Grid'5000 experiment platform and usually show a behavior close to the lower bound while the trivial approach is far from it

APA, Harvard, Vancouver, ISO, and other styles

10

Melkes, Miloslav. "BigData řešení pro zpracování rozsáhlých dat ze síťových toků." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236039.

Full text

Abstract:

This master‘s thesis focuses on distributed processing of big data from network communication. It begins with exploring network communication based on TCP/IP model with focus on data units on each layer, which is necessary to process during analyzation. In terms of the actual processing of big data is described programming model MapReduce, architecture of Apache Hadoop technology and it‘s usage for processing network flows on computer cluster. Second part of this thesis deals with design and following implementation of the application for processing network flows from network communication. In this part are discussed main and problematic parts from the actual implementation. After that this thesis ends with a comparison with available applications for network analysis and evaluation set of tests which confirmed linear growth of acceleration.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "BigData"

1

Wei, Jinpeng, and Liang-Jie Zhang, eds. Big Data – BigData 2021. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-96282-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Chin, Francis Y. L., C. L. Philip Chen, Latifur Khan, Kisung Lee, and Liang-Jie Zhang, eds. Big Data – BigData 2018. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-94301-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Nepal, Surya, Wenqi Cao, Aziz Nasridinov, MD Zakirul Alam Bhuiyan, Xuan Guo, and Liang-Jie Zhang, eds. Big Data – BigData 2020. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Chen, Keke, Sangeetha Seshadri, and Liang-Jie Zhang, eds. Big Data – BigData 2019. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23551-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Hu, Bo, Yunni Xia, Yiwen Zhang, and Liang-Jie Zhang, eds. Big Data – BigData 2022. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-23501-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Zhang, Shunli, Bo Hu, and Liang-Jie Zhang, eds. Big Data – BigData 2023. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-44725-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Qiu, Daowen, Yusheng Jiao, and William Yeoh, eds. Proceedings of the 2022 International Conference on Bigdata Blockchain and Economy Management (ICBBEM 2022). Dordrecht: Atlantis Press International BV, 2023. http://dx.doi.org/10.2991/978-94-6463-030-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Kozli︠a︡kov, V. V. Polikarp Nikitich Bigdash-Bogdashev: Zhiznʹ, tvorcheskai︠a︡ dei︠a︡telʹnostʹ. Tambov: Tambovskiĭ gos. universitet, 2001.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

9

posvi︠a︡shchennai︠a︡ 125-letii︠u︡ so dni︠a︡ rozhdenii︠a︡ P.N. Bigdash-Bogdasheva Mezhdunarodnai︠a︡ nauchno-prakticheskai︠a︡ konferent︠s︡ii︠a︡. Narodno-pevcheskai︠a︡ kulʹtura: Regionalʹnye tradit︠s︡ii, problemy izuchenii︠a︡, puti razvitii︠a︡ : materialy mezhdunarodnoĭ nauchno-prakticheskoĭ konferent︠s︡ii, posvi︠a︡shchennoĭ 125-letii︠u︡ so dni︠a︡ rozhdenii︠a︡ P.N. Bigdash-Bogdasheva, 12-14 marta 2002 goda, g. Tambov. Tambov: Tambovskiĭ gos. universitet, 2002.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

10

Herand, Deniz. Adım Adım Bigdata ve Uygulamaları. Pusula Yayıncılık, 2020.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "BigData"

1

Deinum, Marten, Josh Long, Gary Mak, and Daniel Rubio. "NoSQL and BigData." In Spring Recipes, 549–90. Berkeley, CA: Apress, 2014. http://dx.doi.org/10.1007/978-1-4302-5909-1_13.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Chen, Li M. "Images, Videos, and BigData." In Mathematical Problems in Data Science, 75–100. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-25127-1_5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Krithika, D. R., and K. Rohini. "Blockchain with Bigdata Analytics." In Intelligent Computing and Innovation on Data Science, 403–9. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-3284-9_46.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Levner, Eugene, Boris Kriheli, Arriel Benis, Alexander Ptuskin, Amir Elalouf, Sharon Hovav, and Shai Ashkenazi. "Entropy-Based Approach to Efficient Cleaning of Big Data in Hierarchical Databases." In Big Data – BigData 2020, 3–12. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Carvalho, Andre Luis Costa, Darine Ameyed, and Mohamed Cheriet. "Ensemble Learning for Heterogeneous Missing Data Imputation." In Big Data – BigData 2020, 127–43. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_10.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Supakkul, Sam, Robert Ahn, Ronaldo Gonçalves Junior, Diana Villarreal, Liping Zhao, Tom Hill, and Lawrence Chung. "Validating Goal-Oriented Hypotheses of Business Problems Using Machine Learning: An Exploratory Study of Customer Churn." In Big Data – BigData 2020, 144–58. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_11.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Wang, Nan, Yong Liu, Peiyao Han, Xiaokun Li, and Jinbao Li. "The Collaborative Influence of Multiple Interactions on Successive POI Recommendation." In Big Data – BigData 2020, 159–74. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Isobe, Takashi, and Yoshihiro Okada. "Chemical XAI to Discover Probable Compounds’ Spaces Based on Mixture of Multiple Mutated Exemplars and Bioassay Existence Ratio." In Big Data – BigData 2020, 177–89. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_13.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Perez-Arriaga, Martha O., and Krishna Ashok Poddar. "Clinical Trials Data Management in the Big Data Era." In Big Data – BigData 2020, 190–205. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Zhou, Jonathan, Baldwin Chen, and Nianjun Zhou. "Cross-Cancer Genome Analysis on Cancer Classification Using Both Unsupervised and Supervised Approaches." In Big Data – BigData 2020, 206–19. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59612-5_15.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "BigData"

1

Malhotra, Shweta, M. N. Doja, Bashir Alam, and Mansaf Alam. "Bigdata analysis and comparison of bigdata analytic approches." In 2017 International Conference on Computing, Communication and Automation (ICCCA). IEEE, 2017. http://dx.doi.org/10.1109/ccaa.2017.8229821.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Sarma, Somina Venkata Surya Brahma Linga. "Scalability and Operational Metrics of various BigData Analytics Engines Bigdata Analytics." In Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015). Global Science and Technology Forum (GSTF), 2015. http://dx.doi.org/10.5176/2382-5669_ict-bdcs15.27.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

"Bigdata in Mobile Networks." In Sept. 17-19, 2018 Paris (France). Excellence in Research & Innovation, 2018. http://dx.doi.org/10.17758/eirai4.f0918115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

"BigData 2023 Committee Member." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386716.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

"BigData 2023 Author Index." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386465.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

"BigData 2023 Cover Page." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

"BigData 2023 Cover Page." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386176.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

"BigData 2023 Program Committee." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386815.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Milutinovic, Veljko. "DataFlow SuperComputing for BigData." In 2016 5th Mediterranean Conference on Embedded Computing (MECO). IEEE, 2016. http://dx.doi.org/10.1109/meco.2016.7525678.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Milutinovic, Veljko. "DataFlow SuperComputing for bigdata." In 2018 7th Mediterranean Conference on Embedded Computing (MECO). IEEE, 2018. http://dx.doi.org/10.1109/meco.2018.8405950.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "BigData"

1

López Cantos, F. Communication research using BigData methodology. Revista Latina de Comunicación Social, December 2015. http://dx.doi.org/10.4185/rlcs-2015-1076en.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

López Cantos, F. La investigación en comunicación con metodología BigData. Revista Latina de Comunicación Social, December 2015. http://dx.doi.org/10.4185/rlcs-2015-1076.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Wu, Wenji. BigData Express: Toward Predictable, Schedulable, and High-Performance Data Transfer. Office of Scientific and Technical Information (OSTI), May 2018. http://dx.doi.org/10.2172/1460784.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Wu, Wenji, Liang Zhang, Qiming Lu, and Phil DeMar. BigData Express: Toward Predictable, Schedulable, and High-Performance Data Transfer. Office of Scientific and Technical Information (OSTI), April 2020. http://dx.doi.org/10.2172/1623355.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Wu, Wenji. BigData Express: Toward Predictable, Schedulable, and High-Performance Data Transfer. Office of Scientific and Technical Information (OSTI), September 2019. http://dx.doi.org/10.2172/1565933.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!