To see the other types of publications on this topic, follow the link: Hadoop framework.

Journal articles on the topic 'Hadoop framework'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Hadoop framework.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Sudirman, Ahmad, Irawan, and Zawiyah Saharuna. "PENGEMBANGAN SISTEM BIG DATA: RANCANG BANGUN INFRASTRUKTUR DENGAN FRAMEWORK HADOOP." Journal of Informatics and Computer Engineering Research 1, no. 1 (2024): 25–32. https://doi.org/10.31963/jicer.v1i1.4919.

Full text
Abstract:
Hadoop is a distributed data storage platform that provides a parallel processing framework. HDFS (Hadoop Distributed File System) and Map Reduce are two very important parts of Hadoop, HDFS is a distributed storage system built on java, while Map Reduce is a program for managing large data in parallel and distributed. The research focused on testing the data transfer speed on HDFS, testing using 3 types of data, namely video, ISO Image, and text with a total of 512 MB, 1 GB, and 2 GB of data. Testing will be carried out by entering data into HDFS using the Hadoop command and changing the size
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Xin Liang, and Jian De Zheng. "Improvement of Hadoop Security Mechanism." Applied Mechanics and Materials 484-485 (January 2014): 912–15. http://dx.doi.org/10.4028/www.scientific.net/amm.484-485.912.

Full text
Abstract:
Hadoop, as an open-source cloud computing framework, is increasingly applied in many fields,while the weakness of security mechanism now becomes one of the main problems hindering its development. This paper first analyzes the current security mechanisms of Hadoop,then through study of Hadoop's security mechanism and analysis of security risk in its current version,proposes a corresponding solution based on secure multicast to resovle the security risks. All these could provide certain technical supports for the enterprises in their Hadoop applications with new security needs.
APA, Harvard, Vancouver, ISO, and other styles
3

Husain, Baydaa Hassan, and Subhi R. M. Zeebaree. "Improvised Distributions framework of Hadoop: A review." International Journal of Science and Business 5, no. 2 (2021): 31–41. https://doi.org/10.5281/zenodo.4461761.

Full text
Abstract:
HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is designed to scale single servers to thousands of computers, providing local computation and storage. Over a decade after HADOOP emerged on the forefront as an open system for Big Data analysis. Its growth has prompted several improvisations for particular data processing needs, based on the type of processing conditions at various periods of computation. This paper
APA, Harvard, Vancouver, ISO, and other styles
4

Giri, Pratit Raj, and Gajendra Sharma. "Apache Hadoop Architecture, Applications, and Hadoop Distributed File System." Semiconductor Science and Information Devices 4, no. 1 (2022): 14. http://dx.doi.org/10.30564/ssid.v4i1.4619.

Full text
Abstract:
The data and internet are highly growing which causes problems in management of the big-data. For these kinds of problems, there are many software frameworks used to increase the performance of the distributed system. This software is used for the availability of large data storage. One of the most beneficial software frameworks used to utilize data in distributed systems is Hadoop. This paper introduces Apache Hadoop architecture, components of Hadoop, their significance in managing vast volumes of data in a distributed system. Hadoop Distributed File System enables the storage of enormous ch
APA, Harvard, Vancouver, ISO, and other styles
5

Sahu, kapil, Kaveri Bhatt, Prof Amit Saxena, and Kaptan Singh. "Implementation of Big-Data Applications Using Map Reduce Framework." International Journal of Engineering and Computer Science 9, no. 08 (2020): 25125–31. http://dx.doi.org/10.18535/ijecs/v9i08.4504.

Full text
Abstract:
Clustering As a result of the rapid development in cloud computing, it & fundamental to investigate the performance of extraordinary Hadoop MapReduce purposes and to realize the performance bottleneck in a cloud cluster that contributes to higher or diminish performance. It is usually primary to research the underlying hardware in cloud cluster servers to permit the optimization of program and hardware to achieve the highest performance feasible. Hadoop is founded on MapReduce, which is among the most popular programming items for huge knowledge analysis in a parallel computing environment
APA, Harvard, Vancouver, ISO, and other styles
6

Tripathi, A. K., S. Agrawal, and R. D. Gupta. "A COMPARATIVE ANALYSIS OF CONVENTIONAL HADOOP WITH PROPOSED CLOUD ENABLED HADOOP FRAMEWORK FOR SPATIAL BIG DATA PROCESSING." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-5 (November 15, 2018): 425–30. http://dx.doi.org/10.5194/isprs-annals-iv-5-425-2018.

Full text
Abstract:
<p><strong>Abstract.</strong> The emergence of new tools and technologies to gather the information generate the problem of processing spatial big data. The solution of this problem requires new research, techniques, innovation and development. Spatial big data is categorized by the five V’s: volume, velocity, veracity, variety and value. Hadoop is a most widely used framework which address these problems. But it requires high performance computing resources to store and process such huge data. The emergence of cloud computing has provided, on demand, elastic, scalable and pa
APA, Harvard, Vancouver, ISO, and other styles
7

Tyagi, Adhishtha, and Sonia Sharma. "A Framework of Security and Performance Enhancement for Hadoop." International Journal of Advanced Research in Computer Science and Software Engineering 7, no. 7 (2017): 437. http://dx.doi.org/10.23956/ijarcsse/v7i6/0171.

Full text
Abstract:
Hadoop framework has been emerged as the most effective and widely adopted framework for Big Data processing. Map Reduce programming model is used for processing as well as generating large data sets. Data security has become an important issue as far as storage is concerned. By default theres no security mechanism in hadoop and it is the first choice of the business analyst and industrialists to store and manage data as well as theres a need to introduce security solutions to Hadoop in order to secure the important data in the Hadoop environment. We implemented and evaluated Dynamic Task Spli
APA, Harvard, Vancouver, ISO, and other styles
8

Srinivasa, Rao Putta. "COMPARATIVE ANALYSIS OF BIG DATA USING HADOOP FRAMEWORK." GLOBAL JOURNAL OF ENGINEERING SCIENCE AND RESEARCHES 6, no. 4 (2019): 477–80. https://doi.org/10.5281/zenodo.2657666.

Full text
Abstract:
Big data is an important concept in the field of information technology where companies and organization take advantage of the data that they have stored to find meaningful patterns and predictions to help them in making informed decisions. Big data analysis involves the use of advanced tools and techniques that are used in the processing the large volumes of data that is produced by the organization (Sammer, 2012, p. 23). The Hadoop framework is an important framework in enabling easy and efficient processing of large data sets thus making it one of the most popular big data analytics fr
APA, Harvard, Vancouver, ISO, and other styles
9

Ahmed, Eman, Amin A. Sorrour, Mohamed A. Sobh, and Ayman M. Bahaa-Eldin. "A Cloud-based Malware Detection Framework." International Journal of Interactive Mobile Technologies (iJIM) 11, no. 2 (2017): 113. http://dx.doi.org/10.3991/ijim.v11i2.6577.

Full text
Abstract:
<p class="Els-Abstract-text">Malwares are increasing rapidly. The nature of distribution and effects of malwares attacking several applications requires a real-time response. Therefore, a high performance detection platform is required. In this paper, Hadoop is utilized to perform static binary search and detection for malwares and viruses in portable executable files deployed mainly on the cloud. The paper presents an approach used to map the portable executable files to Hadoop compatible files. The Boyer–Moore-Horspool Search algorithm is modified to benefit from the distribution of Ha
APA, Harvard, Vancouver, ISO, and other styles
10

Revathy, P., and Rajeswari Mukesh. "HadoopSec 2.0: Prescriptive analytics-based multi-model sensitivity-aware constraints centric block placement strategy for Hadoop." Journal of Intelligent & Fuzzy Systems 39, no. 6 (2020): 8477–86. http://dx.doi.org/10.3233/jifs-189165.

Full text
Abstract:
Like many open-source technologies such as UNIX or TCP/IP, Hadoop was not created with Security in mind. Hadoop however evolved from the other tools over time and got widely adopted across large enterprises. Some of Hadoop’s architectural features present Hadoop its unique security issues. Given this security vulnerability and potential invasion of confidentiality due to malicious attackers or internal customers, organizations face challenges in implementing a strong security framework for Hadoop. Furthermore, given the method in which data is placed in Hadoop Cluster adds to the only growing
APA, Harvard, Vancouver, ISO, and other styles
11

Researcher. "OPTIMIZING HADOOP CLUSTER PERFORMANCE: A COMPREHENSIVE FRAMEWORK FOR BIG DATA EFFICIENCY." International Journal of Research In Computer Applications and Information Technology (IJRCAIT) 7, no. 2 (2024): 549–64. https://doi.org/10.5281/zenodo.14009726.

Full text
Abstract:
Optimizing Hadoop performance is essential for maximizing the value of Big Data initiatives. This article provides practical tips and best practices for enhancing the efficiency of Hadoop clusters. Covering key areas such as resource allocation, job scheduling, and data processing optimization, the article draws on real-world experience to offer actionable advice. It discusses techniques for fine-tuning system configurations, leveraging advanced features, and addressing common performance bottlenecks. The article concludes with case studies demonstrating successful Hadoop optimization in large
APA, Harvard, Vancouver, ISO, and other styles
12

Wei, Chih-Chiang, and Tzu-Hao Chou. "Typhoon Quantitative Rainfall Prediction from Big Data Analytics by Using the Apache Hadoop Spark Parallel Computing Framework." Atmosphere 11, no. 8 (2020): 870. http://dx.doi.org/10.3390/atmos11080870.

Full text
Abstract:
Situated in the main tracks of typhoons in the Northwestern Pacific Ocean, Taiwan frequently encounters disasters from heavy rainfall during typhoons. Accurate and timely typhoon rainfall prediction is an imperative topic that must be addressed. The purpose of this study was to develop a Hadoop Spark distribute framework based on big-data technology, to accelerate the computation of typhoon rainfall prediction models. This study used deep neural networks (DNNs) and multiple linear regressions (MLRs) in machine learning, to establish rainfall prediction models and evaluate rainfall prediction a
APA, Harvard, Vancouver, ISO, and other styles
13

Deng, Zhong Hua, Bing Fan, Ying Jun Lu, and Zhi Fang Li. "Discussion about Big Data Mining Based on Hadoop." Applied Mechanics and Materials 380-384 (August 2013): 2063–66. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.2063.

Full text
Abstract:
As a Cloud computing platform, Hadoop has huge advantages in Data mining. The main aspects of Hadoop for data mining are discussed. A technical framework for big data mining based on Hadoop is analyzed.
APA, Harvard, Vancouver, ISO, and other styles
14

Zhang, Wei Feng, and Tin Wang. "Hadoop: Analysis of Cloud Computing Infrastructure." Applied Mechanics and Materials 475-476 (December 2013): 1201–6. http://dx.doi.org/10.4028/www.scientific.net/amm.475-476.1201.

Full text
Abstract:
As a kind of cloud computing infrastructure, Hadoop has attracted more attention from a group of corporations and has been widely used. The framework and character of Hadoop were detailed introduced and analyzed as well as the application opportunities of Hadoop in the field of communication in the future were prospected in this paper. An application based on Hadoop was designed and experiment result demonstrates the efficiency of Hadoop on dealing with big dataset.
APA, Harvard, Vancouver, ISO, and other styles
15

Abhishek, Kumar, Manish Kumar Verma, Kumar Shivam, Vinit Kumar, and Adarsh Mohan. "Integrated Hadoop Cloud Framework (IHCF)." Indian Journal of Science and Technology 10, no. 10 (2017): 1–8. http://dx.doi.org/10.17485/ijst/2017/v10i10/107943.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Shah, Nathar, and Christopher Messom. "An Expressive Hadoop MapReduce Framework." Advanced Science Letters 23, no. 11 (2017): 11197–201. http://dx.doi.org/10.1166/asl.2017.10250.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Mpungu, Cephas, Carlisle George, and Glenford Mapp. "Digital Forensics Readiness in Big Data Networks: A Novel Framework and Incident Response Script for Linux–Hadoop Environments." Applied System Innovation 7, no. 5 (2024): 90. http://dx.doi.org/10.3390/asi7050090.

Full text
Abstract:
The surge in big data and analytics has catalysed the proliferation of cybercrime, largely driven by organisations’ intensified focus on gathering and processing personal data for profit while often overlooking security considerations. Hadoop and its derivatives are prominent platforms for managing big data; however, investigating security incidents within Hadoop environments poses intricate challenges due to scale, distribution, data diversity, replication, component complexity, and dynamicity. This paper proposes a big data digital forensics readiness framework and an incident response scrip
APA, Harvard, Vancouver, ISO, and other styles
18

Lee, Kyong-Ha, Woo Lam Kang, and Young-Kyoon Suh. "Improving I/O Efficiency in Hadoop-Based Massive Data Analysis Programs." Scientific Programming 2018 (December 2, 2018): 1–9. http://dx.doi.org/10.1155/2018/2682085.

Full text
Abstract:
Apache Hadoop has been a popular parallel processing tool in the era of big data. While practitioners have rewritten many conventional analysis algorithms to make them customized to Hadoop, the issue of inefficient I/O in Hadoop-based programs has been repeatedly reported in the literature. In this article, we address the problem of the I/O inefficiency in Hadoop-based massive data analysis by introducing our efficient modification of Hadoop. We first incorporate a columnar data layout into the conventional Hadoop framework, without any modification of the Hadoop internals. We also provide Had
APA, Harvard, Vancouver, ISO, and other styles
19

Bhatia, Raj Kumari, and Aakriti Bansal. "Deploying and Improving Hadoop on PseudoDistributed Mode." COMPUSOFT: An International Journal of Advanced Computer Technology 03, no. 10 (2014): 1136–39. https://doi.org/10.5281/zenodo.14759333.

Full text
Abstract:
Hadoop is an open source framework which comprises of MapReduce and HDFS (Hadoop Distributed File System) which allows storage of data and computing capabilities on large scale. Hadoop is expanding every day and many cloud computing enterprises have been adapting it. Hadoop provides a platform to provide cloud computing services to customers. Hadoop can be run of any the three modes: Standalone, Pseudo-Distributed and Fully-Distributed. In this paper we have improved execution time by configuring different schedulers. We implemented our method over Hadoop 2.2.0 on Pseudo-Distributed mode which
APA, Harvard, Vancouver, ISO, and other styles
20

M, Hena, and N. Jeyanthi. "A Survey on Hadoop Security and Comparative Analysis on Authentication Frameworks in Hadoop Clusters." ECS Transactions 107, no. 1 (2022): 259–68. http://dx.doi.org/10.1149/10701.0259ecst.

Full text
Abstract:
This paper presents a survey on various security frameworks which aim to secure Hadoop clusters and the files stored there. Most of the Hadoop platforms rely on Kerberos Authentication Protocol for user authentication. It is known that Kerberos itself has some vulnerabilities like Password Guessing Attacks, Single Point of Failure, Insider Attacks, and Time Synchronization problems. A new authentication framework that can address most of the identified issues and is computationally feasible than other schemes is introduced in this paper. The proposed framework is based on Secure Remote Passwor
APA, Harvard, Vancouver, ISO, and other styles
21

Wibawa, Condro, Setia Wirawan, Metty Mustikasari, and Dessy Tri Anggraeni. "KOMPARASI KECEPATAN HADOOP MAPREDUCE DAN APACHE SPARK DALAM MENGOLAH DATA TEKS." Jurnal Ilmiah Matrik 24, no. 1 (2022): 10–20. http://dx.doi.org/10.33557/jurnalmatrik.v24i1.1649.

Full text
Abstract:
Istilah Big Data saat ini bukanlah hal yang baru lagi. Salah satu komponen Big Data adalah jumlah data yang masif, yang membuat data tidak bisa diproses dengan cara-cara tradicional. Untuk menyelesaikan masalah ini, dikembangkanlah metode Map Reduce. Map Reduce adalah metode pengolahan data dengan memecah data menjadi bagian-bagian kecil (mapping) dan kemudian hasilnya dijadikan satu kembali (reducing). Framework Map Reduce yang banyak digunakan adalah Hadoop MapReduce dan Apache Spark. Konsep kedua framework ini sama akan tetapi berbeda dalam pengelolaan sumber data. Hadoop MapReduce mengguna
APA, Harvard, Vancouver, ISO, and other styles
22

Et. al., Sirisha N,. "Integrated Security and Privacy Framework for Big Data in Hadoop MapReduce Framework." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 11 (2021): 646–62. http://dx.doi.org/10.17762/turcomat.v12i11.5941.

Full text
Abstract:
Public cloud infrastructure is widely used by enterprises to store and process big data. Cloud and its distributed computing phenomena not only provides scalable, available and affordable solution for storage and compute services but also raises security concerns. Many security solutions that came into existence encrypt data and allow accessing plaintext for data analytics in the confines of secure hardware. However, the fact remains that the large volumes of data is processed in distributed environment involving hundreds of commodity machines. There exist numerous communications between machi
APA, Harvard, Vancouver, ISO, and other styles
23

Bhathal, Gurjit Singh, and Amardeep Singh Dhiman. "Big Data Security Challenges and Solution of Distributed Computing in Hadoop Environment: A Security Framework." Recent Advances in Computer Science and Communications 13, no. 4 (2020): 790–97. http://dx.doi.org/10.2174/2213275912666190822095422.

Full text
Abstract:
Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authenticati
APA, Harvard, Vancouver, ISO, and other styles
24

Chen, Feng Ping, Li Miao, and Yue Gao Tang. "Research of Hadoop Parameters Tuning Based on Function Monitoring." Applied Mechanics and Materials 621 (August 2014): 264–70. http://dx.doi.org/10.4028/www.scientific.net/amm.621.264.

Full text
Abstract:
Hadoop is a popular software framework supports distributed processing of large data sets. However, with Hadoop being a relatively new technology, practitioners and administers often lack the expertise to tune it to get better performance. Hadoop parameters configuration is one of the key factors which influence the performance. In this article, we present a novel Hadoop parameters tuning method based on function monitoring. This method monitors the function call information during task run to analyze why the performance of Hadoop changes when tuning parameters, which will be helpful for pract
APA, Harvard, Vancouver, ISO, and other styles
25

Ghoneimy, Samy, and Samir Abou El-Seoud. "A MapReduce Framework for DNA Sequencing Data Processing." International Journal of Recent Contributions from Engineering, Science & IT (iJES) 4, no. 4 (2016): 11. http://dx.doi.org/10.3991/ijes.v4i4.6537.

Full text
Abstract:
<p class="Els-1storder-head">Genomics and Next Generation Sequencers (NGS) like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC) and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format)
APA, Harvard, Vancouver, ISO, and other styles
26

Chaitanya, Shrikant Kulkarni. "A Survey Paper on Satellite Image Using OpenCV Library over Hadoop Framework." Journal of Computer Science Engineering and Software Testing 4, no. 3 (2018): 12–18. https://doi.org/10.5281/zenodo.1476310.

Full text
Abstract:
In this survey paper, we tend to study land classification from two-dimensional high resolution satellite pictures victimization Hadoop framework. Propelled picture process calculations that need higher system control with monstrous scale inputs is prepared with productivity exploitation the parallel and dispersed procedure of HadoopMapReduce Framework. HadoopMapReduce could be a climbable model that is fit for process petabytes information with enhanced adaptation to non-critical failure and information closeness. During this paper we tend to gift a MapReduce framework for acting parallel rem
APA, Harvard, Vancouver, ISO, and other styles
27

Khan, Mukhtaj, Zhengwen Huang, Maozhen Li, Gareth A. Taylor, Phillip M. Ashton, and Mushtaq Khan. "Optimizing Hadoop Performance for Big Data Analytics in Smart Grid." Mathematical Problems in Engineering 2017 (2017): 1–11. http://dx.doi.org/10.1155/2017/2198262.

Full text
Abstract:
The rapid deployment of Phasor Measurement Units (PMUs) in power systems globally is leading to Big Data challenges. New high performance computing techniques are now required to process an ever increasing volume of data from PMUs. To that extent the Hadoop framework, an open source implementation of the MapReduce computing model, is gaining momentum for Big Data analytics in smart grid applications. However, Hadoop has over 190 configuration parameters, which can have a significant impact on the performance of the Hadoop framework. This paper presents an Enhanced Parallel Detrended Fluctuatio
APA, Harvard, Vancouver, ISO, and other styles
28

Senthilkumar, M., and P. Ilango. "A Survey on Job Scheduling in Big Data." Cybernetics and Information Technologies 16, no. 3 (2016): 35–51. http://dx.doi.org/10.1515/cait-2016-0033.

Full text
Abstract:
Abstract Big Data Applications with Scheduling becomes an active research area in last three years. The Hadoop framework becomes very popular and most used frameworks in a distributed data processing. Hadoop is also open source software that allows the user to effectively utilize the hardware. Various scheduling algorithms of the MapReduce model using Hadoop vary with design and behavior, and are used for handling many issues like data locality, awareness with resource, energy and time. This paper gives the outline of job scheduling, classification of the scheduler, and comparison of different
APA, Harvard, Vancouver, ISO, and other styles
29

Utami, Firmania Dwi, and Femi Dwi Astuti. "Comparison of Hadoop Mapreduce and Apache Spark in Big Data Processing with Hgrid247-DE." Journal of Applied Informatics and Computing 8, no. 2 (2024): 390–99. https://doi.org/10.30871/jaic.v8i2.8557.

Full text
Abstract:
In today’s rapidly evolving information technology landscape, managing and analyzing big data has become one of the most significant challenges. This paper explores the implementation of two major frameworks for big data processing: Hadoop MapReduce and Apache Spark. Both frameworks were tested in three scenarios sorting, summarizing, and grouping using HGrid247-DE as the primary tool for data processing. A diverse set of datasets sourced from Kaggle, ranging in size from 3 MB to 260 MB, was employed to evaluate the performance of each framework. The findings reveal that Apache Spark generally
APA, Harvard, Vancouver, ISO, and other styles
30

Deng, Junyi, Yanheng Liu, Jian Wang, and Shujing Li. "SHIYF: A Secured and High-Integrity YARN Framework." Electronics 8, no. 5 (2019): 548. http://dx.doi.org/10.3390/electronics8050548.

Full text
Abstract:
Cloud computing is becoming a powerful parallel data processing method, and it can be adopted by many network service providers to build a service framework. Although cloud computing is able to efficiently process a large amount of data, it can be attacked easily due to its massively distributed cluster nodes. In this paper, we propose a secure and high-integrity YARN framework (SHIYF), which establishes a close relationship between speculative execution and the security of Yet Another Resource Negotiator (YARN, MapReduce 2.0). SHIYF computes and compares the MD5 hashes of the intermediate and
APA, Harvard, Vancouver, ISO, and other styles
31

Li, Pengcheng, Haidong Chen, Shipeng Li, Tinggui Yan, and Hang Qian. "Research on Distributed Calculation of Flight Parameters Based on Hadoop." Journal of Physics: Conference Series 2337, no. 1 (2022): 012013. http://dx.doi.org/10.1088/1742-6596/2337/1/012013.

Full text
Abstract:
Abstract With the improvement of launch vehicle technology and the increase of launch missions, under the intensive launch tasks, the contradiction between the large-scale calculation demand of flight parameters of launch vehicle and the traditional standalone calculation mode is increasingly prominent, which is mainly reflected in the slow calculation speed, low processing efficiency, limited bandwidth bottleneck, and single point fault. Big data processing architecture Hadoop’s distributed computing framework MapReduce, running in low cost cluster, is innovative applied in the large-scale ca
APA, Harvard, Vancouver, ISO, and other styles
32

Raghu Gopa and Dr Sandeep Kumar. "Hadoop Ecosystem and Cloud Integration." Universal Research Reports 12, no. 1 (2025): 455–64. https://doi.org/10.36676/urr.v12.i1.1505.

Full text
Abstract:
The integration of the Hadoop ecosystem with cloud computing marks a transformative evolution in the way organizations manage and analyze large-scale data. This study examines how the union of Hadoop’s distributed storage and processing capabilities with the scalable, flexible resources of the cloud enhances data-driven decision making and operational efficiency. Hadoop, an open-source framework, is renowned for its ability to process vast volumes of structured and unstructured data across clusters of commodity hardware using components such as HDFS and MapReduce. When integrated with cloud en
APA, Harvard, Vancouver, ISO, and other styles
33

Rahman, Md Armanur, J. Hossen, Venkataseshaiah C, et al. "A Survey of Machine Learning Techniques for Self-tuning Hadoop Performance." International Journal of Electrical and Computer Engineering (IJECE) 8, no. 3 (2018): 1854. http://dx.doi.org/10.11591/ijece.v8i3.pp1854-1862.

Full text
Abstract:
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this paper, the concept of critical issues of Hadoop system, big data and machine learning have been highlighted and an analysis of some machine learning techniques applied so far, for improving the Hadoop performance is presented. Then, a promising machine learning technique using deep learning algorithm is proposed for Hadoop system performance improvement.
APA, Harvard, Vancouver, ISO, and other styles
34

Yu Dai, Yu Dai. "3D Interior Design System Model Based on Computer Virtual Reality Technology." Journal of Electrical Systems 19, no. 4 (2024): 84–101. http://dx.doi.org/10.52783/jes.625.

Full text
Abstract:
Globally, data volume is increases exponentially with increase in the proliferation with Cloud Computing. MapReduce is emerged as the prominent solution for the unprecedented growth in the efficient manner as it process both structured and unstructured data. The dynamic landscape of Virtual Reality has seen a significant shift towards technology-driven approaches, with data analytics and personalized learning becoming increasingly important. This paper introduces an innovative framework that leverages the power of Hadoop and MapReduce to elevate 3D virtual reality experiences within diverse VR
APA, Harvard, Vancouver, ISO, and other styles
35

Kommu, Gangadhara Rao. "Performance Evaluation of Map Reduce vs. Spark framework on Amazon Machine Image for TeraSort Algorithm." International Journal for Research in Applied Science and Engineering Technology 9, no. VI (2021): 2728–32. http://dx.doi.org/10.22214/ijraset.2021.35540.

Full text
Abstract:
TeraSort is one of Hadoop’s widely used benchmarks. Hadoop’s distribution contains both the input generator and sorting implementations: the TeraGen generates the input and TeraSort conducts the sorting. We focus on the comparison of TeraSort algorithm on the different distributed platforms with different configurations of the resources. We have considered the parameters of measure of efficiency as Compute Time, Data Read, Data Write, Compute Time, and Speedup. We have conducted experiments using Hadoop map reduce and Spark (Java). We empirically evaluate the performance of TeraSort algorithm
APA, Harvard, Vancouver, ISO, and other styles
36

Kumari Bhatia, Raj, and Aakriti Bansal. "A Closer Looks Over Hadoop Framework." International Journal of Engineering Trends and Technology 14, no. 3 (2014): 150–52. http://dx.doi.org/10.14445/22315381/ijett-v14p229.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Paul, Gobinda, and S. M. Monzurur Rahman. "A Secure Framework For Hadoop Communication." International Journal of Scientific and Engineering Research 6, no. 9 (2015): 1409–16. http://dx.doi.org/10.14299/ijser.2015.09.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Hamad, Faten. "An Overview of Hadoop Scheduler Algorithms." Modern Applied Science 12, no. 8 (2018): 69. http://dx.doi.org/10.5539/mas.v12n8p69.

Full text
Abstract:
Hadoop is a cloud computing open source system, used in large-scale data processing. It became the basic computing platforms for many internet companies. With Hadoop platform users can develop the cloud computing application and then submit the task to the platform. Hadoop has a strong fault tolerance, and can easily increase the number of cluster nodes, using linear expansion of the cluster size, so that clusters can process larger datasets. However Hadoop has some shortcomings, especially in the actual use of the process of exposure to the MapReduce scheduler, which calls for more researches
APA, Harvard, Vancouver, ISO, and other styles
39

Sneha, Sneha, and Shoney Sebastian. "Improved fair Scheduling Algorithm for Hadoop Clustering." Oriental journal of computer science and technology 10, no. 1 (2017): 194–200. http://dx.doi.org/10.13005/ojcst/10.01.26.

Full text
Abstract:
Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works like parallel processing and there is no failure or data loss as such due to fault tolerance. Job scheduling is an important process in Hadoop Map Reduce. Hadoop comes with three types of schedulers namely
APA, Harvard, Vancouver, ISO, and other styles
40

Wahyu Saputro, Rendiyono, Aminuddin Aminuddin, and Yuda Munarko. "Perbandingan Kinerja Komputasi Hadoop dan Spark untuk Memprediksi Cuaca (Studi Kasus : Storm Event Database)." Jurnal Repositor 2, no. 4 (2020): 463. http://dx.doi.org/10.22219/repositor.v2i4.93.

Full text
Abstract:
AbstrakPerkembangan teknologi telah mengakibatkan pertumbuhan data yang semakin cepat dan besar setiap waktunya. Hal tersebut disebabkan oleh banyaknya sumber data seperti mesin pencari, RFID, catatan transaksi digital, arsip video dan foto, user generated content, internet of things, penelitian ilmiah di berbagai bidang seperti genomika, meteorologi, astronomi, fisika, dll. Selain itu, data - data tersebut memiliki karakteristik yang unik antara satu dengan lainnya, hal ini yang menyebabkan tidak dapat diproses oleh teknologi basis data konvensional.Oleh karena itu, dikembangkan beragam frame
APA, Harvard, Vancouver, ISO, and other styles
41

Ameena, Anjum, and Shivleela Patil Prof. "HDFS Erasure Coded Information Repository System for Hadoop Clusters." International Journal of Trend in Scientific Research and Development 2, no. 5 (2018): 1957–60. https://doi.org/10.31142/ijtsrd18206.

Full text
Abstract:
Existing disk based recorded stockpiling frameworks are insufficient for Hadoop groups because of the obliviousness of information copies and the guide decrease programming model. To handle this issue, a deletion coded information chronicled framework called HD FS is developed for Hadoop bunches, where codes are utilized to file information copies in the Hadoop dispersed document framework or HD FS. Here there are two chronicled systems that HDFS Grouping and HDFS Pipeline in HDFS to accelerate the information documented process. HDFS Grouping is a Map Reduce based information chronicling plan
APA, Harvard, Vancouver, ISO, and other styles
42

Lee, Sungchul, Ju-Yeon Jo, and Yoohwan Kim. "Hadoop Performance Analysis Model with Deep Data Locality." Information 10, no. 7 (2019): 222. http://dx.doi.org/10.3390/info10070222.

Full text
Abstract:
Background: Hadoop has become the base framework on the big data system via the simple concept that moving computation is cheaper than moving data. Hadoop increases a data locality in the Hadoop Distributed File System (HDFS) to improve the performance of the system. The network traffic among nodes in the big data system is reduced by increasing a data-local on the machine. Traditional research increased the data-local on one of the MapReduce stages to increase the Hadoop performance. However, there is currently no mathematical performance model for the data locality on the Hadoop. Methods: Th
APA, Harvard, Vancouver, ISO, and other styles
43

E. Laxmi Lydia, Dr, and M. Srinivasa Rao. "Applying compression algorithms on hadoop cluster implementing through apache tez and hadoop mapreduce." International Journal of Engineering & Technology 7, no. 2.26 (2018): 80. http://dx.doi.org/10.14419/ijet.v7i2.26.12539.

Full text
Abstract:
The latest and famous subject all over the cloud research area is Big Data; its main appearances are volume, velocity and variety. The characteristics are difficult to manage through traditional software and their various available methodologies. To manage the data which is occurring from various domains of big data are handled through Hadoop, which is open framework software which is mainly developed to provide solutions. Handling of big data analytics is done through Hadoop Map Reduce framework and it is the key engine of hadoop cluster and it is extensively used in these days. It uses batch
APA, Harvard, Vancouver, ISO, and other styles
44

Bhaskar, Archana, and Rajeev Ranjan. "Cost-aware optimal resource provisioning Map-Reduce scheduler for hadoop framework." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 2 (2024): 1262. http://dx.doi.org/10.11591/ijai.v13.i2.pp1262-1271.

Full text
Abstract:
<p><span lang="EN-US">Distributed data processing model has been one of the primary components in the case of data-intensive applications; furthermore, due to advancements in technologies, there has been a huge volume of data generation of diverse nature. Hadoop map reduce framework is responsible for adopting the ease of deployment mechanism in an open-source framework. The existing Hadoop MapReduce framework possesses high makespan time and high Input/Output overhead and it mainly affects the cost of a model. Thus, this research work presents an optimized cost aware resource prov
APA, Harvard, Vancouver, ISO, and other styles
45

Babu, M. Mahesh, and A. Madhuri. "An Efficient Parallel String Join Framework using Hadoop framework." International Journal of Computer & Organization Trends 15, no. 1 (2014): 1–6. http://dx.doi.org/10.14445/22492593/ijcot-v15p301.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Et. al., Ravi Kumar A,. "A Review on Design and Development of Performance Evaluation Model for Bio-Informatics Data Using Hadoop." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 2 (2021): 1546–63. http://dx.doi.org/10.17762/turcomat.v12i2.1432.

Full text
Abstract:
The paper reviews the usage of the platform Hadoop in applications for systemic bioinformatics. Hadoop offers another system for Structural Bioinformatics to break down broad fractions of the Protein Data Bank that is crucial to high-throughput investigations of (for example) protein-ligand docking, protein-ligand complex clustering, and structural alignment. In specific, we review different applications of high-throughput analyses and their scalability in the literature using Hadoop. In comparison to revising the algorithms, we find that these organisations typically use a realized executable
APA, Harvard, Vancouver, ISO, and other styles
47

Et. al., G. Joel Sunny Deol,. "Hadoop Job Scheduling Using Improvised Ant Colony Optimization." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 2 (2021): 3417–24. http://dx.doi.org/10.17762/turcomat.v12i2.2403.

Full text
Abstract:
Hadoop Distributed File System is used for storage along with a programming framework MapReduce for processing large datasets allowing parallel processing. The process of handling such complex and vast data and maintaining the performance parameters up to certain level is a difficult task. Hence, an improvised mechanism is proposed here that will enhance the job scheduling capabilities of Hadoop and optimize allocation and utilization of resources. Significantly, an aggregator node is added to the default HDFS framework architecture to improve the performance of Hadoop Name node. In this paper
APA, Harvard, Vancouver, ISO, and other styles
48

Prabowo, Sidik, and Maman Abdurohman. "Studi Perbandingan Performa Algoritma Penjadwalan untuk Real Time Data Twitter pada Hadoop." Komputika : Jurnal Sistem Komputer 9, no. 1 (2020): 43–50. http://dx.doi.org/10.34010/komputika.v9i1.2848.

Full text
Abstract:
Hadoop merupakan sebuah framework software yang bersifat open source dan berbasis java. Hadoop terdiri atas dua komponen utama, yaitu MapReduce dan Hadoop Distributed File System (HDFS). MapReduce terdiri atas Map dan Reduce yang digunakan untuk pemrosesan data, sementara HDFS adalah tempat atau direktori dimana data hadoop dapat disimpan. Dalam menjalankan job yang tidak jarang terdapat keragaman karakteristik eksekusinya, diperlukan job scheduler yang tepat. Terdapat banyak job scheduler yang dapat di pilih supaya sesuai dengan karakteristik job. Fair Scheduler menggunakan salah satu schedul
APA, Harvard, Vancouver, ISO, and other styles
49

Gupta, Manish Kumar, and Rajendra Kumar Dwivedi. "Blockchain Enabled Hadoop Distributed File System Framework for Secure and Reliable Traceability." ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal 12 (December 29, 2023): e31478. http://dx.doi.org/10.14201/adcaij.31478.

Full text
Abstract:
Hadoop Distributed File System (HDFS) is a distributed file system that allows large amounts of data to be stored and processed across multiple servers in a Hadoop cluster. HDFS also provides high throughput for data access. HDFS enables the management of vast amounts of data using commodity hardware. However, security vulnerabilities in HDFS can be manipulated for malicious purposes. This emphasizes the significance of establishing strong security measures to facilitate file sharing within Hadoop and implementing a reliable mechanism for verifying the legitimacy of shared files. The objective
APA, Harvard, Vancouver, ISO, and other styles
50

Azeroual, Otmane, and Renaud Fabre. "Processing Big Data with Apache Hadoop in the Current Challenging Era of COVID-19." Big Data and Cognitive Computing 5, no. 1 (2021): 12. http://dx.doi.org/10.3390/bdcc5010012.

Full text
Abstract:
Big data have become a global strategic issue, as increasingly large amounts of unstructured data challenge the IT infrastructure of global organizations and threaten their capacity for strategic forecasting. As experienced in former massive information issues, big data technologies, such as Hadoop, should efficiently tackle the incoming large amounts of data and provide organizations with relevant processed information that was formerly neither visible nor manageable. After having briefly recalled the strategic advantages of big data solutions in the introductory remarks, in the first part of
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!