To see the other types of publications on this topic, follow the link: MapReduce programming model.

Journal articles on the topic 'MapReduce programming model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'MapReduce programming model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhang, Guigang, Chao Li, Yong Zhang, and Chunxiao Xing. "A Semantic++ MapReduce Parallel Programming Model." International Journal of Semantic Computing 08, no. 03 (2014): 279–99. http://dx.doi.org/10.1142/s1793351x14400091.

Full text
Abstract:
Big data is playing a more and more important role in every area such as medical health, internet finance, culture and education etc. How to process these big data efficiently is a huge challenge. MapReduce is a good parallel programming language to process big data. However, it has lots of shortcomings. For example, it cannot process complex computing. It cannot suit real-time computing. In order to overcome these shortcomings of MapReduce and its variants, in this paper, we propose a Semantic++ MapReduce parallel programming model. This study includes the following parts. (1) Semantic++ MapR
APA, Harvard, Vancouver, ISO, and other styles
2

Lämmel, Ralf. "Google’s MapReduce programming model — Revisited." Science of Computer Programming 70, no. 1 (2008): 1–30. http://dx.doi.org/10.1016/j.scico.2007.07.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Retnowo, Murti. "Syncronize Data Using MapReduceModel Programming." International Journal of Engineering Technology and Natural Sciences 3, no. 2 (2021): 82–88. http://dx.doi.org/10.46923/ijets.v3i2.140.

Full text
Abstract:
Research in the processing of the data shows that the larger data increasingly requires a longer time. Processing huge amounts of data on a single computer has limitations that can be overcome by parallel processing. This study utilized the MapReduce programming model data synchronization by duplicating the data from database client to database server. MapReduce is a programming model that was developed to speed up the processing of large data. MapReduce model application on the training process performed on data sharing that is adapted to number of sub-process (thread) and data entry to datab
APA, Harvard, Vancouver, ISO, and other styles
4

Garg, Uttama. "Data Analytic Models That Redress the Limitations of MapReduce." International Journal of Web-Based Learning and Teaching Technologies 16, no. 6 (2021): 1–15. http://dx.doi.org/10.4018/ijwltt.20211101.oa7.

Full text
Abstract:
The amount of data in today’s world is increasing exponentially. Effectively analyzing Big Data is a very complex task. The MapReduce programming model created by Google in 2004 revolutionized the big-data comput-ing market. Nowadays the model is being used by many for scientific and research analysis as well as for commercial purposes. The MapReduce model however is quite a low-level progamming model and has many limitations. Active research is being undertaken to make models that overcome/remove these limitations. In this paper we have studied some popular data analytic models that redress s
APA, Harvard, Vancouver, ISO, and other styles
5

Gao, Tilei, Ming Yang, Rong Jiang, Yu Li, and Yao Yao. "Research on Computing Efficiency of MapReduce in Big Data Environment." ITM Web of Conferences 26 (2019): 03002. http://dx.doi.org/10.1051/itmconf/20192603002.

Full text
Abstract:
The emergence of big data has brought a great impact on traditional computing mode, the distributed computing framework represented by MapReduce has become an important solution to this problem. Based on the big data, this paper deeply studies the principle and framework of MapReduce programming. On the basis of mastering the principle and framework of MapReduce programming, the time consumption of distributed computing framework MapReduce and traditional computing model is compared with concrete programming experiments. The experiment shows that MapReduce has great advantages in large data vo
APA, Harvard, Vancouver, ISO, and other styles
6

Siddesh, G. M., Kavya Suresh, K. Y. Madhuri, Madhushree Nijagal, B. R. Rakshitha, and K. G. Srinivasa. "Optimizing Crawler4j using MapReduce Programming Model." Journal of The Institution of Engineers (India): Series B 98, no. 3 (2016): 329–36. http://dx.doi.org/10.1007/s40031-016-0267-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Weidong, Boxin He, Yifeng Chen, and Qifei Zhang. "GMR: graph-compatible MapReduce programming model." Multimedia Tools and Applications 78, no. 1 (2017): 457–75. http://dx.doi.org/10.1007/s11042-017-5102-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Durairaj, M., and T. S. Poornappriya. "Importance of MapReduce for Big Data Applications: A Survey." Asian Journal of Computer Science and Technology 7, no. 1 (2018): 112–18. http://dx.doi.org/10.51983/ajcst-2018.7.1.1817.

Full text
Abstract:
Significant regard for MapReduce framework has been trapped by a wide range of areas. It is presently a practical model for data-focused applications because of its basic interface of programming, high elasticity, and capacity to withstand the subjection to defects. Additionally, it is fit for preparing a high extent of data in Distributed Computing environments (DCE). MapReduce, on various events, has turned out to be material to a wide scope of areas. MapReduce is a parallel programming model and a related usage presented by Google. In the programming model, a client determines the calculati
APA, Harvard, Vancouver, ISO, and other styles
9

Charanjeet, Kaur*1& Sumanpreet Kaur2. "NOVEL IMPROVED CAPACITY SCHEDULING ALGORITHM FOR HETEROGENEOUS HADOOP." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 6, no. 6 (2017): 401–10. https://doi.org/10.5281/zenodo.814540.

Full text
Abstract:
For large scale parallel applications Mapreduce is a widely used programming model. Mapreduce is an important programming model for parallel applications. Hadoop is a open source which is popular for developing data based applications and hadoop is a open source implementation of Mapreduce. Mapreduce gives programming interfaces to share data based in a cluster or distributed environment. As it works in a distributed environment so it should provide efficient scheduling mechanisms for efficient work capability in distributed environment. locality and synchronization overhead are main issues in
APA, Harvard, Vancouver, ISO, and other styles
10

Rokhman, Nur, and Amelia Nursanti. "The MapReduce Model on Cascading Platform for Frequent Itemset Mining." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 12, no. 2 (2018): 149. http://dx.doi.org/10.22146/ijccs.34102.

Full text
Abstract:
The implementation of parallel algorithms is very interesting research recently. Parallelism is very suitable to handle large-scale data processing. MapReduce is one of the parallel and distributed programming models. The implementation of parallel programming faces many difficulties. The Cascading gives easy scheme of Hadoop system which implements MapReduce model.Frequent itemsets are most often appear objects in a dataset. The Frequent Itemset Mining (FIM) requires complex computation. FIM is a complicated problem when implemented on large-scale data. This paper discusses the implementation
APA, Harvard, Vancouver, ISO, and other styles
11

Wang, Changjian, Yuxing Peng, Mingxing Tang, Dongsheng Li, Shanshan Li, and Pengfei You. "An Efficient MapReduce Computing Model for Imprecise Applications." International Journal of Web Services Research 13, no. 3 (2016): 46–63. http://dx.doi.org/10.4018/ijwsr.2016070103.

Full text
Abstract:
Optimizing the Map process is important for the improvement of the MapReduce performance. Many efforts have been devoted into the problem to design more efficient scheduling strategies. However, there exists a kind of MapReduce applications, named imprecise applications, where the imprecise results based on part of map tasks can satisfy the requirements of imprecise applications and thus the job processes can be completed when enough map tasks are processed. According to the feature of imprecise applications, the authors propose an improved MapReduce model, named MapCheckReduce, which can term
APA, Harvard, Vancouver, ISO, and other styles
12

Tahsir Ahmed Munna, Md, Shaikh Muhammad Allayear, Mirza Mohtashim Alam, Sheikh Shah Mohammad Motiur Rahman, Md Samadur Rahman, and M. Mesbahuddin Sarker. "Simplified Mapreduce Mechanism for Large Scale Data Processing." International Journal of Engineering & Technology 7, no. 3.8 (2018): 16. http://dx.doi.org/10.14419/ijet.v7i3.8.15211.

Full text
Abstract:
MapReduce has become a popular programming model for processing and running large-scale data sets with a parallel, distributed paradigm on a cluster. Hadoop MapReduce is needed especially for large scale data like big data processing. In this paper, we work to modify the Hadoop MapReduce Algorithm and implement it to reduce processing time.
APA, Harvard, Vancouver, ISO, and other styles
13

Sun, Han Lin. "An Improved MapReduce Model for Computation-Intensive Task." Advanced Materials Research 756-759 (September 2013): 1701–5. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.1701.

Full text
Abstract:
MapReduce is a widely adopted parallel programming model. The standard MapReduce model is designed for data-intensive processing. However, some machine learning algorithms are computation-intensive and time-consuming tasks which process the same data set repeatedly. In this paper, we proposed an improved MapReduce model for computation-intensive algorithms. The model is constructed from a service combination perspective. In the model, the whole task is divided into lots of subtasks taking account into the algorithms parameters, and the datagram with acknowledgement mechanism is used as the com
APA, Harvard, Vancouver, ISO, and other styles
14

Zhang, Weidong, Boxin He, Yifeng Chen, and Qifei Zhang. "Correction to: GMR: graph-compatible MapReduce programming model." Multimedia Tools and Applications 78, no. 1 (2017): 477. http://dx.doi.org/10.1007/s11042-017-5273-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Yangyuan, Li. "BRAIN. Broad Research in Artificial Intelligence and Neuroscience-Stable Modeling on Resource Usage Parameters of MapReduce Application." BRAIN. Broad Research in Artificial Intelligence and Neuroscience-Stable Modeling on Resource Usage Parameters of MapReduce Application 9, no. 2 (2018): 45–62. https://doi.org/10.5281/zenodo.1245887.

Full text
Abstract:
Currently, Hadoop MapReduce framework has been applied to many productive fields to analyze big data. MapReduce applications based on the MapReduce programming model are used to generate and process such huge data. Due to various computational purpose, MapReduce applications have different resource requirements. For specific applications, the resource bottleneck of the cloud computing platform must inevitably impact its executive performance. Therefore, identification of the bottleneck about the allocated resource for MapReduce applications is crucially needed from the viewpoint of either clou
APA, Harvard, Vancouver, ISO, and other styles
16

Zheng, Feifeng, Zhaojie Wang, Yinfeng Xu, and Ming Liu. "Heuristic Algorithms for MapReduce Scheduling Problem with Open-Map Task and Series-Reduce Tasks." Scientific Programming 2020 (July 15, 2020): 1–10. http://dx.doi.org/10.1155/2020/8810215.

Full text
Abstract:
Based on the classical MapReduce concept, we propose an extended MapReduce scheduling model. In the extended MapReduce scheduling problem, we assumed that each job contains an open-map task (the map task can be divided into multiple unparallel operations) and series-reduce tasks (each reduce task consists of only one operation). Different from the classical MapReduce scheduling problem, we also assume that all the operations cannot be processed in parallel, and the machine settings are unrelated machines. For solving the extended MapReduce scheduling problem, we establish a mixed-integer progr
APA, Harvard, Vancouver, ISO, and other styles
17

Kavitha, C., S. R. Srividhya, Wen-Cheng Lai, and Vinodhini Mani. "IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop." Electronics 11, no. 10 (2022): 1599. http://dx.doi.org/10.3390/electronics11101599.

Full text
Abstract:
Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by using the MapReduce programming framework. A very large amount of data is transferred from Mapper to Reducer without any filtering or recursion, resulting in overdrawn bandwidth. In this paper, we introduce an algorithm called Inner MAPping Combiner (IMapC) for the map phase. This algorithm in the Mapper combines the values of recurring ke
APA, Harvard, Vancouver, ISO, and other styles
18

Liu, Hanpeng, Wuqi Gao, and Junmin Luo. "Research on Intelligentization of Cloud Computing Programs Based on Self-awareness." International Journal of Advanced Network, Monitoring and Controls 8, no. 2 (2023): 89–98. http://dx.doi.org/10.2478/ijanmc-2023-0060.

Full text
Abstract:
Abstract Through the research of MapReduce programming framework of cloud computing, the current MapReduce program only solves specific problems, and there is no design experience or design feature summary of MapReduce program, let alone formal description and experience inheritance and application of knowledge base. In order to solve the problem of intelligent cloud computing program, a general MapReduce program generation method is designed. This paper proposes the architecture of intelligent cloud computing by studying AORBCO model and combining cloud computing technology. According to the
APA, Harvard, Vancouver, ISO, and other styles
19

Vaishali, Sontakke, and R. B. Dayananda. "Memory aware optimized Hadoop MapReduce model in cloud computing environment." International Journal of Artificial Intelligence (IJ-AI) 12, no. 3 (2023): 1270–80. https://doi.org/10.11591/ijai.v12.i3.pp1270-1280.

Full text
Abstract:
In the last decade, data analysis has become one of the popular tasks due to enormous growth in data every minute through different applications and instruments. MapReduce is the most popular programming model for data processing. Hadoop constitutes two basic models i.e., Hadoop file system (HDFS) and MapReduce, Hadoop is used for processing a huge amount of data whereas MapReduce is used for data processing. Hadoop MapReduce is one of the best platforms for processing huge data in an efficient manner such as processing web logs data. However, existing model This research work proposes memory
APA, Harvard, Vancouver, ISO, and other styles
20

Al-Badarneh, Amer, Amr Mohammad, and Salah Harb. "A Survey on MapReduce Implementations." International Journal of Cloud Applications and Computing 6, no. 1 (2016): 59–87. http://dx.doi.org/10.4018/ijcac.2016010104.

Full text
Abstract:
A distinguished successful platform for parallel data processing MapReduce is attracting a significant momentum from both academia and industry as the volume of data to capture, transform, and analyse grows rapidly. Although MapReduce is used in many applications to analyse large scale data sets, there is still a lot of debate among scientists and researchers on its efficiency, performance, and usability to support more classes of applications. This survey presents a comprehensive review of various implementations of MapReduce framework. Initially the authors give an overview of MapReduce prog
APA, Harvard, Vancouver, ISO, and other styles
21

Wei, Fang, Pan Wubin, and Cui Zhiming. "View of MapReduce: Programming model, methods, and its applications." IETE Technical Review 29, no. 5 (2012): 380. http://dx.doi.org/10.4103/0256-4602.103168.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Amshakala, K., R. Nedunchezhian, and M. Rajalakshmi. "Extracting Functional Dependencies in Large Datasets Using MapReduce Model." International Journal of Intelligent Information Technologies 10, no. 3 (2014): 19–35. http://dx.doi.org/10.4018/ijiit.2014070102.

Full text
Abstract:
Over the last few years, data are generated in large volume at a faster rate and there has been a remarkable growth in the need for large scale data processing systems. As data grows larger in size, data quality is compromised. Functional dependencies representing semantic constraints in data are important for data quality assessment. Executing functional dependency discovery algorithms on a single computer is hard and laborious with large data sets. MapReduce provides an enabling technology for large scale data processing. The open-source Hadoop implementation of MapReduce has provided resear
APA, Harvard, Vancouver, ISO, and other styles
23

Sontakke, Vaishali, and Dayananda R. B. "Memory aware optimized Hadoop MapReduce model in cloud computing environment." IAES International Journal of Artificial Intelligence (IJ-AI) 12, no. 3 (2023): 1270. http://dx.doi.org/10.11591/ijai.v12.i3.pp1270-1280.

Full text
Abstract:
<p>In the last decade, data analysis has become one of the popular tasks due to enormous growth in data every minute through different applications and instruments. MapReduce is the most popular programming model for data processing. Hadoop constitutes two basic models i.e., Hadoop file system (HDFS) and MapReduce, Hadoop is used for processing a huge amount of data whereas MapReduce is used for data processing. Hadoop MapReduce is one of the best platforms for processing huge data in an efficient manner such as processing web logs data. However, existing model This research work propose
APA, Harvard, Vancouver, ISO, and other styles
24

Muslim, Mohsin Khudhair, Al-Rammahi Adil, and Rabee Furkan. "An innovativefractal architecture model for implementing MapReduce in an open multiprocessing parallel environment." An innovativefractal architecture model for implementing MapReduce in an open multiprocessing parallel environment 30, no. 2 (2023): 1059–67. https://doi.org/10.11591/ijeecs.v30.i2.pp1059-1067.

Full text
Abstract:
One of the infrastructure applications that cloud computing offers as a service is parallel data processing. MapReduce is a type of parallel processing used more and more by data-intensive applications in cloud computing environments. MapReduce is based on a strategy called "divide and conquer," which uses regular computers, also called "nodes," to do processing in parallel. This paper looks at how open multiprocessing (OpenMP), the best shared-memory parallel programming model for high-performance computing, can be used with the proposed fractal network model in the MapRed
APA, Harvard, Vancouver, ISO, and other styles
25

Meng, Jian Liang, and Da Wei Li. "Improve and Optimize Query Recommendation System by MST Algorithm and its MapReduce Implementation." Applied Mechanics and Materials 701-702 (December 2014): 50–53. http://dx.doi.org/10.4028/www.scientific.net/amm.701-702.50.

Full text
Abstract:
Query recommendation as an important tool to enhance the user search efficiency has gradually become a hotspot. In the context of big data, using the MapReduce programming model, combined with distributed minimum spanning tree algorithm, a parallel query recommended method based on MapReduce was proposed in this paper. The final results show that the efficiency of query recommendation was greatly improved through parallel computing.
APA, Harvard, Vancouver, ISO, and other styles
26

Kabiru, D. Ibrahim, M. M. Ibrahim, Idris Yusuf, Bello Adamu, and S. A. Kassim. "MapReduce Model: A Paradigm for Large Data Processing." Global Journal of Research in Humanities & Cultural Studies 3, no. 2 (2023): 1–7. https://doi.org/10.5281/zenodo.7819069.

Full text
Abstract:
MapReduce is a programming paradigm that enables massive processing of large amount of data over several machines in a cluster of commodity computers. It is fault tolerant and scalable hence suitable for cloud computing applications. This paper come up with a second order nonlinear model of the MapReduce using experimental data collected from Grid5000 experimental tested accessed from the local machine using Linux Secure Socket Shell protocol (SSH) as a command line interphase. System identification was performed on the collected data using MATLAB toolbox. The nonlinear model obtained was line
APA, Harvard, Vancouver, ISO, and other styles
27

QIN, Jun, Yanyan SONG, and Ping ZONG. "Study of Task Scheduling Strategy based on Trustworthiness." International Journal of Distributed and Parallel systems 12, no. 05 (2021): 01–09. http://dx.doi.org/10.5121/ijdps.2021.12501.

Full text
Abstract:
MapReduce is a distributed computing model for cloud computing to process massive data. It simplifies the writing of distributed parallel programs. For the fault-tolerant technology in the MapReduce programming model, tasks may be allocated to nodes with low reliability. It causes the task to be reexecuted, wasting time and resources. This paper proposes a reliability task scheduling strategy with a failure recovery mechanism, evaluates the trustworthiness of resource nodes in the cloud environment and builds a trustworthiness model. By using the simulation platform CloudSim, the stability of
APA, Harvard, Vancouver, ISO, and other styles
28

Li, Ren, Haibo Hu, Heng Li, Yunsong Wu, and Jianxi Yang. "MapReduce Parallel Programming Model: A State-of-the-Art Survey." International Journal of Parallel Programming 44, no. 4 (2015): 832–66. http://dx.doi.org/10.1007/s10766-015-0395-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Jing, Weipeng, Danyu Tong, Yangang Wang, Jingyuan Wang, Yaqiu Liu, and Peng Zhao. "MaMR: High-performance MapReduce programming model for material cloud applications." Computer Physics Communications 211 (February 2017): 79–87. http://dx.doi.org/10.1016/j.cpc.2016.07.015.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Park, Jong-Hyuk, Hwa-Young Jeong, Young-Sik Jeong, and Min Choi. "REST-MapReduce: An Integrated Interface but Differentiated Service." Journal of Applied Mathematics 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/170723.

Full text
Abstract:
With the fast deployment of cloud computing, MapReduce architectures are becoming the major technologies for mobile cloud computing. The concept of MapReduce was first introduced as a novel programming model and implementation for a large set of computing devices. In this research, we propose a novel concept of REST-MapReduce, enabling users to use only the REST interface without using the MapReduce architecture. This approach provides a higher level of abstraction by integration of the two types of access interface, REST API and MapReduce. The motivation of this research stems from the slower
APA, Harvard, Vancouver, ISO, and other styles
31

Esposito, Christian, and Massimo Ficco. "Recent Developments on Security and Reliability in Large-Scale Data Processing with MapReduce." International Journal of Data Warehousing and Mining 12, no. 1 (2016): 49–68. http://dx.doi.org/10.4018/ijdwm.2016010104.

Full text
Abstract:
The demand to access to a large volume of data, distributed across hundreds or thousands of machines, has opened new opportunities in commerce, science, and computing applications. MapReduce is a paradigm that offers a programming model and an associated implementation for processing massive datasets in a parallel fashion, by using non-dedicated distributed computing hardware. It has been successfully adopted in several academic and industrial projects for Big Data Analytics. However, since such analytics is increasingly demanded within the context of mission-critical applications, security an
APA, Harvard, Vancouver, ISO, and other styles
32

Khudhair, Muslim Mohsin, Adil AL-Rammahi, and Furkan Rabee. "An innovativefractal architecture model for implementing MapReduce in an open multiprocessing parallel environment." Indonesian Journal of Electrical Engineering and Computer Science 30, no. 2 (2023): 1059. http://dx.doi.org/10.11591/ijeecs.v30.i2.pp1059-1067.

Full text
Abstract:
One of the infrastructure applications that cloud computing offers as a service is parallel data processing. MapReduce is a type of parallel processing used more and more by data-intensive applications in cloud computing environments. MapReduce is based on a strategy called "divide and conquer," which uses regular computers, also called "nodes," to do processing in parallel. This paper looks at how open multiprocessing (OpenMP), the best shared-memory parallel programming model for high-performance computing, can be used with the proposed fractal network model in the MapReduce application. A w
APA, Harvard, Vancouver, ISO, and other styles
33

Natesan, P., V. E. Sathishkumar, Sandeep Kumar Mathivanan, Maheshwari Venkatasen, Prabhu Jayagopal, and Shaikh Muhammad Allayear. "A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming." Mathematical Problems in Engineering 2023 (February 1, 2023): 1–10. http://dx.doi.org/10.1155/2023/6048891.

Full text
Abstract:
With the advancement of Internet technologies and the rapid increase of World Wide Web applications, there has been tremendous growth in the volume of digital data. This takes the digital world into a new era of big data. Various existing data processing technologies are not consistent and scalable in handling the complexity as well as the large-size datasets. Recently, there are many distributed data processing, and programming models have been proposed and implemented to handle big data applications. The open-source-implemented MapReduce programming model in Apache Hadoop is the foremost mod
APA, Harvard, Vancouver, ISO, and other styles
34

Pavithra.K. "Data Deduplication in Parallel Mining of Frequent Item sets using MapReduce." International Journal Of Engineering And Computer Science 5, no. 11 (2016): 18793–94. https://doi.org/10.5281/zenodo.4299570.

Full text
Abstract:
A Parallel Frequent Item sets mining algorithm called FiDoop using MapReduce programming model. FiDoop includes the frequent items ultrametric tree(FIU-tree), in that three MapReduce jobs are applied to complete the mining task. The scalability problem has been addressed bythe implementation of a handful of FP-growth-like parallelFIM algorithms. InFiDoop, the mappers independently and concurrently decompose item sets; the reducers perform combination operationsby constructing small ultrametric trees as well as miningthese trees in parallel. Data Deduplication is one of important data compressi
APA, Harvard, Vancouver, ISO, and other styles
35

K., Srikanth* P. Venkateswarlu Ashok Suragala. "A FUNDAMENTAL CONCEPT OF MAPREDUCE WITH MASSIVE FILES DATASET IN BIG DATA USING HADOOP PSEUDO-DISTRIBUTION MODE." Global Journal of Engineering Science and Research Management 4, no. 5 (2017): 58–62. https://doi.org/10.5281/zenodo.801301.

Full text
Abstract:
Hadoop Distributed File System (HDFS) and MapReduce programming model is used for storage and retrieval of the big data. Big data can be any structured collection which results incapability of conventional data management methods. The Tera Bytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. This paper provides introduction to Hadoop HDFS and MapReduce for storing large number of files and retrieve information from these files. In this paper we present our experimental work done on Hadoop by applying a number of files as input to the system and then analyzing th
APA, Harvard, Vancouver, ISO, and other styles
36

CHEN, Jirong, and Jiajin LE. "Programming model based on MapReduce for importing big table into HDFS." Journal of Computer Applications 33, no. 9 (2013): 2486–89. http://dx.doi.org/10.3724/sp.j.1087.2013.02486.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Zhang, Fan, and Qutaibah M. Malluhi. "A flexible and concurrent MapReduce programming model for shared-data applications." Qatar Foundation Annual Research Forum Proceedings, no. 2012 (October 2012): CSO10. http://dx.doi.org/10.5339/qfarf.2012.cso10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Gao, Yufei, Yanjie Zhou, Bing Zhou, Lei Shi, and Jiacai Zhang. "Handling Data Skew in MapReduce Cluster by Using Partition Tuning." Journal of Healthcare Engineering 2017 (2017): 1–12. http://dx.doi.org/10.1155/2017/1425102.

Full text
Abstract:
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and th
APA, Harvard, Vancouver, ISO, and other styles
39

Sheela, Gole, and Tidke Bharat. "CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON MAPREDUCE FRAMEWORK." International Journal on Foundations of Computer Science & Technology (IJFCST) 5, no. 3 (2023): 11. https://doi.org/10.5281/zenodo.8386798.

Full text
Abstract:
Now a day enormous amount of data is getting explored through Internet of Things (IoT) as technologies are advancing and people uses these technologies in day to day activities, this data is termed as Big Data having its characteristics and challenges. Frequent Itemset Mining algorithms are aimed to disclose frequent itemsets from transactional database but as the dataset size increases, it cannot be handled by traditional frequent itemset mining. MapReduce programming model solves the problem of large datasets but it has large communication cost which reduces execution efficiency. This propos
APA, Harvard, Vancouver, ISO, and other styles
40

Khudhair, Muslim Mohsin, Furkan Rabee, and Adil AL_Rammahi. "New efficient fractal models for MapReduce in OpenMP parallel environment." Bulletin of Electrical Engineering and Informatics 12, no. 4 (2023): 2313–27. http://dx.doi.org/10.11591/beei.v12i4.4977.

Full text
Abstract:
Parallel data processing is one of the specific infrastructure applications categorized as a service provided by cloud computing. In cloud computing environments, data-intensive applications increasingly use the parallel processing paradigm known as MapReduce. MapReduce is based on a strategy called "divide and conquer," which uses ordinary computers, also called "nodes," to do processing in parallel. This paper looks at how open multiprocessing (OpenMP), the best shared-memory parallel programming model for high-performance computing, can be used in the MapReduce application using proposed fr
APA, Harvard, Vancouver, ISO, and other styles
41

Khudhair, Muslim Mohsin, Furkan Rabee, and Adil AL_Rammahi. "New efficient fractal models for MapReduce in OpenMP parallel environment." Bulletin of Electrical Engineering and Informatics 12, no. 4 (2023): 2313–27. http://dx.doi.org/10.11591/eei.v12i4.4977.

Full text
Abstract:
Parallel data processing is one of the specific infrastructure applications categorized as a service provided by cloud computing. In cloud computing environments, data-intensive applications increasingly use the parallel processing paradigm known as MapReduce. MapReduce is based on a strategy called "divide and conquer," which uses ordinary computers, also called "nodes," to do processing in parallel. This paper looks at how open multiprocessing (OpenMP), the best shared-memory parallel programming model for high-performance computing, can be used in the MapReduce application using proposed fr
APA, Harvard, Vancouver, ISO, and other styles
42

Wang, Xiao Feng. "The Application of Hadoop in the Campus Cloud Computing System." Applied Mechanics and Materials 543-547 (March 2014): 3092–95. http://dx.doi.org/10.4028/www.scientific.net/amm.543-547.3092.

Full text
Abstract:
Based on the theory of cloud computing, this paper uses Hadoop distributed computing framework and the MapReduce programming model, designs and implements a campus cloud computing system for processing huge amounts of data. The system uses a three-layer architecture, has the flexibility to expand the scale, low development cost and ease of operation, reduces the difficulty of parallel programming and has the ability to efficiently handle massive data analysis and processing.
APA, Harvard, Vancouver, ISO, and other styles
43

Malik, Vandana. "An Indication of HDFS and MapReduce Application." International Journal for Research in Applied Science and Engineering Technology 13, no. 5 (2025): 6035–37. https://doi.org/10.22214/ijraset.2025.71586.

Full text
Abstract:
In the era of big data, handling and processing large-scale datasets efficiently is paramount. The Hadoop ecosystem, particularly the Hadoop Distributed File System (HDFS) and MapReduce programming model, plays a crucial role in addressing these needs. This paper presents an in-depth analysis of HDFS and MapReduce, highlighting their architecture, functionality, and real-world applications. It explores how these technologies facilitate reliable storage and scalable processing of vast data volumes across distributed computing environments. Additionally, the paper discusses use cases in sectors
APA, Harvard, Vancouver, ISO, and other styles
44

He, Yuheng, Jin Qian, Juanjie Zhang, and Renzhe Zhang. "Word frequency statistics based on MapReduce on serverless platforms." Applied and Computational Engineering 68, no. 1 (2024): 356–67. http://dx.doi.org/10.54254/2755-2721/68/20241536.

Full text
Abstract:
This paper investigates the application of serverless computing in conjunction with the MapReduce framework, particularly in machine learning (ML) tasks. The MapReduce programming model has been widely used to process large-scale datasets by simplifying parallel and distributed data processing. This study explores how the combination of these two technologies can provide more efficient and cost-effective ML solutions. Through a detailed analysis of serverless environments and the MapReduce framework, this paper shows how the combination can advance the fields of cloud computing and machine lea
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Peng, Jia Nan Wang, Ji Ci Ba, and Yu Tan. "Treatment and Research of Massive Data Mining Based on Cloud Computing." Advanced Materials Research 765-767 (September 2013): 941–44. http://dx.doi.org/10.4028/www.scientific.net/amr.765-767.941.

Full text
Abstract:
This paper introduces SPRINT algorithm optimized in the Hadoop core framework. Combing the data mining process, we will study the cloud computing in the MapReduce programming model, then improve and optimize the SPRINT algorithm in conjunction with the mode, transplant the optimized algorithm to Hadoop platform for distributed data processing.
APA, Harvard, Vancouver, ISO, and other styles
46

Chandra Sekhar Reddy, L., and Dr D. Murali. "YouTube: big data analytics using Hadoop and map reduce." International Journal of Engineering & Technology 7, no. 3.29 (2018): 12. http://dx.doi.org/10.14419/ijet.v7i3.29.18451.

Full text
Abstract:
We live today in a digital world a tremendous amount of data is generated by each digital service we use. This vast amount of data generated is called Big Data. According to Wikipedia, Big Data is a word for large data sets or compositions that the traditional data monitoring application software is pitiful to compress [5]. Extensive data cannot be used to receive data, store data, analyse data, search, share, transfer, view, consult, and update and maintain the confidentiality of information. Google's streaming services, YouTube, are one of the best examples of services that produce a massive
APA, Harvard, Vancouver, ISO, and other styles
47

Muhammad, Sharafadeen, Ibrahim Kabiru Dahiru, Ahmad Abubakar, and Muhammad Sanusi Ibrahim. "MODELING OF SYSTEMS UNDER CLOUD ENVIRONMENT." ASEAN Engineering Journal 11, no. 3 (2021): 190–98. http://dx.doi.org/10.11113/aej.v11.17054.

Full text
Abstract:
The emergence of large amount of data requires an efficient means of processing and storage facilities. Cloud computing provides an effective solution; MapReduce programming paradigm has the ability to handle such data by implementing Hadoop, but came up with some conflicting challenges in terms of Service Level Agreement (SLA) between major stakeholders. This paper focuses on coming up with a MapReduce model through system identification in order to address the requirement of the service time to meet-up the SLA within the limit of defined threshold in the presence of uncertainties in the syst
APA, Harvard, Vancouver, ISO, and other styles
48

Gévay, Gábor E., Juan Soto, and Volker Markl. "Handling Iterations in Distributed Dataflow Systems." ACM Computing Surveys 54, no. 9 (2022): 1–38. http://dx.doi.org/10.1145/3477602.

Full text
Abstract:
Over the past decade, distributed dataflow systems (DDS) have become a standard technology. In these systems, users write programs in restricted dataflow programming models, such as MapReduce, which enable them to scale out program execution to a shared-nothing cluster of machines. Yet, there is no established consensus that prescribes how to extend these programming models to support iterative algorithms. In this survey, we review the research literature and identify how DDS handle control flow, such as iteration, from both the programming model and execution level perspectives. This survey w
APA, Harvard, Vancouver, ISO, and other styles
49

Wu, Yao, Long Zheng, Brian Heilig, and Guang R. Gao. "HAMR: A dataflow-based real-time in-memory cluster computing engine." International Journal of High Performance Computing Applications 31, no. 5 (2016): 361–74. http://dx.doi.org/10.1177/1094342016672080.

Full text
Abstract:
As the attention given to big data grows, cluster computing systems for distributed processing of large data sets become the mainstream and critical requirement in high performance distributed system research. One of the most successful systems is Hadoop, which uses MapReduce as a programming/execution model and takes disks as intermedia to process huge volumes of data. Spark, as an in-memory computing engine, can solve the iterative and interactive problems more efficiently. However, currently it is a consensus that they are not the final solutions to big data due to a MapReduce-like programm
APA, Harvard, Vancouver, ISO, and other styles
50

González-Vélez, Horacio, and Maryam Kontagora. "Performance evaluation of MapReduce using full virtualisation on a departmental cloud." International Journal of Applied Mathematics and Computer Science 21, no. 2 (2011): 275–84. http://dx.doi.org/10.2478/v10006-011-0020-3.

Full text
Abstract:
Performance evaluation of MapReduce using full virtualisation on a departmental cloudThis work analyses the performance of Hadoop, an implementation of the MapReduce programming model for distributed parallel computing, executing on a virtualisation environment comprised of 1+16 nodes running the VMWare workstation software. A set of experiments using the standard Hadoop benchmarks has been designed in order to determine whether or not significant reductions in the execution time of computations are experienced when using Hadoop on this virtualisation platform on a departmental cloud. Our find
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!