To see the other types of publications on this topic, follow the link: Data anonymization.

Journal articles on the topic 'Data anonymization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Data anonymization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lingala, Thirupathi, C. Kishor Kumar Reddy, B. V. Ramana Murthy, Rajashekar Shastry, and YVSS Pragathi. "L-Diversity for Data Analysis: Data Swapping with Customized Clustering." Journal of Physics: Conference Series 2089, no. 1 (November 1, 2021): 012050. http://dx.doi.org/10.1088/1742-6596/2089/1/012050.

Full text
Abstract:
Abstract Data anonymization should support the analysts who intend to use the anonymized data. Releasing datasets that contain personal information requires anonymization that balances privacy concerns while preserving the utility of the data. This work shows how choosing anonymization techniques with the data analyst requirements in mind improves effectiveness quantitatively, by minimizing the discrepancy between querying the original data versus the anonymized result, and qualitatively, by simplifying the workflow for querying the data.
APA, Harvard, Vancouver, ISO, and other styles
2

Tomás, Joana, Deolinda Rasteiro, and Jorge Bernardino. "Data Anonymization: An Experimental Evaluation Using Open-Source Tools." Future Internet 14, no. 6 (May 30, 2022): 167. http://dx.doi.org/10.3390/fi14060167.

Full text
Abstract:
In recent years, the use of personal data in marketing, scientific and medical investigation, and forecasting future trends has really increased. This information is used by the government, companies, and individuals, and should not contain any sensitive information that allows the identification of an individual. Therefore, data anonymization is essential nowadays. Data anonymization changes the original data to make it difficult to identify an individual. ARX Data Anonymization and Amnesia are two popular open-source tools that simplify this process. In this paper, we evaluate these tools in two ways: with the OSSpal methodology, and using a public dataset with the most recent tweets about the Pfizer and BioNTech vaccine. The assessment with the OSSpal methodology determines that ARX Data Anonymization has better results than Amnesia. In the experimental evaluation using the public dataset, it is possible to verify that Amnesia has some errors and limitations, but the anonymization process is simpler. Using ARX Data Anonymization, it is possible to upload big datasets and the tool does not show any error in the anonymization process. We concluded that ARX Data Anonymization is the one recommended to use in data anonymization.
APA, Harvard, Vancouver, ISO, and other styles
3

Pejić Bach, Mirjana, Jasmina Pivar, and Ksenija Dumičić. "Data anonymization patent landscape." Croatian Operational Research Review 8, no. 1 (March 31, 2017): 265–81. http://dx.doi.org/10.17535/crorr.2017.0017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Lasko, Thomas A., and Staal A. Vinterbo. "Spectral Anonymization of Data." IEEE Transactions on Knowledge and Data Engineering 22, no. 3 (March 2010): 437–46. http://dx.doi.org/10.1109/tkde.2009.88.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

K., Sivasankari Krishnakumar, and Dr Uma Maheswari K.M. "A Comprehensive Review on Data Anonymization Techniques for Social Networks." Webology 19, no. 1 (January 20, 2022): 380–405. http://dx.doi.org/10.14704/web/v19i1/web19028.

Full text
Abstract:
Many individuals all around the world have been using social media to communicate information. Numerous companies utilize social data mining to deduce many fascinating facts from social data modeled as a complicated network structure. However, releasing social data has a direct and indirect impact on the privacy of many of its users. Recently, several anonymization techniques have been created to safeguard Personal information about users and their relationships in social media. This study presents a comprehensive survey on various data anonymization strategies for social network data and analysis their advantages and disadvantages. It also addresses the major research concerns surrounding the efficiency of anonymization approaches.
APA, Harvard, Vancouver, ISO, and other styles
6

Ji, Shouling, Weiqing Li, Mudhakar Srivatsa, Jing Selena He, and Raheem Beyah. "General Graph Data De-Anonymization." ACM Transactions on Information and System Security 18, no. 4 (May 6, 2016): 1–29. http://dx.doi.org/10.1145/2894760.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Jang, Sung-Bong, and Young-Woong Ko. "Efficient multimedia big data anonymization." Multimedia Tools and Applications 76, no. 17 (December 1, 2015): 17855–72. http://dx.doi.org/10.1007/s11042-015-3123-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kumar, Sindhe Phani, and R. Anandan. "Data Verification of Logical Pk-Anonymization with Big Data Application and Key Generation in Cloud Computing." Journal of Function Spaces 2022 (June 23, 2022): 1–10. http://dx.doi.org/10.1155/2022/8345536.

Full text
Abstract:
Background. As more data becomes available about how frequently the cloud can be updated, a more comprehensive picture of its safety is emerging. The suggested artworks use a cloud-based gradual clustering device to cluster and refresh a large number of informational indexes in a useful manner. Purpose. Anonymization of data is done at the point of collection in order to safeguard the data. More secure than K-Anonymization, Pk-Anonymization is the area’s first randomization method. A cloud service provider (CSP) is an independent company that provides a cloud-based network and computing resources. Customers’ security and connection protection must be verified by an authority before facts may be transferred to cloud servers for storing information. Method. Logical Pk-Anonymization and key era techniques are proposed in this proposed artwork in order to verify the cloud records, as well as to store sensitive information in the cloud. Cloud-based informational indexes are used in the proposed framework, which is effective at handling large amounts of data through MapReduce; a parallel data preparation form is obtained; to get all information as new facts that joins after a while, information anonymization techniques to carry out each protection and immoderate information utilization while updating take place; information loss and clean time is reduced for substantial amounts of data. As a result, the safety and records software might be in sync.
APA, Harvard, Vancouver, ISO, and other styles
9

Et. al., Waleed M. Ead,. "A General Framework Information Loss of Utility-Based Anonymization in Data Publishing." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 5 (April 11, 2021): 1450–56. http://dx.doi.org/10.17762/turcomat.v12i5.2102.

Full text
Abstract:
To build anonymization, the data anonymizer must determine the following three issues: Firstly, which data to be preserved? Secondly, which adversary background knowledge used to disclosure the anonymized data? Thirdly, The usage of the anonymized data? We have different anonymization techniques from the previous three-question according to different adversary background knowledge and information usage (information utility). In other words, different anonymization techniques lead to different information loss. In this paper, we propose a general framework for the utility-based anonymization to minimize the information loss in data published with a trade-off grantee of achieving the required privacy level.
APA, Harvard, Vancouver, ISO, and other styles
10

Gießler, Fina, Maximilian Thormann, Bernhard Preim, Daniel Behme, and Sylvia Saalfeld. "Facial Feature Removal for Anonymization of Neurological Image Data." Current Directions in Biomedical Engineering 7, no. 1 (August 1, 2021): 130–34. http://dx.doi.org/10.1515/cdbme-2021-1028.

Full text
Abstract:
Abstract Interdisciplinary exchange of medical datasets between clinicians and engineers is essential for clinical research. Due to the Data Protection Act, which preserves the rights of patients, full anonymization is necessary before any exchange can take place. Due to the continuous improvement of image quality of tomographic datasets, anonymization of patient-specific information is not sufficient. In this work, we present a prototype that allows to reliably obscure the facial features of patient data, thus enabling anonymization of neurological datasets in image space.
APA, Harvard, Vancouver, ISO, and other styles
11

Ruiz, Nicolas. "A General Cipher for Individual Data Anonymization." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 28, no. 05 (September 30, 2020): 727–56. http://dx.doi.org/10.1142/s0218488520500312.

Full text
Abstract:
Over the years, the literature on individual data anonymization has burgeoned in many directions. While such diversity should be praised, it does not come without some difficulties. Currently, the task of selecting the optimal analytical environment is complicated by the multitude of available choices and the fact that the performance of any method is generally dependent of the data properties. In light of these issues, the contribution of this paper is twofold. First, based on recent insights from the literature and inspired by cryptography, it proposes a new anonymization method that shows that the task of anonymization can ultimately rely only on ranks permutations. As a result, the method offers a new way to practice data anonymization by performing it ex-ante and independently of the distributional features of the data instead of being engaged, as it is currently the case in the literature, in several ex-post evaluations and iterations to reach the protection and information properties sought after. Second, the method establishes a conceptual connection across the field, as it can mimic all the currently existing tools. To make the method operational, this paper proposes also the introduction of permutation menus in data anonymization, where recently developed universal measures of disclosure risk and information loss are used ex-ante for the calibration of permutation keys. To justify the relevance of their uses, a theoretical characterization of these measures is also proposed.
APA, Harvard, Vancouver, ISO, and other styles
12

Ji, Shouling, Prateek Mittal, and Raheem Beyah. "Graph Data Anonymization, De-Anonymization Attacks, and De-Anonymizability Quantification: A Survey." IEEE Communications Surveys & Tutorials 19, no. 2 (2017): 1305–26. http://dx.doi.org/10.1109/comst.2016.2633620.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Parameshwarappa, Pooja, Zhiyuan Chen, and Gunes Koru. "An Effective and Computationally Efficient Approach for Anonymizing Large-Scale Physical Activity Data." International Journal of Information Security and Privacy 14, no. 3 (July 2020): 72–94. http://dx.doi.org/10.4018/ijisp.2020070105.

Full text
Abstract:
Publishing physical activity data can facilitate reproducible health-care research in several areas such as population health management, behavioral health research, and management of chronic health problems. However, publishing such data also brings high privacy risks related to re-identification which makes anonymization necessary. One of the challenges in anonymizing physical activity data collected periodically is its sequential nature. The existing anonymization techniques work sufficiently for cross-sectional data but have high computational costs when applied directly to sequential data. This article presents an effective anonymization approach, multi-level clustering-based anonymization to anonymize physical activity data. Compared with the conventional methods, the proposed approach improves time complexity by reducing the clustering time drastically. While doing so, it preserves the utility as much as the conventional approaches.
APA, Harvard, Vancouver, ISO, and other styles
14

Liber, Arkadiusz. "The issues connected with the anonymization of medical data. Part 2. Advanced anonymization and anonymization controlled by owner of protected sensitive data." Medical Science Pulse 8, no. 2 (August 7, 2014): 13–24. http://dx.doi.org/10.5604/01.3001.0003.3161.

Full text
Abstract:
Introduction: Medical documentation ought to be accessible with the preservation of its integrity as well as the protection of personal data. One of the manners of its protection against disclosure is anonymization. Contemporary methods ensure anonymity without the possibility of sensitive data access control. it seems that the future of sensitive data processing systems belongs to the personalized method. In the first part of the paper k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, and (k,e)-Anonymity methods were discussed. these methods belong to well - known elementary methods which are the subject of a significant number of publications. As the source papers to this part, Samarati, Sweeney, wang, wong and zhang’s works were accredited. the selection of these publications is justified by their wider research review work led, for instance, by Fung, Wang, Fu and y. however, it should be noted that the methods of anonymization derive from the methods of statistical databases protection from the 70s of 20th century. Due to the interrelated content and literature references the first and the second part of this article constitute the integral whole.Aim of the study: The analysis of the methods of anonymization, the analysis of the methods of protection of anonymized data, the study of a new security type of privacy enabling device to control disclosing sensitive data by the entity which this data concerns.Material and methods: Analytical methods, algebraic methods.Results: Delivering material supporting the choice and analysis of the ways of anonymization of medical data, developing a new privacy protection solution enabling the control of sensitive data by entities which this data concerns.Conclusions: In the paper the analysis of solutions for data anonymization, to ensure privacy protection in medical data sets, was conducted. the methods of: k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, (k,e)-Anonymity, (X,y)-Privacy, lKc-Privacy, l-Diversity, (X,y)-linkability, t-closeness, confidence Bounding and Personalized Privacy were described, explained and analyzed. The analysis of solutions of controlling sensitive data by their owner was also conducted. Apart from the existing methods of the anonymization, the analysis of methods of the protection of anonymized data was included. In particular, the methods of: δ-Presence, e-Differential Privacy, (d,γ)-Privacy, (α,β)-Distributing Privacy and protections against (c,t)-isolation were analyzed. Moreover, the author introduced a new solution of the controlled protection of privacy. the solution is based on marking a protected field and the multi-key encryption of sensitive value. The suggested way of marking the fields is in accordance with Xmlstandard. For the encryption, (n,p) different keys cipher was selected. to decipher the content the p keys of n were used. The proposed solution enables to apply brand new methods to control privacy of disclosing sensitive data.
APA, Harvard, Vancouver, ISO, and other styles
15

Xu, Heng, and Nan Zhang. "Implications of Data Anonymization on the Statistical Evidence of Disparity." Management Science 68, no. 4 (April 2022): 2600–2618. http://dx.doi.org/10.1287/mnsc.2021.4028.

Full text
Abstract:
Research and practical development of data-anonymization techniques have proliferated in recent years. Yet, limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged subpopulations. This study is one of the first attempts to examine the extent to which data anonymization could mask the gross statistical disparities between subpopulations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data-anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the business and policy implications, highlighting the need for firms and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact. This paper was accepted by Chris Forman, information systems.
APA, Harvard, Vancouver, ISO, and other styles
16

Bazai, Sibghat Ullah, Julian Jang-Jaccard, and Hooman Alavizadeh. "A Novel Hybrid Approach for Multi-Dimensional Data Anonymization for Apache Spark." ACM Transactions on Privacy and Security 25, no. 1 (February 28, 2022): 1–25. http://dx.doi.org/10.1145/3484945.

Full text
Abstract:
Multi-dimensional data anonymization approaches (e.g., Mondrian) ensure more fine-grained data privacy by providing a different anonymization strategy applied for each attribute. Many variations of multi-dimensional anonymization have been implemented on different distributed processing platforms (e.g., MapReduce, Spark) to take advantage of their scalability and parallelism supports. According to our critical analysis on overheads, either existing iteration-based or recursion-based approaches do not provide effective mechanisms for creating the optimal number of and relative size of resilient distributed datasets (RDDs), thus heavily suffer from performance overheads. To solve this issue, we propose a novel hybrid approach for effectively implementing a multi-dimensional data anonymization strategy (e.g., Mondrian) that is scalable and provides high-performance. Our hybrid approach provides a mechanism to create far fewer RDDs and smaller size partitions attached to each RDD than existing approaches. This optimal RDD creation and operations approach is critical for many multi-dimensional data anonymization applications that create tremendous execution complexity. The new mechanism in our proposed hybrid approach can dramatically reduce the critical overheads involved in re-computation cost, shuffle operations, message exchange, and cache management.
APA, Harvard, Vancouver, ISO, and other styles
17

Hemmatazad, Nolan, Robin Gandhi, Qiuming Zhu, and Sanjukta Bhowmick. "The Intelligent Data Brokerage." International Journal of Privacy and Health Information Management 2, no. 1 (January 2014): 22–33. http://dx.doi.org/10.4018/ijphim.2014010102.

Full text
Abstract:
The anonymization of widely distributed or open data has been a topic of great interest to privacy advocates in recent years. The goal of anonymization in these cases is to make data available to a larger audience, extending the utility of the data to new environments and evolving use cases without compromising the personal information of individuals whose data are being distributed. The resounding issue with such practices is that, with any anonymity measure, there is a trade-off between privacy and utility, where maximizing one carries a cost to the other. In this paper, the authors propose a framework for the utility-preserving release of anonymized data, based on the idea of intelligent data brokerages. These brokerages act as intermediaries between users requesting access to information resources and an existing database management system (DBMS). Through the use of a formal language for interpreting user information requests, customizable anonymization policies, and optional natural language processing (NLP) capabilities, data brokerages can maximize the utility of data in-context when responding to user inquiries.
APA, Harvard, Vancouver, ISO, and other styles
18

Hossayni, Hicham, Imran Khan, and Noel Crespi. "Data Anonymization for Maintenance Knowledge Sharing." IT Professional 23, no. 5 (September 1, 2021): 23–30. http://dx.doi.org/10.1109/mitp.2021.3066244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Sopaoglu, Ugur, and Osman Abul. "Classification utility aware data stream anonymization." Applied Soft Computing 110 (October 2021): 107743. http://dx.doi.org/10.1016/j.asoc.2021.107743.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

GONG, Qi-Yuan, Ming YANG, and Jun-Zhou LUO. "Data Anonymization Approach for Incomplete Microdata." Journal of Software 24, no. 12 (January 17, 2014): 2883–96. http://dx.doi.org/10.3724/sp.j.1001.2013.04411.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Sun, X., H. Wang, J. Li, and Y. Zhang. "Satisfying Privacy Requirements Before Data Anonymization." Computer Journal 55, no. 4 (March 17, 2011): 422–37. http://dx.doi.org/10.1093/comjnl/bxr028.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Zakerzadeh, Hessam, Charu C. Aggarwal, and Ken Barker. "Managing dimensionality in data privacy anonymization." Knowledge and Information Systems 49, no. 1 (December 11, 2015): 341–73. http://dx.doi.org/10.1007/s10115-015-0906-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Gambs, Sébastien, Marc-Olivier Killijian, and Miguel Núñez del Prado Cortez. "De-anonymization attack on geolocated data." Journal of Computer and System Sciences 80, no. 8 (December 2014): 1597–614. http://dx.doi.org/10.1016/j.jcss.2014.04.024.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Gál, Tamás Zoltán, Gábor Kovács, and Zsolt T. Kardkovács. "Survey on privacy preserving data mining techniques in health care databases." Acta Universitatis Sapientiae, Informatica 6, no. 1 (June 1, 2014): 33–55. http://dx.doi.org/10.2478/ausi-2014-0017.

Full text
Abstract:
Abstract In health care databases, there are tireless and antagonistic interests between data mining research and privacy preservation, the more you try to hide sensitive private information, the less valuable it is for analysis. In this paper, we give an outlook on data anonymization problems by case studies. We give a summary on the state-of-the-art health care data anonymization issues including legal environment and expectations, the most common attacking strategies on privacy, and the proposed metrics for evaluating usefulness and privacy preservation for anonymization. Finally, we summarize the strength and the shortcomings of different approaches and techniques from the literature based on these evaluations.
APA, Harvard, Vancouver, ISO, and other styles
25

Vokinger, Kerstin N., Daniel J. Stekhoven, and Michael Krauthammer. "Lost in Anonymization — A Data Anonymization Reference Classification Merging Legal and Technical Considerations." Journal of Law, Medicine & Ethics 48, no. 1 (2020): 228–31. http://dx.doi.org/10.1177/1073110520917025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Arca, Sevgi, and Rattikorn Hewett. "Analytics on Anonymity for Privacy Retention in Smart Health Data." Future Internet 13, no. 11 (October 28, 2021): 274. http://dx.doi.org/10.3390/fi13110274.

Full text
Abstract:
Advancements in smart technology, wearable and mobile devices, and Internet of Things, have made smart health an integral part of modern living to better individual healthcare and well-being. By enhancing self-monitoring, data collection and sharing among users and service providers, smart health can increase healthy lifestyles, timely treatments, and save lives. However, as health data become larger and more accessible to multiple parties, they become vulnerable to privacy attacks. One way to safeguard privacy is to increase users’ anonymity as anonymity increases indistinguishability making it harder for re-identification. Still the challenge is not only to preserve data privacy but also to ensure that the shared data are sufficiently informative to be useful. Our research studies health data analytics focusing on anonymity for privacy protection. This paper presents a multi-faceted analytical approach to (1) identifying attributes susceptible to information leakages by using entropy-based measure to analyze information loss, (2) anonymizing the data by generalization using attribute hierarchies, and (3) balancing between anonymity and informativeness by our anonymization technique that produces anonymized data satisfying a given anonymity requirement while optimizing data retention. Our anonymization technique is an automated Artificial Intelligent search based on two simple heuristics. The paper describes and illustrates the detailed approach and analytics including pre and post anonymization analytics. Experiments on published data are performed on the anonymization technique. Results, compared with other similar techniques, show that our anonymization technique gives the most effective data sharing solution, with respect to computational cost and balancing between anonymity and data retention.
APA, Harvard, Vancouver, ISO, and other styles
27

Madan, Suman, and Puneet Goswami. "A Technique for Securing Big Data Using K-Anonymization With a Hybrid Optimization Algorithm." International Journal of Operations Research and Information Systems 12, no. 4 (October 2021): 1–21. http://dx.doi.org/10.4018/ijoris.20211001.oa3.

Full text
Abstract:
The recent techniques built on cloud computing for data processing is scalable and secure, which increasingly attracts the infrastructure to support big data applications. This paper proposes an effective anonymization based privacy preservation model using k-anonymization criteria and Grey wolf-Cat Swarm Optimization (GWCSO) for attaining privacy preservation in big data. The anonymization technique is processed by adapting k- anonymization criteria for duplicating k records from the original database. The proposed GWCSO is developed by integrating Grey Wolf Optimizer (GWO) and Cat Swarm Optimization (CSO) for constructing the k-anonymized database, which reveals only the essential details to the end users by hiding the confidential information. The experimental results of the proposed technique are compared with various existing techniques based on the performance metrics, such as Classification accuracy (CA) and Information loss (IL). The experimental results show that the proposed technique attains an improved CA value of 0.005 and IL value of 0.798, respectively.
APA, Harvard, Vancouver, ISO, and other styles
28

Jiang, Lili, and Vicenç Torra. "Data Protection and Multi-Database Data-Driven Models." Future Internet 15, no. 3 (February 27, 2023): 93. http://dx.doi.org/10.3390/fi15030093.

Full text
Abstract:
Anonymization and data masking have effects on data-driven models. Different anonymization methods have been developed to provide a good trade-off between privacy guarantees and data utility. Nevertheless, the effects of data protection (e.g., data microaggregation and noise addition) on data integration and on data-driven models (e.g., machine learning models) built from these data are not known. In this paper, we study how data protection affects data integration, and the corresponding effects on the results of machine learning models built from the outcome of the data integration process. The experimental results show that the levels of protection that prevent proper database integration do not affect machine learning models that learn from the integrated database to the same degree. Concretely, our preliminary analysis and experiments show that data protection techniques have a lower level of impact on data integration than on machine learning models.
APA, Harvard, Vancouver, ISO, and other styles
29

Zouinina, Sarah, Younès Bennani, Nicoleta Rogovschi, and Abdelouahid Lyhyaoui. "Data Anonymization through Collaborative Multi-view Microaggregation." Journal of Intelligent Systems 30, no. 1 (October 2, 2020): 327–45. http://dx.doi.org/10.1515/jisys-2020-0026.

Full text
Abstract:
Abstract The interest in data anonymization is exponentially growing, motivated by the will of the governments to open their data. The main challenge of data anonymization is to find a balance between data utility and the amount of disclosure risk. One of the most known frameworks of data anonymization is k-anonymity, this method assumes that a dataset is anonymous if and only if for each element of the dataset, there exist at least k − 1 elements identical to it. In this paper, we propose two techniques to achieve k-anonymity through microaggregation: k-CMVM and Constrained-CMVM. Both, use topological collaborative clustering to obtain k-anonymous data. The first one determines the k levels automatically and the second defines it by exploration. We also improved the results of these two approaches by using pLVQ2 as a weighted vector quantization method. The four methods proposed were proven to be efficient using two data utility measures, the separability utility and the structural utility. The experimental results have shown a very promising performance.
APA, Harvard, Vancouver, ISO, and other styles
30

Al-Zobbi, Mohammed Essa, Seyed Shahrestani, and Chun Ruan. "Achieving Optimal K-Anonymity Parameters for Big Data." International Journal of Information, Communication Technology and Applications 4, no. 1 (May 15, 2018): 23–33. http://dx.doi.org/10.17972/ijicta20184136.

Full text
Abstract:
Datasets containing private and sensitive information are useful for data analytics. Data owners cautiously release such sensitive data using privacy-preserving publishing techniques. Personal re-identification possibility is much larger than ever before. For instance, social media has dramatically increased the exposure to privacy violation. One well-known technique of k-anonymity proposes a protection approach against privacy exposure. K-anonymity tends to find k equivalent number of data records. The chosen attributes are known as Quasi-identifiers. This approach may reduce the personal re-identification. However, this may lessen the usefulness of information gained. The value of k should be carefully determined, to compromise both security and information gained. Unfortunately, there is no any standard procedure to define the value of k. The problem of the optimal k-anonymization is NP-hard. In this paper, we propose a greedy-based heuristic approach that provides an optimal value for k. The approach evaluates the empirical risk concerning our Sensitivity-Based Anonymization method. Our approach is derived from the fine-grained access and business role anonymization for big data, which forms our framework.
APA, Harvard, Vancouver, ISO, and other styles
31

Indhumathi, R., and S. Sathiya Devi. "Anonymization Based on Improved Bucketization (AIB): A Privacy-Preserving Data Publishing Technique for Improving Data Utility in Healthcare Data." Journal of Medical Imaging and Health Informatics 11, no. 12 (December 1, 2021): 3164–73. http://dx.doi.org/10.1166/jmihi.2021.3901.

Full text
Abstract:
Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.
APA, Harvard, Vancouver, ISO, and other styles
32

Abd Razak, Shukor, Nur Hafizah Mohd Nazari, and Arafat Al-Dhaqm. "Data Anonymization Using Pseudonym System to Preserve Data Privacy." IEEE Access 8 (2020): 43256–64. http://dx.doi.org/10.1109/access.2020.2977117.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

M, Abhishek, and Poornima Kulkarni. "BIG DATA PRIVACY ANONYMIZATION ALGORITHMS: A REVIEW." International Journal of Engineering Applied Sciences and Technology 5, no. 4 (August 1, 2020): 147–50. http://dx.doi.org/10.33564/ijeast.2020.v05i04.019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

K, Venkata Ramana, and Valli Kumari V. "Graph Based Local Recoding for Data Anonymization." International Journal of Database Management Systems 5, no. 4 (August 31, 2013): 1–15. http://dx.doi.org/10.5121/ijdms.2013.5401.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Brankovic, Ljiljana, Nacho López, Mirka Miller, and Francesc Sebé. "Triangle randomization for social network data anonymization." Ars Mathematica Contemporanea 7, no. 2 (June 27, 2014): 461–77. http://dx.doi.org/10.26493/1855-3974.220.34c.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

WANG, Zhi-Hui, Jian XU, Wei WANG, and Bai-Le SHI. "A Clustering-Based Approach for Data Anonymization." Journal of Software 21, no. 4 (March 11, 2010): 680–93. http://dx.doi.org/10.3724/sp.j.1001.2010.03508.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Ji, Shouling, Weiqing Li, Mudhakar Srivatsa, and Raheem Beyah. "Structural Data De-Anonymization: Theory and Practice." IEEE/ACM Transactions on Networking 24, no. 6 (December 2016): 3523–36. http://dx.doi.org/10.1109/tnet.2016.2536479.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Terrovitis, Manolis, Nikos Mamoulis, and Panos Kalnis. "Privacy-preserving anonymization of set-valued data." Proceedings of the VLDB Endowment 1, no. 1 (August 2008): 115–25. http://dx.doi.org/10.14778/1453856.1453874.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Kohlmayer, Florian, Fabian Prasser, Claudia Eckert, and Klaus A. Kuhn. "A flexible approach to distributed data anonymization." Journal of Biomedical Informatics 50 (August 2014): 62–76. http://dx.doi.org/10.1016/j.jbi.2013.12.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Maltz, David A., Jibin Zhan, Gisli Hjalmtysson, Albert Greenberg, Jennifer Rexford, Geoffrey G. Xie, and Hui Zhang. "Structure preserving anonymization of router configuration data." IEEE Journal on Selected Areas in Communications 27, no. 3 (April 2009): 349–58. http://dx.doi.org/10.1109/jsac.2009.090410.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

., Mallappa Gurav. "ANONYMIZATION OF DATA USING MAPREDUCE ON CLOUD." International Journal of Research in Engineering and Technology 04, no. 07 (July 25, 2015): 142–46. http://dx.doi.org/10.15623/ijret.2015.0407021.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Loukides, Grigorios, Aris Gkoulalas-Divanis, and Jianhua Shao. "Efficient and flexible anonymization of transaction data." Knowledge and Information Systems 36, no. 1 (September 9, 2012): 153–210. http://dx.doi.org/10.1007/s10115-012-0544-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Li, Jiuyong, Jixue Liu, Muzammil Baig, and Raymond Chi-Wing Wong. "Information based data anonymization for classification utility." Data & Knowledge Engineering 70, no. 12 (December 2011): 1030–45. http://dx.doi.org/10.1016/j.datak.2011.07.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Marrón, David González, Verónica Paola Corona Ramírez, Angélica Enciso González, Alejandro Márquez Callejas, and Iridian Sandivel Pérez Hernández. "Development of an application to anonymize data to be shared in the cloud." South Florida Journal of Development 3, no. 4 (August 16, 2022): 5310–18. http://dx.doi.org/10.46932/sfjdv3n4-097.

Full text
Abstract:
A design of an interface is proposed to be developed as a lightweight application that allows the use of different algorithms to anonymize proposed by various authors, the application focuses on adapting data from distributed applications that must interact using the JSON lightweight Data Interchange Format. The user interaction is minimized for the anonymization process, an interface is provided to facilitate the selection process of the anonymization algorithms that users choose.
APA, Harvard, Vancouver, ISO, and other styles
45

Marrón, David González, Verónica Paola Corona Ramírez, Angélica Enciso González, Alejandro Márquez Callejas, and Iridian Sandivel Pérez Hernández. "Development of an application to anonymize data to be shared in the cloud." South Florida Journal of Development 3, no. 4 (July 21, 2022): 4696–703. http://dx.doi.org/10.46932/sfjdv3n4-047.

Full text
Abstract:
A design of an interface is proposed to be developed as a lightweight application that allows the use of different algorithms to anonymize proposed by various authors, the application focuses on adapting data from distributed applications that must interact using the JSON lightweight Data Interchange Format. The user interaction is minimized for the anonymization process, an interface is provided to facilitate the selection process of the anonymization algorithms that users choose.
APA, Harvard, Vancouver, ISO, and other styles
46

Bazai, Sibghat Ullah, Julian Jang-Jaccard, and Hooman Alavizadeh. "Scalable, High-Performance, and Generalized Subtree Data Anonymization Approach for Apache Spark." Electronics 10, no. 5 (March 3, 2021): 589. http://dx.doi.org/10.3390/electronics10050589.

Full text
Abstract:
Data anonymization strategies such as subtree generalization have been hailed as techniques that provide a more efficient generalization strategy compared to full-tree generalization counterparts. Many subtree-based generalizations strategies (e.g., top-down, bottom-up, and hybrid) have been implemented on the MapReduce platform to take advantage of scalability and parallelism. However, MapReduce inherent lack support for iteration intensive algorithm implementation such as subtree generalization. This paper proposes Distributed Dataset (RDD)-based implementation for a subtree-based data anonymization technique for Apache Spark to address the issues associated with MapReduce-based counterparts. We describe our RDDs-based approach that offers effective partition management, improved memory usage that uses cache for frequently referenced intermediate values, and enhanced iteration support. Our experimental results provide high performance compared to the existing state-of-the-art privacy preserving approaches and ensure data utility and privacy levels required for any competitive data anonymization techniques.
APA, Harvard, Vancouver, ISO, and other styles
47

Pravin N. Kathavate, Mr, and Dr J. Amudhavel. "Route Map of Privacy Preservation to IOT." International Journal of Engineering & Technology 7, no. 2.7 (March 18, 2018): 825. http://dx.doi.org/10.14419/ijet.v7i2.7.11076.

Full text
Abstract:
Data anonymization is the main feature of privacy preservation, and it assists in eradicating the privacy hazard in data preparation in various applications including IoT. Pseudonymity and Anonymization are two significant security factors that were adopted when sensitive data are shared. In medical field, data is usually distributed horizontally with diverse regions carrying a similar set of characteristics for various anonymization techniques. Accordingly, this paper intends to formulate a review on privacy preservation in IoT. Here, the literature analyses on diverse techniques associated with data hiding, data preservation and data anonymization along with data restoration properties. It reviews 60 research papers and states the significant analysis. Initially, the analysis depicts the chronological review of the overall contribution of different types of anonymization protocols in diverse applications. Subsequently, the analysis also focuses on various features such as applications, measures, key generation and data preservation in healthcare, etc. Furthermore, this paper provides the detailed performance study regarding data hiding and restoration process in each contribution. Finally, it extends the various research issues which can be useful for the researchers to accomplish further research on data preservation in IoT.
APA, Harvard, Vancouver, ISO, and other styles
48

Fahad Ahamd. "Preservation of Privacy of Big Data Using Efficient Anonymization Technique." Lahore Garrison University Research Journal of Computer Science and Information Technology 3, no. 4 (December 31, 2019): 14–22. http://dx.doi.org/10.54692/lgurjcsit.2019.030488.

Full text
Abstract:
Big data needs to be kept private because of the increase in the amount of data. Data is generated from social networks, organizations and various other ways, which is known as big data. Big data requires large storage as well as high computational power. At every stage, the data needs to be protected. Privacy preservation plays an important role in keeping sensitive information protected and private from any attack. Data anonymization is one of the techniques to anonymize data to keep it private and protected, which includes suppression, generalization, and bucketization. It keeps personal and private data anonymous from being known by others. But when it is implemented on big data, these techniques cause a great loss of information and also fail in defense of the privacy of big data. Moreover, for the scenario of big data, the anonymization should not only focus on hiding but also on other aspects. This paper aims to provide a technique that uses slicing, suppression, and functional encryption together to achieve better privacy of big data with data anonymization.
APA, Harvard, Vancouver, ISO, and other styles
49

Dr. Muhammad Rizwan. "Preservation of Privacy of Big Data Using Efficient Anonymization Technique." Lahore Garrison University Research Journal of Computer Science and Information Technology 4, no. 3 (September 25, 2020): 1–11. http://dx.doi.org/10.54692/lgurjcsit.2020.040399.

Full text
Abstract:
Big data needs to be retained private because of the increase in the amount of data. Data is generated from social networks, organizations and various other ways, which is known as big data. Big data requires large storage as well as high computational power. At every stage, the data needs to be protected. Privacy preservation plays an important role in keeping sensitive information protected and private from any attack. Data anonymization is one of the techniques to anonymize data to keep it private and protected, which includes suppression, generalization, and bucketization. It keeps personal and private data anonymous from being known by others. But when it is implemented on big data, these techniques cause a great loss of information and also fail in defense of the privacy of big data. Moreover, for the scenario of big data, the anonymization should not only focus on hiding but also on other aspects. This paper aims to provide a technique that uses slicing, suppression, and functional encryption together to achieve better privacy of big data with data anonymization.
APA, Harvard, Vancouver, ISO, and other styles
50

Liu, Xiangwen, Xia Feng, and Yuquan Zhu. "Transactional Data Anonymization for Privacy and Information Preservation via Disassociation and Local Suppression." Symmetry 14, no. 3 (February 25, 2022): 472. http://dx.doi.org/10.3390/sym14030472.

Full text
Abstract:
Ubiquitous devices in IoT-based environments create a large amount of transactional data on daily personal behaviors. Releasing these data across various platforms and applications for data mining can create tremendous opportunities for knowledge-based decision making. However, solid guarantees on the risk of re-identification are required to make these data broadly available. Disassociation is a popular method for transactional data anonymization against re-identification attacks in privacy-preserving data publishing. The anonymization algorithm of disassociation is performed in parallel, suitable for the asymmetric paralleled data process in IoT where the nodes have limited computation power and storage space. However, the anonymization algorithm of disassociation is based on the global recoding mode to achieve transactional data km -anonymization, which leads to a loss of combinations of items in transactional datasets, thus decreasing the data quality of the published transactions. To address the issue, we propose a novel vertical partition strategy in this paper. By employing local suppression and global partition, we first eliminate the itemsets which violate km-anonymity to construct the first km-anonymous record chunk. Then, by the processes of itemset creating and reducing, we recombine the globally partitioned items from the first record chunk to construct remaining km-anonymous record chunks. The experiments illustrate that our scheme can retain more association between items in the dataset, which improves the utility of published data.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography