SIT 719 SIT719 Security and Privacy Issues in Analytics Credit Task 8.2: k-anonymity for Sensitive Data Privacy Overview Data owners want a way to transform a dataset containing highly sensitive...

1 answer below »
security and privacy issues in analytics



SIT 719 SIT719 Security and Privacy Issues in Analytics Credit Task 8.2: k-anonymity for Sensitive Data Privacy Overview Data owners want a way to transform a dataset containing highly sensitive information into a privacy-preserving, low-risk set of records that can be shared with anyone. k-anonymity, a privacy model commonly applied to protect the data subjects’ privacy in data sharing scenarios, and the guarantees that k-anonymity can provide when used to anonymise data. There are different open source and commercial tools which utilizes this privacy model to protect the sensitive data. Amnesia is a data anonymization tool that allows to remove identifying information from data. Amnesia not only removes direct identifiers like names, SSNs etc but also transforms secondary identifiers like birth date and zip code so that individuals cannot be identified in the data. Amnesia supports k-anonymity. Please see the task description for the detailed tasks. This is a Credit task, so please make sure you are already up to date with all Pass tasks before attempting this task. Task Description Instructions: 1. Write a 500 word summary addressing the followings: a) Quasi-identifiers b) k-anonymity c) How k-anonymity can help prevent privacy attack? 2. Do some research to identify some commercial and open-source tools for data anonymization. Then, Make a list of the tools. Upload the summary report to the onTrack system. Overview Task Description
Answered Same DayMay 27, 2021SIT719

Answer To: SIT 719 SIT719 Security and Privacy Issues in Analytics Credit Task 8.2: k-anonymity for Sensitive...

Neha answered on Jun 03 2021
149 Votes
Quasi-Identifier
The Quasi-Identifier can be defined as a piece of information which can be used by an intruder to find out something specific about a target or individual
. This can be predicted from a large number of people (Zhang, X., Liu, C., Nepal, S. and Chen, J). The intruder can find this out using the following personal information about the specific target person:
· Specific target person is well known, and the information is publicly available.
· The publicly available registries or the medias.
· The information which individual post about themselves over the social media.
· The information which is disclosed by individual to multiple people.
It is important to know that it is possible predict a quasi-identifier using some other variable. Both the variables are considered as quasi identifiers. There is no point which can protect the variable A but not variable B and it is easy for the intruder to predict a variable using variable B (Koot, M.R., Mandjes, M., van’t Noordende, G. and de Laat, C). It is important to search for the related variables present in a data set. Examples of the correlated variables are date of birth for a baby and date of discharge from hospital, date of death and date or autopsy, weight at birth and weight of baby at discharge, age and date of graduation etc.
K-Anonymity
The K-Anonymity can be defined as a privacy model which is applied to the data set to protect it and the privacy in data sharing scenarios (LeFevre, K., DeWitt, D.J. and Ramakrishnan, R). The k anonymity can provide privacy when used with anonymise data. There are many privacy-preserving systems which have the goal of providing K-Anonymity for the data subjects. The basic idea is to use...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here