KR101652328B1 - Method and system for collecting data using anonymization method - Google Patents
Method and system for collecting data using anonymization method Download PDFInfo
- Publication number
- KR101652328B1 KR101652328B1 KR1020150143403A KR20150143403A KR101652328B1 KR 101652328 B1 KR101652328 B1 KR 101652328B1 KR 1020150143403 A KR1020150143403 A KR 1020150143403A KR 20150143403 A KR20150143403 A KR 20150143403A KR 101652328 B1 KR101652328 B1 KR 101652328B1
- Authority
- KR
- South Korea
- Prior art keywords
- user
- data
- identifier
- network
- quasi
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6254—Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
- G06F21/6263—Protecting personal data, e.g. for financial or medical purposes during internet communication, e.g. revealing personal data from cookies
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
A data gathering apparatus and a data gathering method in a data gathering system using the anonymizing method are disclosed. The method of data collection includes the steps of: a first user device of a user device included in each network receiving a quasi-identifier (QI) from each of user devices other than the first user device; - transmitting the identifier (QI) to the data collection device. Receiving a generalized quasi-identifier (GQI) list generated by the first user equipment using a k-anonymization technique from a data collection device, the first user equipment transmitting the generalized quasi -identifier (GQI) list To each of the user devices except for the first user device; Wherein the second user device of each of the user devices included in each network receives the anonymized data from each of the user devices except the second user device and the second user device transmits the anonymized data to the data collection device Wherein the anonymized data is data that combines the sensitive attributes (SA) of each user device with a generalized quasi-identifier (GQI) of the user device.
Description
An embodiment according to the concept of the present invention relates to a data collection method and system using an anonymization technique, and more particularly, to a data collection method and system ensuring privacy by collecting data simultaneously with anonymization.
This patent discloses a data collection method in which personal information is protected using anonymization technique. Personal information refers to information that directly or indirectly identifies each individual in the information about the individual and includes information that can be easily identified and combined with other information even if the individual does not recognize the individual.
Data containing personal information is collected and used by data collectors (eg, corporations, hospitals, governments, and other organizations that include data collectors, for example, smart TV manufacturers). Data collectors (especially businesses) collect vast personal information from customers or users to provide personalized services. The data collector may also aggregate the collected data and request the data to be analyzed by a third party (e.g., a data analysis agency). At this time, since the collected personal information includes sensitive information of the information subject, there is a possibility that the collected personal information may be misused for various crimes at the time of leakage.
In general, statistically collected data is divided into identifiers, quasi-identifiers (QI), and sensitive attributes (SAs). A quasi-identifier, which is an attribute indicating personal characteristics such as date of birth, sex, postal code, etc., is not directly known to the subject but indirectly identifiable through combination Property. Sensitive attributes also represent sensitive information about the individual that the data table wants to provide.
Generally, in order to provide information on sensitive attributes, privacy protection is performed by removing an identifier and anonymizing a semi-identifier. In the conventional data collection method using an anonymization technique, an anonymization technique is applied after data collection.
Although the alias naming scheme is applied after collecting data, the registered alliance use technique is disclosed in Japanese Patent Application No. 10-1513769. However, when the initially collected data is exposed, There is a risk that attributes will be exposed with an identifier or a semantic identifier.
Accordingly, there is a need for a data collection method and apparatus that can prevent sensitive personal information from being exposed with identification information.
SUMMARY OF THE INVENTION The present invention provides a method and apparatus for ensuring privacy using an anonymization technique.
In particular, by providing data collection at the same time as anonymization, i.e., by allowing the data collection device to receive anonymized data from the user device, preventing the user's sensitive information from being exposed in combination with the user's identification information .
An anonymized data can also be collected to suit the purpose of the data collector's collection by anonymizing the sensitive attributes using a generalized quasi-identifier (GQI).
In a data collection system comprising a data collection device according to an embodiment of the invention and a plurality of user devices forming at least two networks, the data collection method comprises at least k i (k i (QI) from each of the user equipments of the at least k i user equipments, except for the first user equipments, among the at least k i user equipments Identifier (QI) received by the first user device to the data collection device, wherein the first user device is configured to generate a generalized quasi-identifier (QI) generated using a k- receiving a generalized QI (GQI) list from a data collection device, the first user equipment recording the generalized quasi-identifier (GQI) list Of FIG k i of the user devices of the first user, except for devices transmitting to the user equipment, respectively, the at least a second user device of the k i of the user device and the at least k i of the user equipment and the second Receiving anonymized data from each of the user devices other than the user device, and transmitting the anonymized data to the data collecting device.
The data collection method according to the embodiment of the present invention enables anonymization and data collection at the same time. That is, the data collecting apparatus collects the anonymized data, thereby preventing the leakage of personal information that is not anonymized.
In addition, the data collecting method according to the embodiment of the present invention selects an arbitrary first reader user device and a second reader user device for each network group, and after each reader collects necessary data for each step, (SA) information of the user in combination with the identification information such as the quasi-identifier (QI).
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS In order to more fully understand the drawings recited in the detailed description of the present invention, a detailed description of each drawing is provided.
1 is a schematic diagram of a data acquisition system in accordance with an embodiment of the present invention.
2 is a functional block diagram of the data acquisition device shown in FIG.
3 is a functional block diagram of the user terminal shown in FIG.
4 is a flowchart illustrating a data collection method for packing privacy using a data collection system according to an embodiment of the present invention.
5 illustrates an algorithm in which a user device according to an embodiment of the present invention collects data including personal information and transmits the data to a data collection device.
6 illustrates a data collection algorithm of a data collection device according to an embodiment of the present invention.
7 is a flowchart illustrating a method of collecting a user's TV log when the data user apparatus is a smart TV.
It is to be understood that the specific structural or functional description of embodiments of the present invention disclosed herein is for illustrative purposes only and is not intended to limit the scope of the inventive concept But may be embodied in many different forms and is not limited to the embodiments set forth herein.
The embodiments according to the concept of the present invention can make various changes and can take various forms, so that the embodiments are illustrated in the drawings and described in detail herein. It should be understood, however, that it is not intended to limit the embodiments according to the concepts of the present invention to the particular forms disclosed, but includes all modifications, equivalents, or alternatives falling within the spirit and scope of the invention.
The terms first, second, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms may be named for the purpose of distinguishing one element from another, for example, without departing from the scope of the right according to the concept of the present invention, the first element may be referred to as a second element, The component may also be referred to as a first component.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like are used to specify that there are features, numbers, steps, operations, elements, parts or combinations thereof described herein, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.
Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the meaning of the context in the relevant art and, unless explicitly defined herein, are to be interpreted as ideal or overly formal Do not.
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings attached hereto.
1 to 3, a data collection device using an anonymization technique according to an embodiment of the present invention, a data collection system including the same, and a data collection device will be described in detail.
Figure 1 illustrates a
Each user device may communicate with another user device or
As shown in Fig. 1, a plurality of user equipments constitute a plurality of network groups i. Some
Although a total of two
2 is a functional block diagram of the
2, the
Under the control of the
The
The
The
3 is a functional block diagram of a
The
Referring to Figure 3, a first group of
The module used in the present specification may mean a functional and structural combination of hardware for carrying out the technical idea of the present invention and software for driving the hardware. For example, the module may mean a logical unit of a predetermined code and a hardware resource for executing the predetermined code, and does not necessarily mean a physically connected code or a kind of hardware.
The first group of
The
The
The
The
In the case of the remaining user devices, the configuration and function of the first
However, each of the second
Hereinafter, with reference to FIG. 4 through FIG. 7, a data collection method for assuring privacy using an anonymization technique using a data collection system according to an embodiment of the present invention will be described in detail.
FIG. 4 is a flowchart illustrating a data collection method for assuring privacy using the data collection system shown in FIG. 1. Referring to FIG. FIG. 5 is an algorithm in which the user apparatus shown in FIG. 1 collects data and transmits the collected data to a data collection apparatus, and FIG. 6 is a data collection algorithm of the data collection apparatus shown in FIG. 7 is a flowchart illustrating a method of collecting a user's TV log when the data user apparatus is a smart TV, for example.
Referring to FIG. 4, at least two user equipments form m (m is a natural number of 2 or more) network groups (S100). The network may be an AD-HOC network, for example, a method for configuring an AD-HOC network is as follows.
A user equipment that is not participating in the AD-HOC network broadcasts its presence and notifies the neighboring user equipment. Then, if the size of the AD-HOC network to which the user apparatus belongs is less than or equal to a preset n (n is a natural number of 2 or more), the user apparatus that has received the broadcast configures the AD-HOC network with the user apparatus that has transmitted the broadcast Lt; / RTI >
If the user apparatus that transmitted the broadcast does not yet participate in another AD-HOC network, it accepts the request, becomes a participant of the corresponding AD-HOC network, and informs the AD- Propagate to all participants in the HOC network.
The user device (s) of each of the
At this time, the subject selecting the first reader or the second reader may be all or some of the user devices belonging to the network group, and may be a third device not belonging to the network group, unlike the present embodiment.
User devices other than the
The first
For example, referring to FIG. 7 (a), if the user device is a smart TV and the sensitive attribute SA is a TV log (Logs, e.g. Adult, Sport, News, Drama etc.) The reader A (or C) receives a quasi-identifier QI excluding the sensitive attribute SA from the participant B (or D) of the group to generate a quasi-identifier (QI) list of the group, To the data collector.
The
For example, if the user device is a smart TV, referring to FIG. 7 (b), the data collector generates a GQI list using the QI list transmitted from the first readers A and C, To the first readers A and C of the network group.
The first
A method for verifying whether or not the user equipment is a generalized quasi-identifier (GQI) list that is properly generated and transmitted from the
First, a first reader user device, e.g., a first group of first
(GQI) contained in a generalized quasi -identifier (GQI) list can be sequentially verified. Alternatively, it is possible to verify the entire list at the same time.
After the verification is completed, the first
Unlike the present embodiment, it is possible to arbitrarily select
Unlike the present embodiment, the first
Next, the second
Preferably, the first
The
The second
For example, referring to FIG. 7 (c), when the user device is a smart TV, the second readers B and D of the respective network groups access the corresponding smart TVs A and C, (GQI + SA) combining the generalized quasi-identifier (GQI) of the user and the TV log, which is the sensitive attribute (SA) of the corresponding user, i.e., anonymized data, Can be generated and transmitted to the data collector.
A
In addition, by separating the quasi-identity (QI) acquisition step and the sensitive attribute acquisition step, and by using the k-anonymization technique, data collection can be performed simultaneously with anonymization. SA) can be prevented from being leaked along with the identification information such as the quasi-identifier (QI).
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.
10: Data collection system
100: Data collection device
200: First group
210, 220, 230: a first group of user devices
300: the second group
310, 320, 330: a second group of user devices
Claims (8)
A first user equipment among at least k i (k i is a natural number of 2 or more) user equipments included in each network (i) is connected to each of the user equipments excluding the first user equipment among the at least k i user equipments Receiving a quasi-identifier (QI) from the quasi-identifier;
The first user device sending the quasi-identifier (QI) to a data collection device;
Receiving from the data collection device a list of generalized quasi-identifiers (GQIs) generated by the first user equipment using a k-anonymization technique in a data collection device;
Wherein the first user equipment transmits the generalized quasi -identifier (GQI) list to each of the user equipments except for the first user equipment among at least k i user equipments included in each network (i) ;
Wherein a second one of the at least k i user devices included in each network i receives anonymized data from each of the at least k i user devices except the second user device ; And
The second user device sending the anonymized data to the data collection device,
Wherein the anonymized data is data that combines the sensitive attributes (SA) of each of the plurality of user equipments with a generalized quasi-identifier (GQI) of each of the plurality of user equipments,
A method of data collection in a data collection system.
Further comprising: verifying a generalized quasi-identifier (GQI) list received from the data collection device by at least one user device included in each of the networks.
The step of verifying the generalized quasi-identifier (GQI)
(GQI) received from the data collecting device in a network of any one of the plurality of networks to a first user device in a network other than the network to which the first user device belongs Transmitting; And
(GQI) from each first user device in a network other than the network to which the first user device belongs, and determining whether the generalized quasi-identifier (GQI) is valid ≪ / RTI >
A method of data collection in a data collection system.
Wherein the plurality of networks is an AD-HOC network,
A method of data collection in a data collection system.
Wherein said data collection device receives a quasi -identifier (QI) list from each of first user devices of at least k i (k i is a natural number greater than or equal to 2) user devices included in each network (i) A quasi-identifier (QI) list is a list of quasi-identifiers (QI) from a quasi-identifier (QI) list from each of the at least k i user devices included in the network to which the first user device belongs, Lt; RTI ID = 0.0 > QI < / RTI >
Generating a generalized quasi-identifier (GQI) list using a k-anonymization scheme;
The data collection device sending the generalized quasi-identifier (GQI) list to first user devices in each network;
The data collection device receiving anonymized data from each of the at least one of the at least k i user devices included in the respective network (i)
It said anonymous data of said is at least one data collection from the k i of the user devices of the second user device is a user equipment at least the exception of the second user device of the k i of the user equipment, respectively, and each user apparatus And combining the sensitive attribute (SA) with the generalized quasi-identifier (GQI) of the user device,
A data collection method using a data collection device.
Wherein the first user device and the second user device are different devices,
A data collection method using a data collection device.
Forming an AD-HOC network with the at least one user equipment adjacent to the user equipment;
Wherein the user device sends a quasi-identifier (QI) to the first reader user device of the network if the user device is not the first reader user device of the network, Identifier (QI) from each of the user devices other than the first reader user device among the user devices forming the network, and transmits the quasi-identifier (QI) list of the network to the data Receiving a generalized quasi-identifier (GQI) list from the data collection device, and transmitting the queried-identifier (GQI) list to user devices other than the first reader user device among the user devices forming the network;
If the user equipment is not a second reader user equipment of the network, the user equipment transmits data anonymized to the second reader user equipment of the network, and if the user equipment is a second reader user equipment of the network Wherein the user device collects anonymized data from each of the user devices other than the second reader user device among the user devices forming the network and transmits the collected data to the data collection device,
Wherein the anonymized data is data that combines a sensitive attribute (SA) of each of a plurality of user equipments forming the network with a generalized quasi-identifier (GQI) of each of the user equipments,
A method of providing data for a user device.
(GQI) using a semi-identifier (QI) list received from a first user equipment among at least k i (k i is a natural number of 2 or more) user equipments included in each network (i) A GQI generating unit for generating a list,
Identifier (QI) list from the first user equipment and transmits the generalized quasi-identifier (GQI) list to the first user equipment, wherein at least k i a communication unit for receiving anonymized data from a second one of user equipments (k i is a natural number of 2 or more)
A quasi-identifier (QI) list, a generalized quasi-identifier (GQI) list, and a storage for storing the anonymized data,
Wherein the quasi-identifier (QI) list received from the first user device is a list generated by the first user device collecting from each of the at least k i user devices other than the first user device ,
Wherein the anonymized data is data collected by the second user device from each of the at least k i user devices except for the second user device, Wherein the data is combined with the generalized quasi-identifier (GQI) of the device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150143403A KR101652328B1 (en) | 2015-10-14 | 2015-10-14 | Method and system for collecting data using anonymization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150143403A KR101652328B1 (en) | 2015-10-14 | 2015-10-14 | Method and system for collecting data using anonymization method |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101652328B1 true KR101652328B1 (en) | 2016-08-31 |
Family
ID=56877492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150143403A KR101652328B1 (en) | 2015-10-14 | 2015-10-14 | Method and system for collecting data using anonymization method |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101652328B1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200026559A (en) | 2018-09-03 | 2020-03-11 | (주)아이알컴퍼니 | Dataset De-identification Method and Apparatus Using K-anonymity Model |
KR102648905B1 (en) | 2023-02-21 | 2024-03-18 | (주)이지서티 | Method and device for privacy-constrained data perturbation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070028942A (en) * | 2005-09-08 | 2007-03-13 | 삼성전자주식회사 | Method and apparatus for collecting data |
KR20120023836A (en) * | 2009-05-29 | 2012-03-13 | 노키아 코포레이션 | Method and apparatus for engaging in a service or activity using an ad-hoc mesh network |
KR20130118959A (en) * | 2011-02-15 | 2013-10-30 | 얀마 가부시키가이샤 | Data collection device and system communicating therewith |
KR20140099539A (en) * | 2011-12-07 | 2014-08-12 | 액세스 비지니스 그룹 인터내셔날 엘엘씨 | Behavior tracking and modification system |
-
2015
- 2015-10-14 KR KR1020150143403A patent/KR101652328B1/en active IP Right Grant
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070028942A (en) * | 2005-09-08 | 2007-03-13 | 삼성전자주식회사 | Method and apparatus for collecting data |
KR20120023836A (en) * | 2009-05-29 | 2012-03-13 | 노키아 코포레이션 | Method and apparatus for engaging in a service or activity using an ad-hoc mesh network |
KR20130118959A (en) * | 2011-02-15 | 2013-10-30 | 얀마 가부시키가이샤 | Data collection device and system communicating therewith |
KR20140099539A (en) * | 2011-12-07 | 2014-08-12 | 액세스 비지니스 그룹 인터내셔날 엘엘씨 | Behavior tracking and modification system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20200026559A (en) | 2018-09-03 | 2020-03-11 | (주)아이알컴퍼니 | Dataset De-identification Method and Apparatus Using K-anonymity Model |
KR102648905B1 (en) | 2023-02-21 | 2024-03-18 | (주)이지서티 | Method and device for privacy-constrained data perturbation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10764292B2 (en) | System and method for managing electronic interactions based on defined relationships | |
CN101127625B (en) | A system and method for authorizing access request | |
CN103619019B (en) | Network access authentication method for wireless network | |
CN112219383A (en) | Data anonymization for privacy of service subscribers | |
WO2016154603A1 (en) | Channel based communication and transaction system | |
JP2022518435A (en) | Improved handling of station unique identifiers | |
US20040030915A1 (en) | Access restriction control device and method | |
CN104145445A (en) | Methods, apparatuses, and computer-readable storage media for securely accessing social networking data | |
CN101218626A (en) | Capturing contacts via people near me | |
CN107347054A (en) | A kind of auth method and device | |
EP3477561A1 (en) | System for goods delivery | |
CN104185856A (en) | Information processing device, information processing system, information processing method, and program | |
EP3528468A1 (en) | Profile information sharing | |
GB2499281A (en) | Selecting the most appropriate device to satisfy a user request | |
WO2016165505A1 (en) | Connection control method and apparatus | |
US20140058770A1 (en) | Method and device for issuing reservation number through short-range wireless communication | |
US20120089691A1 (en) | Unidentified recipients message exchange service providing method | |
CN108337210A (en) | Equipment configuration method and device, system | |
CN105451298A (en) | Network-sharing method and system, network access method and system, and electronic device | |
CN107710263A (en) | Shop accesses data creation and management | |
KR101652328B1 (en) | Method and system for collecting data using anonymization method | |
JP2005051475A (en) | System and method for managing personal information, and program thereof | |
CN109327455A (en) | A kind of access method of NAS device, device, equipment and readable storage medium storing program for executing | |
WO2016198229A1 (en) | Method and system for protecting and/or anonymizing a user identity and/or user data of a subscriber of a data protection service, program and computer program product | |
CN106537962B (en) | Wireless network configuration, access and access method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20190808 Year of fee payment: 4 |