CN114676450A - Entity-based privacy policy and data analysis method - Google Patents

Entity-based privacy policy and data analysis method Download PDF

Info

Publication number
CN114676450A
CN114676450A CN202011550071.0A CN202011550071A CN114676450A CN 114676450 A CN114676450 A CN 114676450A CN 202011550071 A CN202011550071 A CN 202011550071A CN 114676450 A CN114676450 A CN 114676450A
Authority
CN
China
Prior art keywords
data
policy
disclosure
entity
privacy policy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011550071.0A
Other languages
Chinese (zh)
Inventor
胡建勋
闫伟
刘元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Information Security Common Technology National Engineering Research Center Co ltd
Original Assignee
Zhongke Information Security Common Technology National Engineering Research Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Information Security Common Technology National Engineering Research Center Co ltd filed Critical Zhongke Information Security Common Technology National Engineering Research Center Co ltd
Priority to CN202011550071.0A priority Critical patent/CN114676450A/en
Priority to PCT/CN2021/131793 priority patent/WO2022134974A1/en
Publication of CN114676450A publication Critical patent/CN114676450A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An entity-based privacy policy and data analysis method collects entity-sensitive stream-to-policy consistency data to determine whether an application privacy policy reveals a related data stream; the consistency model includes consistency of flow to policy and inconsistency of flow to privacy policy. The flow-to-policy consistency includes both explicit disclosure and implicit disclosure; the inconsistency of the stream with the privacy policy includes omission of disclosure, false disclosure, and disclosure of ambiguous information. Has the advantages that: whether the data stream is leaked or not is determined through analysis of the data stream privacy policy, and the data stream privacy policy is judged through a clear policy consistency model. The application of the technology effectively protects private data and improves the judgment on the application state.

Description

Entity-based privacy policy and data analysis method
Technical Field
The invention relates to the technical field of security, in particular to a privacy policy and data analysis method based on an entity.
Background
Privacy protection is a long-standing open research challenge for mobile applications. Research has shown that disclosure of privacy sensitive information such as device identifiers and geographic locations often occurs in mobile applications. In a broad sense, privacy protection exists across technical, cultural and legal aspects. These actions are not generally considered violations if the application is exposed to the privacy policy that the program needs to collect and share data. While there has been some manual analysis of application privacy policies, it is difficult to computationally automate the inference of the content of privacy policies and how the application adheres to those policies.
Researchers have been working on helping application developers write accurate privacy policies and help application stores identify privacy violations and help end users select more privacy-friendly applications by studying them. Conceptually, these studies use a combination of static program analysis and natural language processing to analyze flow-to-policy consistency. Briefly, a consistency analysis of the flow-to-policy determines whether the behavior of the application is consistent with what is stated in the privacy policy.
In addition, the developer needs to disclose a third party entity with which to share information, as prescribed by laws and regulations such as GDPR and CCPA. And GDPR also requires vendors to disclose a category of data receiving third parties or recipients with which to share personal data. In the application, a manufacturer (developer) is considered as a data controller, and a third party may be a data controller or a data processor (only processes data, but does not perform operations such as storage). But most entities involved in the study locate themselves as data controllers (e.g., google, Facebook, and TapJoy).
Although current research has been successful, these techniques have a common weakness in that they do not distinguish between entities receiving the data (e.g., the application itself and advertisers and data analysts of third parties).
In recent years, more and more people are concerned about analyzing flow-to-policy inconsistencies in mobile applications. These works differ in how application behavior flow and privacy policies are analyzed. Although many of the previous works used Android Application Program Interface (API) calls to assess privacy violations. In the aspect of policy analysis, related work adopts a keyword-based method, uses double word lattices and verb correction words to infer privacy policies, and uses a crowd-sourced ontology to perform policy analysis. Other recent studies have focused on analyzing specific application categories, such as those designed for home, compliance and privacy violations. They use dynamic analysis to identify sensitive streams and entities that receive data. However, their policy analysis is either manual or semi-automatic based on keyword searching. While these methods may be suitable for application classes with well-defined requirements, they severely limit the efficiency of work in terms of accuracy and scale.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an entity-based privacy policy and data analysis method, which provides an entity-sensitive flow-to-policy consistency model to determine whether the privacy policy of an application program reveals related data flow, and effectively provides the prior work efficiency.
An entity-based privacy policy and data analysis method collects entity-sensitive stream-to-policy consistency data to determine whether an application privacy policy reveals a related data stream; the consistency model includes consistency of flow to policy and inconsistency of flow to privacy policy.
Preferably, the flow-to-policy compliance includes both explicit disclosure and implicit disclosure; wherein, the first and the second end of the pipe are connected with each other,
when the type of the data stream and the entity sharing the data are directly and definitely marked in the privacy policy, and no other policy statement contradicts the type of the data stream and the entity sharing the data, the data stream can be defined as definitely disclosed;
when in a privacy policy, expressions related to data streams use broad terms of data type or entity, the data streams will be defined as fuzzy disclosure; also similar to explicit disclosure is that statements are subject to fuzzy disclosure only if there are no conflicting policy statements.
Preferably, the inconsistency of the stream with the privacy policy includes omission of disclosure, false disclosure, and disclosure of ambiguous information disclosure; wherein the content of the first and second substances,
if no policy statement is discussing the data flow, the data flow conforms to the omitted disclosure;
if the privacy policy indicates that sharing data is not to occur, then the application may be defined as being falsely exposed when the application shares data;
a data stream is defined as an ambiguous disclosure if it matches two or more contradictory policy statements and it is unclear if the stream will occur.
The technical scheme of the invention has the beneficial effects that: whether the data stream is leaked or not is determined through analysis of the data stream privacy policy, and the data stream privacy policy is judged through a clear policy consistency model. The application of the technology effectively protects private data and improves the judgment on the application state.
Detailed Description
In order to make the technical solutions of the present invention better understood by those skilled in the art, the present invention will be further described in detail with reference to specific examples.
An entity-based privacy policy and data analysis method collects entity-sensitive stream-to-policy consistency data to determine whether an application privacy policy reveals a related data stream; the consistency model includes consistency of flow to policy and inconsistency of flow to privacy policy.
Preferably, the flow-to-policy compliance includes both explicit disclosure and implicit disclosure; wherein the content of the first and second substances,
when the type of the data stream and the entity sharing the data are directly and definitely marked in the privacy policy, and no other policy statement contradicts the type of the data stream and the entity sharing the data, the data stream can be defined as definitely disclosed;
when in a privacy policy, expressions related to data streams use broad terms of data type or entity, the data streams will be defined as fuzzy disclosure; also similar to explicit disclosure is that statements are subject to fuzzy disclosure only if there are no conflicting policy statements.
Preferably, the inconsistency of the stream with the privacy policy includes omission of disclosure, false disclosure, and disclosure of ambiguous information disclosure; wherein the content of the first and second substances,
if no policy statement is discussing the data flow, the data flow conforms to the omitted disclosure;
if the privacy policy indicates that sharing of data is not to occur, then the application may be defined as being incorrectly revealed when the application shares data;
a data stream is defined as an ambiguous disclosure if it matches two or more contradictory policy statements and it is unclear if the stream will occur.
By utilizing the technical scheme of the invention to research 13796 application programs and privacy policies thereof, 42.4% of the application programs are found not to correctly disclose or hide privacy-sensitive data streams. From the results of the study, the importance of considering the receiving entity was confirmed: without considering the receiving entity as a factor, conventional solutions may incorrectly partition up to 38.4% of applications into applications that may be subject to privacy leaks, but in practice these data streams are consistent with the privacy policy of the application itself. The technical scheme analyzes consistency of sensitive flow-to-policy of an entity and provides the most accurate method to determine whether an application program correctly discloses sensitive private data collection behaviors.
The entity-based privacy policy and data analysis method provided by the invention is described in detail above, and the principles and embodiments of the present invention are explained herein by applying the embodiments, and the description of the embodiments is only used to help understand the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (3)

1. An entity-based privacy policy and data analysis method, characterized by: the entity-based privacy policy and data analysis method comprises the steps of collecting entity-sensitive stream-to-policy consistency data to determine whether relevant data streams are revealed by privacy policies of an application program; the consistency model includes consistency of flow to policy and inconsistency of flow to privacy policy.
2. The entity-based privacy policy and data analysis method of claim 1, wherein: the flow-to-policy consistency includes both explicit disclosure and implicit disclosure; wherein the content of the first and second substances,
when the type of the data stream and the entity sharing the data are directly and definitely marked in the privacy policy, and no other policy statement contradicts the type of the data stream and the entity sharing the data, the data stream can be defined as definitely disclosed;
when in a privacy policy, expressions related to data streams use broad terms of data type or entity, the data streams will be defined as fuzzy disclosure; also similar to explicit disclosure is that statements are subject to fuzzy disclosure only if there are no conflicting policy statements.
3. The entity-based privacy policy and data analysis method of claim 1, wherein: the inconsistency of the stream with the privacy policy comprises three disclosures of omission of disclosure, false disclosure and ambiguous information disclosure; wherein the content of the first and second substances,
if no policy statement discusses the data flow, the data flow conforms to the elision disclosure;
if the privacy policy indicates that sharing of data is not to occur, then the application may be defined as being incorrectly revealed when the application shares data;
a data stream is defined as an ambiguous disclosure if it matches two or more contradictory policy statements and it is unclear if the stream will occur.
CN202011550071.0A 2020-12-24 2020-12-24 Entity-based privacy policy and data analysis method Pending CN114676450A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011550071.0A CN114676450A (en) 2020-12-24 2020-12-24 Entity-based privacy policy and data analysis method
PCT/CN2021/131793 WO2022134974A1 (en) 2020-12-24 2021-11-19 Entity-based privacy policy and data analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011550071.0A CN114676450A (en) 2020-12-24 2020-12-24 Entity-based privacy policy and data analysis method

Publications (1)

Publication Number Publication Date
CN114676450A true CN114676450A (en) 2022-06-28

Family

ID=82071304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011550071.0A Pending CN114676450A (en) 2020-12-24 2020-12-24 Entity-based privacy policy and data analysis method

Country Status (2)

Country Link
CN (1) CN114676450A (en)
WO (1) WO2022134974A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7797726B2 (en) * 2004-12-16 2010-09-14 International Business Machines Corporation Method and system for implementing privacy policy enforcement with a privacy proxy
CN103533049B (en) * 2013-10-14 2016-09-14 无锡中盛医疗设备有限公司 The electronic privacy information protection system of intelligent medical treatment
CN109495460B (en) * 2018-11-01 2021-04-06 南京邮电大学 Privacy policy dynamic updating method in combined service
CN111898154B (en) * 2020-06-16 2022-08-05 北京大学 Negotiation type mobile application privacy data sharing protocol signing method
CN112068844B (en) * 2020-09-09 2021-09-07 西安交通大学 APP privacy data consistency behavior analysis method facing privacy protection policy

Also Published As

Publication number Publication date
WO2022134974A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
CN108304720B (en) Android malicious program detection method based on machine learning
CN110351280B (en) Method, system, equipment and readable storage medium for extracting threat information
CN103136471B (en) A kind of malice Android application program detection method and system
CN102521543B (en) Method for information semantic analysis based on dynamic taint analysis
CN112149124B (en) Android malicious program detection method and system based on heterogeneous information network
CN110674144A (en) User portrait generation method and device, computer equipment and storage medium
CN110414222A (en) A kind of application privacy leakage failure detecting method and device based on component liaison
CN113689292B (en) User aggregation identification method and system based on image background identification
CN113497809A (en) MIPS framework vulnerability mining method based on control flow and data flow analysis
CN109214178A (en) APP application malicious act detection method and device
CN111784301A (en) User portrait construction method and device, storage medium and electronic equipment
WO2022062958A1 (en) Privacy detection method and apparatus, and computer readable storage medium
CN110688245A (en) Information acquisition method, device, storage medium and equipment
CN102523286B (en) Method and device for obtaining credit degree of service
CN114676450A (en) Entity-based privacy policy and data analysis method
CN109657148A (en) For abnormal operation recognition methods, device, server and the medium for reporting POI
Pieterse et al. Reference architecture for android applications to support the detection of manipulated evidence
CN110990834B (en) Static detection method, system and medium for android malicious software
CN115033317B (en) Method and device for processing bullet frame, electronic equipment and readable storage medium
CN115562981A (en) Software quality evaluation method based on machine learning
CN115048645A (en) Detection method, device, equipment and medium for collecting privacy information beyond range
CN113779589A (en) Android smart phone application misconfiguration detection method
CN113807077A (en) Natural language test script parsing processing method and device and electronic equipment
CN112395615A (en) Android malicious application detection method
CN108667685B (en) Mobile application network flow clustering device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication