WO2021218660A1 - 信息统计 - Google Patents

信息统计 Download PDF

Info

Publication number
WO2021218660A1
WO2021218660A1 PCT/CN2021/087742 CN2021087742W WO2021218660A1 WO 2021218660 A1 WO2021218660 A1 WO 2021218660A1 CN 2021087742 W CN2021087742 W CN 2021087742W WO 2021218660 A1 WO2021218660 A1 WO 2021218660A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
noise
desensitization
target object
noise information
Prior art date
Application number
PCT/CN2021/087742
Other languages
English (en)
French (fr)
Inventor
范多毅
Original Assignee
支付宝实验室(新加坡)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝实验室(新加坡)有限公司 filed Critical 支付宝实验室(新加坡)有限公司
Publication of WO2021218660A1 publication Critical patent/WO2021218660A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Definitions

  • This specification relates to the field of information processing technology, and in particular to an information statistics method and device.
  • individualized protection of private data or sensitive information is usually required, that is, for each private data or sensitive information, through interaction with the data system (such as reading and writing interactive files for sensitive information, and combining sensitive information Read from the interactive file) to achieve the desensitization effect, and write the desensitized information into the interactive file again after desensitization.
  • the data system such as reading and writing interactive files for sensitive information, and combining sensitive information Read from the interactive file
  • the above interaction process needs to be performed multiple times.
  • the desensitization process cumbersome, the desensitization efficiency is low, and there is a risk of individual data (such as Single sensitive information) for de-identification, eventually leading to the leakage of sensitive information.
  • one or more embodiments of the present specification provide an information statistics method, including: obtaining first noise information corresponding to a target object group to be counted, the target object group including a plurality of targets with sensitive information to be counted Object.
  • the first noise information is split to obtain a plurality of second noise information, and the plurality of second noise information are respectively delivered to the target object.
  • an information statistics device including: a first acquisition module, which acquires first noise information corresponding to a target object group to be counted, and the target object group includes a plurality of The target object of sensitive information for statistics.
  • the splitting and issuing module splits the first noise information to obtain a plurality of second noise information, and respectively delivers the plurality of second noise information to the target object.
  • the second acquiring module acquires the desensitization information reported by the target object, where the desensitization information is generated by the target object performing desensitization processing on its own sensitive information according to the assigned second noise information.
  • the generating module generates desensitization statistical information according to the desensitization information reported by each target object.
  • the noise removal module performs noise removal processing on the desensitization statistical information based on the first noise information to obtain the statistical information of the sensitive information corresponding to the target object group.
  • one or more embodiments of the present specification provide an information statistics device, which is characterized by comprising: a processor; and a memory arranged to store computer-executable instructions that, when executed, enable The processor: obtains first noise information corresponding to a target object group to be counted, and the target object group includes a plurality of target objects with sensitive information to be counted.
  • the first noise information is split to obtain a plurality of second noise information, and the plurality of second noise information are respectively delivered to the target object.
  • one or more embodiments of this specification provide a storage medium for storing computer-executable instructions.
  • the executable instructions When executed, the following process is achieved: Obtain the first corresponding to the target object group to be counted.
  • the target object group includes a plurality of target objects with sensitive information to be counted.
  • the first noise information is split to obtain a plurality of second noise information, and the plurality of second noise information are respectively delivered to the target object.
  • Fig. 1 is a schematic flowchart of an information statistics method according to an embodiment of the present specification
  • Fig. 2 is a schematic flowchart of an information statistics method according to another embodiment of the present specification.
  • Fig. 3 is a schematic flowchart of an information statistics method according to still another embodiment of the specification.
  • Fig. 4 is a schematic block diagram of an information statistics device according to an embodiment of the present specification.
  • Fig. 5 is a schematic block diagram of an information statistics device according to an embodiment of the present specification.
  • One or more embodiments of this specification provide an information statistics method and device to solve the existing problems of low information desensitization efficiency and poor desensitization effect.
  • the information statistics method provided in one or more embodiments of this specification is applicable to situations where information statistics need to be performed on all or part of multiple objects to be counted, and each object to be counted contains sensitive information to be counted.
  • the multiple objects to be statistic may be grouped into at least one object group to be statistic, so as to perform information statistics on any of the object groups to be statistic.
  • multiple objects to be counted can be grouped according to related information of each object to be counted and a preset grouping algorithm.
  • the relevant information of the object to be counted may be any information that can uniquely characterize the object to be counted, such as hash value, object name, and object identifier.
  • the preset grouping algorithm may be any algorithm capable of achieving grouping effects, such as a bucketing method and a random grouping method.
  • the hash value corresponding to each object to be counted may be calculated first, and then the hash value corresponding to each object to be counted is grouped according to the bucketing method.
  • the bucketing method since the bucketing method is an existing technology, it will not be repeated here.
  • the information statistics method provided by the embodiments of this specification will be described in detail below.
  • Fig. 1 is a schematic flowchart of an information statistics method according to an embodiment of the present specification. As shown in Fig. 1, the method is applied to an information statistics device and includes:
  • S102 Acquire first noise information corresponding to a target object group to be counted, where the target object group includes a plurality of target objects with sensitive information to be counted.
  • the target object group may be any object group to be counted obtained by grouping in advance.
  • the first noise information may be in a numerical form or in a data form in other formats.
  • the first noise information may be generated by the information statistics device, or may be generated by other devices connected to the information statistics device, and may be obtained by the information statistics device from other devices.
  • S104 Split the first noise information to obtain multiple second noise information, and respectively deliver the multiple second noise information to the target object.
  • the target object after multiple second noise information is delivered to the target object, if the delivery is successful, the target object will desensitize its own sensitive information based on the received second noise information and report it to the target object. ⁇ Sen information.
  • the splitting method is to split the first noise information according to the magnitude of the value, and the sum of the values of the multiple second noise information obtained after the splitting is equal to the first noise information.
  • One noise information For example, if the first noise information is 100, the first noise information can be divided into two second noise information, 40 and 60, according to the magnitude of the value.
  • the second noise information delivered to each target object may not be recorded, so as to avoid the risk of leakage of the second noise information.
  • S106 Obtain the desensitization information reported by the target object, where the desensitization information is generated by the target object performing desensitization processing on its own sensitive information according to the assigned second noise information.
  • S108 Generate desensitization statistical information according to the desensitization information reported by each target object.
  • the multiple desensitization information when receiving the desensitization information reported by multiple target objects, can be counted to generate desensitization statistical information corresponding to the multiple desensitization information. For example, by performing statistics on the acquired desensitization information a and desensitization information b, the desensitization statistical information a+b can be obtained.
  • S110 Perform denoising processing on the desensitization statistical information based on the first noise information to obtain statistical information of the sensitive information corresponding to the target object group.
  • the first noise information is split to obtain multiple second noise information, and the multiple The second noise information is delivered to the target object, so that each target object can desensitize its own sensitive information based on the assigned second noise information, and report the desensitized information after the desensitization process, so as to avoid direct transmission of sensitive information.
  • Information leakage of sensitive information by obtaining the first noise information corresponding to the target object group to be counted, the first noise information is split to obtain multiple second noise information, and the multiple The second noise information is delivered to the target object, so that each target object can desensitize its own sensitive information based on the assigned second noise information, and report the desensitized information after the desensitization process, so as to avoid direct transmission of sensitive information. Information leakage of sensitive information.
  • the desensitization statistical information is generated according to the desensitization information reported by each target object, and the desensitization statistical information is denoised based on the first noise information to obtain the target object group The corresponding statistical information of sensitive information. It can be seen that through the desensitization processing of sensitive information and the denoising processing of desensitization statistical information, the statistics of sensitive information can ensure accurate statistical effects at the group level. Moreover, there is no intermediate storage and transmission of noise information and sensitive information, and there is no need to denoise the sensitive information of a single target object. Therefore, the final statistical information only retains the statistical information of the target object group, while the information characteristics of the original sensitive information Then all disappear, so as to achieve the effect of protecting sensitive information.
  • the number of second noise information is the same as the number of target objects in the target object group. Based on this, S104 can be executed as the following steps:
  • the first noise information is split into the first number of second noise information.
  • multiple second noise information is respectively delivered to multiple target objects, and the allocation rules include random allocation rules.
  • the first noise information is divided into 10 second noise information, and the 10 second noise information is randomly distributed to 10 target objects, so that each target The object desensitizes its own sensitive information based on the assigned second noise information.
  • the sum of the multiple second noise information is equal to the first noise information, that is, the first noise information and the first noise information delivered to each target object
  • the deviation between the sum of the two noise information is zero. Therefore, when S110 is performed, the first noise information can be removed from the desensitization statistical information to obtain the statistical information of the sensitive information corresponding to the target object group.
  • the desensitization statistical information can be obtained by calculating the sum of each desensitization information. Moreover, when removing the first noise information from the desensitization statistical information, the statistical information of the sensitive information corresponding to the target object group can be obtained by calculating the difference between the desensitization statistical information and the first noise information.
  • the first noise information is 100
  • the target object group includes 10 target objects.
  • the information statistics device splits the first noise information into 10 second noise information, and randomly distributes the 10 second noise information to 10 target objects, and the target objects compare their respective noise information based on the allocated second noise information. Sensitive information is reported after being desensitized.
  • the information statistics device performs statistics based on the received desensitization information to obtain desensitization statistical information. Then, the desensitization statistical information is subtracted from the first noise information 100 to obtain the statistical information. It can be seen that there is no transfer process of the second noise information for desensitization in the information statistics process, nor does it rely on the second noise information for noise removal. Therefore, not only accurate statistics information can be obtained, but sensitive information is well protected.
  • Fig. 2 shows a schematic flowchart of an information statistics method according to another embodiment of the present specification.
  • the first noise information, sensitive information, and desensitization information are all in numerical form. Therefore, in this embodiment, the first noise information can be referred to as the first noise value, and the second noise information can be referred to as The second noise value.
  • the method includes:
  • S201 Acquire a first noise value corresponding to a target object group to be counted, where the target object group includes a plurality of target objects with sensitive information to be counted.
  • the target object group includes the following two target objects, namely, the user whose user ID is "123456" (hereinafter referred to as user "123456”) and the user whose user ID is "123457” (hereinafter referred to as user "123457”):
  • UserId is the user ID
  • the corresponding content is “123456” and “123457” in the target object.
  • AccessCount is the number of visits
  • the corresponding content is "32" and “67” in the target object.
  • HeartBeat is the number of heartbeats, and the corresponding content is “102" and "79” in the target object. Both the number of visits and the number of heartbeats are sensitive information. In this embodiment, it is assumed that the number of visits and the number of heartbeats of the user "123456" and the user "123457” need to be counted.
  • S202 Split the first noise value into second noise values matching the number of target objects, and deliver multiple second noise values to each target object respectively.
  • Each target object performs desensitization processing on its sensitive information based on the assigned second noise value, and reports the desensitization information obtained after the desensitization processing to the information statistics device.
  • multiple split second noise values can be randomly assigned to each target object.
  • the information statistics device does not record the second noise value obtained after splitting, nor does it record which second noise value is assigned to each target object. That is, there is no storage and intermediate transfer process for the second noise value.
  • the first noise value can be split into two second noise values, assuming that the first noise value is 100, and the split second noise values are 40 and 60, and The second noise value 40 is delivered to the user "123456", and the second noise value 60 is delivered to the user "123457”.
  • the desensitization information can be obtained as follows:
  • the desensitization information can be obtained as follows:
  • S204 Calculate the sum of the desensitization information reported by each target object to obtain desensitization statistical information.
  • the desensitization statistics information can be obtained as follows:
  • S205 Calculate the difference between the desensitization statistical information and the first noise value to obtain the statistical information of the sensitive information corresponding to the target object group.
  • the statistical information can be obtained as follows:
  • the finally obtained statistical information only retains the statistical information of the target object group, and all the information features of the original sensitive information disappear.
  • sensitive information such as the number of visits "67", “32” and the number of heartbeats "79", “102”
  • the second noise value used for desensitization processing do not exist in the storage and intermediate transmission process, so it can not only accurately count
  • the statistical information of the sensitive information corresponding to the target object group, and the sensitive information of the target object is well protected.
  • the sensitive information in the same target object includes multiple (that is, the number of visits and the number of heartbeats), that is, multiple sensitive information of multiple target objects needs to be counted at the same time, and the above example uses multiple sensitive information The same first noise information.
  • different sensitive information is not limited to using the same first noise information, and different first noise information may also be used.
  • the first noise value of 100 is used for the "number of visits" of sensitive information
  • the second noise value of 200 is used for the "number of heartbeats" for sensitive information.
  • the information statistics process is still the same as that shown in Fig.
  • the “number of visits” in the desensitization statistical information is subtracted from the first noise value of 100, and the desensitization statistical information is By subtracting the first noise value 200 from the "heartbeat number" in, the statistical information corresponding to the target object group can be obtained.
  • the number of second noise information is more than the number of target objects in the target object group. Based on this, after multiple pieces of second noise information are delivered to the target object, there is some second noise information that is not allocated to the target object.
  • a part of the second noise information can be randomly selected from a plurality of second noise information and sent to the target object. same. It is also possible to sequentially deliver each second noise information to the target object according to the order in which each second noise information is split, until each target object is assigned the second noise information.
  • removing the first noise information from the desensitization statistical information to obtain the statistical information of the sensitive information corresponding to the target object group can be performed as follows:
  • the deviation noise information corresponding to the target object group is obtained by removing the first noise information from the second noise information issued each time.
  • the first noise information is 100
  • the first noise information is split into three second noise information, such as 20, 30, and 50.
  • the target group includes 2 target objects. Two of the second noise information are selected and delivered to each target object. Assuming that the delivered second noise information is 20 and 30, the second noise information 50 is the deviation noise information corresponding to the target object group.
  • the first noise information is removed from the offset noise information to obtain the third noise information.
  • the third noise information 50 can be obtained.
  • the third noise information is removed from the desensitization statistical information to obtain the statistical information of the sensitive information corresponding to the target object group.
  • removing the third noise information 50 from the desensitization statistical information can obtain the statistical information corresponding to the target object group.
  • Fig. 3 shows a schematic flowchart of an information statistics method according to still another embodiment of this specification.
  • the first noise information, sensitive information, and desensitization information are all in numerical form. Therefore, in this embodiment, the first noise information can be referred to as the first noise value, and the second noise information can be referred to as The second noise value and the deviation noise information can be called the deviation noise value.
  • the method includes:
  • S301 Acquire a first noise value corresponding to a target object group to be counted, where the target object group includes a first number of target objects that have sensitive information to be counted.
  • the target object group includes the following two target objects, namely, the user whose user ID is "123456" (hereinafter referred to as user "123456”) and the user whose user ID is "123457” (hereinafter referred to as user "123457”):
  • UserId is the user ID
  • the corresponding content is “123456” and “123457” in the target object.
  • AccessCount is the number of visits
  • the corresponding content is "32" and “67” in the target object.
  • HeartBeat is the number of heartbeats, and the corresponding content is “102" and "79” in the target object. Both the number of visits and the number of heartbeats are sensitive information. In this embodiment, it is assumed that the number of visits and the number of heartbeats of the user "123456" and the user "123457” need to be counted.
  • S302 Split the first noise value into a second number of second noise values, where the second number is greater than the first number.
  • the number of second noise values obtained after splitting is more than the number of target objects.
  • S303 Filter out the first number of second noise values from the second number of second noise values, and deliver the first number of second noise values to each target object respectively.
  • the method of screening the second noise value is not limited, and it only needs to satisfy that the number of the filtered second noise value is the first number.
  • the first number of second noise values can be selected by random screening, or the first number of second noise values can be selected in sequence according to the split order corresponding to each second noise value, and so on.
  • Each target object performs desensitization processing on its sensitive information based on the assigned second noise value, and reports the desensitization information obtained after the desensitization processing to the information statistics device.
  • the filtered first number of second noise values can be randomly assigned to each target object.
  • the information statistics device does not record the second noise value obtained after splitting, nor does it record which second noise value is assigned to each target object. That is, there is no storage and intermediate transfer process for the second noise value.
  • the first number is 2 (that is, there are 2 target objects)
  • the second number is 3, that is, the first noise value is split into 3 second noise values, and the first noise value is assumed to be 100.
  • the second noise value of is 20, 30, 50.
  • two second noise values 20 and 30 are randomly selected, and the selected second noise values 20 and 30 are respectively delivered to two target objects, for example: the second noise value 20 is delivered to the user " 123456", the second noise value 30 is delivered to the user "123457”.
  • the desensitization information can be obtained as follows:
  • the desensitization information can be obtained as follows:
  • S305 Calculate the sum of the desensitization information reported by each target object to obtain desensitization statistical information.
  • S306 Obtain a deviation noise value corresponding to the target object group, where the deviation noise value is obtained by removing the first noise value from the second noise value issued each time.
  • the deviation noise can be obtained.
  • the value is 50.
  • the deviation noise value is the total second noise value that has not been delivered to the target object. Since the information statistics device in this embodiment does not record the second noise value obtained after splitting, it can be determined by way of S306 Deviation noise value.
  • S307 Remove the deviation noise value from the first noise value to obtain a third noise value, and then remove the third noise value from the desensitization statistical information to obtain statistical information of sensitive information corresponding to the target object group.
  • the first noise value 100 is divided by the deviation noise value 50 to obtain the third noise value 50. Then, by removing the third noise value 50 from the desensitization statistical information, the statistical information corresponding to the target object group can be obtained.
  • the third noise value is the sum of the second noise values delivered to the target object. Since the information statistics device in this embodiment does not record which second noise value is delivered for each target object, and the second noise value obtained after splitting is not all delivered to the target object, that is, there is a deviation, so it can be passed. The third noise value that should be removed from the desensitization statistical information is determined in S307, so as to obtain accurate statistical information.
  • the statistical information can be obtained as follows:
  • the finally obtained statistical information only retains the statistical information of the target object group, and all the information features of the original sensitive information disappear.
  • sensitive information such as the number of visits "67", “32” and the number of heartbeats "79", “102”
  • the second noise value used for desensitization processing do not exist in the storage and intermediate transmission process, so it can not only accurately count
  • the statistical information of the sensitive information corresponding to the target object group, and the sensitive information of the target object is well protected.
  • the sensitive information in the same target object includes multiple (that is, the number of visits and the number of heartbeats), that is, multiple sensitive information of multiple target objects needs to be counted at the same time, and the above example uses multiple sensitive information The same first noise information.
  • different sensitive information is not limited to using the same first noise information, and different first noise information may also be used.
  • the first noise value of 100 is used for the "number of visits" of sensitive information
  • the second noise value of 200 is used for the "number of heartbeats" for sensitive information.
  • the information statistics process is still the same as the process shown in Figure 3.
  • each target object group has its own corresponding group identification information.
  • Different or the same first noise information can be acquired for each target object group, and then information statistics can be performed for each target object group and its corresponding first noise information.
  • the information statistics method described is the same.
  • noise removal processing may be performed on the desensitization statistical information corresponding to each target object group based on the first noise information and the group identification information respectively corresponding to each target object group.
  • the group identification information corresponding to each target object group obtained in advance may be generated by the information statistics device, or may be generated by other devices connected to the information statistics device, and obtained by the information statistics device from other devices.
  • Fig. 4 is a schematic block diagram of an information statistics apparatus according to an embodiment of the present specification. As shown in Fig. 4, the information statistics apparatus includes:
  • the first obtaining module 410 obtains first noise information corresponding to a target object group to be counted; the target object group includes a plurality of target objects with sensitive information to be counted;
  • the splitting and issuing module 420 splits the first noise information to obtain a plurality of second noise information, and respectively delivers the plurality of second noise information to the target object;
  • the second obtaining module 430 obtains the desensitization information reported by the target object; the desensitization information is generated by the target object performing desensitization processing on its own sensitive information according to the assigned second noise information;
  • the generating module 440 generates desensitization statistical information according to the desensitization information reported by each target object;
  • the noise removal module 450 performs noise removal processing on the desensitization statistical information based on the first noise information to obtain the statistical information of the sensitive information corresponding to the target object group.
  • the splitting and issuing module 420 includes: a splitting unit that splits the first noise information into a first number of the second noises based on the first number of the target objects Information; a issuing unit, according to a preset allocation rule, respectively deliver the plurality of second noise information to a plurality of the target objects; the allocation rule includes a random allocation rule.
  • the noise removal module 450 includes: a first removal unit that removes the first noise information from the desensitization statistical information to obtain the statistical information of the sensitive information corresponding to the target object group .
  • the first noise information, the sensitive information, and the desensitization information are in numerical form.
  • the generating module 440 includes a calculation unit, which calculates the sum of the desensitization information to obtain the desensitization statistical information.
  • the first removing unit calculates the difference between the desensitization statistical information and the first noise information to obtain the statistical information of the sensitive information corresponding to the target object group.
  • the first removing unit obtains deviation noise information corresponding to the target object group, where the deviation noise information is to remove the first noise information from the second noise issued each time Information acquisition; removing the first noise information from the deviation noise information to obtain third noise information; removing the third noise information from the desensitization statistical information to obtain the sensitive information corresponding to the target object group Statistics.
  • the target object group includes a plurality of objects.
  • the device further includes: a third obtaining module, which obtains group identification information corresponding to each of the target object groups.
  • the denoising module 450 includes: a denoising unit, based on the first noise information and the group identification information corresponding to each of the target object groups, respectively, to perform statistics on the desensitization corresponding to each of the target object groups. Information is denoised.
  • the device further includes: a calculation module, before the first noise information corresponding to the target object group to be counted is obtained, for a plurality of objects to be counted, the hash corresponding to each of the objects to be counted is calculated respectively. Hope value; the grouping module, according to the hash value corresponding to each object to be counted, group each of the objects to be counted to obtain at least one object group to be counted; each of the object groups to be counted includes a plurality of The sensitive information of the object to be counted.
  • the noise information is delivered to the target object, so that each target object can desensitize its own sensitive information based on the assigned second noise information, and report the desensitized information after the desensitization process, so as to avoid direct transmission of sensitive information Circumstances where sensitive information is leaked at a time.
  • the desensitization statistical information is generated according to the desensitization information reported by each target object, and the desensitization statistical information is denoised based on the first noise information to obtain the target object group The corresponding statistical information of sensitive information. It can be seen that through the desensitization processing of sensitive information and the denoising processing of desensitization statistical information, the statistics of sensitive information can ensure accurate statistical effects at the group level. Moreover, there is no intermediate storage and transmission of noise information and sensitive information, and there is no need to denoise the sensitive information of a single target object. Therefore, the final statistical information only retains the statistical information of the target object group, while the information characteristics of the original sensitive information Then all disappear, so as to achieve the effect of protecting sensitive information.
  • the information statistics device may have relatively large differences due to different configurations or performances, and may include one or more processors 501 and a memory 502, and the memory 502 may store one or more storage applications or data. Among them, the memory 502 may be short-term storage or persistent storage.
  • the application program stored in the memory 502 may include one or more modules (not shown in the figure), and each module may include a series of computer-executable instructions for the information statistics device.
  • the processor 501 may be configured to communicate with the memory 502, and execute a series of computer-executable instructions in the memory 502 on the information statistics device.
  • the information statistics device may also include one or more power supplies 503, one or more wired or wireless network interfaces 504, one or more input and output interfaces 505, and one or more keyboards 506.
  • the information statistics device includes a memory and one or more programs, one or more programs are stored in the memory, and one or more programs may include one or more modules, and each The module may include a series of computer-executable instructions in the information statistics equipment, and is configured to be executed by one or more processors.
  • the second noise information is delivered to the target object; the desensitization information reported by the target object is acquired; the desensitization information is the target object's sensitive information based on the assigned second noise information Generated by desensitization processing; generating desensitization statistical information according to the desensitization information reported by each target object; denoising processing on the desensitization statistical information based on the first noise information to obtain the target object Statistical information of the sensitive information corresponding to the group.
  • the processor may also cause the processor to split the first noise information into a first number of the second noises based on the first number of the target objects Information; according to preset allocation rules, the multiple second noise information are respectively delivered to multiple target objects; the allocation rules include random allocation rules.
  • the processor may also be caused to remove the first noise information from the desensitization statistical information to obtain information about the sensitive information corresponding to the target object group Statistics.
  • the first noise information, the sensitive information, and the desensitization information are in numerical form.
  • the processor may also be caused to calculate the sum of the desensitization information to obtain the desensitization statistical information.
  • the removing the first noise information from the desensitization statistical information to obtain the statistical information of the sensitive information corresponding to the target object group includes: calculating the desensitization statistical information and the first noise information The difference between the two is to obtain the statistical information of the sensitive information corresponding to the target object group.
  • the processor may also cause the processor to obtain the deviation noise information corresponding to the target object group, where the deviation noise information is to remove the first noise information each time.
  • the second noise information issued is obtained; the deviation noise information is removed from the first noise information to obtain third noise information; the third noise information is removed from the desensitization statistical information to obtain the target Statistical information of the sensitive information corresponding to the object group.
  • the target object group includes multiple ones.
  • the processor may also cause the processor to: obtain the group identification information corresponding to each of the target object groups; The group identification information respectively performs noise removal processing on the desensitization statistical information corresponding to each of the target object groups.
  • the processor may also cause the processor to: before acquiring the first noise information corresponding to the target object group to be counted, respectively calculate each of the plurality of objects to be counted.
  • the hash value corresponding to the object to be counted according to the hash value corresponding to each object to be counted, each of the objects to be counted is grouped to obtain at least one object group to be counted; each of the object groups to be counted includes a plurality of The object to be counted with sensitive information to be counted.
  • One or more embodiments of this specification also propose a computer-readable storage medium that stores one or more programs, and the one or more programs include instructions.
  • the electronic device can execute the above-mentioned information statistics method, and is specifically used to execute: obtain the first noise information corresponding to the target object group to be counted; the target object group includes a plurality of sensitive information to be counted The target object of the information; split the first noise information to obtain a plurality of second noise information, and respectively deliver the plurality of second noise information to the target object; obtain the information reported by the target object Sensitive information; the desensitization information is generated by the target object performing desensitization processing on its own sensitive information according to the assigned second noise information; generating desensitization information according to the desensitization information reported by each target object Sensitive statistical information; performing denoising processing on the desensitized statistical information based on the first noise information to obtain statistical information of the sensitive information corresponding to the target object group
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cell phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or Any combination of these devices.
  • one or more embodiments of this specification can be provided as a method, a system, or a computer program product. Therefore, one or more embodiments of this specification may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, one or more embodiments of this specification may adopt computer programs implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The form of the product.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are used to generate It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • the memory may include non-permanent memory in a computer-readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM).
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • This application can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Complex Calculations (AREA)

Abstract

本说明书一个或多个实施例公开了一种信息统计方法及装置,用以解决现有的信息脱敏效率低、且脱敏效果不佳的问题。所述方法包括:获取待统计的目标对象组对应的第一噪声信息,所述目标对象组包括多个具有待统计的敏感信息的目标对象。对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象。获取所述目标对象上报的脱敏信息,所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成。根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息。基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。

Description

信息统计 技术领域
本说明书涉及信息处理技术领域,尤其涉及一种信息统计方法及装置。
背景技术
在全球隐私数据和敏感信息保护越来越严格的市场环境下,如何在保障个体的隐私数据不泄露给除该终端以外的任何其他方的前提下,又能对整体的数据进行精确的统计计算,成为目前亟需解决的一个问题。
现有技术中,通常需要对隐私数据或敏感信息进行个体化保护,即针对每个隐私数据或敏感信息,通过和数据系统之间的交互(如读写敏感信息的交互文件,并将敏感信息从交互文件中读出)来实现脱敏效果,并在脱敏后将已脱敏的信息再次写入交互文件中。当需要保护多个隐私数据或敏感信息时,如需对多个敏感信息进行统计,那么需执行多次上述交互过程,不仅脱敏过程繁琐,导致脱敏效率低,且存在对个体数据(如单个敏感信息)进行反识别的情况,最终导致敏感信息泄露。
发明内容
一方面,本说明书一个或多个实施例提供一种信息统计方法,包括:获取待统计的目标对象组对应的第一噪声信息,所述目标对象组包括多个具有待统计的敏感信息的目标对象。对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象。获取所述目标对象上报的脱敏信息,所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成。根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息。基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
另一方面,本说明书一个或多个实施例提供一种信息统计装置,包括:第一获取模块,获取待统计的目标对象组对应的第一噪声信息,所述目标对象组包括多个具有待统计的敏感信息的目标对象。拆分及下发模块,对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象。第二获取模块,获取所述目标对象上报的脱敏信息,所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成。生成模块,根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息。除噪模块,基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
再一方面,本说明书一个或多个实施例提供一种信息统计设备,其特征在于,包 括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:获取待统计的目标对象组对应的第一噪声信息,所述目标对象组包括多个具有待统计的敏感信息的目标对象。对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象。获取所述目标对象上报的脱敏信息,所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成。根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息。基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
再一方面,本说明书一个或多个实施例提供一种存储介质,用于存储计算机可执行指令,所述可执行指令在被执行时实现以下流程:获取待统计的目标对象组对应的第一噪声信息,所述目标对象组包括多个具有待统计的敏感信息的目标对象。对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象。获取所述目标对象上报的脱敏信息,所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成。根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息。基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
附图说明
为了更清楚地说明本说明书一个或多个实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据本说明书一实施例的一种信息统计方法的示意性流程图;
图2是根据本说明书另一实施例的一种信息统计方法的示意性流程图;
图3是根据本说明书再一实施例的一种信息统计方法的示意性流程图;
图4是根据本说明书一实施例的一种信息统计装置的示意性框图;
图5是根据本说明书一实施例的一种信息统计设备的示意性框图。
具体实施方式
本说明书一个或多个实施例提供一种信息统计方法及装置,用以解决现有的信息脱敏效率低、且脱敏效果不佳的问题。
为了使本技术领域的人员更好地理解本说明书一个或多个实施例中的技术方案, 下面将结合本说明书一个或多个实施例中的附图,对本说明书一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本说明书一部分实施例,而不是全部的实施例。基于本说明书一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本说明书一个或多个实施例保护的范围。
本说明书一个或多个实施例提供的信息统计方法,适用于需对多个待统计对象中的全部或部分对象进行信息统计的情形,各待统计对象中包含有待统计的敏感信息。在对多个待统计对象进行信息统计之前,可首先将多个待统计对象进行分组,以分组为至少一个待统计对象组,从而针对其中的任一待统计对象组进行信息统计。
首先说明如何对多个待统计对象进行分组。在一个实施例中,可根据各待统计对象的相关信息、并按照预设分组算法对多个待统计对象进行分组。待统计对象的相关信息可以是哈希值、对象名称、对象标识等任一项能够唯一表征待统计对象的信息。预设分组算法可以是分桶法、随机分组法等任一种能够实现分组效果的算法。
以相关信息为哈希值,预设分组算法为分桶法为例。在对多个待统计对象进行分组时,可首先分别计算各待统计对象对应的哈希值,进而根据各待统计对象对应的哈希值,并按照分桶法对各待统计对象进行分组。其中,由于分桶法已为现有技术,此处不再赘述。下面详细说明本说明书实施例提供的信息统计方法。
图1是根据本说明书一实施例的一种信息统计方法的示意性流程图,如图1所示,该方法应用于信息统计设备中,包括:
S102,获取待统计的目标对象组对应的第一噪声信息,目标对象组包括多个具有待统计的敏感信息的目标对象。
其中,目标对象组可以是预先分组得到的任一个待统计对象组。
第一噪声信息可以是数值形式,也可以是其他格式的数据形式。第一噪声信息可由信息统计设备生成,也可由与信息统计设备连接的其他设备生成,并有执行由信息统计设备从其他设备中获取得到。
S104,对第一噪声信息进行拆分以得到多个第二噪声信息,分别将多个第二噪声信息下发至目标对象。
其中,在将多个第二噪声信息下发至目标对象后,在下发成功的情况下,目标对象就会基于接收到的第二噪声信息对其自身的敏感信息进行脱敏处理,并上报脱敏信息。
在一个实施例中,若第一噪声信息为数值信息,则拆分方法为按照数值大小对第一噪声信息进行拆分,且拆分后得到的多个第二噪声信息的数值之和等于第一噪声信息。 例如,第一噪声信息为100,则按照数值大小可将第一噪声信息拆分为40和60两个第二噪声信息。
并且,下发至各目标对象的第二噪声信息可不进行记录,以避免第二噪声信息泄露的风险。
S106,获取目标对象上报的脱敏信息,脱敏信息为目标对象根据分配的第二噪声信息,对自身的敏感信息进行脱敏处理所生成。
S108,根据各目标对象上报的脱敏信息生成脱敏统计信息。
其中,当接收多个目标对象上报的脱敏信息时,可对多个脱敏信息进行统计,以生成多个脱敏信息对应的脱敏统计信息。例如,对于获取到的脱敏信息a和脱敏信息b进行统计,可得到脱敏统计信息a+b。
S110,基于第一噪声信息对脱敏统计信息进行除噪处理,以得到目标对象组对应的敏感信息的统计信息。
采用本说明书一个或多个实施例的技术方案,通过获取待统计的目标对象组对应的第一噪声信息,对第一噪声信息进行拆分以得到多个第二噪声信息,并分别将多个第二噪声信息下发至目标对象,使得各目标对象能够基于分配的第二噪声信息对自身的敏感信息进行脱敏处理,并将脱敏处理后的脱敏信息进行上报,从而避免直接传递敏感信息时泄露敏感信息的情况。并且,在获取到目标对象上报的脱敏信息时,根据各目标对象上报的脱敏信息生成脱敏统计信息,并基于第一噪声信息对脱敏统计信息进行除噪处理,以得到目标对象组对应的敏感信息的统计信息。可见,通过对敏感信息的脱敏处理及对脱敏统计信息的除噪处理,使得敏感信息的统计能够在群体级别上确保精确统计效果。并且噪声信息及敏感信息没有中间存储和传递过程,更无需对单个目标对象的敏感信息进行除噪处理,因此最终获得的统计信息仅保留目标对象组的统计信息,而原有敏感信息的信息特征则全部消失,从而实现保护敏感信息的效果。
在一个实施例中,第二噪声信息的数量和目标对象组中目标对象的数量相同。基于此,S104可执行为以下步骤:
首先,基于目标对象的第一数量,将第一噪声信息拆分为第一数量的第二噪声信息。
其次,根据预设的分配规则,分别将多个第二噪声信息下发至多个目标对象,分配规则包括随机分配的规则。
例如,目标对象组中包括10个目标对象,则将第一噪声信息拆分为10个第二噪声信息,并将这10个第二噪声信息随机下发至10个目标对象,以使各目标对象基于分配 的第二噪声信息对自身的敏感信息进行脱敏处理。
在第二噪声信息的数量和目标对象组中目标对象的数量相同的情况下,多个第二噪声信息的总和等于第一噪声信息,即,第一噪声信息与下发至各目标对象的第二噪声信息总和之间的偏差为0。因此,在执行S110时,可从脱敏统计信息中去除第一噪声信息,得到目标对象组对应的敏感信息的统计信息。
在一个实施例中,若第一噪声信息、敏感信息及脱敏信息均为数值形式,则脱敏统计信息可通过计算各脱敏信息的和值得到。并且,在从脱敏统计信息中去除第一噪声信息时,可通过计算脱敏统计信息与第一噪声信息之间的差值,即可得到目标对象组对应的敏感信息的统计信息。
例如,第一噪声信息为100,目标对象组中包括10个目标对象。信息统计设备将第一噪声信息拆分为10个第二噪声信息,并将这10个第二噪声信息随机下发至10个目标对象之后,由目标对象基于分配的第二噪声信息对各自的敏感信息进行脱敏处理后上报。信息统计设备基于接收到的各脱敏信息进行统计,得到脱敏统计信息。然后,将脱敏统计信息减去第一噪声信息100,即可得到统计信息。可见,信息统计过程不存在用于脱敏的第二噪声信息的传递过程,也不依赖于第二噪声信息进行除噪,因此不仅能得到精确的统计信息,且很好地保护了敏感信息。
图2示出了本说明书另一实施例的一种信息统计方法的示意性流程图。在本实施例中,第一噪声信息、敏感信息及脱敏信息均为数值形式,因此在本实施例中,第一噪声信息可称之为第一噪声值,第二噪声信息可称之为第二噪声值。如图2所示,该方法包括:
S201,获取待统计的目标对象组对应的第一噪声值,目标对象组包括多个具有待统计的敏感信息的目标对象。
例如,目标对象组中包括以下2个目标对象,即用户标识为“123456”的用户(以下简称用户“123456”)以及用户标识为“123457”的用户(以下简称用户“123457”):
|UserId|AccessCount|HeartBeat||--|--|--||123456|32|102|;
|UserId|AccessCount|HeartBeat||--|--|--||123457|67|79|
其中,UserId为用户标识,对应内容为目标对象中的“123456”、“123457”。AccessCount为访问数,对应内容为目标对象中的“32”、“67”。HeartBeat为心跳数,对应内容为目标对象中的“102”、“79”。访问数及心跳数均为敏感信息。本实施例中,假设需要对用户“123456”和用户“123457”的访问数及心跳数进行统计。
S202,将第一噪声值拆分为与目标对象的数量相匹配的第二噪声值,并将多个第二噪声值分别下发至各目标对象。
各目标对象基于分配的第二噪声值对各自的敏感信息进行脱敏处理,并将脱敏处理后得到的脱敏信息上报至信息统计设备。
本实施例中,可将拆分后的多个第二噪声值随机分配给各目标对象。信息统计设备不记录拆分后得到的第二噪声值,也不记录为各目标对象分配了哪个第二噪声值。即,第二噪声值不存在存储及中间传递过程。
由于目标对象组中包括2个目标对象,因此可将第一噪声值拆分为2个第二噪声值,假设第一噪声值为100,拆分后的第二噪声值为40和60,并且将第二噪声值40下发至用户“123456”,将第二噪声值60下发至用户“123457”。
用户“123456”接收到第二噪声值40之后,基于第二噪声值40对其敏感信息进行脱敏处理,具体为:将其敏感信息(包括访问数“32”及心跳数“102”)加上第二噪声值40,可得到脱敏信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|72|142|。
用户“123457”接收到第二噪声值60之后,基于第二噪声值60对其敏感信息进行脱敏处理,具体为:将其敏感信息(包括访问数“67”及心跳数“79”)加上第二噪声值60,可得到脱敏信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|127|139|。
S203,获取各目标对象上报的脱敏信息。
S204,计算各目标对象上报的脱敏信息的和值,得到脱敏统计信息。
沿用上述举例,通过计算用户“123456”上报的脱敏信息及用户“123457”上报的脱敏信息之和,即将用户“123456”及用户“123457”分别对应的脱敏后的访问数相加(72+127=199),以及将用户“123456”及用户“123457”分别对应的脱敏后的心跳数相加(142+139=281),可得到脱敏统计信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|199|281|
S205,计算脱敏统计信息与第一噪声值之间的差值,得到目标对象组对应的敏感信息的统计信息。
沿用上述举例,将脱敏统计信息减去第一噪声值100,即可得到统计信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|99|181|
由上述实施例可看出,最终得到的统计信息仅保留了目标对象组的统计信息,而原有敏感信息的信息特征则全部消失。并且敏感信息(如访问数“67”、“32”及心跳数“79”、“102”)及用于脱敏处理的第二噪声值不存在存储及中间传递过程,因此不仅能精确统计 出目标对象组对应的敏感信息的统计信息,且很好地保护了目标对象的敏感信息。
上述举例中,同一目标对象中的敏感信息包括多个(即访问数和心跳数),即需要同时对多个目标对象的多个敏感信息进行统计,且上述举例中对多个敏感信息采用了相同的第一噪声信息。对此需说明的是,对于这种需统计多个目标对象中的多个敏感信息的情况,不同敏感信息并不局限于采用相同的第一噪声信息,还可采用不同的第一噪声信息。例如,对敏感信息“访问数”采用第一噪声值100,而对敏感信息“心跳数”采用第二噪声值200。这种情况下,信息统计流程仍与图2所示流程相同,不同之处在于执行S205时,将脱敏统计信息中的“访问数”减去第一噪声值100,而将脱敏统计信息中的“心跳数”减去第一噪声值200,即可得到目标对象组对应的统计信息。
在一个实施例中,第二噪声信息的数量多于目标对象组中目标对象的数量。基于此,将多个第二噪声信息下发至目标对象后,存在部分第二噪声信息未分配至目标对象。
基于第二噪声信息的数量与目标对象的数量不同的情况,可随机从多个第二噪声信息中选择部分第二噪声信息下发至目标对象,选择的第二噪声信息的数量与目标对象的相同。也可按照各第二噪声信息的拆分顺序依次将各第二噪声信息下发至目标对象,直至每个目标对象均分配有第二噪声信息。
基于第二噪声信息的数量与目标对象的数量不同的情况,从脱敏统计信息中去除第一噪声信息,以得到目标对象组对应的敏感信息的统计信息,可执行为以下步骤:
首先,获取目标对象组对应的偏差噪声信息,偏差噪声信息为将第一噪声信息分别去除每次下发的第二噪声信息得到。
例如,第一噪声信息为100,且第一噪声信息被拆分为3个第二噪声信息,如20、30、50。目标对象组中包括2个目标对象。选择其中的2个第二噪声信息下发至各目标对象,假设下发的第二噪声信息为20和30,则第二噪声信息50即为目标对象组对应的偏差噪声信息。
其次,将第一噪声信息去除偏差噪声信息得到第三噪声信息。
沿用上述举例,将第一噪声信息100去除偏差噪声信息50后,即可得到第三噪声信息50。
再次,从脱敏统计信息中去除第三噪声信息,得到目标对象组对应的敏感信息的统计信息。
沿用上述举例,从脱敏统计信息中去除第三噪声信息50,即可得到目标对象组对应的统计信息。
图3示出了本说明书再一实施例的一种信息统计方法的示意性流程图。在本实施 例中,第一噪声信息、敏感信息及脱敏信息均为数值形式,因此在本实施例中,第一噪声信息可称之为第一噪声值,第二噪声信息可称之为第二噪声值,偏差噪声信息可称之为偏差噪声值。如图3所示,该方法包括:
S301,获取待统计的目标对象组对应的第一噪声值,目标对象组包括第一数量个具有待统计的敏感信息的目标对象。
例如,目标对象组中包括以下2个目标对象,即用户标识为“123456”的用户(以下简称用户“123456”)以及用户标识为“123457”的用户(以下简称用户“123457”):
|UserId|AccessCount|HeartBeat||--|--|--||123456|32|102|;
|UserId|AccessCount|HeartBeat||--|--|--||123457|67|79|.
其中,UserId为用户标识,对应内容为目标对象中的“123456”、“123457”。AccessCount为访问数,对应内容为目标对象中的“32”、“67”。HeartBeat为心跳数,对应内容为目标对象中的“102”、“79”。访问数及心跳数均为敏感信息。本实施例中,假设需要对用户“123456”和用户“123457”的访问数及心跳数进行统计。
S302,将第一噪声值拆分为第二数量个第二噪声值,第二数量大于第一数量。
即,拆分后得到的第二噪声值的数量多于目标对象的数量。
S303,从第二数量个第二噪声值中筛选出第一数量个第二噪声值,并将第一数量个第二噪声值分别下发至各目标对象。
其中,筛选第二噪声值的方式不受限定,只需满足筛选出的第二噪声值的数量为第一数量即可。例如,可通过随机筛选的方式筛选出第一数量个第二噪声值,也可按照各第二噪声值对应的拆分顺序依次筛选出第一数量个第二噪声值,等等。
各目标对象基于分配的第二噪声值对各自的敏感信息进行脱敏处理,并将脱敏处理后得到的脱敏信息上报至信息统计设备。
本实施例中,可将筛选出的第一数量个第二噪声值随机分配给各目标对象。信息统计设备不记录拆分后得到的第二噪声值,也不记录为各目标对象分配了哪个第二噪声值。即,第二噪声值不存在存储及中间传递过程。
由于第一数量为2(即目标对象有2个),因此可假设第二数量为3,即将第一噪声值拆分为3个第二噪声值,假设第一噪声值为100,拆分后的第二噪声值为20、30、50。然后,随机筛选出两个第二噪声值20、30,并将筛选出的第二噪声值20和30分别下发至2个目标对象,具体如:将第二噪声值20下发至用户“123456”,将第二噪声值30下发至用户“123457”。
用户“123456”接收到第二噪声值20之后,基于第二噪声值20对其敏感信息进行脱敏处理,具体为:将其敏感信息(包括访问数“32”及心跳数“102”)加上第二噪声值20,可得到脱敏信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|52|122|。
用户“123457”接收到第二噪声值30之后,基于第二噪声值30对其敏感信息进行脱敏处理,具体为:将其敏感信息(包括访问数“67”及心跳数“79”)加上第二噪声值30,可得到脱敏信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|97|109|。
S304,获取各目标对象上报的脱敏信息。
S305,计算各目标对象上报的脱敏信息的和值,得到脱敏统计信息。
沿用上述举例,通过计算用户“123456”上报的脱敏信息及用户“123457”上报的脱敏信息之和,即将用户“123456”及用户“123457”分别对应的脱敏后的访问数相加(52+97=149),以及将用户“123456”及用户“123457”分别对应的脱敏后的心跳数相加(122+109=231),可得到脱敏统计信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|149|231|
S306,获取目标对象组对应的偏差噪声值,该偏差噪声值为将第一噪声值分别去除每次下发的第二噪声值得到。
沿用上述举例,由于第一噪声值100下发了2个第二噪声值20和30,一次通过将第一噪声值100去除每次下发的第二噪声值20、30后,可得到偏差噪声值为50。
实际上,偏差噪声值即为未下发至目标对象的总第二噪声值,由于本实施例中信息统计设备并不记录拆分后得到的第二噪声值,因此可通过S306的方式来确定偏差噪声值。
S307,将第一噪声值去除偏差噪声值得到第三噪声值,再从脱敏统计信息中去除第三噪声值,得到目标对象组对应的敏感信息的统计信息。
沿用上述举例,将第一噪声值100去除偏差噪声值50,即可得到第三噪声值50。然后,从脱敏统计信息中去除第三噪声值50,即可得到目标对象组对应的统计信息。
实际上,第三噪声值即为下发至目标对象的各第二噪声值之和。由于本实施例中信息统计设备并不记录为各目标对象下发了哪个第二噪声值,且拆分后得到的第二噪声值并未全部下发至目标对象,即存在偏差,因此可通过S307的方式来确定应从脱敏统计信息中去除的第三噪声值,从而得到精确的统计信息。
将上述脱敏统计信息减去第三噪声值50,即可得到统计信息如下:
|UserId|AccessCount|HeartBeat||--|--|--||123456|99|181|
由上述实施例可看出,最终得到的统计信息仅保留了目标对象组的统计信息,而原有敏感信息的信息特征则全部消失。并且敏感信息(如访问数“67”、“32”及心跳数“79”、“102”)及用于脱敏处理的第二噪声值不存在存储及中间传递过程,因此不仅能精确统计出目标对象组对应的敏感信息的统计信息,且很好地保护了目标对象的敏感信息。
上述举例中,同一目标对象中的敏感信息包括多个(即访问数和心跳数),即需要同时对多个目标对象的多个敏感信息进行统计,且上述举例中对多个敏感信息采用了相同的第一噪声信息。对此需说明的是,对于这种需统计多个目标对象中的多个敏感信息的情况,不同敏感信息并不局限于采用相同的第一噪声信息,还可采用不同的第一噪声信息。例如,对敏感信息“访问数”采用第一噪声值100,而对敏感信息“心跳数”采用第二噪声值200。这种情况下,信息统计流程仍与图3所示流程相同。
在一个实施例中,目标对象组包括多个,各目标对象组分别有各自对应的组标识信息。可为各目标对象组获取不同或相同的第一噪声信息,进而针对每个目标对象组与其对应的第一噪声信息进行信息统计,针对各目标对象组执行的信息统计方法与上述实施例中所述的信息统计方法相同。并且,在执行S110时,可基于各目标对象组分别对应的第一噪声信息和组标识信息,分别对各目标对象组对应的脱敏统计信息进行除噪处理。其中,预先获取的各目标对象组分别对应的组标识信息,可由信息统计设备生成,也可由与信息统计设备连接的其他设备生成,并由信息统计设备从其他设备中获取得到。
此外,借鉴上述构思,对于可进行组合的数据,在进行信息统计时,可以使用其他相适应的噪声格式进行脱敏和除噪,以达到不同数据形态下的应用,实现多场景的统计应用。
综上,已经对本主题的特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作可以按照不同的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序,以实现期望的结果。在某些实施方式中,多任务处理和并行处理可以是有利的。
以上为本说明书一个或多个实施例提供的信息统计方法,基于同样的思路,本说明书一个或多个实施例还提供一种信息统计装置。
图4是根据本说明书一实施例的一种信息统计装置的示意性框图,如图4所示,信息统计装置包括:
第一获取模块410,获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;
拆分及下发模块420,对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;
第二获取模块430,获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;
生成模块440,根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;
除噪模块450,基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
在一个实施例中,所述拆分及下发模块420包括:拆分单元,基于所述目标对象的第一数量,将所述第一噪声信息拆分为第一数量的所述第二噪声信息;下发单元,根据预设的分配规则,分别将所述多个第二噪声信息下发至多个所述目标对象;所述分配规则包括随机分配的规则。
在一个实施例中,所述除噪模块450包括:第一去除单元,从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
在一个实施例中,所述第一噪声信息、所述敏感信息及所述脱敏信息为数值形式。所述生成模块440包括:计算单元,计算各所述脱敏信息的和值,得到所述脱敏统计信息。所述第一去除单元,计算所述脱敏统计信息与所述第一噪声信息之间的差值,得到所述目标对象组对应的所述敏感信息的统计信息。
在一个实施例中,所述第一去除单元:获取所述目标对象组对应的偏差噪声信息,所述偏差噪声信息为将所述第一噪声信息分别去除每次下发的所述第二噪声信息得到;将所述第一噪声信息去除所述偏差噪声信息得到第三噪声信息;从所述脱敏统计信息中去除所述第三噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
在一个实施例中,所述目标对象组包括多个。所述装置还包括:第三获取模块,获取各所述目标对象组分别对应的组标识信息。所述除噪模块450包括:除噪单元,基于各所述目标对象组分别对应的所述第一噪声信息和所述组标识信息,分别对各所述目标对象组对应的所述脱敏统计信息进行除噪处理。
在一个实施例中,所述装置还包括:计算模块,所述获取待统计的目标对象组对应的第一噪声信息之前,针对多个待统计对象,分别计算各所述待统计对象对应的哈希值;分组模块,根据各所述待统计对象对应的哈希值,对各所述待统计对象进行分组,得到至少一个待统计对象组;各所述待统计对象组包括多个具有待统计的敏感信息的所述待统计对象。
采用本说明书一个或多个实施例的装置,通过获取待统计的目标对象组对应的第 一噪声信息,对第一噪声信息进行拆分以得到多个第二噪声信息,并分别将多个第二噪声信息下发至目标对象,使得各目标对象能够基于分配的第二噪声信息对自身的敏感信息进行脱敏处理,并将脱敏处理后的脱敏信息进行上报,从而避免直接传递敏感信息时泄露敏感信息的情况。并且,在获取到目标对象上报的脱敏信息时,根据各目标对象上报的脱敏信息生成脱敏统计信息,并基于第一噪声信息对脱敏统计信息进行除噪处理,以得到目标对象组对应的敏感信息的统计信息。可见,通过对敏感信息的脱敏处理及对脱敏统计信息的除噪处理,使得敏感信息的统计能够在群体级别上确保精确统计效果。并且噪声信息及敏感信息没有中间存储和传递过程,更无需对单个目标对象的敏感信息进行除噪处理,因此最终获得的统计信息仅保留目标对象组的统计信息,而原有敏感信息的信息特征则全部消失,从而实现保护敏感信息的效果。
本领域的技术人员应可理解,上述信息统计装置能够用来实现前文所述的信息统计方法,其中的细节描述应与前文方法部分描述类似,为避免繁琐,此处不另赘述。
基于同样的思路,本说明书一个或多个实施例还提供一种信息统计设备,如图5所示。信息统计设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上的处理器501和存储器502,存储器502中可以存储有一个或一个以上存储应用程序或数据。其中,存储器502可以是短暂存储或持久存储。存储在存储器502的应用程序可以包括一个或一个以上模块(图示未示出),每个模块可以包括对信息统计设备中的一系列计算机可执行指令。更进一步地,处理器501可以设置为与存储器502通信,在信息统计设备上执行存储器502中的一系列计算机可执行指令。信息统计设备还可以包括一个或一个以上电源503,一个或一个以上有线或无线网络接口504,一个或一个以上输入输出接口505,一个或一个以上键盘506。
具体在本实施例中,信息统计设备包括有存储器,以及一个或一个以上的程序,其中一个或者一个以上程序存储于存储器中,且一个或者一个以上程序可以包括一个或一个以上模块,且每个模块可以包括对信息统计设备中的一系列计算机可执行指令,且经配置以由一个或者一个以上处理器执行该一个或者一个以上程序包含用于进行以下计算机可执行指令:获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:基于所述目标对 象的第一数量,将所述第一噪声信息拆分为第一数量的所述第二噪声信息;根据预设的分配规则,分别将所述多个第二噪声信息下发至多个所述目标对象;所述分配规则包括随机分配的规则。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
可选地,所述第一噪声信息、所述敏感信息及所述脱敏信息为数值形式。计算机可执行指令在被执行时,还可以使所述处理器:计算各所述脱敏信息的和值,得到所述脱敏统计信息。所述从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息,包括:计算所述脱敏统计信息与所述第一噪声信息之间的差值,得到所述目标对象组对应的所述敏感信息的统计信息。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:获取所述目标对象组对应的偏差噪声信息,所述偏差噪声信息为将所述第一噪声信息分别去除每次下发的所述第二噪声信息得到;将所述第一噪声信息去除所述偏差噪声信息得到第三噪声信息;从所述脱敏统计信息中去除所述第三噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
可选地,所述目标对象组包括多个。计算机可执行指令在被执行时,还可以使所述处理器:获取各所述目标对象组分别对应的组标识信息;基于各所述目标对象组分别对应的所述第一噪声信息和所述组标识信息,分别对各所述目标对象组对应的所述脱敏统计信息进行除噪处理。
可选地,计算机可执行指令在被执行时,还可以使所述处理器:所述获取待统计的目标对象组对应的第一噪声信息之前,针对多个待统计对象,分别计算各所述待统计对象对应的哈希值;根据各所述待统计对象对应的哈希值,对各所述待统计对象进行分组,得到至少一个待统计对象组;各所述待统计对象组包括多个具有待统计的敏感信息的所述待统计对象。
本说明书一个或多个实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的电子设备执行时,能够使该电子设备执行上述信息统计方法,并具体用于执行:获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本说明书一个或多个实施例时可以把各单元的功能在同一个或多个软件和/或硬件中实现。
本领域内的技术人员应明白,本说明书一个或多个实施例可提供为方法、系统、或计算机程序产品。因此,本说明书一个或多个实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本说明书一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本说明书一个或多个实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书一个或多个实施例可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本说明书一个或多个实施例而已,并不用于限制本说明书。对于本领域技术人员来说,本说明书一个或多个实施例可以有各种更改和变化。凡在本说明书一个或多个实施例的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本说明书一个或多个实施例的权利要求范围之内。

Claims (14)

  1. 一种信息统计方法,包括:
    获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;
    对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;
    获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;
    根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;
    基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
  2. 根据权利要求1所述的方法,所述对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象,包括:
    基于所述目标对象的第一数量,将所述第一噪声信息拆分为第一数量的所述第二噪声信息;
    根据预设的分配规则,分别将所述多个第二噪声信息下发至多个所述目标对象;所述分配规则包括随机分配的规则。
  3. 根据权利要求2所述的方法,所述基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息,包括:
    从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
  4. 根据权利要求3所述的方法,所述第一噪声信息、所述敏感信息及所述脱敏信息为数值形式;
    所述根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息,包括:
    计算各所述脱敏信息的和值,得到所述脱敏统计信息;
    所述从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息,包括:
    计算所述脱敏统计信息与所述第一噪声信息之间的差值,得到所述目标对象组对应的所述敏感信息的统计信息。
  5. 根据权利要求3所述的方法,所述从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息,包括:
    获取所述目标对象组对应的偏差噪声信息,所述偏差噪声信息为将所述第一噪声信息分别去除每次下发的所述第二噪声信息得到;
    将所述第一噪声信息去除所述偏差噪声信息得到第三噪声信息;
    从所述脱敏统计信息中去除所述第三噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
  6. 根据权利要求1所述的方法,所述目标对象组包括多个;所述方法还包括:
    获取各所述目标对象组分别对应的组标识信息;
    所述基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,包括:
    基于各所述目标对象组分别对应的所述第一噪声信息和所述组标识信息,分别对各所述目标对象组对应的所述脱敏统计信息进行除噪处理。
  7. 根据权利要求1所述的方法,所述获取待统计的目标对象组对应的第一噪声信息之前,还包括:
    针对多个待统计对象,分别计算各所述待统计对象对应的哈希值;
    根据各所述待统计对象对应的哈希值,对各所述待统计对象进行分组,得到至少一个待统计对象组;各所述待统计对象组包括多个具有待统计的敏感信息的所述待统计对象。
  8. 一种信息统计装置,包括:
    第一获取模块,获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;
    拆分及下发模块,对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;
    第二获取模块,获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;
    生成模块,根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;
    除噪模块,基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
  9. 根据权利要求8所述的装置,所述拆分及下发模块包括:
    拆分单元,基于所述目标对象的第一数量,将所述第一噪声信息拆分为第一数量的所述第二噪声信息;
    下发单元,根据预设的分配规则,分别将所述多个第二噪声信息下发至多个所述目标对象;所述分配规则包括随机分配的规则。
  10. 根据权利要求9所述的装置,所述除噪模块包括:
    第一去除单元,从所述脱敏统计信息中去除所述第一噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
  11. 根据权利要求10所述的装置,所述第一噪声信息、所述敏感信息及所述脱敏信息为数值形式;
    所述生成模块包括:
    计算单元,计算各所述脱敏信息的和值,得到所述脱敏统计信息;
    所述第一去除单元,计算所述脱敏统计信息与所述第一噪声信息之间的差值,得到所述目标对象组对应的所述敏感信息的统计信息。
  12. 根据权利要求10所述的装置,所述第一去除单元:
    获取所述目标对象组对应的偏差噪声信息,所述偏差噪声信息为将所述第一噪声信息分别去除每次下发的所述第二噪声信息得到;
    将所述第一噪声信息去除所述偏差噪声信息得到第三噪声信息;
    从所述脱敏统计信息中去除所述第三噪声信息,得到所述目标对象组对应的所述敏感信息的统计信息。
  13. 一种信息统计设备,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器:
    获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;
    对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;
    获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;
    根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;
    基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
  14. 一种存储介质,用于存储计算机可执行指令,所述可执行指令在被执行时实现以下流程:
    获取待统计的目标对象组对应的第一噪声信息;所述目标对象组包括多个具有待统计的敏感信息的目标对象;
    对所述第一噪声信息进行拆分以得到多个第二噪声信息,分别将所述多个第二噪声信息下发至所述目标对象;
    获取所述目标对象上报的脱敏信息;所述脱敏信息为所述目标对象根据分配的所述第二噪声信息,对自身的敏感信息进行脱敏处理所生成;
    根据各所述目标对象上报的所述脱敏信息生成脱敏统计信息;
    基于所述第一噪声信息对所述脱敏统计信息进行除噪处理,以得到所述目标对象组对应的所述敏感信息的统计信息。
PCT/CN2021/087742 2020-04-30 2021-04-16 信息统计 WO2021218660A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010360913.X 2020-04-30
CN202010360913.XA CN111563272B (zh) 2020-04-30 2020-04-30 信息统计方法及装置

Publications (1)

Publication Number Publication Date
WO2021218660A1 true WO2021218660A1 (zh) 2021-11-04

Family

ID=72071756

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087742 WO2021218660A1 (zh) 2020-04-30 2021-04-16 信息统计

Country Status (2)

Country Link
CN (1) CN111563272B (zh)
WO (1) WO2021218660A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563272B (zh) * 2020-04-30 2021-11-09 支付宝实验室(新加坡)有限公司 信息统计方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8909711B1 (en) * 2011-04-27 2014-12-09 Google Inc. System and method for generating privacy-enhanced aggregate statistics
CN106295392A (zh) * 2015-06-24 2017-01-04 阿里巴巴集团控股有限公司 数据脱敏处理方法和装置
CN110087237A (zh) * 2019-04-30 2019-08-02 苏州大学 基于数据扰动的隐私保护方法、装置及相关组件
CN110188571A (zh) * 2019-06-05 2019-08-30 深圳市优网科技有限公司 基于敏感数据的脱敏方法及系统
CN110866263A (zh) * 2019-11-14 2020-03-06 中国科学院信息工程研究所 一种可对抗纵向攻击的用户隐私信息保护方法及系统
CN111563272A (zh) * 2020-04-30 2020-08-21 支付宝实验室(新加坡)有限公司 信息统计方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107017985A (zh) * 2017-05-10 2017-08-04 河南工业大学 一种车载自组织网络轨迹隐私保护方法及系统
CN109284620A (zh) * 2017-07-19 2019-01-29 中国移动通信集团黑龙江有限公司 一种发布数据的生成方法、装置和服务器
US20190080116A1 (en) * 2017-09-13 2019-03-14 Microsoft Technology Licensing, Llc Random noise based privacy mechanism
CN110022531B (zh) * 2019-03-01 2021-01-19 华南理工大学 一种本地化差分隐私城市垃圾数据报告和隐私计算方法
CN110334757A (zh) * 2019-06-27 2019-10-15 南京邮电大学 面向大数据分析的隐私保护聚类方法及计算机存储介质
CN110889141B (zh) * 2019-12-11 2022-02-08 百度在线网络技术(北京)有限公司 数据分布图隐私处理方法、装置和电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8909711B1 (en) * 2011-04-27 2014-12-09 Google Inc. System and method for generating privacy-enhanced aggregate statistics
CN106295392A (zh) * 2015-06-24 2017-01-04 阿里巴巴集团控股有限公司 数据脱敏处理方法和装置
CN110087237A (zh) * 2019-04-30 2019-08-02 苏州大学 基于数据扰动的隐私保护方法、装置及相关组件
CN110188571A (zh) * 2019-06-05 2019-08-30 深圳市优网科技有限公司 基于敏感数据的脱敏方法及系统
CN110866263A (zh) * 2019-11-14 2020-03-06 中国科学院信息工程研究所 一种可对抗纵向攻击的用户隐私信息保护方法及系统
CN111563272A (zh) * 2020-04-30 2020-08-21 支付宝实验室(新加坡)有限公司 信息统计方法及装置

Also Published As

Publication number Publication date
CN111563272B (zh) 2021-11-09
CN111563272A (zh) 2020-08-21

Similar Documents

Publication Publication Date Title
US11295381B2 (en) Data auditing method and device
CN110163006B (zh) 一种块链式账本中的签名验证方法、系统、装置及设备
TWI724579B (zh) 區塊鏈資料處理方法、裝置、系統、處理設備及儲存媒體
WO2020211497A1 (zh) 一种个人资产变更记录的存储方法、系统、装置及设备
KR20200053613A (ko) 데이터 통계 방법 및 장치
CN110020544B (zh) 区块链的区块中存储记录的哈希信息处理方法和系统
CN110334153B (zh) 块链式账本中的授权方法、系统、装置及设备
CN113726751B (zh) 一种块链式账本中的权重管理方法、装置及设备
CN110474775B (zh) 一种块链式账本中的用户创建方法、装置及设备
WO2019085677A1 (zh) 基于混淆电路的数据统计方法、装置以及设备
CN110837502B (zh) 一种块链式账本中的数据存储方法、装置及设备
CN109726563B (zh) 一种数据统计的方法、装置以及设备
CN109146699A (zh) 基于区块链的相亲交友综合管理方法和系统
WO2021218660A1 (zh) 信息统计
CN106878367A (zh) 服务接口异步调用的实现方法和装置
CN110750530A (zh) 一种业务系统及其数据核对方法
CN110209582A (zh) 代码覆盖率的统计方法及装置、电子设备、存储介质
CN110727679A (zh) 法院案卷的协同追踪方法、系统、装置及设备
CN110059097B (zh) 数据处理方法和装置
CN109446060B (zh) 一种服务端测试用例集的生成方法、终端设备及存储介质
CN111444215A (zh) 一种块链式账本中的成块方法、装置及设备
CN112015825A (zh) 基于区块链的模型登记方法、装置及电子设备
CN111506613A (zh) 一种数据记录的关联关系的查询方法、系统、装置及设备
CN112818367B (zh) 文件的加密方法、装置、存储介质及处理器
CN112287023B (zh) 一种块链式账本中的权重分配方法、装置及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21795956

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21795956

Country of ref document: EP

Kind code of ref document: A1