CN107145539A - A kind of method for handling unreasonable data in negative investigation - Google Patents

A kind of method for handling unreasonable data in negative investigation Download PDF

Info

Publication number
CN107145539A
CN107145539A CN201710267513.2A CN201710267513A CN107145539A CN 107145539 A CN107145539 A CN 107145539A CN 201710267513 A CN201710267513 A CN 201710267513A CN 107145539 A CN107145539 A CN 107145539A
Authority
CN
China
Prior art keywords
msub
mrow
data
investigation
negative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710267513.2A
Other languages
Chinese (zh)
Other versions
CN107145539B (en
Inventor
赵冬冬
方舒
向剑文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN201710267513.2A priority Critical patent/CN107145539B/en
Publication of CN107145539A publication Critical patent/CN107145539A/en
Application granted granted Critical
Publication of CN107145539B publication Critical patent/CN107145539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of method for handling unreasonable data in negative investigation, including following four steps:Calculate the value just investigated;Irrational data are adjusted;For the data after adjustment, the ratio that the difference caused by adjustment is assigned to other options is calculated;Value just investigated for the value of the negative investigation obtained after adjustment, calculating etc.;During positive survey data is rebuild in negative investigation, the present invention can both handle the irrational data of this class of negative value, can also handle the unreasonable data disagreed with background knowledge, and obtain higher reconstruction precision.

Description

A kind of method for handling unreasonable data in negative investigation
Technical field
The invention belongs to secret protection technical field, it is related to a kind of method that positive survey data is rebuild in negative investigation, specifically relates to And a kind of processing method of traditional algorithm for reconstructing when there are unreasonable data.
Background technology
The epoch of information huge explosion, people gradually recognize the importance of individual privacy safety.In order to meet people increasingly The demand of the secret protection of growth, the method for increasing secret protection is suggested, and it is exactly one kind therein to bear investigation.It is negative to adjust It is that a kind of of information negative indication applies to look into, and compared with conventional survey, it can be effectively protected the privacy of participant, particularly suitable In the collection of sensitive data.In traditional positive investigation, whether related to sensitive data, participant, which is required for selecting, to be met That option of oneself actual conditions, and negative investigation and traditional positive investigation are on the contrary, participant is required do not meeting oneself in fact One is selected in the option of border situation.
The negative data collected, are not the data required for us, it would be desirable to be to be reconstructed from negative investigation The data of the positive investigation come.Algorithm for reconstructing most at present proposes that uniformly negative investigation is finger counting both for uniform negative investigation Method assumes that participant is to bear option with equiprobable possible Sexual behavior mode.NStoPS algorithm for reconstructing is the most basic weight of uniform negative investigation Algorithm is built, but NStoPS algorithms are likely to occur negative value in the result of reconstruction, negative value does not meet actual conditions, so category Then irrational data.Then, for negative value problem, two kinds of innovatory algorithms are proposed on the basis of NStoPS, are referred to as NStoPS-I、NStoPS-II.Although both algorithms can be very good to handle negative value problem, NStoPS-I iteration speeds Slowly, efficiency is low, and NStoPS-II is not suitable for the negative investigation that select probability is Arbitrary distribution.
Questionnaire in reality is often in the presence of some background knowledges, such as the investigation for some diseases, hospital Often know incidence rate of the disease etc..If background knowledge is incorporated into negative investigation, using traditional algorithm for reconstructing, it can go out The unreasonable data now disagreed with background knowledge.For this kind of unreasonable data, NStoPS-BK algorithms are suggested, the algorithm Demonstrate reasonably can effectively improve the precision of data reconstruction using background knowledge.
Negative Investigation requirements are randomly choosed also likely to be present artificially partially in an option for not meeting oneself situation, fact-finding process Good, these are likely to irrational data occur when causing negative investigation to rebuild positive survey data.Obviously, these irrational numbers According to can influence to rebuild the precision of positive survey data, so, the unreasonable data in rational processing reconstructed algorithm are for improving number It is extremely important according to reconstruction precision.
The content of the invention
In order to solve the above-mentioned technical problem, the invention provides a kind of method for handling unreasonable data in negative investigation.
The technical solution adopted in the present invention is:A kind of method for handling unreasonable data in negative investigation, it is characterised in that Comprise the following steps:
Step 1:Calculate the value just investigated;
Step 2:Irrational data are adjusted;
Step 3:For the data after adjustment, the ratio that the difference caused by adjustment is assigned to other options is calculated;
Step 4:For the value of the negative investigation obtained after adjustment, the value just investigated is calculated.
Preferably, in step 1, use traditional NStoPS algorithm for reconstructing calculate the value just investigated for Wherein,Represent the number ratio estimate value of option i in positive investigation, i=1,2 ..., c;
Preferably, in step 2, irrational data are adjusted, the relation of negative data and correction data is after adjustment:
Wherein, riRepresent the number ratio of each option in negative investigation, xiThe number ratio just investigated after adjustment is represented, and And r1+...+rc=1, X1+...+Xc=1, △ riRepresent the difference that the irrational data of adjustment are brought, pijRepresentative is just being adjusted The people for looking into selection option i have selected option j probability in negative investigation, i=1,2 ..., c, j=1,2 ..., c;
For uniform negative investigation:
Preferably, in step 3, it is assumed that it is irrational result to have n option, by the estimate of this n optionIt is adjusted to reasonable value xc,xc-1,...,xc-n+1, calculate the difference that causes of adjustment and be assigned to other options Ratio, if option i ratio is in positive investigationThen:
If option i ratio is △ r in negative investigationi, according to the code fo practice of negative investigation:
Preferably, in step 4, for the value of the negative investigation obtained after adjustment, re-using NStoPS algorithms and calculating just The value x of investigationi
xi=1- (c-1) (ri+△ri)
Further obtain calculating xiFormula:
Wherein,
xs=xc+xc-1+…+xc-n+1, rs=rc+rc-1+…+rc-n+1
The beneficial effects of the present invention are:During positive survey data is rebuild in negative investigation, the present invention can both be handled The irrational data of this class of negative value, can also handle the unreasonable data disagreed with background knowledge, and obtain higher Reconstruction precision.
Brief description of the drawings
Fig. 1 is the method flow diagram of the embodiment of the present invention.
Embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, below in conjunction with the accompanying drawings and embodiment is to this hair It is bright to be described in further detail, it will be appreciated that implementation example described herein is merely to illustrate and explain the present invention, not For limiting the present invention.
See Fig. 1, the method for unreasonable data, comprises the following steps in the negative investigation of a kind of processing that the present invention is provided:
Step 1:The value just investigated is calculated using traditional NStoPS algorithm for reconstructing;
Assuming that the total number of persons for participating in investigation is N, the option number of problem is c, bears the number ratio of each option in investigation For R={ ri,ri,...,rc, the number ratio for each option just investigated accordingly is T={ t1,t2,...,tc, pijRepresent Selection option i people have selected option j probability, p in negative investigation in positive investigationijMatrix P is constituted, therefore:
For background, i.e., the method and this paper algorithm that existing positive survey data is rebuild all be with uniform negative investigation:
Traditional NStoPS method for reconstructing, matrix R, T, P meet relation:
R=TP
Therefore, matrix T can be calculated using following formula:
T=RP-1
According to above formula, the estimate of correction data can be calculated with following formula:
Represent the number ratio estimate value of option i in positive investigation, i=1,2 ..., c;
Step 2:Irrational data are adjusted;
The present embodiment assumes that the reconstructed results for having an option are irrational data by taking an option as an example, that is,.Using Traditional NStoPS methods calculate the positive investigation estimate of each optionAssuming that the result of c-th of option is unreasonable data, Reasonable value x should be adjusted toc.Because the irrational selection of certain subparticipation person, result in option c correction data estimateWith Known xcDifference △ rc, this paper algorithm is first by the irrational estimates of option cIt is adjusted to xc, and it is resulting Difference △ rcAgain by corresponding pro rate into other c-1 option, if the ratio for being assigned to each option is designated as △ r1,△ r2,...,△rc-1, therefore xi、pij、ri、△riRelation is as follows
Step 3:For the data after adjustment, the ratio that the difference caused by adjustment is assigned to other options is calculated;
By difference △ rcDuring by corresponding pro rate to other options, △ r are calculated firstcOther options in positive investigation Number ratio, if option i ratio is in positive investigationAssuming that being calculated as follows:
According to the create-rule of uniform negative investigation, other c-1 option in positive investigationEquiprobability is assigned to negative investigation C-2 option in, probability be 1/ (c-2), so △ riIt is calculated as follows
Step 4:For the value of the negative investigation obtained after adjustment, re-use NStoPS algorithms and calculate the value just investigated.
Calculate the estimate just investigated using NStoPS again using the negative investigation value after adjustment:
xi=1- (c-1) (ri+△ri)
It can to sum up obtain, except option c, the reconstructed results of other each options are calculated as follows:
The calculating process of algorithm when above example is an only unreasonable data, the present invention is applied to multiple unreasonable numbers According to processing, and higher accuracy can be obtained.
It should be appreciated that the part that this specification is not elaborated belongs to prior art.
It should be appreciated that the above-mentioned description for preferred embodiment is more detailed, therefore it can not be considered to this The limitation of invention patent protection scope, one of ordinary skill in the art is not departing from power of the present invention under the enlightenment of the present invention Profit is required under protected ambit, can also be made replacement or be deformed, each fall within protection scope of the present invention, this hair It is bright scope is claimed to be determined by the appended claims.

Claims (4)

1. a kind of method for handling unreasonable data in negative investigation, it is characterised in that comprise the following steps:
Step 1:Calculate the value just investigated;
Step 2:Irrational data are adjusted;
Step 3:For the data after adjustment, the ratio that the difference caused by adjustment is assigned to other options is calculated;
Step 4:For the value of the negative investigation obtained after adjustment, the value just investigated is calculated.
2. the method for unreasonable data in the negative investigation of processing according to claim 1, it is characterised in that:, will not in step 2 Rational data are adjusted, and the relation of negative data and correction data is after adjustment:
<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </msubsup> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>*</mo> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mn>1</mn> </mrow> </msub> <mo>=</mo> <msub> <mi>r</mi> <mn>1</mn> </msub> <mo>+</mo> <msub> <mi>&amp;Delta;r</mi> <mn>1</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </msubsup> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>*</mo> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mn>2</mn> </mrow> </msub> <mo>=</mo> <msub> <mi>r</mi> <mn>2</mn> </msub> <mo>+</mo> <msub> <mi>&amp;Delta;r</mi> <mn>2</mn> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>c</mi> </msubsup> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>*</mo> <msub> <mi>p</mi> <mrow> <mi>i</mi> <mi>c</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>r</mi> <mi>c</mi> </msub> <mo>+</mo> <msub> <mi>&amp;Delta;r</mi> <mi>c</mi> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced>
Wherein, riRepresent the number ratio of each option in negative investigation, xiRepresent the number ratio just investigated after adjustment, and r1 +...+rc=1, X1+...+Xc=1, △ riRepresent the difference that the irrational data of adjustment are brought, pijRepresent in positive investigation Selection option i people have selected option j probability in negative investigation, i=1,2 ..., c, j=1,2 ..., c.
3. the method for unreasonable data in the negative investigation of processing according to claim 2, it is characterised in that:In step 3, it is assumed that It is irrational result to have n option, by the estimate of this n optionIt is adjusted to reasonable value xc, xc-1,...,xc-n+1, the ratio that the difference caused is assigned to other options that adjusts is calculated, if option i ratio is in positive investigationThen:
<mrow> <msub> <mi>d</mi> <msub> <mi>x</mi> <mi>i</mi> </msub> </msub> <mo>=</mo> <mfrac> <msub> <mi>x</mi> <mi>i</mi> </msub> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>c</mi> <mo>-</mo> <mi>n</mi> </mrow> </msubsup> <msub> <mi>X</mi> <mi>j</mi> </msub> </mrow> </mfrac> <mo>*</mo> <mrow> <mo>(</mo> <mo>-</mo> <msub> <mi>&amp;Delta;r</mi> <mi>c</mi> </msub> <mo>)</mo> </mrow> <mo>+</mo> <mn>...</mn> <mo>+</mo> <mfrac> <msub> <mi>x</mi> <mi>i</mi> </msub> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>c</mi> <mo>-</mo> <mi>n</mi> </mrow> </msubsup> <msub> <mi>X</mi> <mi>j</mi> </msub> </mrow> </mfrac> <mo>*</mo> <mrow> <mo>(</mo> <mo>-</mo> <msub> <mi>&amp;Delta;r</mi> <mrow> <mi>c</mi> <mo>-</mo> <mi>n</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>,</mo> <mrow> <mo>(</mo> <mi>i</mi> <mo>=</mo> <mn>1...</mn> <mi>c</mi> <mo>-</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
If option i ratio is △ r in negative investigationi, according to the code fo practice of negative investigation:
<mrow> <msub> <mi>&amp;Delta;r</mi> <mi>i</mi> </msub> <mo>=</mo> <munderover> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>j</mi> <mo>&amp;NotEqual;</mo> <mi>i</mi> </mrow> <mrow> <mi>c</mi> <mo>-</mo> <mi>n</mi> </mrow> </munderover> <msub> <mi>d</mi> <msub> <mi>x</mi> <mi>j</mi> </msub> </msub> <mfrac> <mn>1</mn> <mrow> <mi>c</mi> <mo>-</mo> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <mo>.</mo> </mrow>
4. the method for unreasonable data in the negative investigation of processing according to claim 3, it is characterised in that in step 5, for The value of the negative investigation obtained after adjustment, recalculates the value x just investigatedi
<mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>=</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>+</mo> <mfrac> <mrow> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>-</mo> <mrow> <mo>(</mo> <mi>c</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>r</mi> <mi>i</mi> </msub> </mrow> <mrow> <mn>1</mn> <mo>+</mo> <mfrac> <mrow> <mo>(</mo> <mi>n</mi> <mo>-</mo> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>)</mo> <mo>-</mo> <mo>(</mo> <mi>c</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <msub> <mi>r</mi> <mi>s</mi> </msub> </mrow> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>)</mo> <mo>(</mo> <mi>c</mi> <mo>-</mo> <mi>n</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mfrac> </mrow> </mfrac> </mrow>
Wherein,
xs=xc+xc-1+…+xc-n+1, rs=rc+rc-1+…+rc-n+1
CN201710267513.2A 2017-04-21 2017-04-21 A method of unreasonable data in the negative investigation of processing Active CN107145539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710267513.2A CN107145539B (en) 2017-04-21 2017-04-21 A method of unreasonable data in the negative investigation of processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710267513.2A CN107145539B (en) 2017-04-21 2017-04-21 A method of unreasonable data in the negative investigation of processing

Publications (2)

Publication Number Publication Date
CN107145539A true CN107145539A (en) 2017-09-08
CN107145539B CN107145539B (en) 2019-10-25

Family

ID=59774312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710267513.2A Active CN107145539B (en) 2017-04-21 2017-04-21 A method of unreasonable data in the negative investigation of processing

Country Status (1)

Country Link
CN (1) CN107145539B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409132A (en) * 2018-10-26 2019-03-01 南京航空航天大学 A kind of negative investigation method with personalized privacy protection function

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105487828A (en) * 2015-11-24 2016-04-13 珠海奔图电子有限公司 Printing control system and method
CN106127541A (en) * 2016-06-08 2016-11-16 中国科学技术大学 A kind of credit assessment method based on negative investigation and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105487828A (en) * 2015-11-24 2016-04-13 珠海奔图电子有限公司 Printing control system and method
CN106127541A (en) * 2016-06-08 2016-11-16 中国科学技术大学 A kind of credit assessment method based on negative investigation and system

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
RAN LIU 等: "Multiple-negative survey method for enhancing the accuracy of negative survey-based cloud data privacy", 《2015 INTERNATIONAL WORKSHOP ON ARTIFICIAL IMMUNE SYSTEMS》 *
YAFEI BAO 等: "Estimating positive surveys from negative surveys", 《STATISTICS & PROBABILITY LETTERS》 *
YIHUI LU 等: "Fast searching optimal negative surveys", 《2014 INTERNATIONAL CONFERENCE ON INFORMATION AND NETWORK SECURITY》 *
杜学海: "基于信息负表示的数据发布方法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
罗文坚: "关于负调查的若干问题研究", 《万方》 *
赵冬冬: "信息负表示的若干应用方案研究", 《中国博士学位论文全文数据库 信息科技辑》 *
鲁义辉: "负调查的相关方法及应用研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409132A (en) * 2018-10-26 2019-03-01 南京航空航天大学 A kind of negative investigation method with personalized privacy protection function

Also Published As

Publication number Publication date
CN107145539B (en) 2019-10-25

Similar Documents

Publication Publication Date Title
Tamang et al. Forecasting of Covid-19 cases based on prediction using artificial neural network curve fitting technique
Li et al. Integrated CNN and federated learning for COVID-19 detection on chest X-ray images
Guo Negative multinomial regression models for clustered event counts
US20240135258A1 (en) Methods and apparatuses for data privacy-preserving training of service prediction models
CN110149333A (en) A kind of network security situation evaluating method based on SAE+BPNN
Zhang et al. Multiple-vs non-or single-imputation based fuzzy clustering for incomplete longitudinal behavioral intervention data
Thomas J-value assessment of how best to combat COVID-19
CN107145539A (en) A kind of method for handling unreasonable data in negative investigation
Pramanik Path integral control in infectious disease modeling
Yang Modeling the transmission dynamics of pertussis using recursive point process and SEIR model
CN110968893A (en) Privacy protection method for associated classified data sequence based on Pufferfish framework
Zhang et al. Estimating Mann–Whitney‐type Causal Effects
Xue et al. Seasonal transmission dynamics and optimal control strategies for tuberculosis in Jiangsu Province, China
CN109409132A (en) A kind of negative investigation method with personalized privacy protection function
Lu Causal inference for observational studies/real-world data
Lui et al. Notes on testing equality in binary data under a three period crossover design
Ahmad Modeling and handling overdispersion health science data with zero-inflated Poisson model
Trisuciana et al. Clustering of COVID-19 vaccination recipients in DKI Jakarta using the K-medoids algorithm
Ene et al. A machine learning approach to differentiating bacterial from viral meningitis
Ghosh et al. Bayesian and likelihood-based inference for the bivariate normal correlation coefficient
Singh et al. An Extension of Poisson Distribution and its Applications in Human Reproduction
Lessner Projection of AIDS incidence in women in New York State.
Mao Cost-effectiveness of workplace closure and travel restriction for mitigating influenza outbreaks: a network-based simulation
Martínez-Rodríguez et al. Network computational model to estimate the effectiveness of the influenza vaccine a posteriori
Hong Behavior, knowledge, attitude, and other characteristics of men who had sex with female commercial sex workers in Kenya

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant