CN109543442A - Data safety processing method, device, computer equipment and storage medium - Google Patents

Data safety processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109543442A
CN109543442A CN201811187262.8A CN201811187262A CN109543442A CN 109543442 A CN109543442 A CN 109543442A CN 201811187262 A CN201811187262 A CN 201811187262A CN 109543442 A CN109543442 A CN 109543442A
Authority
CN
China
Prior art keywords
data
training
model
original
user data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811187262.8A
Other languages
Chinese (zh)
Inventor
史光辉
王涵
王建明
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201811187262.8A priority Critical patent/CN109543442A/en
Priority to PCT/CN2018/122734 priority patent/WO2020073492A1/en
Publication of CN109543442A publication Critical patent/CN109543442A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Abstract

The invention discloses a kind of data safety processing method, device, computer equipment and storage medium, this method includes obtaining data processing request, and data processing request includes objective cross feature;According to objective cross feature, user data to be measured is obtained from customer data base;It determines that target noise extracts range according to data safe processing model, range is extracted based on target noise, user data to be measured is handled, obtain validated user data;Validated user data are input in data safe processing model, the output valve of the corresponding objective cross feature of validated user data is obtained;When the output valve of objective cross feature is in default monitoring range, then using validated user data as Secure user data.This method had not only guaranteed that the user of data cannot be inferred to the individual privacy information of user easily, but also played the value of target data with capable of maximizing degree.

Description

Data safety processing method, device, computer equipment and storage medium
Technical field
The present invention relates to technical field of data security more particularly to a kind of data safety processing method, device, computer to set Standby and storage medium.
Background technique
In recent years, with the fast development of information technology, big data using more and more extensive, data sharing has become society A kind of trend that can develop.But data need to meet when shared specific condition, and guaranteeing cannot be comprising the information of personal identification And it can be inferred to individual privacy information easily.When carrying out data sharing such as banking and insurance business industry, supervised according to China's banking industry The regulation such as administration committee (the abbreviation Banking Supervision Commission) and China Insurance Regulatory Commission (abbreviation Chinese Protecting and Supervising Association), uses expression The information of family identity and the information that can be inferred to user identity easily need to carry out desensitization process, have both guaranteed the user of data It cannot be inferred to the individual privacy information of user easily, and the value for playing data can be maximized.Currently according to the rules to user Data carry out desensitization process one by one, and desensitization process data volume is big, it has not been convenient to operate and take time and effort.
Summary of the invention
The embodiment of the present invention provides a kind of data safety processing method, device, computer equipment and storage medium, to solve User data desensitization process data volume is big, it has not been convenient to the problem of operating and taking time and effort.
A kind of data safety processing method, comprising:
Data processing request is obtained, the data processing request includes objective cross feature;
According to the objective cross feature, user data to be measured is obtained from customer data base;
It determines that target noise extracts range according to data safe processing model, range is extracted based on the target noise and is treated It surveys user data to be handled, obtains validated user data;
The validated user data are input in the data safe processing model, the validated user data pair are obtained The output valve for the objective cross feature answered;
When the output valve of the objective cross feature is in default monitoring range, then using the validated user data as peace Full user data.
A kind of data safe processing device, comprising:
Data processing request obtains module, and for obtaining data processing request, the data processing request includes target group Close feature;
User data to be measured obtains module, for obtaining from customer data base to be measured according to the objective cross feature User data;
Validated user data acquisition module, for determining that target noise extracts range, base according to data safe processing model Range is extracted in the target noise to handle user data to be measured, obtains validated user data;
Data safe processing module, for the validated user data to be input in the data safe processing model, Obtain the output valve of the corresponding objective cross feature of the validated user data;
Secure user data obtains module, for the output valve when the objective cross feature in default monitoring range, Then using the validated user data as Secure user data.
A kind of computer equipment, including memory, processor and storage are in the memory and can be in the processing The computer program run on device, the processor realize above-mentioned data safety processing method when executing the computer program Step.
A kind of computer readable storage medium, the computer-readable recording medium storage have computer program, the meter The step of calculation machine program realizes above-mentioned data safety processing method when being executed by processor.
Above-mentioned data safety processing method, device, computer equipment and storage medium, by objective cross feature from user User data to be measured is obtained in database, and range is then extracted according to the target noise that data safe processing model determines, is extracted The corresponding user data of certain amount of non-targeted assemblage characteristic is added in user data to be measured, reduces user data to be measured Accuracy guarantees the safety of user data.It is according to trained data safe processing model that the target noise, which extracts range, Determining, so that it is more accurate according to the validated user data that target noise extracts range acquisition, meet user's requirement, both guaranteed The safety of user data can use the value of user data performance data again.After obtaining validated user data, in order to further determine Whether validated user data meet the requirements, it is also necessary to validated user data are input in data safe processing model, acquisition has The output valve of the corresponding objective cross feature of effectiveness user data, when output valve is in default monitoring range, then it represents that effective amount According to meeting the requirements, using validated user data as Secure user data, so that Secure user data reaches desensitization process requirement, both Guarantee that it can not derive personal private chat information easily, and plays while degree can be maximized the value of target data.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is an application scenario diagram of data safety processing method in one embodiment of the invention;
Fig. 2 is a flow chart of data safety processing method in one embodiment of the invention;
Fig. 3 is a specific flow chart of step S30 in Fig. 2;
Fig. 4 is the flow chart in Fig. 2 before step S30;
Fig. 5 is a specific flow chart of step S301 in Fig. 4;
Fig. 6 is a schematic diagram of data safe processing device in one embodiment of the invention;
Fig. 7 is a schematic diagram of computer equipment in one embodiment of the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
Data safety processing method provided by the present application can be applicable in the application environment such as Fig. 1, wherein client is logical Network is crossed to be communicated with server.Wherein, client can be, but not limited to various personal computers, laptop, intelligence Mobile phone, tablet computer and portable wearable device.Server can be formed with the either multiple servers of independent server Server cluster realize.
In one embodiment, it as shown in Fig. 2, providing a kind of data safety processing method, applies in Fig. 1 in this way It is illustrated, includes the following steps: for server
S10: data processing request is obtained, data processing request includes objective cross feature.
Data processing request in the present embodiment refers to the request that client is sent to server for processes user data, should Data processing request includes objective cross feature.The objective cross feature refer to user as needed client fill in about with Whether whether the feature of user data, the age of the user including but not limited to stored in customer data base gender, have vehicle, buy The features such as insurance (including but not limited to vehicle insurance, production danger and life insurance).It is to be appreciated that the objective cross feature includes at least one The feature of user data.
S20: according to objective cross feature, user data to be measured is obtained from customer data base.
Wherein, customer data base refers to the database for storing user data.It is deposited in customer data base in the present embodiment The user data of storage includes but is not limited to the essential information of user, buying behavior and transaction data.Wherein, essential information include but It is not limited to name, age (date of birth), gender, native place, nationality, education background and work experience etc..Buying behavior refers to user In the behavior of platform purchase product, the including but not limited to finance product and insurance products of user's purchase.Transaction data refers to user Investment, the record data consuming and transfer accounts.
Specifically, when user is after client sets objective cross feature, it is sent to server, server is according to acquisition The objective cross feature arrived obtains user data corresponding with objective cross feature as number of users to be measured from customer data base According to.Wherein, user data to be measured refers to user data in need of test.
S30: it determines that target noise extracts range according to data safe processing model, range is extracted based on target noise and is treated It surveys user data to be handled, obtains validated user data.
In order to protect the safety of user data, it is corresponding that the present embodiment adds non-targeted assemblage characteristic to user data to be measured User data reduce basic letter so that the validated user data obtained are not all the corresponding user data of objective cross feature The accuracy of user data is ceased to guarantee the safety of user data.
Wherein, data safe processing model refers to trained in advance for determining that target noise extracts the model of range.Mesh Mark noise extraction range refers to the range for extracting the corresponding user data of non-targeted assemblage characteristic.Data peace in the present embodiment Full processing model includes that goal gradient promotes decision-tree model and target logic regression model.Goal gradient promotes decision-tree model Refer to that the gradient for meeting user's requirement trained in advance promotes decision tree, gradient promotes decision tree (Gradient Boosting Decision Tree, GBDT) be a kind of iteration decision Tree algorithms, which is made of more decision trees, the conclusion of all trees It has added up as final prediction result.Target logic regression model refers to trained in advance for obtaining target noise extraction model The Logic Regression Models enclosed.Logic Regression Models (Logistic Regression, LR) are a kind of reduction estimation ranges, will be pre- Measured value is limited to a kind of regression model between [0,1].
After obtaining user data to be measured, determine that target noise extracts range according to safe handling model, from user data The corresponding user data of certain amount of non-targeted assemblage characteristic is extracted in library, is added in user data to be measured, is generated effective User data.Validated user data refer to the corresponding user data of non-targeted assemblage characteristic is added in user data to be measured after obtain Data.It should be noted that the specific quantity in the present embodiment, which refers to, extracts the quantity in range in target noise.Server root The corresponding user data of the non-targeted assemblage characteristic of range extraction is extracted according to target noise to be added in user data to be measured, so that after The Secure user data that continuous step obtains not only can guarantee the safety of user data, but also meet the need to the value of user data It asks.
S40: validated user data are input in data safe processing model, obtain the corresponding target of validated user data The output valve of assemblage characteristic.
Specifically, after obtaining validated user data, it is also necessary to further verification processing is carried out to validated user data, Determine whether validated user data meet user's requirement.Therefore, the validated user data that will acquire also is needed to be input to preparatory training It is handled in good data safe processing model, obtains validated user data and obtained by data safe processing model treatment Output valve.The output valve is a probability value, for detecting whether validated user data meet requirement set by user.
S50: when the output valve of objective cross feature is in default monitoring range, then using validated user data as safety use User data.
Specifically, default monitoring range refers to pre-set for detecting whether validated user data meet user's requirement Range.The default monitoring range being arranged in this implementation is 70%-90%, not only can guarantee the safety of user data, but also satisfaction pair The value demand of user data.When the calculating for passing through data safe processing model, the output valve of the objective cross feature of acquisition exists In default monitoring range, then it represents that validated user data are met the requirements, using validated user data as Secure user data.
Step S10- step S50, obtains user data to be measured by objective cross feature, then root from customer data base Range is extracted according to the target noise that data safe processing model determines, extracts the corresponding use of certain amount of non-targeted assemblage characteristic User data is added in user data to be measured, to obtain validated user data, reduces the accuracy of validated user data to guarantee The safety of user data.After obtaining validated user data, whether met the requirements in order to further determine validated user data, It also needs for validated user data to be input in data safe processing model, it is special to obtain the corresponding objective cross of validated user data The output valve of sign, when output valve is in default monitoring range, then it represents that validated user data are met the requirements, and validated user data are made It is sent to the client of the transmission data processing request of needs for Secure user data, had both guaranteed that the user of data cannot be easily It is inferred to the individual privacy information of user, and plays while degree can be maximized the value of target data.
In one embodiment, as shown in figure 3, in step S30, based on target noise extract range to user data to be measured into Row processing, obtains validated user data, specifically comprises the following steps:
S31: extracting range based on target noise, chooses the corresponding user data of non-targeted assemblage characteristic from customer data base As target noise data.
Specifically, after data safe processing model determines that target noise extracts range, server is taken out according to target noise Range is taken, chooses the corresponding user data of certain amount of non-targeted assemblage characteristic as target noise number from customer data base According to providing data source for subsequent step.
Such as determining that target noise extracts range according to data safe processing model is 0.05-0.25, non-targeted assemblage characteristic Corresponding user data has 100,000, and server extracts range 5000-25000 according to target noise, chooses from customer data base The corresponding user data of non-targeted assemblage characteristic of any one number is as target noise data in 5000-25000.
S32: target noise data are added in user data to be measured, obtain validated user data.
Specifically, after obtaining target noise data, target noise data are added in user data to be measured, acquisition has Effectiveness user data reduces validated user data so that validated user data are not all the corresponding user data of objective cross feature Accuracy so that validated user data not only can satisfy the demand of user to user data, but also can guarantee the peace of user data Quan Xing.
Step S31- step S32, extracts range by target noise, chooses non-targeted assemblage characteristic pair from customer data base The user data answered is added in user data to be measured as target noise data, and by target noise data, is obtained effective User data guarantees the safety of user data.
In one embodiment, as shown in figure 4, before step S30, i.e., determine that target is made an uproar according to data safe processing model Before sound extracts the step of range, data safety processing method further includes following steps:
S301: it obtains to training data, training set and test set will be divided into training data.
Specifically, it obtains from sample database to training data, and training set and test will be divided into training data Collection is used for training pattern and test model.Trained model is needed to refer to that original gradient promotes decision-tree model in the present embodiment.Its In, sample database refers to for storing the database to training data.Refer to training data for training original gradient promotion to determine The data of plan tree-model.Training set (training set) is the number to be trained for training original gradient to promote decision-tree model According to set.Test set (test set) be for test trained original gradient promoted decision-tree model whether accurately to The set of training data.
S302: initialization original gradient promotes the model parameter of decision-tree model, and model parameter includes that gradient promotes decision The depth capacity and maximum number of iterations of tree.
Specifically, before promoting decision-tree model to original gradient and being trained, it is necessary first to be promoted to original gradient Model parameter in decision-tree model carries out Initialize installation.Model parameter in the present embodiment includes that original gradient promotes decision The depth capacity and maximum number of iterations of tree.It is indicated according to experimental data, original gradient promotes the depth capacity setting of decision tree It is 3, maximum number of iterations is set as 50, and effect is best, its depth capacity of Initialize installation is 3 in the present embodiment, greatest iteration Number is 50.Carrying out Initialize installation to the model parameter that original gradient promotes decision-tree model can be in the original ladder of subsequent training When degree promotes decision-tree model, shorten the training time, improves recognition accuracy.
S303: by training set it is corresponding to training data be input to original gradient promoted decision-tree model in, when original ladder The training depth that degree promotes in decision-tree model reaches depth capacity and the number of iterations reaches maximum number of iterations, then deconditioning Original gradient promotes decision-tree model, and it is original group corresponding to obtain each decision tree path in original gradient promotion decision-tree model Close feature.
Specifically, to original gradient promoted decision-tree model carry out Initialize installation after, by training set wait train Data are input to original gradient and are promoted in decision-tree model, and original gradient promotes decision-tree model can be corresponding to training data It selects a feature as first bifurcation in objective cross feature, obtains in training set to training data in the bifurcation Then feature remaining in objective cross feature is carried out bifurcated again by residual error, using the corresponding residual error of first bifurcation as The input of second decision tree, continuous iteration, when original gradient promoted the training depth in decision-tree model reach depth capacity and The number of iterations reaches maximum number of iterations, then deconditioning original gradient promotes decision-tree model, obtains original gradient promotion and determines The corresponding original combined feature in each decision tree path in plan tree-model.Wherein, original combined feature refer to training set it is corresponding to Training data is input to original gradient and is promoted in decision-tree model, the corresponding assemblage characteristic in each decision tree path.
S304: original combined feature is input in primitive logic regression model, and it is corresponding defeated to obtain original combined feature It is worth out.
Specifically, after obtaining original combined feature, original combined feature is input in primitive logic regression model, The corresponding probability of original combined feature, the i.e. corresponding output valve of original combined feature are calculated by Sigmoid function.The present embodiment In Sigmoid function may be expressed as:Wherein, g(x)Occur for original combined feature general Rate, g(x)∈ (0,1), x are original combined feature, and T is the parameter that user changes speed according to the θ of actual set, and θ is original ladder Degree promotes the combination of the corresponding weight of feature of each user data in decision-tree model.Q=1 indicates that original combined feature is mesh It marks assemblage characteristic to set up, q=0 indicates that original combined feature is that objective cross feature is invalid, i.e., original combined feature is non-mesh Mark assemblage characteristic.
S305: when the corresponding output valve of original combined feature is in default monitoring range, then original gradient is promoted into decision Tree-model and primitive logic regression model are as goal gradient promotion decision-tree model and target logic regression model.
Specifically, when the corresponding output valve of original combined feature is in default monitoring range, then it represents that original gradient is promoted Decision-tree model and primitive logic regression model are trained successfully, and original gradient is promoted decision-tree model and primitive logic returns mould Type promotes decision-tree model and target logic regression model as goal gradient.
S306: decision-tree model and target logic recurrence are promoted to goal gradient to training data using test set is corresponding Model is tested, if get it is each to the corresponding output valve of training data in default monitoring range, by target Gradient promotes decision-tree model and target logic regression model as data safe processing model.
Determining that original gradient promotes decision-tree model and primitive logic regression model is that goal gradient promotes decision tree mould After type and target logic regression model, over-fitting, needs using terraced to target to training data in test set in order to prevent Degree promotes decision-tree model and target logic regression model is tested, and verifying goal gradient promotes decision-tree model and target is patrolled The accuracy of regression model is collected, if each in the test set got monitor model default to the corresponding output valve of training data In enclosing, then the comparison success that goal gradient is promoted to decision-tree model and the training of target logic regression model is identified, is met the requirements, Goal gradient is then promoted into decision-tree model and target logic regression model as data safe processing model.
Step S301- step S306, by promoting decision-tree model to training data training original gradient in training set, And the corresponding original combined feature in decision tree path each in original gradient promotion decision-tree model is input to primitive logic and is returned Return in model, obtains the corresponding output valve of original combined feature.If the corresponding output valve of original combined feature is in default monitoring model In enclosing, then it represents that original gradient promotes decision-tree model and primitive logic regression model is trained successfully.Occurred intending in order to prevent The case where conjunction, it is also necessary to by test set it is corresponding to training data be input to trained goal gradient promoted decision-tree model and It is tested in target logic regression model, if each presetting in monitoring range to the corresponding output valve of training data, Goal gradient after indicating training promotes decision-tree model and target logic regression model meets demand, can be used as final determination Target noise extracts the data safe processing model of range, obtains validated user data for subsequent step and uses.
In one embodiment, as shown in figure 5, step S301, obtains to training data, will be divided into training to training data Collection and test set, specifically comprise the following steps:
S3011: obtaining model training request, and model training request includes that training group closes feature.
Wherein, model training request refers to the request for being used to carry out model training that client is sent.Training assemblage characteristic refers to The feature for the user data for training pattern that user is set as needed in client.Specifically, user on demand at Training assemblage characteristic is arranged in client, then clicks the operation of transmission, server, which can be got, carries trained assemblage characteristic Model training request.
S3012: according to training assemblage characteristic, the training user that characteristic matching is combined with training is chosen from customer data base Data and with training the unmatched non-training user data of assemblage characteristic.
Specifically, get model training request after, server according to model training request in include training combine Feature is chosen the training user's data for combining characteristic matching with training from customer data base and is mismatched with training assemblage characteristic Non-training user data.Wherein, training user's data refer to the user data that characteristic matching is combined with training.Non-training number of users According to finger and the training unmatched user data of assemblage characteristic.It is that 20-30 years old age and purchase produce danger that feature is closed in such as training, then basis Training characteristics obtain 20-30 years old age and purchase produces dangerous user data as training user's data, and choose does not include year simultaneously Age was at 20-30 years old and purchase produces the user data of danger as non-training user data.Training is obtained according to training assemblage characteristic to use User data and non-training user data provide data source for subsequent acquisition original positive sample and original negative sample.
Further, user data is handled for convenience of subsequent step, after obtaining training user's data, will be trained User tag of the assemblage characteristic as training user's data, non-targeted assemblage characteristic are marked as the user of non-training user data Label.It is that 20-30 years old age and purchase produce user tag of the danger as training user's data that feature such as is closed in training, and the age is not existed 20-30 years old, do not buy the label for producing danger as non-training user data.
S3013: according to preset positive sample quantity, corresponding training user's data conduct is chosen from training user's data Original positive sample.
Positive sample quantity refers to pre-set for obtaining the quantity of original positive sample in the present embodiment.Original positive sample refers to The training user's data extracted from training user's data according to preset positive sample quantity.According to preset positive sample quantity, Original positive sample is obtained from training user's data, effective positive sample is obtained for subsequent step and effective negative sample provides data Source.
S3014: according to positive and negative sample proportion, corresponding non-training user data conduct is chosen from non-training user data Original negative sample.
Specifically, after obtaining original positive sample, server is according to pre-set positive and negative sample proportion, from non-training use Corresponding non-training user data is chosen in user data as original negative sample.Wherein, positive and negative sample proportion refers to pre-set For determining the ratio column of the quantity of original negative sample according to positive sample quantity.The quantity of original negative sample in the present embodiment is root It is determined according to positive sample quantity, so that the quantity of original negative sample and positive sample quantity are proportional, meets subsequent carry out mould The requirement of type training.
S3015: extracting negative noise data according to the first noise extraction range from original positive sample, and by negative noise data It is added in original negative sample, generates effective negative sample.
Wherein, first extraction range refer to pre-set using original positive sample as the proportional region of negative noise data. Negative noise data, which refers to, is artificially modified the corresponding user tag of original positive sample partial user data, so that the portion The user tag for dividing user data to carry becomes the corresponding user tag of non-targeted user characteristics.Such as the user in original positive sample The user tag that data carry is that 20-30 years old age and purchase produce danger, is artificially non-targeted combination by certain customers' tag modification User tag (is such as revised as 35-45 years old and purchase produces danger, or user tag is revised as by the corresponding user tag of feature 20-30 years old and purchase vehicle insurance, or it is revised as 30-40 years old user tag corresponding with there is the arbitrarily non-targeted assemblage characteristic such as vehicle).
Specifically, after obtaining original positive sample and original negative sample, server according to the first noise extraction range, from Certain amount of user data is randomly selected in original positive sample, artificially modifies the corresponding user tag of the portion of user data For the corresponding user tag of non-targeted assemblage characteristic, which is had modified into the user data of user tag as negative noise number According to.It is to be appreciated that the specific quantity refers to the Any Digit within the scope of the first noise extraction.It, will after obtaining negative noise data Negative noise data is added in original negative sample, generates effective negative sample, reduces the accuracy of original negative sample, so that original minus User data in sample is not all the corresponding user data of non-targeted assemblage characteristic, and reaching reduces number of users in original negative sample According to sensibility purpose.
For example, the quantity of original positive sample is 100,000, the first extraction range is set as 0.1-0.3, extracts according to first Range randomly selects 10,000-3 ten thousand user data, the user which is carried from 100,000 original positive samples Tag modification is the corresponding user tag of non-targeted assemblage characteristic, using the portion of user data as negative noise data.
S3016: extracting positive noise data according to the second noise extraction range from original negative sample, and by positive noise data It is added in original positive sample, generates effective positive sample.
Wherein, second extraction range refer to pre-set using original negative sample as the proportional region of positive noise data. Positive noise data, which refers to, becomes the corresponding use of objective cross feature for the corresponding user tag of original negative sample partial user data Family label.It should be noted that since the quantity of positive sample and the quantity of negative sample are according to the positive negative sample pre-set Ratio choose, therefore, first extraction range and second extraction range be also it is inconsistent, such as by first extract range be set as 0.1-0.3, then the second extraction range is set as 0.05-0.2.
Specifically, range is extracted according to second, certain amount of user data is randomly selected from original negative sample, artificially The corresponding user tag of the portion of user data is modified, user tag is revised as the corresponding user tag of objective cross feature, The part is had modified into the original negative sample of user tag as positive noise data.It is to be appreciated that the specific quantity refers to second Any Digit within the scope of noise extraction.After obtaining positive noise data, positive noise data is added in original positive sample, is formed Effective positive sample.It is to be appreciated that effectively positive sample refers to the positive sample for being added in original positive sample and being formed after negative noise data.
For example, the quantity of original negative sample is 1,000,000, the second extraction range is set as 0.05-0.2, takes out according to second Range is taken, from randomly selecting in 50,000-20 ten thousand ranges any one corresponding user data of number in 1,000,000 original negative samples, Then the user tag that the user data of extraction carries artificially is revised as the corresponding user tag of objective cross feature.Modification is used It after the label of family, using the original negative sample for modifying user tag as positive noise data, is added in original positive sample, generates effective Positive sample.
Positive noise data is added in original positive sample, so that original positive sample more than only includes real objective cross The corresponding user data of feature further includes the positive noise data of some non-targeted assemblage characteristics, so that the use in original positive sample User data is not 100% true training user's data, can be to avoid true user data is exported, without to user data In some sensitive datas carry out desensitization process one by one, ensure that the safety of user data.
S3017: it using effective positive sample and effective negative sample as to training data, and is stored in sample database.
Specifically, after obtaining effective positive sample and effective negative sample, subsequent trained original gradient promotion is determined for convenience Whether plan tree-model and the trained goal gradient promotion decision-tree model of verifying are accurate, by effective positive sample and effective negative sample It is stored in sample database as to training data.
Step S3011- step S3017 obtains effective positive sample by the way that positive noise data is added to original positive sample, to original Beginning negative sample is added negative noise data and obtains effective negative sample, changes the accuracy of original positive sample and original negative sample by force, So that the user data in the effective positive sample and effective negative sample that generate is not all the user data containing objective cross feature Or the user data of non-targeted assemblage characteristic, it avoids exporting true user data, without to some in user data Sensitive data carries out desensitization process one by one, ensure that the safety of user data.
In one embodiment, in step S304, original combined feature is input in primitive logic regression model, is obtained former After the step of beginning assemblage characteristic corresponding output valve, the data safety processing method further include: when original combined feature is corresponding Output valve be lower than default monitoring range, then reduce the first noise extraction range and the second noise extraction range, and improve positive and negative Sample proportion;When the corresponding output valve of original combined feature is higher than default monitoring range, then increase the first noise extraction range and Second noise extraction range, and reduce positive and negative sample proportion.
Specifically, decision-tree model and primitive logic time are promoted by original gradient to training data when training set is corresponding The training and calculating for returning model, the obtained corresponding output valve of original combined feature is not in default monitoring range, then it represents that the The inaccuracy of one noise extraction range, the second noise extraction range and the setting of positive and negative sample proportion needs dynamic to adjust first and makes an uproar Sound extracts range, the second noise extraction range and positive and negative sample proportion.
Further, when the corresponding output valve of original combined feature is lower than default monitoring range, then it represents that negative sample quantity Excessively, it should improve positive and negative sample proportion, reduce negative sample quantity.At this point, the first noise extraction range and the second noise extraction Higher, the setting of reduction the first noise extraction range and the second noise extraction range of range setting.When original combined feature pair The output valve answered is higher than default monitoring range, then it represents that negative sample quantity is very few, it should reduce positive and negative sample proportion, increase negative sample This quantity.At this point, the first noise extraction range and the second noise extraction range setting it is relatively low, it should improve the first noise extraction The setting of range and the second noise extraction range.
Data safety processing method provided by the invention obtains use to be measured by objective cross feature from customer data base Then user data extracts range according to the target noise that data safe processing model determines, extracts certain amount of non-targeted group It closes the corresponding user data of feature to be added in user data to be measured, reduces the accuracy of user data to be measured to guarantee number of users According to safety.The target noise, which extracts range, to be determined according to trained data safe processing model, so that according to mesh Mark noise extraction range obtain validated user data it is more accurate, meet user requirement, not only guaranteed user data safety but also User data can be used to play the value of data.Obtain validated user data after, in order to further determine validated user data whether It meets the requirements, it is also necessary to validated user data is input in data safe processing model, it is corresponding to obtain validated user data The output valve of objective cross feature, when output valve is in default monitoring range, then it represents that validated user data are met the requirements, will be effective User data is sent to the user of needs as Secure user data.When the corresponding output valve of original combined feature is lower than default prison Range is surveyed, then reduces the first noise extraction range and the second noise extraction range, and improve positive and negative sample proportion;Work as original combined The corresponding output valve of feature is higher than default monitoring range, then increases the first noise extraction range and the second noise extraction range, and Positive and negative sample proportion is reduced, so that obtaining the requirement that Secure user data is more nearly user.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
In one embodiment, a kind of data safe processing device is provided, the data safe processing device and above-described embodiment Middle data safety processing method corresponds.As shown in fig. 6, the data safe processing device includes that data processing request obtains mould Block 10, user data to be measured obtain module 20, validated user data acquisition module 30, data safe processing module 40 and safety and use User data obtains module 50.Detailed description are as follows for each functional module:
Data processing request obtains module 10, and for obtaining data processing request, data processing request includes objective cross Feature.
User data to be measured obtains module 20, for obtaining use to be measured from customer data base according to objective cross feature User data.
Validated user data acquisition module 30, for determining that target noise extracts range according to data safe processing model, Range is extracted based on target noise to handle user data to be measured, obtains validated user data.
Data safe processing module 40, for validated user data to be input in data safe processing model, acquisition has The output valve of the corresponding objective cross feature of effectiveness user data.
Secure user data obtains module 50, for the output valve when objective cross feature in default monitoring range, then Using validated user data as Secure user data.
Further, validated user data acquisition module 30 includes target noise data capture unit and validated user data Acquiring unit.
Target noise data capture unit is chosen non-targeted for extracting range based on target noise from customer data base The corresponding user data of assemblage characteristic is as target noise data.
Validated user data capture unit obtains effective for target noise data to be added in user data to be measured User data.
Further, before validated user data acquisition module, data safe processing device further include:
To training data processing unit training set and test will be divided into training data for obtaining to training data Collection.
Model parameter initialization unit promotes the model parameter of decision-tree model, model ginseng for initializing original gradient Number includes the depth capacity and maximum number of iterations that gradient promotes decision tree.
Model training unit, for by training set it is corresponding to training data be input to original gradient promoted decision-tree model In, when original gradient is promoted, the training depth in decision-tree model reaches depth capacity and the number of iterations reaches greatest iteration time Number, then deconditioning original gradient promotes decision-tree model, obtains original gradient and promotes each decision tree road in decision-tree model The corresponding original combined feature of diameter.
Output valve computing unit obtains original group for original combined feature to be input in primitive logic regression model Close the corresponding output valve of feature.
Model determination unit then will be original for presetting in monitoring range when the corresponding output valve of original combined feature Gradient promotes decision-tree model and primitive logic regression model as goal gradient and promotes decision-tree model and target logic recurrence Model.
Model measurement unit, for using test set it is corresponding to training data to goal gradient promoted decision-tree model and Target logic regression model is tested, if get it is each to the corresponding output valve of training data in default monitoring range It is interior, then goal gradient is promoted into decision-tree model and target logic regression model as data safe processing model.
It further, include model training request subelement, user data selection to training data processing unit Unit, original positive sample obtain subelement, original negative sample obtains subelement, effective negative sample obtains subelement, effective positive sample This acquisition subelement and to training data generate subelement.
Model training request subelement, for obtaining model training request, model training request includes that training group is closed Feature.
User data chooses subelement, for choosing from customer data base and combining with training according to training assemblage characteristic Training user's data of characteristic matching and with training the unmatched non-training user data of assemblage characteristic.
Original positive sample obtains subelement, is used for the selection pair from training user's data according to preset positive sample quantity The training user's data answered are as original positive sample.
Original negative sample obtains subelement, for choosing and corresponding to from non-training user data according to positive and negative sample proportion Non-training user data as original negative sample.
Effective negative sample obtains subelement, for extracting negative noise from original positive sample according to the first noise extraction range Data, and negative noise data is added in original negative sample, generate effective negative sample.
Effective positive sample obtains subelement, for extracting positive noise from original negative sample according to the second noise extraction range Data, and positive noise data is added in original positive sample, generate effective positive sample.
Subelement is generated to training data, for being stored in using effective positive sample and effective negative sample as to training data In sample database.
Further, data safety processing method is also used to when the corresponding output valve of original combined feature is lower than default monitoring Range then reduces the first noise extraction range and the second noise extraction range, and improves positive and negative sample proportion;As original combined spy It levies corresponding output valve and is higher than default monitoring range, then increase the first noise extraction range and the second noise extraction range, and drop Low positive and negative sample proportion.
Specific about data safe processing device limits the limit that may refer to above for data safety processing method Fixed, details are not described herein.Modules in above-mentioned data safe processing device can fully or partially through software, hardware and its Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 7.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of the machine equipment data that security processing is related to for storing data.The network interface of the computer equipment be used for External terminal passes through network connection communication.To realize a kind of data safe processing side when the computer program is executed by processor Method.
In one embodiment, a kind of computer equipment is provided, including memory, processor and storage are on a memory And the computer program that can be run on a processor, processor realize above-mentioned data safety processing method when executing computer program The step of, step S10- step S50 as shown in Figure 2.Or Fig. 3 is to step shown in fig. 5.To avoid repeating, here no longer It repeats.Alternatively, the step of processor realizes above-mentioned data safe processing device when executing computer program, number as shown in FIG. 6 At processing request module 10, user data to be measured acquisition module 20, validated user data acquisition module 30, data safety It manages module 40 and Secure user data obtains module 50.To avoid repeating, which is not described herein again.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated The step of machine program realizes above-mentioned data safety processing method when being executed by processor, step S10- step as shown in Figure 2 S50.Or Fig. 3 is to step shown in fig. 5.To avoid repeating, which is not described herein again.Alternatively, computer program is held by processor The step of above-mentioned data safe processing device is realized when row, data processing request as shown in FIG. 6 obtain module 10, user to be measured Data acquisition module 20, validated user data acquisition module 30, data safe processing module 40 and Secure user data obtain mould Block 50.To avoid repeating, which is not described herein again.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims (10)

1. a kind of data safety processing method characterized by comprising
Data processing request is obtained, the data processing request includes objective cross feature;
According to the objective cross feature, user data to be measured is obtained from customer data base;
It determines that target noise extracts range according to data safe processing model, range is extracted to use to be measured based on the target noise User data is handled, and validated user data are obtained;
The validated user data are input in the data safe processing model, it is corresponding to obtain the validated user data The output valve of objective cross feature;
When the output valve of the objective cross feature is in default monitoring range, then using the validated user data as safety use User data.
2. data safety processing method as described in claim 1, which is characterized in that described to extract model based on the target noise It encloses and user data to be measured is handled, obtain validated user data, comprising:
Range is extracted based on the target noise, chooses the corresponding user data conduct of non-targeted assemblage characteristic from customer data base Target noise data;
The target noise data are added in user data to be measured, validated user data are obtained.
3. data safety processing method as described in claim 1, which is characterized in that described according to data safe processing model Before determining the step of target noise extracts range, the data safety processing method further include:
It obtains to training data, is divided into training set and test set to training data for described;
The model parameter that original gradient promotes decision-tree model is initialized, the model parameter includes that gradient promotes decision tree most Big depth and maximum number of iterations;
The original gradient is input to training data is promoted training set is corresponding in decision-tree model, when the original gradient Training depth in promotion decision-tree model reaches the depth capacity and the number of iterations reaches maximum number of iterations, then stops instructing Practice the original gradient and promote decision-tree model, obtains the original gradient and promote each decision tree path pair in decision-tree model The original combined feature answered;
The original combined feature is input in primitive logic regression model, the corresponding output of the original combined feature is obtained Value;
When the corresponding output valve of the original combined feature is in the default monitoring range, then original gradient promotion is determined Plan tree-model and the primitive logic regression model are as goal gradient promotion decision-tree model and target logic regression model;
Decision-tree model and target logic recurrence are promoted to the goal gradient to training data using test set is corresponding Model is tested, if get it is each it is described to the corresponding output valve of training data in the default monitoring range, The goal gradient is then promoted into decision-tree model and target logic regression model as data safe processing model.
4. data safety processing method as claimed in claim 3, which is characterized in that it is described to obtain to training data, it will be described Training set and test set are divided into training data, comprising:
Model training request is obtained, the model training request includes that training group closes feature;
According to the trained assemblage characteristic, training user's number that characteristic matching is combined with the training is chosen from customer data base According to with the trained unmatched non-training user data of assemblage characteristic;
According to preset positive sample quantity, chosen from training user's data corresponding training user's data as it is original just Sample;
According to positive and negative sample proportion, corresponding non-training user data is chosen from the non-training user data as original minus Sample;
Negative noise data is extracted from original positive sample according to the first noise extraction range, and the negative noise data is added to In original negative sample, effective negative sample is generated;
Positive noise data is extracted from original negative sample according to the second noise extraction range, and the positive noise data is added to In original positive sample, effective positive sample is generated;
It is stored in sample database using effective positive sample and effective negative sample as to training data.
5. data safety processing method as claimed in claim 4, which is characterized in that described that the original combined feature is defeated After the step of entering into primitive logic regression model, obtaining the original combined feature corresponding output valve, the data peace Full processing method further include:
When the corresponding output valve of the original combined feature be lower than the default monitoring range, then reduce first noise extraction Range and the second noise extraction range, and improve the positive and negative sample proportion;When the original combined feature is corresponding defeated Value is higher than the default monitoring range out, then increases the first noise extraction range and the second noise extraction range, and Reduce the positive and negative sample proportion.
6. a kind of data safe processing device characterized by comprising
Data processing request obtains module, and for obtaining data processing request, the data processing request includes objective cross spy Sign;
User data to be measured obtains module, for obtaining user to be measured from customer data base according to the objective cross feature Data;
Validated user data acquisition module is based on institute for determining that target noise extracts range according to data safe processing model It states target noise extraction range to handle user data to be measured, obtains validated user data;
Data safe processing module is obtained for the validated user data to be input in the data safe processing model The output valve of the corresponding objective cross feature of the validated user data;
Secure user data obtains module, for the output valve when the objective cross feature in default monitoring range, then will The validated user data are as Secure user data.
7. data safe processing device as claimed in claim 6, which is characterized in that validated user data acquisition module it Before, the data safe processing device further include:
Training set and test are divided into training data by described for obtaining to training data to training data processing unit Collection;
Model parameter initialization unit promotes the model parameter of decision-tree model, the model ginseng for initializing original gradient Number includes the depth capacity and maximum number of iterations that gradient promotes decision tree;
Model training unit, for being input to training set is corresponding the original gradient to training data and promoting decision-tree model In, when the original gradient is promoted, the training depth in decision-tree model reaches the depth capacity and the number of iterations reaches maximum The number of iterations, then original gradient described in deconditioning promotes decision-tree model, obtains the original gradient and promotes decision tree The corresponding original combined feature in each decision tree path in model;
Output valve computing unit obtains the original for the original combined feature to be input in primitive logic regression model The corresponding output valve of beginning assemblage characteristic;
Model determination unit then will for working as the corresponding output valve of the original combined feature in the default monitoring range The original gradient promoted decision-tree model and the primitive logic regression model as goal gradient promoted decision-tree model with Target logic regression model;
Model measurement unit, for using test set it is corresponding to training data to the goal gradient promoted decision-tree model and The target logic regression model is tested, if get it is each it is described to the corresponding output valve of training data described In default monitoring range, then using the goal gradient promoted decision-tree model and target logic regression model as data safety at Manage model.
8. data safe processing device as claimed in claim 6, which is characterized in that described to training data processing unit, packet It includes:
Model training request subelement, for obtaining model training request, the model training request includes that training group is closed Feature;
User data chooses subelement, for being chosen and the training from customer data base according to the trained assemblage characteristic The matched Secure user data of assemblage characteristic and with the trained unmatched insecure user data of assemblage characteristic;
Original positive sample obtains subelement, is used for the selection pair from the Secure user data according to preset positive sample quantity The Secure user data answered is as original positive sample;
Original negative sample obtains subelement, for choosing and corresponding to from the insecure user data according to positive and negative sample proportion Insecure user data as original negative sample;
Effective negative sample obtains subelement, for extracting negative noise number from original positive sample according to the first noise extraction range According to, and the negative noise data is added in original negative sample, generate effective negative sample;
Effective positive sample obtains subelement, for extracting positive noise number from original negative sample according to the second noise extraction range According to, and the positive noise data is added in original positive sample, generate effective positive sample;
Subelement is generated to training data, for depositing using effective positive sample and effective negative sample as to training data Storage is in sample database.
9. a kind of computer equipment, including memory, processor and storage are in the memory and can be in the processor The computer program of upper operation, which is characterized in that the processor realized when executing the computer program as claim 1 to The step of any one of 5 data safety processing method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realizing the data safety processing method as described in any one of claim 1 to 5 when the computer program is executed by processor Step.
CN201811187262.8A 2018-10-12 2018-10-12 Data safety processing method, device, computer equipment and storage medium Pending CN109543442A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811187262.8A CN109543442A (en) 2018-10-12 2018-10-12 Data safety processing method, device, computer equipment and storage medium
PCT/CN2018/122734 WO2020073492A1 (en) 2018-10-12 2018-12-21 Data security processing method and apparatus, and computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811187262.8A CN109543442A (en) 2018-10-12 2018-10-12 Data safety processing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109543442A true CN109543442A (en) 2019-03-29

Family

ID=65843885

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811187262.8A Pending CN109543442A (en) 2018-10-12 2018-10-12 Data safety processing method, device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN109543442A (en)
WO (1) WO2020073492A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070310A (en) * 2020-09-10 2020-12-11 腾讯科技(深圳)有限公司 Loss user prediction method and device based on artificial intelligence and electronic equipment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753914B (en) * 2020-06-29 2024-04-16 北京百度网讯科技有限公司 Model optimization method and device, electronic equipment and storage medium
CN112347476B (en) * 2020-11-13 2024-02-02 脸萌有限公司 Data protection method, device, medium and equipment
CN112632607B (en) * 2020-12-22 2024-04-26 中国建设银行股份有限公司 Data processing method, device and equipment
CN113780365A (en) * 2021-08-19 2021-12-10 支付宝(杭州)信息技术有限公司 Sample generation method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150095017A1 (en) * 2013-09-27 2015-04-02 Google Inc. System and method for learning word embeddings using neural language models
CN107782442A (en) * 2017-10-24 2018-03-09 华北电力大学(保定) Transformer multiple features parameter selection method based on big data and random forest
CN107895277A (en) * 2017-09-30 2018-04-10 平安科技(深圳)有限公司 Method, electronic installation and the medium of push loan advertisement in the application
US20180115568A1 (en) * 2016-10-21 2018-04-26 Neusoft Corporation Method and device for detecting network intrusion
CN108389125A (en) * 2018-02-27 2018-08-10 挖财网络技术有限公司 The overdue Risk Forecast Method and device of credit applications
CN108520181A (en) * 2018-03-26 2018-09-11 联想(北京)有限公司 data model training method and device
CN108537055A (en) * 2018-03-06 2018-09-14 南京邮电大学 A kind of privacy budget allocation of data query secret protection and data dissemination method and its system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8312273B2 (en) * 2009-10-07 2012-11-13 Microsoft Corporation Privacy vault for maintaining the privacy of user profiles
CN106339714B (en) * 2016-08-10 2020-12-01 上海交通大学 Privacy risk control method for multilayer embedded differential privacy to decision tree model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150095017A1 (en) * 2013-09-27 2015-04-02 Google Inc. System and method for learning word embeddings using neural language models
US20180115568A1 (en) * 2016-10-21 2018-04-26 Neusoft Corporation Method and device for detecting network intrusion
CN107895277A (en) * 2017-09-30 2018-04-10 平安科技(深圳)有限公司 Method, electronic installation and the medium of push loan advertisement in the application
CN107782442A (en) * 2017-10-24 2018-03-09 华北电力大学(保定) Transformer multiple features parameter selection method based on big data and random forest
CN108389125A (en) * 2018-02-27 2018-08-10 挖财网络技术有限公司 The overdue Risk Forecast Method and device of credit applications
CN108537055A (en) * 2018-03-06 2018-09-14 南京邮电大学 A kind of privacy budget allocation of data query secret protection and data dissemination method and its system
CN108520181A (en) * 2018-03-26 2018-09-11 联想(北京)有限公司 data model training method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEI LI ET AL.: "Heterogeneous Ensemble for Default Prediction of Peer-to-Peer Lending in China", IEEE ACCESS, no. 6, 1 March 2018 (2018-03-01), pages 54396 - 54406 *
李睿琪: "针对匿名电信客户数据的流失预测模型", 中国优秀硕士学位论文全文数据库经济与管理科学辑, 15 January 2018 (2018-01-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070310A (en) * 2020-09-10 2020-12-11 腾讯科技(深圳)有限公司 Loss user prediction method and device based on artificial intelligence and electronic equipment

Also Published As

Publication number Publication date
WO2020073492A1 (en) 2020-04-16

Similar Documents

Publication Publication Date Title
CN109543442A (en) Data safety processing method, device, computer equipment and storage medium
CN108876133B (en) Risk assessment processing method, device, server and medium based on business information
US10783457B2 (en) Method for determining risk preference of user, information recommendation method, and apparatus
CN110489520A (en) Event-handling method, device, equipment and the storage medium of knowledge based map
CN109740869A (en) Data checking method, device, computer equipment and storage medium
WO2019218699A1 (en) Fraud transaction determining method and apparatus, computer device, and storage medium
CN109360105A (en) Product risks method for early warning, device, computer equipment and storage medium
US9443002B1 (en) Dynamic data analysis and selection for determining outcomes associated with domain specific probabilistic data sets
CN110992167A (en) Bank client business intention identification method and device
CN109065139A (en) Medical follow up method, apparatus, computer equipment and storage medium
CN108596760A (en) loan risk evaluation method and server
CN109509087A (en) Intelligentized loan checking method, device, equipment and medium
CN109615280A (en) Employee's data processing method, device, computer equipment and storage medium
CN107818491A (en) Electronic installation, Products Show method and storage medium based on user's Internet data
CN109255747A (en) A kind of intelligent checks method that information is declared
CN109118376A (en) Medical insurance premium calculation principle method, apparatus, computer equipment and storage medium
CN109583682A (en) Recognition methods, device and the computer equipment of business finance fraud risk
CN110533526A (en) A kind of recognition methods, device, computer equipment and the storage medium of black mark client
CN109377388A (en) Medical insurance is insured method, apparatus, computer equipment and storage medium
CN109766772A (en) Risk control method, device, computer equipment and storage medium
CN113538070A (en) User life value cycle detection method and device and computer equipment
CN108182633A (en) Loan data processing method, device, computer equipment and storage medium
CN112990989B (en) Value prediction model input data generation method, device, equipment and medium
US20210406930A1 (en) Benefit surrender prediction
CN115630221A (en) Terminal application interface display data processing method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination