WO2019200810A1

WO2019200810A1 - User data authenticity analysis method and apparatus, storage medium and electronic device

Info

Publication number: WO2019200810A1
Application number: PCT/CN2018/103063
Authority: WO
Inventors: 叶俊锋; 龙觉刚; 孙成; 赖云辉; 罗先贤
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-04-20
Filing date: 2018-08-29
Publication date: 2019-10-24
Also published as: CN108596616B; CN108596616A

Abstract

The present disclosure relates to the technical field of computers, and relates to a user data authenticity analysis method and apparatus, a computer-readable storage medium and an electronic device, the user data authenticity analysis method comprising: building multiple result prediction models according to multiple feature combinations comprising sub-features; obtaining feature data to be analyzed which is of the same type as the sub-features; grouping the feature data to be analyzed according to the feature combinations to form multiple combinations of feature data to be analyzed; inputting the combinations of feature data to be analyzed into the result prediction models, thereby obtaining multiple prediction results; and fusing the prediction results to obtain a final prediction result. The present method improves the prediction precision and accuracy in fraud identification.

Description

Title: User data authenticity analysis method and device, storage medium, electronic device

[0001] The present application claims the priority of the Chinese Patent Application No. 201101359102.0, filed on Apr. 20, 2018, which is entitled "User Data Authenticity Analysis Method and Apparatus, Storage Medium, Electronic Apparatus", the entire contents of which are hereby incorporated by reference. Merge here.

[0002] The present disclosure relates to the field of computer technologies, and in particular, to a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device.

Background technique

[0003] With the rapid development of communication technologies, the Internet has gradually become a tool for people's daily use, which is often used as a means of communicating and conducting business transactions with customers, vendors, employees, and shareholders. In theory, trading on the Internet is efficient and cost-effective, but it also has major drawbacks, such as hackers, identity theft, stolen credit cards and other fraudulent acts, posing a threat to users' financial security. And difficult to manage.

[0004] Effective identification of fraudulent users is a key technical problem. The inventors have realized that in the prior art, a plurality of trained models are generally used to judge the authenticity of user feature data and output prediction results, and then through data fusion methods. The prediction results corresponding to multiple models are merged. However, since the similar models adopt similar algorithms, the prediction results of the same user feature data may be consistent, so the self-enhancement effect is likely to occur when the prediction results are combined. In addition, if the prediction result of a certain model is wrong, the prediction results of other similar models are also wrong, and the result of the fusion is necessarily wrong, which will affect the effective identification of fraudulent users.

[0005] Therefore, it is desirable to provide a new user data authenticity analysis method and apparatus for identifying fraudulent users.

It is to be understood that the information disclosed in the Background section above is only used to enhance the understanding of the background of the present disclosure, and thus may include information that does not constitute the prior art known to those of ordinary skill in the art. Summary of invention

technical problem [0007] An object of the present disclosure is to provide a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device, thereby at least to some extent overcoming the limitations and defects of the related art. The problem of economic fraud caused by inaccurate prediction results. . Problem solution

Technical solution

[0008] According to an aspect of the present disclosure, a user data authenticity analysis method is provided, including:

[0009] constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;

[0010] acquiring feature data to be analyzed that is the same as the sub-feature type;

[0011] grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

[0012] inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results; [0013] fusing the prediction results to obtain a final prediction result.

[0014] According to an aspect of the present disclosure, a user data authenticity analysis apparatus is provided, including:

[0015] a model building module configured to construct a plurality of result prediction models according to a plurality of feature combinations including sub-features

[0016] The data acquisition module is configured to acquire the same feature data to be analyzed as the sub-feature type;

[0017] The combination generation module is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

[0018] a result prediction module, configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;

[0019] The result fusion module is configured to fuse the prediction results to obtain a final prediction result.

[0020] According to an aspect of the present disclosure, a computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor, implements the user data authenticity analysis method described above.

[0021] According to an aspect of the disclosure, an electronic device is provided, including:

[0022] a processor;

[0023] a memory configured to store executable instructions of the processor;

[0024] wherein the processor is configured to execute the user data authenticity analysis method described above by executing the executable instruction. [0025] The present disclosure is a user data authenticity analysis method and apparatus, which constructs a plurality of result prediction models according to feature combinations including sub-features, and extracts feature data to be analyzed from user data to be analyzed according to the type of sub-features, and The feature data to be analyzed is formed into a plurality of feature data combinations to be analyzed according to the feature combination; then the feature data to be analyzed is combined into the result prediction model to obtain a plurality of prediction results; and finally, the plurality of prediction results are fused to obtain a final prediction result. On the one hand, by forming user data samples into a plurality of different feature combinations, and generating corresponding models for each combination training, analyzing the feature data combinations to obtain multiple prediction results, and finally merging the prediction results to obtain a final prediction. As a result, the sample data can be fully utilized, and the prediction accuracy is improved. On the other hand, over-fitting occurs when the prediction result is fused, and the accuracy of fraud recognition is improved.

The above general description and the following detailed description are intended to be illustrative and not restrictive.

Advantageous effects of the invention

Brief description of the drawing

DRAWINGS

[0027] The accompanying drawings, which are incorporated in and constitute in the claims Obviously, the drawings in the following description are only some of the embodiments of the present disclosure, and those skilled in the art can obtain other drawings based on these drawings without any creative work.

[0028] FIG. 1 is a flow chart schematically showing a method for authenticating user data.

[0029] FIG. 2 is a schematic diagram showing an example of an application scenario of a user data authenticity analysis method.

[0030] FIG. 3 is a schematic flow chart showing a method of constructing a result prediction model.

[0031] FIG. 4 is a block diagram schematically showing a user data authenticity analyzing device.

[0032] FIG. 5 is a block diagram showing an exemplary electronic device for implementing the above-described user data authenticity analysis method.

[0033] FIG. 6 schematically illustrates a computer readable storage medium for implementing the above-described user data authenticity analysis method. .

Invention embodiment Embodiments of the invention

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the example embodiments can be embodied in a variety of forms and should not be construed as being limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be more complete and complete, To those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are set forth However, one skilled in the art will appreciate that one or more of the specific details may be omitted, or other methods, components, means, steps, etc. may be employed. In other instances, various aspects of the present disclosure are not obscured by the detailed description of the embodiments.

The drawings are only schematic representations of the present disclosure, and are not necessarily to scale. The same reference numerals in the drawings denote the same or similar parts, and a repeated description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily have to correspond to physically or logically separate entities. These functional entities may be implemented in software, or implemented in one or more hardware modules or integrated circuits, or implemented in different network and/or processor devices and/or microcontroller devices.

[0036] In the related art in the related art, when fraud detection is performed on a user, a binary classification is usually required, and a corresponding model is used for each data to predict the authenticity, and the prediction result is output, and then more The predicted results of the model outputs are fused. However, the similar models use similar algorithms, so the prediction results of the same user's eigendata may be consistent. If the prediction results of multiple model outputs are fused, the self-enhancement effect is easy to occur, resulting in inaccurate prediction results. The model lacks generalization capabilities. And when the prediction result of a model output is wrong, the prediction results of other similar models are also wrong. The result of the fusion generation is also wrong, which affects the effective identification of fraudulent users.

[0037] In view of the problems in the related art, in the present exemplary embodiment, a user data authenticity analysis method is first provided, and the user data authenticity analysis method may be run on a server, or may be run on a server cluster or a cloud server. Of course, the method of the present application can also be run on other platforms according to requirements, and is not specifically limited in this exemplary embodiment. Referring to Figure 1, the user The data authenticity analysis method may include the following steps:

[0038] Step S110. Construct a plurality of result prediction models according to a plurality of feature combinations including sub-features;

[0039] Step S120: Obtaining the same feature data to be analyzed as the sub-feature type;

[0040] Step S130: grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

[0041] Step S140. Input the feature data to be analyzed into the result prediction model, and obtain a plurality of prediction results;

[0042] Step S150. Converging the prediction results to obtain a final prediction result.

[0043] In the above method for authenticating user data, the data samples are formed into a plurality of different feature combinations, and corresponding models are generated for each combination training, and the plurality of prediction results are obtained by analyzing the analysis of the feature data to be analyzed, and finally The prediction results are fused to obtain the final prediction results. On the one hand, the sample data can be fully utilized, and the same applies to the case of insufficient samples; on the other hand, over-fitting is avoided when the prediction results are fused, and the prediction accuracy and accuracy of fraud recognition are improved.

[0044] Hereinafter, each step in the above-described user data authenticity analysis method in the present exemplary embodiment will be explained in detail and explained with reference to FIG.

[0045] In step S110, a plurality of result prediction models are constructed according to a plurality of feature combinations including sub-features.

[0046] FIG. 3 shows a schematic diagram of a method for constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features. As shown in FIG. 3, the method for constructing a result prediction model is as follows:

[0047] Step S301. Acquire user feature information, where the user feature information includes multiple sub-features;

[0048] Step S302: Perform machine training on the sub-features to construct a single feature model;

[0049] Step S303: input the sub-feature into the single feature model, and obtain an accuracy rate of the sub-feature; [0050] Step S304. Randomly combine the sub-features according to an accuracy rate of the sub-features Forming the combination of features;

[0051] Step S305. Perform machine training on the feature combination to construct the result prediction model.

[0052] The following describes the method of constructing the result prediction model in detail:

[0053] In step S301, user feature information is acquired, where the user feature information includes a plurality of sub-features.

[0054] In the present exemplary embodiment, the server 201 receives the user data sent by the mobile terminal 202, and selects some or all of the data as the user data sample, and further obtains the user special from the user data sample. The information includes the plurality of sub-features. The user characteristic information may include behavior data, attribute data, and spatial data, where the behavior data may include one or more of parameters such as page browsing time, page click times, and page click frequency; the attribute data includes age, gender, ID number, One or more of the driver's license number, contact information, etc.; the spatial data includes one or more of the device model, IP address, current location, and the like. The sub-features may be selected from the above-mentioned user characteristic information, such as taking age as a sub-feature and gender as a sub-feature, which is not specifically limited in the present disclosure.

[0055] In step S302, the sub-features are machine trained to construct a single feature model.

[0056] In the present exemplary embodiment, each sub-feature may be machine trained, a single feature model is constructed, and sub-features are input to the single feature model to generate a prediction result, and the accuracy of each sub-feature is obtained. For example, sub-features include age, gender, current address, page browsing time, and page click frequency. Sample data of sub-features can be learned by linear regression algorithm, logistic regression algorithm, decision tree, naive Bayes algorithm, random forest algorithm, etc. Multiple machine trainings are performed until a single feature model with minimal loss is formed.

[0057] In step S303, the sub-feature is input to the single feature model to obtain an accuracy of the sub-feature.

[0058] In the present exemplary embodiment, after the single feature model is constructed according to the plurality of sub-features, the sub-features may be input to the single feature model, and the authenticity of the sub-features is predicted, thereby obtaining the accuracy of the plurality of sub-features.

[0059] In step S304, the sub-features are randomly combined to form the feature combination according to the accuracy of the sub-features.

[0060] In the present exemplary embodiment, after obtaining the accuracy of the sub-features, all the sub-features may be randomly combined according to the accuracy of the sub-features to form a plurality of feature combinations. For example, the server 201 can randomly extract any three sub-features of age, gender, current address, page browsing time, and page click frequency to form a feature combination, such as forming a feature combination {age, gender, current address}, {age, current address , page browsing time;} form a feature combination and so on. It is of course also possible to select any number of sub-features of all sub-features to form different feature combinations, which will not be described in detail in the present disclosure. By adopting a random combination method to form a feature combination, each piece of data can generate a corresponding model, and corresponding predicted values are generated, and the data samples are fully utilized, thereby improving the prediction accuracy.

[0061] Further, in order to improve efficiency, all sub-features may be randomly combined by roulette method. Multiple combinations of features.

[0062] In step S305, machine training is performed on the feature combination to construct the result prediction model.

[0063] In the present exemplary embodiment, machine training is performed on each feature combination formed by random combination, and a plurality of result prediction models are constructed. In the present disclosure, the data in each feature combination can be mechanically trained by a learning algorithm such as a linear regression algorithm, a logistic regression algorithm, a decision tree, a naive Bayes algorithm, or a random forest algorithm to obtain a plurality of combinations corresponding to each feature. The resulting prediction model is used for subsequent data analysis to improve the accuracy of fraud identification. The algorithm used to form the result prediction model may be the same as or different from the algorithm used to form the single feature model, which is not specifically limited in the present disclosure.

[0064] In step S120, feature data to be analyzed that is the same as the sub-feature type is acquired.

In the present example, the server 201 receives the user data to be analyzed sent by the mobile terminal 202, and extracts data of the corresponding type from the user data to be analyzed according to the type of the sub-feature to form the feature data to be analyzed. For example, when the sub-features in step S110 are age, gender, and current address, all age information, gender information, and current address information may be extracted from the user data to be analyzed as the feature data to be analyzed. It is worth noting that as the sub-feature types change, the feature data to be analyzed also changes accordingly.

[0066] In step S130, the feature data to be analyzed is grouped according to the feature combination to form a plurality of feature data combinations to be analyzed.

In the present exemplary embodiment, the feature data to be analyzed acquired in step S120 is grouped according to the feature combination in step S110 to form a plurality of feature data combinations to be analyzed. For example, if the feature combination is {age, sex, IJ, page browsing time:}, the user's age information, gender information, and page browsing time information are extracted from the feature data to be analyzed, and the above information is combined to form a feature data combination to be analyzed. The specific form may be {age to be analyzed, gender to be analyzed, page browsing time to be analyzed:}.

[0068] In step S140, the feature data to be analyzed is combined and input to the result prediction model to obtain a plurality of prediction results.

[0069] In the present exemplary embodiment, the plurality of feature data to be analyzed obtained in step S130 are separately input to the result prediction model obtained in step S110, so as to predict a plurality of feature data combinations to be analyzed, and obtain multiple forecast result. For example, the mobile terminal 202 collects M (M is a positive integer) related information of the user, and correspondingly, the server 201 forms an M group of feature data combinations to be analyzed, and then sequentially sets the M group to be divided. The feature data combination is input to the result prediction model for prediction, and M prediction results are obtained.

[0070] In step S150, a plurality of the prediction results are fused to obtain a final prediction result.

[0071] In the present example embodiment, after obtaining the prediction result of the M group to be analyzed feature data combination, the M prediction results may be fused to obtain the final prediction result. In the present disclosure, a plurality of prediction results may be fused by a data fusion method such as Bayesian inference, voting, D-S (Dempster-Shafer) evidence theory, and neural network fusion method to obtain a final prediction result. Because DS evidence theory has a strong ability to process uncertain information, it does not need a priori information. The description of uncertain information uses the method of “interval estimation” instead of “point estimation”, which solves the problem of “unknown” The deterministic representation method has great flexibility in distinguishing between ignorance and uncertainty and accurately reflecting evidence collection; and DS evidence theory fusion framework supports infinite expansion model, so DS evidence theory is preferred in this disclosure as a fusion framework. , Fusion of multiple prediction results to obtain final prediction results.

[0072] Further, based on the obtained final prediction result, the authenticity of the user data to be analyzed may be determined. For example, the final prediction result may be embodied in the form of fraud probability. When the final prediction result is higher than 0.05, it may be determined that the user data to be analyzed is untrue and belongs to a fraudulent user; conversely, when the prediction result is not higher than 0.05, Then, it can be determined that the user data to be analyzed is true, and belongs to a non-fraud user, that is, the lower the fraud probability, the higher the legitimacy of the user and the authenticity of the user data. Of course, those skilled in the art can also set other fraud probability according to actual conditions to judge the legitimacy of the user.

[0073] The user data authenticity analysis method of the present disclosure makes full use of the sample data on the one hand, and avoids over-fitting when using the DS evidence theory as a fusion framework for the prediction result, and improves the prediction accuracy. And the accuracy of fraud identification; on the other hand, the user data authenticity analysis method of the present disclosure is also applicable to the case where the number of samples is insufficient, making fraud identification easier.

[0074] The user data authenticity method of the present disclosure can be used in an environment such as a surrender application and a car insurance claim to determine the legality of the surrender applicant, the auto insurance claimant and the request thereof, and prevent hackers and other lawless elements from obtaining the improper means. Benefits, causing losses to insurance institutions. The following describes the authenticity analysis method of the user data of the present disclosure by taking the prediction of the legality of the auto insurance claimant and its request as an example. First, the raw data of the auto insurance case is taken as a data sample, which contains multiple sub-characters, such as: name, gender, policy start and end date, insurance amount, accident time, road segment, vehicle brand, vehicle value, claim amount, etc.; Training each sub-feature to establish a single feature model, and using the single feature model to sub-features Performing prediction to obtain the accuracy of the sub-features; then randomly combining all the sub-features to form a plurality of feature combinations according to the accuracy of the sub-features, for example, forming a feature combination by {name, gender, accident time, claim amount}, etc.; Training the data in each feature combination to construct a plurality of result prediction models; then acquiring the user data to be analyzed, and extracting the same feature data to be analyzed as the sub-feature type; and extracting the extracted feature data to be analyzed according to the type of feature combination Grouping, forming a plurality of feature data combinations to be analyzed, and then inputting the feature data combination to be analyzed into the trained result prediction model for analysis, and obtaining corresponding prediction results; finally, using DS evidence theory as a fusion framework, each prediction result is obtained. Convergence, obtaining the final forecast results, and judging whether the auto insurance claimant and its request are legal based on the final forecast.

[0075] The present disclosure also provides a user data authenticity analysis device. Referring to FIG. 4, the user data authenticity analysis apparatus may include a model construction module 410, a data acquisition module 420, a combination generation module 430, a result prediction module 440, and a result fusion module 450. among them:

[0076] The model building module 410 is configured to construct a plurality of result prediction models according to the plurality of feature combinations including the sub features;

[0077] The data obtaining module 420 is configured to acquire the same feature data to be analyzed as the sub-feature type;

[0078] The combination generation module 430 is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

[0079] The result prediction module 440 is configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;

[0080] The result fusion module 450 is configured to fuse the prediction results to obtain a final prediction result.

[0081] The specific details of each module in the user data authenticity analysis apparatus have been described in detail in the corresponding user data authenticity analysis method, and thus are not described herein again.

[0082] It should be noted that although several modules or units of equipment for action execution are mentioned in the above detailed description, such division is not mandatory. In fact, the features and functions of the two or more modules or units described above may be embodied in one module or unit in accordance with the embodiments of the present disclosure. Conversely, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.

Further, although the various steps of the method of the present disclosure are described in a particular order in the drawings, It is not required or implied that these steps must be performed in that particular order, or that all of the steps shown must be performed to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions and the like.

[0084] Through the description of the above embodiments, those skilled in the art will readily understand that the example embodiments described herein may be implemented by software or by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, mobile terminal, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.

[0085] In an embodiment, the user data authenticity analysis device further includes:

[0086] a feature information acquiring module, configured to acquire user feature information, where the user feature information includes multiple sub-features;

[0087] a feature combination forming module configured to form a plurality of the feature combinations according to the sub-features.

[0088] In an embodiment, the model building module 410 includes:

[0089] a random combination unit configured to acquire an accuracy rate of the sub-features, and randomly combine the sub-features according to an accuracy rate of the sub-features to form a plurality of the feature combinations;

[0090] A feature training unit configured to perform machine training on the feature combination to construct the result prediction model.

[0091] In an embodiment, the foregoing random combination unit includes:

[0092] a sub-feature trainer configured to perform machine training on the sub-features to construct a single feature model;

[0093] an accuracy rate acquirer configured to input the sub-feature to the single feature model to obtain an accuracy rate of the sub-feature.

[0094] In an embodiment, the foregoing random combination unit includes:

[0095] A roulette combiner is configured to randomly combine the sub-features to form the feature combination by roulette according to an accuracy of the sub-features.

[0096] In an embodiment, the foregoing data acquisition module 420 includes:

[0097] a user data obtaining unit, configured to acquire user data to be analyzed;

[0098] a feature data extraction unit, configured to extract, from the user data to be analyzed, according to the type of the sub-feature Taking the feature data to be analyzed.

[0099] In an embodiment, the result fusion module 450 includes:

[0100] The theoretical fusion unit is configured to fuse the prediction result according to the D-S evidence theory to obtain the final prediction result.

[0101] In an exemplary embodiment of the present disclosure, there is also provided an electronic device capable of implementing the above method.

[0102] Those skilled in the art will appreciate that aspects of the present application can be implemented as a system, method, or program product. Therefore, various aspects of the present application may be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which may be collectively referred to herein. "circuit", "module" or "system".

[0103] An electronic device 500 according to this embodiment of the present application will be described below with reference to FIG. The electronic device 500 shown in Fig. 5 is merely an example and should not impose any limitation on the function and scope of use of the embodiments of the present application.

[0104] As shown in FIG. 5, the electronic device 500 is embodied in the form of a general purpose computing device. The components of the electronic device 500 may include, but are not limited to: the at least one processing unit 510, the at least one memory unit 520, and the bus 530 connecting different system components (including the storage unit 520 and the processing unit 510).

[0105] wherein, the storage unit stores a program code, and the program code may be executed by the processing unit 510, so that the processing unit 510 performs the following according to the present application described in the “Exemplary Method” section of the present specification. The steps of an exemplary embodiment. For example, the processing unit 510 may perform step S110 as shown in FIG. 1: constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features; Step S120: acquiring the same feature to be analyzed as the sub-feature type Step S130: grouping the to-be-analyzed feature data according to the feature combination to form a plurality of feature data combinations to be analyzed; Step S140: input the feature data to be analyzed into the result prediction model to obtain more Prediction results; Step S150: Converging the prediction results to obtain a final prediction result.

[0106] The storage unit 520 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 5201 and/or a cache storage unit 5202, and may further include a read only storage unit (ROM) 5203. .

[0107] The storage unit 520 may further include a program/utility 5204 having a set (at least one) of the program modules 5205, such program modules 5205 including but not limited to: an operating system, one or more applications , other program modules, and program data, each of these examples or some combination may include an implementation of a network environment.

[0108] The bus 530 may represent one or more of several types of bus structures, including a memory unit bus or a memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or any bus structure using a plurality of bus structures. Local bus.

[0109] The electronic device 500 can also be in communication with one or more external devices 700 (eg, a keyboard, pointing device, Bluetooth device, etc.), and can also be in communication with one or more devices that enable a user to interact with the electronic device 500, and / or communicate with any device (eg, router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 550. Also, the electronic device 500 can communicate with one or more networks (e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) via the network adapter 560. As shown, network adapter 560 communicates with other modules of electronic device 500 via bus 530. It should be understood that although not shown in the figures, other hardware and/or software modules may be utilized in conjunction with electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives. And data backup storage systems, etc.

[0110] Through the description of the above embodiments, those skilled in the art will readily understand that the example embodiments described herein may be implemented by software or by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, terminal device, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.

[0111] In an exemplary embodiment of the present disclosure, there is also provided a computer readable storage medium having stored thereon a program product capable of implementing the above method of the present specification. In some possible implementations, various aspects of the present application can also be implemented in the form of a program product, including program code, when the program product is run on a terminal device, the program code is used to cause the The terminal device performs the steps according to various exemplary embodiments of the present application described in the "Exemplary Method" section of the present specification.

[0112] Referring to FIG. 6, a program product 600 for implementing the above method, which may employ a portable compact disk read only memory (CD-ROM) and includes program code, and may be described in accordance with an embodiment of the present application, may be In A terminal device, such as a personal computer. However, the program product of the present application is not limited thereto, and in this document, the readable storage medium may be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus or device.

[0113] The program product may take any combination of one or more readable mediums. The readable medium can be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of non-exhaustive storage media (non-exhaustive list) include: electrical connections with one or more wires, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-RO M), optical storage device, magnetic storage device, or any suitable combination of the foregoing.

[0114] A computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate or transport a program for use by or in connection with the instruction execution system, apparatus or device.

[0115] The program code embodied on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.

[0116] Program code for performing the operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++, etc., including conventional A procedural programming language such as the "C" language or a similar programming language. The program code can be executed entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the remote computing device on the user computing device, or entirely on the remote computing device or server Execute on. In the case of a remote computing device, the remote computing device can be connected to the user computing device via any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computing device (eg, provided using an Internet service) Businesses are connected via the Internet).

Further, the above-described drawings are merely illustrative of the processes included in the method according to the exemplary embodiments of the present application, and are not intended to be limiting. It is easy to understand that the processing shown in the above figures does not indicate or limit these places. The chronological order. In addition, it is also easy to understand that these processes may be performed synchronously or asynchronously, for example, in a plurality of modules.

Other embodiments of the present disclosure will be apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present disclosure, which are in accordance with the general principles of the present disclosure and include common general knowledge or conventional technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be regarded as illustrative only,

Claims

Claim

[Claim 1] A user data authenticity analysis method, including:

Constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;

Acquiring feature data to be analyzed that is the same as the sub-feature type; grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

Inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;

The prediction results are fused to obtain a final prediction result.

[Claim 2] The user data authenticity analysis method according to claim 1, wherein the user data truth analysis method further comprises:

Obtaining user feature information, where the user feature information includes multiple sub-features;

A plurality of the feature combinations are formed according to the sub-features.

[Claim 3] The user data authenticity analysis method according to any one of claims 1-2, wherein constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features comprises: obtaining an accuracy of the sub-features Rate, and randomly combining the sub-features to form a plurality of the feature combinations according to the accuracy of the sub-features;

Machine training is performed on the combination of features to construct the result prediction model.

[Claim 4] The user data authenticity analysis method according to claim 3, wherein obtaining the accuracy of the sub-features comprises:

Performing machine training on the sub-features to construct a single feature model; inputting the sub-features to the single feature model to obtain an accuracy rate of the sub-features.

[Claim 5] The user data authenticity analysis method according to claim 3 or 4, wherein randomly combining the sub-features according to the accuracy of the sub-feature to form a plurality of the feature combinations comprises: The accuracy of the sub-features is randomly combined by the roulette method to form the feature combination.

The user data authenticity analysis method according to any one of claims 1 to 5, wherein acquiring the same feature data to be analyzed as the sub-feature type includes: Obtain user data to be analyzed;

Extracting the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.

The user data authenticity analysis method according to any one of claims 1 to 6, wherein the merging the prediction results to obtain a final prediction result comprises: performing the prediction result according to DS evidence theory Fusion, obtaining the final prediction result

[Claim 8] A user data authenticity analyzing device, comprising:

a model building module configured to construct a plurality of result prediction models based on a plurality of feature combinations including sub-features;

a data acquisition module, configured to acquire the same feature data to be analyzed as the sub-feature type

The combination generation module is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;

a result prediction module, configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;

a result fusion module configured to fuse the prediction results to obtain a final prediction result

[Claim 9] The user data authenticity analyzing device according to claim 8, wherein the device further comprises:

a feature information acquiring module, configured to acquire user feature information, where the user feature information includes multiple sub-features;

A feature combination forming module is configured to form a plurality of the feature combinations according to the sub-features.

The user data authenticity analyzing device according to any one of claims 8-9, wherein the model building module comprises: a random combining unit configured to acquire an accuracy rate of the sub-features, and according to The accuracy of the sub-features randomly combines the sub-features to form a plurality of the feature combinations;

a feature training unit configured to perform machine training on the feature combination to construct the knot The prediction model.

[Claim 11] The user data authenticity analyzing apparatus according to claim 10, wherein the random combination unit comprises:

a sub-feature trainer configured to perform machine training on the sub-features to construct a single-feature model;

An accuracy rate acquirer configured to input the sub-feature to the single feature model to obtain an accuracy of the sub-feature.

[Claim 12] The user data authenticity analyzing device according to claim 10 or 11, wherein the random combination unit comprises:

A roulette combiner is configured to randomly combine the sub-features to form the feature combination by roulette based on the accuracy of the sub-features.

The user data authenticity analysis device according to any one of claims 8 to 12, wherein the data acquisition module comprises:

a user data obtaining unit, configured to acquire user data to be analyzed, and a feature data extracting unit, configured to extract the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.

The user data authenticity analysis device according to any one of claims 8 to 13, wherein the result fusion module comprises:

A theoretical fusion unit is configured to fuse the prediction results according to D-S evidence theory to obtain the final prediction result.

[Claim 15] A computer readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor by the following steps:

Inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results; The prediction results are fused to obtain a final prediction result.

[Claim 16] The computer readable storage medium according to claim 15, wherein the step further comprises: acquiring user feature information, wherein the user feature information includes a plurality of sub-features;

The computer readable storage medium according to any one of claims 15-16, wherein constructing the plurality of result prediction models according to the plurality of feature combinations including the sub-features comprises: obtaining an accuracy rate of the sub-features And randomly combining the sub-features to form a plurality of the feature combinations according to an accuracy rate of the sub-features;

The computer readable storage medium according to claim 17, wherein the obtaining the accuracy of the sub-features comprises:

[Claim 19] The computer readable storage medium according to claim 17 or 18, wherein randomly combining the sub-features according to an accuracy of the sub-feature to form a plurality of the feature combinations comprises: according to the sub- The accuracy of the features, the sub-features are randomly combined by roulette to form the feature combination.

The computer readable storage medium according to any one of claims 15 to 19, wherein acquiring the same feature data to be analyzed as the sub-feature type comprises:

Obtain user data to be analyzed;

The computer readable storage medium according to any one of claims 15 to 20, wherein the merging the prediction results to obtain a final prediction result comprises: merging the prediction results according to DS evidence theory , get the final prediction result

[Claim 22] An electronic device, comprising: Processor;

a memory configured to store executable instructions of the processor; wherein the processor is configured to perform the following steps:

The prediction results are fused to obtain a final prediction result.

[Claim 23] The electronic device according to claim 22, wherein the step further comprises:

[Claim 24] The electronic device according to any one of claims 22-23, wherein constructing the plurality of result prediction models according to the plurality of feature combinations including the sub-features comprises: obtaining an accuracy rate of the sub-features, and according to The accuracy of the sub-features randomly combines the sub-features to form a plurality of the feature combinations;

[Claim 25] The electronic device according to claim 24, wherein the accuracy of acquiring the sub-features comprises:

[Claim 26] The electronic device according to claim 24 or 25, wherein randomly combining the sub-features according to an accuracy of the sub-feature to form a plurality of the feature combinations comprises:

The sub-features are randomly combined by roulette to form the feature combination based on the accuracy of the sub-features.

The electronic device according to any one of claims 22 to 26, wherein acquiring the same feature data to be analyzed as the sub-feature type comprises: Obtain user data to be analyzed;

[Claim 28] The electronic device according to any one of claims 22-27, wherein the fusing the prediction result to obtain a final prediction result comprises:

Converging the prediction results according to D-S evidence theory to obtain the final prediction result