WO2019200810A1 - User data authenticity analysis method and apparatus, storage medium and electronic device - Google Patents

User data authenticity analysis method and apparatus, storage medium and electronic device Download PDF

Info

Publication number
WO2019200810A1
WO2019200810A1 PCT/CN2018/103063 CN2018103063W WO2019200810A1 WO 2019200810 A1 WO2019200810 A1 WO 2019200810A1 CN 2018103063 W CN2018103063 W CN 2018103063W WO 2019200810 A1 WO2019200810 A1 WO 2019200810A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
sub
features
analyzed
data
Prior art date
Application number
PCT/CN2018/103063
Other languages
French (fr)
Chinese (zh)
Inventor
叶俊锋
龙觉刚
孙成
赖云辉
罗先贤
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019200810A1 publication Critical patent/WO2019200810A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/382Payment protocols; Details thereof insuring higher security of transaction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4014Identity check for transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing

Definitions

  • the present disclosure relates to the field of computer technologies, and in particular, to a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device.
  • An object of the present disclosure is to provide a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device, thereby at least to some extent overcoming the limitations and defects of the related art.
  • a user data authenticity analysis method including:
  • a user data authenticity analysis apparatus including:
  • a model building module configured to construct a plurality of result prediction models according to a plurality of feature combinations including sub-features
  • the data acquisition module is configured to acquire the same feature data to be analyzed as the sub-feature type
  • the combination generation module is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
  • a result prediction module configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results
  • the result fusion module is configured to fuse the prediction results to obtain a final prediction result.
  • a computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor, implements the user data authenticity analysis method described above.
  • an electronic device including:
  • a memory configured to store executable instructions of the processor
  • the present disclosure is a user data authenticity analysis method and apparatus, which constructs a plurality of result prediction models according to feature combinations including sub-features, and extracts feature data to be analyzed from user data to be analyzed according to the type of sub-features, and The feature data to be analyzed is formed into a plurality of feature data combinations to be analyzed according to the feature combination; then the feature data to be analyzed is combined into the result prediction model to obtain a plurality of prediction results; and finally, the plurality of prediction results are fused to obtain a final prediction result.
  • FIG. 1 is a flow chart schematically showing a method for authenticating user data.
  • FIG. 2 is a schematic diagram showing an example of an application scenario of a user data authenticity analysis method.
  • FIG. 3 is a schematic flow chart showing a method of constructing a result prediction model.
  • FIG. 4 is a block diagram schematically showing a user data authenticity analyzing device.
  • FIG. 5 is a block diagram showing an exemplary electronic device for implementing the above-described user data authenticity analysis method.
  • FIG. 6 schematically illustrates a computer readable storage medium for implementing the above-described user data authenticity analysis method. .
  • a user data authenticity analysis method is first provided, and the user data authenticity analysis method may be run on a server, or may be run on a server cluster or a cloud server.
  • the method of the present application can also be run on other platforms according to requirements, and is not specifically limited in this exemplary embodiment.
  • the user The data authenticity analysis method may include the following steps:
  • Step S110 Construct a plurality of result prediction models according to a plurality of feature combinations including sub-features
  • Step S120 Obtaining the same feature data to be analyzed as the sub-feature type
  • Step S130 grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
  • Step S140 Input the feature data to be analyzed into the result prediction model, and obtain a plurality of prediction results
  • Step S150 Converging the prediction results to obtain a final prediction result.
  • the data samples are formed into a plurality of different feature combinations, and corresponding models are generated for each combination training, and the plurality of prediction results are obtained by analyzing the analysis of the feature data to be analyzed, and finally The prediction results are fused to obtain the final prediction results.
  • the sample data can be fully utilized, and the same applies to the case of insufficient samples; on the other hand, over-fitting is avoided when the prediction results are fused, and the prediction accuracy and accuracy of fraud recognition are improved.
  • step S110 a plurality of result prediction models are constructed according to a plurality of feature combinations including sub-features.
  • FIG. 3 shows a schematic diagram of a method for constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features. As shown in FIG. 3, the method for constructing a result prediction model is as follows:
  • Step S301 Acquire user feature information, where the user feature information includes multiple sub-features
  • Step S302 Perform machine training on the sub-features to construct a single feature model
  • Step S303 input the sub-feature into the single feature model, and obtain an accuracy rate of the sub-feature;
  • Step S304 Randomly combine the sub-features according to an accuracy rate of the sub-features Forming the combination of features;
  • Step S305 Perform machine training on the feature combination to construct the result prediction model.
  • step S301 user feature information is acquired, where the user feature information includes a plurality of sub-features.
  • the server 201 receives the user data sent by the mobile terminal 202, and selects some or all of the data as the user data sample, and further obtains the user special from the user data sample.
  • the information includes the plurality of sub-features.
  • the user characteristic information may include behavior data, attribute data, and spatial data, where the behavior data may include one or more of parameters such as page browsing time, page click times, and page click frequency; the attribute data includes age, gender, ID number, One or more of the driver's license number, contact information, etc.; the spatial data includes one or more of the device model, IP address, current location, and the like.
  • the sub-features may be selected from the above-mentioned user characteristic information, such as taking age as a sub-feature and gender as a sub-feature, which is not specifically limited in the present disclosure.
  • step S302 the sub-features are machine trained to construct a single feature model.
  • each sub-feature may be machine trained, a single feature model is constructed, and sub-features are input to the single feature model to generate a prediction result, and the accuracy of each sub-feature is obtained.
  • sub-features include age, gender, current address, page browsing time, and page click frequency.
  • Sample data of sub-features can be learned by linear regression algorithm, logistic regression algorithm, decision tree, naive Bayes algorithm, random forest algorithm, etc. Multiple machine trainings are performed until a single feature model with minimal loss is formed.
  • step S303 the sub-feature is input to the single feature model to obtain an accuracy of the sub-feature.
  • the sub-features may be input to the single feature model, and the authenticity of the sub-features is predicted, thereby obtaining the accuracy of the plurality of sub-features.
  • step S304 the sub-features are randomly combined to form the feature combination according to the accuracy of the sub-features.
  • all the sub-features may be randomly combined according to the accuracy of the sub-features to form a plurality of feature combinations.
  • the server 201 can randomly extract any three sub-features of age, gender, current address, page browsing time, and page click frequency to form a feature combination, such as forming a feature combination ⁇ age, gender, current address ⁇ , ⁇ age, current address , page browsing time; ⁇ form a feature combination and so on. It is of course also possible to select any number of sub-features of all sub-features to form different feature combinations, which will not be described in detail in the present disclosure.
  • each piece of data can generate a corresponding model, and corresponding predicted values are generated, and the data samples are fully utilized, thereby improving the prediction accuracy.
  • all sub-features may be randomly combined by roulette method. Multiple combinations of features.
  • step S305 machine training is performed on the feature combination to construct the result prediction model.
  • machine training is performed on each feature combination formed by random combination, and a plurality of result prediction models are constructed.
  • the data in each feature combination can be mechanically trained by a learning algorithm such as a linear regression algorithm, a logistic regression algorithm, a decision tree, a naive Bayes algorithm, or a random forest algorithm to obtain a plurality of combinations corresponding to each feature.
  • the resulting prediction model is used for subsequent data analysis to improve the accuracy of fraud identification.
  • the algorithm used to form the result prediction model may be the same as or different from the algorithm used to form the single feature model, which is not specifically limited in the present disclosure.
  • step S120 feature data to be analyzed that is the same as the sub-feature type is acquired.
  • the server 201 receives the user data to be analyzed sent by the mobile terminal 202, and extracts data of the corresponding type from the user data to be analyzed according to the type of the sub-feature to form the feature data to be analyzed.
  • the sub-features in step S110 are age, gender, and current address
  • all age information, gender information, and current address information may be extracted from the user data to be analyzed as the feature data to be analyzed. It is worth noting that as the sub-feature types change, the feature data to be analyzed also changes accordingly.
  • step S130 the feature data to be analyzed is grouped according to the feature combination to form a plurality of feature data combinations to be analyzed.
  • the feature data to be analyzed acquired in step S120 is grouped according to the feature combination in step S110 to form a plurality of feature data combinations to be analyzed.
  • the feature combination is ⁇ age, sex, IJ, page browsing time: ⁇
  • the user's age information, gender information, and page browsing time information are extracted from the feature data to be analyzed, and the above information is combined to form a feature data combination to be analyzed.
  • the specific form may be ⁇ age to be analyzed, gender to be analyzed, page browsing time to be analyzed: ⁇ .
  • step S140 the feature data to be analyzed is combined and input to the result prediction model to obtain a plurality of prediction results.
  • the plurality of feature data to be analyzed obtained in step S130 are separately input to the result prediction model obtained in step S110, so as to predict a plurality of feature data combinations to be analyzed, and obtain multiple forecast result.
  • the mobile terminal 202 collects M (M is a positive integer) related information of the user, and correspondingly, the server 201 forms an M group of feature data combinations to be analyzed, and then sequentially sets the M group to be divided.
  • the feature data combination is input to the result prediction model for prediction, and M prediction results are obtained.
  • step S150 a plurality of the prediction results are fused to obtain a final prediction result.
  • the M prediction results may be fused to obtain the final prediction result.
  • a plurality of prediction results may be fused by a data fusion method such as Bayesian inference, voting, D-S (Dempster-Shafer) evidence theory, and neural network fusion method to obtain a final prediction result. Because DS evidence theory has a strong ability to process uncertain information, it does not need a priori information.
  • uncertain information uses the method of “interval estimation” instead of “point estimation”, which solves the problem of “unknown”
  • the deterministic representation method has great flexibility in distinguishing between ignorance and uncertainty and accurately reflecting evidence collection; and DS evidence theory fusion framework supports infinite expansion model, so DS evidence theory is preferred in this disclosure as a fusion framework. , Fusion of multiple prediction results to obtain final prediction results.
  • the authenticity of the user data to be analyzed may be determined.
  • the final prediction result may be embodied in the form of fraud probability.
  • the final prediction result is higher than 0.05, it may be determined that the user data to be analyzed is untrue and belongs to a fraudulent user; conversely, when the prediction result is not higher than 0.05, Then, it can be determined that the user data to be analyzed is true, and belongs to a non-fraud user, that is, the lower the fraud probability, the higher the legitimacy of the user and the authenticity of the user data.
  • those skilled in the art can also set other fraud probability according to actual conditions to judge the legitimacy of the user.
  • the user data authenticity analysis method of the present disclosure makes full use of the sample data on the one hand, and avoids over-fitting when using the DS evidence theory as a fusion framework for the prediction result, and improves the prediction accuracy. And the accuracy of fraud identification; on the other hand, the user data authenticity analysis method of the present disclosure is also applicable to the case where the number of samples is insufficient, making fraud identification easier.
  • the user data authenticity method of the present disclosure can be used in an environment such as a surrender application and a car insurance claim to determine the legality of the surrender applicant, the auto insurance claimant and the request thereof, and prevent hackers and other lawless elements from obtaining the improper means. Benefits, causing losses to insurance institutions.
  • the following describes the authenticity analysis method of the user data of the present disclosure by taking the prediction of the legality of the auto insurance claimant and its request as an example.
  • the raw data of the auto insurance case is taken as a data sample, which contains multiple sub-characters, such as: name, gender, policy start and end date, insurance amount, accident time, road segment, vehicle brand, vehicle value, claim amount, etc.; Training each sub-feature to establish a single feature model, and using the single feature model to sub-features Performing prediction to obtain the accuracy of the sub-features; then randomly combining all the sub-features to form a plurality of feature combinations according to the accuracy of the sub-features, for example, forming a feature combination by ⁇ name, gender, accident time, claim amount ⁇ , etc.; Training the data in each feature combination to construct a plurality of result prediction models; then acquiring the user data to be analyzed, and extracting the same feature data to be analyzed as the sub-feature type; and extracting the extracted feature data to be analyzed according to the type of feature combination Grouping, forming a plurality of feature data combinations to be analyzed, and then inputting the feature data combination to be analyzed into the trained result prediction model
  • the present disclosure also provides a user data authenticity analysis device.
  • the user data authenticity analysis apparatus may include a model construction module 410, a data acquisition module 420, a combination generation module 430, a result prediction module 440, and a result fusion module 450. among them:
  • the model building module 410 is configured to construct a plurality of result prediction models according to the plurality of feature combinations including the sub features;
  • the data obtaining module 420 is configured to acquire the same feature data to be analyzed as the sub-feature type
  • the combination generation module 430 is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
  • the result prediction module 440 is configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
  • the result fusion module 450 is configured to fuse the prediction results to obtain a final prediction result.
  • the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network.
  • a non-volatile storage medium which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.
  • a number of instructions are included to cause a computing device (which may be a personal computer, server, mobile terminal, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
  • the user data authenticity analysis device further includes:
  • a feature information acquiring module configured to acquire user feature information, where the user feature information includes multiple sub-features
  • a feature combination forming module configured to form a plurality of the feature combinations according to the sub-features.
  • the model building module 410 includes:
  • a random combination unit configured to acquire an accuracy rate of the sub-features, and randomly combine the sub-features according to an accuracy rate of the sub-features to form a plurality of the feature combinations
  • a feature training unit configured to perform machine training on the feature combination to construct the result prediction model.
  • the foregoing random combination unit includes:
  • a sub-feature trainer configured to perform machine training on the sub-features to construct a single feature model
  • an accuracy rate acquirer configured to input the sub-feature to the single feature model to obtain an accuracy rate of the sub-feature.
  • the foregoing random combination unit includes:
  • a roulette combiner is configured to randomly combine the sub-features to form the feature combination by roulette according to an accuracy of the sub-features.
  • the foregoing data acquisition module 420 includes:
  • a user data obtaining unit configured to acquire user data to be analyzed
  • a feature data extraction unit configured to extract, from the user data to be analyzed, according to the type of the sub-feature Taking the feature data to be analyzed.
  • the result fusion module 450 includes:
  • the theoretical fusion unit is configured to fuse the prediction result according to the D-S evidence theory to obtain the final prediction result.
  • FIG. 5 An electronic device 500 according to this embodiment of the present application will be described below with reference to FIG.
  • the electronic device 500 shown in Fig. 5 is merely an example and should not impose any limitation on the function and scope of use of the embodiments of the present application.
  • the electronic device 500 is embodied in the form of a general purpose computing device.
  • the components of the electronic device 500 may include, but are not limited to: the at least one processing unit 510, the at least one memory unit 520, and the bus 530 connecting different system components (including the storage unit 520 and the processing unit 510).
  • the storage unit stores a program code
  • the program code may be executed by the processing unit 510, so that the processing unit 510 performs the following according to the present application described in the “Exemplary Method” section of the present specification.
  • the processing unit 510 may perform step S110 as shown in FIG.
  • Step S120 constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features
  • Step S120 acquiring the same feature to be analyzed as the sub-feature type
  • Step S130 grouping the to-be-analyzed feature data according to the feature combination to form a plurality of feature data combinations to be analyzed
  • Step S140 input the feature data to be analyzed into the result prediction model to obtain more Prediction results
  • Step S150 Converging the prediction results to obtain a final prediction result.
  • the storage unit 520 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 5201 and/or a cache storage unit 5202, and may further include a read only storage unit (ROM) 5203. .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 520 may further include a program/utility 5204 having a set (at least one) of the program modules 5205, such program modules 5205 including but not limited to: an operating system, one or more applications , other program modules, and program data, each of these examples or some combination may include an implementation of a network environment.
  • program modules 5205 including but not limited to: an operating system, one or more applications , other program modules, and program data, each of these examples or some combination may include an implementation of a network environment.
  • the bus 530 may represent one or more of several types of bus structures, including a memory unit bus or a memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or any bus structure using a plurality of bus structures. Local bus.
  • the electronic device 500 can also be in communication with one or more external devices 700 (eg, a keyboard, pointing device, Bluetooth device, etc.), and can also be in communication with one or more devices that enable a user to interact with the electronic device 500, and / or communicate with any device (eg, router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 550. Also, the electronic device 500 can communicate with one or more networks (e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) via the network adapter 560.
  • networks e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • network adapter 560 communicates with other modules of electronic device 500 via bus 530.
  • other hardware and/or software modules may be utilized in conjunction with electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives. And data backup storage systems, etc.
  • the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network.
  • a computing device which may be a personal computer, server, terminal device, or network device, etc.
  • a computer readable storage medium having stored thereon a program product capable of implementing the above method of the present specification.
  • various aspects of the present application can also be implemented in the form of a program product, including program code, when the program product is run on a terminal device, the program code is used to cause the The terminal device performs the steps according to various exemplary embodiments of the present application described in the "Exemplary Method" section of the present specification.
  • a program product 600 for implementing the above method which may employ a portable compact disk read only memory (CD-ROM) and includes program code, and may be described in accordance with an embodiment of the present application, may be In A terminal device, such as a personal computer.
  • the program product of the present application is not limited thereto, and in this document, the readable storage medium may be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus or device.
  • the program product may take any combination of one or more readable mediums.
  • the readable medium can be a readable signal medium or a readable storage medium.
  • the readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of non-exhaustive storage media (non-exhaustive list) include: electrical connections with one or more wires, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-RO M), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate or transport a program for use by or in connection with the instruction execution system, apparatus or device.
  • the program code embodied on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++, etc., including conventional A procedural programming language such as the "C" language or a similar programming language.
  • the program code can be executed entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the remote computing device on the user computing device, or entirely on the remote computing device or server Execute on.
  • the remote computing device can be connected to the user computing device via any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computing device (eg, provided using an Internet service) Businesses are connected via the Internet).
  • LAN local area network
  • WAN wide area network
  • Businesses are connected via the Internet.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Technology Law (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure relates to the technical field of computers, and relates to a user data authenticity analysis method and apparatus, a computer-readable storage medium and an electronic device, the user data authenticity analysis method comprising: building multiple result prediction models according to multiple feature combinations comprising sub-features; obtaining feature data to be analyzed which is of the same type as the sub-features; grouping the feature data to be analyzed according to the feature combinations to form multiple combinations of feature data to be analyzed; inputting the combinations of feature data to be analyzed into the result prediction models, thereby obtaining multiple prediction results; and fusing the prediction results to obtain a final prediction result. The present method improves the prediction precision and accuracy in fraud identification.

Description

发明名称:用户数据真实性分析方法及装置、 存储介质、 电子设备 技术领域  Title: User data authenticity analysis method and device, storage medium, electronic device
[0001] 本申请要求 2018年 4月 20日递交、 发明名称为“用户数据真实性分析方法及装置 、 存储介质、 电子设备”的中国专利申请 201810359102.0的优先权, 在此通过引 用将其全部内容合并于此。  [0001] The present application claims the priority of the Chinese Patent Application No. 201101359102.0, filed on Apr. 20, 2018, which is entitled "User Data Authenticity Analysis Method and Apparatus, Storage Medium, Electronic Apparatus", the entire contents of which are hereby incorporated by reference. Merge here.
[0002] 本公开涉及计算机技术领域, 具体而言, 涉及一种用户数据真实性分析方法、 用户数据真实性分析装置、 计算机可读存储介质以及电子设备。  [0002] The present disclosure relates to the field of computer technologies, and in particular, to a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device.
背景技术  Background technique
[0003] 随着通信技术的飞速发展, 互联网已逐渐成为人们日常使用的工具, 其通常作 为与客户、 销售商、 雇员以及股东通信和进行商业交易的途径。 理论上, 在互 联网上进行交易是高效且节省成本的, 但同时也存在很大的弊端, 例如存在黑 客、 身份盗用、 被盗的信用卡以及其他欺诈性行为, 给使用者的资金安全带来 威胁并且难以管理。  [0003] With the rapid development of communication technologies, the Internet has gradually become a tool for people's daily use, which is often used as a means of communicating and conducting business transactions with customers, vendors, employees, and shareholders. In theory, trading on the Internet is efficient and cost-effective, but it also has major drawbacks, such as hackers, identity theft, stolen credit cards and other fraudulent acts, posing a threat to users' financial security. And difficult to manage.
[0004] 欺诈用户的有效识别是一个关键的技术难题, 发明人意识到现有技术中通常使 用多个训练好的模型对用户特征数据的真实性进行判断并输出预测结果, 然后 通过数据融合方法将多个模型对应的预测结果进行融合。 但是由于相似模型采 用的算法相似, 对同一用户特征数据的预测结果可能一致, 因此在进行预测结 果融合时容易出现自增强效应。 另外, 如果某个模型预测结果错误, 则其他相 似模型预测结果也是错误的, 经过融合的结果必然也是错误的, 这就会影响对 欺诈用户的有效识别。  [0004] Effective identification of fraudulent users is a key technical problem. The inventors have realized that in the prior art, a plurality of trained models are generally used to judge the authenticity of user feature data and output prediction results, and then through data fusion methods. The prediction results corresponding to multiple models are merged. However, since the similar models adopt similar algorithms, the prediction results of the same user feature data may be consistent, so the self-enhancement effect is likely to occur when the prediction results are combined. In addition, if the prediction result of a certain model is wrong, the prediction results of other similar models are also wrong, and the result of the fusion is necessarily wrong, which will affect the effective identification of fraudulent users.
[0005] 因此, 需要提供一种新的用户数据真实性分析方法及装置, 以对欺诈用户进行 识别。  [0005] Therefore, it is desirable to provide a new user data authenticity analysis method and apparatus for identifying fraudulent users.
[0006] 需要说明的是, 在上述背景技术部分公开的信息仅用于加强对本公开的背景的 理解, 因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。 发明概述  It is to be understood that the information disclosed in the Background section above is only used to enhance the understanding of the background of the present disclosure, and thus may include information that does not constitute the prior art known to those of ordinary skill in the art. Summary of invention
技术问题 [0007] 本公开的目的在于提供一种用户数据真实性分析方法、 用户数据真实性分析装 置、 计算机可读存储介质以及电子设备, 进而至少在一定程度上克服由于相关 技术的限制和缺陷而导致的由于预测结果不准确而导致的经济欺诈问题。 。 问题的解决方案 technical problem [0007] An object of the present disclosure is to provide a user data authenticity analysis method, a user data authenticity analysis device, a computer readable storage medium, and an electronic device, thereby at least to some extent overcoming the limitations and defects of the related art. The problem of economic fraud caused by inaccurate prediction results. . Problem solution
技术解决方案  Technical solution
[0008] 根据本公开的一个方面, 提供一种用户数据真实性分析方法, 包括:  [0008] According to an aspect of the present disclosure, a user data authenticity analysis method is provided, including:
[0009] 根据多个包含子特征的特征组合构建多个结果预测模型;  [0009] constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;
[0010] 获取与所述子特征类型相同的待分析特征数据;  [0010] acquiring feature data to be analyzed that is the same as the sub-feature type;
[0011] 根据所述特征组合将所述待分析特征数据进行分组, 形成多个待分析特征数据 组合;  [0011] grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
[0012] 将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测结果; [0013] 将所述预测结果进行融合, 获取最终预测结果。  [0012] inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results; [0013] fusing the prediction results to obtain a final prediction result.
[0014] 根据本公开的一个方面, 提供一种用户数据真实性分析装置, 包括:  [0014] According to an aspect of the present disclosure, a user data authenticity analysis apparatus is provided, including:
[0015] 模型构建模块, 配置为根据多个包含子特征的特征组合构建多个结果预测模型  [0015] a model building module configured to construct a plurality of result prediction models according to a plurality of feature combinations including sub-features
[0016] 数据获取模块, 配置为获取与所述子特征类型相同的待分析特征数据; [0016] The data acquisition module is configured to acquire the same feature data to be analyzed as the sub-feature type;
[0017] 组合生成模块, 配置为根据所述特征组合将所述待分析特征数据进行分组, 形 成多个待分析特征数据组合;  [0017] The combination generation module is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
[0018] 结果预测模块, 配置为将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测结果;  [0018] a result prediction module, configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
[0019] 结果融合模块, 配置为将所述预测结果进行融合, 获取最终预测结果。  [0019] The result fusion module is configured to fuse the prediction results to obtain a final prediction result.
[0020] 根据本公开的一个方面, 提供一种计算机可读存储介质, 其上存储有计算机程 序, 所述计算机程序被处理器执行时实现上述用户数据真实性分析方法。  [0020] According to an aspect of the present disclosure, a computer readable storage medium having stored thereon a computer program, the computer program being executed by a processor, implements the user data authenticity analysis method described above.
[0021] 根据本公开的一个方面, 提供一种电子设备, 包括:  [0021] According to an aspect of the disclosure, an electronic device is provided, including:
[0022] 处理器; 以及  [0022] a processor;
[0023] 存储器, 配置为存储所述处理器的可执行指令;  [0023] a memory configured to store executable instructions of the processor;
[0024] 其中, 所述处理器配置为经由执行所述可执行指令来执行上述用户数据真实性 分析方法。 [0025] 本公开一种用户数据真实性分析方法及装置, 通过根据包含子特征的特征组合 构建多个结果预测模型, 并根据子特征的类型从待分析用户数据中抽取待分析 特征数据, 并根据特征组合将待分析特征数据形成多个待分析特征数据组合; 然后将待分析特征数据组合输入到结果预测模型得到多个预测结果; 最后将多 个预测结果进行融合获得最终预测结果。 一方面, 通过将用户数据样本形成多 个不同的特征组合, 并对每个组合训练生成相应地模型, 以对待分析特征数据 组合进行分析得到多个预测结果, 最后对预测结果进行融合得到最终预测结果 , 这样可以充分利用样本数据, 提高了预测精度; 另一方面, 避免了在融合预 测结果时出现过拟合, 提高了欺诈识别的准确度。 [0024] wherein the processor is configured to execute the user data authenticity analysis method described above by executing the executable instruction. [0025] The present disclosure is a user data authenticity analysis method and apparatus, which constructs a plurality of result prediction models according to feature combinations including sub-features, and extracts feature data to be analyzed from user data to be analyzed according to the type of sub-features, and The feature data to be analyzed is formed into a plurality of feature data combinations to be analyzed according to the feature combination; then the feature data to be analyzed is combined into the result prediction model to obtain a plurality of prediction results; and finally, the plurality of prediction results are fused to obtain a final prediction result. On the one hand, by forming user data samples into a plurality of different feature combinations, and generating corresponding models for each combination training, analyzing the feature data combinations to obtain multiple prediction results, and finally merging the prediction results to obtain a final prediction. As a result, the sample data can be fully utilized, and the prediction accuracy is improved. On the other hand, over-fitting occurs when the prediction result is fused, and the accuracy of fraud recognition is improved.
[0026] 应当理解的是, 以上的一般描述和后文的细节描述仅是示例性和解释性的, 并 不能限制本公开。  The above general description and the following detailed description are intended to be illustrative and not restrictive.
发明的有益效果  Advantageous effects of the invention
对附图的简要说明  Brief description of the drawing
附图说明  DRAWINGS
[0027] 此处的附图被并入说明书中并构成本说明书的一部分, 示出了符合本公开的实 施例, 并与说明书一起用于解释本公开的原理。 显而易见地, 下面描述中的附 图仅仅是本公开的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造 性劳动的前提下, 还可以根据这些附图获得其他的附图。  [0027] The accompanying drawings, which are incorporated in and constitute in the claims Obviously, the drawings in the following description are only some of the embodiments of the present disclosure, and those skilled in the art can obtain other drawings based on these drawings without any creative work.
[0028] 图 1示意性示出一种用户数据真实性分析方法的流程图。  [0028] FIG. 1 is a flow chart schematically showing a method for authenticating user data.
[0029] 图 2示意性示出一种用户数据真实性分析方法的应用场景示例图。  [0029] FIG. 2 is a schematic diagram showing an example of an application scenario of a user data authenticity analysis method.
[0030] 图 3示意性示出一种构建结果预测模型的方法流程图。  [0030] FIG. 3 is a schematic flow chart showing a method of constructing a result prediction model.
[0031] 图 4示意性示出一种用户数据真实性分析装置的方框图。  [0031] FIG. 4 is a block diagram schematically showing a user data authenticity analyzing device.
[0032] 图 5示意性示出一种用于实现上述用户数据真实性分析方法的电子设备示例框 图。  [0032] FIG. 5 is a block diagram showing an exemplary electronic device for implementing the above-described user data authenticity analysis method.
[0033] 图 6示意性示出一种用于实现上述用户数据真实性分析方法的计算机可读存储 介质。 。  [0033] FIG. 6 schematically illustrates a computer readable storage medium for implementing the above-described user data authenticity analysis method. .
发明实施例 本发明的实施方式 Invention embodiment Embodiments of the invention
[0034] 现在将参考附图更全面地描述示例实施方式。 然而, 示例实施方式能够以多种 形式实施, 且不应被理解为限于在此阐述的范例; 相反, 提供这些实施方式使 得本公开将更加全面和完整, 并将示例实施方式的构思全面地传达给本领域的 技术人员。 所描述的特征、 结构或特性可以以任何合适的方式结合在一个或更 多实施方式中。 在下面的描述中, 提供许多具体细节从而给出对本公开的实施 方式的充分理解。 然而, 本领域技术人员将意识到, 可以实践本公开的技术方 案而省略所述特定细节中的一个或更多, 或者可以采用其它的方法、 组元、 装 置、 步骤等。 在其它情况下, 不详细示出或描述公知技术方案以避免喧宾夺主 而使得本公开的各方面变得模糊。  Example embodiments will now be described more fully with reference to the accompanying drawings. However, the example embodiments can be embodied in a variety of forms and should not be construed as being limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be more complete and complete, To those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are set forth However, one skilled in the art will appreciate that one or more of the specific details may be omitted, or other methods, components, means, steps, etc. may be employed. In other instances, various aspects of the present disclosure are not obscured by the detailed description of the embodiments.
[0035] 此外, 附图仅为本公开的示意性图解, 并非一定是按比例绘制。 图中相同的附 图标记表示相同或类似的部分, 因而将省略对它们的重复描述。 附图中所示的 一些方框图是功能实体, 不一定必须与物理或逻辑上独立的实体相对应。 可以 采用软件形式来实现这些功能实体, 或在一个或多个硬件模块或集成电路中实 现这些功能实体, 或在不同网络和 /或处理器装置和 /或微控制器装置中实现这些 功能实体。  The drawings are only schematic representations of the present disclosure, and are not necessarily to scale. The same reference numerals in the drawings denote the same or similar parts, and a repeated description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily have to correspond to physically or logically separate entities. These functional entities may be implemented in software, or implemented in one or more hardware modules or integrated circuits, or implemented in different network and/or processor devices and/or microcontroller devices.
[0036] 在本领域的相关技术中, 对用户进行欺诈检测时, 通常需要进行二元分类, 并 且对每条数据都会采用对应地模型进行真实性的预测, 并输出预测结果, 然后 再将多个模型输出的预测结果进行融合。 但是相似模型采用的算法相似, 因此 对同一用户的特征数据的预测结果可能一致, 如果对多个模型输出的预测结果 进行融合时, 很容易出现自增强效应, 导致预测结果不准确, 融合器产生的模 型缺乏泛化能力。 并且当某个模型输出的预测结果是错误的, 则其他相似模型 预测结果也是错误的, 经过融合生成的结果必然也是错误的, 影响了对欺诈用 户的有效识别。  [0036] In the related art in the related art, when fraud detection is performed on a user, a binary classification is usually required, and a corresponding model is used for each data to predict the authenticity, and the prediction result is output, and then more The predicted results of the model outputs are fused. However, the similar models use similar algorithms, so the prediction results of the same user's eigendata may be consistent. If the prediction results of multiple model outputs are fused, the self-enhancement effect is easy to occur, resulting in inaccurate prediction results. The model lacks generalization capabilities. And when the prediction result of a model output is wrong, the prediction results of other similar models are also wrong. The result of the fusion generation is also wrong, which affects the effective identification of fraudulent users.
[0037] 鉴于相关技术中存在的问题, 本示例实施方式中首先提供了一种用户数据真实 性分析方法, 该用户数据真实性分析方法可以运行于服务器, 也可以运行于服 务器集群或云服务器等, 当然, 本领域技术人员也可以根据需求在其他平台运 行本申请的方法, 本示例性实施例中对此不做特殊限定。 参考图 1所示, 该用户 数据真实性分析方法可以包括以下步骤: [0037] In view of the problems in the related art, in the present exemplary embodiment, a user data authenticity analysis method is first provided, and the user data authenticity analysis method may be run on a server, or may be run on a server cluster or a cloud server. Of course, the method of the present application can also be run on other platforms according to requirements, and is not specifically limited in this exemplary embodiment. Referring to Figure 1, the user The data authenticity analysis method may include the following steps:
[0038] 步骤 S110.根据多个包含子特征的特征组合构建多个结果预测模型;  [0038] Step S110. Construct a plurality of result prediction models according to a plurality of feature combinations including sub-features;
[0039] 步骤 S120.获取与所述子特征类型相同的待分析特征数据;  [0039] Step S120: Obtaining the same feature data to be analyzed as the sub-feature type;
[0040] 步骤 S130.根据所述特征组合将所述待分析特征数据进行分组, 形成多个待分 析特征数据组合;  [0040] Step S130: grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
[0041] 步骤 S140.将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预 测结果;  [0041] Step S140. Input the feature data to be analyzed into the result prediction model, and obtain a plurality of prediction results;
[0042] 步骤 S150.将所述预测结果进行融合, 获取最终预测结果。  [0042] Step S150. Converging the prediction results to obtain a final prediction result.
[0043] 上述用户数据真实性分析方法中, 通过将数据样本形成多个不同的特征组合, 并对每个组合训练生成相应地模型, 以对待分析特征数据组合进行分析得到多 个预测结果, 最后对预测结果进行融合得到最终预测结果。 一方面, 可以充分 利用样本数据, 对于样本不足的情况也同样适用; 另一方面, 避免了在融合预 测结果时出现过拟合, 提高了欺诈识别的预测精度及准确度。  [0043] In the above method for authenticating user data, the data samples are formed into a plurality of different feature combinations, and corresponding models are generated for each combination training, and the plurality of prediction results are obtained by analyzing the analysis of the feature data to be analyzed, and finally The prediction results are fused to obtain the final prediction results. On the one hand, the sample data can be fully utilized, and the same applies to the case of insufficient samples; on the other hand, over-fitting is avoided when the prediction results are fused, and the prediction accuracy and accuracy of fraud recognition are improved.
[0044] 下面, 将结合图 2对本示例实施方式中上述用户数据真实性分析方法中的各步 骤进行详细的解释以及说明。  [0044] Hereinafter, each step in the above-described user data authenticity analysis method in the present exemplary embodiment will be explained in detail and explained with reference to FIG.
[0045] 在步骤 S110中, 根据多个包含子特征的特征组合构建多个结果预测模型。  [0045] In step S110, a plurality of result prediction models are constructed according to a plurality of feature combinations including sub-features.
[0046] 图 3示出了根据多个包含子特征的特征组合构建多个结果预测模型的方法示意 图, 如图 3所示, 构建结果预测模型的方法具体如下:  [0046] FIG. 3 shows a schematic diagram of a method for constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features. As shown in FIG. 3, the method for constructing a result prediction model is as follows:
[0047] 步骤 S301.获取用户特征信息, 所述用户特征信息包括多个子特征;  [0047] Step S301. Acquire user feature information, where the user feature information includes multiple sub-features;
[0048] 步骤 S302.对所述子特征进行机器训练, 以构建单特征模型;  [0048] Step S302: Perform machine training on the sub-features to construct a single feature model;
[0049] 步骤 S303.将所述子特征输入至所述单特征模型, 获取所述子特征的准确率; [0050] 步骤 S304.根据所述子特征的准确率, 将所述子特征随机组合形成所述特征组 合;  [0049] Step S303: input the sub-feature into the single feature model, and obtain an accuracy rate of the sub-feature; [0050] Step S304. Randomly combine the sub-features according to an accuracy rate of the sub-features Forming the combination of features;
[0051] 步骤 S305.对所述特征组合进行机器训练, 以构建所述结果预测模型。  [0051] Step S305. Perform machine training on the feature combination to construct the result prediction model.
[0052] 下面对构建结果预测模型的方法进行详细的说明:  [0052] The following describes the method of constructing the result prediction model in detail:
[0053] 在步骤 S301中, 获取用户特征信息, 所述用户特征信息包括多个子特征。  [0053] In step S301, user feature information is acquired, where the user feature information includes a plurality of sub-features.
[0054] 在本示例实施方式中, 服务器 201接收移动终端 202发送的用户数据, 并从中选 取部分或全部数据作为用户数据样本, 进一步的从用户数据样本中获取用户特 征信息, 该用户特征信息包括多个子特征。 用户特征信息可以包括行为数据、 属性数据和空间数据, 其中行为数据可以包括页面浏览时间、 页面点击次数、 页面点击频率等参数中的一个或多个; 属性数据包括年龄、 性别、 身份证号、 驾驶证号、 联系方式等参数中的一个或多个; 空间数据包括设备型号、 IP地址、 当前位置等参数中的一个或多个。 子特征可以选自上述用户特征信息, 如将年 龄作为一个子特征、 将性别作为一个子特征, 本公开对此不做具体限定。 [0054] In the present exemplary embodiment, the server 201 receives the user data sent by the mobile terminal 202, and selects some or all of the data as the user data sample, and further obtains the user special from the user data sample. The information includes the plurality of sub-features. The user characteristic information may include behavior data, attribute data, and spatial data, where the behavior data may include one or more of parameters such as page browsing time, page click times, and page click frequency; the attribute data includes age, gender, ID number, One or more of the driver's license number, contact information, etc.; the spatial data includes one or more of the device model, IP address, current location, and the like. The sub-features may be selected from the above-mentioned user characteristic information, such as taking age as a sub-feature and gender as a sub-feature, which is not specifically limited in the present disclosure.
[0055] 在步骤 S302中, 对所述子特征进行机器训练, 以构建单特征模型。  [0055] In step S302, the sub-features are machine trained to construct a single feature model.
[0056] 在本示例实施方式中, 可以对每个子特征进行机器训练, 构建单特征模型, 并 将子特征输入至该单特征模型产生预测结果, 获得各个子特征的准确率。 例如 子特征包括年龄、 性别、 当前地址、 页面浏览时间和页面点击频率, 可以通过 线性回归算法、 逻辑回归算法、 决策树、 朴素贝叶斯算法、 随机森林算法等学 习算法对子特征的样本数据进行多次机器训练, 直至形成具有最小损失的单特 征模型。  [0056] In the present exemplary embodiment, each sub-feature may be machine trained, a single feature model is constructed, and sub-features are input to the single feature model to generate a prediction result, and the accuracy of each sub-feature is obtained. For example, sub-features include age, gender, current address, page browsing time, and page click frequency. Sample data of sub-features can be learned by linear regression algorithm, logistic regression algorithm, decision tree, naive Bayes algorithm, random forest algorithm, etc. Multiple machine trainings are performed until a single feature model with minimal loss is formed.
[0057] 在步骤 S303中, 将所述子特征输入至所述单特征模型, 获取所述子特征的准确 率。  [0057] In step S303, the sub-feature is input to the single feature model to obtain an accuracy of the sub-feature.
[0058] 在本示例实施方式中, 根据多个子特征构建单特征模型后, 可以将子特征输入 至单特征模型, 对子特征的真实性进行预测, 进而获得多个子特征的准确率。  [0058] In the present exemplary embodiment, after the single feature model is constructed according to the plurality of sub-features, the sub-features may be input to the single feature model, and the authenticity of the sub-features is predicted, thereby obtaining the accuracy of the plurality of sub-features.
[0059] 在步骤 S304中, 根据所述子特征的准确率, 将所述子特征随机组合形成所述特 征组合。  [0059] In step S304, the sub-features are randomly combined to form the feature combination according to the accuracy of the sub-features.
[0060] 在本示例实施方式中, 获得子特征的准确率后, 可以根据子特征的准确率, 对 所有的子特征进行随机组合, 形成多个特征组合。 例如服务器 201可以随机抽取 年龄、 性别、 当前地址、 页面浏览时间和页面点击频率中的任意三个子特征形 成特征组合, 如将{年龄、 性别、 当前地址}形成一特征组合、 {年龄、 当前地址 、 页面浏览时间;}形成一特征组合等等。 当然也可以选择所有子特征中任意数量 的子特征形成不同的特征组合, 本公开对此不再赘述。 通过采用随机组合的方 式形成特征组合能够使每条数据都产生对应的模型, 并生成相应地预测值, 充 分利用了数据样本, 提高了预测精度。  [0060] In the present exemplary embodiment, after obtaining the accuracy of the sub-features, all the sub-features may be randomly combined according to the accuracy of the sub-features to form a plurality of feature combinations. For example, the server 201 can randomly extract any three sub-features of age, gender, current address, page browsing time, and page click frequency to form a feature combination, such as forming a feature combination {age, gender, current address}, {age, current address , page browsing time;} form a feature combination and so on. It is of course also possible to select any number of sub-features of all sub-features to form different feature combinations, which will not be described in detail in the present disclosure. By adopting a random combination method to form a feature combination, each piece of data can generate a corresponding model, and corresponding predicted values are generated, and the data samples are fully utilized, thereby improving the prediction accuracy.
[0061] 进一步地, 为了提高效率, 可以采用轮盘赌法对所有子特征进行随机组合, 形 成多个特征组合。 [0061] Further, in order to improve efficiency, all sub-features may be randomly combined by roulette method. Multiple combinations of features.
[0062] 在步骤 S305中, 对所述特征组合进行机器训练, 以构建所述结果预测模型。  [0062] In step S305, machine training is performed on the feature combination to construct the result prediction model.
[0063] 在本示例实施方式中, 对随机组合形成的每个特征组合进行机器训练, 构建多 个结果预测模型。 本公开中可以通过线性回归算法、 逻辑回归算法、 决策树、 朴素贝叶斯算法、 随机森林算法等学习算法对每个特征组合中的数据进行机器 训练, 以获得多个对应于每个特征组合的结果预测模型, 用于后续的数据分析 , 以提高欺诈识别的准确度。 形成结果预测模型采用的算法与形成单特征模型 采用的算法可以相同, 也可以不同, 本公开对此不做具体限定。  [0063] In the present exemplary embodiment, machine training is performed on each feature combination formed by random combination, and a plurality of result prediction models are constructed. In the present disclosure, the data in each feature combination can be mechanically trained by a learning algorithm such as a linear regression algorithm, a logistic regression algorithm, a decision tree, a naive Bayes algorithm, or a random forest algorithm to obtain a plurality of combinations corresponding to each feature. The resulting prediction model is used for subsequent data analysis to improve the accuracy of fraud identification. The algorithm used to form the result prediction model may be the same as or different from the algorithm used to form the single feature model, which is not specifically limited in the present disclosure.
[0064] 在步骤 S120中, 获取与所述子特征类型相同的待分析特征数据。  [0064] In step S120, feature data to be analyzed that is the same as the sub-feature type is acquired.
[0065] 在本示例实施方式中, 服务器 201接收移动终端 202发送的待分析用户数据, 根 据子特征的类型, 从待分析用户数据中抽取对应类型的数据形成待分析特征数 据。 举例而言, 当步骤 S110中的子特征为年龄、 性别、 当前地址时, 可以从待 分析用户数据中提取所有的年龄信息、 性别信息及当前地址信息, 做为待分析 特征数据。 值得注意的是, 随着子特征类型的变化, 待分析特征数据也相应地 发生变化。  In the present example, the server 201 receives the user data to be analyzed sent by the mobile terminal 202, and extracts data of the corresponding type from the user data to be analyzed according to the type of the sub-feature to form the feature data to be analyzed. For example, when the sub-features in step S110 are age, gender, and current address, all age information, gender information, and current address information may be extracted from the user data to be analyzed as the feature data to be analyzed. It is worth noting that as the sub-feature types change, the feature data to be analyzed also changes accordingly.
[0066] 在步骤 S130中, 根据所述特征组合将所述待分析特征数据进行分组, 形成多个 待分析特征数据组合。  [0066] In step S130, the feature data to be analyzed is grouped according to the feature combination to form a plurality of feature data combinations to be analyzed.
[0067] 在本示例实施方式中, 根据步骤 S110中的特征组合将步骤 S120中获取的待分析 特征数据进行分组, 形成多个待分析特征数据组合。 例如特征组合为{年龄、 性 另 IJ、 页面浏览时间 :}, 则从待分析特征数据中提取用户的年龄信息、 性别信息和 页面浏览时间信息, 并将上述信息组合形成待分析特征数据组合, 具体形式可 以是{待分析年龄、 待分析性别、 待分析页面浏览时间 :}。  In the present exemplary embodiment, the feature data to be analyzed acquired in step S120 is grouped according to the feature combination in step S110 to form a plurality of feature data combinations to be analyzed. For example, if the feature combination is {age, sex, IJ, page browsing time:}, the user's age information, gender information, and page browsing time information are extracted from the feature data to be analyzed, and the above information is combined to form a feature data combination to be analyzed. The specific form may be {age to be analyzed, gender to be analyzed, page browsing time to be analyzed:}.
[0068] 在步骤 S140中, 将所述待分析特征数据组合输入至所述结果预测模型, 获取多 个预测结果。  [0068] In step S140, the feature data to be analyzed is combined and input to the result prediction model to obtain a plurality of prediction results.
[0069] 在本示例实施方式中, 将步骤 S130中获得的多个待分析特征数据组合分别输入 至步骤 S110中获得的结果预测模型, 以对多个待分析特征数据组合进行预测, 获取多个预测结果。 例如移动终端 202共收集到 M (M为正整数) 个用户的相关 信息, 相应地服务器 201端形成 M组待分析特征数据组合, 然后依次将 M组待分 析特征数据组合输入至结果预测模型进行预测, 即可获得 M个预测结果。 [0069] In the present exemplary embodiment, the plurality of feature data to be analyzed obtained in step S130 are separately input to the result prediction model obtained in step S110, so as to predict a plurality of feature data combinations to be analyzed, and obtain multiple forecast result. For example, the mobile terminal 202 collects M (M is a positive integer) related information of the user, and correspondingly, the server 201 forms an M group of feature data combinations to be analyzed, and then sequentially sets the M group to be divided. The feature data combination is input to the result prediction model for prediction, and M prediction results are obtained.
[0070] 在步骤 S150中, 将多个所述预测结果进行融合, 获取最终预测结果。  [0070] In step S150, a plurality of the prediction results are fused to obtain a final prediction result.
[0071] 在本示例实施方式中, 在获取 M组待分析特征数据组合的预测结果后, 可以将 M个预测结果进行融合, 获取最终预测结果。 本公开中可以通过贝叶斯推理法、 表决法、 D-S (Dempster-Shafer) 证据理论、 神经网络融合法等数据融合方法对 多个预测结果进行融合, 以获取最终预测结果。 由于 D-S证据理论具有很强的处 理不确定信息的能力, 它不需要先验信息, 对不确定信息的描述采用“区间估计” 而不是“点估计”的方法, 解决了关于“未知”即不确定性的表示方法, 在区分不知 道与不确定方面以及精确反映证据收集方面具有很大的灵活性; 并且 D-S证据理 论融合框架支持无限扩展模型, 因此本公开中优选采用 D-S证据理论作为融合框 架, 对多个预测结果进行融合, 获得最终预测结果。  [0071] In the present example embodiment, after obtaining the prediction result of the M group to be analyzed feature data combination, the M prediction results may be fused to obtain the final prediction result. In the present disclosure, a plurality of prediction results may be fused by a data fusion method such as Bayesian inference, voting, D-S (Dempster-Shafer) evidence theory, and neural network fusion method to obtain a final prediction result. Because DS evidence theory has a strong ability to process uncertain information, it does not need a priori information. The description of uncertain information uses the method of “interval estimation” instead of “point estimation”, which solves the problem of “unknown” The deterministic representation method has great flexibility in distinguishing between ignorance and uncertainty and accurately reflecting evidence collection; and DS evidence theory fusion framework supports infinite expansion model, so DS evidence theory is preferred in this disclosure as a fusion framework. , Fusion of multiple prediction results to obtain final prediction results.
[0072] 进一步的, 根据获得的最终预测结果, 可以对待分析用户数据的真实性进行判 定。 例如最终预测结果可以是以欺诈概率的形式体现的, 当最终预测结果高于 0. 05时, 则可确定待分析用户数据不真实, 属于欺诈用户; 反之, 当预测结果不 高于 0.05时, 则可确定待分析用户数据真实, 属于非欺诈用户, 也就是说欺诈概 率越低, 用户的合法性、 用户数据的真实性越高。 当然, 本领域技术人员也可 以根据实际情况设置其它的欺诈概率, 以判断用户的合法性。  [0072] Further, based on the obtained final prediction result, the authenticity of the user data to be analyzed may be determined. For example, the final prediction result may be embodied in the form of fraud probability. When the final prediction result is higher than 0.05, it may be determined that the user data to be analyzed is untrue and belongs to a fraudulent user; conversely, when the prediction result is not higher than 0.05, Then, it can be determined that the user data to be analyzed is true, and belongs to a non-fraud user, that is, the lower the fraud probability, the higher the legitimacy of the user and the authenticity of the user data. Of course, those skilled in the art can also set other fraud probability according to actual conditions to judge the legitimacy of the user.
[0073] 本公开的用户数据真实性分析方法, 一方面充分利用了样本数据, 同时在对预 测结果采用 D-S证据理论作为融合框架对多个预测结果融合时避免了过拟合, 提 高了预测精度和欺诈识别的准确度; 另一方面, 本公开的用户数据真实性分析 方法对于样本数量不足的情况也同样适用, 使得欺诈识别更容易。  [0073] The user data authenticity analysis method of the present disclosure makes full use of the sample data on the one hand, and avoids over-fitting when using the DS evidence theory as a fusion framework for the prediction result, and improves the prediction accuracy. And the accuracy of fraud identification; on the other hand, the user data authenticity analysis method of the present disclosure is also applicable to the case where the number of samples is insufficient, making fraud identification easier.
[0074] 本公开的用户数据真实性方法可以用于退保申请、 车险索赔等环境, 以确定退 保申请人、 车险索赔人及其请求的合法性, 防止黑客等不法分子通过不正当手 段获得利益, 使保险机构蒙受损失。 下面以对车险索赔人及其请求的合法性进 行预测为例对本公开的用户数据真实性分析方法进行说明。 首先获取车险案例 的原始数据作为数据样本, 该原始数据中包含多个子特征, 例如: 姓名、 性别 、 保单起止日期、 保额、 事故时间、 路段、 车辆品牌、 车辆价值、 索赔金额等 等; 接着对每个子特征进行训练建立单特征模型, 并利用单特征模型对子特征 进行预测获得子特征的准确率; 然后根据子特征的准确率将所有子特征随机组 合形成多个特征组合, 例如以{姓名、 性别、 事故时间、 索赔金额}形成特征组合 , 等等; 再分别对每个特征组合中的数据进行训练, 构建多个结果预测模型; 然后获取待分析用户数据, 并提取与子特征类型相同的待分析特征数据; 将提 取的待分析特征数据按照特征组合的类型进行分组, 形成多个待分析特征数据 组合, 然后将待分析特征数据组合输入至训练好的结果预测模型进行分析, 获 取相应地预测结果; 最后通过 D-S证据理论作为融合框架, 将每个预测结果进行 融合, 获得最终预测结果, 并根据最终预测结果判断车险索赔人及其请求是否 合法。 [0074] The user data authenticity method of the present disclosure can be used in an environment such as a surrender application and a car insurance claim to determine the legality of the surrender applicant, the auto insurance claimant and the request thereof, and prevent hackers and other lawless elements from obtaining the improper means. Benefits, causing losses to insurance institutions. The following describes the authenticity analysis method of the user data of the present disclosure by taking the prediction of the legality of the auto insurance claimant and its request as an example. First, the raw data of the auto insurance case is taken as a data sample, which contains multiple sub-characters, such as: name, gender, policy start and end date, insurance amount, accident time, road segment, vehicle brand, vehicle value, claim amount, etc.; Training each sub-feature to establish a single feature model, and using the single feature model to sub-features Performing prediction to obtain the accuracy of the sub-features; then randomly combining all the sub-features to form a plurality of feature combinations according to the accuracy of the sub-features, for example, forming a feature combination by {name, gender, accident time, claim amount}, etc.; Training the data in each feature combination to construct a plurality of result prediction models; then acquiring the user data to be analyzed, and extracting the same feature data to be analyzed as the sub-feature type; and extracting the extracted feature data to be analyzed according to the type of feature combination Grouping, forming a plurality of feature data combinations to be analyzed, and then inputting the feature data combination to be analyzed into the trained result prediction model for analysis, and obtaining corresponding prediction results; finally, using DS evidence theory as a fusion framework, each prediction result is obtained. Convergence, obtaining the final forecast results, and judging whether the auto insurance claimant and its request are legal based on the final forecast.
[0075] 本公开还提供了一种用户数据真实性分析装置。 参考图 4所示, 该用户数据真 实性分析装置可以包括模型构建模块 410、 数据获取模块 420、 组合生成模块 430 、 结果预测模块 440以及结果融合模块 450。 其中:  [0075] The present disclosure also provides a user data authenticity analysis device. Referring to FIG. 4, the user data authenticity analysis apparatus may include a model construction module 410, a data acquisition module 420, a combination generation module 430, a result prediction module 440, and a result fusion module 450. among them:
[0076] 模型构建模块 410, 配置为根据多个包含子特征的特征组合构建多个结果预测 模型;  [0076] The model building module 410 is configured to construct a plurality of result prediction models according to the plurality of feature combinations including the sub features;
[0077] 数据获取模块 420, 配置为获取与所述子特征类型相同的待分析特征数据; [0077] The data obtaining module 420 is configured to acquire the same feature data to be analyzed as the sub-feature type;
[0078] 组合生成模块 430, 配置为根据所述特征组合将所述待分析特征数据进行分组 , 形成多个待分析特征数据组合; [0078] The combination generation module 430 is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
[0079] 结果预测模块 440, 配置为将所述待分析特征数据组合输入至所述结果预测模 型, 获取多个预测结果;  [0079] The result prediction module 440 is configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
[0080] 结果融合模块 450, 配置为将所述预测结果进行融合, 获取最终预测结果。  [0080] The result fusion module 450 is configured to fuse the prediction results to obtain a final prediction result.
[0081] 上述用户数据真实性分析装置中各模块的具体细节已经在对应的用户数据真实 性分析方法中进行了详细的描述, 因此此处不再赘述。  [0081] The specific details of each module in the user data authenticity analysis apparatus have been described in detail in the corresponding user data authenticity analysis method, and thus are not described herein again.
[0082] 应当注意, 尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者 单元, 但是这种划分并非强制性的。 实际上, 根据本公开的实施方式, 上文描 述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化 。 反之, 上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个 模块或者单元来具体化。  [0082] It should be noted that although several modules or units of equipment for action execution are mentioned in the above detailed description, such division is not mandatory. In fact, the features and functions of the two or more modules or units described above may be embodied in one module or unit in accordance with the embodiments of the present disclosure. Conversely, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.
[0083] 此外, 尽管在附图中以特定顺序描述了本公开中方法的各个步骤, 但是, 这并 非要求或者暗示必须按照该特定顺序来执行这些步骤, 或是必须执行全部所示 的步骤才能实现期望的结果。 附加的或备选的, 可以省略某些步骤, 将多个步 骤合并为一个步骤执行, 以及 /或者将一个步骤分解为多个步骤执行等。 Further, although the various steps of the method of the present disclosure are described in a particular order in the drawings, It is not required or implied that these steps must be performed in that particular order, or that all of the steps shown must be performed to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions and the like.
[0084] 通过以上的实施方式的描述, 本领域的技术人员易于理解, 这里描述的示例实 施方式可以通过软件实现, 也可以通过软件结合必要的硬件的方式来实现。 因 此, 根据本公开实施方式的技术方案可以以软件产品的形式体现出来, 该软件 产品可以存储在一个非易失性存储介质 (可以是 CD-ROM, U盘, 移动硬盘等) 中或网络上, 包括若干指令以使得一台计算设备 (可以是个人计算机、 服务器 、 移动终端、 或者网络设备等) 执行根据本公开实施方式的方法。  [0084] Through the description of the above embodiments, those skilled in the art will readily understand that the example embodiments described herein may be implemented by software or by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, mobile terminal, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
[0085] 在一种实施例中, 上述用户数据真实性分析装置还包括:  [0085] In an embodiment, the user data authenticity analysis device further includes:
[0086] 特征信息获取模块, 配置为获取用户特征信息, 所述用户特征信息包括多个子 特征;  [0086] a feature information acquiring module, configured to acquire user feature information, where the user feature information includes multiple sub-features;
[0087] 特征组合形成模块, 配置为根据所述子特征形成多个所述特征组合。  [0087] a feature combination forming module configured to form a plurality of the feature combinations according to the sub-features.
[0088] 在一种实施例中, 上述模型构建模块 410包括:  [0088] In an embodiment, the model building module 410 includes:
[0089] 随机组合单元, 配置为获取所述子特征的准确率, 并根据所述子特征的准确率 将所述子特征随机组合形成多个所述特征组合;  [0089] a random combination unit configured to acquire an accuracy rate of the sub-features, and randomly combine the sub-features according to an accuracy rate of the sub-features to form a plurality of the feature combinations;
[0090] 特征训练单元, 配置为对所述特征组合进行机器训练, 以构建所述结果预测模 型。  [0090] A feature training unit configured to perform machine training on the feature combination to construct the result prediction model.
[0091] 在一种实施例中, 上述随机组合单元包括:  [0091] In an embodiment, the foregoing random combination unit includes:
[0092] 子特征训练器, 配置为对所述子特征进行机器训练, 以构建单特征模型; [0092] a sub-feature trainer configured to perform machine training on the sub-features to construct a single feature model;
[0093] 准确率获取器, 配置为将所述子特征输入至所述单特征模型, 获取所述子特征 的准确率。 [0093] an accuracy rate acquirer configured to input the sub-feature to the single feature model to obtain an accuracy rate of the sub-feature.
[0094] 在一种实施例中, 上述随机组合单元包括:  [0094] In an embodiment, the foregoing random combination unit includes:
[0095] 轮盘组合器, 配置为根据所述子特征的准确率, 通过轮盘赌法将所述子特征随 机组合形成所述特征组合。  [0095] A roulette combiner is configured to randomly combine the sub-features to form the feature combination by roulette according to an accuracy of the sub-features.
[0096] 在一种实施例中, 上述数据获取模块 420包括:  [0096] In an embodiment, the foregoing data acquisition module 420 includes:
[0097] 用户数据获取单元, 用于获取待分析用户数据;  [0097] a user data obtaining unit, configured to acquire user data to be analyzed;
[0098] 特征数据抽取单元, 用于根据所述子特征的类型, 从所述待分析用户数据中抽 取所述待分析特征数据。 [0098] a feature data extraction unit, configured to extract, from the user data to be analyzed, according to the type of the sub-feature Taking the feature data to be analyzed.
[0099] 在一种实施例中, 上述结果融合模块 450包括:  [0099] In an embodiment, the result fusion module 450 includes:
[0100] 理论融合单元, 用于根据 D-S证据理论将所述预测结果进行融合, 获取所述最 终预测结果。  [0100] The theoretical fusion unit is configured to fuse the prediction result according to the D-S evidence theory to obtain the final prediction result.
[0101] 在本公开的示例性实施例中, 还提供了一种能够实现上述方法的电子设备。  [0101] In an exemplary embodiment of the present disclosure, there is also provided an electronic device capable of implementing the above method.
[0102] 所属技术领域的技术人员能够理解, 本申请的各个方面可以实现为系统、 方法 或程序产品。 因此, 本申请的各个方面可以具体实现为以下形式, 即: 完全的 硬件实施方式、 完全的软件实施方式 (包括固件、 微代码等) , 或硬件和软件 方面结合的实施方式, 这里可以统称为“电路”、 “模块”或“系统”。  [0102] Those skilled in the art will appreciate that aspects of the present application can be implemented as a system, method, or program product. Therefore, various aspects of the present application may be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which may be collectively referred to herein. "circuit", "module" or "system".
[0103] 下面参照图 5来描述根据本申请的这种实施方式的电子设备 500。 图 5显示的电 子设备 500仅仅是一个示例, 不应对本申请实施例的功能和使用范围带来任何限 制。  [0103] An electronic device 500 according to this embodiment of the present application will be described below with reference to FIG. The electronic device 500 shown in Fig. 5 is merely an example and should not impose any limitation on the function and scope of use of the embodiments of the present application.
[0104] 如图 5所示, 电子设备 500以通用计算设备的形式表现。 电子设备 500的组件可 以包括但不限于: 上述至少一个处理单元 510、 上述至少一个存储单元 520、 连 接不同系统组件 (包括存储单元 520和处理单元 510) 的总线 530。  [0104] As shown in FIG. 5, the electronic device 500 is embodied in the form of a general purpose computing device. The components of the electronic device 500 may include, but are not limited to: the at least one processing unit 510, the at least one memory unit 520, and the bus 530 connecting different system components (including the storage unit 520 and the processing unit 510).
[0105] 其中, 所述存储单元存储有程序代码, 所述程序代码可以被所述处理单元 510 执行, 使得所述处理单元 510执行本说明书上述“示例性方法”部分中描述的根据 本申请各种示例性实施方式的步骤。 例如, 所述处理单元 510可以执行如图 1中 所示的步骤 S110: 根据多个包含子特征的特征组合构建多个结果预测模型; 步 骤 S120: 获取与所述子特征类型相同的待分析特征数据; 步骤 S130: 根据所述 特征组合将所述待分析特征数据进行分组, 形成多个待分析特征数据组合; 步 骤 S140: 将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测 结果; 步骤 S150: 将所述预测结果进行融合, 获取最终预测结果。  [0105] wherein, the storage unit stores a program code, and the program code may be executed by the processing unit 510, so that the processing unit 510 performs the following according to the present application described in the “Exemplary Method” section of the present specification. The steps of an exemplary embodiment. For example, the processing unit 510 may perform step S110 as shown in FIG. 1: constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features; Step S120: acquiring the same feature to be analyzed as the sub-feature type Step S130: grouping the to-be-analyzed feature data according to the feature combination to form a plurality of feature data combinations to be analyzed; Step S140: input the feature data to be analyzed into the result prediction model to obtain more Prediction results; Step S150: Converging the prediction results to obtain a final prediction result.
[0106] 存储单元 520可以包括易失性存储单元形式的可读介质, 例如随机存取存储单 元 (RAM) 5201和 /或高速缓存存储单元 5202, 还可以进一步包括只读存储单元 (ROM) 5203。  [0106] The storage unit 520 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 5201 and/or a cache storage unit 5202, and may further include a read only storage unit (ROM) 5203. .
[0107] 存储单元 520还可以包括具有一组 (至少一个) 程序模块 5205的程序 /实用工具 5204, 这样的程序模块 5205包括但不限于: 操作系统、 一个或者多个应用程序 、 其它程序模块以及程序数据, 这些示例中的每一个或某种组合中可能包括网 络环境的实现。 [0107] The storage unit 520 may further include a program/utility 5204 having a set (at least one) of the program modules 5205, such program modules 5205 including but not limited to: an operating system, one or more applications , other program modules, and program data, each of these examples or some combination may include an implementation of a network environment.
[0108] 总线 530可以为表示几类总线结构中的一种或多种, 包括存储单元总线或者存 储单元控制器、 外围总线、 图形加速端口、 处理单元或者使用多种总线结构中 的任意总线结构的局域总线。  [0108] The bus 530 may represent one or more of several types of bus structures, including a memory unit bus or a memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or any bus structure using a plurality of bus structures. Local bus.
[0109] 电子设备 500也可以与一个或多个外部设备 700 (例如键盘、 指向设备、 蓝牙设 备等) 通信, 还可与一个或者多个使得用户能与该电子设备 500交互的设备通信 , 和 /或与使得该电子设备 500能与一个或多个其它计算设备进行通信的任何设备 (例如路由器、 调制解调器等等) 通信。 这种通信可以通过输入 /输出 (I/O) 接 口 550进行。 并且, 电子设备 500还可以通过网络适配器 560与一个或者多个网络 (例如局域网 (LAN) , 广域网 (WAN) 和 /或公共网络, 例如因特网) 通信。 如图所示, 网络适配器 560通过总线 530与电子设备 500的其它模块通信。 应当明 白, 尽管图中未示出, 可以结合电子设备 500使用其它硬件和 /或软件模块, 包括 但不限于: 微代码、 设备驱动器、 冗余处理单元、 外部磁盘驱动阵列、 RAID系 统、 磁带驱动器以及数据备份存储系统等。  [0109] The electronic device 500 can also be in communication with one or more external devices 700 (eg, a keyboard, pointing device, Bluetooth device, etc.), and can also be in communication with one or more devices that enable a user to interact with the electronic device 500, and / or communicate with any device (eg, router, modem, etc.) that enables the electronic device 500 to communicate with one or more other computing devices. This communication can take place via an input/output (I/O) interface 550. Also, the electronic device 500 can communicate with one or more networks (e.g., a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) via the network adapter 560. As shown, network adapter 560 communicates with other modules of electronic device 500 via bus 530. It should be understood that although not shown in the figures, other hardware and/or software modules may be utilized in conjunction with electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives. And data backup storage systems, etc.
[0110] 通过以上的实施方式的描述, 本领域的技术人员易于理解, 这里描述的示例实 施方式可以通过软件实现, 也可以通过软件结合必要的硬件的方式来实现。 因 此, 根据本公开实施方式的技术方案可以以软件产品的形式体现出来, 该软件 产品可以存储在一个非易失性存储介质 (可以是 CD-ROM, U盘, 移动硬盘等) 中或网络上, 包括若干指令以使得一台计算设备 (可以是个人计算机、 服务器 、 终端装置、 或者网络设备等) 执行根据本公开实施方式的方法。  [0110] Through the description of the above embodiments, those skilled in the art will readily understand that the example embodiments described herein may be implemented by software or by software in combination with necessary hardware. Therefore, the technical solution according to an embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a USB flash drive, a mobile hard disk, etc.) or on a network. A number of instructions are included to cause a computing device (which may be a personal computer, server, terminal device, or network device, etc.) to perform a method in accordance with an embodiment of the present disclosure.
[0111] 在本公开的示例性实施例中, 还提供了一种计算机可读存储介质, 其上存储有 能够实现本说明书上述方法的程序产品。 在一些可能的实施方式中, 本申请的 各个方面还可以实现为一种程序产品的形式, 其包括程序代码, 当所述程序产 品在终端设备上运行时, 所述程序代码用于使所述终端设备执行本说明书上述“ 示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。  [0111] In an exemplary embodiment of the present disclosure, there is also provided a computer readable storage medium having stored thereon a program product capable of implementing the above method of the present specification. In some possible implementations, various aspects of the present application can also be implemented in the form of a program product, including program code, when the program product is run on a terminal device, the program code is used to cause the The terminal device performs the steps according to various exemplary embodiments of the present application described in the "Exemplary Method" section of the present specification.
[0112] 参考图 6所示, 描述了根据本申请的实施方式的用于实现上述方法的程序产品 6 00, 其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码, 并可以在 终端设备, 例如个人电脑上运行。 然而, 本申请的程序产品不限于此, 在本文 件中, 可读存储介质可以是任何包含或存储程序的有形介质, 该程序可以被指 令执行系统、 装置或者器件使用或者与其结合使用。 [0112] Referring to FIG. 6, a program product 600 for implementing the above method, which may employ a portable compact disk read only memory (CD-ROM) and includes program code, and may be described in accordance with an embodiment of the present application, may be In A terminal device, such as a personal computer. However, the program product of the present application is not limited thereto, and in this document, the readable storage medium may be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus or device.
[0113] 所述程序产品可以采用一个或多个可读介质的任意组合。 可读介质可以是可读 信号介质或者可读存储介质。 可读存储介质例如可以为但不限于电、 磁、 光、 电磁、 红外线、 或半导体的系统、 装置或器件, 或者任意以上的组合。 可读存 储介质的更具体的例子 (非穷举的列表) 包括: 具有一个或多个导线的电连接 、 便携式盘、 硬盘、 随机存取存储器 (RAM) 、 只读存储器 (ROM) 、 可擦式 可编程只读存储器 (EPROM或闪存) 、 光纤、 便携式紧凑盘只读存储器(CD-RO M)、 光存储器件、 磁存储器件、 或者上述的任意合适的组合。  [0113] The program product may take any combination of one or more readable mediums. The readable medium can be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of non-exhaustive storage media (non-exhaustive list) include: electrical connections with one or more wires, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-RO M), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
[0114] 计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号, 其中承载了可读程序代码。 这种传播的数据信号可以采用多种形式, 包括但不 限于电磁信号、 光信号或上述的任意合适的组合。 可读信号介质还可以是可读 存储介质以外的任何可读介质, 该可读介质可以发送、 传播或者传输用于由指 令执行系统、 装置或者器件使用或者与其结合使用的程序。  [0114] A computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate or transport a program for use by or in connection with the instruction execution system, apparatus or device.
[0115] 可读介质上包含的程序代码可以用任何适当的介质传输, 包括但不限于无线、 有线、 光缆、 RF等等, 或者上述的任意合适的组合。  [0115] The program code embodied on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
[0116] 可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序 代码, 所述程序设计语言包括面向对象的程序设计语言一诸如 Java、 C++等, 还 包括常规的过程式程序设计语言一诸如“C”语言或类似的程序设计语言。 程序代 码可以完全地在用户计算设备上执行、 部分地在用户设备上执行、 作为一个独 立的软件包执行、 部分在用户计算设备上部分在远程计算设备上执行、 或者完 全在远程计算设备或服务器上执行。 在涉及远程计算设备的情形中, 远程计算 设备可以通过任意种类的网络, 包括局域网 (LAN) 或广域网 (WAN) , 连接 到用户计算设备, 或者, 可以连接到外部计算设备 (例如利用因特网服务提供 商来通过因特网连接) 。  [0116] Program code for performing the operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++, etc., including conventional A procedural programming language such as the "C" language or a similar programming language. The program code can be executed entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the remote computing device on the user computing device, or entirely on the remote computing device or server Execute on. In the case of a remote computing device, the remote computing device can be connected to the user computing device via any kind of network, including a local area network (LAN) or wide area network (WAN), or can be connected to an external computing device (eg, provided using an Internet service) Businesses are connected via the Internet).
[0117] 此外, 上述附图仅是根据本申请示例性实施例的方法所包括的处理的示意性说 明, 而不是限制目的。 易于理解, 上述附图所示的处理并不表明或限制这些处 理的时间顺序。 另外, 也易于理解, 这些处理可以是例如在多个模块中同步或 异步执行的。 Further, the above-described drawings are merely illustrative of the processes included in the method according to the exemplary embodiments of the present application, and are not intended to be limiting. It is easy to understand that the processing shown in the above figures does not indicate or limit these places. The chronological order. In addition, it is also easy to understand that these processes may be performed synchronously or asynchronously, for example, in a plurality of modules.
[0118] 本领域技术人员在考虑说明书及实践这里公开的发明后, 将容易想到本公开的 其他实施例。 本申请旨在涵盖本公开的任何变型、 用途或者适应性变化, 这些 变型、 用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本 技术领域中的公知常识或惯用技术手段。 说明书和实施例仅被视为示例性的, 本公开的真正范围和精神由权利要求指出。  Other embodiments of the present disclosure will be apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present disclosure, which are in accordance with the general principles of the present disclosure and include common general knowledge or conventional technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be regarded as illustrative only,

Claims

权利要求书 Claim
[权利要求 1] 一种用户数据真实性分析方法, 其中, 包括:  [Claim 1] A user data authenticity analysis method, including:
根据多个包含子特征的特征组合构建多个结果预测模型;  Constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;
获取与所述子特征类型相同的待分析特征数据; 根据所述特征组合将所述待分析特征数据进行分组, 形成多个待分析 特征数据组合;  Acquiring feature data to be analyzed that is the same as the sub-feature type; grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测 结果;  Inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
将所述预测结果进行融合, 获取最终预测结果。  The prediction results are fused to obtain a final prediction result.
[权利要求 2] 根据权利要求 1所述的用户数据真实性分析方法, 其中, 用户数据真 实性分析方法还包括:  [Claim 2] The user data authenticity analysis method according to claim 1, wherein the user data truth analysis method further comprises:
获取用户特征信息, 所述用户特征信息包括多个子特征;  Obtaining user feature information, where the user feature information includes multiple sub-features;
根据所述子特征形成多个所述特征组合。  A plurality of the feature combinations are formed according to the sub-features.
[权利要求 3] 根据权利要求 1-2任意一项所述的用户数据真实性分析方法, 其中, 根据多个包含子特征的特征组合构建多个结果预测模型包括: 获取所述子特征的准确率, 并根据所述子特征的准确率将所述子特征 随机组合形成多个所述特征组合;  [Claim 3] The user data authenticity analysis method according to any one of claims 1-2, wherein constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features comprises: obtaining an accuracy of the sub-features Rate, and randomly combining the sub-features to form a plurality of the feature combinations according to the accuracy of the sub-features;
对所述特征组合进行机器训练, 以构建所述结果预测模型。  Machine training is performed on the combination of features to construct the result prediction model.
[权利要求 4] 根据权利要求 3所述的用户数据真实性分析方法, 其中, 获取所述子 特征的准确率包括:  [Claim 4] The user data authenticity analysis method according to claim 3, wherein obtaining the accuracy of the sub-features comprises:
对所述子特征进行机器训练, 以构建单特征模型; 将所述子特征输入至所述单特征模型, 获取所述子特征的准确率。  Performing machine training on the sub-features to construct a single feature model; inputting the sub-features to the single feature model to obtain an accuracy rate of the sub-features.
[权利要求 5] 根据权利要求 3或 4所述的用户数据真实性分析方法, 其中, 根据所述 子特征的准确率将所述子特征随机组合形成多个所述特征组合包括: 根据所述子特征的准确率, 通过轮盘赌法将所述子特征随机组合形成 所述特征组合。  [Claim 5] The user data authenticity analysis method according to claim 3 or 4, wherein randomly combining the sub-features according to the accuracy of the sub-feature to form a plurality of the feature combinations comprises: The accuracy of the sub-features is randomly combined by the roulette method to form the feature combination.
[权利要求 6] 根据权利要求 1-5任意一项所述的用户数据真实性分析方法, 其中, 获取与所述子特征类型相同的待分析特征数据包括: 获取待分析用户数据; The user data authenticity analysis method according to any one of claims 1 to 5, wherein acquiring the same feature data to be analyzed as the sub-feature type includes: Obtain user data to be analyzed;
根据所述子特征的类型, 从所述待分析用户数据中抽取所述待分析特 征数据。  Extracting the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.
[权利要求 7] 据权利要求 1-6任意一项所述的用户数据真实性分析方法, 其中, 将 所述预测结果进行融合, 获取最终预测结果包括: 根据 D-S证据理论将所述预测结果进行融合, 获取所述最终预测结果  The user data authenticity analysis method according to any one of claims 1 to 6, wherein the merging the prediction results to obtain a final prediction result comprises: performing the prediction result according to DS evidence theory Fusion, obtaining the final prediction result
[权利要求 8] 一种用户数据真实性分析装置, 其中, 包括: [Claim 8] A user data authenticity analyzing device, comprising:
模型构建模块, 配置为根据多个包含子特征的特征组合构建多个结果 预测模型;  a model building module configured to construct a plurality of result prediction models based on a plurality of feature combinations including sub-features;
数据获取模块, 配置为获取与所述子特征类型相同的待分析特征数据  a data acquisition module, configured to acquire the same feature data to be analyzed as the sub-feature type
组合生成模块, 配置为根据所述特征组合将所述待分析特征数据进行 分组, 形成多个待分析特征数据组合; The combination generation module is configured to group the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
结果预测模块, 配置为将所述待分析特征数据组合输入至所述结果预 测模型, 获取多个预测结果;  a result prediction module, configured to input the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
结果融合模块, 配置为将所述预测结果进行融合, 获取最终预测结果  a result fusion module configured to fuse the prediction results to obtain a final prediction result
[权利要求 9] 根据权利要求 8所述的用户数据真实性分析装置, 其中, 所述装置还 包括: [Claim 9] The user data authenticity analyzing device according to claim 8, wherein the device further comprises:
特征信息获取模块, 配置为获取用户特征信息, 所述用户特征信息包 括多个子特征;  a feature information acquiring module, configured to acquire user feature information, where the user feature information includes multiple sub-features;
特征组合形成模块, 配置为根据所述子特征形成多个所述特征组合。  A feature combination forming module is configured to form a plurality of the feature combinations according to the sub-features.
[权利要求 10] 根据权利要求 8-9任意一项所述的用户数据真实性分析装置, 其中, 所述模型构建模块包括: 随机组合单元, 配置为获取所述子特征的准确率, 并根据所述子特征 的准确率将所述子特征随机组合形成多个所述特征组合;  The user data authenticity analyzing device according to any one of claims 8-9, wherein the model building module comprises: a random combining unit configured to acquire an accuracy rate of the sub-features, and according to The accuracy of the sub-features randomly combines the sub-features to form a plurality of the feature combinations;
特征训练单元, 配置为对所述特征组合进行机器训练, 以构建所述结 果预测模型。 a feature training unit configured to perform machine training on the feature combination to construct the knot The prediction model.
[权利要求 11] 根据权利要求 10所述的用户数据真实性分析装置, 其中, 所述随机组 合单元包括:  [Claim 11] The user data authenticity analyzing apparatus according to claim 10, wherein the random combination unit comprises:
子特征训练器, 配置为对所述子特征进行机器训练, 以构建单特征模 型;  a sub-feature trainer configured to perform machine training on the sub-features to construct a single-feature model;
准确率获取器, 配置为将所述子特征输入至所述单特征模型, 获取所 述子特征的准确率。  An accuracy rate acquirer configured to input the sub-feature to the single feature model to obtain an accuracy of the sub-feature.
[权利要求 12] 根据权利要求 10或 11所述的用户数据真实性分析装置, 其中, 所述随 机组合单元包括:  [Claim 12] The user data authenticity analyzing device according to claim 10 or 11, wherein the random combination unit comprises:
轮盘组合器, 配置为根据所述子特征的准确率, 通过轮盘赌法将所述 子特征随机组合形成所述特征组合。  A roulette combiner is configured to randomly combine the sub-features to form the feature combination by roulette based on the accuracy of the sub-features.
[权利要求 13] 根据权利要求 8-12任意一项所述的用户数据真实性分析装置, 其中, 所述数据获取模块包括:  The user data authenticity analysis device according to any one of claims 8 to 12, wherein the data acquisition module comprises:
用户数据获取单元, 用于获取待分析用户数据; 特征数据抽取单元, 用于根据所述子特征的类型, 从所述待分析用户 数据中抽取所述待分析特征数据。  a user data obtaining unit, configured to acquire user data to be analyzed, and a feature data extracting unit, configured to extract the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.
[权利要求 14] 根据权利要求 8-13任意一项所述的用户数据真实性分析装置, 其中, 所述结果融合模块包括:  The user data authenticity analysis device according to any one of claims 8 to 13, wherein the result fusion module comprises:
理论融合单元, 用于根据 D-S证据理论将所述预测结果进行融合, 获 取所述最终预测结果。  A theoretical fusion unit is configured to fuse the prediction results according to D-S evidence theory to obtain the final prediction result.
[权利要求 15] 一种计算机可读存储介质, 其上存储有计算机程序, 其中, 所述计算 机程序被处理器执行以下步骤:  [Claim 15] A computer readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor by the following steps:
根据多个包含子特征的特征组合构建多个结果预测模型;  Constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;
获取与所述子特征类型相同的待分析特征数据; 根据所述特征组合将所述待分析特征数据进行分组, 形成多个待分析 特征数据组合;  Acquiring feature data to be analyzed that is the same as the sub-feature type; grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测 结果; 将所述预测结果进行融合, 获取最终预测结果。 Inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results; The prediction results are fused to obtain a final prediction result.
[权利要求 16] 根据权利要求 15所述的计算机可读存储介质, 其中, 所述步骤还包括 获取用户特征信息, 所述用户特征信息包括多个子特征;  [Claim 16] The computer readable storage medium according to claim 15, wherein the step further comprises: acquiring user feature information, wherein the user feature information includes a plurality of sub-features;
根据所述子特征形成多个所述特征组合。  A plurality of the feature combinations are formed according to the sub-features.
[权利要求 17] 根据权利要求 15-16任意一项所述的计算机可读存储介质, 其中, 根 据多个包含子特征的特征组合构建多个结果预测模型包括: 获取所述子特征的准确率, 并根据所述子特征的准确率将所述子特征 随机组合形成多个所述特征组合;  The computer readable storage medium according to any one of claims 15-16, wherein constructing the plurality of result prediction models according to the plurality of feature combinations including the sub-features comprises: obtaining an accuracy rate of the sub-features And randomly combining the sub-features to form a plurality of the feature combinations according to an accuracy rate of the sub-features;
对所述特征组合进行机器训练, 以构建所述结果预测模型。  Machine training is performed on the combination of features to construct the result prediction model.
[权利要求 18] 根据权利要求 17所述的计算机可读存储介质, 其中, 获取所述子特征 的准确率包括:  The computer readable storage medium according to claim 17, wherein the obtaining the accuracy of the sub-features comprises:
对所述子特征进行机器训练, 以构建单特征模型; 将所述子特征输入至所述单特征模型, 获取所述子特征的准确率。  Performing machine training on the sub-features to construct a single feature model; inputting the sub-features to the single feature model to obtain an accuracy rate of the sub-features.
[权利要求 19] 根据权利要求 17或 18所述的计算机可读存储介质, 其中, 根据所述子 特征的准确率将所述子特征随机组合形成多个所述特征组合包括: 根据所述子特征的准确率, 通过轮盘赌法将所述子特征随机组合形成 所述特征组合。  [Claim 19] The computer readable storage medium according to claim 17 or 18, wherein randomly combining the sub-features according to an accuracy of the sub-feature to form a plurality of the feature combinations comprises: according to the sub- The accuracy of the features, the sub-features are randomly combined by roulette to form the feature combination.
20.根据权利要求 15-19任意一项所述的计算机可读存储介质, 其中, 获取与所述子特征类型相同的待分析特征数据包括:  The computer readable storage medium according to any one of claims 15 to 19, wherein acquiring the same feature data to be analyzed as the sub-feature type comprises:
获取待分析用户数据;  Obtain user data to be analyzed;
根据所述子特征的类型, 从所述待分析用户数据中抽取所述待分析特 征数据。  Extracting the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.
[权利要求 21] 根据权利要求 15-20任意一项所述的计算机可读存储介质, 其中, 将 所述预测结果进行融合, 获取最终预测结果包括: 根据 D-S证据理论将所述预测结果进行融合, 获取所述最终预测结果  The computer readable storage medium according to any one of claims 15 to 20, wherein the merging the prediction results to obtain a final prediction result comprises: merging the prediction results according to DS evidence theory , get the final prediction result
[权利要求 22] 一种电子设备, 其中, 包括: 处理器; 以及 [Claim 22] An electronic device, comprising: Processor;
存储器, 配置为存储所述处理器的可执行指令; 其中, 所述处理器配置为执行以下步骤:  a memory configured to store executable instructions of the processor; wherein the processor is configured to perform the following steps:
根据多个包含子特征的特征组合构建多个结果预测模型;  Constructing a plurality of result prediction models according to a plurality of feature combinations including sub-features;
获取与所述子特征类型相同的待分析特征数据; 根据所述特征组合将所述待分析特征数据进行分组, 形成多个待分析 特征数据组合;  Acquiring feature data to be analyzed that is the same as the sub-feature type; grouping the feature data to be analyzed according to the feature combination to form a plurality of feature data combinations to be analyzed;
将所述待分析特征数据组合输入至所述结果预测模型, 获取多个预测 结果;  Inputting the feature data to be analyzed into the result prediction model to obtain a plurality of prediction results;
将所述预测结果进行融合, 获取最终预测结果。  The prediction results are fused to obtain a final prediction result.
[权利要求 23] 根据权利要求 22所述的电子设备, 其中, 所述步骤还包括:  [Claim 23] The electronic device according to claim 22, wherein the step further comprises:
获取用户特征信息, 所述用户特征信息包括多个子特征;  Obtaining user feature information, where the user feature information includes multiple sub-features;
根据所述子特征形成多个所述特征组合。  A plurality of the feature combinations are formed according to the sub-features.
[权利要求 24] 根据权利要求 22-23任意一项所述的电子设备, 其中, 根据多个包含 子特征的特征组合构建多个结果预测模型包括: 获取所述子特征的准确率, 并根据所述子特征的准确率将所述子特征 随机组合形成多个所述特征组合;  [Claim 24] The electronic device according to any one of claims 22-23, wherein constructing the plurality of result prediction models according to the plurality of feature combinations including the sub-features comprises: obtaining an accuracy rate of the sub-features, and according to The accuracy of the sub-features randomly combines the sub-features to form a plurality of the feature combinations;
对所述特征组合进行机器训练, 以构建所述结果预测模型。  Machine training is performed on the combination of features to construct the result prediction model.
[权利要求 25] 根据权利要求 24所述的电子设备, 其中, 获取所述子特征的准确率包 括:  [Claim 25] The electronic device according to claim 24, wherein the accuracy of acquiring the sub-features comprises:
对所述子特征进行机器训练, 以构建单特征模型; 将所述子特征输入至所述单特征模型, 获取所述子特征的准确率。  Performing machine training on the sub-features to construct a single feature model; inputting the sub-features to the single feature model to obtain an accuracy rate of the sub-features.
[权利要求 26] 根据权利要求 24或 25所述的电子设备, 其中, 根据所述子特征的准确 率将所述子特征随机组合形成多个所述特征组合包括:  [Claim 26] The electronic device according to claim 24 or 25, wherein randomly combining the sub-features according to an accuracy of the sub-feature to form a plurality of the feature combinations comprises:
根据所述子特征的准确率, 通过轮盘赌法将所述子特征随机组合形成 所述特征组合。  The sub-features are randomly combined by roulette to form the feature combination based on the accuracy of the sub-features.
[权利要求 27] 根据权利要求 22-26任意一项所述的电子设备, 其中, 获取与所述子 特征类型相同的待分析特征数据包括: 获取待分析用户数据; The electronic device according to any one of claims 22 to 26, wherein acquiring the same feature data to be analyzed as the sub-feature type comprises: Obtain user data to be analyzed;
根据所述子特征的类型, 从所述待分析用户数据中抽取所述待分析特 征数据。  Extracting the feature data to be analyzed from the user data to be analyzed according to the type of the sub-feature.
[权利要求 28] 根据权利要求 22-27任意一项所述的电子设备, 其中, 将所述预测结 果进行融合, 获取最终预测结果包括:  [Claim 28] The electronic device according to any one of claims 22-27, wherein the fusing the prediction result to obtain a final prediction result comprises:
根据 D-S证据理论将所述预测结果进行融合, 获取所述最终预测结果  Converging the prediction results according to D-S evidence theory to obtain the final prediction result
PCT/CN2018/103063 2018-04-20 2018-08-29 User data authenticity analysis method and apparatus, storage medium and electronic device WO2019200810A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810359102.0 2018-04-20
CN201810359102.0A CN108596616B (en) 2018-04-20 2018-04-20 User data authenticity analysis method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
WO2019200810A1 true WO2019200810A1 (en) 2019-10-24

Family

ID=63614092

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/103063 WO2019200810A1 (en) 2018-04-20 2018-08-29 User data authenticity analysis method and apparatus, storage medium and electronic device

Country Status (2)

Country Link
CN (1) CN108596616B (en)
WO (1) WO2019200810A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112288457A (en) * 2020-06-23 2021-01-29 北京沃东天骏信息技术有限公司 Data processing method, device, equipment and medium based on multi-model calculation fusion
CN113034157A (en) * 2019-12-24 2021-06-25 中国移动通信集团浙江有限公司 Group member identification method and device and computing equipment
CN113298120A (en) * 2021-04-29 2021-08-24 上海淇玥信息技术有限公司 User risk prediction method and system based on fusion model and computer equipment

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325375B (en) * 2018-12-13 2024-06-18 北京沃东天骏信息技术有限公司 Data correction method and device, computer storage medium and electronic equipment
CN110020868B (en) * 2019-03-11 2021-02-23 同济大学 Anti-fraud module decision fusion method based on online transaction characteristics
CN110189134B (en) * 2019-05-17 2023-01-31 同济大学 Suspected fraud transaction reference ordinal-based network payment anti-fraud system architecture design method
CN110245704B (en) * 2019-06-13 2021-10-08 泰康保险集团股份有限公司 Service processing method and device, storage medium and electronic equipment
CN111143552B (en) * 2019-12-05 2023-06-27 支付宝(杭州)信息技术有限公司 Text information category prediction method and device and server
CN111626898B (en) * 2020-03-20 2022-03-15 贝壳找房(北京)科技有限公司 Method, device, medium and electronic equipment for realizing attribution of events
CN111612366B (en) * 2020-05-27 2023-08-04 中国联合网络通信集团有限公司 Channel quality assessment method, channel quality assessment device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678419A (en) * 2012-09-25 2014-03-26 日电(中国)有限公司 Data recognition method and device
US20150363473A1 (en) * 2014-06-17 2015-12-17 Microsoft Corporation Direct answer triggering in search
CN107515876A (en) * 2016-06-16 2017-12-26 阿里巴巴集团控股有限公司 A kind of generation of characteristic model, application process and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104766080A (en) * 2015-05-06 2015-07-08 苏州搜客信息技术有限公司 Image multi-class feature recognizing and pushing method based on electronic commerce
CN106506454B (en) * 2016-10-10 2019-11-12 江苏通付盾科技有限公司 fraud service identification method and device
CN107330445B (en) * 2017-05-31 2020-06-05 北京京东尚科信息技术有限公司 User attribute prediction method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678419A (en) * 2012-09-25 2014-03-26 日电(中国)有限公司 Data recognition method and device
US20150363473A1 (en) * 2014-06-17 2015-12-17 Microsoft Corporation Direct answer triggering in search
CN107515876A (en) * 2016-06-16 2017-12-26 阿里巴巴集团控股有限公司 A kind of generation of characteristic model, application process and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113034157A (en) * 2019-12-24 2021-06-25 中国移动通信集团浙江有限公司 Group member identification method and device and computing equipment
CN113034157B (en) * 2019-12-24 2023-12-26 中国移动通信集团浙江有限公司 Group member identification method and device and computing equipment
CN112288457A (en) * 2020-06-23 2021-01-29 北京沃东天骏信息技术有限公司 Data processing method, device, equipment and medium based on multi-model calculation fusion
CN113298120A (en) * 2021-04-29 2021-08-24 上海淇玥信息技术有限公司 User risk prediction method and system based on fusion model and computer equipment
CN113298120B (en) * 2021-04-29 2023-08-01 上海淇玥信息技术有限公司 Fusion model-based user risk prediction method, system and computer equipment

Also Published As

Publication number Publication date
CN108596616B (en) 2023-04-18
CN108596616A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
WO2019200810A1 (en) User data authenticity analysis method and apparatus, storage medium and electronic device
US11522873B2 (en) Detecting network attacks
US11403643B2 (en) Utilizing a time-dependent graph convolutional neural network for fraudulent transaction identification
US11531987B2 (en) User profiling based on transaction data associated with a user
CN111612037B (en) Abnormal user detection method, device, medium and electronic equipment
US20220084371A1 (en) Systems and methods for unsupervised detection of anomalous customer interactions to secure and authenticate a customer session
WO2021196935A1 (en) Data checking method and apparatus, electronic device, and storage medium
US20230093540A1 (en) System and Method for Detecting Anomalous Activity Based on a Data Distribution
CN112581259A (en) Account risk identification method and device, storage medium and electronic equipment
CN112348321A (en) Risk user identification method and device and electronic equipment
US20190279228A1 (en) Suspicious activity report smart validation
US20220222683A1 (en) Labeling optimization through image clustering
CN114780932B (en) Cross-block chain data interaction verification method, system and equipment for management three-mode platform
CN116245630A (en) Anti-fraud detection method and device, electronic equipment and medium
CN114971642A (en) Knowledge graph-based anomaly identification method, device, equipment and storage medium
CN110808978B (en) Real name authentication method and device
CN113988223A (en) Certificate image recognition method and device, computer equipment and storage medium
US11645372B2 (en) Multifactor handwritten signature verification
CN117093715B (en) Word stock expansion method, system, computer equipment and storage medium
US20240211951A1 (en) Systems and methods for merchant level fraud detection based in part on event timing
CN113836566B (en) Model processing method, device, equipment and medium based on block chain system
US20240211966A1 (en) Systems and methods for merchant level fraud detection using an ensemble of machine learning models
US20240211965A1 (en) Systems and methods for merchant level fraud detection based in part on merchant cohort clustering
US20240005688A1 (en) Document authentication using multi-tier machine learning models
US20240061915A1 (en) Dynamic handwriting authentication

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18915551

Country of ref document: EP

Kind code of ref document: A1