WO2020015089A1 - Identity information risk assessment method and apparatus, and computer device and storage medium - Google Patents

Identity information risk assessment method and apparatus, and computer device and storage medium Download PDF

Info

Publication number
WO2020015089A1
WO2020015089A1 PCT/CN2018/104806 CN2018104806W WO2020015089A1 WO 2020015089 A1 WO2020015089 A1 WO 2020015089A1 CN 2018104806 W CN2018104806 W CN 2018104806W WO 2020015089 A1 WO2020015089 A1 WO 2020015089A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
identity
risk
risk assessment
probability
Prior art date
Application number
PCT/CN2018/104806
Other languages
French (fr)
Chinese (zh)
Inventor
孙静远
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020015089A1 publication Critical patent/WO2020015089A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present application relates to an identity information risk assessment method, device, computer equipment, and storage medium.
  • an identity information risk assessment method is provided.
  • An identity information risk assessment method includes:
  • a risk assessment result is generated according to the identity risk probability.
  • An identity information risk assessment device includes:
  • Identity data acquisition module for receiving identity data
  • An identity parameter extraction module configured to extract identity characteristic parameters from the identity data
  • a time parameter acquisition module configured to find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data
  • a risk probability obtaining module configured to input the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability
  • a risk result generating module is configured to generate a risk assessment result according to the identity risk probability.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors are executed. The following steps:
  • a risk assessment result is generated according to the identity risk probability.
  • One or more non-volatile storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors execute the following steps:
  • a risk assessment result is generated according to the identity risk probability.
  • FIG. 1 is an application scenario diagram of an identity information risk assessment method according to one or more embodiments.
  • FIG. 2 is a schematic flowchart of an identity information risk assessment method according to one or more embodiments.
  • FIG. 3 is a schematic flowchart of a method for generating a preset risk assessment model according to one or more embodiments.
  • FIG. 4 is a structural block diagram of an identity information risk assessment device according to one or more embodiments.
  • FIG. 5 is an internal structural diagram of a computer device according to one or more embodiments.
  • the identity information risk assessment method provided in this application can be applied to the application environment shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through a network.
  • the server 104 receives the passenger identity data sent by the terminal 102, extracts identity characteristic parameters from the received identity data, finds historical verification data corresponding to the identity data, extracts verification time parameters from the historical verification data, and combines the identity characteristic parameters and verification time
  • the parameters are input to a preset risk assessment model to obtain an identity risk probability, a risk assessment result is generated according to the identity risk probability, and the server 104 returns the generated risk assessment result to the terminal 102.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • an identity information risk assessment method is provided.
  • the method is applied to the server 104 in FIG. 1 as an example for description, and includes the following steps:
  • Step 210 Receive identity data.
  • Identity data is data that can uniquely determine the identity of a passenger, such as the ID type, ID number, etc. of IDs, visas, student IDs and other IDs.
  • the staff of the security terminal can collect the identity data of the passengers who are checked through the identity information collection equipment such as a credit card machine.
  • the identity information collection equipment transmits the collected passenger's identity data to the security terminal.
  • the staff can also enter the passenger's Identity data.
  • the security terminal sends the acquired passenger's identity data to the server, and the server receives the identity data sent by the security terminal.
  • Step 220 Extract identity characteristic parameters from the identity data.
  • the identity characteristic parameter is a parameter for characterizing the passenger.
  • the identity characteristic parameter may include parameters such as passenger age, passenger origin, and passenger gender.
  • the passenger's identity data includes the passenger's characteristic parameters, and the server extracts the identity characteristic parameters from the received identity data.
  • the step of extracting identity characteristic parameters from the identity data may include: extracting a document number from the identity data; identifying the document format of the document number, searching for a document type corresponding to the format recognition result; The document number is segmented to obtain the segmented character string; the identity characteristic parameters corresponding to each segmented character string are found.
  • the server extracts the ID number from the identity data.
  • the server recognizes the credential format of the credential number, and recognizes the credential format such as the number length and alphanumeric composition of the credential number.
  • the mapping relationship between the credential type and the credential format is stored in the server in advance, and the server looks for the credential type corresponding to the recognized credential format.
  • the credential category may include categories such as an identity card, a pass, a home permit, and a passport.
  • the character string of the preset position in the credential number of different credential types corresponds to a certain identity characteristic parameter.
  • the server obtains the preset position and the preset length of the character string corresponding to the credential type. Segmentation is performed to obtain a segmentation string.
  • the server obtains a data conversion table of each preset character string corresponding to the credential type, and the data conversion table stores the correspondence between the specific string value of each preset character string and the identity characteristic parameter.
  • the server looks up the identity characteristic parameter corresponding to each segmented character string from the data conversion table.
  • the server obtains the first three digits of the ID number.
  • the corresponding data conversion table for example, the first three digits of the ID number is "410", and the passenger's nationality parameter corresponding to "410" found from the data conversion table is "Henan”.
  • the server looks up the identity characteristic parameters corresponding to all the segmented character strings.
  • Step 230 Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data.
  • the historical verification data is the historical record data of passengers' security inspections.
  • the historical record data may include data such as the security inspection time and security inspection results of passengers during previous security inspections.
  • the server searches the historical verification data corresponding to the passenger according to the document number in the passenger identity data.
  • the verification time parameters may include parameters such as the frequency of passengers entering and exiting the security inspection place for security inspections, the time period for each security inspection, and the time period to which the current security inspection time belongs.
  • the frequency of security inspections can be set to daily security inspections. Frequency, weekly security inspection frequency and monthly security inspection frequency, etc., the frequency of security inspections can be specifically set by the staff according to the actual security inspection requirements.
  • the server performs statistics on the historical verification data, and counts out various verification time parameters from it.
  • Step 240 Input the identity characteristic parameters and the verification time parameters into a preset risk assessment model to obtain an identity risk probability.
  • the server obtains a preset risk assessment model, and the preset risk assessment model is a preset model for assessing passenger safety risks.
  • the input of the preset risk assessment model is various identity characteristic parameters and verification time parameters, and the output is the probability that the passenger has a security risk.
  • the server inputs the extracted identity characteristic parameters and verification time parameters into a preset risk assessment model, and the preset risk assessment model calculates and processes the parameters to obtain the identity risk probability.
  • Step 250 Generate a risk assessment result according to the identity risk probability.
  • the server generates a risk assessment result based on the calculated identity risk probability.
  • the risk assessment result may include information such as identity risk probability, passenger historical security information, and security deployment recommendations.
  • the server extracts the identity characteristic parameters from the received passenger's identity data, and searches for historical verification data corresponding to the extracted identity characteristic parameters, and sets a risk assessment model in advance to combine the identity characteristic parameters with the corresponding
  • the historical verification data can be used to enter the preset risk assessment model to obtain the passenger's identity risk probability, so that the passenger's security risk can be calculated and evaluated scientifically based on the passenger's characteristics and historical data, thereby improving the accuracy of the security inspection.
  • the method for generating a preset risk assessment model includes:
  • Step 201 Collect sample data, and divide the sample data into training set data and test set data.
  • the sample data is historical data for security inspections in real security places.
  • the server collects historical security inspection data within a preset time range.
  • the preset time range can be set to 1 month, 3 months, half a year, etc.
  • the server will collect samples
  • the data is randomly divided into training set data and test set data. The number of samples contained in the training set data and the test set data may be the same or different.
  • Step 203 Extract a first feature parameter and a first target category from the training set data.
  • the sample data can be divided into positive sample data and negative sample data.
  • the positive sample data is the historical security data of passengers whose inspection results are normal
  • the negative sample data is the historical security data of passengers whose inspection results are abnormal.
  • Both the training set data and the test set data contain both positive sample data and negative sample data.
  • the server extracts the first feature parameter and the first target category one by one from each sample of the training set data.
  • the first characteristic parameter includes an identity characteristic parameter and a verification time parameter, that is, the first feature parameter corresponds to an identity characteristic parameter extracted from passenger identity data in an actual security check and a verification time parameter extracted from a passenger historical verification data.
  • the first target category is the category of the security inspection results. The first target category is divided into two categories: normal security check and abnormal security check.
  • Step 205 Perform feature gain evaluation according to the first feature parameter and the first target category, and perform feature selection according to the feature gain evaluation result. Classify the selected features to obtain an initial decision tree risk assessment model, and calculate the initial decision based on the training set data The risk probability of each classification node in the tree risk assessment model.
  • the preset risk assessment model constructed is a decision tree model.
  • a decision tree is a tree structure composed of nodes and directed edges that is used to classify instances.
  • the specific method of using the decision tree model for classification is: starting from the root node, testing a certain feature of the instance, and assigning the instance to its child nodes according to the test results. When it is possible to reach a leaf node or another internal node along this branch, the new test condition is used to recursively execute until a leaf node is reached. When the leaf node is reached, the final classification result is obtained, and the leaf node is used as the classification node.
  • an ID3 algorithm is used to construct an initial decision tree risk assessment model.
  • the ID3 algorithm evaluates the information gain of each feature, and selects the feature parameter with the largest information gain each time as a judgment module to establish a child node.
  • the server calculates the information gain of each feature corresponding to the first feature parameter, selects the feature with the largest information gain as the judgment module to establish the child node, divides the training set data corresponding to the child node into the subset data, and recursively performs the subset data. Branching establishes branch nodes until all branch nodes correspond to the same first target category.
  • the server uses the following formula (1) to calculate the information gain of each feature corresponding to the first feature parameter:
  • g (D, A) is the information gain of feature A on training data set D
  • H (D) is the empirical entropy of training data set D
  • A) is the empirical conditional entropy of feature A on data set D .
  • the server uses the following formula (2) to calculate the empirical entropy H (D) of the training data set D:
  • C k is the number of samples corresponding to the first target category
  • K is the number of categories of the first target category.
  • the first target category is divided into two types: normal security check and abnormal security check.
  • the server uses the following formula (3) to calculate the empirical conditional entropy H (D
  • value (A) wherein A is a set of all values, i is a value characteristic of the A, D i is a training data set D wherein A is a sample set of values of i,
  • all the values of feature A corresponding to the gender characteristic parameter are male and female.
  • male can be represented by 0
  • female can be represented by 1
  • value (A) is (0, 1).
  • the server uses the Hunt algorithm to build a decision tree recursively. After calculating the information gain of each feature parameter and selecting features, it obtains the training set data corresponding to the feature parameter with the largest information gain, and uses the same method for the training set data. Feature selection is performed on the subsets, and the training data set is gradually divided into more pure subsets.
  • the server After the server builds the initial decision tree risk assessment model, according to the first feature parameters and the first target category of each sample in the training data set, the feature parameter combination corresponding to each classification node in the initial decision tree risk assessment model is calculated from the training data set. For the matched negative sample data, calculate the ratio of the statistical negative sample data to the total negative sample data in the training data set, and use this ratio as the risk probability of each classification node.
  • Step 207 Extract a second feature parameter and a second target category from the test set data.
  • the server extracts the second feature parameter and the second target category one by one from each sample of the test set data.
  • the second characteristic parameter includes an identity characteristic parameter and a verification time parameter, that is, the second feature parameter corresponds to the identity characteristic parameter extracted from the passenger identity data in the actual security inspection and the verification time parameter extracted from the passenger historical verification data.
  • the second target category is the category of the security inspection results. The second target category is divided into two categories: normal security inspection and abnormal security inspection.
  • step 209 the risk probability of each classification node in the initial decision tree risk assessment model is verified according to the second feature parameter and the second target category, and the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated according to the verification result.
  • the server calculates negative sample data from the test data set that matches the combination of the feature parameter corresponding to each classification node in the initial decision tree risk assessment model, and calculates the statistical negative
  • the proportion of sample data in the total negative sample data in the test data set, and the risk probability of each classification node in the decision tree model is verified based on the calculated proportion.
  • the server can set a preset tolerance error. When the absolute difference between the calculated ratio and the risk probability is less than the preset tolerance error, the verification passes. When the absolute difference between the calculated ratio and the risk probability is greater than the preset, When the tolerance is poor, the verification fails.
  • the server can add the sample data in the test data set to the training data set, expand the sample capacity to train the initial decision tree risk assessment model, and adjust the initial decision tree risk assessment model to generate a preset risk assessment model.
  • the identity information risk assessment method may further include: when the update time of the verification data is reached, loading the updated verification data; and extracting from the verification data a third characteristic parameter and a risk target corresponding to a preset risk assessment model Mark; verify the risk probability of each classification node in the preset risk assessment model according to the third characteristic parameter and the risk target mark, and optimize the preset risk assessment model according to the verification result.
  • the server presets the check data update time, and the check data update time is the time to update the security check data of the security place.
  • the server loads the updated check data.
  • the check data includes the passenger's identity data, check time, and security check results.
  • the security terminal can actively or passively send the updated check data to the server.
  • the server extracts the third characteristic parameter and the risk target mark from the verification data.
  • the third characteristic parameter corresponds to the characteristic set in the preset risk assessment model.
  • the risk target mark is a security check result mark, which is divided into no security risk mark and There are two types of security risk tags.
  • the server calculates the negative sample data that matches the combination of the characteristic parameter corresponding to each classification node in the preset risk assessment model from the check data according to the third characteristic parameter and the risk target mark of each sample in the check data, and calculates the statistical negative sample data.
  • the proportion of the total negative sample data in the verification data, and the risk probability of each classification node in the preset risk assessment model is verified according to the calculated proportion.
  • the server can set a preset deviation. When the absolute difference between the calculated ratio and the risk probability is less than the preset deviation, the verification passes; when the absolute difference between the calculated ratio and the risk probability is greater than the preset deviation , Verification failed.
  • the server can continue to train and adjust the audit data to the preset risk assessment model, so as to continuously optimize the preset risk assessment model according to the verification data, so that the training of big data enables the preset risk assessment model to pass.
  • the resulting risk assessment results are becoming more accurate.
  • the step of generating a risk assessment result according to the identity risk probability may include: finding a decision path corresponding to the identity risk probability with the highest probability value from a preset risk assessment model; obtaining node data of the decision path; and according to the node data The identity risk probability with the largest probability value and the greatest probability value is used to generate a seizure path map and output it.
  • the identity characteristic parameters and verification time parameters may match the feature parameters in multiple decision paths in the preset risk assessment model. Therefore, it is possible The identity risk probability corresponding to multiple matching classification nodes will be obtained.
  • the characteristic parameter corresponding to the classification node in the risk assessment model is the clearance frequency
  • the input parameters may satisfy both the decision path of the classification node being "the number of security checks of the day” and the node characteristic value being "greater than twice". It may meet the decision path two of the classification node as “the number of security checks in the last natural week” and the value of the node feature as “between 8-15 days”.
  • the identity risk probability corresponding to decision path one is 21%, and the identity corresponding to decision path two The risk probability is 25%.
  • the server finds the decision path corresponding to the identity risk probability with the highest probability value from the calculation result of the preset risk assessment model.
  • the server obtains the characteristic parameters corresponding to each node in the found decision path, and the nodes include internal branch nodes and classification leaf nodes.
  • the server generates a seizure path map in series according to the characteristic parameters of all nodes, and also adds the identity risk probability of the final data to the seizure path map, and returns the seized path map to the security terminal, so that the security terminal displays the seized path map, that is, Visually display the output of the preset risk assessment model, so that security personnel can clearly understand the characteristics of the current passenger and the potential security risks, and determine whether to further inspect the current passenger based on the visualized roadmap.
  • the passenger ’s identity characteristics and verification time parameters of the input model are “Zhang San, male, 24 years old, Chinese, born in Guangdong, the third verification this month”, which is in line with the “male-20” in the preset risk assessment model.
  • the decision path has a 30% risk probability, which is the decision path with the highest probability value among the decision paths that match all the characteristic parameters.
  • a seizure path map is generated based on the decision path and the corresponding risk probability.
  • the step of generating a risk assessment result according to the identity risk probability may include: obtaining an identity risk probability with a maximum probability value; obtaining current security manpower data, finding a security passenger flow threshold corresponding to the current security manpower data; obtaining a preset Threshold conversion data.
  • the risk probability threshold is calculated based on the security passenger flow threshold and the preset threshold conversion data.
  • the server obtains the identity risk probability corresponding to each classification node of the preset risk assessment model, and selects the identity risk probability with the largest probability value from it.
  • the security passenger flow threshold is the maximum value of the passenger flow corresponding to the current security manpower that can perform security checks.
  • the server obtains the current security manpower data.
  • the current security manpower data may include data such as the total security manpower deployed at the current security checkpoint, and the security manpower deployed at the current security checkpoint corresponding to the security terminal.
  • the server obtains the mapping relationship between the pre-stored security manpower data and the security passenger flow threshold, including the mapping relationship between the total security manpower and the total security passenger flow threshold, and the mapping relationship between the security manpower at the current security checkpoint and the corresponding security passenger flow threshold.
  • the server looks up the total security passenger flow threshold corresponding to the current total security manpower, and finds the security checkpoint security passenger flow threshold corresponding to the current security manpower at the security checkpoint.
  • the risk probability threshold is the minimum value of the identity risk probability that can determine that the passenger has a security risk.
  • the risk probability threshold is not fixed, but adjusted according to the security manpower. When the security manpower is sufficient, the risk probability threshold is set to be relatively small. Conversely, the risk probability threshold is set relatively large.
  • the server obtains preset threshold conversion data.
  • the preset threshold conversion data is conversion data converted between the security passenger flow threshold and the risk probability threshold.
  • the conversion data may be a mapping table between the security passenger flow threshold and the risk probability threshold, or may be a preset Conversion calculation formula, etc.
  • the server calculates a risk probability threshold corresponding to the security passenger flow threshold according to the preset threshold conversion data, including a first risk probability threshold corresponding to the total security passenger flow threshold and a second risk probability threshold corresponding to the security passenger flow threshold of the security checkpoint.
  • the minimum of a risk probability threshold and a second risk probability threshold is used as the risk probability threshold.
  • the server compares the obtained identity risk probability with the highest probability value to the calculated risk probability threshold.
  • the identity risk probability with the largest probability value is less than or equal to the risk probability threshold, the current passenger security check passes, and the server can generate a security check notification and return it to Security terminal; when the identity probability with the largest probability value exceeds the risk probability threshold, the server generates a risk check alert prompt, which can carry the calculated identity risk probability of the current passenger.
  • the server sends the generated risk check warning prompt to the security terminal to remind the staff at the security checkpoint that the passenger has a certain security risk and needs to be further Security check.
  • steps in the flowchart of FIG. 2-3 are sequentially displayed in accordance with the directions of the arrows, these steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in Figure 2-3 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of another step or a sub-step or stage of another step.
  • an identity information risk assessment device including: an identity data acquisition module 410, an identity parameter extraction module 420, a time parameter acquisition module 430, a risk probability acquisition module 440, and a risk.
  • Result generation module 450 where:
  • the identity data obtaining module 410 is configured to receive identity data.
  • the identity parameter extraction module 420 is configured to extract identity characteristic parameters from the identity data.
  • the time parameter obtaining module 430 is configured to find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data.
  • a risk probability obtaining module 440 is configured to input identity characteristic parameters and verification time parameters into a preset risk assessment model to obtain an identity risk probability.
  • the risk result generating module 450 is configured to generate a risk assessment result according to the identity risk probability.
  • the apparatus may further include:
  • a data acquisition module is used to collect sample data and divide the sample data into training set data and test set data.
  • a training data extraction module is used to extract a first feature parameter and a first target category from the training set data.
  • the initial model building module is used to perform feature gain evaluation according to the first feature parameter and the first target category, and to perform feature selection according to the feature gain evaluation result, and classify according to the selected features to obtain an initial decision tree risk assessment model.
  • the data calculates the risk probability of each classification node in the initial decision tree risk assessment model.
  • the test data extraction module is configured to extract a second feature parameter and a second target category from the test set data.
  • the evaluation module generation module is used to verify the risk probability of each classification node in the initial decision tree risk assessment model according to the second characteristic parameter and the second target category, and adjust the initial decision tree risk assessment model and generate a preset based on the verification result. Risk assessment model.
  • the apparatus may further include:
  • the verification data loading module is used to load the updated verification data when the verification data update time is reached.
  • the verification data extraction module is used to extract from the verification data the third characteristic parameter and the risk target mark corresponding to the preset risk assessment model.
  • the model optimization module is used to verify the risk probability of each classification node in the preset risk assessment model according to the third characteristic parameter and the risk target mark, and optimize the preset risk assessment model according to the verification result.
  • the risk result generation module 450 may include:
  • the path finding module is configured to find a decision path corresponding to an identity risk probability with a maximum probability value from a preset risk assessment model.
  • the path data acquisition module is used to acquire node data of a decision path.
  • the path graph generating module is configured to generate and output a seized path graph according to the node data and the identity risk probability with the largest probability value.
  • the risk result generation module 450 may include:
  • the probability acquisition module is configured to acquire an identity risk probability with a maximum probability value.
  • the security threshold search module is used to obtain the current security manpower data and find the security passenger flow threshold corresponding to the current security manpower data.
  • the risk threshold calculation module is configured to obtain preset threshold conversion data, and calculate a risk probability threshold according to the security passenger flow threshold and the preset threshold conversion data.
  • An early warning prompt generating module is configured to generate and output a risk check early warning prompt when the identity probability with the highest probability value exceeds the risk probability threshold.
  • the identity parameter extraction module 420 may include:
  • the number extraction module is used for extracting the document number from the identity data.
  • the credential type search module is used to identify the credential format of the credential number and find the credential type corresponding to the format recognition result.
  • the word segmentation module is used to segment the document number according to the type of the document to obtain a word segmentation string.
  • a parameter search module is used to search for identity characteristic parameters corresponding to each segmented character string.
  • Each module in the above-mentioned identity information risk assessment device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 5.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for operating the operating system and computer-readable instructions in a non-volatile storage medium.
  • the database of the computer equipment is used to store relevant data of identity information risk assessment.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a method for risk assessment of identity information.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the application, and does not constitute a limitation on the computer equipment to which the solution of the application is applied.
  • the specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • Computer-readable instructions are stored in the memory.
  • the processor causes the one or more processors to perform the following steps: receiving identity data; Identity identity parameters are extracted from identity data; historical verification data corresponding to identity data are found, and verification time parameters are extracted from historical verification data; identity feature parameters and verification time parameters are entered into a preset risk assessment model to obtain identity risk probability; according to identity risk probability Generate risk assessment results.
  • the processor executes the computer-readable instructions, the following steps are further implemented: collecting sample data, dividing the sample data into training set data and test set data; and extracting first feature parameters and first data from the training set data.
  • Target category feature gain evaluation based on the first feature parameter and the first target category, feature selection based on the feature gain evaluation result, classification based on the selected features to obtain an initial decision tree risk assessment model, and calculation of the initial decision based on the training set data
  • the risk probability of each classification node in the tree risk assessment model; the second feature parameter and the second target category are extracted from the test set data; The risk probability is verified, and the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated according to the verification result.
  • the processor when the processor executes the computer-readable instructions, the processor further implements the following steps: when the verification data update time is reached, loading the updated verification data; and extracting a third feature corresponding to a preset risk assessment model from the verification data Parameters and risk target labels; the risk probability of each classification node in the preset risk evaluation model is verified according to the third characteristic parameter and the risk target labels, and the preset risk evaluation model is optimized based on the verification results.
  • the step of generating a risk assessment result according to the identity risk probability is further used to: find a decision path corresponding to the identity risk probability with the highest probability value from a preset risk assessment model ; Obtain the node data of the decision path; generate the seizure path map based on the node data and the identity risk probability with the largest probability value and output it.
  • the step of generating a risk assessment result according to the identity risk probability is further used to: obtain the identity risk probability with the largest probability value; obtain the current security manpower data, find and compare the current security The security passenger flow threshold corresponding to the human data; obtain preset threshold conversion data, and calculate the risk probability threshold based on the security passenger flow threshold and the preset threshold conversion data; when the identity risk probability with the highest probability value exceeds the risk probability threshold, generate a risk check warning prompt and Output.
  • the processor when the processor executes the computer-readable instructions to implement the step of extracting identity characteristic parameters from the identity data, the processor is further configured to: extract a document number from the identity data; identify and search for the document format of the document number The document type corresponding to the format recognition result; segmenting the document number according to the document type to obtain a segmented character string; and finding the identity characteristic parameters corresponding to each segmented character string.
  • One or more non-volatile storage media storing computer-readable instructions.
  • the one or more processors When the computer-readable instructions are executed by one or more processors, the one or more processors cause the following steps to be performed: receiving identity data; Extract the identity characteristic parameters in the system; find the historical verification data corresponding to the identity data, and extract the verification time parameters from the historical verification data; enter the identity characteristic parameters and the verification time parameters into the preset risk assessment model to obtain the identity risk probability; evaluation result.
  • the following steps are further implemented: collecting sample data, dividing the sample data into training set data and test set data; and extracting first feature parameters and first A target category; feature gain evaluation based on the first feature parameter and the first target category, feature selection based on the feature gain evaluation result, classification based on the selected features to obtain an initial decision tree risk assessment model, and initial calculation based on the training set data
  • the risk probability of each classification node in the decision tree risk assessment model; the second feature parameter and the second target category are extracted from the test set data; each classification node in the initial decision tree risk assessment model is based on the second feature parameter and the second target category
  • the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated.
  • the following steps are further implemented: when the update time of the audit data is reached, loading the updated audit data; and extracting a third corresponding to the preset risk assessment model from the audit data Feature parameters and risk target tags; verify the risk probability of each classification node in the preset risk assessment model according to the third feature parameter and risk goal tags, and optimize the preset risk assessment model based on the verification results.
  • the step of generating a risk assessment result according to the identity risk probability is further used to: find a decision corresponding to the identity risk probability with the highest probability value from a preset risk assessment model Path; obtain the node data of the decision path; generate the seizure path map based on the node data and the identity risk probability with the highest probability value and output it.
  • the step of generating a risk assessment result according to the identity risk probability is further used to: obtain the identity risk probability with the largest probability value; obtain the current security human data, Security passenger flow threshold corresponding to security manpower data; Obtain preset threshold conversion data, and calculate risk probability threshold based on security passenger flow threshold and preset threshold conversion data; when the identity probability with the highest probability value exceeds the risk probability threshold, generate a risk check warning alert And output.
  • the following steps are also implemented: when implementing the step of extracting identity characteristic parameters from the identity data, it is further used for: extracting a document number from the identity data; Recognize the document format, find the document type corresponding to the format recognition result; segment the document number according to the document type to obtain the segmented character string; find the identity characteristic parameters corresponding to each segmented character string.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • RDRAM Rambus Direct RAM
  • DRAM Direct Memory Bus Dynamic RAM
  • RDRAM Memory Bus Dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Computer Security & Cryptography (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided is an identity information risk assessment method, comprising: receiving identity data; extracting an identity feature parameter from the identity data; searching for historical verification data corresponding to the identity data, and extracting a verification time parameter from the historical verification data; inputting the identity feature parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and generating a risk assessment result according to the identity risk probability.

Description

身份信息风险评定方法、装置、计算机设备和存储介质Identity information risk assessment method, device, computer equipment and storage medium
相关申请的交叉引用Cross-reference to related applications
本申请要求于2018年07月18日提交中国专利局,申请号为2018107914492,申请名称为“身份信息风险评定方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on July 18, 2018 with the Chinese Patent Office under the application number of 2018107914492, the application name is "Identity Information Risk Assessment Method, Device, Computer Equipment, and Storage Medium", all of which passed Citations are incorporated in this application.
技术领域Technical field
本申请涉及一种身份信息风险评定方法、装置、计算机设备和存储介质。The present application relates to an identity information risk assessment method, device, computer equipment, and storage medium.
背景技术Background technique
机场、口岸等出入境场所每天都会大量旅客通关,其中不乏一些走私、偷渡等不法分子。Airports, ports and other places of entry and exit will pass through a large number of passengers every day, and some of them are smugglers, smugglers and other illegal elements.
出入境场所的安防安检人员在对旅客进行安全检查时,通常是根据自身工作经验对旅客进行察言观色来判断旅客是否存在安全风险。但是,由于每天通关的人流量很大,单单依靠安检人员的人工检查能够排查到的具有安全风险的旅客是很有限的,导致出入境场所安防检查的准确率很低,并使得许多不法分子成为漏网之鱼。When conducting security checks on passengers at the entry and exit places, they usually judge passengers based on their work experience to determine whether there is a security risk. However, due to the large volume of people passing through customs every day, the security risks of passengers that can be detected by relying only on the manual inspection of security personnel are very limited, resulting in a low accuracy of security checks at entry and exit places and making many criminals into Missing Fish.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种身份信息风险评定方法、装置、计算机设备和存储介质。According to various embodiments disclosed in the present application, an identity information risk assessment method, apparatus, computer equipment, and storage medium are provided.
一种身份信息风险评定方法包括:An identity information risk assessment method includes:
接收身份数据;Receiving identity data;
从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
一种身份信息风险评定装置包括:An identity information risk assessment device includes:
身份数据获取模块,用于接收身份数据;Identity data acquisition module, for receiving identity data;
身份参数提取模块,用于从所述身份数据中提取身份特征参数;An identity parameter extraction module, configured to extract identity characteristic parameters from the identity data;
时间参数获取模块,用于查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;A time parameter acquisition module, configured to find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
风险概率获得模块,用于将所述身份特征参数和所述核查时间参数输入预设风险评估 模型得到身份风险概率;及A risk probability obtaining module, configured to input the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
风险结果生成模块,用于根据所述身份风险概率生成风险评估结果。A risk result generating module is configured to generate a risk assessment result according to the identity risk probability.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors are executed. The following steps:
接收身份数据;Receiving identity data;
从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:One or more non-volatile storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:
接收身份数据;Receiving identity data;
从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features and advantages of the application will become apparent from the description, the drawings, and the claims.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to explain the technical solutions in the embodiments of the present application more clearly, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. Those of ordinary skill in the art can obtain other drawings according to the drawings without paying creative labor.
图1为根据一个或多个实施例中身份信息风险评定方法的应用场景图。FIG. 1 is an application scenario diagram of an identity information risk assessment method according to one or more embodiments.
图2为根据一个或多个实施例中身份信息风险评定方法的流程示意图。FIG. 2 is a schematic flowchart of an identity information risk assessment method according to one or more embodiments.
图3为根据一个或多个实施例中预设风险评估模型生成方法的流程示意图。3 is a schematic flowchart of a method for generating a preset risk assessment model according to one or more embodiments.
图4为根据一个或多个实施例中身份信息风险评定装置的结构框图。FIG. 4 is a structural block diagram of an identity information risk assessment device according to one or more embodiments.
图5为根据一个或多个实施例中计算机设备的内部结构图。FIG. 5 is an internal structural diagram of a computer device according to one or more embodiments.
具体实施方式detailed description
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限 定本申请。In order to make the technical solution and advantages of the present application more clear and clear, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.
本申请提供的身份信息风险评定方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。服务器104接收终端102发送的旅客身份数据,从接收的身份数据中提取身份特征参数,查找身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数,将身份特征参数和核查时间参数输入预设风险评估模型得到身份风险概率,根据身份风险概率生成风险评估结果,服务器104将生成的风险评估结果返回给终端102。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The identity information risk assessment method provided in this application can be applied to the application environment shown in FIG. 1. The terminal 102 communicates with the server 104 through a network. The server 104 receives the passenger identity data sent by the terminal 102, extracts identity characteristic parameters from the received identity data, finds historical verification data corresponding to the identity data, extracts verification time parameters from the historical verification data, and combines the identity characteristic parameters and verification time The parameters are input to a preset risk assessment model to obtain an identity risk probability, a risk assessment result is generated according to the identity risk probability, and the server 104 returns the generated risk assessment result to the terminal 102. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
在其中一个实施例中,如图2所示,提供了一种身份信息风险评定方法,以该方法应用于图1中的服务器104为例进行说明,包括以下步骤:In one embodiment, as shown in FIG. 2, an identity information risk assessment method is provided. The method is applied to the server 104 in FIG. 1 as an example for description, and includes the following steps:
步骤210,接收身份数据。Step 210: Receive identity data.
身份数据为可以唯一确定旅客身份的数据,如身份证、签证、学生证等身份证件的证件类别、证件号码等。Identity data is data that can uniquely determine the identity of a passenger, such as the ID type, ID number, etc. of IDs, visas, student IDs and other IDs.
安防终端的工作人员可以通过身份信息采集设备如刷卡机等采集被安检的旅客的身份数据,身份信息采集设备将采集的旅客的身份数据传输给安防终端,工作人员也可以在安防终端录入旅客的身份数据。安防终端将获取的旅客的身份数据发送给服务器,服务器接收安防终端发送的身份数据。The staff of the security terminal can collect the identity data of the passengers who are checked through the identity information collection equipment such as a credit card machine. The identity information collection equipment transmits the collected passenger's identity data to the security terminal. The staff can also enter the passenger's Identity data. The security terminal sends the acquired passenger's identity data to the server, and the server receives the identity data sent by the security terminal.
步骤220,从身份数据中提取身份特征参数。Step 220: Extract identity characteristic parameters from the identity data.
身份特征参数为用于表征旅客特征的参数,身份特征参数可以包括旅客年龄、旅客籍贯、旅客性别等参数。旅客的身份数据中包括旅客的特征参数,服务器从接收的身份数据中提取身份特征参数。The identity characteristic parameter is a parameter for characterizing the passenger. The identity characteristic parameter may include parameters such as passenger age, passenger origin, and passenger gender. The passenger's identity data includes the passenger's characteristic parameters, and the server extracts the identity characteristic parameters from the received identity data.
在其中一个实施例中,从身份数据中提取身份特征参数的步骤可以包括:从身份数据中提取证件号码;对证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;根据证件类型对证件号码进行分词得到分词字符串;查找与各分词字符串对应的身份特征参数。In one of the embodiments, the step of extracting identity characteristic parameters from the identity data may include: extracting a document number from the identity data; identifying the document format of the document number, searching for a document type corresponding to the format recognition result; The document number is segmented to obtain the segmented character string; the identity characteristic parameters corresponding to each segmented character string are found.
服务器从身份数据中提取证件号码。服务器对证件号码进行证件格式识别,识别出证件号码的号码长度、字母数字组成情况等证件格式。证件类型与证件格式的映射关系事先存储在服务器中,服务器查找与识别出的证件格式对应的证件类型,在实施例中,证件类别可以包括身份证、通行证、回乡证、护照等类别。The server extracts the ID number from the identity data. The server recognizes the credential format of the credential number, and recognizes the credential format such as the number length and alphanumeric composition of the credential number. The mapping relationship between the credential type and the credential format is stored in the server in advance, and the server looks for the credential type corresponding to the recognized credential format. In the embodiment, the credential category may include categories such as an identity card, a pass, a home permit, and a passport.
不同证件类型的证件号码中预设位置的字符串对应于某项身份特征参数,服务器获取证件类型对应的字符串预设位置和预设长度,根据字符串预设位置和预设长度对证件号码进行分词,得到分词字符串。服务器获取与证件类型对应的各预设字符串的数据转换表,数据转换表中存储了各预设字符串的具体字符串取值与身份特征参数的对应关系。服务器从数据转换表中查找与各分词字符串对应的身份特征参数。The character string of the preset position in the credential number of different credential types corresponds to a certain identity characteristic parameter. The server obtains the preset position and the preset length of the character string corresponding to the credential type. Segmentation is performed to obtain a segmentation string. The server obtains a data conversion table of each preset character string corresponding to the credential type, and the data conversion table stores the correspondence between the specific string value of each preset character string and the identity characteristic parameter. The server looks up the identity characteristic parameter corresponding to each segmented character string from the data conversion table.
如证件类型为身份证,对身份证号码分词后,身份证号码的前三位为一个分词字符串,并且身份证号码前三位对应的身份特征为旅客籍贯,服务器获取身份证号码前三位对应的数据转换表,如身份证号码前三位为“410”,从数据转换表中查找到的与“410”对应的旅客籍贯参数为“河南”。采用上述方法,服务器查找与所有分词字符串对应的身份特征参数。For example, if the type of ID is ID, after the ID number is segmented, the first three digits of the ID number are a participle string, and the first three digits of the ID number correspond to the identity of the passenger, and the server obtains the first three digits of the ID number. The corresponding data conversion table, for example, the first three digits of the ID number is "410", and the passenger's nationality parameter corresponding to "410" found from the data conversion table is "Henan". Using the above method, the server looks up the identity characteristic parameters corresponding to all the segmented character strings.
步骤230,查找身份数据对应的历史核查数据,从历史核查数据中提取核查时间参数。Step 230: Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data.
历史核查数据为旅客进行安全检查的历史记录数据,历史记录数据可以包括旅客历次进行安全检查时的安检时间、安检结果等数据。服务器根据旅客身份数据中的证件号码查找与旅客对应的历史核查数据。The historical verification data is the historical record data of passengers' security inspections. The historical record data may include data such as the security inspection time and security inspection results of passengers during previous security inspections. The server searches the historical verification data corresponding to the passenger according to the document number in the passenger identity data.
核查时间参数可以包括旅客进出该安检场所进行安全检查的频率、每次进行安全检查的时间段,当前安检时刻所属的时间段等参数,具体地,可以将安全检查的频率设定为每日安检频率、每周安检频率及每月安检频率等,安全检查的频率可以由工作人员根据实际安检需求进行具体设定。服务器对历史核查数据进行数据统计,从中统计出各项核查时间参数。The verification time parameters may include parameters such as the frequency of passengers entering and exiting the security inspection place for security inspections, the time period for each security inspection, and the time period to which the current security inspection time belongs. Specifically, the frequency of security inspections can be set to daily security inspections. Frequency, weekly security inspection frequency and monthly security inspection frequency, etc., the frequency of security inspections can be specifically set by the staff according to the actual security inspection requirements. The server performs statistics on the historical verification data, and counts out various verification time parameters from it.
步骤240,将身份特征参数和核查时间参数输入预设风险评估模型得到身份风险概率。Step 240: Input the identity characteristic parameters and the verification time parameters into a preset risk assessment model to obtain an identity risk probability.
服务器获取预设风险评估模型,预设风险评估模型为预先设定的对旅客安全风险进行评估的模型。预设风险评估模型的输入为各项身份特征参数和核查时间参数,输出为旅客存在安全风险的概率。服务器将提取出的各项身份特征参数和核查时间参数输入预设风险评估模型,预设风险评估模型对各项参数进行运算处理后得到身份风险概率。The server obtains a preset risk assessment model, and the preset risk assessment model is a preset model for assessing passenger safety risks. The input of the preset risk assessment model is various identity characteristic parameters and verification time parameters, and the output is the probability that the passenger has a security risk. The server inputs the extracted identity characteristic parameters and verification time parameters into a preset risk assessment model, and the preset risk assessment model calculates and processes the parameters to obtain the identity risk probability.
步骤250,根据身份风险概率生成风险评估结果。Step 250: Generate a risk assessment result according to the identity risk probability.
服务器根据计算得到的身份风险概率生成风险评估结果,风险评估结果中可以包括身份风险概率、旅客历史安检信息及安防部署建议等信息。The server generates a risk assessment result based on the calculated identity risk probability. The risk assessment result may include information such as identity risk probability, passenger historical security information, and security deployment recommendations.
在本实施例中,服务器从接收的旅客的身份数据中提取出身份特征参数,并查找与提取的身份特征参数对应的历史核查数据,且预先设定风险评估模型,将身份特征参数和对应的历史核查数据输入预设风险评估模型能够得到旅客的身份风险概率,从而能够根据旅客特征和历史数据对旅客的安全风险进行科学地计算评估,提高安防检查准确率。In this embodiment, the server extracts the identity characteristic parameters from the received passenger's identity data, and searches for historical verification data corresponding to the extracted identity characteristic parameters, and sets a risk assessment model in advance to combine the identity characteristic parameters with the corresponding The historical verification data can be used to enter the preset risk assessment model to obtain the passenger's identity risk probability, so that the passenger's security risk can be calculated and evaluated scientifically based on the passenger's characteristics and historical data, thereby improving the accuracy of the security inspection.
在其中一个实施例中,如图3所示,预设风险评估模型的生成方式,包括:In one embodiment, as shown in FIG. 3, the method for generating a preset risk assessment model includes:
步骤201,采集样本数据,将样本数据划分为训练集数据和测试集数据。Step 201: Collect sample data, and divide the sample data into training set data and test set data.
样本数据为真实的安防场所进行安全检查的历史数据,服务器采集预设时间范围内的历史安检数据,预设时间范围可以设定为1个月、3个月、半年等,服务器将采集的样本数据进行随机划分,划分为训练集数据和测试集数据,训练集数据和测试集数据中包含的样本数量可以相同也可以不同。The sample data is historical data for security inspections in real security places. The server collects historical security inspection data within a preset time range. The preset time range can be set to 1 month, 3 months, half a year, etc. The server will collect samples The data is randomly divided into training set data and test set data. The number of samples contained in the training set data and the test set data may be the same or different.
步骤203,从训练集数据中提取第一特征参数和第一目标类别。Step 203: Extract a first feature parameter and a first target category from the training set data.
根据安全检查的检查结果可以将样本数据分为正样本数据和负样本数据,正样本数据为检查结果为正常的旅客的历史安检数据,负样本数据为检查结果为异常的旅客的历史安 检数据。训练集数据中和测试集数据中均既包含正样本数据和负样本数据。According to the inspection results of the security inspection, the sample data can be divided into positive sample data and negative sample data. The positive sample data is the historical security data of passengers whose inspection results are normal, and the negative sample data is the historical security data of passengers whose inspection results are abnormal. Both the training set data and the test set data contain both positive sample data and negative sample data.
服务器从训练集数据的各个样本中逐个提取出第一特征参数和第一目标类别。其中,第一特征参数包括身份特征参数和核查时间参数,即第一特征参数与实际安检中从旅客身份数据中提取的身份特征参数及从旅客历史核查数据中提取的核查时间参数相对应。第一目标类别为安全检查结果的类别,第一目标类别分为安检正常和安检异常两类。The server extracts the first feature parameter and the first target category one by one from each sample of the training set data. The first characteristic parameter includes an identity characteristic parameter and a verification time parameter, that is, the first feature parameter corresponds to an identity characteristic parameter extracted from passenger identity data in an actual security check and a verification time parameter extracted from a passenger historical verification data. The first target category is the category of the security inspection results. The first target category is divided into two categories: normal security check and abnormal security check.
步骤205,根据第一特征参数和第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据训练集数据计算初始决策树风险评估模型中各分类节点的风险概率。Step 205: Perform feature gain evaluation according to the first feature parameter and the first target category, and perform feature selection according to the feature gain evaluation result. Classify the selected features to obtain an initial decision tree risk assessment model, and calculate the initial decision based on the training set data The risk probability of each classification node in the tree risk assessment model.
在本实施例中,构建的预设风险评估模型为决策树模型。决策树是一种由节点和有向边组成的用于对实例进行分类的树形结构,节点的类型有两种:内部节点和叶子节点。其中,内部节点表示特征或属性的测试条件,叶子节点表示分类。使用决策树模型进行分类的具体方法是:从根节点开始,对实例的某一特征进行测试,根据测试结果将实例分配到其子节点。沿该分支可能达到叶子节点或者到达另一个内部节点时,则使用新的测试条件递归执行下去,直到抵达一个叶子节点,当到达叶子节点时,则得到最终分类结果,将叶子节点作为分类节点。In this embodiment, the preset risk assessment model constructed is a decision tree model. A decision tree is a tree structure composed of nodes and directed edges that is used to classify instances. There are two types of nodes: internal nodes and leaf nodes. Among them, the internal nodes represent test conditions for features or attributes, and the leaf nodes represent classification. The specific method of using the decision tree model for classification is: starting from the root node, testing a certain feature of the instance, and assigning the instance to its child nodes according to the test results. When it is possible to reach a leaf node or another internal node along this branch, the new test condition is used to recursively execute until a leaf node is reached. When the leaf node is reached, the final classification result is obtained, and the leaf node is used as the classification node.
在本实施例中,采用ID3算法构建初始决策树风险评估模型,ID3算法是对各特征进行信息增益评估,每次选择信息增益最大的特征参数作为判断模块建立子节点。服务器计算出第一特征参数对应的各特征的信息增益,选取信息增益最大的特征作为判断模块建立子节点,将子节点对应的训练集数据划分为子集数据,对子集数据以递归方式进行分支建立分支节点,直至所有分支节点对应于相同的第一目标类别为止。In this embodiment, an ID3 algorithm is used to construct an initial decision tree risk assessment model. The ID3 algorithm evaluates the information gain of each feature, and selects the feature parameter with the largest information gain each time as a judgment module to establish a child node. The server calculates the information gain of each feature corresponding to the first feature parameter, selects the feature with the largest information gain as the judgment module to establish the child node, divides the training set data corresponding to the child node into the subset data, and recursively performs the subset data. Branching establishes branch nodes until all branch nodes correspond to the same first target category.
具体地,服务器采用下列公式(1)计算第一特征参数对应的各特征的信息增益:Specifically, the server uses the following formula (1) to calculate the information gain of each feature corresponding to the first feature parameter:
g(D,A)=H(D)-H(D|A)        (1)g (D, A) = H (D) -H (D | A) (1)
其中,g(D,A)为特征A对训练数据集D的信息增益,H(D)为训练数据集D的经验熵,H(D|A)为特征A对数据集D的经验条件熵。Among them, g (D, A) is the information gain of feature A on training data set D, H (D) is the empirical entropy of training data set D, and H (D | A) is the empirical conditional entropy of feature A on data set D .
服务器采用下列公式(2)计算训练数据集D的经验熵H(D):The server uses the following formula (2) to calculate the empirical entropy H (D) of the training data set D:
Figure PCTCN2018104806-appb-000001
Figure PCTCN2018104806-appb-000001
其中,C k为第一目标类别对应的样本数量,K为第一目标类别的类别数量,在本实施例中,第一目标类别分为安检正常和安检异常两种。 Among them, C k is the number of samples corresponding to the first target category, and K is the number of categories of the first target category. In this embodiment, the first target category is divided into two types: normal security check and abnormal security check.
服务器采用下列公式(3)计算特征A对训练数据集D的经验条件熵H(D|A):The server uses the following formula (3) to calculate the empirical conditional entropy H (D | A) of feature A on the training data set D:
Figure PCTCN2018104806-appb-000002
Figure PCTCN2018104806-appb-000002
其中,value(A)是特征A所有的取值集合,i是特征A的一个取值,D i是训练数据集D中特征A取值为i的样例集合,|D i|表示取值为i的样例集合的样本数量,|D|表示进行样例集合划分前样本的总数量。如性别特征参数对应的特征A所有的取值为男和女,如男可以用0表示,女可以用1表示,value(A)为(0,1)。 Wherein, value (A) wherein A is a set of all values, i is a value characteristic of the A, D i is a training data set D wherein A is a sample set of values of i, | D i | that value Is the number of samples in the sample set of i, | D | represents the total number of samples before the sample set is divided. For example, all the values of feature A corresponding to the gender characteristic parameter are male and female. For example, male can be represented by 0, female can be represented by 1, and value (A) is (0, 1).
服务器采用Hunt算法的递归方式建立决策树,当计算出各特征参数的信息增益并进行特征选择后,获取信息增益最大的特征参数对应的训练集数据子集,并采用相同的方式对训练集数据子集进行特征选择,从而将训练数据集逐步划分为较纯的子集。The server uses the Hunt algorithm to build a decision tree recursively. After calculating the information gain of each feature parameter and selecting features, it obtains the training set data corresponding to the feature parameter with the largest information gain, and uses the same method for the training set data. Feature selection is performed on the subsets, and the training data set is gradually divided into more pure subsets.
Hunt算法的递归定义如下:设Dt是与节点t相关联的训练数据子集,而y={y1,y2,…,yc}是目标类别标号,如果Dt中所有样本数据都属于同一个类别,则t是叶节点,用yt标记;如果Dt中包含属于多个类别的样本数据,则选择一个特征测试条件,将样本数据划分成较小的子集。对于测试条件的每个输出,创建一个分支节点,并根据测试结果将Dt中的样本数据分布到分支节点中。对于每个分支节点,递归地调用该算法。The recursive definition of the Hunt algorithm is as follows: Let Dt be the subset of training data associated with node t, and y = {y1, y2, ..., yc} be the target category labels. If all sample data in Dt belong to the same category, Then t is a leaf node, labeled with yt; if Dt contains sample data belonging to multiple categories, a feature test condition is selected to divide the sample data into smaller subsets. For each output of the test condition, create a branch node and distribute the sample data in Dt to the branch nodes based on the test results. For each branch node, the algorithm is called recursively.
服务器构建出初始决策树风险评估模型后,根据训练数据集中各样本的第一特征参数和第一目标类别,从训练数据集中统计出与初始决策树风险评估模型中各分类节点对应的特征参数组合匹配的负样本数据,计算统计的负样本数据在训练数据集中总的负样本数据中所占的比例,将该比例作为中各分类节点的风险概率。After the server builds the initial decision tree risk assessment model, according to the first feature parameters and the first target category of each sample in the training data set, the feature parameter combination corresponding to each classification node in the initial decision tree risk assessment model is calculated from the training data set. For the matched negative sample data, calculate the ratio of the statistical negative sample data to the total negative sample data in the training data set, and use this ratio as the risk probability of each classification node.
步骤207,从测试集数据中提取第二特征参数和第二目标类别。Step 207: Extract a second feature parameter and a second target category from the test set data.
服务器从测试集数据的各个样本中逐个提取出第二特征参数和第二目标类别。其中,第二特征参数包括身份特征参数和核查时间参数,即第二特征参数与实际安检中从旅客身份数据中提取的身份特征参数及从旅客历史核查数据中提取的核查时间参数相对应。第二目标类别为安全检查结果的类别,第二目标类别分为安检正常和安检异常两类。The server extracts the second feature parameter and the second target category one by one from each sample of the test set data. Among them, the second characteristic parameter includes an identity characteristic parameter and a verification time parameter, that is, the second feature parameter corresponds to the identity characteristic parameter extracted from the passenger identity data in the actual security inspection and the verification time parameter extracted from the passenger historical verification data. The second target category is the category of the security inspection results. The second target category is divided into two categories: normal security inspection and abnormal security inspection.
步骤209,根据第二特征参数和第二目标类别对初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对初始决策树风险评估模型进行调整并生成预设风险评估模型。In step 209, the risk probability of each classification node in the initial decision tree risk assessment model is verified according to the second feature parameter and the second target category, and the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated according to the verification result.
服务器根据测试数据集中各样本的第二特征参数和第二目标类别,从测试数据集中统计出与初始决策树风险评估模型中各分类节点对应的特征参数组合匹配的负样本数据,计算统计的负样本数据在测试数据集中总的负样本数据中所占的比例,并根据计算出的比例对决策树模型中各分类节点的风险概率进行验证。在验证时,服务器可以设定预设容错误差,当计算出的比例与风险概率的绝对差值小于预设容错误差时,验证通过,当计算出的比例与风险概率的绝对差值大于预设容错误差时,验证不通过。当验证不通过时,服务器可以将测试数据集中的样本数据加入训练数据集中,扩大样本容量对初始决策树风险评估 模型进行训练,对初始决策树风险评估模型进行调整后生成预设风险评估模型。Based on the second feature parameters and the second target category of each sample in the test data set, the server calculates negative sample data from the test data set that matches the combination of the feature parameter corresponding to each classification node in the initial decision tree risk assessment model, and calculates the statistical negative The proportion of sample data in the total negative sample data in the test data set, and the risk probability of each classification node in the decision tree model is verified based on the calculated proportion. During the verification, the server can set a preset tolerance error. When the absolute difference between the calculated ratio and the risk probability is less than the preset tolerance error, the verification passes. When the absolute difference between the calculated ratio and the risk probability is greater than the preset, When the tolerance is poor, the verification fails. When the verification fails, the server can add the sample data in the test data set to the training data set, expand the sample capacity to train the initial decision tree risk assessment model, and adjust the initial decision tree risk assessment model to generate a preset risk assessment model.
在其中一个实施例中,身份信息风险评定方法还可以包括:当到达核查数据更新时间时,加载更新的核查数据;从核查数据中提取与预设风险评估模型对应的第三特征参数和风险目标标记;根据第三特征参数和风险目标标记对预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对预设风险评估模型进行优化。In one embodiment, the identity information risk assessment method may further include: when the update time of the verification data is reached, loading the updated verification data; and extracting from the verification data a third characteristic parameter and a risk target corresponding to a preset risk assessment model Mark; verify the risk probability of each classification node in the preset risk assessment model according to the third characteristic parameter and the risk target mark, and optimize the preset risk assessment model according to the verification result.
服务器预先设定核查数据更新时间,核查数据更新时间为对安防场所的安全检查数据进行更新的时间。当到达预设的核查数据更新时间后,服务器加载更新的核查数据,核查数据包括旅客的身份特征数据、核查时间和安全检查结果,安防终端可以主动或被动地向服务器发送更新的核查数据。The server presets the check data update time, and the check data update time is the time to update the security check data of the security place. When the preset check data update time is reached, the server loads the updated check data. The check data includes the passenger's identity data, check time, and security check results. The security terminal can actively or passively send the updated check data to the server.
服务器从核查数据中提取出第三特征参数和风险目标标记,第三特征参数与预设风险评估模型中设定的特征相对应,风险目标标记为安全检查结果标记,分为无安全风险标记和安全风险标记两类。The server extracts the third characteristic parameter and the risk target mark from the verification data. The third characteristic parameter corresponds to the characteristic set in the preset risk assessment model. The risk target mark is a security check result mark, which is divided into no security risk mark and There are two types of security risk tags.
服务器根据核查数据中各样本的第三特征参数和风险目标标记,从核查数据中统计出与预设风险评估模型中各分类节点对应的特征参数组合匹配的负样本数据,计算统计的负样本数据在核查数据总的负样本数据中所占的比例,并根据计算出的比例对预设风险评估模型中各分类节点的风险概率进行验证。在验证时,服务器可以设定预设偏差,当计算出的比例与风险概率的绝对差值小于预设偏差时,验证通过;当计算出的比例与风险概率的绝对差值大于预设偏差时,验证不通过。当验证不通过时,服务器可以将核查数据继续对预设风险评估模型进行训练和调整,从而根据核查数据对预设风险评估模型进行不断优化,从而通过大数据的训练使得通过预设风险评估模型得到的风险评估结果越来越准确。The server calculates the negative sample data that matches the combination of the characteristic parameter corresponding to each classification node in the preset risk assessment model from the check data according to the third characteristic parameter and the risk target mark of each sample in the check data, and calculates the statistical negative sample data. The proportion of the total negative sample data in the verification data, and the risk probability of each classification node in the preset risk assessment model is verified according to the calculated proportion. During verification, the server can set a preset deviation. When the absolute difference between the calculated ratio and the risk probability is less than the preset deviation, the verification passes; when the absolute difference between the calculated ratio and the risk probability is greater than the preset deviation , Verification failed. When the verification fails, the server can continue to train and adjust the audit data to the preset risk assessment model, so as to continuously optimize the preset risk assessment model according to the verification data, so that the training of big data enables the preset risk assessment model to pass. The resulting risk assessment results are becoming more accurate.
在其中一个实施例中,根据身份风险概率生成风险评估结果的步骤可以包括:从预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;获取决策路径的节点数据;根据节点数据和概率值最大的身份风险概率生成查获路径图并输出。In one of the embodiments, the step of generating a risk assessment result according to the identity risk probability may include: finding a decision path corresponding to the identity risk probability with the highest probability value from a preset risk assessment model; obtaining node data of the decision path; and according to the node data The identity risk probability with the largest probability value and the greatest probability value is used to generate a seizure path map and output it.
当服务器将提取出的身份特征参数和核查时间参数输入预设风险评估模型后,身份特征参数和核查时间参数可能与预设风险评估模型中多条决策路径中的特征参数相匹配,因此,可能会得到多个匹配的分类节点对应的身份风险概率。如预设风险评估模型中的分类节点对应的特征参数为通关频率,则输入的参数可能既满足分类节点为“当天安检次数”、节点特征取值为“大于两次”的决策路径一,又可能满足分类节点为“上个自然周安检次数”、节点特征取值为“8-15天之间”的决策路径二,决策路径一对应的身份风险概率为21%,决策路径二对应的身份风险概率为25%。服务器从预设风险评估模型的计算结果中查找出概率值最大的身份风险概率对应的决策路径。After the server enters the extracted identity characteristic parameters and verification time parameters into the preset risk assessment model, the identity characteristic parameters and verification time parameters may match the feature parameters in multiple decision paths in the preset risk assessment model. Therefore, it is possible The identity risk probability corresponding to multiple matching classification nodes will be obtained. If the characteristic parameter corresponding to the classification node in the risk assessment model is the clearance frequency, the input parameters may satisfy both the decision path of the classification node being "the number of security checks of the day" and the node characteristic value being "greater than twice". It may meet the decision path two of the classification node as “the number of security checks in the last natural week” and the value of the node feature as “between 8-15 days”. The identity risk probability corresponding to decision path one is 21%, and the identity corresponding to decision path two The risk probability is 25%. The server finds the decision path corresponding to the identity risk probability with the highest probability value from the calculation result of the preset risk assessment model.
服务器获取查找出的决策路径中各节点对应的特征参数,节点包括内部的分支节点和分类叶子节点。服务器根据所有节点的特征参数串联起来生成查获路径图,并将最后数据的身份风险概率也添加至查获路径图中,将查获路径图返回给安防终端,使得安防终端将查获路径图进行显示,即对预设风险评估模型的输出结果进行可视化展示,从而可以使安 防人员清楚了解当前旅客的各项特征和所存在的潜在安全风险,根据可视化查获路径图判断是否要对当前旅客采取进一步检查。The server obtains the characteristic parameters corresponding to each node in the found decision path, and the nodes include internal branch nodes and classification leaf nodes. The server generates a seizure path map in series according to the characteristic parameters of all nodes, and also adds the identity risk probability of the final data to the seizure path map, and returns the seized path map to the security terminal, so that the security terminal displays the seized path map, that is, Visually display the output of the preset risk assessment model, so that security personnel can clearly understand the characteristics of the current passenger and the potential security risks, and determine whether to further inspect the current passenger based on the visualized roadmap.
例如,输入模型的旅客的身份特征参数和核查时间参数为“张三,男,24岁,中国籍,出生地广东,本月第三次核查”,符合预设风险评估模型中“男性—20-30岁年龄段—中国籍—出生地华南—当月核查3-5次”的决策路径,该决策路径的风险概率为30%,为所有特征参数匹配的决策路径中概率值最大的决策路径,根据决策路径和相应的风险概率生成查获路径图。For example, the passenger ’s identity characteristics and verification time parameters of the input model are “Zhang San, male, 24 years old, Chinese, born in Guangdong, the third verification this month”, which is in line with the “male-20” in the preset risk assessment model. -30-year-old—Chinese—born in South China—check 3-5 times a month, the decision path has a 30% risk probability, which is the decision path with the highest probability value among the decision paths that match all the characteristic parameters. A seizure path map is generated based on the decision path and the corresponding risk probability.
在其中一个实施例中,根据身份风险概率生成风险评估结果的步骤可以包括:获取概率值最大的身份风险概率;获取当前安防人力数据,查找与当前安防人力数据对应的安防客流阈值;获取预设阈值转换数据,根据安防客流阈值和预设阈值转换数据计算风险概率阈值;当概率值最大的身份风险概率超过风险概率阈值时,生成风险检查预警提示并输出。In one of the embodiments, the step of generating a risk assessment result according to the identity risk probability may include: obtaining an identity risk probability with a maximum probability value; obtaining current security manpower data, finding a security passenger flow threshold corresponding to the current security manpower data; obtaining a preset Threshold conversion data. The risk probability threshold is calculated based on the security passenger flow threshold and the preset threshold conversion data. When the identity probability with the highest probability value exceeds the risk probability threshold, a risk check alert is generated and output.
服务器获取预设风险评估模型的各分类节点对应输出的身份风险概率,并从中选取出概率值最大的身份风险概率。The server obtains the identity risk probability corresponding to each classification node of the preset risk assessment model, and selects the identity risk probability with the largest probability value from it.
安防客流阈值为与当前安防人力对应的可以进行安全检查的旅客流量的最大值。服务器获取当前安防人力数据,当前安防人力数据可以包括当前安检场所部署的总的安防人力,安防终端对应的当前安检口部署的安防人力等数据。服务器获取预存的安防人力数据与安防客流阈值的映射关系,包括总安防人力与总安防客流阈值的映射关系,及当前安检口的安防人力与相应的安防客流阈值的映射关系。服务器查找与当前总安防人力对应的总安防客流阈值,并查找与当前安检口的安防人力对应的安检口安防客流阈值。The security passenger flow threshold is the maximum value of the passenger flow corresponding to the current security manpower that can perform security checks. The server obtains the current security manpower data. The current security manpower data may include data such as the total security manpower deployed at the current security checkpoint, and the security manpower deployed at the current security checkpoint corresponding to the security terminal. The server obtains the mapping relationship between the pre-stored security manpower data and the security passenger flow threshold, including the mapping relationship between the total security manpower and the total security passenger flow threshold, and the mapping relationship between the security manpower at the current security checkpoint and the corresponding security passenger flow threshold. The server looks up the total security passenger flow threshold corresponding to the current total security manpower, and finds the security checkpoint security passenger flow threshold corresponding to the current security manpower at the security checkpoint.
风险概率阈值为能够判定旅客具有安全风险的身份风险概率的最小值,风险概率阈值不是固定的,而是根据安防人力进行调整,当安防人力充足时,将风险概率阈值设定得相对小一些,反之,将风险概率阈值设定得相对大一些。The risk probability threshold is the minimum value of the identity risk probability that can determine that the passenger has a security risk. The risk probability threshold is not fixed, but adjusted according to the security manpower. When the security manpower is sufficient, the risk probability threshold is set to be relatively small. Conversely, the risk probability threshold is set relatively large.
服务器获取预设阈值转换数据,预设阈值转换数据为安防客流阈值与风险概率阈值之间进行转换的转换数据,转换数据可以为安防客流阈值与风险概率阈值的映射关系表,也可以为预设的转换计算公式等。服务器根据预设阈值转换数据计算出与安防客流阈值对应的风险概率阈值,包括与总安防客流阈值对应的第一风险概率阈值,及安检口安防客流阈值对应的第二风险概率阈值,并将第一风险概率阈值与第二风险概率阈值中的最小值作为风险概率阈值。The server obtains preset threshold conversion data. The preset threshold conversion data is conversion data converted between the security passenger flow threshold and the risk probability threshold. The conversion data may be a mapping table between the security passenger flow threshold and the risk probability threshold, or may be a preset Conversion calculation formula, etc. The server calculates a risk probability threshold corresponding to the security passenger flow threshold according to the preset threshold conversion data, including a first risk probability threshold corresponding to the total security passenger flow threshold and a second risk probability threshold corresponding to the security passenger flow threshold of the security checkpoint. The minimum of a risk probability threshold and a second risk probability threshold is used as the risk probability threshold.
服务器将获取的概率值最大的身份风险概率与计算出的风险概率阈值进行比较,当概率值最大的身份风险概率小于等于风险概率阈值时,当前旅客安检通过,服务器可以生成安检合格通知并返回给安防终端;当概率值最大的身份风险概率超过风险概率阈值时,服务器生成风险检查预警提示,风险检查预警提示中可以携带计算出的当前旅客的身份风险概率,当当前旅客的历史历史核查数据存在安检异常记录时,将异常记录信息也添加至风险检查预警提示中,服务器将生成的风险检查预警提示发送至安防终端,以提示安检口的工作人员该旅客具有一定地安全风险,需要进行进一步地安全检查。The server compares the obtained identity risk probability with the highest probability value to the calculated risk probability threshold. When the identity risk probability with the largest probability value is less than or equal to the risk probability threshold, the current passenger security check passes, and the server can generate a security check notification and return it to Security terminal; when the identity probability with the largest probability value exceeds the risk probability threshold, the server generates a risk check alert prompt, which can carry the calculated identity risk probability of the current passenger. When the current passenger's historical verification data exists When the security check is abnormally recorded, the abnormal record information is also added to the risk check warning prompt, and the server sends the generated risk check warning prompt to the security terminal to remind the staff at the security checkpoint that the passenger has a certain security risk and needs to be further Security check.
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the steps in the flowchart of FIG. 2-3 are sequentially displayed in accordance with the directions of the arrows, these steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in Figure 2-3 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of another step or a sub-step or stage of another step.
在其中一个实施例中,如图4所示,提供了一种身份信息风险评定装置,包括:身份数据获取模块410、身份参数提取模块420、时间参数获取模块430、风险概率获得模块440和风险结果生成模块450,其中:In one embodiment, as shown in FIG. 4, an identity information risk assessment device is provided, including: an identity data acquisition module 410, an identity parameter extraction module 420, a time parameter acquisition module 430, a risk probability acquisition module 440, and a risk. Result generation module 450, where:
身份数据获取模块410,用于接收身份数据。The identity data obtaining module 410 is configured to receive identity data.
身份参数提取模块420,用于从身份数据中提取身份特征参数。The identity parameter extraction module 420 is configured to extract identity characteristic parameters from the identity data.
时间参数获取模块430,用于查找身份数据对应的历史核查数据,从历史核查数据中提取核查时间参数。The time parameter obtaining module 430 is configured to find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data.
风险概率获得模块440,用于将身份特征参数和核查时间参数输入预设风险评估模型得到身份风险概率。A risk probability obtaining module 440 is configured to input identity characteristic parameters and verification time parameters into a preset risk assessment model to obtain an identity risk probability.
风险结果生成模块450,用于根据身份风险概率生成风险评估结果。The risk result generating module 450 is configured to generate a risk assessment result according to the identity risk probability.
在其中一个实施例中,装置还可以包括:In one embodiment, the apparatus may further include:
数据采集模块,用于采集样本数据,将样本数据划分为训练集数据和测试集数据。A data acquisition module is used to collect sample data and divide the sample data into training set data and test set data.
训练数据提取模块,用于从训练集数据中提取第一特征参数和第一目标类别。A training data extraction module is used to extract a first feature parameter and a first target category from the training set data.
初始模型构建模块,用于根据第一特征参数和第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据训练集数据计算初始决策树风险评估模型中各分类节点的风险概率。The initial model building module is used to perform feature gain evaluation according to the first feature parameter and the first target category, and to perform feature selection according to the feature gain evaluation result, and classify according to the selected features to obtain an initial decision tree risk assessment model. According to the training set, The data calculates the risk probability of each classification node in the initial decision tree risk assessment model.
测试数据提取模块,用于从测试集数据中提取第二特征参数和第二目标类别。The test data extraction module is configured to extract a second feature parameter and a second target category from the test set data.
评估模块生成模块,用于根据第二特征参数和第二目标类别对初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对初始决策树风险评估模型进行调整并生成预设风险评估模型。The evaluation module generation module is used to verify the risk probability of each classification node in the initial decision tree risk assessment model according to the second characteristic parameter and the second target category, and adjust the initial decision tree risk assessment model and generate a preset based on the verification result. Risk assessment model.
在其中一个实施例中,装置还可以包括:In one embodiment, the apparatus may further include:
核查数据加载模块,用于当到达核查数据更新时间时,加载更新的核查数据。The verification data loading module is used to load the updated verification data when the verification data update time is reached.
核查数据提取模块,用于从核查数据中提取与预设风险评估模型对应的第三特征参数和风险目标标记。The verification data extraction module is used to extract from the verification data the third characteristic parameter and the risk target mark corresponding to the preset risk assessment model.
模型优化模块,用于根据第三特征参数和风险目标标记对预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对预设风险评估模型进行优化。The model optimization module is used to verify the risk probability of each classification node in the preset risk assessment model according to the third characteristic parameter and the risk target mark, and optimize the preset risk assessment model according to the verification result.
在其中一个实施例中,风险结果生成模块450可以包括:In one embodiment, the risk result generation module 450 may include:
路径查找模块,用于从预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径。The path finding module is configured to find a decision path corresponding to an identity risk probability with a maximum probability value from a preset risk assessment model.
路径数据获取模块,用于获取决策路径的节点数据。The path data acquisition module is used to acquire node data of a decision path.
路径图生成模块,用于根据节点数据和概率值最大的身份风险概率生成查获路径图并输出。The path graph generating module is configured to generate and output a seized path graph according to the node data and the identity risk probability with the largest probability value.
在其中一个实施例中,风险结果生成模块450可以包括:In one embodiment, the risk result generation module 450 may include:
概率获取模块,用于获取概率值最大的身份风险概率。The probability acquisition module is configured to acquire an identity risk probability with a maximum probability value.
安防阈值查找模块,用于获取当前安防人力数据,查找与当前安防人力数据对应的安防客流阈值。The security threshold search module is used to obtain the current security manpower data and find the security passenger flow threshold corresponding to the current security manpower data.
风险阈值计算模块,用于获取预设阈值转换数据,根据安防客流阈值和预设阈值转换数据计算风险概率阈值。The risk threshold calculation module is configured to obtain preset threshold conversion data, and calculate a risk probability threshold according to the security passenger flow threshold and the preset threshold conversion data.
预警提示生成模块,用于当概率值最大的身份风险概率超过风险概率阈值时,生成风险检查预警提示并输出。An early warning prompt generating module is configured to generate and output a risk check early warning prompt when the identity probability with the highest probability value exceeds the risk probability threshold.
在其中一个实施例中,身份参数提取模块420可以包括:In one embodiment, the identity parameter extraction module 420 may include:
号码提取模块,用于从身份数据中提取证件号码。The number extraction module is used for extracting the document number from the identity data.
证件类型查找模块,用于对证件号码进行证件格式识别,查找与格式识别结果对应的证件类型。The credential type search module is used to identify the credential format of the credential number and find the credential type corresponding to the format recognition result.
分词模块,用于根据证件类型对证件号码进行分词得到分词字符串。The word segmentation module is used to segment the document number according to the type of the document to obtain a word segmentation string.
参数查找模块,用于查找与各分词字符串对应的身份特征参数。A parameter search module is used to search for identity characteristic parameters corresponding to each segmented character string.
关于身份信息风险评定装置的具体限定可以参见上文中对于身份信息风险评定方法的限定,在此不再赘述。上述身份信息风险评定装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。Regarding the specific limitation of the identity information risk assessment device, please refer to the limitation on the identity information risk assessment method mentioned above, which will not be repeated here. Each module in the above-mentioned identity information risk assessment device may be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储身份信息风险评定的相关数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种身份信息风险评定方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 5. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer-readable instructions, and a database. The internal memory provides an environment for operating the operating system and computer-readable instructions in a non-volatile storage medium. The database of the computer equipment is used to store relevant data of identity information risk assessment. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by a processor to implement a method for risk assessment of identity information.
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in FIG. 5 is only a block diagram of a part of the structure related to the solution of the application, and does not constitute a limitation on the computer equipment to which the solution of the application is applied. The specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:接收身份数据;从身份数据中提取身份特征参数;查找身份数据对应的历史核查数据,从历史核查数据中提取核查时间参数;将身份特征参数和核查时间参数输入预设风险评估模型得到身份风险概率;根据身份风险概率生成风险评估结果。A computer device includes a memory and one or more processors. Computer-readable instructions are stored in the memory. When the computer-readable instructions are executed by the processor, the processor causes the one or more processors to perform the following steps: receiving identity data; Identity identity parameters are extracted from identity data; historical verification data corresponding to identity data are found, and verification time parameters are extracted from historical verification data; identity feature parameters and verification time parameters are entered into a preset risk assessment model to obtain identity risk probability; according to identity risk probability Generate risk assessment results.
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:采集样本数据,将样本数据划分为训练集数据和测试集数据;从训练集数据中提取第一特征参数和第一目标类别;根据第一特征参数和第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据训练集数据计算初始决策树风险评估模型中各分类节点的风险概率;从测试集数据中提取第二特征参数和第二目标类别;根据第二特征参数和第二目标类别对初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对初始决策树风险评估模型进行调整并生成预设风险评估模型。In one embodiment, when the processor executes the computer-readable instructions, the following steps are further implemented: collecting sample data, dividing the sample data into training set data and test set data; and extracting first feature parameters and first data from the training set data. Target category; feature gain evaluation based on the first feature parameter and the first target category, feature selection based on the feature gain evaluation result, classification based on the selected features to obtain an initial decision tree risk assessment model, and calculation of the initial decision based on the training set data The risk probability of each classification node in the tree risk assessment model; the second feature parameter and the second target category are extracted from the test set data; The risk probability is verified, and the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated according to the verification result.
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:当到达核查数据更新时间时,加载更新的核查数据;从核查数据中提取与预设风险评估模型对应的第三特征参数和风险目标标记;根据第三特征参数和风险目标标记对预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对预设风险评估模型进行优化。In one of the embodiments, when the processor executes the computer-readable instructions, the processor further implements the following steps: when the verification data update time is reached, loading the updated verification data; and extracting a third feature corresponding to a preset risk assessment model from the verification data Parameters and risk target labels; the risk probability of each classification node in the preset risk evaluation model is verified according to the third characteristic parameter and the risk target labels, and the preset risk evaluation model is optimized based on the verification results.
在其中一个实施例中,处理器执行计算机可读指令时实现根据身份风险概率生成风险评估结果的步骤时还用于:从预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;获取决策路径的节点数据;根据节点数据和概率值最大的身份风险概率生成查获路径图并输出。In one embodiment, when the processor executes the computer-readable instructions, the step of generating a risk assessment result according to the identity risk probability is further used to: find a decision path corresponding to the identity risk probability with the highest probability value from a preset risk assessment model ; Obtain the node data of the decision path; generate the seizure path map based on the node data and the identity risk probability with the largest probability value and output it.
在其中一个实施例中,处理器执行计算机可读指令时实现根据身份风险概率生成风险评估结果的步骤时还用于:获取概率值最大的身份风险概率;获取当前安防人力数据,查找与当前安防人力数据对应的安防客流阈值;获取预设阈值转换数据,根据安防客流阈值和预设阈值转换数据计算风险概率阈值;当概率值最大的身份风险概率超过风险概率阈值时,生成风险检查预警提示并输出。In one embodiment, when the processor executes the computer-readable instructions, the step of generating a risk assessment result according to the identity risk probability is further used to: obtain the identity risk probability with the largest probability value; obtain the current security manpower data, find and compare the current security The security passenger flow threshold corresponding to the human data; obtain preset threshold conversion data, and calculate the risk probability threshold based on the security passenger flow threshold and the preset threshold conversion data; when the identity risk probability with the highest probability value exceeds the risk probability threshold, generate a risk check warning prompt and Output.
在其中一个实施例中,处理器执行计算机可读指令时实现从所述身份数据中提取身份特征参数的步骤时还用于:从身份数据中提取证件号码;对证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;根据证件类型对证件号码进行分词得到分词字符串;查找与各分词字符串对应的身份特征参数。In one of the embodiments, when the processor executes the computer-readable instructions to implement the step of extracting identity characteristic parameters from the identity data, the processor is further configured to: extract a document number from the identity data; identify and search for the document format of the document number The document type corresponding to the format recognition result; segmenting the document number according to the document type to obtain a segmented character string; and finding the identity characteristic parameters corresponding to each segmented character string.
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收身份数据;从身份数据中提取身份特征参数;查找身份数据对应的历史核查数据,从历史核查数据中提取核查时间参数;将身份特征参数和核查时间参数输入预设风险评估模型得到身份风险概率;根据身份风险概率生成风险评估结果。One or more non-volatile storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors cause the following steps to be performed: receiving identity data; Extract the identity characteristic parameters in the system; find the historical verification data corresponding to the identity data, and extract the verification time parameters from the historical verification data; enter the identity characteristic parameters and the verification time parameters into the preset risk assessment model to obtain the identity risk probability; evaluation result.
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:采集样本数据,将样本数据划分为训练集数据和测试集数据;从训练集数据中提取第一特征参数和第一目标类别;根据第一特征参数和第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据训练集数据计算初始决策树风险评估模型中各分类节点的风险概率;从测试集数据中提取第二特征参数和第二目标类别;根据第二特征参数和第二目标类别对初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对初始决策树风险评估模型进行调整并生成预设风险评估模型。In one embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented: collecting sample data, dividing the sample data into training set data and test set data; and extracting first feature parameters and first A target category; feature gain evaluation based on the first feature parameter and the first target category, feature selection based on the feature gain evaluation result, classification based on the selected features to obtain an initial decision tree risk assessment model, and initial calculation based on the training set data The risk probability of each classification node in the decision tree risk assessment model; the second feature parameter and the second target category are extracted from the test set data; each classification node in the initial decision tree risk assessment model is based on the second feature parameter and the second target category According to the verification results, the initial decision tree risk assessment model is adjusted and a preset risk assessment model is generated.
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:当到达核查数据更新时间时,加载更新的核查数据;从核查数据中提取与预设风险评估模型对应的第三特征参数和风险目标标记;根据第三特征参数和风险目标标记对预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对预设风险评估模型进行优化。In one embodiment, when the computer-readable instructions are executed by the processor, the following steps are further implemented: when the update time of the audit data is reached, loading the updated audit data; and extracting a third corresponding to the preset risk assessment model from the audit data Feature parameters and risk target tags; verify the risk probability of each classification node in the preset risk assessment model according to the third feature parameter and risk goal tags, and optimize the preset risk assessment model based on the verification results.
在其中一个实施例中,计算机可读指令被处理器执行时实现根据身份风险概率生成风险评估结果的步骤时还用于:从预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;获取决策路径的节点数据;根据节点数据和概率值最大的身份风险概率生成查获路径图并输出。In one embodiment, when the computer-readable instructions are executed by the processor, the step of generating a risk assessment result according to the identity risk probability is further used to: find a decision corresponding to the identity risk probability with the highest probability value from a preset risk assessment model Path; obtain the node data of the decision path; generate the seizure path map based on the node data and the identity risk probability with the highest probability value and output it.
在其中一个实施例中,计算机可读指令被处理器执行时实现根据身份风险概率生成风险评估结果的步骤时还用于:获取概率值最大的身份风险概率;获取当前安防人力数据,查找与当前安防人力数据对应的安防客流阈值;获取预设阈值转换数据,根据安防客流阈值和预设阈值转换数据计算风险概率阈值;当概率值最大的身份风险概率超过风险概率阈值时,生成风险检查预警提示并输出。In one embodiment, when the computer-readable instructions are executed by the processor, the step of generating a risk assessment result according to the identity risk probability is further used to: obtain the identity risk probability with the largest probability value; obtain the current security human data, Security passenger flow threshold corresponding to security manpower data; Obtain preset threshold conversion data, and calculate risk probability threshold based on security passenger flow threshold and preset threshold conversion data; when the identity probability with the highest probability value exceeds the risk probability threshold, generate a risk check warning alert And output.
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:实现从所述身份数据中提取身份特征参数的步骤时还用于:从身份数据中提取证件号码;对证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;根据证件类型对证件号码进行分词得到分词字符串;查找与各分词字符串对应的身份特征参数。In one of the embodiments, when the computer-readable instructions are executed by the processor, the following steps are also implemented: when implementing the step of extracting identity characteristic parameters from the identity data, it is further used for: extracting a document number from the identity data; Recognize the document format, find the document type corresponding to the format recognition result; segment the document number according to the document type to obtain the segmented character string; find the identity characteristic parameters corresponding to each segmented character string.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM (DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by computer-readable instructions to instruct related hardware. The computer-readable instructions can be stored in a non-volatile computer. In the readable storage medium, the computer-readable instructions, when executed, may include the processes of the embodiments of the methods described above. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), Rambus Direct RAM (RDRAM), Direct Memory Bus Dynamic RAM (DRDRAM), and Memory Bus Dynamic RAM (RDRAM).
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments can be arbitrarily combined. In order to make the description concise, all possible combinations of the technical features in the above embodiments have not been described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation manners of the present application, and their descriptions are more specific and detailed, but they cannot be understood as limiting the scope of the invention patent. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims (20)

  1. 一种身份信息风险评定方法,包括:An identity information risk assessment method, including:
    接收身份数据;Receiving identity data;
    从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
    查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
    将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
    根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
  2. 根据权利要求1所述的方法,其特征在于,所述预设风险评估模型的生成方式,包括:The method according to claim 1, wherein the generating method of the preset risk assessment model comprises:
    采集样本数据,将所述样本数据划分为训练集数据和测试集数据;Collect sample data, and divide the sample data into training set data and test set data;
    从所述训练集数据中提取第一特征参数和第一目标类别;Extracting a first feature parameter and a first target category from the training set data;
    根据所述第一特征参数和所述第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据所述训练集数据计算所述初始决策树风险评估模型中各分类节点的风险概率;Perform feature gain evaluation according to the first feature parameter and the first target category, perform feature selection according to the feature gain evaluation result, classify the selected feature to obtain an initial decision tree risk assessment model, and according to the training set data Calculating the risk probability of each classification node in the initial decision tree risk assessment model;
    从所述测试集数据中提取第二特征参数和第二目标类别;及Extracting a second feature parameter and a second target category from the test set data; and
    根据所述第二特征参数和所述第二目标类别对所述初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述初始决策树风险评估模型进行调整并生成预设风险评估模型。The risk probability of each classification node in the initial decision tree risk assessment model is verified according to the second characteristic parameter and the second target category, and the initial decision tree risk assessment model is adjusted and a preliminary prediction is generated according to the verification result. Set up a risk assessment model.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, further comprising:
    当到达核查数据更新时间时,加载更新的所述核查数据;When the verification data update time is reached, loading the updated verification data;
    从所述核查数据中提取与所述预设风险评估模型对应的第三特征参数和风险目标标记;及Extracting from the verification data a third characteristic parameter and a risk target flag corresponding to the preset risk assessment model; and
    根据所述第三特征参数和所述风险目标标记对所述预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述预设风险评估模型进行优化。The risk probability of each classification node in the preset risk assessment model is verified according to the third characteristic parameter and the risk target flag, and the preset risk assessment model is optimized according to a verification result.
  4. 根据权利要求2所述的方法,其特征在于,所述根据所述身份风险概率生成风险评估结果,包括:The method according to claim 2, wherein the generating a risk assessment result according to the identity risk probability comprises:
    从所述预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;Find a decision path corresponding to the identity risk probability with the highest probability value from the preset risk assessment model;
    获取所述决策路径的节点数据;及Acquiring node data of the decision path; and
    根据所述节点数据和所述概率值最大的身份风险概率生成查获路径图并输出。According to the node data and the identity risk probability with the largest probability value, a seizure path map is generated and output.
  5. 根据权利要求2所述的方法,其特征在于,所述根据所述身份风险概率生成风险评估结果,包括:The method according to claim 2, wherein the generating a risk assessment result according to the identity risk probability comprises:
    获取概率值最大的身份风险概率;Get the identity risk probability with the largest probability value;
    获取当前安防人力数据,查找与所述当前安防人力数据对应的安防客流阈值;Acquiring current security manpower data, and searching for a security passenger flow threshold corresponding to the current security manpower data;
    获取预设阈值转换数据,根据所述安防客流阈值和所述预设阈值转换数据计算风险概 率阈值;及Obtaining preset threshold conversion data, and calculating a risk probability threshold based on the security passenger flow threshold value and the preset threshold conversion data; and
    当所述概率值最大的身份风险概率超过所述风险概率阈值时,生成风险检查预警提示并输出。When the identity risk probability with the highest probability value exceeds the risk probability threshold, a risk check warning alert is generated and output.
  6. 根据权利要求1所述的方法,其特征在于,所述从所述身份数据中提取身份特征参数,包括:The method according to claim 1, wherein the extracting identity characteristic parameters from the identity data comprises:
    从所述身份数据中提取证件号码;Extracting a document number from the identity data;
    对所述证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;Identify the document format of the document number, and find the document type corresponding to the format identification result;
    根据所述证件类型对所述证件号码进行分词得到分词字符串;及Segmenting the credential number according to the credential type to obtain a segmented character string; and
    查找与各所述分词字符串对应的身份特征参数。Finding the identity characteristic parameters corresponding to each of the word segmentation strings.
  7. 一种身份信息风险评定装置,包括:An identity information risk assessment device includes:
    身份数据获取模块,用于接收身份数据;Identity data acquisition module, for receiving identity data;
    身份参数提取模块,用于从所述身份数据中提取身份特征参数;An identity parameter extraction module, configured to extract identity characteristic parameters from the identity data;
    时间参数获取模块,用于查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;A time parameter acquisition module, configured to find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
    风险概率获得模块,用于将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及A risk probability obtaining module, configured to input the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
    风险结果生成模块,用于根据所述身份风险概率生成风险评估结果。A risk result generating module is configured to generate a risk assessment result according to the identity risk probability.
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 7, further comprising:
    数据采集模块,用于采集样本数据,将所述样本数据划分为训练集数据和测试集数据;A data collection module, configured to collect sample data, and divide the sample data into training set data and test set data;
    训练数据提取模块,用于从所述训练集数据中提取第一特征参数和第一目标类别;A training data extraction module, configured to extract a first feature parameter and a first target category from the training set data;
    初始模型构建模块,用于根据所述第一特征参数和所述第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据所述训练集数据计算所述初始决策树风险评估模型中各分类节点的风险概率;An initial model building module is configured to perform feature gain evaluation according to the first feature parameter and the first target category, perform feature selection according to the feature gain evaluation result, and classify according to the selected features to obtain an initial decision tree risk assessment model. Calculating the risk probability of each classification node in the initial decision tree risk assessment model according to the training set data;
    测试数据提取模块,用于从所述测试集数据中提取第二特征参数和第二目标类别;及A test data extraction module, configured to extract a second feature parameter and a second target category from the test set data; and
    评估模块生成模块,用于根据所述第二特征参数和所述第二目标类别对所述初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述初始决策树风险评估模型进行调整并生成预设风险评估模型。An evaluation module generating module is configured to verify the risk probability of each classification node in the initial decision tree risk assessment model according to the second characteristic parameter and the second target category, and to evaluate the initial decision tree risk according to the verification result The assessment model is adjusted and a preset risk assessment model is generated.
  9. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more processors are Each processor performs the following steps:
    接收身份数据;Receiving identity data;
    从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
    查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
    将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
    根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
  10. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 9, wherein the processor further executes the following steps when executing the computer-readable instructions:
    采集样本数据,将所述样本数据划分为训练集数据和测试集数据;Collect sample data, and divide the sample data into training set data and test set data;
    从所述训练集数据中提取第一特征参数和第一目标类别;Extracting a first feature parameter and a first target category from the training set data;
    根据所述第一特征参数和所述第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据所述训练集数据计算所述初始决策树风险评估模型中各分类节点的风险概率;Perform feature gain evaluation according to the first feature parameter and the first target category, perform feature selection according to the feature gain evaluation result, classify the selected feature to obtain an initial decision tree risk assessment model, and according to the training set data Calculating the risk probability of each classification node in the initial decision tree risk assessment model;
    从所述测试集数据中提取第二特征参数和第二目标类别;及Extracting a second feature parameter and a second target category from the test set data; and
    根据所述第二特征参数和所述第二目标类别对所述初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述初始决策树风险评估模型进行调整并生成预设风险评估模型。The risk probability of each classification node in the initial decision tree risk assessment model is verified according to the second characteristic parameter and the second target category, and the initial decision tree risk assessment model is adjusted and a preliminary prediction is generated according to the verification result. Set up a risk assessment model.
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instructions:
    当到达核查数据更新时间时,加载更新的所述核查数据;When the verification data update time is reached, loading the updated verification data;
    从所述核查数据中提取与所述预设风险评估模型对应的第三特征参数和风险目标标记;及Extracting from the verification data a third characteristic parameter and a risk target flag corresponding to the preset risk assessment model; and
    根据所述第三特征参数和所述风险目标标记对所述预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述预设风险评估模型进行优化。The risk probability of each classification node in the preset risk assessment model is verified according to the third characteristic parameter and the risk target flag, and the preset risk assessment model is optimized according to a verification result.
  12. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时实现根据所述身份风险概率生成风险评估结果,还包括:The computer device according to claim 10, wherein when the processor executes the computer-readable instructions to generate a risk assessment result according to the identity risk probability, further comprising:
    从所述预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;Find a decision path corresponding to the identity risk probability with the highest probability value from the preset risk assessment model;
    获取所述决策路径的节点数据;及Acquiring node data of the decision path; and
    根据所述节点数据和所述概率值最大的身份风险概率生成查获路径图并输出。According to the node data and the identity risk probability with the largest probability value, a seizure path map is generated and output.
  13. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时实现根据所述身份风险概率生成风险评估结果,还包括:The computer device according to claim 10, wherein when the processor executes the computer-readable instructions to generate a risk assessment result according to the identity risk probability, further comprising:
    获取概率值最大的身份风险概率;Get the identity risk probability with the largest probability value;
    获取当前安防人力数据,查找与所述当前安防人力数据对应的安防客流阈值;Acquiring current security manpower data, and searching for a security passenger flow threshold corresponding to the current security manpower data;
    获取预设阈值转换数据,根据所述安防客流阈值和所述预设阈值转换数据计算风险概率阈值;及Acquiring preset threshold conversion data, and calculating a risk probability threshold based on the security passenger flow threshold value and the preset threshold conversion data; and
    当所述概率值最大的身份风险概率超过所述风险概率阈值时,生成风险检查预警提示并输出。When the identity risk probability with the highest probability value exceeds the risk probability threshold, a risk check warning alert is generated and output.
  14. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可 读指令时实现从所述身份数据中提取身份特征参数,还包括:The computer device of claim 9, wherein the processor, when executing the computer-readable instructions, extracts identity characteristic parameters from the identity data, further comprising:
    从所述身份数据中提取证件号码;Extracting a document number from the identity data;
    对所述证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;Identify the document format of the document number, and find the document type corresponding to the format identification result;
    根据所述证件类型对所述证件号码进行分词得到分词字符串;及Segmenting the credential number according to the credential type to obtain a segmented character string; and
    查找与各所述分词字符串对应的身份特征参数。Finding the identity characteristic parameters corresponding to each of the word segmentation strings.
  15. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:One or more non-transitory computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
    接收身份数据;Receiving identity data;
    从所述身份数据中提取身份特征参数;Extracting identity characteristic parameters from the identity data;
    查找所述身份数据对应的历史核查数据,从所述历史核查数据中提取核查时间参数;Find historical verification data corresponding to the identity data, and extract verification time parameters from the historical verification data;
    将所述身份特征参数和所述核查时间参数输入预设风险评估模型得到身份风险概率;及Inputting the identity characteristic parameter and the verification time parameter into a preset risk assessment model to obtain an identity risk probability; and
    根据所述身份风险概率生成风险评估结果。A risk assessment result is generated according to the identity risk probability.
  16. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 15, wherein when the computer-readable instructions are executed by the processor, the following steps are further performed:
    采集样本数据,将所述样本数据划分为训练集数据和测试集数据;Collect sample data, and divide the sample data into training set data and test set data;
    从所述训练集数据中提取第一特征参数和第一目标类别;Extracting a first feature parameter and a first target category from the training set data;
    根据所述第一特征参数和所述第一目标类别进行特征增益评估,并根据特征增益评估结果进行特征选择,根据所选择的特征进行分类得到初始决策树风险评估模型,根据所述训练集数据计算所述初始决策树风险评估模型中各分类节点的风险概率;Perform feature gain evaluation according to the first feature parameter and the first target category, perform feature selection according to the feature gain evaluation result, classify the selected feature to obtain an initial decision tree risk assessment model, and according to the training set data Calculating the risk probability of each classification node in the initial decision tree risk assessment model;
    从所述测试集数据中提取第二特征参数和第二目标类别;及Extracting a second feature parameter and a second target category from the test set data; and
    根据所述第二特征参数和所述第二目标类别对所述初始决策树风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述初始决策树风险评估模型进行调整并生成预设风险评估模型。The risk probability of each classification node in the initial decision tree risk assessment model is verified according to the second characteristic parameter and the second target category, and the initial decision tree risk assessment model is adjusted and a preliminary prediction is generated according to the verification result. Set up a risk assessment model.
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:The storage medium according to claim 16, wherein when the computer-readable instructions are executed by the processor, the following steps are further performed:
    当到达核查数据更新时间时,加载更新的所述核查数据;When the verification data update time is reached, loading the updated verification data;
    从所述核查数据中提取与所述预设风险评估模型对应的第三特征参数和风险目标标记;及Extracting from the verification data a third characteristic parameter and a risk target flag corresponding to the preset risk assessment model; and
    根据所述第三特征参数和所述风险目标标记对所述预设风险评估模型中各分类节点的风险概率进行验证,根据验证结果对所述预设风险评估模型进行优化。The risk probability of each classification node in the preset risk assessment model is verified according to the third characteristic parameter and the risk target flag, and the preset risk assessment model is optimized according to a verification result.
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时实现根据所述身份风险概率生成风险评估结果,还包括:The storage medium according to claim 16, wherein when the computer-readable instructions are executed by the processor, a risk assessment result is generated according to the identity risk probability, further comprising:
    从所述预设风险评估模型中查找概率值最大的身份风险概率对应的决策路径;Find a decision path corresponding to the identity risk probability with the highest probability value from the preset risk assessment model;
    获取所述决策路径的节点数据;及Acquiring node data of the decision path; and
    根据所述节点数据和所述概率值最大的身份风险概率生成查获路径图并输出。According to the node data and the identity risk probability with the largest probability value, a seizure path map is generated and output.
  19. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时实现根据所述身份风险概率生成风险评估结果,还包括:The storage medium according to claim 16, wherein when the computer-readable instructions are executed by the processor, a risk assessment result is generated according to the identity risk probability, further comprising:
    获取概率值最大的身份风险概率;Get the identity risk probability with the largest probability value;
    获取当前安防人力数据,查找与所述当前安防人力数据对应的安防客流阈值;Acquiring current security manpower data, and searching for a security passenger flow threshold corresponding to the current security manpower data;
    获取预设阈值转换数据,根据所述安防客流阈值和所述预设阈值转换数据计算风险概率阈值;及Acquiring preset threshold conversion data, and calculating a risk probability threshold based on the security passenger flow threshold value and the preset threshold conversion data; and
    当所述概率值最大的身份风险概率超过所述风险概率阈值时,生成风险检查预警提示并输出。When the identity risk probability with the highest probability value exceeds the risk probability threshold, a risk check warning alert is generated and output.
  20. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时实现从所述身份数据中提取身份特征参数,还包括:The storage medium according to claim 15, wherein the execution of the computer-readable instructions by the processor to extract identity characteristic parameters from the identity data further comprises:
    从所述身份数据中提取证件号码;Extracting a document number from the identity data;
    对所述证件号码进行证件格式识别,查找与格式识别结果对应的证件类型;Identify the document format of the document number, and find the document type corresponding to the format identification result;
    根据所述证件类型对所述证件号码进行分词得到分词字符串;及Segmenting the credential number according to the credential type to obtain a segmented character string; and
    查找与各所述分词字符串对应的身份特征参数。Finding the identity characteristic parameters corresponding to each of the word segmentation strings.
PCT/CN2018/104806 2018-07-18 2018-09-10 Identity information risk assessment method and apparatus, and computer device and storage medium WO2020015089A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810791449.2 2018-07-18
CN201810791449.2A CN109242740A (en) 2018-07-18 2018-07-18 Identity information risk assessment method, apparatus, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2020015089A1 true WO2020015089A1 (en) 2020-01-23

Family

ID=65072033

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/104806 WO2020015089A1 (en) 2018-07-18 2018-09-10 Identity information risk assessment method and apparatus, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN109242740A (en)
WO (1) WO2020015089A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404721A (en) * 2020-02-13 2020-07-10 中国平安人寿保险股份有限公司 Web-based model training process data visualization processing method, device and equipment
CN111427883A (en) * 2020-02-18 2020-07-17 深圳壹账通智能科技有限公司 Data processing method and device based on AeroPike, computer equipment and storage medium
CN111445106A (en) * 2020-03-02 2020-07-24 国网辽宁省电力有限公司电力科学研究院 Power utilization acquisition equipment fault processing operation site safety control method and system
CN111738331A (en) * 2020-06-19 2020-10-02 北京同邦卓益科技有限公司 User classification method and device, computer-readable storage medium and electronic device
CN112465626A (en) * 2020-11-24 2021-03-09 平安科技(深圳)有限公司 Joint risk assessment method based on client classification aggregation and related equipment
CN112750031A (en) * 2021-01-20 2021-05-04 北京赛思信安技术股份有限公司 Multidimensional risk verification device applied to financing guarantee mechanism
CN113807858A (en) * 2021-09-23 2021-12-17 未鲲(上海)科技服务有限公司 Data processing method based on decision tree model and related equipment
CN115146725A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Determination method of object classification mode, object classification method, device and equipment
CN116319083A (en) * 2023-05-17 2023-06-23 南京哲上信息科技有限公司 Data transmission security detection method and system
CN116308762A (en) * 2023-05-19 2023-06-23 杭州钱袋数字科技有限公司 Credibility evaluation and trust processing method based on artificial intelligence
CN117634873A (en) * 2023-11-15 2024-03-01 中国人寿保险股份有限公司江苏省分公司 System and method for evaluating risk of sales personnel in insurance industry
CN117675387A (en) * 2023-12-12 2024-03-08 广州达悦信息科技有限公司 Network security risk prediction method and system based on user behavior analysis

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992119A (en) * 2019-02-21 2020-04-10 北京嘀嘀无限科技发展有限公司 Method and system for sequencing risk orders
CN113052413A (en) * 2019-12-26 2021-06-29 北京中科闻歌科技股份有限公司 Risk passenger assessment method, device, terminal and computer readable medium
CN112883061B (en) * 2020-12-07 2022-08-16 浙江大华技术股份有限公司 Dangerous article detection method, device and system and computer equipment
CN113642820B (en) * 2020-12-18 2024-05-28 航天信息股份有限公司广州航天软件分公司 Method and system for evaluating and managing personnel data information based on big data
CN113671589A (en) * 2021-09-14 2021-11-19 清华大学 Safety detection match physical system
CN114756716A (en) * 2022-04-18 2022-07-15 马上消费金融股份有限公司 Information processing method, device, equipment and storage medium
CN116797357B (en) * 2023-08-24 2023-11-21 杭银消费金融股份有限公司 Financial terminal-based credit authorization processing method and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260628A (en) * 2014-06-03 2016-01-20 腾讯科技(深圳)有限公司 Classifier training method and device and identity verification method and system
CN108009914A (en) * 2017-12-19 2018-05-08 马上消费金融股份有限公司 Credit risk assessment method, system, equipment and computer storage medium
CN108198116A (en) * 2016-12-08 2018-06-22 同方威视技术股份有限公司 For being detected the method and device of staffing levels in safety check
CN108269012A (en) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 Construction method, device, storage medium and the terminal of risk score model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260628A (en) * 2014-06-03 2016-01-20 腾讯科技(深圳)有限公司 Classifier training method and device and identity verification method and system
CN108198116A (en) * 2016-12-08 2018-06-22 同方威视技术股份有限公司 For being detected the method and device of staffing levels in safety check
CN108009914A (en) * 2017-12-19 2018-05-08 马上消费金融股份有限公司 Credit risk assessment method, system, equipment and computer storage medium
CN108269012A (en) * 2018-01-12 2018-07-10 中国平安人寿保险股份有限公司 Construction method, device, storage medium and the terminal of risk score model

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111404721B (en) * 2020-02-13 2023-07-25 中国平安人寿保险股份有限公司 Visual processing method, device and equipment for model training process data based on web
CN111404721A (en) * 2020-02-13 2020-07-10 中国平安人寿保险股份有限公司 Web-based model training process data visualization processing method, device and equipment
CN111427883A (en) * 2020-02-18 2020-07-17 深圳壹账通智能科技有限公司 Data processing method and device based on AeroPike, computer equipment and storage medium
CN111445106A (en) * 2020-03-02 2020-07-24 国网辽宁省电力有限公司电力科学研究院 Power utilization acquisition equipment fault processing operation site safety control method and system
CN111445106B (en) * 2020-03-02 2023-12-01 国网辽宁省电力有限公司电力科学研究院 Safety control method and system for fault processing operation site of electricity acquisition equipment
CN111738331A (en) * 2020-06-19 2020-10-02 北京同邦卓益科技有限公司 User classification method and device, computer-readable storage medium and electronic device
CN112465626A (en) * 2020-11-24 2021-03-09 平安科技(深圳)有限公司 Joint risk assessment method based on client classification aggregation and related equipment
CN112465626B (en) * 2020-11-24 2023-08-29 平安科技(深圳)有限公司 Combined risk assessment method based on client classification aggregation and related equipment
CN112750031A (en) * 2021-01-20 2021-05-04 北京赛思信安技术股份有限公司 Multidimensional risk verification device applied to financing guarantee mechanism
CN113807858A (en) * 2021-09-23 2021-12-17 未鲲(上海)科技服务有限公司 Data processing method based on decision tree model and related equipment
CN113807858B (en) * 2021-09-23 2024-04-26 中科软科技股份有限公司 Data processing method and related equipment based on decision tree model
CN115146725B (en) * 2022-06-30 2023-05-30 北京百度网讯科技有限公司 Method for determining object classification mode, object classification method, device and equipment
CN115146725A (en) * 2022-06-30 2022-10-04 北京百度网讯科技有限公司 Determination method of object classification mode, object classification method, device and equipment
CN116319083B (en) * 2023-05-17 2023-08-04 南京哲上信息科技有限公司 Data transmission security detection method and system
CN116319083A (en) * 2023-05-17 2023-06-23 南京哲上信息科技有限公司 Data transmission security detection method and system
CN116308762B (en) * 2023-05-19 2023-08-11 杭州钱袋数字科技有限公司 Credibility evaluation and trust processing method based on artificial intelligence
CN116308762A (en) * 2023-05-19 2023-06-23 杭州钱袋数字科技有限公司 Credibility evaluation and trust processing method based on artificial intelligence
CN117634873A (en) * 2023-11-15 2024-03-01 中国人寿保险股份有限公司江苏省分公司 System and method for evaluating risk of sales personnel in insurance industry
CN117675387A (en) * 2023-12-12 2024-03-08 广州达悦信息科技有限公司 Network security risk prediction method and system based on user behavior analysis

Also Published As

Publication number Publication date
CN109242740A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2020015089A1 (en) Identity information risk assessment method and apparatus, and computer device and storage medium
WO2020015104A1 (en) Method, apparatus, computer device, and storage medium for predicting flow rate of passengers presenting security risk
CN109165840B (en) Risk prediction processing method, risk prediction processing device, computer equipment and medium
WO2020253358A1 (en) Service data risk control analysis processing method, apparatus and computer device
US11816078B2 (en) Automatic entity resolution with rules detection and generation system
WO2019218699A1 (en) Fraud transaction determining method and apparatus, computer device, and storage medium
WO2020177377A1 (en) Machine learning-based data prediction processing method and apparatus, and computer device
CN111177714B (en) Abnormal behavior detection method and device, computer equipment and storage medium
CN108170759B (en) Complaint case processing method and device, computer equipment and storage medium
CN110008250B (en) Social security data processing method and device based on data mining and computer equipment
WO2020057021A1 (en) Data table processing method and device, computer device and storage medium
WO2020015139A1 (en) Method and device for identifying high-risk passenger, computer apparatus, and storage medium
CN111145910A (en) Abnormal case identification method and device based on artificial intelligence and computer equipment
CN111046879A (en) Certificate image classification method and device, computer equipment and readable storage medium
WO2020048048A1 (en) Unbalanced sample data preprocessing method and apparatus, and computer device
CN109949154A (en) Customer information classification method, device, computer equipment and storage medium
CN111400126B (en) Network service abnormal data detection method, device, equipment and medium
CN113888299A (en) Wind control decision method and device, computer equipment and storage medium
CN112131277A (en) Medical data anomaly analysis method and device based on big data and computer equipment
CN111767192B (en) Business data detection method, device, equipment and medium based on artificial intelligence
CN110751171A (en) Image data classification method and device, computer equipment and storage medium
CN112990989B (en) Value prediction model input data generation method, device, equipment and medium
CN113283973A (en) Account checking difference data processing method and device, computer equipment and storage medium
CN109493975B (en) Chronic disease recurrence prediction method, device and computer equipment based on xgboost model
CN110727711A (en) Method and device for detecting abnormal data in fund database and computer equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18927054

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18927054

Country of ref document: EP

Kind code of ref document: A1