CN111080012A - Personnel risk degree prediction method and device, electronic equipment and readable storage medium - Google Patents

Personnel risk degree prediction method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN111080012A
CN111080012A CN201911301877.3A CN201911301877A CN111080012A CN 111080012 A CN111080012 A CN 111080012A CN 201911301877 A CN201911301877 A CN 201911301877A CN 111080012 A CN111080012 A CN 111080012A
Authority
CN
China
Prior art keywords
personnel
information
risk
person
personnel information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911301877.3A
Other languages
Chinese (zh)
Inventor
袁杰
陈秀坤
高古明
王欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhizhi Heshu Technology Co.,Ltd.
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201911301877.3A priority Critical patent/CN111080012A/en
Publication of CN111080012A publication Critical patent/CN111080012A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Security & Cryptography (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides a method and a device for predicting the risk of people, electronic equipment and a readable storage medium, and relates to the technical field of artificial intelligence. According to the method for predicting the danger degree of the personnel, firstly, to-be-predicted information of the personnel to be predicted is obtained, then, based on a pre-established danger degree prediction model, the danger degree is predicted according to the to-be-predicted information, and the danger degree value of the personnel to be predicted is obtained, so that the efficiency of recognizing the danger degree of the illegal personnel is improved, and the difficulty of recognizing the danger degree of the illegal personnel is reduced.

Description

Personnel risk degree prediction method and device, electronic equipment and readable storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for predicting personnel risk, electronic equipment and a readable storage medium.
Background
The mode of illegal activities by using the internet is increasingly prominent, and the illegal activities by using the internet become the latest development trend of lawbreakers due to the fact that the public security personnel have many tracking links and high tracking difficulty when handling cases. At present, the tracking and screening efficiency of suspicious people can be improved through face recognition and voiceprint recognition. However, the judgment of the degree of danger of illegal persons in public security business is usually based on business knowledge of public security personnel or clue information provided by reporting personnel, and is simply carried out by means of manual work, so that the efficiency is low, and the difficulty is high. How to improve the efficiency of discerning the danger degree of illegal personnel, reduce the degree of difficulty of discerning the danger degree of illegal personnel is the problem that needs to solve at present urgently.
Disclosure of Invention
In view of the above, embodiments of the present application provide a method and an apparatus for predicting a risk level of a person, an electronic device, and a readable storage medium, so as to solve the above problems.
Embodiments of the invention may be implemented as follows:
in a first aspect, an embodiment provides a method for predicting a risk of a person, where the method includes:
acquiring information to be predicted of a person to be predicted;
and predicting the risk degree according to the information to be predicted based on a pre-established risk degree prediction model to obtain the risk degree value of the personnel to be predicted.
In an alternative embodiment, the risk prediction model is built by:
acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and preprocessing each personnel information;
dividing the preprocessed personnel information into a personnel information training set and a personnel information testing set according to a preset proportion;
and establishing a risk prediction model based on the personnel information training set and the personnel information testing set.
In an optional implementation manner, the step of acquiring the staff information of at least one history identifier and the staff information of at least one general staff and preprocessing each of the staff information includes:
acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and performing abnormal data detection on the personnel information according to a preset abnormal data detection method;
according to the abnormal data detection result, removing or filling the personnel information;
and based on a preset standardization method, standardizing the personnel information after the personnel information is removed or filled.
In an optional embodiment, the step of performing elimination or padding processing on each piece of personnel information according to an abnormal data detection result includes:
and for each piece of personnel information, if the number of the sample data with the abnormality in the personnel information is detected to be larger than a first preset threshold value, rejecting the sample data with the abnormality, and otherwise, filling the sample data with the abnormality.
In an optional embodiment, the personnel information includes an identification number and a mobile phone number, and after the step of performing padding processing on the sample data with the abnormality, the method further includes:
detecting whether the identity card number included in the filled personnel information meets a first preset standard or not, and if not, rejecting the identity card number;
and detecting whether the mobile phone number included in the filled personnel information meets a second preset standard, and if not, rejecting the mobile phone number.
In an optional embodiment, the step of padding the sample data with the exception includes:
and filling the sample data with the abnormality by using the mean value, the median or the first preset value of the sample data with the abnormality.
In an optional embodiment, the step of building a risk prediction model based on the training set of human information and the testing set of human information includes:
aiming at the personnel information training set, establishing a risk prediction model by using a logistic regression algorithm;
inputting the personnel information test set into the risk prediction model to obtain risk prediction values of various historical identification personnel and various common personnel in the personnel information test set;
counting the number of each historical identifier and each ordinary person with the risk degree prediction value larger than a second preset value, and calculating the prediction accuracy according to the number;
and comparing whether the accuracy is smaller than a second preset threshold, if so, training the risk prediction model again by using the personnel information training set until the accuracy is larger than or equal to the second preset threshold.
In a second aspect, an embodiment provides a person risk prediction device, including:
the acquisition module is used for acquiring information to be predicted of a person to be predicted;
and the prediction module is used for predicting the risk degree according to the information to be predicted based on a pre-established risk degree prediction model to obtain the risk degree value of the personnel to be predicted.
In a third aspect, an embodiment provides an electronic device, which includes a processor, a memory and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the memory communicate with each other through the bus, and the processor executes the machine-readable instructions to perform the steps of the method for predicting human risk according to any one of the foregoing embodiments.
In a fourth aspect, an embodiment provides a readable storage medium, in which a computer program is stored, and the computer program, when executed, implements the person risk prediction method according to any one of the foregoing embodiments.
The embodiment of the application provides a method and a device for predicting the danger degree of personnel, electronic equipment and a readable storage medium.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Fig. 2 is a flowchart of a method for predicting a risk of a person according to an embodiment of the present disclosure.
Fig. 3 is a second flowchart of a method for predicting a risk of a person according to an embodiment of the present application.
Fig. 4 is a schematic sub-step diagram of step S100 in fig. 3 according to an embodiment of the present disclosure.
Fig. 5 is a schematic sub-step diagram of step S300 in fig. 3 according to an embodiment of the present disclosure.
Fig. 6 is a functional block diagram of a device for predicting a risk of a person according to an embodiment of the present disclosure.
Icon: 100-an electronic device; 110-a memory; 120-a processor; 130-a personnel risk prediction device; 131-an acquisition module; 132-prediction module.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
Furthermore, the appearances of the terms "first," "second," and the like, if any, are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
It should be noted that the features of the embodiments of the present invention may be combined with each other without conflict.
In recent years, as the internet is involved in all fields of society in all directions, internet illegal activities show a situation of frequent occurrence, and the proportion of online drug crimes occupying in online crimes is rising continuously, so that the internet illegal activities show a rapid spreading trend. Lawbreakers can purchase drugs on the internet by using the characteristics of internet crossing time and space, and can purchase the drugs on the internet through interactive social software such as QQ and WeChat and instant communication tools such as mobile phones, so as to achieve the purpose of sucking the drugs. The mode of illegal activities by using the internet is increasingly prominent, and the illegal activities by using the internet become the latest development trend of lawbreakers due to the fact that the public security personnel have many tracking links and high tracking difficulty when handling cases.
As described in the background art, currently, the efficiency of tracking and screening suspicious people can be improved through face recognition and voiceprint recognition. However, the judgment of the degree of danger of illegal persons in public security business is usually based on business knowledge of public security personnel or clue information provided by reporting personnel, and is simply carried out by means of manual work, so that the efficiency is low, and the difficulty is high. Under the condition that the internet is used for carrying out drug vending or other illegal activities are increasingly prominent, how to improve the efficiency of identifying the danger degree of illegal personnel and reduce the difficulty of identifying the danger degree of the illegal personnel is a problem which needs to be solved urgently at present.
In view of this, the embodiment of the present application provides a method for predicting a risk of a person, where the method uses a pre-established risk prediction model to predict a risk according to information to be predicted of a person to be predicted. Therefore, the efficiency of recognizing the risk of the illegal personnel is improved, and the difficulty of recognizing the risk of the illegal personnel is reduced. The above method is explained in detail below.
Referring to fig. 1, fig. 1 is a schematic structural diagram of an electronic device 100 according to an embodiment of the present disclosure. The device may include a processor 120, a memory 110, a human risk prediction apparatus 130, and a bus, where the memory 110 stores machine-readable instructions executable by the processor 120, and when the electronic device 100 runs, the processor 120 and the memory 110 communicate with each other through the bus, and the processor 120 executes the machine-readable instructions and performs the steps of the human risk prediction method.
The memory 110, the processor 120, and other components are electrically connected to each other directly or indirectly to enable signal transmission or interaction.
For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The human risk prediction means 130 comprises at least one software function module which may be stored in the memory 110 in the form of software or firmware (firmware). The processor 120 is configured to execute an executable module stored in the memory 110, such as a software function module or a computer program included in the human risk prediction apparatus 130.
The Memory 110 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
The processor 120 may be an integrated circuit chip having signal processing capabilities. The processor 120 may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and so on.
But may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In this embodiment, the memory 110 is used for storing programs, and the processor 120 is used for executing the programs after receiving the execution instructions. The method defined by the process disclosed in any of the embodiments of the present application can be applied to the processor 120, or implemented by the processor 120.
It will be appreciated that the configuration shown in figure 1 is merely illustrative. Electronic device 100 may also have more or fewer components than shown in FIG. 1, or a different configuration than shown in FIG. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
Referring to fig. 2, fig. 2 is a flowchart of a method for predicting a risk of a person according to an embodiment of the present application, where the method is applied to the electronic device 100. The specific flow shown in fig. 2 is described in detail below.
And step S1, acquiring information to be predicted of the person to be predicted.
And step S2, based on a pre-established danger degree prediction model, predicting the danger degree according to the information to be predicted, and obtaining the danger degree value of the personnel to be predicted.
The information to be predicted may include identity information, social information, and traffic information of the person to be predicted, for example, the identity information may include an identification number, height, weight, age, occupation, and gender of the person to be predicted. The social information may include a cell phone number of a person to be predicted, information of different persons included in the social platform, call information with others, and the like. The traffic information may include public transportation ride information, hotel accommodation information, etc. of the person to be predicted.
When the danger degree of the person to be predicted violating a certain law needs to be obtained, the obtained information to be predicted of the person to be predicted is input into a pre-established danger degree prediction model, and the danger degree prediction model predicts the danger degree according to the information to be predicted, so that the danger degree value of the person to be predicted is obtained.
As a possible case, when the relevant departments investigate crime cases together and there are A, B, C, D four persons involved in the case, the person information of the person involved in the case A, B, C, D may be inputted into the risk degree prediction model, and based on the risk degree prediction model, the risk degree value of the person involved in the case a is 0.9, the risk degree value of the person involved in the case B is 0.72, the risk degree value of the person involved in the case C is 0.31, and the risk degree value of the person involved in the case D is 0.56. At this time, the staff of the relevant department can proceed to focus on investigating the case-involved person A, the case-involved person B and the case-involved person D with higher risk value, so as to assist the relevant department to investigate the case, and improve the efficiency of the case investigation.
As another possible situation, when the related department wants to know the probability of the drug inhalation or other illegal events of the third person recorded in the case, the third person in the theft, robbery or fraud president, the information of the third person can be input into the risk prediction model, and the probability of the drug inhalation or other illegal events of the third person can be obtained based on the risk prediction model.
As still another possible scenario, the relevant department may also predict the probability of crime again after the prison of the criminal person during prison based on the above risk prediction model, for example, predict the probability of crime again after prison of the person who has theft, robbery or pre-fraudulent department, and estimate the person who is dangerous for crime again according to the predicted probability, so as to improve the prognosis work effect of the prisoner after prison and guarantee social security.
According to the method for predicting the danger degree of the personnel, the danger degree is predicted according to the information to be predicted of the personnel to be predicted by using the pre-established danger degree prediction model. Therefore, the efficiency of recognizing the risk of the illegal personnel is improved, and the difficulty of recognizing the risk of the illegal personnel is reduced.
Further, please refer to fig. 3 in combination, fig. 3 is a second flowchart of the method for predicting the risk of people according to the embodiment of the present application. The risk prediction model can be built through the following steps S100 to S300:
step S100, acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and preprocessing each personnel information.
And S200, dividing the preprocessed personal information into a personal information training set and a personal information testing set according to a preset proportion.
And step S300, establishing a risk prediction model based on the personnel information training set and the personnel information testing set.
The history identification personnel are personnel who have carried out illegal activities and recorded on a case, and the ordinary personnel are personnel who do not have any illegal record.
In the following, taking the history identifier as a person in a theft, robbery or fraud pre-department as an example, in the embodiment of the present application, the social information in the person information may further include a duration of a call between the history identifier or an ordinary person and another person in the theft, robbery or fraud pre-department, a total number of calls between the history identifier or an ordinary person and another person in the theft, robbery or fraud pre-department, an average time interval of a call between the history identifier or an ordinary person and another person in the theft, robbery or fraud pre-department, and a number of days from the latest call between the history identifier or an ordinary person and another person in the theft, robbery or fraud pre-department.
And the total frequency of short messages sent by the history identification person or the ordinary person and other persons with theft, robbery or pre-fraud departments, the average time interval of short messages sent by the history identification person or the ordinary person and other persons with theft, robbery or pre-fraud departments, and the number of days from the latest short message sent by the history identification person or the ordinary person and other persons with theft, robbery or pre-fraud departments.
And the total number of the contact ways of the persons in the address book of the history identification person or the ordinary person, and the number of the contact ways of the persons in the address book of the history identification person or the ordinary person with theft, robbery or pre-fraud departments, wherein the total number of the contact ways of the persons in the address book of the history identification person or the ordinary person with theft, robbery or pre-fraud departments is stored by the other persons with theft, robbery or pre-fraud departments.
And the history identifies the total number of people who have a theft, robbery, or pre-fraud department among the history identifies people who are one degree related to the person or person of ordinary skill (i.e., the history identifies friends and relatives of the person or person of ordinary skill), and the history identifies the total number of people who have a theft, robbery, or pre-fraud department among the history identifies people who are two degree related to the person or person of ordinary skill (i.e., the history identifies friends and relatives of the person or person of ordinary skill and friends).
And a record of whether the history identifier or the ordinary person has sent the express delivery to other persons with theft, robbery or fraud forensics, the times that the history identifier or the ordinary person sends the express delivery to other persons with theft, robbery or fraud forensics, the times that other persons with theft, robbery or fraud forensics send the history identifier or the ordinary person, and the times that the history identifier or the ordinary person receives the express deliveries sent by other persons with theft, robbery or fraud forensics.
And whether the history identifies people or ordinary people who have words related to "theft, robbery or fraud" in the chat of the social software, and the number of times the history identifies people or ordinary people living with other people who have pre-theft, robbery or fraud subjects.
And whether the history identifies the person as having other criminal activity records, such as whether the history identifies the person as having gambling records, whether the history identifies the person as having poison-related records, whether the history identifies the person as having sales records or computer object records.
Further, the traffic information in each person information may further include a total number of times that the history identifier or the ordinary person checked in the hotel, an average number of times that the history identifier or the ordinary person checked in the hotel per month, an average number of times that the history identifier or the ordinary person checked in the hotel per week, a number of times that the history identifier or the ordinary person checked in the hotel at midnight, a number of days that the history identifier or the ordinary person checked in the hotel at the latest time, an average time interval that the history identifier or the ordinary person changed the hotel, an average time interval that the history identifier or the ordinary person took the vehicle with other people with theft, robbery, or fraud predecessor, and an average time interval that the history identifier or the ordinary person took the vehicle with other people with theft, robbery, or fraud predecessor.
It should be noted that the above is only an example, and the person information may include other information.
Further, the identity information in each person information may also include whether the history identifier or the ordinary person is a staff in an entertainment place such as a bar, a KTV, a massage foot and the like.
As an alternative implementation manner, in the present application example, the ratio of 4: 1, randomly dividing each preprocessed personal information into a personal information training set and a personal information testing set. It is easily understood that the training set of the personnel information obtained after the division accounts for 80% of the information of each person after the preprocessing, and the testing set of the personnel information accounts for 20% of the information of each person after the preprocessing.
As another embodiment, in the present application example, the ratio of 7: and 3, randomly dividing the preprocessed personal information into a personal information training set and a personal information testing set according to a preset proportion. It is easily understood that the training set of the personnel information obtained after the division accounts for 70% of the information of each person after the preprocessing, and the testing set of the personnel information accounts for 30% of the information of each person after the preprocessing.
It is understood that the preset ratio may also be 6: 4 or 5: and 5, determining according to actual requirements, and not limiting the embodiment of the application.
As an alternative, please refer to fig. 4 in combination, each of the personal information may be preprocessed through the following steps S110 to S130.
Step S110, acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and performing abnormal data detection on each personnel information according to a preset abnormal data detection method.
And step S120, removing or filling the personnel information according to the abnormal data detection result.
And step S130, based on a preset standardization method, standardizing the personnel information after being removed or filled.
The preset abnormal data detection method may be a gaussian distribution-based abnormal data detection method, a classification model-based abnormal value detection method, an isolated forest method, or the like. In practical use, any method can be selected according to needs, and the specific principle and steps of the method can refer to the prior art and are not described herein.
The personnel information can be from records in public security records, and can also be from information provided by a network or related personnel. Some data are unclear or lost due to long time or negligence of workers during recording, so that abnormal sample data exists in the obtained information of each worker. Therefore, padding processing or culling processing needs to be performed on sample data having an abnormality.
In the embodiment of the application, the personnel information can be removed or filled according to the abnormal data detection result by the following method.
And for each piece of personnel information, if the number of the sample data with the abnormality in the personnel information is detected to be larger than a first preset threshold value, rejecting the sample data with the abnormality, and otherwise, filling the sample data with the abnormality.
It is to be understood that, if the first preset threshold is 50%, 60%, and 70%. it is understood that, if the first preset threshold is 50%, the number of sample data indicating that there is an abnormality in the person information accounts for half of the total number. The larger the first preset threshold is, the larger the number of sample data representing that abnormality exists in the person information accounts for the total number. Therefore, in order to make the processed data be data that can provide value for identifying the degree of danger of the person, the first preset threshold needs to be set within a reasonable range and should not be too large or too small.
As an alternative, for the missing data of numerical type in the sample data with abnormality, the average or median filling mode can be used to process, for example, the time length of the conversation between the history identifier or the ordinary person and other persons with theft, robbery or prior fraud department in about three months. For other discrete data in the sample data with the exception, the first preset value may be used: "1" or "0" are padded. For example, history identifies whether a person or a common person has a previous subject related to a virus, whether a person to be identified has a record of sending an express with a person having the previous subject related to the virus, and the like.
As another optional implementation, the padding processing may be performed by adopting a hot-card padding method or a cluster padding method for missing data of a numerical value type in the sample data with the exception. For example, the nearest data is filled in to the missing data using a near-complement approach or a K-nearest distance approach.
As another optional implementation manner, in order to better obtain valid data, on the basis of the method, whether the identity card number included in each piece of person information after padding processing meets the first preset specification may be further detected, and if not, the identity card number is removed.
And meanwhile, detecting whether the mobile phone number included in the personnel information after filling processing meets a second preset standard, and if not, rejecting the mobile phone number.
Because the identification number and the mobile phone number have certain regularity and are also important data for identifying the personnel information, after the above-mentioned removing or filling processing is carried out, the mobile phone number and the identification number in the personnel information are further preprocessed in the embodiment of the application, so that the availability of the obtained data is stronger.
Wherein, a first preset specification can be formulated according to the regulation of arranging related identity card numbers, for example, the identity card number is composed of a 17-digit digital body code and a one-digit check code, and the arrangement sequence is from left to right: a 6-bit digital address code, an 8-bit digital birth date code, a 3-bit digital sequence code, and a 1-bit digital check code. Wherein odd numbers of the sequence code are assigned to males and even numbers are assigned to females. If the fact that the person to be predicted is male, but the 3-digit numerical sequence code of the identity card number is an even number, the identity card number can be determined to be not in accordance with the first preset specification, and the identity card number needs to be removed.
It is understood that the second predetermined specification is also similar to the id number, and can be formulated according to the rules related to the arrangement of mobile phone numbers, which is not described herein.
Therefore, by filling or removing, abnormal sample data is processed, and the accuracy of a risk prediction model established by the filled or removed data is improved.
Further, the preset normalization processing method may be to normalize each of the person information according to the following formula according to the mean value and the standard deviation of each of the person information after the abnormal data detection:
Figure BDA0002322023990000161
wherein, mu is the average value of the personnel information after the removing or filling processing, and sigma is the standard deviation of the historical recorded data after the removing or filling processing.
As another implementation manner, in this embodiment of the present application, the person information may be normalized according to a minimum-maximum normalization, and a specific formula is as follows:
Figure BDA0002322023990000171
and the min is the minimum value of the personnel information subjected to the removing or filling processing, and the max is the maximum value of the personnel information subjected to the removing or filling processing.
Therefore, the personnel information after being removed or filled is subjected to standardization processing through a preset standardization method, and the accuracy of a risk degree prediction model established by utilizing the standardized data is improved.
Referring to fig. 5, as an alternative embodiment, the risk prediction model may be established through the following steps S310 to S350.
And S310, aiming at the personnel information training set, establishing a risk prediction model by using a logistic regression algorithm.
And S320, inputting the personnel information test set into the risk prediction model to obtain risk prediction values of each historical identification personnel and each common personnel in the personnel information test set.
And step S330, counting the number of each history identifier and each ordinary person with the risk degree prediction value larger than a second preset value, and calculating the prediction accuracy according to the number.
Step S340, comparing whether the accuracy is smaller than a second preset threshold. If the accuracy is smaller than the second preset threshold, the personnel information training set is used again to train the risk prediction model until the accuracy is larger than or equal to the second preset threshold.
And step S350, obtaining the established risk degree prediction model.
The Logistic Regression (LR) is a supervised learning classification model, and the Logistic Regression is used to estimate the probability of an event, for example, the probability of crime of a person who has a theft, a robbery or a pretend of fraud after going out of a prison can be predicted. The risk prediction model constructed using the logistic regression algorithm is as follows:
Figure BDA0002322023990000181
where ρ is the estimated conditional probability of the result of interest (e.g., the probability of crime again), β0Is a constant term, β1、β2I is the predicted independent variable x …iThe corresponding logistic regression coefficient is that for the two classifications of whether crime is crime again, the default classification probability threshold (i.e. the second preset value) of logistic regression is 0.5, that is, if the probability that a person who steals, robbes or swinds crimes after going out of the prison is greater than or equal to 0.5, the person is considered to crime again.
However, it is understood that in practical applications, different second predetermined values may be selected for different situations, and if the accuracy requirement for the prediction is high, the second predetermined value may be selected to be larger. If the requirement for accuracy of prediction is low, the second preset value may be selected to be smaller, for example, to improve the level of safety supervision and to reduce the crime rate of the criminals after returning to the society to the maximum extent, and when predicting the crime risk of the criminals, the second preset value should be selected to be smaller to predict the persons with crime risk to the maximum extent possible.
As another embodiment, other methods may be selected to build the risk prediction model, such as a decision tree algorithm or a Neural Network (NN) algorithm.
The decision tree algorithm is a prediction model, and can also be used for predicting whether a person crimes again. The risk prediction model established by the decision tree algorithm is as follows:
Figure BDA0002322023990000191
wherein x ispqSample data value of qth representing personal information of pth person, EpqIs the p-th person letterThe mean of all sample data values.
The neural network algorithm is a nonlinear machine learning model formed by using the working principle of a biological neural network as a reference, and the specific principle can refer to the prior art and is not described herein again.
Further, the second preset threshold may be determined according to the personnel information test set. For example, if the staff information test set includes 1000 pieces of staff information of the history identifier and the ordinary staff, where the number of the staff information of the history identifier is 900, and the number of the staff information of the ordinary staff is 100, the second preset threshold may be set to 90%.
As a possible case, when the number of history identifier persons and ordinary persons whose statistically obtained risk degree prediction values are greater than the second preset value is 700, the accuracy of obtaining the prediction by calculation is 70%. Since the accuracy of this prediction is 70% below the second preset threshold of 90%. Therefore, it can be known that the accuracy of the current risk prediction model is low and the predicted effect is not achieved, the risk prediction model is trained again by using the personnel information training set until the accuracy is greater than or equal to the second preset threshold.
For example, when the number of history identifier persons and ordinary persons with the statistically obtained risk prediction value larger than the second preset value is 950 persons, the accuracy of the prediction obtained by calculation is 95%. Since the accuracy of the prediction is higher than 95% of the second preset threshold value 90%, it can be known that the accuracy of the current risk prediction model is higher, and the predicted prediction effect is achieved. The risk prediction of the person can be performed using the current risk prediction model.
For another example, when the number of history identifier persons and ordinary persons with the risk degree prediction value obtained through statistics larger than the second preset value is 900 persons, the accuracy rate of obtaining prediction through calculation is 90%. Since the accuracy of the prediction is 95% equal to 90% of the second preset threshold, it can be known that the accuracy of the current risk prediction model is high, and the predicted prediction effect is achieved. The risk prediction of the person can be performed using the current risk prediction model.
It is to be understood that the second preset threshold may also be determined according to other manners, for example, may be an empirical value obtained through a plurality of experiments, and the like, and is not limited herein.
According to the method and the device, the personnel information is divided into the personnel information training set and the personnel information testing set, the danger degree prediction model is established through the preset algorithm aiming at the personnel information training set, then the accuracy of the danger degree prediction model is obtained through the personnel testing set until the accuracy of the danger degree prediction model reaches the expected effect, and the accuracy of predicting the danger degree of the personnel by the danger degree prediction model is improved.
Referring to fig. 6, the present embodiment further provides a device 130 for predicting a risk of a person, the device includes:
the obtaining module 131 is configured to obtain information to be predicted of a person to be predicted.
And the prediction module 132 is configured to predict the risk according to the information to be predicted based on a pre-established risk prediction model, so as to obtain a risk value of the person to be predicted.
It is clear to those skilled in the art that, for convenience and brevity of description, the detailed principle of the above-described personnel risk degree prediction apparatus 130 may refer to the corresponding process in the foregoing method, and will not be described in too much detail herein.
The present embodiment also provides a readable storage medium, in which a computer program is stored, and when the computer program is executed, the method for predicting the risk of a person is implemented.
In summary, the embodiment of the application provides a method, a device, an electronic device and a readable storage medium for predicting the danger degree of a person, the method for predicting the danger degree of the person firstly obtains information to be predicted of the person to be predicted, then carries out the danger degree prediction according to the information to be predicted based on a pre-established danger degree prediction model, and obtains the danger degree value of the person to be predicted, so that the efficiency of recognizing the danger degree of the illegal person is improved, and the difficulty of recognizing the danger degree of the illegal person is reduced.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for predicting a risk of a person, the method comprising:
acquiring information to be predicted of a person to be predicted;
and predicting the risk degree according to the information to be predicted based on a pre-established risk degree prediction model to obtain the risk degree value of the personnel to be predicted.
2. The method for predicting the degree of risk of a person according to claim 1, wherein the degree of risk prediction model is established by:
acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and preprocessing each personnel information;
dividing the preprocessed personnel information into a personnel information training set and a personnel information testing set according to a preset proportion;
and establishing a risk prediction model based on the personnel information training set and the personnel information testing set.
3. The method for predicting the risk of the person according to claim 2, wherein the step of obtaining the personnel information of at least one history identification personnel and the personnel information of at least one common personnel and preprocessing each of the personnel information comprises:
acquiring personnel information of at least one historical identification personnel and personnel information of at least one common personnel, and performing abnormal data detection on the personnel information according to a preset abnormal data detection method;
according to the abnormal data detection result, removing or filling the personnel information;
and based on a preset standardization method, standardizing the personnel information after the personnel information is removed or filled.
4. The method for predicting the risk of the person according to claim 3, wherein the step of removing or filling the information of each person according to the abnormal data detection result comprises:
and for each piece of personnel information, if the number of the sample data with the abnormality in the personnel information is detected to be larger than a first preset threshold value, rejecting the sample data with the abnormality, and otherwise, filling the sample data with the abnormality.
5. The method for predicting the danger level of the person according to claim 4, wherein the person information includes an identification number and a mobile phone number, and after the step of padding the sample data with the abnormality, the method further includes:
detecting whether the identity card number included in the filled personnel information meets a first preset standard or not, and if not, rejecting the identity card number;
and detecting whether the mobile phone number included in the filled personnel information meets a second preset standard, and if not, rejecting the mobile phone number.
6. The method according to claim 4, wherein the step of padding the sample data having the abnormality comprises:
and filling the sample data with the abnormality by using the mean value, the median or the first preset value of the sample data with the abnormality.
7. The method of predicting human risk according to claim 2, wherein the step of building a risk prediction model based on the training set of human information and the testing set of human information comprises:
aiming at the personnel information training set, establishing a risk prediction model by using a logistic regression algorithm;
inputting the personnel information test set into the risk prediction model to obtain risk prediction values of various historical identification personnel and various common personnel in the personnel information test set;
counting the number of each historical identifier and each ordinary person with the risk degree prediction value larger than a second preset value, and calculating the prediction accuracy according to the number;
and comparing whether the accuracy is smaller than a second preset threshold, if so, training the risk prediction model again by using the personnel information training set until the accuracy is larger than or equal to the second preset threshold.
8. A personal risk prediction device, the device comprising:
the acquisition module is used for acquiring information to be predicted of a person to be predicted;
and the prediction module is used for predicting the risk degree according to the information to be predicted based on a pre-established risk degree prediction model to obtain the risk degree value of the personnel to be predicted.
9. An electronic device, comprising a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor and the memory communicate via the bus, and the processor executes the machine-readable instructions to perform the steps of the method for predicting human risk according to any one of claims 1-7.
10. A readable storage medium, wherein a computer program is stored in the readable storage medium, and when executed, the computer program implements the person risk prediction method according to any one of claims 1 to 7.
CN201911301877.3A 2019-12-17 2019-12-17 Personnel risk degree prediction method and device, electronic equipment and readable storage medium Pending CN111080012A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911301877.3A CN111080012A (en) 2019-12-17 2019-12-17 Personnel risk degree prediction method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911301877.3A CN111080012A (en) 2019-12-17 2019-12-17 Personnel risk degree prediction method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN111080012A true CN111080012A (en) 2020-04-28

Family

ID=70315041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911301877.3A Pending CN111080012A (en) 2019-12-17 2019-12-17 Personnel risk degree prediction method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111080012A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669188A (en) * 2021-01-12 2021-04-16 徐涛 Critical event early warning model construction method, critical event early warning method and electronic equipment
CN116579448A (en) * 2022-12-26 2023-08-11 北京码牛科技股份有限公司 Personnel contamination risk prediction method, system, intelligent terminal and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070229543A1 (en) * 2006-03-29 2007-10-04 Autodesk Inc. System for controlling deformation
CN103345544A (en) * 2013-06-11 2013-10-09 大连理工大学 Predicting organic chemical biodegradability according to logistic regression method
CN108596409A (en) * 2018-07-16 2018-09-28 江苏智通交通科技有限公司 The method for promoting traffic hazard personnel's accident risk prediction precision
CN110009224A (en) * 2019-04-02 2019-07-12 深圳市华云中盛科技有限公司 Suspect's violation probability prediction technique, device, computer equipment and storage medium
CN110245132A (en) * 2019-06-12 2019-09-17 腾讯科技(深圳)有限公司 Data exception detection method, device, computer readable storage medium and computer equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070229543A1 (en) * 2006-03-29 2007-10-04 Autodesk Inc. System for controlling deformation
CN103345544A (en) * 2013-06-11 2013-10-09 大连理工大学 Predicting organic chemical biodegradability according to logistic regression method
CN108596409A (en) * 2018-07-16 2018-09-28 江苏智通交通科技有限公司 The method for promoting traffic hazard personnel's accident risk prediction precision
CN110009224A (en) * 2019-04-02 2019-07-12 深圳市华云中盛科技有限公司 Suspect's violation probability prediction technique, device, computer equipment and storage medium
CN110245132A (en) * 2019-06-12 2019-09-17 腾讯科技(深圳)有限公司 Data exception detection method, device, computer readable storage medium and computer equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669188A (en) * 2021-01-12 2021-04-16 徐涛 Critical event early warning model construction method, critical event early warning method and electronic equipment
CN116579448A (en) * 2022-12-26 2023-08-11 北京码牛科技股份有限公司 Personnel contamination risk prediction method, system, intelligent terminal and storage medium

Similar Documents

Publication Publication Date Title
US11875684B1 (en) Customization of sharing of rides
CN111899878B (en) Old person health detection system, method, computer device and readable storage medium
US20180082368A1 (en) System and method for detecting fraudulent account access and transfers
Mouridsen et al. Pervasive developmental disorders and criminal behaviour: A case control study
JP6820151B2 (en) Information processing equipment, information processing methods, and programs
Belknap et al. The roles of phones and computers in threatening and abusing women victims of male intimate partner abuse
CN107358158B (en) Group partner plan early warning method and device
CN110309735A (en) Exception detecting method, device, server and storage medium
Goode et al. Detecting complex account fraud in the enterprise: The role of technical and non-technical controls
CN106952162A (en) Money laundering risks rating calculation method and system
CN111080012A (en) Personnel risk degree prediction method and device, electronic equipment and readable storage medium
CN111598368B (en) Risk identification method, system and device based on stop abnormality after stroke end
Hockenberry et al. Delinquency cases in juvenile court, 2011
US11501381B2 (en) Method for learning and device for reviewing insurance review claim statement on basis of deep neural network
CN110493476B (en) Detection method, device, server and storage medium
Ebner et al. Aging online:: Rethinking the aging decision-maker in a digital era
CN113450011A (en) Task allocation method and device
CN113704731A (en) Hospital-oriented patient abnormal hospitalizing behavior detection method and system
CN111598274A (en) Risk identification method, system and device based on exception cancellation and storage medium
CN110852517B (en) Abnormal behavior early warning method and device, data processing equipment and storage medium
CN110782061A (en) Method and system for predicting malignant event
US20220246153A1 (en) System and method for detecting fraudsters
CN111640502B (en) Method and device for detecting health state of delivery object
CN109523394B (en) Risk detection method, device and storage medium based on data processing
US20200211075A1 (en) Onboarding platform for performing dynamic mitigation analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210914

Address after: 100000 room 650, 6th floor, building 11, Huashan Garden Cultural Media Industrial Park, 1376 folk culture street, Gaobeidian village, Gaobeidian Township, Chaoyang District, Beijing

Applicant after: Beijing Zhizhi Heshu Technology Co.,Ltd.

Address before: No.310, building 4, courtyard 8, Dongbei Wangxi Road, Haidian District, Beijing

Applicant before: MININGLAMP SOFTWARE SYSTEMS Co.,Ltd.