CN111768027A - Reinforcement learning-based crime risk prediction method, medium, and computing device - Google Patents

Reinforcement learning-based crime risk prediction method, medium, and computing device Download PDF

Info

Publication number
CN111768027A
CN111768027A CN202010463027.XA CN202010463027A CN111768027A CN 111768027 A CN111768027 A CN 111768027A CN 202010463027 A CN202010463027 A CN 202010463027A CN 111768027 A CN111768027 A CN 111768027A
Authority
CN
China
Prior art keywords
attribute
crime
training sample
training
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010463027.XA
Other languages
Chinese (zh)
Inventor
李康顺
王梓铭
刘嘉豪
方鸿铭
雷逸舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN202010463027.XA priority Critical patent/CN111768027A/en
Publication of CN111768027A publication Critical patent/CN111768027A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method, a medium and a computing device for predicting crime risks based on reinforcement learning, wherein a training sample set is constructed firstly, and clustering is carried out on the training sample set; respectively constructing N BP neural networks for N classes obtained by clustering; inputting the attributes of each training sample into a corresponding BP neural network, and training the BP neural network to obtain a crime risk prediction model; aiming at a test sample needing to predict the risk of the crime again, calculating the distance between the test sample and each cluster center, selecting the cluster center with the minimum distance from the test sample, taking a trained neural network corresponding to the cluster to which the cluster center belongs as a crime again risk prediction model of the test sample, inputting the attribute of the test sample into the crime again risk prediction model of the test sample, and predicting the crime again behavior of the test sample through the model. The invention ensures that the result of the crime prediction is more real, effective and accurate and the calculation speed is faster.

Description

Reinforcement learning-based crime risk prediction method, medium, and computing device
Technical Field
The invention relates to the technical field of crime prediction, in particular to a crime risk prediction method, a crime risk prediction medium and crime risk prediction computing equipment based on reinforcement learning.
Background
Crimes are a social phenomenon in human society. With the continuous progress of human society, especially the rapid development of modern science and technology, crimes vary greatly in number, scale, criminal methods and degree of harm to society, and the threat to human society becomes more serious. Practice proves that crimes are far from sufficient by only striking the palliative measures, so people hope to prevent crimes.
The three special groups of the parole, the temporary outside prison execution and the criminal release which are transformed by the prison are very easy to crime again due to the factors of poor social adaptation capability, unstable psychological state and the like. If such personnel crime again, the crime mode is more irresistible, and the society is greatly threatened. Therefore, how to correctly predict the possibility of the crime again of the personnel and correctly make a risk early warning has important social significance, and is one of the problems to be solved in the current society.
At present, foreign relevant research has been conducted for criminal risk early warning technology of criminals for a century, however, domestic relevant research is relatively few, and in the field of China, questionnaires and scales are mainly used, and in the aspect of content evaluation, only basic information of the criminals is generally considered, and data dimensions and data scale are few. Meanwhile, many relevant researches in China only stay at a theoretical level, and a small number of scholars directly predict data of relevant personnel through a neural network, a random forest, a classification tree and the like. However, the background, living environment, and past experience of different related persons may cause different crime factors and probabilities. Therefore, it is difficult to effectively predict the crime of the relevant personnel and give early warning in time only by adopting the method.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the prior art and provides a reinforced learning-based crime risk prediction method, which can predict the crime behaviors of related personnel and has the advantages of high prediction accuracy and high calculation speed.
A second object of the present invention is to provide a storage medium.
It is a third object of the invention to provide a computing device.
The first purpose of the invention is realized by the following technical scheme: a crime risk prediction method based on reinforcement learning comprises the following steps:
s1, obtaining training samples to form a training sample set; the training samples comprise personnel with crime forepart and crime again and personnel with crime forepart and crime again;
s2, clustering the training samples according to the continuous attribute and the classification attribute of the training samples in the training sample set, and defining the obtained clustering number as N, wherein N is a constant, namely all the training samples in the training sample set are clustered into N types;
s3, respectively constructing N corresponding neural networks aiming at the N classes obtained by clustering;
s4, inputting the continuous attributes and the classified attributes after the unique hot coding of each training sample into a neural network corresponding to the class to which the training sample belongs, and training the neural network to obtain a crime risk prediction model;
s5, aiming at the test sample needing to predict the risk of crime again, calculating the distance between the test sample and each clustering center, selecting the clustering center with the minimum distance from the test sample, and taking the trained neural network corresponding to the cluster to which the clustering center belongs as a crime again risk prediction model of the test sample;
and S6, inputting the continuous attribute and the classified attribute after the unique hot coding of the test sample into a crime risk prediction model of the test sample, and predicting the crime behavior of the test sample through the model.
Preferably, the continuous attributes include age, height, weight and cultural degree; the categorical attributes include gender, crime date, and crime type of pre-crime department.
Preferably, the process of clustering the training samples in S2 specifically includes:
s21, selecting N samples from the training sample set as initial clustering centers according to the required clustering number N; the method specifically comprises the following steps:
firstly, taking any training sample from a training sample set as a first initial clustering center;
then selecting the training sample with the maximum distance sum with the existing clustering centers from the training sample set as a new initial clustering center until N initial clustering centers are selected;
s22, calculating the distance between each training sample and each cluster center in the training sample set, and dividing each training sample to the cluster center closest to the training sample;
s23, calculating a target function according to the distance between each training sample and the clustering center of the cluster to which the training sample belongs, and judging whether the value of the target function is unchanged compared with the last calculated value;
if not, updating the clustering centers for various types according to the training samples in the current various types, and returning to the step S22;
if yes, clustering is finished.
Further, the distance between each training sample in the training sample set and the cluster center is:
Figure BDA0002511685460000031
Figure BDA0002511685460000032
wherein:
xiexpressed as the ith training sample in the training dataset;
Kja cluster center denoted as class j;
p represents the total number of classes of the training sample continuum attribute,
ωmrepresenting the weight of the mth continuous type attribute;
xima value representing an mth continuous-type attribute of an ith training sample;
Figure BDA0002511685460000033
representing the average value of the m continuous type attributes of all training samples in the j class;
gamma is the weight of the categorical attribute relative to the continuous attribute;
q represents the total number of categories of the training sample typing attributes;
ωlrepresenting the weight of the first type attribute;
tlthe number of the median values in the first type attribute value domain;
Figure BDA0002511685460000034
representing the w value in the l type attribute value field;
xila value representing the ith type attribute of the ith training sample;
Figure BDA0002511685460000035
the value representing the property of the first type is
Figure BDA0002511685460000036
The frequency of occurrence of the training samples in class j;
weight ω of mth continuous attributemAnd the weight ω of the first type attributelRespectively as follows:
Figure BDA0002511685460000037
Figure BDA0002511685460000038
wherein:
Figure BDA0002511685460000041
Figure BDA0002511685460000042
Figure BDA0002511685460000043
Figure BDA0002511685460000044
wherein e ismEntropy of information as the m-th continuous type attribute, exInformation entropy of the xth continuous type attribute, x is 1, 2,3.
ElEntropy of information for the type I attribute, EyInformation entropy of the y-th subtype attribute, wherein y is 1, 2,3.
YwlThe times of the appearance of the training sample of the w-th value of the l-th type attribute in the training sample set;
i is the total number of training samples in the set of training samples.
Further, the distance between the test sample and each cluster center is:
Figure BDA0002511685460000045
Figure BDA0002511685460000046
xtmrepresents the test sample xtThe value of the m-th continuous type attribute of (2);
xtlRepresents the test sample xtIs determined by the value of the first type attribute of (1).
Further, the objective function F (X, P) is:
Figure BDA0002511685460000047
Figure BDA0002511685460000048
{Jjis the center of the cluster is KjThe training sample set of class j.
Further, in step S23, according to the training samples in the current classes, the specific steps of updating the cluster centers for the classes are as follows:
step S231, calculating an average value of various continuous type attributes of all training samples for each type:
Figure BDA0002511685460000051
wherein n isjIs the total number of training samples, x, in class jimRepresents the value of the m-th continuous type attribute of the ith training sample in the jth class,
Figure BDA0002511685460000052
representing the average value of the m continuous type attributes of all training samples in the j class, and p represents the total number of the classes of the continuous type attributes of the training samples;
aiming at each class, obtaining the value of each class attribute of all training samples, and counting the frequency of each value in each class attribute value field:
Figure BDA0002511685460000053
wherein
Figure BDA0002511685460000054
Representing the w-th value in the l-th type attribute value field,
Figure BDA0002511685460000055
the value representing the property of the first type is
Figure BDA0002511685460000056
Is the frequency of occurrence of w-th value in the l-th type attribute value domain in the j-th class, w is 1, 2,3l,tlThe number of the median values in the first type attribute value domain;
step S232, the obtained
Figure BDA0002511685460000057
m 1, 2,3,.., p and
Figure BDA0002511685460000058
and l is 1, 2,3, and q is used as the attribute of the new cluster center, so as to obtain the new cluster center.
Preferably, the neural network is a BP neural network;
in step S3, initial parameters are respectively set for the N BP neural networks by using a genetic algorithm, which is specifically as follows:
step S31, randomly generating initial parameters of the BP neural network as an initial group, and setting maximum iteration times, stopping errors, cross probabilities and variation probabilities for the genetic algorithm;
step S32, selecting, crossing and mutating by adopting a championship selection strategy, a uniform crossing strategy and a uniform mutation strategy respectively;
step S33, calculating the fitness of each generation of individuals, stopping the algorithm when the fitness is smaller than a stopping error or the iteration times is larger than the maximum iteration times, and returning the last individual as the initial input parameter of the BP neural network; otherwise, the process returns to step S32.
The second purpose of the invention is realized by the following technical scheme: a storage medium comprising a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to implement the method for predicting crime risk based on reinforcement learning according to the first object of the present invention.
The third purpose of the invention is realized by the following technical scheme: a computing device stores a program that when executed by a processor implements a reinforcement learning-based crime risk prediction method according to a first object of the present invention.
Compared with the prior art, the invention has the following advantages and effects:
(1) the invention relates to a crime risk prediction method based on reinforcement learning, which comprises the steps of firstly constructing a training sample set, and then clustering the training sample set; respectively constructing N corresponding BP neural networks for the N classes obtained by clustering; inputting the attributes of each training sample into a BP neural network corresponding to the class to which the training sample belongs, and training the BP neural network to obtain a crime risk prediction model; aiming at a test sample needing to predict the risk of the crime again, calculating the distance between the test sample and each cluster center, selecting the cluster center with the minimum distance from the test sample, taking a trained neural network corresponding to the cluster to which the cluster center belongs as a crime again risk prediction model of the test sample, inputting the attribute of the test sample into the crime again risk prediction model of the test sample, and predicting the crime again behavior of the test sample through the model. Therefore, the training samples are clustered by the clustering method, the training samples are divided according to the attributes, and then the neural networks are respectively established for all types, so that the data input by each neural network has similar characteristics, the training data is more targeted, the crime forecasting effect is more real, effective and accurate, and the calculating speed is higher.
(2) According to the method for predicting the crime risk based on reinforcement learning, N points with the farthest distance are selected instead of a random selection mode in the selection of the initial clustering center in the clustering process, so that the selected initial clustering center can have a larger average difference degree, the situation that the selected initial clustering center is trapped in local optimum can be avoided to a larger extent, the algorithm is more stable, and the clustering result cannot cause larger fluctuation due to the randomness of the selection of the initial clustering center.
(3) In the reinforced learning-based crime risk prediction method, when the distance between the sample and the clustering center is calculated in the clustering process, and when the distance between the classification attribute and the clustering center is described, the continuity attribute is referred to and the attribute and the clustering center are described through the Euclidean distance, compared with the existing clustering algorithm that the dissimilarity degree between the object and the clustering center cannot be well described through the Hamming distance formula only by adopting a simple 0-1 matching mode, the method adopts a novel mode of describing the distance between the classification attribute and the clustering center, so that the measurement mode of the dissimilarity degree of the classification attribute is more uniform, and is uniform with the measurement mode of the continuity attribute, and is more convincing. Meanwhile, the method describes the information quantity contained in each attribute through the information entropy, and gives corresponding weight to each attribute, so that the importance degree of each attribute and the influence on the final result can be more accurately expressed.
(4) In the reinforced learning-based crime risk prediction method, the initial parameters are set for the BP neural network by adopting the genetic algorithm, the BP neural network can approach to any continuous function theoretically, and the genetic algorithm can better solve the defect that the BP neural network is easy to fall into the local optimal solution, so that the BP neural network can obtain the global optimal solution more easily, and meanwhile, the better initial parameters are also beneficial to accelerating the convergence speed of the BP neural network.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a flow chart of training sample clustering in the method of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
The embodiment discloses a rethinking risk prediction method based on reinforcement learning, which can be used for predicting criminal behaviors for persons related to president departments so as to monitor and interfere the persons in a targeted manner and reduce the influence of rethinking on the society. The steps of the crime risk prediction method of the embodiment are shown in fig. 1, and include:
s1, obtaining training samples to form a training sample set; the training samples include those with pre-criminal disciplines and crime retrenactivity and those with pre-criminal disciplines and crime retrenactivity.
As in table 1, it is assumed that the training sample set includes the following training samples, and the data of each training sample is as follows:
TABLE 1
Numbering Sex Age (age) Type of crime Whether or not to crime again
a1 For male 19 Violence Is that
a2 For male 27 Theft prevention Whether or not
a3 For male 45 Robbery Is that
a4 Woman 33 Theft prevention Whether or not
a5 For male 26 Violence Whether or not
a6 Woman 16 Theft prevention Whether or not
a7 For male 52 Theft prevention Whether or not
In table 1, the age is the continuous attribute of the training sample, and the gender and the crime type are classified attributes. In this embodiment, the continuous attributes of the training sample may include age, height, weight, and cultural degree; the typing attributes may include gender, crime date, and crime type of pre-crime department.
In this embodiment, the continuous type attribute of the training sample is normalized to obtain the data shown in table 2:
TABLE 2
Numbering Sex Age (age) Type of crime Whether or not to crime again
a1 For male 0.08 Violence Is that
a2 For male 0.31 Theft prevention Whether or not
a3 For male 0.81 Robbery Is that
a4 Woman 0.47 Theft prevention Whether or not
a5 For male 0.28 Violence Whether or not
a6 Woman 0 Theft prevention Whether or not
a7 For male 1 Theft prevention Whether or not
And S2, firstly, performing primary processing on each training sample acquired in the step S1, and removing redundant items and missing items. The redundant items refer to attributes which have no influence on the result of the crime prediction in the training samples. The missing item refers to that a certain attribute value of a training sample is null, and attributes with a large number of missing values are removed.
And then clustering the training samples according to the continuous type attribute and the classification type attribute of each training sample in the training sample set, and defining the obtained clustering number as N, wherein N is a constant, namely all the training samples in the training sample set are clustered into N types. In this embodiment, N may be 2 to 4.
In this embodiment, as shown in fig. 2, the specific steps of clustering each training sample are as follows:
s21, selecting N samples from the training sample set as initial clustering centers according to the required clustering number N; the method specifically comprises the following steps:
firstly, taking any training sample from a training sample set as a first initial clustering center;
then, selecting the training sample with the maximum distance sum with the existing clustering centers from the training sample set as a new initial clustering center, namely the next initial distance center, until N initial clustering centers are selected;
in this embodiment, if N is 2,2 initial cluster centers are selected in step S21; and after the 1 st clustering center is selected, selecting the training sample with the farthest distance from the 1 st clustering center in the training sample set as the 2 nd clustering center.
If N is 3,2 initial cluster centers are selected in step S21; and after the 2 nd clustering center is selected, selecting the training sample with the largest distance sum of the 1 st clustering center and the 2 nd clustering center in the training sample set as the 3 rd clustering center.
In the present embodiment, for any one cluster center KjDefining the mth continuous attribute of the class to which the cluster center belongs as the average value of the mth continuous attributes of all training samples in the class
Figure BDA0002511685460000091
Categorizing attributes of the class to which the cluster center belongs
Figure BDA0002511685460000092
Is defined as:
Figure BDA0002511685460000093
wherein any classification type attribute of the cluster center
Figure BDA0002511685460000094
Figure BDA0002511685460000095
Wherein, therein
Figure BDA0002511685460000096
Representing the w-th value in the l-th type attribute value field,
Figure BDA0002511685460000097
the value representing the property of the first type is
Figure BDA0002511685460000098
Is the frequency of occurrence of w-th value in the l-th type attribute value domain in the j-th type, w is 1, 2,3l,tlThe number of the median values in the first type attribute value domain; in the present embodiment, the data shown in table 1, wherein the classification type attribute has only two categories, i.e. gender and crime type, so l is gender or crime type, wherein when l is gender, the first classification type attribute value range is (male, female), i.e. the value range includes 2 values, i.e. male and female, t is tlIs 2, i.e. w is 1 or 2,
Figure BDA0002511685460000099
in the case of male, the male is,
Figure BDA00025116854600000910
is a woman. When l is a crime type, the first-type attribute value range is (violence, theft, robbery), that is, the value range includes 3 values, tlIs 2, i.e. w is 1 or 2,
Figure BDA00025116854600000911
in order to be violent,
Figure BDA00025116854600000912
in order to avoid theft,
Figure BDA00025116854600000913
is robbery.
In this embodiment, if a2 is randomly chosen as the first initial cluster center,since clustering has not been started yet, and there are no other training samples in the class to which the cluster center belongs, only the cluster center itself, based on the above, the m-th continuum attribute, i.e., the continuous attribute of the age, of the class to which the cluster center belongs is:
Figure BDA00025116854600000914
as shown in Table 1, the continuous type attribute includes only 1, so m and p are both 1. Based on the above, the i-th type attribute of the class to which the cluster center belongs is:
Figure BDA00025116854600000915
i.e. when l is 1,
Figure BDA00025116854600000916
and
Figure BDA00025116854600000917
corresponding to 1.0 and 0, respectively.
Figure BDA00025116854600000918
I.e. when l is 2,
Figure BDA00025116854600000919
corresponding to 0, 1.0 and 0, respectively.
Wherein the 1 st and 2 nd classification attributes are gender and crime type classification attributes respectively.
In this embodiment, the formula d (x) is calculated by clustering the training samples with the cluster centersi,Kj) Comprises the following steps:
Figure BDA0002511685460000101
Figure BDA0002511685460000102
wherein:
xiexpressed as the ith training sample in the training dataset;
Kja cluster center denoted as class j;
p represents the total number of classes of the training sample continuum attribute,
ωmrepresenting the weight of the mth continuous type attribute;
xima value representing an mth continuous-type attribute of an ith training sample;
Figure BDA0002511685460000103
representing the average value of the m continuous type attributes of all training samples in the j class;
γ is a weight of the categorical attribute with respect to the continuous attribute, and in the present embodiment, γ takes 0.8 to 1.2.
q represents the total number of categories of the training sample typing attributes;
ωlrepresenting the weight of the first type attribute;
tlthe number of the median values in the first type attribute value domain;
Figure BDA0002511685460000104
representing the w value in the l type attribute value field;
xila value representing the ith type attribute of the ith training sample;
Figure BDA0002511685460000105
the value representing the property of the first type is
Figure BDA0002511685460000106
The frequency of occurrence of the training samples in class j;
weight ω of mth continuous attributemAnd the weight ω of the first type attributelRespectively as follows:
Figure BDA0002511685460000107
Figure BDA0002511685460000108
wherein:
Figure BDA0002511685460000109
Figure BDA0002511685460000111
Figure BDA0002511685460000112
Figure BDA0002511685460000113
wherein e ismEntropy of information as the m-th continuous type attribute, exInformation entropy of the xth continuous type attribute, x is 1, 2,3.
ElEntropy of information for the type I attribute, EyInformation entropy of the y-th subtype attribute, wherein y is 1, 2,3.
xi′mThe value of the m continuous type attribute of the ith training sample is obtained;
Ywlthe times of the appearance of the training sample of the w-th value of the l-th type attribute in the training sample set;
i is the total number of training samples in the set of training samples.
Based on the data as shown in Table 1, emI.e. e1Comprises the following steps:
Figure BDA0002511685460000114
wherein
Figure BDA0002511685460000115
Figure BDA0002511685460000116
To obtain the final e1Is 0.81.
For the categorical attribute gender, i.e., when l is 1, tlIs the number of 2, and the number of the second,
Figure BDA0002511685460000117
wherein the content of the first and second substances,
Figure BDA0002511685460000121
finally obtain E1Is 0.86.
In the same way, obtain2When l is 2, the attribute corresponding to the first classification is crime type.
Based on the information entropy obtained above, the weight ω of age, which is the 1 st continuous attributemComprises the following steps:
Figure BDA0002511685460000122
weight ω of 1 st type attribute, i.e., agelComprises the following steps:
ωl=0.31,l=1;
weight omega of 2 nd type attribute, namely crime typelComprises the following steps:
ωl=0.28,l=2。
and S22, calculating the distance between each training sample and each clustering center in the training sample set based on the calculation formula of the training samples and the clustering centers, and dividing each training sample into the nearest clustering centers.
Based on the data in tables 1 and 2, in the present embodiment, the distance between each training sample and each cluster center is calculated, for example, for training sample a1, the cluster center K of the 1 st class is calculated1And class 2 center K2If K is1And K2The initial cluster center obtained in the above step S21 is based on the step S2ω derived from 1m、ωlThe above values of (a) can be obtained:
d(a1,K1)=0.41×(0.08-0.31)2+0.31×((1-1)2+02)+0.28×((1-0)2+12+02)=0.5817;
d(a1,K2)=0.41×(0.08-0.81)2+0.31×((1-1)2+02)+0.28×((1-0)2+02+12)=0.7785;
thus, the training sample a1 is divided into the clustering centers K1In the class (c); similarly, calculate a 2-a 7 and K1And K2Based on the calculation result, a2 is divided into cluster centers K1In the class of (2), a3 is divided into clustering centers K2In the class of (2), a4 is divided into a clustering center K1In the class of (2), a5 is divided into a clustering center K1In the class of (2), a6 is divided into a clustering center K1In the class of (2), a7 is divided into a clustering center K1In (2) class (iii).
And S23, calculating an objective function according to the distance between each training sample and the cluster center of the cluster to which the training sample belongs, and judging whether the value of the objective function is unchanged compared with the last calculated value.
If not, updating the clustering centers for various types according to the training samples in the current various types, and returning to the step S22;
if yes, clustering is finished.
In this embodiment, if the cluster center in step S22 is the initial cluster center, that is, if the clustering is performed for the first time, there is no last calculated value of the objective function in step S23, and at this time, it is determined that there is a transformation between the currently calculated objective function and the last calculated value.
In this embodiment, the objective function F (X, P) is:
Figure BDA0002511685460000131
Figure BDA0002511685460000132
{Jjis the center of the cluster is KjThe training sample set of class j.
In this embodiment, the specific process for updating the cluster centers for each type is as follows:
step S232, aiming at each type, calculating the average value of various continuous type attributes of all training samples:
Figure BDA0002511685460000133
wherein n isjIs the total number of training samples, x, in class jimRepresents the value of the m-th continuous type attribute of the ith training sample in the jth class,
Figure BDA0002511685460000134
representing the average value of the m continuous type attributes of all training samples in the j class, and p represents the total number of the classes of the continuous type attributes of the training samples;
aiming at each class, obtaining the value of each class attribute of all training samples, and counting the frequency of each value in each class attribute value field:
Figure BDA0002511685460000135
wherein
Figure BDA0002511685460000136
Representing the w-th value in the l-th type attribute value field,
Figure BDA0002511685460000137
the value representing the property of the first type is
Figure BDA0002511685460000138
Is the frequency of occurrence of w-th value in the l-th type attribute value domain in the j-th class, w is 1, 2,3l,tlThe number of values in the l-th type attribute value domain.
Step S232, the obtained
Figure BDA0002511685460000139
m 1, 2,3,.., p and
Figure BDA00025116854600001310
and l is 1, 2,3, and q is used as the attribute of the new cluster center, so as to obtain the new cluster center.
S3, respectively constructing N corresponding BP neural networks aiming at the N classes obtained by clustering; in this embodiment, initial parameters are respectively set for the N BP neural networks by using a genetic algorithm, which is specifically as follows:
step S31, randomly generating initial parameters of the BP neural network as an initial group, and setting maximum iteration times, stopping errors, cross probabilities and variation probabilities for the genetic algorithm;
in this example, 4 individuals were randomly generated as the initial population of the genetic algorithm, and the individual samples were [1.0,1.1,1.2,1.3,1.3,1.5], [1.6,1.7,1.8,1.9,2.0,2.1], [2.2,2.3,2.4,2.5,2.6,2.7], [2.8,2.9,3.0,3.1,3.2,3.3 ].
The maximum iteration number is set to be 50 for the genetic algorithm, the stop error is 0.1, the cross probability is 0.5, and the variation probability is 0.05.
Step S32, inputting training samples, calculating the fitness of individuals according to the cross entropy cost function of the BP neural network, and respectively adopting a championship selection strategy, a uniform crossing strategy and a uniform mutation strategy to perform selection, crossing and mutation operations, specifically:
firstly, a tournament strategy is adopted for selection operation. Randomly selecting two individuals from the initial sample for comparison, and taking the individual with lower fitness as a descendant individual; this step is repeated until the number of children is consistent with the number of parents.
Then, a uniform crossing strategy is adopted for crossing operation, every two selected filial generation individuals are paired, and each gene (in this example, a floating point number) of the two paired individuals is exchanged according to crossing probability to form two new individuals.
And finally, carrying out mutation operation by adopting a uniform mutation strategy. For each individual progeny generated after crossover, three genes are randomly assigned, random numbers are generated from a designated range (in this case, -5.0-5.0) in a uniform distribution, and the genes are replaced with variation probabilities.
Step S33, calculating the fitness of each generation of individuals, stopping the algorithm when the fitness is smaller than a stopping error or the iteration times is larger than the maximum iteration times, taking the optimal individual as the initial parameter of the neural network, namely returning the last individual as the initial input parameter of the BP neural network, or returning to the step S32;
in this embodiment, the BP neural network has a three-layer structure, the cost function is a cross entropy cost function, and the activation function is a sigmoid function.
And S4, inputting the continuous attribute and the classified attribute after the unique hot coding of each training sample into a BP neural network corresponding to the class to which the training sample belongs, and training the BP neural network to obtain a crime risk prediction model.
In this embodiment, the classification type attribute of each training sample after clustering is subjected to one-hot coding, and the continuous type attribute does not need one-hot coding. Based on the data of the training samples shown in table 2, the results of the data after one-hot encoding are shown in table 3:
TABLE 3
Figure BDA0002511685460000141
Figure BDA0002511685460000151
And S5, aiming at the test sample needing to predict the crime risk, calculating the distance between the test sample and each cluster center obtained after the clustering in the step S2 is completed, selecting the cluster center with the minimum distance from the test sample, and taking the trained neural network corresponding to the cluster to which the cluster center belongs as a crime risk prediction model of the test sample.
In this embodiment, referring to step S2, the distance between the test sample and each cluster center is obtained by training a cluster calculation formula between the sample and each cluster center:
Figure BDA0002511685460000152
Figure BDA0002511685460000153
xtmrepresents the test sample xtThe value of the mth type continuous-type attribute of (1);
xtlrepresents the test sample xtIs determined by the value of the first type attribute of (1).
And S6, inputting the continuous attribute and the classified attribute after the unique hot coding of the test sample into a crime risk prediction model of the test sample, and predicting the crime behavior of the test sample through the model.
It should be noted that: table 1 above shows training sample data in a hypothetical training sample set, and when the method of the present embodiment is actually executed, the total number of training samples in the training sample set may be more than 6 ten thousand, where the number of training samples with crime again accounts for 10% of the total number.
Example 2
The storage medium includes a processor and a memory for storing a program executable by the processor, and is characterized in that when the processor executes the program stored in the memory, the method for predicting crime risk based on reinforcement learning according to embodiment 1 is implemented as follows:
acquiring training samples to form a training sample set; the training samples comprise personnel with crime forepart and crime again and personnel with crime forepart and crime again;
clustering the training samples according to the continuous attribute and the classification attribute of each training sample in the training sample set, and defining the obtained clustering number as N, wherein N is a constant, namely all the training samples in the training sample set are clustered into N classes;
respectively constructing N corresponding neural networks for the N classes obtained by clustering;
inputting the continuous attribute and the classified attribute after the unique hot coding of each training sample into a neural network corresponding to the class to which the training sample belongs, and training the neural network to obtain a crime risk prediction model;
aiming at a test sample needing to predict the risk of the crime again, calculating the distance between the test sample and each clustering center, selecting the clustering center with the minimum distance from the test sample, and taking a trained neural network corresponding to the cluster to which the clustering center belongs as a crime again risk prediction model of the test sample;
inputting the continuous attribute and the classified attribute after the unique hot coding of the test sample into a crime risk prediction model of the test sample, and predicting the crime behavior of the test sample through the model.
In this embodiment, the storage medium may be a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a Random Access Memory (RAM), a usb disk, a removable hard disk, or other media.
Example 3
The present embodiment discloses a computing device, which stores a program that, when executed by a processor, implements the reinforcement learning-based crime risk prediction method according to embodiment 1, as follows:
acquiring training samples to form a training sample set; the training samples comprise personnel with crime forepart and crime again and personnel with crime forepart and crime again;
clustering the training samples according to the continuous attribute and the classification attribute of each training sample in the training sample set, and defining the obtained clustering number as N, wherein N is a constant, namely all the training samples in the training sample set are clustered into N classes;
respectively constructing N corresponding neural networks for the N classes obtained by clustering;
inputting the continuous attribute and the classified attribute after the unique hot coding of each training sample into a neural network corresponding to the class to which the training sample belongs, and training the neural network to obtain a crime risk prediction model;
aiming at a test sample needing to predict the risk of the crime again, calculating the distance between the test sample and each clustering center, selecting the clustering center with the minimum distance from the test sample, and taking a trained neural network corresponding to the cluster to which the clustering center belongs as a crime again risk prediction model of the test sample;
inputting the continuous attribute and the classified attribute after the unique hot coding of the test sample into a crime risk prediction model of the test sample, and predicting the crime behavior of the test sample through the model.
In this embodiment, the computing device may be a desktop computer, a notebook computer, a smart phone, a PDA handheld terminal, or a tablet computer.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. A crime risk prediction method based on reinforcement learning is characterized by comprising the following steps:
s1, obtaining training samples to form a training sample set; the training samples comprise personnel with crime forepart and crime again and personnel with crime forepart and crime again;
s2, clustering the training samples according to the continuous attribute and the classification attribute of the training samples in the training sample set, and defining the obtained clustering number as N, wherein N is a constant, namely all the training samples in the training sample set are clustered into N types;
s3, respectively constructing N corresponding neural networks aiming at the N classes obtained by clustering;
s4, inputting the continuous attributes and the classified attributes after the unique hot coding of each training sample into a neural network corresponding to the class to which the training sample belongs, and training the neural network to obtain a crime risk prediction model;
s5, aiming at the test sample needing to predict the risk of crime again, calculating the distance between the test sample and each clustering center, selecting the clustering center with the minimum distance from the test sample, and taking the trained neural network corresponding to the cluster to which the clustering center belongs as a crime again risk prediction model of the test sample;
and S6, inputting the continuous attribute and the classified attribute after the unique hot coding of the test sample into a crime risk prediction model of the test sample, and predicting the crime behavior of the test sample through the model.
2. The reinforcement learning-based crime risk prediction method of claim 1, wherein the continuum attributes include age, height, weight, and cultural degree; the categorical attributes include gender, crime date, and crime type of pre-crime department.
3. The method for predicting crime risk based on reinforcement learning of claim 1, wherein the process of clustering the training samples in S2 specifically includes:
s21, selecting N samples from the training sample set as initial clustering centers according to the required clustering number N; the method specifically comprises the following steps:
firstly, taking any training sample from a training sample set as a first initial clustering center;
then selecting the training sample with the maximum distance sum with the existing clustering centers from the training sample set as a new initial clustering center until N initial clustering centers are selected;
s22, calculating the distance between each training sample and each cluster center in the training sample set, and dividing each training sample to the cluster center closest to the training sample;
s23, calculating a target function according to the distance between each training sample and the clustering center of the cluster to which the training sample belongs, and judging whether the value of the target function is unchanged compared with the last calculated value;
if not, updating the clustering centers for various types according to the training samples in the current various types, and returning to the step S22;
if yes, clustering is finished.
4. The reinforcement learning-based crime risk prediction method according to claim 3, wherein the distance between each training sample in the training sample set and the cluster center is:
Figure FDA0002511685450000021
Figure FDA0002511685450000022
wherein:
xiexpressed as the ith training sample in the training dataset;
Kja cluster center denoted as class j;
p represents the total number of classes of the training sample continuum attribute,
ωmrepresenting the weight of the mth continuous type attribute;
xima value representing an mth continuous-type attribute of an ith training sample;
Figure FDA0002511685450000025
representing the average value of the m continuous type attributes of all training samples in the j class;
gamma is the weight of the categorical attribute relative to the continuous attribute;
q represents the total number of categories of the training sample typing attributes;
ωlrepresenting the weight of the first type attribute;
tlthe number of the median values in the first type attribute value domain;
Figure FDA0002511685450000026
representing the w value in the l type attribute value field;
xila value representing the ith type attribute of the ith training sample;
Figure FDA0002511685450000027
the value representing the property of the first type is
Figure FDA0002511685450000028
The frequency of occurrence of the training samples in class j;
weight ω of mth continuous attributemAnd the weight ω of the first type attributelRespectively as follows:
Figure FDA0002511685450000023
Figure FDA0002511685450000024
wherein:
Figure FDA0002511685450000031
Figure FDA0002511685450000032
Figure FDA0002511685450000033
Figure FDA0002511685450000034
wherein e ismEntropy of information as the m-th continuous type attribute, exInformation entropy of the xth continuous type attribute, x is 1, 2,3.
ElEntropy of information for the type I attribute, EyInformation entropy of the y-th subtype attribute, wherein y is 1, 2,3.
YwlThe times of the appearance of the training sample of the w-th value of the l-th type attribute in the training sample set;
i is the total number of training samples in the set of training samples.
5. The reinforcement learning-based crime risk prediction method according to claim 4, wherein the distance between the test sample and each cluster center is:
Figure FDA0002511685450000035
Figure FDA0002511685450000036
xtmrepresents the test sample xtThe value of the mth type continuous-type attribute of (1);
xtlrepresents the test sample xtIs determined by the value of the first type attribute of (1).
6. The reinforcement learning-based crime risk prediction method according to claim 4, wherein the objective function F (X, P) is:
Figure FDA0002511685450000037
Figure FDA0002511685450000041
{Jjis the center of the cluster is KjThe training sample set of class j.
7. The method for predicting crime risk based on reinforcement learning of claim 3, wherein in step S23, the specific steps of updating the cluster center for each class according to the training samples in the current class are as follows:
step S231, calculating an average value of various continuous type attributes of all training samples for each type:
Figure FDA0002511685450000042
wherein n isjIs the total number of training samples, x, in class jimRepresents the value of the m-th continuous type attribute of the ith training sample in the jth class,
Figure FDA0002511685450000044
representing the average value of the m continuous type attributes of all training samples in the j class, and p represents the total number of the classes of the continuous type attributes of the training samples;
aiming at each class, obtaining the value of each class attribute of all training samples, and counting the frequency of each value in each class attribute value field:
Figure FDA0002511685450000043
wherein
Figure FDA0002511685450000045
Representing the w-th value in the l-th type attribute value field,
Figure FDA0002511685450000049
the value representing the property of the first type is
Figure FDA0002511685450000046
Is the frequency of occurrence of w-th value in the l-th type attribute value domain in the j-th class, w is 1, 2,3l,tlThe number of the median values in the first type attribute value domain;
step S232, the obtained
Figure FDA0002511685450000047
And
Figure FDA0002511685450000048
and obtaining a new clustering center as the attribute of the new clustering center.
8. The reinforcement learning-based crime risk prediction method of claim 1, wherein the neural network is a BP neural network;
in step S3, initial parameters are respectively set for the N BP neural networks by using a genetic algorithm, which is specifically as follows:
step S31, randomly generating initial parameters of the BP neural network as an initial group, and setting maximum iteration times, stopping errors, cross probabilities and variation probabilities for the genetic algorithm;
step S32, selecting, crossing and mutating by adopting a championship selection strategy, a uniform crossing strategy and a uniform mutation strategy respectively;
step S33, calculating the fitness of each generation of individuals, stopping the algorithm when the fitness is smaller than a stopping error or the iteration times is larger than the maximum iteration times, and returning the last individual as the initial input parameter of the BP neural network; otherwise, the process returns to step S32.
9. A storage medium comprising a processor and a memory for storing a processor-executable program, wherein the processor, when executing the program stored in the memory, implements the reinforcement learning-based crime risk prediction method of any one of claims 1-8.
10. A computing device storing a program that, when executed by a processor, implements the reinforcement learning-based crime risk prediction method of any one of claims 1-8.
CN202010463027.XA 2020-05-27 2020-05-27 Reinforcement learning-based crime risk prediction method, medium, and computing device Pending CN111768027A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010463027.XA CN111768027A (en) 2020-05-27 2020-05-27 Reinforcement learning-based crime risk prediction method, medium, and computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010463027.XA CN111768027A (en) 2020-05-27 2020-05-27 Reinforcement learning-based crime risk prediction method, medium, and computing device

Publications (1)

Publication Number Publication Date
CN111768027A true CN111768027A (en) 2020-10-13

Family

ID=72719818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010463027.XA Pending CN111768027A (en) 2020-05-27 2020-05-27 Reinforcement learning-based crime risk prediction method, medium, and computing device

Country Status (1)

Country Link
CN (1) CN111768027A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508413A (en) * 2020-12-08 2021-03-16 天津大学 Multi-mode learning and LSTM risk studying and judging method
CN112651442A (en) * 2020-12-25 2021-04-13 南京中兴力维软件有限公司 Crime prediction method, crime prediction device, crime prediction equipment and computer readable storage medium
CN112989799A (en) * 2021-04-26 2021-06-18 扆亮海 Microblog data stream evolution topic modeling document clustering analysis method
CN113159155A (en) * 2021-04-15 2021-07-23 华南农业大学 Crime risk early warning mixed attribute data processing method, medium and equipment
CN114358441A (en) * 2022-01-19 2022-04-15 西南石油大学 Intelligent segmented prediction method for yield of dense gas
CN115936431A (en) * 2022-11-28 2023-04-07 四川大学华西医院 Crime risk assessment method, crime risk assessment device, computer equipment and readable storage medium
CN114358441B (en) * 2022-01-19 2024-05-31 西南石油大学 Intelligent sectional prediction method for dense gas yield

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508413A (en) * 2020-12-08 2021-03-16 天津大学 Multi-mode learning and LSTM risk studying and judging method
CN112651442A (en) * 2020-12-25 2021-04-13 南京中兴力维软件有限公司 Crime prediction method, crime prediction device, crime prediction equipment and computer readable storage medium
CN113159155A (en) * 2021-04-15 2021-07-23 华南农业大学 Crime risk early warning mixed attribute data processing method, medium and equipment
CN113159155B (en) * 2021-04-15 2024-01-23 华南农业大学 Mixed attribute data processing method, medium and equipment for crime risk early warning
CN112989799A (en) * 2021-04-26 2021-06-18 扆亮海 Microblog data stream evolution topic modeling document clustering analysis method
CN114358441A (en) * 2022-01-19 2022-04-15 西南石油大学 Intelligent segmented prediction method for yield of dense gas
CN114358441B (en) * 2022-01-19 2024-05-31 西南石油大学 Intelligent sectional prediction method for dense gas yield
CN115936431A (en) * 2022-11-28 2023-04-07 四川大学华西医院 Crime risk assessment method, crime risk assessment device, computer equipment and readable storage medium
CN115936431B (en) * 2022-11-28 2023-10-20 四川大学华西医院 Re-crime risk assessment method, device, computer equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN111768027A (en) Reinforcement learning-based crime risk prediction method, medium, and computing device
Chen et al. Efficient ant colony optimization for image feature selection
Oreski et al. Hybrid system with genetic algorithm and artificial neural networks and its application to retail credit risk assessment
CN108475393A (en) The system and method that decision tree is predicted are promoted by composite character and gradient
CN110084610A (en) A kind of network trading fraud detection system based on twin neural network
CN115688024B (en) Network abnormal user prediction method based on user content characteristics and behavior characteristics
Leung et al. Generating compact classifier systems using a simple artificial immune system
Sánchez et al. Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data
CN110458373A (en) A kind of method of crime prediction and system of the fusion of knowledge based map
CN110021341A (en) A kind of prediction technique of GPCR drug based on heterogeneous network and targeting access
Zhou et al. Personal credit default prediction model based on convolution neural network
CN110782948A (en) Method for predicting potential association of miRNA and disease based on constraint probability matrix decomposition method
Ma An Efficient Optimization Method for Extreme Learning Machine Using Artificial Bee Colony.
Martino et al. Multivariate hidden Markov models for disease progression
Moslehi et al. A genetic algorithm-based framework for mining quantitative association rules without specifying minimum support and minimum confidence
Peng et al. The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process
Lamba et al. A MCDM-based performance of classification algorithms in breast cancer prediction for imbalanced datasets
Bustillo et al. Predictive Hybridization Model integrating Modified Genetic Algorithm (MGA) and C4. 5
CN109597944B (en) Single-classification microblog rumor detection model based on deep belief network
CN114519508A (en) Credit risk assessment method based on time sequence deep learning and legal document information
Chishti et al. Deep neural network a step by step approach to classify credit card default customer
CN112529415A (en) Article scoring method based on combined multi-receptive-field-map neural network
CN111144453A (en) Method and equipment for constructing multi-model fusion calculation model and method and equipment for identifying website data
CN113159155B (en) Mixed attribute data processing method, medium and equipment for crime risk early warning
CN115935067A (en) Article recommendation method integrating semantics and structural view for socialized recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201013

RJ01 Rejection of invention patent application after publication