CN117708725A - Distributed personnel relationship mining and evaluating method and device - Google Patents

Distributed personnel relationship mining and evaluating method and device Download PDF

Info

Publication number
CN117708725A
CN117708725A CN202311733465.3A CN202311733465A CN117708725A CN 117708725 A CN117708725 A CN 117708725A CN 202311733465 A CN202311733465 A CN 202311733465A CN 117708725 A CN117708725 A CN 117708725A
Authority
CN
China
Prior art keywords
module
personnel
relationship
category
mining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311733465.3A
Other languages
Chinese (zh)
Inventor
任传伦
杨天长
张先国
刘策越
徐明烨
赵杰民
李宝静
唐然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 15 Research Institute
Original Assignee
CETC 15 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 15 Research Institute filed Critical CETC 15 Research Institute
Priority to CN202311733465.3A priority Critical patent/CN117708725A/en
Publication of CN117708725A publication Critical patent/CN117708725A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/302Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Signal Processing (AREA)
  • Evolutionary Biology (AREA)
  • Computer Hardware Design (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a distributed personnel relationship mining and evaluating method and device, wherein the method comprises the following steps: acquiring a personnel relationship training data set; training the personnel relationship mining model by using the personnel relationship training data set; acquiring personnel relationship detection data; preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data; processing the preprocessed personnel relationship detection data by using the personnel relationship mining model to obtain personnel category mining result information; performing evaluation processing on the personnel category mining result information to obtain a personnel relationship evaluation result value; the personnel relationship evaluation result value is used for representing the threat degree of the personnel relationship. The invention solves the problems of unclear analysis purpose of personnel relationship, unclear required data field, inaccurate mining analysis algorithm, incapability of accurately evaluating personnel threat and the like in the network personnel tracking technology.

Description

Distributed personnel relationship mining and evaluating method and device
Technical Field
The invention relates to the technical field of information security, in particular to a distributed personnel relationship mining and evaluating method and device.
Background
At present, in the field of information safety, application such as recommending key hit targets, analyzing and breaking key targets, intercepting and acquiring important information can be effectively supported by mining identity information and association relations of network personnel, and the problems that the existing network personnel tracking technology is ambiguous in personnel relation analysis purpose, unclear in required data field, inaccurate in mining analysis algorithm, incapable of accurately evaluating personnel threat and the like exist, so that the analysis result is difficult to support practical application and the like are caused.
Disclosure of Invention
Aiming at the problems that the existing network personnel tracking technology has an ambiguous personnel relationship analysis purpose, unclear required data fields, inaccurate mining analysis algorithm, incapability of accurately evaluating personnel threat and the like, and the analysis result is difficult to support practical application, the invention discloses a distributed personnel relationship mining evaluation method and device.
The invention discloses a distributed personnel relationship mining and evaluating method, which comprises the following steps:
s1, acquiring a personnel relationship training data set; the personnel relationship training data set comprises personnel relationship training data and corresponding personnel category label data; the personnel category label data comprises real identity category label information, virtual identity category label information, trust relationship category label information and activity track category label information;
s2, training the personnel relationship mining model by using the personnel relationship training data set;
s3, acquiring personnel relationship detection data;
s4, preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data;
s5, processing the preprocessed personnel relationship detection data by using the personnel relationship mining model to obtain personnel category mining result information; the personnel category mining result information comprises real identity category information, virtual identity category information, trust relationship category information and activity track category information;
s6, carrying out evaluation processing on the personnel category mining result information to obtain a personnel relationship evaluation result value; the personnel relationship evaluation result value is used for representing the threat degree of the personnel relationship.
The personnel relationship mining model comprises a feature extraction network and a classification network;
the characteristic extraction network is connected with the classification network and comprises a first processing module, a second processing module, a third processing module and a fourth processing module, and is used for carrying out characteristic extraction processing on input data to obtain characteristic data;
the first processing module comprises an input module, a first convolution module, a first activation module, a first residual error module and a first normalization module; the first port of the first processing module is an input end of the input module; the second port of the first processing module is an output end of the first normalization module; the first port of the first processing module is used as an input end of the feature extraction network and is used for receiving and obtaining input data; the second port of the first processing module is connected with the first port of the second processing module; in the first processing module, the input module, the first convolution module, the first activation module, the first residual error module and the first normalization module are sequentially connected;
the second processing module comprises a second convolution module, a second activation module and a second normalization module; the first port of the second processing module is an input end of the second convolution module; the second port of the second processing module is an output end of the second normalization module; the second port of the second processing module is connected with the first port of the third processing module; in the second processing module, the second convolution module, the second activation module and the second normalization module are sequentially connected;
the third processing module comprises a third convolution module, a third activation module and a third residual error module; the first port of the third processing module is an input end of the third convolution module; the second port of the third processing module is an output end of the third residual error module; the second port of the third processing module is connected with the first port of the fourth processing module; in the third processing module, the third convolution module, the third activation module and the third residual error module are sequentially connected;
the fourth processing module comprises a fourth convolution module, a fourth activation module and a fourth residual error module; the first port of the fourth processing module is an input end of the fourth convolution module; the second port of the fourth processing module is an output end of the fourth residual error module; the second port of the fourth processing module is used as an output end of the feature extraction network and is used for outputting feature data; in the fourth processing module, the fourth convolution module, the fourth activation module and the fourth residual error module are sequentially connected;
the classification network is used for carrying out relation classification and identification processing on the characteristic data to obtain personnel class mining result information or personnel class prediction result information, and comprises a first relation characteristic extraction module, a first coding module, a first decoding module, a first characteristic reconstruction module, a second relation characteristic extraction module, a second coding module, a second decoding module, a second characteristic reconstruction module and a full connection module; the first coding module comprises a first personnel category coding module and a first personnel track coding module; the second coding module comprises a second personnel category coding module and a second personnel track coding module;
the input end of the first relation feature extraction module and the input end of the second relation feature extraction module are connected with the second port of the fourth processing module;
the output end of the first relation feature extraction module is connected with the input end of the first coding module; the output ends of the first personnel category encoding module and the second personnel track encoding module are connected with the input end of the first decoding module; the output ends of the second personnel category encoding module and the first personnel track encoding module are connected with the input end of the second decoding module; the output end of the first decoding module is connected with the input end of the first characteristic reconstruction module; the output end of the second decoding module is connected with the input end of the second characteristic reconstruction module; the output end of the first characteristic reconstruction module and the output end of the second characteristic reconstruction module are connected with the input end of the full-connection module; and the output end of the full-connection module is used for outputting personnel category mining result information or personnel category prediction result information obtained by the personnel relationship mining model.
The characteristic extraction network of the personnel relationship mining model realizes effective extraction of useful information of personnel relationship information and effective inhibition of irrelevant information by arranging a multi-stage convolution processing module; the classification network ensures the accuracy of the relationship classification result by adopting multi-level relationship feature extraction and respectively encoding the personnel category and the personnel track.
The training process for the personnel relationship mining model by using the personnel relationship training data set comprises the following steps:
s21, initializing a training iteration number value;
s22, taking each personnel relationship training data in the personnel relationship training data set as input data, and inputting a personnel relationship mining model to obtain personnel category prediction result information;
s23, performing difference calculation processing on personnel category label data and personnel category prediction result information corresponding to the input data to obtain a difference value;
s24, judging whether the difference value meets a convergence condition or not to obtain a first judgment result;
when the first judgment result is negative, judging whether the training iteration number value is equal to a training number threshold value or not, and obtaining a second judgment result;
when the second judgment result is negative, determining that the model training state does not meet the training termination condition;
when the second judgment result is yes, determining that the model training state meets the training termination condition;
when the first judgment result is yes, determining that the model training state meets the training termination condition;
when the model training state does not meet the training termination condition, carrying out parameter updating on the classification network by using a parameter updating model, increasing the training iteration number by 1, and executing S22;
and when the model training state meets the training termination condition, finishing the training processing process of the personnel relationship mining model to obtain a trained personnel relationship mining model.
The parameter updating model is as follows:
θ←θ+v;
wherein x is (i) For the ith data in the personal relationship training data set,for the loss function of the personnel class label data of the j-th class, v is a parameter updating value, theta is a parameter of the classification network, eta is an initial parameter learning rate, alpha is a momentum angle parameter, 0-1 and->Representing the partial derivative of the variable θ, f (x (i) The method comprises the steps of carrying out a first treatment on the surface of the θ) represents personnel category label data obtained by calculating the ith data of the personnel relationship training data set by the personnel relationship mining model, and f (·) is a calculation function corresponding to the personnel relationship mining model.
Preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data, wherein the preprocessing comprises the following steps:
s41, performing data cleaning processing on the personnel relationship detection data to obtain cleaning data;
s42, performing data range limiting processing on the cleaning data to obtain limiting data;
s43, carrying out normalization processing on the limiting data to obtain preprocessed personnel relationship detection data.
The step of evaluating the personnel category mining result information to obtain a personnel relationship evaluation result comprises the following steps:
s61, obtaining personnel category mining result information in N time periods;
s62, carrying out quantization processing on the personnel category mining result information in the N time periods to obtain a personnel category mining result matrix A;
the row vectors in the personnel category mining result matrix A are vectors obtained by carrying out quantization processing on personnel category mining result information in a time period;
s63, carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V;
s64, carrying out weighted summation processing on the personnel category mining result matrix by utilizing the personnel category feature vector to obtain a personnel relationship evaluation result value.
The step of carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V comprises the following steps:
s631, QR decomposition is carried out on the personnel category mining result matrix A to obtain
A=U s V s T
Wherein U is s 、V s Respectively representing a column orthogonal matrix and an upper triangular matrix of the matrix A;
s632, utilize U s And constructing and obtaining a personnel category characteristic vector V.
In a second aspect of the embodiment of the present invention, a distributed personnel relationship mining and evaluating apparatus is disclosed, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
and the processor calls the executable program codes stored in the memory to execute the distributed personnel relationship mining and evaluating method.
In a third aspect of the embodiments of the present invention, a computer-readable medium storing computer instructions for performing the distributed personnel relationship mining evaluation method when called is disclosed.
In a fourth aspect of the embodiment of the present invention, an information data processing terminal is disclosed, where the information data processing terminal is configured to implement the distributed personnel relationship mining and evaluating method.
The beneficial effects of the invention are as follows:
the personnel tracking technology based on distributed data mining aims at solving the problems that analysis results are difficult to support for practical application and the like due to the fact that personnel relation analysis purposes are not clear, required data fields are not clear, mining analysis algorithms are not accurate, personnel threat cannot be estimated, and the like, personnel information fields are accurately defined, data and tasks are disassembled and distributed to all computing nodes in the analysis process, data mining is conducted through parallelization of all the computing nodes, and time consumption of data mining processing is greatly reduced.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
For a better understanding of the present disclosure, an embodiment is presented herein.
In a first aspect of the embodiment of the present invention, a distributed personnel relationship mining and evaluating method is disclosed, as shown in fig. 1, including:
s1, acquiring a personnel relationship training data set; the personnel relationship training data set comprises personnel relationship training data and corresponding personnel category label data; the personnel category label data comprises real identity category label information, virtual identity category label information, trust relationship category label information and activity track category label information;
s2, training the personnel relationship mining model by using the personnel relationship training data set;
s3, acquiring personnel relationship detection data;
s4, preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data;
s5, processing the preprocessed personnel relationship detection data by using a personnel relationship mining model to obtain personnel category mining result information; the personnel category mining result information comprises real identity category information, virtual identity category information, trust relationship category information and activity track category information;
and S6, carrying out evaluation processing on the personnel category mining result information to obtain personnel relationship evaluation result values. The personnel relationship evaluation result value is used for representing the threat degree of the personnel relationship.
The personnel relationship training data set can be constructed by adopting known internet security news, government public information and information security knowledge maps; specifically, known internet security news, government public information, text, pictures, audio and the like in an information security knowledge graph can be used as personnel relationship training data, and personnel information in the information is used as corresponding personnel type label data; the personnel information comprises real identity category information, virtual identity category information, trust relationship category information and activity track category information; the true identity category information can be valued by network security management personnel, network security technical personnel, general personnel and the like; the virtual identity category information can be valued by threat personnel, detection personnel, tourists and the like; the trust relationship category information can be trust, general trust, untrustworthy and dangerous; the activity track category information can be dangerous, general safe and safe.
The obtaining of the personnel relationship detection data includes: acquiring personnel relationship detection data related to personnel relationships from an information security professional website, a government database and an information security knowledge graph;
the personnel relationship mining model comprises a feature extraction network and a classification network;
the characteristic extraction network is connected with the classification network and comprises a first processing module, a second processing module, a third processing module and a fourth processing module, and is used for carrying out characteristic extraction processing on input data to obtain characteristic data;
the first processing module comprises an input module, a first convolution module, a first activation module, a first residual error module and a first normalization module; the first port of the first processing module is an input end of the input module; the second port of the first processing module is an output end of the first normalization module; the first port of the first processing module is used as an input end of the feature extraction network and is used for receiving and obtaining input data; the second port of the first processing module is connected with the first port of the second processing module; in the first processing module, the input module, the first convolution module, the first activation module, the first residual error module and the first normalization module are sequentially connected;
the second processing module comprises a second convolution module, a second activation module and a second normalization module; the first port of the second processing module is an input end of the second convolution module; the second port of the second processing module is an output end of the second normalization module; the second port of the second processing module is connected with the first port of the third processing module; in the second processing module, the second convolution module, the second activation module and the second normalization module are sequentially connected;
the third processing module comprises a third convolution module, a third activation module and a third residual error module; the first port of the third processing module is an input end of the third convolution module; the second port of the third processing module is an output end of the third residual error module; the second port of the third processing module is connected with the first port of the fourth processing module; in the third processing module, the third convolution module, the third activation module and the third residual error module are sequentially connected;
the fourth processing module comprises a fourth convolution module, a fourth activation module and a fourth residual error module; the first port of the fourth processing module is an input end of the fourth convolution module; the second port of the fourth processing module is an output end of the fourth residual error module; the second port of the fourth processing module is used as an output end of the feature extraction network and is used for outputting feature data; in the fourth processing module, the fourth convolution module, the fourth activation module and the fourth residual error module are sequentially connected;
the classification network is used for carrying out relation classification and identification processing on the characteristic data to obtain personnel class mining result information or personnel class prediction result information, and comprises a first relation characteristic extraction module, a first coding module, a first decoding module, a first characteristic reconstruction module, a second relation characteristic extraction module, a second coding module, a second decoding module, a second characteristic reconstruction module and a full connection module; the first coding module comprises a first personnel category coding module and a first personnel track coding module; the second coding module comprises a second personnel category coding module and a second personnel track coding module;
the input end of the first relation feature extraction module and the input end of the second relation feature extraction module are connected with the second port of the fourth processing module and are used for receiving and obtaining feature data;
the output end of the first relation feature extraction module is connected with the input end of the first coding module; the output ends of the first personnel category encoding module and the second personnel track encoding module are connected with the input end of the first decoding module; the output ends of the second personnel category encoding module and the first personnel track encoding module are connected with the input end of the second decoding module; the output end of the first decoding module is connected with the input end of the first characteristic reconstruction module; the output end of the second decoding module is connected with the input end of the second characteristic reconstruction module; the output end of the first characteristic reconstruction module and the output end of the second characteristic reconstruction module are connected with the input end of the full-connection module; the output end of the full-connection module is used for outputting personnel category mining result information or personnel category prediction result information obtained by the personnel relationship mining model;
the activation module is used for realizing an activation function; the activation function may be a sigmoid function;
the residual error module can be realized by adopting a Resnet network;
the normalization module can be realized by adopting a fractional number normalization Quantile Normalization function in phton.
The convolution module can be realized by adopting a convolution network conv;
the relation feature extraction module can be realized by adopting a cyclic neural network, a transducer or a feature pyramid;
the first or second coding module can be realized by adopting a self-coding neural network or an automatic coder or a DAE network;
the first or second decoding module can be realized by adopting a self-decoding neural network or an automatic decoder;
the first or second feature reconstruction module may be implemented using a graph roll-up neural network.
The fully connected module can be realized by adopting a fully connected layer.
The personnel category coding module can be realized by adopting an LSTM automatic coder in Tensflow;
the personnel track coding module can be realized by adopting a graph rolling network (GCN), a graph annotation force network (GAT) and the like.
The training process for the personnel relationship mining model by using the personnel relationship training data set comprises the following steps:
s21, initializing a training iteration number value;
s22, taking each personnel relationship training data in the personnel relationship training data set as input data, and inputting a personnel relationship mining model to obtain personnel category prediction result information;
s23, performing difference calculation processing on personnel category label data and personnel category prediction result information corresponding to the input data to obtain a difference value;
s24, judging whether the difference value meets a convergence condition or not to obtain a first judgment result;
when the first judgment result is negative, judging whether the training iteration number value is equal to a training number threshold value or not, and obtaining a second judgment result;
when the second judgment result is negative, determining that the model training state does not meet the training termination condition;
when the second judgment result is yes, determining that the model training state meets the training termination condition;
when the first judgment result is yes, determining that the model training state meets the training termination condition;
when the model training state does not meet the training termination condition, carrying out parameter updating on the classification network by using a parameter updating model, increasing the training iteration number by 1, and executing S22;
and when the model training state meets the training termination condition, finishing the training processing process of the personnel relationship mining model to obtain a trained personnel relationship mining model.
The convergence condition is that when the difference value is smaller than a set threshold value, the convergence condition is satisfied; when the difference value is not smaller than the set threshold value, the convergence condition is not satisfied.
The parameter updating model is as follows:
θ←θ+v;
wherein x is (i) For the ith data in the personal relationship training data set,for the loss function of the personnel class label data of the j-th class, v is a parameter updating value, theta is a parameter of the classification network, eta is an initial parameter learning rate, alpha is a momentum angle parameter, 0-1 and->Representing the partial derivative of the variable θ, f (x (i) The method comprises the steps of carrying out a first treatment on the surface of the θ) represents personnel category label data obtained by calculating the ith data of the personnel relationship training data set by the personnel relationship mining model, and f (·) is a calculation function corresponding to the personnel relationship mining model; exp represents a power operation of a constant e; eta and alpha are preset values.
Wherein,representation utilizationUpdating the parameter v; θ≡θ + v, the update of θ with θ+v is indicated.
Preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data, wherein the preprocessing comprises the following steps:
s41, performing data cleaning processing on the personnel relationship detection data to obtain cleaning data;
s42, performing data range limiting processing on the cleaning data to obtain limiting data;
s43, carrying out normalization processing on the limiting data to obtain preprocessed personnel relationship detection data.
The step of evaluating the personnel category mining result information to obtain a personnel relationship evaluation result comprises the following steps:
s61, obtaining personnel category mining result information in N time periods;
s62, carrying out quantization processing on the personnel category mining result information in the N time periods to obtain a personnel category mining result matrix A;
the row vectors in the personnel category mining result matrix A are vectors obtained by carrying out quantization processing on personnel category mining result information in a time period;
the quantization processing is to carry out quantization scoring on the class information of each class according to the value of the class information; for example, the true identity category information may be valued by a network security manager, a network security technician, a general person, etc., and the corresponding quantitative scores are respectively 4, 3, 2 and 1; the virtual identity category information can be valued by threat personnel, detection personnel, tourists and the like, and the corresponding quantitative scores are respectively 4, 3, 2 and 1; the trust relationship category information can be trust, general trust, untrustworthy and dangerous, and the corresponding quantitative scores are respectively 4, 3, 2 and 1; the value of the activity track category information can be danger, general safety and safety, and the corresponding quantitative scores are respectively 4, 3, 2 and 1.
S63, carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V;
s64, carrying out weighted summation processing on the personnel category mining result matrix by utilizing the personnel category feature vector to obtain a personnel relationship evaluation result value;
and carrying out weighted summation processing on the personnel category mining result matrix by using the personnel category feature vector to obtain a personnel relationship evaluation result value, wherein the calculation expression is as follows:
q=sum(VA);
wherein VA represents a result vector obtained by multiplying the personnel category feature vector V by the personnel category mining result matrix a, sum represents a summation operation of all elements in the vector.
The step of carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V comprises the following steps:
s631, QR decomposition is carried out on the personnel category mining result matrix A to obtain
A=U s V s T
Wherein U is s 、V s Respectively representing a column orthogonal matrix and an upper triangular matrix of the matrix A;
s632, utilize U s And constructing and obtaining a personnel category characteristic vector V.
The step of carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V comprises the following steps:
taking each row vector of the matrix A as a point in a high latitude space;
performing nuclear mapping processing on the matrix A to obtain a corresponding nuclear matrix;
performing feature decomposition on the kernel matrix to obtain a feature vector;
determining the feature vector and a human category feature vector V;
the core mapping process may employ a regenerated core mapping or a Mercer core mapping.
In a second aspect of the embodiment of the present invention, a distributed personnel relationship mining and evaluating apparatus is disclosed, the apparatus comprising:
a memory storing executable program code; a processor coupled to the memory; and the processor calls the executable program codes stored in the memory to execute the distributed personnel relationship mining and evaluating method.
In a third aspect of the embodiments of the present invention, a computer-readable medium storing computer instructions for performing the distributed personnel relationship mining evaluation method when called is disclosed.
In a fourth aspect of the embodiment of the present invention, an information data processing terminal is disclosed, where the information data processing terminal is configured to implement the distributed personnel relationship mining and evaluating method.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. The distributed personnel relationship mining and evaluating method is characterized by comprising the following steps of:
s1, acquiring a personnel relationship training data set; the personnel relationship training data set comprises personnel relationship training data and corresponding personnel category label data; the personnel category label data comprises real identity category label information, virtual identity category label information, trust relationship category label information and activity track category label information;
s2, training the personnel relationship mining model by using the personnel relationship training data set;
s3, acquiring personnel relationship detection data;
s4, preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data;
s5, processing the preprocessed personnel relationship detection data by using the personnel relationship mining model to obtain personnel category mining result information; the personnel category mining result information comprises real identity category information, virtual identity category information, trust relationship category information and activity track category information;
s6, carrying out evaluation processing on the personnel category mining result information to obtain a personnel relationship evaluation result value; the personnel relationship evaluation result value is used for representing the threat degree of the personnel relationship.
2. The distributed personal relationship mining assessment method according to claim 1, wherein the personal relationship mining model includes a feature extraction network and a classification network;
the characteristic extraction network is connected with the classification network and comprises a first processing module, a second processing module, a third processing module and a fourth processing module, and is used for carrying out characteristic extraction processing on input data to obtain characteristic data;
the first processing module comprises an input module, a first convolution module, a first activation module, a first residual error module and a first normalization module; the first port of the first processing module is an input end of the input module; the second port of the first processing module is an output end of the first normalization module; the first port of the first processing module is used as an input end of the feature extraction network and is used for receiving and obtaining input data; the second port of the first processing module is connected with the first port of the second processing module; in the first processing module, the input module, the first convolution module, the first activation module, the first residual error module and the first normalization module are sequentially connected;
the second processing module comprises a second convolution module, a second activation module and a second normalization module; the first port of the second processing module is an input end of the second convolution module; the second port of the second processing module is an output end of the second normalization module; the second port of the second processing module is connected with the first port of the third processing module; in the second processing module, the second convolution module, the second activation module and the second normalization module are sequentially connected;
the third processing module comprises a third convolution module, a third activation module and a third residual error module; the first port of the third processing module is an input end of the third convolution module; the second port of the third processing module is an output end of the third residual error module; the second port of the third processing module is connected with the first port of the fourth processing module; in the third processing module, the third convolution module, the third activation module and the third residual error module are sequentially connected;
the fourth processing module comprises a fourth convolution module, a fourth activation module and a fourth residual error module; the first port of the fourth processing module is an input end of the fourth convolution module; the second port of the fourth processing module is an output end of the fourth residual error module; the second port of the fourth processing module is used as an output end of the feature extraction network and is used for outputting feature data; in the fourth processing module, the fourth convolution module, the fourth activation module and the fourth residual error module are sequentially connected;
the classification network is used for carrying out relation classification and identification processing on the characteristic data to obtain personnel class mining result information or personnel class prediction result information, and comprises a first relation characteristic extraction module, a first coding module, a first decoding module, a first characteristic reconstruction module, a second relation characteristic extraction module, a second coding module, a second decoding module, a second characteristic reconstruction module and a full connection module; the first coding module comprises a first personnel category coding module and a first personnel track coding module; the second coding module comprises a second personnel category coding module and a second personnel track coding module;
the input end of the first relation feature extraction module and the input end of the second relation feature extraction module are connected with the second port of the fourth processing module;
the output end of the first relation feature extraction module is connected with the input end of the first coding module; the output ends of the first personnel category encoding module and the second personnel track encoding module are connected with the input end of the first decoding module; the output ends of the second personnel category encoding module and the first personnel track encoding module are connected with the input end of the second decoding module; the output end of the first decoding module is connected with the input end of the first characteristic reconstruction module; the output end of the second decoding module is connected with the input end of the second characteristic reconstruction module; the output end of the first characteristic reconstruction module and the output end of the second characteristic reconstruction module are connected with the input end of the full-connection module; and the output end of the full-connection module is used for outputting personnel category mining result information or personnel category prediction result information obtained by the personnel relationship mining model.
3. The distributed personnel relationship mining assessment method according to claim 1, wherein the training the personnel relationship mining model using the personnel relationship training data set comprises:
s21, initializing a training iteration number value;
s22, taking each personnel relationship training data in the personnel relationship training data set as input data, and inputting a personnel relationship mining model to obtain personnel category prediction result information;
s23, performing difference calculation processing on personnel category label data and personnel category prediction result information corresponding to the input data to obtain a difference value;
s24, judging whether the difference value meets a convergence condition or not to obtain a first judgment result;
when the first judgment result is negative, judging whether the training iteration number value is equal to a training number threshold value or not, and obtaining a second judgment result;
when the second judgment result is negative, determining that the model training state does not meet the training termination condition;
when the second judgment result is yes, determining that the model training state meets the training termination condition;
when the first judgment result is yes, determining that the model training state meets the training termination condition;
when the model training state does not meet the training termination condition, carrying out parameter updating on the classification network by using a parameter updating model, increasing the training iteration number by 1, and executing S22;
and when the model training state meets the training termination condition, finishing the training processing process of the personnel relationship mining model to obtain a trained personnel relationship mining model.
4. The distributed personal relationship mining assessment method according to claim 3, wherein the parameter update model is:
θ←θ+v;
wherein x is (i) For the ith data in the personal relationship training data set,for the loss function of the personnel class label data of the j-th class, v is a parameter updating value, theta is a parameter of the classification network, eta is an initial parameter learning rate, alpha is a momentum angle parameter, 0-1 and->Representing the partial derivative of the variable θ, f (x (i) The method comprises the steps of carrying out a first treatment on the surface of the θ) represents personnel category label data obtained by calculating the ith data of the personnel relationship training data set by the personnel relationship mining model, and f (·) is a calculation function corresponding to the personnel relationship mining model.
5. The distributed personnel relationship mining evaluation method according to claim 1, wherein the preprocessing the personnel relationship detection data to obtain preprocessed personnel relationship detection data includes:
s41, performing data cleaning processing on the personnel relationship detection data to obtain cleaning data;
s42, performing data range limiting processing on the cleaning data to obtain limiting data;
s43, carrying out normalization processing on the limiting data to obtain preprocessed personnel relationship detection data.
6. The distributed personnel relationship mining evaluation method according to claim 1, wherein the evaluation processing of the personnel category mining result information to obtain a personnel relationship evaluation result comprises:
s61, obtaining personnel category mining result information in N time periods;
s62, carrying out quantization processing on the personnel category mining result information in the N time periods to obtain a personnel category mining result matrix A;
the row vectors in the personnel category mining result matrix A are vectors obtained by carrying out quantization processing on personnel category mining result information in a time period;
s63, carrying out feature vector calculation processing on the personnel category mining result matrix A to obtain a personnel category feature vector V;
s64, carrying out weighted summation processing on the personnel category mining result matrix by utilizing the personnel category feature vector to obtain a personnel relationship evaluation result value.
7. The distributed personnel relationship mining evaluation method according to claim 6, wherein the performing feature vector calculation processing on the personnel category mining result matrix a to obtain a personnel category feature vector V includes:
s631, QR decomposition is carried out on the personnel category mining result matrix A to obtain
Wherein U is s 、V s Respectively representing a column orthogonal matrix and an upper triangular matrix of the matrix A;
s632, utilize U s And constructing and obtaining a personnel category characteristic vector V.
8. A distributed personnel relationship mining and assessment apparatus, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the distributed personnel relationship mining assessment method of any one of claims 1 to 7.
9. A computer-storable medium storing computer instructions that, when invoked, are operable to perform the distributed personnel relationship mining assessment method of any one of claims 1 to 7.
10. An information data processing terminal, characterized in that the information data processing terminal is configured to implement the distributed personnel relationship mining evaluation method according to any one of claims 1 to 7.
CN202311733465.3A 2023-12-15 2023-12-15 Distributed personnel relationship mining and evaluating method and device Pending CN117708725A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311733465.3A CN117708725A (en) 2023-12-15 2023-12-15 Distributed personnel relationship mining and evaluating method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311733465.3A CN117708725A (en) 2023-12-15 2023-12-15 Distributed personnel relationship mining and evaluating method and device

Publications (1)

Publication Number Publication Date
CN117708725A true CN117708725A (en) 2024-03-15

Family

ID=90143939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311733465.3A Pending CN117708725A (en) 2023-12-15 2023-12-15 Distributed personnel relationship mining and evaluating method and device

Country Status (1)

Country Link
CN (1) CN117708725A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078918A1 (en) * 2010-09-28 2012-03-29 Siemens Corporation Information Relation Generation
CN109522342A (en) * 2018-11-30 2019-03-26 北京百度网讯科技有限公司 Police affairs management method, device, equipment and storage medium
CN110427406A (en) * 2019-08-10 2019-11-08 吴诚诚 The method for digging and device of organization's related personnel's relationship
CN112417176A (en) * 2020-12-09 2021-02-26 交通银行股份有限公司 Graph feature-based method, device and medium for mining implicit association relation between enterprises
CN112765287A (en) * 2021-02-05 2021-05-07 中国人民解放军国防科技大学 Method, device and medium for mining character relation based on knowledge graph embedding
CN113642482A (en) * 2021-08-18 2021-11-12 西北工业大学 Video character relation analysis method based on video space-time context
CN114168691A (en) * 2022-02-11 2022-03-11 南京拓界信息技术有限公司 Visualized character relation mining management system and method based on big data
CN114722212A (en) * 2022-02-28 2022-07-08 中国人民解放军国防科技大学 Automatic meta-path mining method oriented to character relation network
CN116450730A (en) * 2023-05-18 2023-07-18 成都合盛智联科技有限公司 Knowledge graph-based personnel relationship mining method and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078918A1 (en) * 2010-09-28 2012-03-29 Siemens Corporation Information Relation Generation
CN109522342A (en) * 2018-11-30 2019-03-26 北京百度网讯科技有限公司 Police affairs management method, device, equipment and storage medium
CN110427406A (en) * 2019-08-10 2019-11-08 吴诚诚 The method for digging and device of organization's related personnel's relationship
CN112417176A (en) * 2020-12-09 2021-02-26 交通银行股份有限公司 Graph feature-based method, device and medium for mining implicit association relation between enterprises
CN112765287A (en) * 2021-02-05 2021-05-07 中国人民解放军国防科技大学 Method, device and medium for mining character relation based on knowledge graph embedding
CN113642482A (en) * 2021-08-18 2021-11-12 西北工业大学 Video character relation analysis method based on video space-time context
CN114168691A (en) * 2022-02-11 2022-03-11 南京拓界信息技术有限公司 Visualized character relation mining management system and method based on big data
CN114722212A (en) * 2022-02-28 2022-07-08 中国人民解放军国防科技大学 Automatic meta-path mining method oriented to character relation network
CN116450730A (en) * 2023-05-18 2023-07-18 成都合盛智联科技有限公司 Knowledge graph-based personnel relationship mining method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MOHAMMAD NUR 等: ""An Overview of Identity Relationship Management in the Internet of Things"", 2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 13 May 2021 (2021-05-13) *
郭佳: ""基于事理图谱的辅助判案技术的研究与实现"", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 15 January 2022 (2022-01-15) *
陶露;: "基于双向LSTM的人物关系抽取", 池州学院学报, no. 03, 28 June 2020 (2020-06-28) *
黄杨琛;贾焰;甘亮;徐菁;黄九鸣;赫中翮;: "基于远程监督的多因子人物关系抽取模型", 通信学报, no. 07, 25 July 2018 (2018-07-25) *

Similar Documents

Publication Publication Date Title
Kidger et al. Neural controlled differential equations for irregular time series
Molnar et al. Pitfalls to avoid when interpreting machine learning models
US9697476B1 (en) System and method for utilizing a model to process big data
US6036349A (en) Method and apparatus for validation of model-based predictions
CN110135681A (en) Risk subscribers recognition methods, device, readable storage medium storing program for executing and terminal device
CN113254934A (en) Binary code similarity detection method and system based on graph matching network
CN117009780A (en) Space-time frequency domain effective channel attention motor imagery brain electrolysis code method based on contrast learning
CN115017511A (en) Source code vulnerability detection method and device and storage medium
CN115082041A (en) User information management method, device, equipment and storage medium
CN110399279B (en) Intelligent measurement method for non-human intelligent agent
CN112270325A (en) Character verification code recognition model training method, recognition method, system, device and medium
Balasubramanian et al. Comparison of neural networks based on accuracy and robustness in identifying impact location for structural health monitoring applications
CN112988851A (en) Counterfactual prediction model data processing method, device, equipment and storage medium
Danisman et al. Hidden Markov models with binary dependence
CN110866672B (en) Data processing method, device, terminal and medium
Papamakarios et al. Distilling intractable generative models
CN117708725A (en) Distributed personnel relationship mining and evaluating method and device
Wu et al. Generating life course trajectory sequences with recurrent neural networks and application to early detection of social disadvantage
Nautiyal et al. Kcc qa latent semantic representation using deep learning & hierarchical semantic cluster inferential framework
Geleta et al. Deep variational autoencoders for population genetics
Nakip et al. Comparative study of forecasting models for COVID-19 outbreak in Turkey
Zamanzadeh et al. Autopopulus: a novel framework for autoencoder imputation on large clinical datasets
Charlier et al. User-device authentication in mobile banking using APHEN for PARATUCK2 tensor decomposition
CN117973683B (en) Equipment system efficiency evaluation device based on evaluation knowledge characterization
CN117009532B (en) Semantic type recognition method and device, computer readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination