CN114090752A - Problem thread mining method, device, computer equipment and medium - Google Patents

Problem thread mining method, device, computer equipment and medium Download PDF

Info

Publication number
CN114090752A
CN114090752A CN202111364231.7A CN202111364231A CN114090752A CN 114090752 A CN114090752 A CN 114090752A CN 202111364231 A CN202111364231 A CN 202111364231A CN 114090752 A CN114090752 A CN 114090752A
Authority
CN
China
Prior art keywords
user
knowledge graph
node
clues
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111364231.7A
Other languages
Chinese (zh)
Inventor
王怡冰
朱克鹏
申中华
李新
贾飞
金亮
陆亦敏
万晓龙
刘水泉
魏聪惠
薛飞
鄞玮强
王俐
张振强
余华颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202111364231.7A priority Critical patent/CN114090752A/en
Publication of CN114090752A publication Critical patent/CN114090752A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a problem thread mining method, a problem thread mining device, computer equipment and a medium, and relates to the field of artificial intelligence. The method comprises the following steps: acquiring a user relation knowledge graph; calculating characteristic values of corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph; and determining hidden problem clues according to the characteristic values and displaying the hidden problem clues. According to the embodiment of the invention, the problem clues are excavated through the user relation knowledge graph, so that the intelligent level of problem clue excavation is improved, and the potential problem clues are comprehensively and accurately discovered; by providing a visual interaction mode, clues of abnormal behavior problems of the excavating personnel are displayed and explored, and the actual business requirements of relevant investigation work are met.

Description

Problem thread mining method, device, computer equipment and medium
Technical Field
The embodiment of the invention relates to the field of artificial intelligence, in particular to a problem thread mining method, a problem thread mining device, computer equipment and a medium.
Background
In the investigation work of a user, a problem clue is generally found in modes of artificial subjective impression, petition, report and the like, the efficiency of the overall work is low, and the risk loss possibly brought by treatment after the problem is found is relatively difficult to control. Because the existing problem clue mining scheme is realized based on subjective consciousness of personnel, the information source is narrow, the information with complex relation is difficult to mine, and potential problem clues cannot be mined comprehensively and effectively.
Disclosure of Invention
Embodiments of the present invention provide a problem thread mining method, apparatus, computer device, and medium, which can improve the intelligence level of problem thread mining.
In a first aspect, an embodiment of the present invention provides a problem thread mining method, including:
acquiring a user relation knowledge graph;
calculating characteristic values of corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph;
and determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
In a second aspect, an embodiment of the present invention further provides a problem thread mining apparatus, including:
the knowledge graph acquisition module is used for acquiring a user relation knowledge graph;
the characteristic value calculating module is used for calculating the characteristic values of the corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph;
and the problem clue display module is used for determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method for problem thread mining as described in any embodiment of the invention.
In a fourth aspect, the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform the problem thread mining method according to any of the embodiments of the present invention.
In a fifth aspect, the embodiment of the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the problem thread mining method according to any embodiment of the present invention.
The embodiment of the invention provides a problem clue mining scheme, which is characterized in that the characteristic values of corresponding nodes are calculated based on the incidence relation among the nodes in a user relation knowledge graph, and then a hidden problem clue is determined according to the characteristic values to display the hidden problem clue. According to the embodiment of the invention, the problem clues are excavated through the user relation knowledge graph, so that the intelligent level of problem clue excavation is improved, and the potential problem clues are comprehensively and accurately discovered; by providing a visual interaction mode, clues of abnormal behavior problems of the excavating personnel are displayed and explored, and the actual business requirements of relevant investigation work are met.
Drawings
FIG. 1 is a flowchart of a problem thread mining method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating the definition of abnormal relationships and abnormal problem clues;
FIG. 3 is a flowchart of another problem thread mining method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an entity of interpersonal relationships and a structure of a relationship wide table;
FIG. 5 is a schematic diagram of the structure of an entity and relationship wide table representing the relationship of a person to other types of entities;
FIG. 6 is a schematic diagram of a user relationship knowledge graph;
FIG. 7 is a flowchart of another problem thread mining method according to an embodiment of the present invention;
FIG. 8 is a diagram of an example of a tight centrality algorithm;
FIG. 9 is a diagram of an example of an intermediary centrality algorithm;
FIG. 10 is a flowchart of problem thread mining based on a user relationship knowledge graph according to an embodiment of the present invention;
FIG. 11 is a block diagram of a problem thread mining device according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
To facilitate understanding, the words that may appear below are explained.
A data lake is a repository or system that stores data in a raw format, storing the data as it is without the need for structuring the data in advance. A data lake may store structured data, semi-structured data, unstructured data, binary data, and the like.
A data warehouse is a theme-oriented, integrated, relatively stable data set that reflects historical changes and is used to support administrative decisions.
The data mart, also called a data market, meets the requirements of a specific department or user, is stored in a multidimensional way, and comprises defining dimensions, indexes needing to be calculated, hierarchy of the dimensions and the like, and generates a data cube facing the requirements of decision analysis.
The data blooding margin, i.e. the coming and going pulse of the data, mainly includes the source of the data, the processing mode of the data, the mapping relationship and the data outlet. In the practical application of data, the original data is processed in multiple steps according to the practical needs, and finally new data is generated, in the process, a plurality of data tables are generated, and the link relationship among the data tables is called as data blood margin.
A knowledge graph is a semantic network that exposes relationships between entities. The knowledge graph can be divided into a mode layer and a data layer in a logic structure. The data layer is mainly composed of a series of knowledge, and the knowledge is stored in the unit of fact. If a fact is expressed by a triple of (entity 1, relationship, entity 2), (entity, attribute value), a graph database may be selected as the storage medium. The mode layer is built on the data layer, and a series of fact expressions of the data layer are specified mainly through an ontology library. An ontology is a conceptual template of a structured knowledge base that lists the types of entities. Relationships linking them and definitions of entities and combinations of relationships. It can also be understood that, to some extent, an ontology defines rules for how entities connect in the world.
The general process of problem thread discovery is as follows:
1) and checking and authenticating the petition and the report by staff related to the investigation.
2) And (4) performing problem clue screening according to the subjective impression of the relevant personnel of the investigation work on the object.
3) And acquiring data, carrying out ETL (data extraction, conversion and loading) processing on the acquired data, and uniformly loading and storing the processed data in an enterprise data lake and a data warehouse.
4) A human question thread analysis library is established through predefined rules and then applied to evaluate whether a question thread is effective.
The problem clue mining method is used for screening problem clues based on subjective consciousness of personnel, the information source is narrow, hidden problem clues are difficult to mine for information with complex relationships, the intelligence degree is low, and the accuracy is low.
For situations where hidden problem cues are difficult to find using predefined rules, embodiments of the present invention build and use a user relationship knowledge graph to assist in the mining of hidden problem cues.
The user relation knowledge graph is a knowledge graph which is extracted from the practical business process and various rules and regulations, and established as clues of abnormal behavior problems of personnel. Because the user investigation management is a management process taking personnel as a core, the personnel often relate to various complex relationships, and the knowledge graph can be well combined into the user investigation management due to the relationship expression capability of the knowledge graph. According to the supervision requirements, existing data are summarized and abstracted, entities and relations capable of expressing problem clues are extracted, and a user relation knowledge graph for personnel problem clue mining management is formed on the basis of the entities and the relations.
Specifically, a relatively intelligent user relationship knowledge graph is constructed based on acquisition, processing, analysis and mining of multidimensional key data such as financial reimbursement, human resources, travel, credit, deposit, transaction flow, purchase, real estate, bad assets, risk scoring, voyage, industrial and commercial enterprises, internal control compliance, anti-money laundering, anti-fraud and patrol of a data mart, and the digitalized, scientific and intelligent management of problem clue mining work is boosted.
It should be noted that, in the technical solution of the present invention, operations such as collection, storage, use, and processing of the personal information of the user are subject to relevant national laws and regulations regarding personal information security.
Fig. 1 is a flowchart of a problem clue mining method according to an embodiment of the present invention, which may be applied to the case of monitoring the abnormal behavior problem of people, and the method may be executed by a problem clue mining apparatus, which may be implemented by software and/or hardware and configured in a computer device. The method comprises the following steps:
and step 110, acquiring a user relation knowledge graph.
Illustratively, when personnel abnormal problem clue inspection needs to be carried out, a user relation knowledge graph which is constructed in advance based on data in the data mart is obtained.
And 120, calculating characteristic values of the corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph.
The association relationship may be a degree of influence of a certain node on other nodes. For example, the association includes how easily one node reaches other nodes, and the proportion of the number of times one node acts as a bridge for the shortest path between two other nodes.
Illustratively, calculating the reciprocal of the distance average value of each node and other nodes in the user relationship knowledge graph, and taking the reciprocal as the tight centrality value of the corresponding node; determining the frequency of any node in the user relationship knowledge graph as a bridge with the shortest path between other two nodes and the number of the shortest paths between the corresponding two nodes, and taking the calculation result as the intermediary centrality value of the corresponding node by calculating the ratio of the frequency to the number of the shortest paths; a node identification, a tight centrality value, and an intermediate centrality value for the associated storage node.
Alternatively, the node identification, the tight centrality value and the medium centrality value of the corresponding node may be stored in association by a data table. The tight centrality values and the medium centrality values can also be added to the triples as attributes, so that the user relationship knowledge graph is generated through the updated triples, and the corresponding tight centrality values and the medium centrality values are displayed in nodes of the user relationship knowledge graph.
And step 130, determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
Exemplarily, comparing the characteristic value of each node with a preset threshold value, and determining a target node with the characteristic value larger than the preset threshold value; and taking the target node and other nodes connected with the target node through edges as hidden problem clues, and displaying the hidden problem clues. Because the number of the characteristic values is more than one, the tight centrality value of each node can be compared with a preset tight centrality threshold value to obtain a first comparison result; and comparing the intermediary centrality value of each node with a preset intermediary centrality threshold value to obtain a second comparison result. And if at least one of the first comparison result and the second comparison result is greater than the corresponding threshold, determining that the node is a target node with the characteristic value greater than a preset threshold. The target node and other nodes connected with the target node are used as hidden problem clues, namely, other nodes connected with the target node need to be examined to determine whether abnormal behavior problems exist between entities.
Optionally, the user may be prompted by displaying edges between the target node and other nodes as a preset color when the user relationship knowledge graph is displayed. And when the user relation knowledge graph is displayed, the user can be reminded by displaying the frames of the target node and other nodes as preset colors. Or, when the user relationship knowledge graph is displayed, the user can be reminded by filling preset colors in the borders of the target node and other nodes.
Optionally, before comparing the feature value of each node with the preset threshold, the method further includes: and inquiring a preset data table according to the node identification to obtain the characteristic value of the corresponding node, wherein the node identification and the characteristic value are stored in the preset data table in an associated mode. For example, the preset data table is queried through the node identifier to obtain the tight centrality value and the medium centrality value of the corresponding node.
Optionally, before comparing the feature value of each node with the preset threshold, the method further includes: and acquiring characteristic values corresponding to each node in the user relationship knowledge graph, wherein the characteristic values are displayed in the nodes of the user relationship knowledge graph. For example, querying the triple data of the corresponding node through the node identification results in a tight centrality value and an intermediate centrality value.
According to the technical scheme of the embodiment, the characteristic values of the corresponding nodes are calculated based on the incidence relation among the nodes in the user relation knowledge graph, the hidden problem clue is determined according to the characteristic values, and the hidden problem clue is displayed. According to the embodiment of the invention, the problem clues are excavated through the user relation knowledge graph, so that the intelligent level of problem clue excavation is improved, and the potential problem clues are comprehensively and accurately discovered; by providing a visual interaction mode, clues of abnormal behavior problems of the excavating personnel are displayed and explored, and the actual business requirements of relevant investigation work are met.
On the basis of the above technical solution, after determining the hidden problem clue according to the feature value, the following additional technical features may also be included: and verifying the effectiveness of the hidden problem clues by adopting an abnormal relation rule. The abnormal relation rule is a judgment rule of the abnormal behavior problem summarized from the existing data according to the supervision requirement. Abnormal relationship rules include, but are not limited to: the principal account or the control account is used for carrying out fund exchange with the public credit client and the related enterprises thereof or with the personal account of high-administrative staff of the public credit client and the related enterprises thereof; assisting enterprises or individuals to collect bank loans and then transfer high interest to loan other people; financing repayment funds for credit customers for 'reclaiming and repayment'; whether or not a financial transaction has occurred with the customer.
FIG. 2 is a diagram illustrating the definition of abnormal relationships and abnormal problem clues. FIG. 2 illustrates the definition of abnormal relationships of people and abnormal problem clues extended based on abnormal relationship rules in the form of a thought graph. For the staff to associate the violation event, the staff is the associated object of the abnormal event. For example, employee-associated violations include collusion events, abnormal penalties, bad loans, and suspension of work. For violation associations, the personnel have a normal relationship with the core object of the exception event. For example, the personnel bid for collusion about the relatives, colleagues, alumni, etc. of the core object. For abnormal same-cell phone numbers, there should be no relationship in the inference of normal behavior of the person. For example, the own mobile phone number is retained in account information such as a credit card of another person. For abnormal fund transactions, large transactions frequently occur with the same transaction opponent in a short period, and the abnormal behavior of the fund transactions is suspected to exist.
Because the hidden abnormal behavior problem clues discovered through the user relation knowledge graph may be invalid clues sometimes, the embodiment of the invention adopts the predefined abnormal relation rule to verify the validity of the hidden problem clues after determining the hidden problem clues according to the characteristic values, thereby eliminating the invalid hidden problem clues, reducing the data volume of the hidden problem clues needing to be verified by the user and improving the verification efficiency of the hidden problem clues.
On the basis of the technical scheme, before the user relation knowledge graph is obtained, the method further comprises the additional technical characteristics of constructing the user relation knowledge graph.
Fig. 3 is a flowchart of another problem thread mining method according to an embodiment of the present invention, as shown in fig. 3, the method includes:
and step 310, acquiring a model construction data table and a relation list.
The model construction data table is used for storing entity information for constructing the knowledge graph, and the relation detail table is used for storing the relation between the entities.
Illustratively, business data which can be used for constructing a knowledge graph can be obtained and processed from an existing data lake and a data warehouse in advance according to the analysis of a data source and a data blood margin. Alternatively, the business data may be data that is screened out and centered on people, and integrated into a model building data table representing entity information as shown in table 1.
Table 1 is a model build data table.
Serial number Table name
1 Tag table for personal relatives and suppliers to communicate
2 Personal relative avoidance tag chart
3 Information table of personnel debit card
4 Personnel basic information table
5 Suspected collusion quotation label traceability table
…… ……
Wherein, around the expression of the person, the entity types comprise the person, the event, the mechanism, the mobile phone number, the account, the equipment and the like. More specifically, personnel encompasses personnel, clients, relatives, and the like. The event mainly comprises violation events participated by various personnel in the information record. For example, suspending work, bad loans, wading, etc., a person may create a relationship through an event. The institution mainly relates to institutions, such as working institutions, schools and the like, where different persons can have relationships. I.e. the persons in the same working establishment may be colleagues. People in the same school may be classmates. The mobile phone number is a private article of the user, in the investigation service, the mobile phone number is reserved in various events for necessary information, and the private mobile phone can be registered under other names as an auxiliary tool to generate abnormal association, so that the private mobile phone can be expressed as an entity. The account may include a unique identification, card number, account number, customer number, and the like. It should be noted that the entity types are not limited to the above listed ones, and if a new type of entity is generated during the actual business operation, the entity information of the new type of entity may be added to the model building data table.
And mining through the relationship among various entities to construct a relationship detail table which represents the relationship between the entities and is shown in the table 2.
Figure BDA0003360297960000101
Figure BDA0003360297960000111
The relationship between the entities comprises a relationship between persons, a relationship between persons and accounts, a relationship between persons and mobile phone numbers, a relationship between persons and events, a relationship between persons and mechanisms, a relationship between accounts and mobile phone numbers, a relationship between accounts and events, a relationship between accounts and mechanisms, a relationship between mobile phone numbers and mechanisms and a relationship between events and mechanisms. It should be noted that the relationship between the entities is not limited to the above listed relationship, and if a new relationship type is generated during the actual service operation process, the new relationship type may be incorporated into the relationship list of the relationship between the entities and the entities.
And step 320, constructing a user relation knowledge graph based on the model construction data table and the relation detail table by adopting a knowledge graph generation tool.
Illustratively, determining entities included in the model build data table; determining the relationship among the entities according to the relationship list; and determining a triple for constructing the knowledge graph according to the entity and the relation, and constructing the user relation knowledge graph based on the triple.
Specifically, before the user relationship knowledge graph is constructed by the model construction data table and the relationship detail table, the entity and the relationship broad table are designed based on the two types of data tables to construct the triple for forming the knowledge graph. FIG. 4 is a schematic diagram of an entity of interpersonal relationships and a structure of a relationship wide table. FIG. 5 is a diagram illustrating the structure of an entity and relationship wide table representing the relationship of a person to other types of entities. Optionally, before generating the user relationship knowledge graph, defining an entity type and a relationship type is further included.
Illustratively, a plurality of entities and relationship-wide tables are generated based on the model build data table and the relationship detail table such that data of the model build data table and the relationship detail table of the entity-to-entity relationships satisfy processing requirements of the graph database. The entity and relationship broad table may include entity and relationship triple relationships. And inputting the entity and the relation broad table into a graph database so as to construct a user relation knowledge graph through the graph database. FIG. 6 is a schematic diagram of a user relationship knowledge graph. FIG. 6 illustrates the relationship between people, institutions, accounts, contact addresses, debit card account numbers, and violation events. The entity type definition and the relationship type definition are shown in the following table.
Table 3 is an entity type definition table.
Figure BDA0003360297960000131
Table 4 is a relationship type definition table.
Relationship names Type of relationship Start node End node
Relatives and relatives RELATIVES Person Person
Country COUNTRYMEN Person Person
School friend ALUMNUS Person Person
Colleagues COLLEAGUE Person Person
Guarantee for others GUARANTEE Person Person
Account dependent ACCOUNT_BELONG_TO Account Person
Mobile phone number slave PHONENUMBER_BELONG_TO PhoneNumber Person
Mobile phone number slave PHONENUMBER_BELONG_TO PhoneNumber Organization
Exceptional event correlation RELATE_TO_EVENT Organization Event
Exceptional event correlation RELATE_TO_EVENT Person Event
(Director) DIRECTOR Person Externallnstitutions
Prison affairs SUPERVISOR Person Externallnstitutions
High pipe MANAGEMENT Person Externallnstitutions
Slave mechanism BELONG_TO_ORG Person InternalOrganization
Trading TRANSACTION Account Account
And step 330, acquiring a user relation knowledge graph.
And 340, calculating characteristic values of corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph.
And step 350, determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
According to the technical scheme of the embodiment of the invention, the user relation knowledge graph is constructed by extracting relevant data and management requirements from the actual operation business process and each regulation and regulation, and the intelligent level of problem clue discovery is improved.
On the basis of the technical scheme, after the user relation knowledge graph is obtained, the following additional technical characteristics are also included: and responding to the query information of the user to cut the user relation knowledge graph, and displaying the cut user relation knowledge graph.
Fig. 7 is a flowchart of another problem thread mining method according to an embodiment of the present invention, as shown in fig. 7, the method includes:
and step 710, acquiring a user relation knowledge graph.
And 720, calculating characteristic values of the corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph.
Illustratively, the characteristic values include a tight centrality value and a medium centrality value. And calculating the difficulty of a certain node to reach other nodes, namely calculating the reciprocal of the average value of the distances from the certain node to all other nodes. Can be expressed as the following equation: raw close together centre (node) 1/sum (distance from node to all other nodes)
Fig. 8 is a diagram of an example of a tight centrality algorithm. From the relationship between nodes in FIG. 8, it can be inferred that node C has the shortest average distance to other nodes, followed by node B and node D, and the farthest average distance to other nodes is node A and node E. The result of the calculation of the tight centrality calculated using the above equation can verify the above conclusion.
Mediation centrality is a method of detecting the degree of influence of nodes on the information flow in a graph. All shortest paths between any two nodes in the network are calculated, and if many of the paths pass through a certain node, the node is considered to have high mediation centrality. The specific calculation formula is as follows:
Figure BDA0003360297960000141
wherein σstRepresenting the number of shortest paths from the node s to the node t; sigmast(v) Representing the number of nodes v traversed in the shortest path from node s to node t.
FIG. 9 is a diagram of an example of an intermediary centrality algorithm. According to the nodes and the relations in fig. 9, the most association relations of the a node can be found, and it can be inferred that the node has the greatest influence on the information flow side, that is, the node has the highest mediation centrality, and then the C node is located. The result of the calculation of the centrality of the intermediary calculated using the calculation formula may verify the above conclusion.
And step 730, determining hidden problem clues according to the characteristic values.
Optionally, after determining the hidden problem clue according to the feature value, a preset abnormal relation rule may be further used to verify whether the hidden problem clue is valid. Or, verifying whether the hidden problem clues are effective or not by adopting the abnormal behavior problem characteristics trained by the machine learning model.
Step 740, responding to the query information of the user to cut the user relation knowledge graph, and displaying hidden problem clues in the cut user relation knowledge graph.
Specifically, acquiring an employee identifier input by a user; selecting a corresponding first node from the user relation knowledge graph according to the employee identification, deleting a second node which is not connected with the first node in the user relation knowledge graph, and deleting edges between the second nodes; and displaying hidden problem clues based on the deleted user relation knowledge graph.
Optionally, the cutting the user relationship knowledge graph in response to the query information of the user, and displaying hidden problem clues in the cut user relationship knowledge graph, includes: acquiring employee identifications of at least two persons to be inquired input by a user; determining at least two corresponding fourth nodes from the user relationship knowledge graph according to the employee identification, deleting fifth nodes of the user relationship knowledge graph except the fourth nodes, and deleting edges between the fifth nodes; and displaying hidden problem clues based on the deleted user relation knowledge graph.
It should be noted that, there are many ways to display hidden problem clues in the user relationship knowledge graph, and the embodiment of the present invention is not limited in particular. For example, the hidden problem thread may be displayed in the user relationship knowledge graph by adjusting the display attribute of the edge corresponding to the hidden problem thread. The hidden problem clues may be displayed in the user relationship knowledge graph in a manner of adjusting display attributes of nodes corresponding to the hidden problem clues. The display attributes of the edges and nodes corresponding to the hidden problem clues can also be adjusted simultaneously.
And step 750, responding to the clue pushing request input by the user, pushing the hidden problem clue to an upper user of the user so as to instruct the upper user to judge whether the hidden problem clue is effective.
And 760, responding to the cancel clue operation of the upper-level user on the invalid hidden problem clue, and modifying the display attribute of the edge or the node corresponding to the invalid hidden problem clue.
The technical scheme of the embodiment of the invention provides a visual interaction mode to form powerful support for the business and meet the actual business requirements of related investigation work.
Fig. 10 is a flowchart of problem thread discovery based on a user relationship knowledge graph according to an embodiment of the present invention. The steps illustrated in fig. 10 have been covered in the description of fig. 1, and key steps are labeled in fig. 10 for greater clarity. As shown in fig. 10, the main flow of problem thread mining is presented. Firstly, a data source is accessed, and then data from the data source is processed uniformly. And entering the unified and processed standardized data into a data warehouse, and processing the data in the data warehouse. And then, constructing a user relation knowledge graph through the processed data. And calculating the tight centrality value and the medium centrality value of each node in the user relationship knowledge graph by adopting a tight centrality algorithm and a medium centrality algorithm. Hidden problem clues are found according to the tight centrality value and the medium centrality value. After the current user finishes the validity judgment based on the found problem clues, the problem clues are pushed to the users with higher authority so as to request the users with higher authority to judge whether the hidden problem clues are valid. Optionally, after discovering hidden problem cues, the validity of the discovered hidden problem cues may also be verified according to abnormal relationship rules and/or abnormal behavior features trained by machine learning algorithms.
The method can adapt to the continuously changing service requirements, and improves the digitization, the intellectualization and the scientization of problem clue development by using the user relationship knowledge graph; by implementing the embodiment of the invention, the effectiveness and the accuracy of problem thread mining can be improved, and the related requirements of problem thread mining in various services can be met to a certain extent; and a friendly application and display interface is provided for a user, so that the user is simple, convenient and easy to operate.
Fig. 11 is a block diagram of a problem thread mining apparatus according to an embodiment of the present invention, which can perform the problem thread mining method according to any embodiment of the present invention, and implement hidden problem thread mining by performing the method. The apparatus may be implemented by software and/or hardware and configured in a computer device. As shown in fig. 11, the apparatus includes:
a knowledge graph obtaining module 1110, configured to obtain a user relationship knowledge graph;
a feature value calculating module 1120, configured to calculate feature values of corresponding nodes based on an association relationship between nodes in the user relationship knowledge graph;
the problem thread display module 1130 determines hidden problem threads according to the feature values, and displays the hidden problem threads.
The embodiment of the invention provides a problem thread mining device, which is used for mining problem threads through a user relationship knowledge graph, improving the intelligent level of problem thread mining and realizing comprehensive and accurate discovery of potential problem threads; by providing a visual interaction mode, clues of abnormal behavior problems of the excavating personnel are displayed and explored, and the actual business requirements of relevant investigation work are met.
Optionally, the apparatus further comprises:
and the clue verification module is used for verifying the effectiveness of the clue of the hidden problem by adopting an abnormal relation rule after determining the clue of the hidden problem according to the characteristic value.
Optionally, the apparatus further comprises:
the system comprises a map construction module, a relation detail table and a model construction data table, wherein the map construction module is used for acquiring the model construction data table and the relation detail table before acquiring the user relation knowledge map, the model construction data table is used for storing entity information for constructing the knowledge map, and the relation detail table is used for storing the relation between entities; and constructing a user relation knowledge graph based on the model construction data table and the relation detail table by adopting a knowledge graph generation tool.
Optionally, the atlas modeling module is further specifically configured to:
determining entities included in the model building data table;
determining the relationship among the entities according to a preset relationship list;
and determining a triple for constructing the knowledge graph according to the entity and the relation, and constructing the user relation knowledge graph based on the triple.
Optionally, the feature value calculating module 1120 is specifically configured to:
calculating the reciprocal of the distance average value of each node and other nodes in the user relationship knowledge graph, and taking the reciprocal as the compact centrality value of the corresponding node;
calculating the intermediary centrality value of the corresponding node according to the frequency of any node in the user relationship knowledge graph as a bridge of the shortest path between other two nodes and the number of the shortest paths between the corresponding two nodes;
a node identification, a tight centrality value, and an intermediate centrality value for the associated storage node.
Optionally, the question cue display module 1130 is specifically configured to:
comparing the characteristic value of each node with a preset threshold value, and determining a target node with the characteristic value larger than the preset threshold value;
and taking the target node and other nodes connected with the target node through edges as hidden problem clues, and displaying the hidden problem clues.
Optionally, the apparatus further comprises:
and the characteristic value determining module is used for inquiring a preset data table according to the node identification to obtain the characteristic value of the corresponding node before comparing the characteristic value of each node with the preset threshold, wherein the node identification and the characteristic value are stored in the preset data table in an associated manner.
Optionally, the feature value determining module is further configured to, before comparing the feature value of each node with a preset threshold, obtain a feature value corresponding to each node in the user relationship knowledge graph, where the feature value is displayed in the node of the user relationship knowledge graph.
Optionally, the question cue display module is further specifically configured to:
and cutting the user relation knowledge graph in response to the query information of the user, and displaying hidden problem clues in the cut user relation knowledge graph.
Optionally, the question cue display module is further configured to:
acquiring employee identification input by a user;
selecting a corresponding first node from the user relation knowledge graph according to the employee identification, deleting a second node which is not connected with the first node in the user relation knowledge graph, and deleting edges between the second nodes;
and displaying hidden problem clues based on the deleted user relation knowledge graph.
Optionally, the question cue display module is further configured to:
acquiring employee identifications of at least two persons to be inquired input by a user;
determining at least two corresponding fourth nodes from the user relationship knowledge graph according to the employee identification, deleting fifth nodes of the user relationship knowledge graph except the fourth nodes, and deleting edges between the fifth nodes;
and displaying hidden problem clues based on the deleted user relation knowledge graph.
Optionally, the apparatus further comprises:
the clue judging module is used for responding to a clue pushing request input by a user after determining a hidden problem clue according to the characteristic value and displaying the hidden problem clue, and pushing the hidden problem clue to a superior user of the user so as to indicate the superior user to judge whether the hidden problem clue is effective or not; and in response to the cancel cue operation of the upper-level user on the invalid hidden question cue, modifying the display attribute of the edge or the node corresponding to the invalid hidden question cue.
The problem thread mining device provided by the embodiment of the invention can execute the problem thread mining method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Fig. 12 is a schematic structural diagram of a computer apparatus according to an embodiment of the present invention, as shown in fig. 12, the computer apparatus includes a processor 1210, a memory 1220, an input device 1230, and an output device 1240; the number of the processors 1210 in the computer device may be one or more, and one processor 1210 is taken as an example in fig. 12; the processor 1210, memory 1220, input device 1230 and output device 1240 in the computer apparatus may be connected by a bus or other means, as exemplified by a bus in fig. 12.
The memory 1220 is used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the problem thread mining method in the embodiment of the present invention (for example, the knowledge map obtaining module 1110, the feature value calculating module 1120, and the problem thread displaying module 1130 in the problem thread mining apparatus). The processor 1210 executes software programs, instructions and modules stored in the memory 1220 to execute various functional applications of the computer device and data processing, i.e., to implement the problem cue mining method.
The memory 1220 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 1220 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 1220 can further include memory located remotely from the processor 1210, which can be connected to a computer device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 1230 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus. The output device 1240 may include a display device such as a display screen. Displays the knowledge map and hides the problem clues and the like through the output device 1240.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions which, when executed by a computer processor, perform a problem lead mining method, the method comprising:
acquiring a user relation knowledge graph;
calculating characteristic values of corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph;
and determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
Of course, the storage medium provided by the embodiments of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also execute the relevant operations in the problem thread discovery method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the problem thread mining apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (15)

1. A method for problem thread mining, comprising:
acquiring a user relation knowledge graph;
calculating characteristic values of corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph;
and determining hidden problem clues according to the characteristic values, and displaying the hidden problem clues.
2. The method of claim 1, further comprising, after determining hidden problem cues from the feature values:
and verifying the effectiveness of the hidden problem clues by adopting an abnormal relation rule.
3. The method of claim 1, prior to obtaining the user relationship knowledge-graph, further comprising:
acquiring a model construction data table and a relation detail table, wherein the model construction data table is used for storing entity information for constructing a knowledge graph, and the relation detail table is used for storing the relation between entities;
and constructing a user relation knowledge graph based on the model construction data table and the relation detail table by adopting a knowledge graph generation tool.
4. The method of claim 3, wherein the building a user relationship knowledge graph based on the model building data table and the relationship detail table using a knowledge graph generation tool comprises:
determining entities included in the model building data table;
determining the relationship between the entities according to the relationship detail table;
and determining a triple for constructing the knowledge graph according to the entity and the relation, and constructing the user relation knowledge graph based on the triple.
5. The method of claim 1, wherein calculating feature values of respective nodes based on incidence relations between nodes in the user relationship knowledge-graph comprises:
calculating the reciprocal of the distance average value of each node and other nodes in the user relationship knowledge graph, and taking the reciprocal as the compact centrality value of the corresponding node;
calculating the intermediary centrality value of the corresponding node according to the frequency of any node in the user relation knowledge graph serving as a bridge of the shortest path between the other two nodes and the number of the shortest paths between the corresponding two nodes;
and associating and storing the node identification, the tight centrality value and the intermediate centrality value of the node.
6. The method of claim 1, wherein determining hidden problem cues according to the feature values, and displaying the hidden problem cues comprises:
comparing the characteristic value of each node with a preset threshold value, and determining a target node of which the characteristic value is greater than the preset threshold value;
and taking the target node and other nodes connected with the target node through edges as hidden problem clues, and displaying the hidden problem clues.
7. The method according to claim 6, before comparing the characteristic value of each node with the preset threshold, further comprising:
and inquiring a preset data table according to the node identification to obtain the characteristic value of the corresponding node, wherein the node identification and the characteristic value are stored in the preset data table in an associated mode.
8. The method of claim 6, further comprising, before comparing the eigenvalue of each of the nodes with a preset threshold:
and acquiring a characteristic value corresponding to each node in the user relationship knowledge graph, wherein the characteristic value is displayed in the node of the user relationship knowledge graph.
9. The method of claim 1, wherein displaying the hidden problem clue comprises:
and cutting the user relation knowledge graph in response to the query information of the user, and displaying hidden problem clues in the cut user relation knowledge graph.
10. The method of claim 9, wherein the tailoring of the user relationship knowledge graph in response to the query information of the user, displaying hidden problem clues in the tailored user relationship knowledge graph, comprises:
acquiring employee identification input by a user;
selecting a corresponding first node from the user relation knowledge graph according to the employee identification, deleting a second node which is not connected with the first node in the user relation knowledge graph, and deleting edges between the second nodes;
and displaying hidden problem clues based on the deleted user relation knowledge graph.
11. The method of claim 9, wherein the tailoring of the user relationship knowledge graph in response to the query information of the user, displaying hidden problem clues in the tailored user relationship knowledge graph, comprises:
acquiring employee identifications of at least two persons to be inquired input by a user;
determining at least two corresponding fourth nodes from the user relationship knowledge graph according to the employee identification, deleting fifth nodes of the user relationship knowledge graph except the fourth nodes, and deleting edges between the fifth nodes;
and displaying hidden problem clues based on the deleted user relation knowledge graph.
12. The method of claim 1, wherein after determining hidden problem cues based on the feature values and displaying the hidden problem cues, further comprising:
responding to a clue pushing request input by a user, pushing the hidden problem clue to an upper-level user of the user so as to indicate the upper-level user to judge whether the hidden problem clue is effective or not;
and in response to the cancel clue operation of the upper-level user on the invalid hidden problem clue, modifying the display attribute of the edge or the node corresponding to the invalid hidden problem clue.
13. A problem thread mining device, comprising:
the knowledge graph acquisition module is used for acquiring a user relation knowledge graph;
the characteristic value calculation module is used for calculating the characteristic values of the corresponding nodes based on the incidence relation among the nodes in the user relation knowledge graph;
and the problem clue display module is used for determining hidden problem clues according to the characteristic values and displaying the hidden problem clues.
14. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the problem thread mining method of any of claims 1-12.
15. A storage medium containing computer-executable instructions for performing the problem thread mining method of any one of claims 1-12 when executed by a computer processor.
CN202111364231.7A 2021-11-17 2021-11-17 Problem thread mining method, device, computer equipment and medium Pending CN114090752A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111364231.7A CN114090752A (en) 2021-11-17 2021-11-17 Problem thread mining method, device, computer equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111364231.7A CN114090752A (en) 2021-11-17 2021-11-17 Problem thread mining method, device, computer equipment and medium

Publications (1)

Publication Number Publication Date
CN114090752A true CN114090752A (en) 2022-02-25

Family

ID=80301660

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111364231.7A Pending CN114090752A (en) 2021-11-17 2021-11-17 Problem thread mining method, device, computer equipment and medium

Country Status (1)

Country Link
CN (1) CN114090752A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106878174A (en) * 2017-03-21 2017-06-20 哈尔滨工程大学 Internet communication node influence power based on Betweenness Centrality finds method
CN107909178A (en) * 2017-08-31 2018-04-13 上海壹账通金融科技有限公司 Electronic device, lost contact repair rate Forecasting Methodology and computer-readable recording medium
CN109710701A (en) * 2018-12-14 2019-05-03 浪潮软件股份有限公司 A kind of automated construction method for public safety field big data knowledge mapping
CN110717824A (en) * 2019-10-17 2020-01-21 北京明略软件系统有限公司 Method and device for conducting and calculating risk of public and guest groups by bank based on knowledge graph
CN111309824A (en) * 2020-02-18 2020-06-19 中国工商银行股份有限公司 Entity relationship map display method and system
CN112182174A (en) * 2020-09-24 2021-01-05 南方电网数字电网研究院有限公司 Business question-answer knowledge query method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106878174A (en) * 2017-03-21 2017-06-20 哈尔滨工程大学 Internet communication node influence power based on Betweenness Centrality finds method
CN107909178A (en) * 2017-08-31 2018-04-13 上海壹账通金融科技有限公司 Electronic device, lost contact repair rate Forecasting Methodology and computer-readable recording medium
CN109710701A (en) * 2018-12-14 2019-05-03 浪潮软件股份有限公司 A kind of automated construction method for public safety field big data knowledge mapping
CN110717824A (en) * 2019-10-17 2020-01-21 北京明略软件系统有限公司 Method and device for conducting and calculating risk of public and guest groups by bank based on knowledge graph
CN111309824A (en) * 2020-02-18 2020-06-19 中国工商银行股份有限公司 Entity relationship map display method and system
CN112182174A (en) * 2020-09-24 2021-01-05 南方电网数字电网研究院有限公司 Business question-answer knowledge query method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Mardani et al. A novel extended approach under hesitant fuzzy sets to design a framework for assessing the key challenges of digital health interventions adoption during the COVID-19 outbreak
US10878358B2 (en) Techniques for semantic business policy composition
JP7313597B2 (en) Cryptocurrency trading analysis method and system
US8674993B1 (en) Graph database system and method for facilitating financial and corporate relationship analysis
WO2020048058A1 (en) Fund knowledge inference method and system, computer device, and storage medium
US20200242615A1 (en) First party fraud detection
WO2019019630A1 (en) Anti-fraud identification method, storage medium, server carrying ping an brain and device
Yang Construction of logistics financial security risk ontology model based on risk association and machine learning
Hutchins et al. Hiding in plain sight: criminal network analysis
CN111881302B (en) Knowledge graph-based bank public opinion analysis method and system
EP1676189A2 (en) Application processing and decision systems and processes
CN107633093A (en) A kind of structure and its querying method of DECISION KNOWLEDGE collection of illustrative plates of powering
CN112215616B (en) Method and system for automatically identifying abnormal fund transaction based on network
CN111611309A (en) Interactive visualization method for call ticket data relation network
TW201539214A (en) A multidimensional recursive learning process and system used to discover complex dyadic or multiple counterparty relationships
Wu et al. Human resource allocation based on fuzzy data mining algorithm
CN107679977A (en) A kind of tax administration platform and implementation method based on semantic analysis
Cifci et al. Data mining usage and applications in health services
US20150199688A1 (en) System and Method for Analyzing an Alert
CN109410035B (en) Method and tool for assisting anti-fraud analysis of group structure
CN114090752A (en) Problem thread mining method, device, computer equipment and medium
CN110162521A (en) A kind of payment system transaction data processing method and system
CN112991079B (en) Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium
De Boeck et al. Dataset anonymization with purpose: a resource allocation use case
Tian AI-Assisted Dynamic Modeling for Data Management in a Distributed System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination