WO2018228259A1

WO2018228259A1 - Relationship diagram processing method and apparatus

Info

Publication number: WO2018228259A1
Application number: PCT/CN2018/090178
Authority: WO
Inventors: 许凌志; 钱伟红; 张洪
Original assignee: 阿里巴巴集团控股有限公司
Priority date: 2017-06-16
Filing date: 2018-06-07
Publication date: 2018-12-20
Also published as: CN109145178A

Abstract

A relationship diagram processing method and apparatus. The method comprises: determining a plurality of core nodes in a relationship diagram to be simplified, the core nodes being nodes in the relationship diagram to be simplified or virtual nodes constituted by clusters in the relationship diagram (100); obtaining a plurality of association relationships between the core nodes (101); and calculating the similarity of the plurality of association relationships between the core nodes, and performing aggregation to obtain virtual association relationships between the core nodes, so as to use the obtained virtual association relationships as the relationships between the core nodes in the relationship diagram to be simplified (102). In the method, by using the similarity between indirect relationships of core nodes or clusters in a relationship diagram, similar relationships are combined, so as to abstract new virtual corresponding relationships, thereby simplifying the complex relationship diagram, and highlighting the backbone.

Description

Method and device for processing relationship diagram

The present application claims priority to Chinese Patent Application No. PCT Application No. No. No. No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No

Technical field

The present application relates to a data processing technology, and more particularly to a relationship processing method and apparatus.

Background technique

With the rapid expansion of Internet data, many large and complex graph data, such as social networks, have been produced in many fields. In order to simplify the complex relationship diagram of the graph data, in the prior art, the simplification is started from the node, and the common methods are: using the similarity between the nodes to perform aggregation, thereby simplifying the complex relationship graph; or cutting Unrelated leaf nodes in complex diagrams.

Among them, the method of starting aggregation from a node is mainly to merge the relationships of the same data type to achieve the purpose of simplifying the complex relationship diagram. On the one hand, it is simplified from the node. For complex diagrams, due to the large number of nodes, it will take time and effort to implement it. On the other hand, the combination only according to the data type can not reflect the relationship between the two nodes. relationship.

Although the above two related technologies can make the core nodes or clusters in the complex relationship diagram more prominent, especially the graph data with relatively simple indirect relationship between the core backbones is effective. However, for complex relationships with complex relationships between the core backbones, the simplification of these methods is less than satisfactory.

Summary of the invention

In order to solve the above technical problem, the present application provides a method and a device for processing a relationship diagram, which can simplify the complex relationship diagram and have an associated prominent backbone.

In order to achieve the purpose of the present application, the present application provides a relationship diagram processing method, including:

Determining a plurality of core nodes in the relationship diagram to be simplified, the core node being a virtual node formed by a node in a relationship diagram to be simplified or a cluster in a relationship diagram;

Obtaining multiple association relationships between core nodes;

The similarity calculation is performed on multiple association relationships between the core nodes, and the virtual association relationship between the core nodes is obtained by aggregation, so that the obtained virtual association relationship is used as the relationship between the core nodes in the relationship diagram to be simplified.

Optionally, the method further includes: storing the associated relationship between the aggregated virtual association relationship and the pre-aggregation.

Optionally, the method further includes: when the virtual association relationship after the aggregation is triggered, and expanding the selected virtual association relationship according to the associated association relationship corresponding to the virtual association relationship.

Optionally, the expanding the selected aggregated virtual association relationship includes:

Reading all the pre-aggregation association relationships corresponding to the aggregated virtual association relationship, and displaying the read association relationship.

Optionally, the similarity calculation is performed on multiple association relationships between the core nodes, and the virtual association relationship between the core nodes is obtained by the aggregation:

Calculating the similarity of the association relationship through different dimensional relationships and performing aggregation to obtain the virtual association relationship.

Optionally, the different dimensions include any combination of the following: a time dimension, a relationship attribute dimension, and a behavior mode dimension.

The application also provides an implementation diagram device, including: a division module, an acquisition module, and an aggregation module; wherein

a dividing module, configured to determine a plurality of core nodes in the relationship diagram to be simplified, the core node being a virtual node formed by a node in a relationship diagram or a cluster in a relationship diagram to be simplified;

An obtaining module, configured to acquire multiple association relationships between core nodes;

The aggregation module is configured to perform similarity calculation on multiple association relationships between the core nodes, and obtain a virtual association relationship between the core nodes to obtain the virtual association relationship between the core nodes in the relationship diagram to be simplified. Relationship.

Optionally, the device further includes:

a storage module, configured to correspondingly store the associated virtual association relationship and the pre-aggregation association relationship;

The expansion module is configured to trigger the virtual association relationship after the aggregation, and expand the selected virtual relationship after the aggregation according to the association relationship corresponding to the virtual association relationship.

Optionally, the expansion module is configured to: read all the pre-aggregation association relationships corresponding to the aggregated virtual association relationship, and display the read association relationship.

The present application further provides a relationship diagram processing apparatus, including a memory and a processor, wherein the memory stores an executable instruction: determining a plurality of core nodes in a relationship diagram to be simplified, the core node being a relationship to be simplified a virtual node formed by a node in the graph or a cluster in the relationship graph; acquiring multiple association relationships between the core nodes; performing similarity calculation on multiple association relationships between the core nodes, and obtaining the core nodes by aggregation The virtual association relationship between the virtual associations is taken as the relationship between the core nodes in the relationship diagram to be simplified.

The solution provided by the present application includes: determining a plurality of core nodes in a relationship diagram to be simplified, the core node being a virtual node formed by a node in a relationship diagram or a relationship diagram to be simplified; acquiring between each core node Multiple association relationships; similarity calculations are performed on multiple association relationships between core nodes, and the virtual association relationship between the core nodes is obtained by aggregation, so that the obtained virtual association relationship is regarded as the core node in the relationship diagram to be simplified. Relationship between. The present application utilizes the similarity between the core nodes in the relationship diagram to merge similar relationships to abstract new virtual association relationships, thereby simplifying the complex relationship diagram and highlighting the backbone context.

Further, the present application further provides a method for expanding an aggregated independent track, which implements local information expansion in a relationship diagram, so that the relationship between two nodes or between two clusters is separated from the complex relationship diagram, and analyzed. It is also clearer and more convenient.

Other features and advantages of the present application will be set forth in the description which follows. The objectives and other advantages of the present invention can be realized and obtained by the structure of the invention.

DRAWINGS

The drawings are used to provide a further understanding of the technical solutions of the present application, and constitute a part of the specification, which is used together with the embodiments of the present application to explain the technical solutions of the present application, and does not constitute a limitation of the technical solutions of the present application.

1 is a flow chart of a method for processing a relationship diagram of the present application;

2(a) is a schematic diagram of an embodiment of dividing a complex relationship diagram to be simplified into a plurality of clusters in the present application;

2(b) is a schematic diagram of an embodiment in which a complex relationship diagram to be simplified is divided into a plurality of core nodes in the present application;

3 is a schematic diagram of an embodiment of a similar independent rail of the present application;

4 is a schematic diagram of an embodiment of a simplified relationship diagram of the present application;

FIG. 5 is a schematic diagram of an embodiment of the present application after the independent rails of FIG. 4 are aggregated; FIG.

6 is a schematic diagram of an embodiment of developing an independent rail after aggregation in FIG. 5 in the present application;

FIG. 7 is a schematic structural diagram of a device for implementing a simplified relationship diagram of the present application.

detailed description

In order to make the objects, technical solutions and advantages of the present application more clear, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments in the present application and the features in the embodiments may be arbitrarily combined with each other without conflict.

In a typical configuration of the present application, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium.

Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.

The steps illustrated in the flowchart of the figures may be executed in a computer system such as a set of computer executable instructions. Also, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.

When the graph data is relatively large, the connections between the core nodes or clusters displayed on the complex graph need to be related through multiple indirect relationships. In order to simplify the complex relationship diagram and to associate with the prominent backbone, the present application provides a relationship diagram processing method, as shown in FIG. 1, including the following steps:

Step 100: Determine a plurality of core nodes in the relationship diagram to be simplified.

The core node is a virtual node formed by a node in a relationship diagram to be simplified or a cluster in a relationship diagram.

In other words, this step is to divide the relationship to be simplified into several feature areas. The feature area may be a small area composed of core nodes, or may be divided into several clusters. Specifically, find some core nodes in the relationship diagram to be simplified or divide the relationship diagram to be simplified into several clusters.

Among them, there are many methods for dividing the relationship diagram into several clusters, such as: community discovery methods such as LPA (Label Propagation Algorithm), SLPA (Speaker-listener Label Propagation Algorithm), etc.; another example: overlapping community discovery algorithm based on balanced multi-tag propagation Such as BMLPA (Multiplex Ligation-dependent Probe Amplification); another example: community partitioning algorithms such as Fast Unfolding.

The method for calculating the core node may include, but is not limited to, pagerank, k-core, and the like. Among them, pagerank is an algorithm invented by Google to evaluate the importance of web pages. The principle can also be used to judge the centrality of points in a relational network; k-core is another algorithm for evaluating the central degree of the network's European midpoint. .

As shown in Figure 2(a), the simplified diagram is divided into three areas: cluster A, cluster B, and cluster C. As shown in Figure 2(b), the core calculated from the graph to be simplified is displayed. Node A, core node B and core node C.

Step 101: Acquire multiple association relationships between core nodes.

As shown in FIG. 2(a) or FIG. 2(b), taking the vertex A and the vertex B as an example, the path between the core node A and the core node B, or between the cluster A and the cluster B, having no common internal vertices is called For independent rails. The independent track expresses the relationship between two core nodes or clusters, and describes how the two core backbones are related, that is, the relationship information. As shown in Fig. 2(a) or Fig. 2(b), the line segment represented by the double line is an independent track between the cluster or the core node. Among them, the number of nodes included in an independent track is called the degree of this independent track, denoted by N. In practical applications, it is often concerned with the independent track of the closer relationship, that is, the independent track with less degree of independent track, such as N<=2. The value of N depends on the application scenario. For example, in a peer scenario, it is appropriate to take 2 according to the value of experience N. This covers the case where two people may share a car (N=1), and also covers The situation where the user arrives at the same place through different trains at the same time (N=2); for example: in the scene of living together, the value of N is more appropriate, because living together often means living in one place. Such as hotels.

How to obtain an independent track, there are many algorithms in the graph theory, and the specific algorithm is not used to limit the protection scope of the present invention, and details are not described herein again.

Step 102: Perform similarity calculation on multiple association relationships between the core nodes, and obtain a virtual association relationship between the core nodes to obtain the virtual association relationship as a relationship between the core nodes in the relationship diagram to be simplified. .

In this step, the similarity of the association relationship can be calculated and aggregated through different dimensional relationships, for example, the time dimension, that is, the similarity is the simultaneous segment, and/or the relationship attribute dimension, that is, the similarity is the same attribute, and/or the behavior mode dimension. That is, the similarity is the peer.

The similarity of the calculated association relationship may be determined according to the relationship type of the independent track expressed by the association relationship. For example, the two independent tracks between the independent track A and the independent track B are travel related. It should be noted that the judgment of the similarity of different business scenarios is different, depending on different strategies. The technical solutions provided by the present application are easy to understand, and the specific policies are not used to limit the scope of protection of the present application, and details are not described herein again.

In the present application, the aggregation may be performed according to the combination of the similarities of the association relationships calculated in the three dimensions, such as the indirect relationship of the same segment with the attribute, the indirect relationship of the same segment with the peer, and the like.

FIG. 3 is a schematic diagram of an embodiment of a method for aggregating similar independent tracks. As shown in FIG. 3, from top to bottom, the similarities are described in terms of the same attribute, the peer, and the simultaneous segment.

For example, as shown in the first merge mode in FIG. 3, it is assumed that there are two independent tracks between the core node A and the core node B, and one of the independent tracks indicates that the device with the IMEI of #3eedf3ed passes the QQ number of 3443223 at the core node A. The login with the core node B, the other independent track indicates that the device with the IMEI #3eedf3ed has logged in between the core node A and the core node B through the QQ number 2222222, and the two independent tracks exist on the same device. The similarity of the same attribute of the login, so it is aggregated into an independent track with the IMEI.

Another example: As shown in the second merge mode in Figure 3, it is assumed that there are two independent tracks between cluster A and cluster B. One of the independent tracks means: around 14:45 on March 15, 16th, flight CA1232 arrives, another An independent track indicates that the flight CA1232 arrived at around 14:45 on March 18th, 16th. The two independent tracks have the similarity of the same flight as the flight, so they are aggregated into independent tracks of the same flight.

Another example: As shown in the third merge mode in Figure 3, it is assumed that there are two independent tracks between cluster A and cluster B. One of the independent tracks means: around 14:45 on March 15, 16th, flight G124 arrives, and another An independent track indicates that the flight MU1122 arrives at around 14:45 on March 15th, 16th. The two independent tracks have the similarity of the simultaneous segments arriving at the same time, so they are aggregated into independent tracks that arrive at the same time.

When there are many similar relationships between the core nodes or the clusters as shown in FIG. 3, the aggregation manner provided by the present application greatly simplifies the indirect relationship between the core nodes or the clusters, so that the core nodes or clusters are between The connection is clearer.

Optionally, the method of the present application further includes:

The virtual association relationship after the storage aggregation is as follows: the relationship between the independent track representation and the pre-aggregation similar relationship, such as the connection relationship represented by the independent track, that is, the storage of the associated virtual association relationship and the pre-aggregation association relationship . Each aggregated generated edge, that is, the aggregated independent track, contains a drilldown field in which the similar independent track connections it contains are stored.

According to the similarity between the independent tracks of the present application and the simplified relationship diagram, the relationship between the core nodes is clearly displayed. However, it is still possible to view the detailed association between two core nodes when the graph is analyzed. Optionally, the method of the present application further includes:

According to the stored correspondence, the selected aggregated virtual relationship is expanded, for example, as an independent track. For example, when the edge of an aggregate is triggered, that is, when the independent track is expanded, that is, the user wants to see which aggregated independent track clicks on which independent track is located, at this time, the aggregated independent track is read. All sub-independent tracks under the drilldown field, then display these sub-independent tracks.

This application uses the similarity between the core nodes or clusters in the relationship diagram to merge similar relationships to abstract new virtual associations, thus simplifying the complex relationship and highlighting the backbone. Thread.

Further, the present application also provides a method for expanding the independent track after aggregation, realizing the local information expansion in the relationship diagram, so that the relationship between the two core nodes of Jiacheng and Bcheng is separated from the complex relationship diagram, and analyzed. It is also clearer and more convenient.

The implementation of the aggregation and expansion of the independent tracks in this application is examined below in conjunction with an embodiment.

4 is a schematic diagram of an embodiment of a simplified relationship diagram of the present application. As shown in FIG. 4, it is assumed that the feature area divided by the complex relationship diagram includes three core nodes: A City, B City, and C City. A complex indirect relationship is created between the three core nodes by train. Taking the relationship indicated by the thick solid line in Figure 4 as an example, it is shown that A and B are in the same section of the train HB4540 (A) and the train HB1590 (B). There are other similar simultaneous periods between A and B. Independent track of different trains.

FIG. 5 is a schematic diagram of an embodiment in which the independent rails in FIG. 4 are aggregated in the present application. In this embodiment, a complex indirect relationship is abstracted into a traveling relationship of simultaneous segments, as shown in FIG. 5, between core nodes. The relationship has become simple and clear. Here, the correspondence between the aggregated independent track and the similar independent track connection relationship before the aggregation is stored. As shown in FIG. 5, the dotted line between the A city and the B city is the indirect relationship edge after the similar independent track is merged. The data field corresponding to the edge describes all the sub-independent tracks it contains through the drilldown. The corresponding relationship includes the first correspondence between the city and the city, and the second correspondence between the city and the city.

FIG. 6 is a schematic diagram of an embodiment of deploying an independent rail after aggregation in FIG. 5 in the present application, and it is assumed that an association relationship between the city and the city B needs to be expanded, as shown in FIG. 6 , according to the first correspondence relationship. The aggregated independent tracks are shown in the diagram. In this way, the local information expansion in the relationship diagram is realized, so that the relationship between the two core nodes of Jiacheng and Bcheng is separated from the complex relationship diagram, and the analysis is also more clear and convenient.

The present application further provides a relationship diagram processing apparatus, including at least a memory and a processor, wherein the memory stores an executable instruction: determining a plurality of core nodes in a relationship diagram to be simplified, the core node is to be simplified a virtual node formed by a node in a relationship diagram or a relationship diagram; acquiring multiple association relationships between core nodes; performing similarity calculation on multiple association relationships between core nodes, and obtaining cores by aggregation A virtual association relationship between nodes, so that the obtained virtual association relationship is used as a relationship between core nodes in the relationship diagram to be simplified.

FIG. 7 is a schematic structural diagram of a device for implementing a simplified relationship diagram of the present application. As shown in FIG. 7, the method includes at least: a dividing module, an acquiring module, and an aggregation module;

Optionally, the aggregation module is specifically configured to: calculate a similarity of the association relationship by using a relationship of different dimensions, and perform aggregation to obtain the virtual association relationship.

Further, the device of the present application further includes: a storage module, configured to correspondingly store the associated relationship between the aggregated virtual association relationship and the pre-aggregation.

Further, the device of the present application further includes:

The present application utilizes the similarity between the indirect relationships between the core nodes or the clusters in the relationship diagram, and merges the similar relationships to abstract new virtual correspondences, thereby simplifying the complex relationship diagram and highlighting the backbone. Thread.

Further, the present application further provides a solution for unfolding the aggregated relationship information, and implements local information expansion in the relationship diagram, so that the relationship between the two core nodes of the city and the city is separated from the complex relationship diagram. The analysis is also more clear and convenient.

The embodiments disclosed in the present application are as described above, but the description is only for the purpose of understanding the present application, and is not intended to limit the present application. Any modifications and changes in the form and details of the embodiments may be made by those skilled in the art without departing from the spirit and scope of the disclosure. The scope defined by the appended claims shall prevail.

Claims

A method for processing a relationship graph, comprising:

Determining a plurality of core nodes in the relationship diagram to be simplified, the core node being a virtual node formed by a node in a relationship diagram to be simplified or a cluster in a relationship diagram;

Obtaining multiple association relationships between core nodes;

The similarity calculation is performed on multiple association relationships between the core nodes, and the virtual association relationship between the core nodes is obtained by aggregation, so that the obtained virtual association relationship is used as the relationship between the core nodes in the relationship diagram to be simplified.
The method of claim 1 , wherein the method further comprises: storing the associated virtual association relationship and the pre-aggregation association relationship.
The method of claim 2, wherein the method further comprises: when the aggregated virtual association relationship is triggered, and expanding the selected aggregation according to the associated association relationship corresponding to the virtual association relationship. After the virtual relationship.
The method of processing a relationship diagram according to claim 3, wherein the expanding the selected aggregated virtual association relationship comprises:

Reading all the pre-aggregation association relationships corresponding to the aggregated virtual association relationship, and displaying the read association relationship.
The method for processing a relational graph according to claim 1, 2 or 3, wherein the similarity calculation is performed on a plurality of association relationships between the core nodes, and the virtual association relationship between the core nodes is obtained by aggregation. :

Calculating the similarity of the association relationship through different dimensional relationships and performing aggregation to obtain the virtual association relationship.
The diagram processing method according to claim 5, wherein the different dimensions comprise any combination of the following: a time dimension, a relationship attribute dimension, and a behavior mode dimension.
A diagram processing device, comprising: a dividing module, an obtaining module, and an aggregation module; wherein

a dividing module, configured to determine a plurality of core nodes in the relationship diagram to be simplified, the core node being a virtual node formed by a node in a relationship diagram or a cluster in a relationship diagram to be simplified;

An obtaining module, configured to acquire multiple association relationships between core nodes;

The aggregation module is configured to perform similarity calculation on multiple association relationships between the core nodes, and obtain a virtual association relationship between the core nodes to obtain the virtual association relationship between the core nodes in the relationship diagram to be simplified. Relationship.
The diagram processing device according to claim 7, wherein the device further comprises:

a storage module, configured to correspondingly store the associated virtual association relationship and the pre-aggregation association relationship;

The expansion module is configured to trigger the virtual association relationship after the aggregation, and expand the selected virtual relationship after the aggregation according to the association relationship corresponding to the virtual association relationship.
The diagram processing device according to claim 8, wherein the expansion module is configured to: read all the pre-aggregation association relationships corresponding to the aggregated virtual association relationship, and display the read The association relationship.
A relationship diagram processing apparatus includes a memory and a processor, wherein the memory stores an executable instruction: determining a plurality of core nodes in a relationship diagram to be simplified, the core node being a node in a relationship diagram to be simplified or a virtual node formed by a cluster in a relationship graph; acquiring multiple association relationships between core nodes; performing similarity calculation on multiple association relationships between core nodes, and obtaining virtual association relationship between core nodes by aggregation To take the obtained virtual association relationship as the relationship between the core nodes in the relationship diagram to be simplified.