CN111506613A - Method, system, device and equipment for querying incidence relation of data record - Google Patents

Method, system, device and equipment for querying incidence relation of data record Download PDF

Info

Publication number
CN111506613A
CN111506613A CN202010321078.9A CN202010321078A CN111506613A CN 111506613 A CN111506613 A CN 111506613A CN 202010321078 A CN202010321078 A CN 202010321078A CN 111506613 A CN111506613 A CN 111506613A
Authority
CN
China
Prior art keywords
node
record
query
subsystem
associated path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010321078.9A
Other languages
Chinese (zh)
Inventor
李启睿
杨程远
楼景华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010321078.9A priority Critical patent/CN111506613A/en
Publication of CN111506613A publication Critical patent/CN111506613A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method, system, device and equipment for querying the association relationship of data records are disclosed. The online storage subsystem pre-acquires the node record, the edge record and the associated path record which are obtained by pre-calculation from the offline calculation subsystem. When batch query of incidence relations is needed, direct real-time calculation is not needed, and query can be directly carried out from an online storage subsystem.

Description

Method, system, device and equipment for querying incidence relation of data record
Technical Field
The embodiment of the specification relates to the technical field of information, in particular to a method, a system, a device and equipment for querying an association relation of data records.
Background
In the enterprise risk analysis and early warning scene, there are often needs to mine enterprise associated parties and further analyze enterprise associated risks, for example, banks perform credit qualification audit on enterprises, finance does mine enterprise risk clues, and brokerage agencies and sponsoring agencies perform due diligence and material audit on companies to be listed.
At this time, it may be necessary to query any M enterprises and other N enterprises for association details, or any association details between N enterprises. As the number of batch queries increases, the query efficiency in the conventional manner will be lower.
Based on this, the embodiments of the present specification provide a more efficient query scheme for association relationship of data records.
Disclosure of Invention
The embodiment of the application aims to provide a more efficient method for querying the association relation of the data records.
In order to solve the above technical problem, the embodiment of the present application is implemented as follows:
a query method for incidence relation of data records comprises the following steps:
the off-line calculation subsystem is used for calculating and acquiring a node record containing node identifications, calculating and acquiring an edge record containing two node identifications and common characteristics of the two node identifications, calculating and acquiring an associated path record corresponding to a node pair, wherein the associated path record contains a plurality of node identifications, at least one edge record exists between two adjacent node identifications, and the node pair contains two node identifications;
the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record;
the service query subsystem determines a plurality of node pairs to be queried and sends query requests aiming at the node pairs to the online storage subsystem;
the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem;
and the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
Correspondingly, an embodiment of the present specification further provides a query system for association relationship of data records, including an offline computing subsystem, an online storage subsystem, and a service query subsystem, in which:
the off-line calculation subsystem is used for calculating and acquiring a node record containing node identifications, calculating and acquiring an edge record containing two node identifications and common characteristics of the two node identifications, calculating and acquiring an associated path record corresponding to a node pair, wherein the associated path record contains a plurality of node identifications, at least one edge record exists between two adjacent node identifications, and the node pair contains two node identifications;
the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record;
the service query subsystem determines a plurality of node pairs to be queried and sends query requests aiming at the node pairs to the online storage subsystem;
the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem;
and the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
Correspondingly, an embodiment of the present specification further provides a method for querying an association relationship of a data record, which is applied to an online storage subsystem, and the method includes:
acquiring node records, edge records and associated path records obtained by calculation from an offline calculation subsystem, and storing the node records, the edge records and the associated path records;
receiving a query request sent by a service query subsystem, and querying and acquiring an associated path record corresponding to any node pair aiming at the node pair;
generating a query result regarding the associated path record;
and returning the query result to the service query subsystem.
Correspondingly, an embodiment of the present specification further provides an apparatus for querying an association relationship of data records, which is applied to an online storage subsystem, and the apparatus includes:
the storage module is used for acquiring the node record, the edge record and the associated path record obtained by calculation from the off-line calculation subsystem and storing the node record, the edge record and the associated path record;
the receiving module is used for receiving the query request sent by the service query subsystem and querying and acquiring the associated path record corresponding to any node pair aiming at the node pair;
the generating module generates a query result about the associated path record;
and the return module returns the query result to the service query subsystem.
Through the scheme provided by the embodiment of the specification, the online storage subsystem acquires the node record, the edge record and the associated path record which are obtained by pre-calculation from the offline calculation subsystem in advance. When the batch query of the incidence relation is needed, direct real-time calculation is not needed, but the query can be directly carried out from the online storage subsystem, so that the query time consumption is reduced, and the batch query of the efficient incidence relation is realized.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the invention.
In addition, any one of the embodiments in the present specification is not required to achieve all of the effects described above.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present specification, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a block diagram of a system according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating a method for associating data records according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a method for querying an association relationship of data records according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of an apparatus for querying an association relationship of data records provided in an embodiment of the present specification;
fig. 5 is a schematic structural diagram of an apparatus for configuring a method according to an embodiment of the present disclosure.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of protection.
In the enterprise risk analysis and early warning scene, there are often needs to mine enterprise associated parties and further analyze enterprise associated risks, for example, a bank conducts credit qualification audit on enterprises, finance conducts mining enterprise risk clues, and an intermediary organization and a referral organization conduct due diligence and material audit on companies to be listed.
It is simple to query the incidence of two enterprises (e.g., enterprise a and enterprise B) separately. There are also many products of this type currently on the market. However, these query methods generally require real-time computation. Particularly in the batch query requirement, a server needs to query by adopting multiple threads, the requirement on the server is high, and the number of query threads is generally limited by the server in a high concurrency scene, so that the query performance and the function of an online graph query mode are limited in the batch query scene.
Based on this, an embodiment of the present specification provides a query system for association relationship of data records, as shown in fig. 1, fig. 1 is an architecture schematic diagram of a system related to the embodiment of the present specification, and in the system, an offline computing subsystem, an online storage subsystem, and a service query subsystem are included. The functional overview of the subsystems is as follows:
(1) and the offline computing subsystem generates and stores the node records, the edge records and the associated path records through offline pre-computation. The size of the data volume is very large, possibly reaching over 100 hundred million.
(2) And the online storage subsystem is used for importing the node record, the edge record and the associated path record from the offline computing subsystem and storing the node record, the edge record and the associated path record, so that the query of the service subsystem can be received.
(3) And the service inquiry subsystem receives the inquiry request of the user and forwards the request to the online storage subsystem.
The following describes a scheme provided in an embodiment of the present specification based on a system shown in fig. 1, and as shown in fig. 2, fig. 2 is a schematic flow chart of a method for associating data records provided in an embodiment of the present specification, and the method includes:
s201, the off-line computing subsystem computes and obtains a node record containing node identifications, computes and obtains an edge record containing two node identifications and common characteristics of the two node identifications, and computes and obtains associated path records corresponding to the node pairs.
Under different scenes, the contents of the node records and the node identifiers are different. Taking the association query between enterprises as an example, the node record at this time may include enterprise information such as legal representatives, holders, holdings ratios, enterprise managers, other enterprise names under enterprise control, and the like, and the node identifier is used to uniquely correspond to the enterprise.
For example, one form of node record may be as follows: { "id": a "," name ": company a", "information of company a", … … }, so that the node record corresponding to the enterprise can be directly found according to the node identifier "a", and the "information of company a" may include the aforementioned enterprise information such as the holding personnel, holding organization, and the like.
Furthermore, based on the node record, the common features included in any two node identifiers can be calculated, so as to generate an edge record including the two node identifiers and the common features of the two node identifiers, wherein the edge record includes a starting point and an end point.
For example, assuming that the legal representatives of the stakeholders of company A and company B are the same, user "xxxx", an edge record of the form: { "start _ id": A "," edge _ name ": legal representative", "end _ id": B "," ext _ info ": xxxx" }.
In an edge record, only one common feature can be included. If there are N common features in the node records corresponding to the two node identifiers, N edge records different from each other should be generated between the two node identifiers.
Further, based on the calculated edge record, for any node pair including two nodes, an associated path record corresponding to the node pair can be obtained. The associated path record comprises a plurality of node identifiers, and at least one edge record exists between two adjacent node identifiers. The association path represents that the two node identifications can have an association relationship, and the association relationship can be represented by the association path.
For example, assuming that an edge record exists between node identifiers a and B, and an edge record exists between B and D, the associated path record between a and D may contain the following information [ "a, B, D" ]. That is, each element of the associated path record only needs to contain a plurality of node identifiers, and does not need to contain other information.
In practical application, a plurality of association paths may exist between two node identifiers, at this time, the same association path record may include a plurality of elements, each element corresponds to an association path, and for example, the association path record between a and D may be in the following form: [ "A, B, D", "A, C, D", "A, E, F, G, D" ], i.e. there are three correlation paths between A and D. Obviously, in the same associated path record, the start node identifier and the end node identifier of each element should be the same, while the node identifiers in the middle are different, so as to avoid duplication.
In practical applications, if there are too many elements in one associated path record, a weight calculation may be performed on each associated path (for example, a weight is given to each node), and then topN associated paths are taken according to the weights.
And S203, the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record.
In practical applications, a non-relational database such as Hbase may be used to store the node records, edge records, and associated path records for the node pairs.
The specific obtaining mode can adopt a mode such as incremental updating. The whole quantity is firstly led in, and then the offline computing subsystem computes changes of the current node record, the edge record and the associated path record in real time, and then incremental updating can be carried out on the record with the changes. In a scenario such as enterprise information query, since the enterprise information is actually stable, the calculated point record, edge record and associated path record do not change a lot in a short time, and therefore, an incremental update mode can be adopted. In some situations where the data records change rapidly, it is obvious that the incremental update method does not adapt due to the large amount of data.
The reason for this is that in relational database based queries, the one-time response time may be in the hundreds of milliseconds, whereas a one-time response of a non-relational database requires only a few milliseconds. In the case of batch query, the relational database cannot achieve the effect of quick response.
Specifically, the node records are stored by adopting a non-relational point table, wherein the node identifiers are used as primary keys in the point table. As shown in table 1, table 1 is a point table provided in the examples of the present specification.
TABLE 1
Main key Data recording
A { "id": A "," name ": company A" }
B { "id": B "," name ": company B" }
And storing the edge records by adopting a non-relational edge table, wherein two node identifications contained in the edge records are used as primary keys in the edge table. As shown in table 2, table 2 is an edge table provided in the embodiments of the present disclosure.
TABLE 2
Figure BDA0002461425270000071
And storing the associated path record by adopting a non-relational path table, wherein two node identifications contained in the node pairs of the associated path record are used as primary keys in the path. As shown in table 3, table 3 is a path table provided in the embodiment of the present disclosure.
TABLE 3
Main key Data recording Minimum degree of correlation
A_B ["A,C,B","A,E,B","A,E,F,B"] 2
A_D ["A,B,D","A,C,D","A,E,F,G,D"] 2
In practical application, the path table may further include a step of determining a degree of association for the associated path records, where the degree of association represents how many node identifiers at least pass between two enterprises, and the degree of association is positively correlated with the number of node identifiers of the shortest associated path in the associated path records.
For example, for an association path "a, C, B", the degree of association may be the number of edges contained therein (i.e., the number of node identifiers-1). Obviously, for the associated path records, the degree of association of each associated path may be different. However, the minimum association degree in each association path may be used as the association degree of the association path record between two node identifiers, and the correspondence between the association degree and the primary key of the association path record is established and written into the path table, that is, as shown in table 3.
S205, the service inquiry subsystem determines a plurality of node pairs to be inquired and sends inquiry requests aiming at the node pairs to the online storage subsystem.
Although the solution of the present application may also be used for the query of a node pair, it is clear that this is not the purpose of the present application. The method mainly solves the problem of efficiency of batch query. For example, under current conventional graph queries, the number of node pairs that can be queried simultaneously does not exceed the number of threads (e.g., 10) that the server can provide, and is inefficient. Whereas the batch query in the present application does not have this limitation.
Specifically, taking the incidence relation query between enterprises as an example, in the batch query, the node pairs can be determined in several ways:
first, the incidence of a single business with N other businesses is queried. For example, a user enters an origin enterprise and N destination enterprises, asking for associations between enterprise A and the other N enterprises. In this manner, N node pairs are generated, i.e., [ < A, N1>, < A, N2>, … …, < A, Nn > ];
and secondly, inquiring the association relation among the N enterprises. At this point, N (N-1)/2 node pairs are generated. For example, a user entering four businesses a, B, C, D, asking for a relationship between the four businesses, would result in 4 × (4-1)/═ 6 node pairs, [ < a, B >, < a, C >, < a, D >, < B, C >, < B, D >, < C, D > ].
Third, the relationships between the N businesses and the other M businesses are queried. Obviously, a total of N x M node pairs will be generated at this time.
In summary, on the side of the service query subsystem, a plurality of node pairs for batch queries may be generated according to the starting node identifier and the end node identifier input by the user, where at least one end of the starting node identifier and the end node identifier is plural in number, and further, query requests for the node pairs may be sent to the online storage subsystem. Or, the starting point identifiers can be respectively determined according to a plurality of node identifiers input by the user, so that a plurality of corresponding groups of node pairs are determined.
For example, a plurality of node pairs may be sent in a single query request, but when the number of node pairs is large, the node pairs may also be sent in batches, so that the online storage subsystem queries in batches.
And S207, the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem.
The online storage subsystem can perform the associated path query from the non-relational database according to the two node identifiers included in the node pair and the corresponding primary keys of the two node identifiers.
For example, for a node pair [ < a, B >, the primary key "a _ B" containing node identifications a and B may be queried from table 3, so as to obtain the corresponding associated paths "a, C, B", "a, E, B" and "a, E, F, B", thereby generating a query result containing the associated paths "a, C, B", "a, E, B" and "a, E, F, B", and returning to the service query subsystem.
In an embodiment, for the associated path record obtained by query, querying from the edge table to obtain an edge record corresponding to each group of two adjacent node identifiers in each path; inquiring and acquiring a node record corresponding to each node identifier in the associated path record from the point table; and generating a query result containing the node record, the edge record and the associated path record.
Further, when the path table further includes the degree of association, a query result including the degree of association corresponding to the node pair may be generated, that is, for the node pair < a, B >, the minimum degree of association is 2, and the value is written into the query result and returned.
In one embodiment, the association path may not be returned, but only the number of associations to the service query subsystem. For example, when the number of pairs of nodes in the batch query is large (for example, the association relationship between a specific enterprise and another 200 enterprises is calculated), the number of association paths may be large, and each association path is also deep (i.e., the number of node identifiers is large), so that the occupied space of each association path is large, which increases the storage and transmission time.
At this time, only the association degree (no association relationship, the association degree may be 0) on each association path may be returned to the service query subsystem, so that the service subsystem may perform an overview according to the returned association degree of the association path. When detailed understanding is needed, the node pairs with the association relation are clicked, the path table of the node pairs is inquired, all points and edges on the association path are obtained, and the edge table and the point table are inquired to obtain detailed information of the points and the edges on the path.
In this embodiment, the online storage subsystem may predetermine a threshold, and the threshold number may be related to the number of node pairs for the batch query, or may be related to the size of the occupied space of the query result. When the number of the node pairs of the batch query or the size of the final returned result exceeds a specified threshold, only returning and not returning the association path, and only returning the association degree to the service query subsystem. Therefore, time consumption for inquiring the edge table and the point table for multiple times in simple inquiry is reduced, and the time consumption of simple inquiry is further reduced.
In an embodiment, the user may also indicate a preset association degree in advance in the service query subsystem, which is used to indicate the online storage subsystem, and if the association degree of the association path obtained by the query exceeds the preset association degree, the association does not need to be returned. At this time, the online storage subsystem may determine the degree of association of the association path record obtained by the query, and generate a query result including the association path record not exceeding the preset degree of association.
S209, the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
Specifically, the service query subsystem may perform corresponding presentation based on the received query result.
When the returned query result contains the specific associated path record. The service query subsystem can display the query result in the form of a graph containing nodes and edges, in the graph, node identifiers in the associated path records correspond to nodes in the graph one by one, and the edge records of two adjacent node identifiers in the associated path records correspond to edges in the graph one by one.
And then the user can determine a certain point or a certain edge in the map, generate a confirmation instruction about the target (edge or node) object, and generate an inquiry request containing the node identifier corresponding to the target object, so that the online storage subsystem receives the inquiry request containing the node identifier corresponding to the target object, inquires and acquires the node record and the edge record corresponding to the node identifier, and returns the inquiry structure to the service inquiry subsystem. And the user can browse the related information of each edge record and each node record in detail.
Of course, if the returned result includes the node record and the edge record, the online storage subsystem can directly obtain the corresponding information locally and display the information to the user.
When there are many node pairs for batch query, and the query result returned at this time may only include the association degree and does not include the association path record, then the service query subsystem may show the association degree of the association path corresponding to each node pair, so that the user can roughly know the association degree.
Furthermore, the user can confirm the interested node pairs with the correlation (for example, the correlation degree is more than 0), so that the path table can be inquired, all the points and edges on the correlation path can be acquired, the edge table and the point table are inquired, the detailed information of the points and the edges on the path can be acquired, a large amount of useless inquiry is avoided, and the corresponding speed is further increased.
Through the scheme provided by the embodiment of the specification, the online storage subsystem acquires the node record, the edge record and the associated path record which are obtained by pre-calculation from the offline calculation subsystem in advance. When the batch query of the incidence relation under the large-scale map is needed, direct real-time calculation is not needed, the query can be directly carried out from the online storage subsystem, the query time consumption is reduced, and the batch query of the efficient incidence relation is realized.
Correspondingly, an embodiment of the present specification further provides a query system for association relationship of data records, including an offline computing subsystem, an online storage subsystem, and a service query subsystem, in which:
the off-line calculation subsystem is used for calculating and acquiring a node record containing node identifications, calculating and acquiring an edge record containing two node identifications and common characteristics of the two node identifications, calculating and acquiring an associated path record corresponding to a node pair, wherein the associated path record contains a plurality of node identifications, at least one edge record exists between two adjacent node identifications, and the node pair contains two node identifications;
the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record;
the service query subsystem determines a plurality of node pairs to be queried and sends query requests aiming at the node pairs to the online storage subsystem;
the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem;
and the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
Correspondingly, the present specification is an embodiment that further provides a method for querying an association relationship of a data record, which is applied to an online storage subsystem, as shown in fig. 3, where fig. 3 is a schematic flow diagram of a method for querying an association relationship of a data record provided by the embodiment of the present specification, and the method includes:
s301, acquiring node records, edge records and associated path records obtained by calculation from an offline calculation subsystem, and storing the node records, the edge records and the associated path records;
s303, receiving a query request sent by the service query subsystem, and querying and acquiring an associated path record corresponding to any node pair aiming at the node pair;
s305, generating a query result about the associated path record;
s307, returning the query result to the service query subsystem.
For a more specific application of the online storage subsystem, the detailed description has been given above, and will not be repeated here.
Correspondingly, an embodiment of the present specification further provides a query device for association relationship of data records, which is applied to an online storage subsystem, as shown in fig. 4, where fig. 4 is a schematic structural diagram of the query device for association relationship of data records provided in the embodiment of the present specification, and includes:
s401, a storage module acquires the node record, the edge record and the associated path record obtained by calculation from the off-line calculation subsystem and stores the node record, the edge record and the associated path record;
s403, the receiving module receives the query request sent by the service query subsystem, and queries and acquires the associated path record corresponding to any node pair aiming at the node pair;
s405, a generating module generates a query result about the associated path record;
s407, a return module returns the query result to the service query subsystem.
The embodiment of the present specification further provides a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the association relationship query method shown in fig. 3 when executing the program.
Fig. 5 is a schematic diagram illustrating a more specific hardware structure of a computing device according to an embodiment of the present disclosure, where the computing device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present specification further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the association relationship query method shown in fig. 3.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The systems, methods, modules or units described in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to the partial description of the method embodiment for relevant points. The above-described method embodiments are merely illustrative, wherein the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present specification. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.

Claims (15)

1. A query method for incidence relation of data records comprises the following steps:
the off-line calculation subsystem is used for calculating and acquiring a node record containing node identifications, calculating and acquiring an edge record containing two node identifications and common characteristics of the two node identifications, calculating and acquiring an associated path record corresponding to a node pair, wherein the associated path record contains a plurality of node identifications, at least one edge record exists between two adjacent node identifications, and the node pair contains two node identifications;
the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record;
the service query subsystem determines a plurality of node pairs to be queried and sends query requests aiming at the node pairs to the online storage subsystem;
the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem;
and the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
2. The method of claim 1, wherein the online storage subsystem stores the node record, edge record, and associated path record, comprising:
and the online storage subsystem stores the node records, the edge records and the associated path records of the node pairs by adopting a non-relational database.
3. The method of claim 2, wherein the online storage subsystem stores the node record, the edge record, and the associated path record for the node pair using a non-relational database, comprising:
the online storage subsystem stores the node records by adopting a non-relational point table, wherein the node identifiers are used as main keys in the point table;
storing the edge records by adopting a non-relational edge table, wherein two node identifications contained in the edge records are used as primary keys in the edge table;
and storing the associated path record by adopting a non-relational path table, wherein two node identifications contained in the node pairs of the associated path record are used as primary keys in the path.
4. The method of claim 3, wherein the online storage subsystem stores the associated path record using a non-relational path table, comprising:
the online storage subsystem determines the degree of association of any associated path record, wherein the degree of association is positively correlated with the number of node identifiers in the associated path;
establishing a corresponding relation between the association degrees and the main keys recorded by the association path, and writing the corresponding relation into the path table;
correspondingly, the querying and obtaining the associated path record corresponding to the node pair includes: inquiring and acquiring the association degree of the node pair for judging whether the node pair has an association relation, if so, inquiring and acquiring an association path record corresponding to the node pair;
accordingly, generating query results for the associated path records includes: and generating a query result which contains the association degree and does not contain the association path record, or generating a query result which contains the association degree and the association path record.
5. The method of claim 4, generating query results that only contain the degree of relevance, comprising:
and judging whether the number of the inquired node pairs exceeds a threshold value or not, or judging whether the occupied space of the generated inquiry result exceeds the threshold value or not, and if so, generating the inquiry result which contains the association degree and does not contain the association path record.
6. The method of claim 4, the business query subsystem to present the query results, comprising: and displaying the query result containing the association degrees aiming at the plurality of node pairs to be queried.
7. The method of claim 4, wherein when the query request further includes a preset number of degrees of association, the online storage subsystem generates a query result about the associated path record, and the query result includes:
and determining the degree of association of the association path record obtained by query, and generating a query result containing the association path record not exceeding the preset degree of association.
8. The method of claim 3, wherein the online storage subsystem, generating query results containing the associated path records, comprises:
the online storage subsystem determines two adjacent node identifications of each group in the associated path record;
inquiring and acquiring edge records corresponding to each group of two adjacent node identifications from the edge table;
inquiring and acquiring a node record corresponding to each node identifier in the associated path record from the point table;
and generating a query result containing the node record, the edge record and the associated path record.
9. The method of claim 1, the business query subsystem to present the query results, comprising:
and displaying the query result in the form of a graph containing nodes and edges, wherein in the graph, node identifiers in the associated path records correspond to nodes in the graph one by one, and the edge records of two adjacent node identifiers in the associated path records correspond to edges in the graph one by one.
10. The method of claim 9, the service query subsystem, further comprising: receiving a confirmation instruction for a target object in a graph, and generating a query request containing a node identifier corresponding to the target object, wherein the target object contains an edge or a node;
correspondingly, the online storage subsystem receives the query request containing the node identifier corresponding to the target object, queries and acquires the node record and the edge record corresponding to the node identifier, and returns the query structure to the service query subsystem.
11. The method of claim 1, wherein the on-line storage subsystem obtains the computed node record, edge record, and associated path record from the off-line computation subsystem, and comprises:
and the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem in an incremental updating mode.
12. A query system of incidence relation of data records comprises an offline computing subsystem, an online storage subsystem and a service query subsystem, wherein in the system:
the off-line calculation subsystem is used for calculating and acquiring a node record containing node identifications, calculating and acquiring an edge record containing two node identifications and common characteristics of the two node identifications, calculating and acquiring an associated path record corresponding to a node pair, wherein the associated path record contains a plurality of node identifications, at least one edge record exists between two adjacent node identifications, and the node pair contains two node identifications;
the online storage subsystem acquires the node record, the edge record and the associated path record obtained by calculation from the offline calculation subsystem, and stores the node record, the edge record and the associated path record;
the service query subsystem determines a plurality of node pairs to be queried and sends query requests aiming at the node pairs to the online storage subsystem;
the online storage subsystem receives the query request sent by the service query subsystem, queries and acquires the associated path record corresponding to any node pair aiming at the node pair, generates a query result about the associated path record, and returns the query result to the service query subsystem;
and the service inquiry subsystem receives the inquiry result returned by the online storage subsystem and displays the inquiry result.
13. A query method for incidence relation of data records is applied to an online storage subsystem, and comprises the following steps:
acquiring node records, edge records and associated path records obtained by calculation from an offline calculation subsystem, and storing the node records, the edge records and the associated path records;
receiving a query request sent by a service query subsystem, and querying and acquiring an associated path record corresponding to any node pair aiming at the node pair;
generating a query result regarding the associated path record;
and returning the query result to the service query subsystem.
14. An inquiry device for incidence relation of data records, which is applied to an online storage subsystem, and comprises:
the storage module is used for acquiring the node record, the edge record and the associated path record obtained by calculation from the off-line calculation subsystem and storing the node record, the edge record and the associated path record;
the receiving module is used for receiving the query request sent by the service query subsystem and querying and acquiring the associated path record corresponding to any node pair aiming at the node pair;
the generating module generates a query result about the associated path record;
and the return module returns the query result to the service query subsystem.
15. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of claim 13 when executing the program.
CN202010321078.9A 2020-04-22 2020-04-22 Method, system, device and equipment for querying incidence relation of data record Pending CN111506613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010321078.9A CN111506613A (en) 2020-04-22 2020-04-22 Method, system, device and equipment for querying incidence relation of data record

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010321078.9A CN111506613A (en) 2020-04-22 2020-04-22 Method, system, device and equipment for querying incidence relation of data record

Publications (1)

Publication Number Publication Date
CN111506613A true CN111506613A (en) 2020-08-07

Family

ID=71871208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010321078.9A Pending CN111506613A (en) 2020-04-22 2020-04-22 Method, system, device and equipment for querying incidence relation of data record

Country Status (1)

Country Link
CN (1) CN111506613A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076465A (en) * 2023-10-16 2023-11-17 支付宝(杭州)信息技术有限公司 Data association query method and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458716A (en) * 2008-12-31 2009-06-17 北京大学 Shortcut searching method between nodes in chart
US20140006542A1 (en) * 2012-06-29 2014-01-02 William M Pitts Recursive ascent network link failure notifications
CN105893572A (en) * 2016-03-31 2016-08-24 北京奇艺世纪科技有限公司 Method, device and system for outputting target data
CN110688541A (en) * 2019-10-08 2020-01-14 中国建设银行股份有限公司 Report data query method and device, storage medium and electronic equipment
CN110765215A (en) * 2019-09-30 2020-02-07 深圳云天励飞技术有限公司 Query method and device for personnel common relationship, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458716A (en) * 2008-12-31 2009-06-17 北京大学 Shortcut searching method between nodes in chart
US20140006542A1 (en) * 2012-06-29 2014-01-02 William M Pitts Recursive ascent network link failure notifications
CN105893572A (en) * 2016-03-31 2016-08-24 北京奇艺世纪科技有限公司 Method, device and system for outputting target data
CN110765215A (en) * 2019-09-30 2020-02-07 深圳云天励飞技术有限公司 Query method and device for personnel common relationship, electronic equipment and storage medium
CN110688541A (en) * 2019-10-08 2020-01-14 中国建设银行股份有限公司 Report data query method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117076465A (en) * 2023-10-16 2023-11-17 支付宝(杭州)信息技术有限公司 Data association query method and related equipment
CN117076465B (en) * 2023-10-16 2024-04-05 支付宝(杭州)信息技术有限公司 Data association query method and related equipment

Similar Documents

Publication Publication Date Title
CN107798038B (en) Data response method and data response equipment
CN109885786B (en) Data caching processing method and device, electronic equipment and readable storage medium
CN111352902A (en) Log processing method and device, terminal equipment and storage medium
CN110162512B (en) Log retrieval method, device and storage medium
US20150120697A1 (en) System and method for analysis of a database proxy
CN111046052B (en) Method, device and equipment for storing operation records in database
CN112328575B (en) Data asset blood-edge generation method and device and electronic equipment
CN111506613A (en) Method, system, device and equipment for querying incidence relation of data record
CN110928895B (en) Data query and data table establishment method, device and equipment
CN111553749A (en) Activity push strategy configuration method and device
US9984235B2 (en) Transmission of trustworthy data
CN116069810A (en) Data query method and device and terminal equipment
CN115481026A (en) Test case generation method and device, computer equipment and storage medium
CN111444198B (en) Transaction storage and query method based on centralized block chain type account book
TWI630496B (en) Data storage method and system thereof
CN111680112B (en) Data analysis method and device
CN113761102B (en) Data processing method, device, server, system and storage medium
CN111339152B (en) Store expansion record data processing device
CN112364030B (en) Business derivative record storage method based on credible account book database
CN117076465B (en) Data association query method and related equipment
CN112364031B (en) Business derivative record storage method based on credible account book database
CN109299139B (en) Information query method and device
CN116795835A (en) Correlation query method and device
CN117216164A (en) Financial data synchronous processing method, apparatus, device, medium and program product
CN117235149A (en) Method, apparatus, device, storage medium and program product for generating public accumulation bill

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40034586

Country of ref document: HK