CN116932846A

CN116932846A - Method and device for determining data asset, storage medium and electronic device

Info

Publication number: CN116932846A
Application number: CN202310794220.5A
Authority: CN
Inventors: 王龙龙; 孙能林
Original assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd; Haier Uplus Intelligent Technology Beijing Co Ltd
Current assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd; Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date: 2023-06-29
Filing date: 2023-06-29
Publication date: 2023-10-24

Abstract

The application discloses a method and a device for determining data assets, a storage medium and an electronic device, and relates to the technical field of smart families, wherein the method for determining the data assets comprises the following steps: determining a first graph corresponding to different graph data stored in a graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data; preprocessing the first graph to generate a second graph under the condition that the first graph is determined to have a ring sub-graph; under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in the second graph; according to the technical scheme, the technical problem that invalid data assets in the database cannot be comprehensively identified is solved.

Description

Method and device for determining data asset, storage medium and electronic device

Technical Field

The application relates to the technical field of smart families, in particular to a method and a device for determining data assets, a storage medium and an electronic device.

Background

At present, in large data items, a large amount of invalid data caused by historical reasons exists, the data occupy storage resources and even computing resources, but the data do not bring any value, the waste of the large amount of data resources increases the maintenance cost of corresponding resources, and the processing efficiency of related data services is reduced, so that the technical problem that the invalid data assets existing in the large data corresponding database cannot be comprehensively identified exists.

Therefore, no effective solution has been proposed for the technical problem that the invalid data assets existing in the database cannot be comprehensively identified in the related art.

Disclosure of Invention

The embodiment of the application provides a method and a device for determining data assets, a storage medium and an electronic device, which at least solve the technical problem that invalid data assets in a database cannot be comprehensively identified in the related art.

According to an embodiment of the present application, there is provided a method of determining a data asset, including: determining a first graph corresponding to different graph data stored in a graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data; preprocessing the first graph to generate a second graph under the condition that the first graph is determined to have a ring sub-graph; under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in the second graph; and extracting the data asset from the graph database according to the marking result.

In an exemplary embodiment, preprocessing the first graph to generate a second graph includes: determining a first adjacency list and/or a first inverse adjacency list corresponding to different nodes in the first graph; performing first traversal on the first adjacency list and/or the first inverse adjacency list, and marking the vertexes of the looped sub-graphs passing through in the first traversal, wherein at least one looped sub-graph exists in the first graph; determining a second adjacent table and/or a second inverse adjacent table corresponding to the ring sub-graph according to the vertex, and replacing all member points in the ring sub-graph with target points corresponding to the vertex based on the second adjacent table and/or the second inverse adjacent table so as to determine the whole ring sub-graph as a new vertex; in the case that the first graph does not exist in all ring subgraphs, generating a second graph corresponding to the first graph is determined.

In an exemplary embodiment, marking the first graph data in the second graph that is identical to the data to be screened to extract data assets from the graph database according to a marking result includes: adding a first mark for the first graph data, and identifying a transfer link containing the first graph data in the second graph, wherein the transfer link at least comprises two nodes; determining parent-child relationships between other nodes in the transfer link and target nodes corresponding to the first graph data; adding a second mark to the other nodes based on the father-son relationship and the propagation mode of the data to be screened; determining a marking result based on the second marking and the first marking to extract a data asset from the map database based on the marking result.

In an exemplary embodiment, adding a second flag to the other nodes based on the parent-child relationship and the propagation manner of the data to be screened includes: determining to add a second mark for the other nodes under the condition that the target node is determined to be a father node of the other nodes and the propagation mode of the data to be screened is upstream propagation; ending the mark adding operation under the condition that the target node is determined to be a father node of the other nodes and the propagation mode of the data to be screened is downstream propagation; ending the mark adding operation under the condition that the target node is determined to be a child node of the other nodes and the propagation mode of the data to be screened is upstream propagation; and under the condition that the target node is determined to be a child node of the other nodes and the propagation mode of the data to be screened is downstream propagation, adding a second mark for the other nodes.

In an exemplary embodiment, after adding a second flag to the other nodes based on the parent-child relationship and the propagation manner of the data to be screened, the method further includes: adjusting the marking result according to the sufficient condition corresponding to the propagation mode of the data to be screened; wherein the sufficient conditions include at least one of: determining that the propagation mode of the data to be screened is upstream propagation, and if all child nodes are determined to have marks, adding a second mark for a parent node with an inexistent mark; determining that the propagation mode of the data to be screened is downstream propagation, and if all father nodes are determined to have marks, adding a second mark for the child nodes with marks which do not exist; the adjusted target marking result is determined as a marking result for extracting the data asset.

In one exemplary embodiment, after extracting the data asset from the graph database according to the marking result, the method further comprises: identifying a data service that applies the data asset; and sending a management hint to the data service, wherein the management hint is used for indicating a second object using the data service to dereference the data asset.

In an exemplary embodiment, before determining the first graph corresponding to the different graph data stored in the graph database, the method further includes: acquiring a target storage engine preconfigured for the graph database; the target storage engine is utilized to transfer the data to be processed in the target database to the graph database; and determining a blood-edge map recorded in the target storage engine after the transfer is completed, so as to assist in determining the first map according to the blood-edge map, wherein the blood-edge map is used for indicating the transfer relationship of different data stored in the map database among various services.

According to another embodiment of the present application, there is also provided a data asset determining apparatus including: the determining module is used for determining a first graph corresponding to different graph data stored in the graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data; the processing module is used for preprocessing the first graph to generate a second graph under the condition that the fact that the ring sub-graph exists in the first graph is determined; the extraction module is used for marking the first graph data which are the same as the data to be screened in the second graph under the condition that the data to be screened input by the first object are acquired; and extracting the data asset from the graph database according to the marking result.

According to a further embodiment of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-described method of determining data assets when run.

According to yet another embodiment of the present application, there is provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor performs the method for determining a data asset described above through the computer program.

In the embodiment of the application, a first graph corresponding to different graph data stored in a graph database is determined, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data; under the condition that the fact that the ring sub-graph exists in the first graph is determined, preprocessing the first graph to generate a second graph; under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in a second graph; extracting data assets from the graph database according to the marking result; by adopting the technical scheme, the technical problem that the invalid data assets existing in the database cannot be comprehensively identified is solved, and the identification efficiency of the invalid assets existing in the database can be improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.

FIG. 1 is a schematic diagram of a hardware environment for a method of determining data assets in accordance with an embodiment of the application;

FIG. 2 is a flow chart of a method of determining a data asset according to an embodiment of the application;

FIG. 3 is a schematic diagram of a cyclic subgraph in accordance with an embodiment of the present application;

FIG. 4 is a flow diagram of a method of determining a data asset according to an embodiment of the application;

FIG. 5 is a schematic diagram of a determination of a cyclic subgraph in accordance with an embodiment of the present application;

fig. 6 is a block diagram of a data asset determination device according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to one aspect of an embodiment of the present application, a method of determining a data asset is provided. The method for determining the data asset is widely applied to full-house intelligent digital control application scenes such as intelligent Home (Smart Home), intelligent Home equipment ecology, intelligent Home (Intelligence House) ecology and the like. Alternatively, in the present embodiment, the above-described method of determining data assets may be applied to a hardware environment constituted by the terminal device 102 and the server 104 as shown in fig. 1. Fig. 1 is a schematic hardware environment of a method for determining a data asset according to an embodiment of the present application, as shown in fig. 1, where a server 104 is connected to a terminal device 102 through a network, and may be used to provide services (such as application services and the like) for a terminal or a client installed on the terminal, a database may be set on the server or independent of the server, for providing a data storage service for the server 104, and a cloud computing and/or edge computing service may be configured on the server or independent of the server, for providing a data computing service for the server 104.

The network may include, but is not limited to, at least one of: wired network, wireless network. The wired network may include, but is not limited to, at least one of: a wide area network, a metropolitan area network, a local area network, and the wireless network may include, but is not limited to, at least one of: WIFI (Wireless Fidelity ), bluetooth. The terminal device 102 may not be limited to a PC, a mobile phone, a tablet computer, an intelligent air conditioner, an intelligent smoke machine, an intelligent refrigerator, an intelligent oven, an intelligent cooking range, an intelligent washing machine, an intelligent water heater, an intelligent washing device, an intelligent dish washer, an intelligent projection device, an intelligent television, an intelligent clothes hanger, an intelligent curtain, an intelligent video, an intelligent socket, an intelligent sound box, an intelligent fresh air device, an intelligent kitchen and toilet device, an intelligent bathroom device, an intelligent sweeping robot, an intelligent window cleaning robot, an intelligent mopping robot, an intelligent air purifying device, an intelligent steam box, an intelligent microwave oven, an intelligent kitchen appliance, an intelligent purifier, an intelligent water dispenser, an intelligent door lock, and the like.

In this embodiment, a method for determining a data asset is provided, which is applied to the terminal device or the server, and fig. 2 is a flowchart of a method for determining a data asset according to an embodiment of the present application, where the flowchart includes the following steps:

Step S202, determining a first graph corresponding to different graph data stored in a graph database, wherein the first graph is used for indicating association relations between nodes corresponding to the different graph data;

step S204, preprocessing the first graph to generate a second graph under the condition that the first graph is determined to have the ring sub-graph;

step S206, under the condition that the data to be screened input by the first object is obtained, marking the first graph data which are the same as the data to be screened in the second graph; and extracting the data asset from the graph database according to the marking result.

Through the steps, determining a first graph corresponding to different graph data stored in the graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data; under the condition that the fact that the ring sub-graph exists in the first graph is determined, preprocessing the first graph to generate a second graph; under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in a second graph; extracting data assets from the graph database according to the marking result; by adopting the technical scheme, the technical problem that the invalid data assets existing in the database cannot be comprehensively identified is solved, and the identification efficiency of the invalid assets existing in the database can be improved.

In an exemplary embodiment, for the step S204, the preprocessing may be performed on the first graph to generate a second graph, where the specific steps include:

step S204-01, determining a first adjacency list and/or a first inverse adjacency list corresponding to different nodes in the first graph;

step S204-02, performing a first traversal on the first adjacency list and/or the first inverse adjacency list, and marking the vertices of the looped sub-graph passing through in the first traversal, wherein at least one looped sub-graph exists in the first graph;

step S204-03, determining a second adjacent table and/or a second inverse adjacent table corresponding to the cyclic sub-graph according to the vertex, and replacing all member points in the cyclic sub-graph with target points corresponding to the vertex based on the second adjacent table and/or the second inverse adjacent table so as to determine the cyclic sub-graph as a new vertex;

it will be appreciated that since a loop-containing sub-graph may cause traversal to enter a dead loop, the loop-containing sub-graph as a whole may be considered as a node and the node may be represented instead by the sub-loop information that the sub-loop carries all the loop member information, e.g., fig. 3 is a schematic diagram of a loop-containing sub-graph according to an embodiment of the application, if the initial step starts from G, H or F and a depth traversal is performed with an adjacency list, then F belongs to the [ G, H, F ] sub-graph. But if the traversal starts from B, D, E, then F belongs to the [ B, D, E, F, H, G ] subgraph. The sub-graph of [ G, H, F ] or [ B, D, E, F, H, G ] may then be considered as a node, alternatively the sub-rings may be represented by combinedNode (corresponding to the target point of the above embodiment), and a large sphere may be considered as being surrounded by a plurality of small spheres, each of which is a ring member, and the large sphere is a ring. Then in the marking, the combinedNode is regarded as a common vertex, and the only special place is that when the combinedNode is marked as a zombie asset, all member vertices inside the combinedNode need to be marked.

Step S204-04, determining to generate a second graph corresponding to the first graph in the case that the first graph is determined to have no cyclic subgraph.

Through the processing mode, the looped sub-graphs which influence traversal and exist in the first graph can be combined, the looped sub-graphs which exist in the first graph are simplified, the looped sub-graphs are regarded as a vertex combinedNode, and the vertex is marked, namely the looped sub-graph is marked, namely all member points in the sub-graph are marked. The effect of the presence of a looped sub-graph in the graph on subsequent labels is then avoided.

Optionally, at the time of tagging, the target object is imported into an initial collection of zombic assets (corresponding to the data to be screened described above), a propagation algorithm of zombic tagging is performed based on the initial collection of zombic assets, and all the collection of zombic assets (including the new collection of zombic assets) is returned. When the upstream propagation is determined, the depth-first traversing DFS is performed by using an inverse adjacency list corresponding to the second graph, and the adjacency list is used for judging the sufficient condition. And when the downstream propagation is performed, performing depth-first traversing DFS by using an adjacent table corresponding to the second graph, and realizing sufficient condition judgment by using an inverse adjacent table.

In order to ensure the accuracy of marking, the marked target data can be judged, whether the current mark is effective or not is identified, and adjustment is performed.

Optionally, during upstream propagation, DFS (depth first traversal) is performed using an inverse adjacency table, and judgment of sufficient conditions is achieved using the adjacency table. The upstream propagation conditions are: if currentNode is a zombie asset, then a sufficient condition that a certain parent Node is a zombie asset is that all child nodes of the pantnode have zombie tags.

During downstream propagation, DFS is performed using an adjacency list, and sufficient condition judgment is realized using an inverse adjacency list. The downstream propagation condition is that if currentNode is a zombie asset, then a child Node is a zombie asset, and the sufficient condition is that all parentNodes of child Node have zombie tags.

In one exemplary embodiment, after extracting the data asset from the graph database based on the marking result, the method further comprises: identifying a data service that applies the data asset; and sending a management hint to the data service, wherein the management hint is used for indicating a second object using the data service to dereference the data asset.

That is, in big data processing, the source and processing links of the data proximate to the user may be quite complex, and usually a complex directed graph (supporting a loop) is formed, all the data, scripts, etc. on the processing links may serve only the current zombie asset, so they should also be marked as zombie assets, after determining the data assets, in order to use the recovery of the data assets more accurately, the corresponding management prompt may be fed back to the data service applying the data assets, so that the corresponding recovery record may exist in the corresponding data service while the data assets are recovered, so as to avoid the abnormal recovery caused by the no record in the recovery of the data assets.

It should be noted that, since the database storing data in actual use does not have a graph attribute, when it is determined that a certain data set needs to be processed, the data set needs to be saved into a graph database capable of generating a graph, and the transfer relationship between different data and each service is recorded in the process of saving, so that the source of the different data and the use direction of the data can be clearly known, and the marking can be more comprehensively performed in the subsequent marking.

In order to better understand the process of the method for determining the data asset, the following description is given with reference to the implementation method flow of the determination of the data asset in the alternative embodiment, but the implementation method flow is not limited to the technical solution of the embodiment of the present application.

In this embodiment, a method for determining a data asset is provided, and fig. 4 is a schematic flow chart of a method for determining a data asset according to an embodiment of the present application, all graph data are first read from Neo4j, where the data include a ring graph, and the ring graph is denoted by gd_1; and constructing an adjacency list and an inverse adjacency list for the GD_1 respectively, merging and replacing the looped sub-graph in the GD_1, introducing an initial zombie asset set (ZombineCaltions) by a user, executing a zombie-marked propagation algorithm, and returning all the zombie asset sets (including new sets of the zombie assets). In sum, based on the big data blood margin system, combining with a graph algorithm, based on inputting the zombie assets confirmed by the user, finding out the zombie assets in all relevant links to realize the identification of various zombie assets in the big data system.

As shown in fig. 4, the implementation steps are as follows:

step S1: storing the data in a Neo4j graph database;

step S2: determining full blood-margin map data (corresponding to the first map in the above embodiment) corresponding to the map database;

step S3: constructing an adjacency list and an inverse adjacency list;

step S4: analyzing the adjacency list and the inverse adjacency list to determine whether a ring exists in the graph; it should be noted that, due to the propagation process of zombie marks, the traversal operation of the graph is essentially performed, and the loop-containing sub-graph causes the traversal to enter a dead loop, although the entry point of the loop-containing graph can be identified by recording the traversal times of the vertices, and the traversal of the current path is ended when the entry point is encountered. However, when traversing from different starting points and encountering the same loop subgraph, the result of determining the loop entry point will be different, which results in inaccuracy of the "zombie mark propagation algorithm". In order to solve the above problem, in the embodiment of the present application, the cyclic sub-graph is simplified, and is regarded as a combinedNode, and when the cyclic sub-graph is marked, the marking is performed on all member points in the sub-graph.

Step S5: if the ring exists, the maximum ring sub-graph is identified, the ring sub-graphs are combined to form a combined vertex, and the processing adjacency list and the inverse adjacency list transfer the vertex relation belonging to the ring sub-graph to the combined vertex.

As an alternative implementation, as shown in fig. 5, fig. 5 is a schematic diagram of a determination of a cyclic subgraph according to an embodiment of the present application;

as an alternative implementation manner, the looped sub-graph in the graph is identified by determining the outbound degree or inbound degree corresponding to all nodes in the graph, deleting the point with the outbound degree or inbound degree of 0 and the related edge thereof, and the rest points enable the outbound degree or inbound degree to be different from 0, so that the maximum sub-loop graph in the graph can be rapidly located. For example, when the A, F node with the outbound degree or inbound degree of 0 in fig. 5 is deleted, the node existing in the original graph includes A, B, C, D, E, F, and after merging, the remaining nodes include: B. c, D, E;

further, the adjacency list with ring subgraphs in FIG. 5 can be constructed as follows:

B->[C]；

C->[D,E]；

D->[E]；

E->[C,B]；

accordingly, the inverse adjacency table in fig. 5 is as follows:

B->[E]；

C->[B,E]；

D->[C]；

E->[C,D]；

optionally, performing depth-first traversal on each row of the adjacency list, and setting a mark on the traversed vertex, where the traversed vertex does not perform any operation, the following is the traversal result of each row of the adjacency list:

The traversal results for node B are as follows:

start->B(unmark)->C(unmark)->D(unmark)->E(unmark)->C(mark)->B(mar k)->E(mark)->end；

the traversal results for node C are as follows:

C：start->C(mark)->end；

the traversal results for node D are as follows:

D：start->D(mark)->end；

the traversal results for node E are as follows:

E：start->E(mark)->end；

it can be determined by the above traversal that B, C, D, E belong to the same ring.

Step S6: if no ring exists, the final adjacency list and inverse adjacency list incorporating the ring subgraph are determined directly.

Alternatively, the traversing operation applied in the step S5 may determine which sub-ring a vertex belongs to, where all ring members in the adjacency list and the inverse adjacency list of the complete blood map need to be replaced by the sub-ring, that is, the sub-ring is denoted by combinedNode, and it may be considered that a large sphere wraps a plurality of small spheres, each small sphere is a ring member, and the large sphere is a ring. Then in the zombie tag propagation algorithm, combinedNode is regarded as a common vertex, and the only special place is that when combinedNode is tagged as a zombie asset, all member vertices inside the combinedNode need to be tagged.

It can be understood that the merging identifies the ring sub-graph, finds member points of the ring sub-graph, and replaces all the member points with combineddode, that is, abstract representation is performed on the ring sub-graph, which can be regarded as a common point in the graph. And then the ring members of the adjacency list and the inverse adjacency list belonging to the same ring are replaced by corresponding combinedNodes.

Step S7: the set of zombie assets entered by the user (corresponding to the data to be screened in the above embodiment).

Step S8: marking zombie assets in a final adjacency list and an inverse adjacency list;

step S9: propagating the zombie tag upstream and propagating the zombie tag downstream;

optionally, during upstream propagation, DFS (depth first traversal) is performed using an inverse adjacency table, and judgment of sufficient conditions is achieved using the adjacency table. The upstream propagation conditions are: if currentNode is a zombie asset, then a sufficient condition that a certain parentNode is a zombie asset is that all child rennodes of the parentNode have zombie tags.

During downstream propagation, DFS is performed using an adjacency list, and sufficient condition judgment is realized using an inverse adjacency list. The downstream propagation condition is that if currentNode is a zombie asset, then a sufficient condition that a certain child is a zombie asset is that all parentNodes of child have zombie tags.

Step S10: traversing the final adjacency list and the inverse adjacency list, and returning all assets with zombie marks.

Specifically, the tag propagation algorithm is shown below in pseudo code.

Zombie asset identification algorithm pseudocode:

through the steps, the conditional propagation algorithm of the data in the graph is realized by loading the data in Neo4j into the memory and utilizing the graph-related algorithm. Therefore, in big data application, invalid assets can be quickly and effectively identified, in practical application, a user can identify all asset information from a source to a BI report according to the BI report to be offline (or other assets close to a user plane), then in the field of data management, the identified result can be utilized for carrying out resource recovery and other works, and finally the effects of cost reduction and efficiency enhancement are achieved.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the various embodiments of the present application.

FIG. 6 is a block diagram of a data asset determination device according to an embodiment of the application; as shown in fig. 6, includes:

a determining module 62, configured to determine a first graph corresponding to different graph data stored in a graph database, where the first graph is used to indicate an association relationship between nodes corresponding to the different graph data;

a processing module 64, configured to, in a case where it is determined that there is a ring sub-graph in the first graph, pre-process the first graph to generate a second graph;

The extracting module 66 is configured to, when obtaining data to be screened input by the first object, mark first graph data that is the same as the data to be screened in the second graph; and extracting the data asset from the graph database according to the marking result.

Through the device, the first graphs corresponding to different graph data stored in the graph database are determined, wherein the first graphs are used for indicating the association relationship between nodes corresponding to the different graph data; under the condition that the fact that the ring sub-graph exists in the first graph is determined, preprocessing the first graph to generate a second graph; under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in a second graph; extracting data assets from the graph database according to the marking result; by adopting the technical scheme, the technical problem that the invalid data assets existing in the database cannot be comprehensively identified is solved, and the identification efficiency of the invalid assets existing in the database can be improved.

An embodiment of the present application also provides a storage medium including a stored program, wherein the program executes the method of any one of the above.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store program code for performing the steps of:

s1, determining a first graph corresponding to different graph data stored in a graph database, wherein the first graph is used for indicating association relations between nodes corresponding to the different graph data;

s2, preprocessing the first graph to generate a second graph under the condition that the first graph is determined to have the ring sub-graph;

S3, under the condition that data to be screened input by a first object is obtained, marking first graph data which are the same as the data to be screened in the second graph; and extracting the data asset from the graph database according to the marking result.

An embodiment of the application also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present application is not limited to any specific combination of hardware and software.

The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims

1. A method of determining a data asset, comprising:

determining a first graph corresponding to different graph data stored in a graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data;

preprocessing the first graph to generate a second graph under the condition that the first graph is determined to have a ring sub-graph;

and under the condition that the data to be screened input by the first object is obtained, marking the first graph data which are the same as the data to be screened in the second graph, so as to extract the data asset from the graph database according to the marking result.

2. The method of claim 1, wherein preprocessing the first map to generate a second map comprises:

determining a first adjacency list and/or a first inverse adjacency list corresponding to different nodes in the first graph;

Performing first traversal on the first adjacency list and/or the first inverse adjacency list, and marking the vertexes of the looped sub-graphs passing through in the first traversal, wherein at least one looped sub-graph exists in the first graph;

determining a second adjacent table and/or a second inverse adjacent table corresponding to the ring sub-graph according to the vertex, and replacing all member points in the ring sub-graph with target points corresponding to the vertex based on the second adjacent table and/or the second inverse adjacent table so as to determine the whole ring sub-graph as a new vertex;

in the case that the first graph does not exist in all ring subgraphs, generating a second graph corresponding to the first graph is determined.

3. The method of claim 1, wherein marking the first map data in the second map that is identical to the data to be screened to extract data assets from the map database based on the marking results comprises:

adding a first mark for the first graph data, and identifying a transfer link containing the first graph data in the second graph, wherein the transfer link at least comprises two nodes;

Determining parent-child relationships between other nodes in the transfer link and target nodes corresponding to the first graph data;

adding a second mark to the other nodes based on the father-son relationship and the propagation mode of the data to be screened;

determining a marking result based on the second marking and the first marking to extract a data asset from the map database based on the marking result.

4. A method of determining a data asset according to claim 3, wherein adding a second tag to the other nodes based on the parent-child relationship and the propagation of the data to be screened comprises:

determining to add a second mark for the other nodes under the condition that the target node is determined to be a father node of the other nodes and the propagation mode of the data to be screened is upstream propagation;

ending the mark adding operation under the condition that the target node is determined to be a father node of the other nodes and the propagation mode of the data to be screened is downstream propagation;

ending the mark adding operation under the condition that the target node is determined to be a child node of the other nodes and the propagation mode of the data to be screened is upstream propagation;

And under the condition that the target node is determined to be a child node of the other nodes and the propagation mode of the data to be screened is downstream propagation, adding a second mark for the other nodes.

5. A method of determining a data asset according to claim 3, wherein after adding a second marker to the other nodes based on the parent-child relationship and the way of propagation of the data to be screened, the method further comprises:

adjusting the marking result according to the sufficient condition corresponding to the propagation mode of the data to be screened; wherein the sufficient conditions include at least one of:

determining that the propagation mode of the data to be screened is upstream propagation, and if all child nodes are determined to have marks, adding a second mark for a parent node with an inexistent mark;

determining that the propagation mode of the data to be screened is downstream propagation, and if all father nodes are determined to have marks, adding a second mark for the child nodes with marks which do not exist;

the adjusted target marking result is determined as a marking result for extracting the data asset.

6. The method of claim 1, wherein after extracting data assets from the graph database based on the marking results, the method further comprises:

Identifying a data service that applies the data asset;

and sending a management hint to the data service, wherein the management hint is used for indicating a second object using the data service to dereference the data asset.

7. The method of determining a data asset according to claim 1, wherein prior to determining a first graph corresponding to different graph data stored in the graph database, the method further comprises:

acquiring a target storage engine preconfigured for the graph database;

the target storage engine is utilized to transfer the data to be processed in the target database to the graph database;

and determining a blood-edge map recorded in the target storage engine after the transfer is completed, so as to assist in determining the first map according to the blood-edge map, wherein the blood-edge map is used for indicating the transfer relationship of different data stored in the map database among various services.

8. A data asset determination apparatus, comprising:

the determining module is used for determining a first graph corresponding to different graph data stored in the graph database, wherein the first graph is used for indicating the association relationship between nodes corresponding to the different graph data;

The processing module is used for preprocessing the first graph to generate a second graph under the condition that the fact that the ring sub-graph exists in the first graph is determined;

the extraction module is used for marking the first graph data which are the same as the data to be screened in the second graph under the condition that the data to be screened input by the first object are acquired; and extracting the data asset from the graph database according to the marking result.

9. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any of the preceding claims 1 to 7.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.