CN117312317A - Tag set generation method, device and equipment of data warehouse table and storage medium - Google Patents

Tag set generation method, device and equipment of data warehouse table and storage medium Download PDF

Info

Publication number
CN117312317A
CN117312317A CN202311265535.7A CN202311265535A CN117312317A CN 117312317 A CN117312317 A CN 117312317A CN 202311265535 A CN202311265535 A CN 202311265535A CN 117312317 A CN117312317 A CN 117312317A
Authority
CN
China
Prior art keywords
data warehouse
tables
marked
warehouse table
blood
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311265535.7A
Other languages
Chinese (zh)
Inventor
尚亚涛
徐悦
刘旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avatr Technology Chongqing Co Ltd
Original Assignee
Avatr Technology Chongqing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Avatr Technology Chongqing Co Ltd filed Critical Avatr Technology Chongqing Co Ltd
Priority to CN202311265535.7A priority Critical patent/CN117312317A/en
Publication of CN117312317A publication Critical patent/CN117312317A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Abstract

The application provides a tag set generation method, device and equipment of a data warehouse table and a storage medium. The method comprises the following steps: acquiring a data warehouse where a data warehouse table to be marked is located, wherein the data warehouse comprises a plurality of data warehouse tables; determining a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse tables of the data warehouse according to the data warehouse table blood-edge relation map, wherein the blood-edge relation map is a data warehouse table relation map comprising data circulation paths in the data warehouse; a set of labels of the data warehouse tables to be marked is generated according to the sets of labels of the plurality of associated data warehouse tables. According to the method, the accuracy of label set generation of the data warehouse table is improved.

Description

Tag set generation method, device and equipment of data warehouse table and storage medium
Technical Field
The present disclosure relates to the field of big data technologies, and in particular, to a method, an apparatus, a device, and a storage medium for generating a tag set of a data warehouse table.
Background
In the big data age, along with the continuous increase of data volume, a data warehouse table, which is a solution for data management and processing, is widely used in the aspects of efficient management, data analysis and the like. In the construction of a plurality of bins, the data label is a very important link, and the readability, the understandability, the usability, the query efficiency and the like of the data can be effectively improved by generating a proper label for a plurality of bin models, so that the work of data analysis, data mining and the like can be better facilitated.
In the prior art, labels are typically manually added to each of the bin models by hand.
However, for large-scale, complex and variable bins, manual tagging can lead to inefficiency and low accuracy problems.
Disclosure of Invention
The application provides a tag set generation method, device and equipment of a data warehouse table and a storage medium, which are used for solving the problem of low accuracy of tag set generation of the data warehouse table.
In a first aspect, the present application provides a method for generating a tag set of a data warehouse table, including:
acquiring a data warehouse where a data warehouse table to be marked is located, wherein the data warehouse comprises a plurality of data warehouse tables;
determining a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse tables of the data warehouse according to a data warehouse table blood-edge relation map, wherein the blood-edge relation map is a data warehouse table relation map comprising data circulation paths in the data warehouse;
and generating the label set of the data warehouse table to be marked according to the label sets of the plurality of associated data warehouse tables.
In a second aspect, the present application provides a tag set generating apparatus of a data warehouse table, including:
the data warehouse comprises a plurality of data warehouse tables;
a determining module, configured to determine, from data warehouse tables of the data warehouse, a plurality of associated data warehouse tables related to the data warehouse table to be marked according to a data warehouse table blood-edge relationship map, where the blood-edge relationship map is a data warehouse table relationship map that includes a data circulation path in the data warehouse;
and the generating module is used for generating the label set of the data warehouse table to be marked according to the label sets of the plurality of associated data warehouse tables.
In a third aspect, the present application provides a tag set generating apparatus of a data warehouse table, including:
a processor, a memory, a communication interface;
the memory is used for storing executable instructions of the processor;
wherein the processor is configured to perform the tag set generation method of the data warehouse table of the first aspect above via execution of the executable instructions.
In a fourth aspect, the present application provides a readable storage medium comprising: a computer program stored thereon, which when executed by a processor, implements a tag set generation method of executing the data warehouse table as described in the first aspect above.
According to the label set generation method, the device, the equipment and the storage medium of the data warehouse table, the data warehouse where the data warehouse table to be marked is located is obtained, the data warehouse comprises a plurality of data warehouse tables, a plurality of associated data warehouse tables related to the data warehouse table to be marked are determined in the data warehouse table of the data warehouse according to the data warehouse table blood-edge relation map, the label set of the data warehouse table to be marked is generated according to the label set of the associated data warehouse tables, the associated data warehouse table of the data warehouse table to be marked is determined through the pre-built data warehouse table blood-edge relation map, accuracy of determining the associated data warehouse table is improved, the label set of the data warehouse table to be marked is further generated through the determined label set of the associated data warehouse table, and the label set generation accuracy of the data warehouse table is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flowchart of a tag set generating method of a data warehouse table according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of constructing a data warehouse table blood relationship map according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of determining a plurality of associated data warehouse tables related to a data warehouse table to be marked in the data warehouse tables of the data warehouse according to the data warehouse table blood relationship map provided in the embodiment of the present application;
FIG. 4 is a flowchart of generating a tag set of a data warehouse table to be marked according to a tag set of a plurality of associated data warehouse tables provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of a tag set generating device of a data warehouse table according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a tag set generating device of a data warehouse table according to an embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
In the prior art, labels are typically manually added to each of the bin models by hand. However, for large-scale, complex and variable bins, manual tagging can lead to inefficiency and low accuracy problems.
According to the method, the data warehouse where the data warehouse table to be marked is located is obtained, the data warehouse comprises a plurality of data warehouse tables, a plurality of associated data warehouse tables related to the data warehouse table to be marked are determined in the data warehouse tables of the data warehouse according to the data warehouse table blood-edge relation map, the tag set of the data warehouse table to be marked is generated according to the tag set of the plurality of associated data warehouse tables, the associated data warehouse table of the data warehouse table to be marked is determined through the pre-built data warehouse table blood-edge relation map, accuracy of determining the associated data warehouse table is improved, the tag set of the data warehouse table to be marked is further generated through the determined tag set of the associated data warehouse table, and accuracy of tag set generation of the data warehouse table is improved.
The following describes the technical solutions of the present application and how the technical solutions of the present application solve the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a tag set generating method of a data warehouse table according to a first embodiment of the present application, where an execution body is a tag set generating module of the data warehouse table, and the tag set generating module may be implemented by software, or implemented by hardware, or implemented by a combination of software and hardware.
As shown in fig. 1, the tag set generating method of the data warehouse table of the present embodiment may include the steps of:
step S101, a data warehouse where the data warehouse table to be marked is located is obtained, and the data warehouse comprises a plurality of data warehouse tables.
Wherein the data warehouse comprises a plurality of data warehouse tables including data warehouse tables to be marked, and the data warehouse tables refer to data warehouse units for storing data in a table form. Wherein each data warehouse table contains a plurality of tags for users to understand, query, use, etc., the plurality of tags comprising a tag set of the data warehouse table. Optionally, after creating the data warehouse table, a tab set of the created data warehouse table needs to be generated for use by the user. Optionally, after updating the data warehouse table, a set of labels of the updated data warehouse table also needs to be generated for use by the user.
Specifically, first, a data warehouse where the data warehouse table to be marked is located, that is, all the data warehouse tables included in the located data warehouse may be acquired.
Step S102, determining a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse tables of the data warehouse according to the data warehouse table blood-edge relation map, wherein the blood-edge relation map is a data warehouse table relation map comprising data circulation paths in the data warehouse.
The associated data warehouse table is a data warehouse table of a data warehouse, wherein the association relation between the data warehouse table and the data warehouse table to be marked is larger than a preset association relation threshold value. Wherein the blood relationship graph is a data warehouse table relationship graph including data flow paths in a data warehouse.
Specifically, a plurality of associated data warehouse tables related to the data warehouse table to be marked may be determined from among the data warehouse tables of the data warehouse acquired in step S101 according to the data warehouse table blood-edge relationship map constructed in advance. The construction process of the data warehouse table blood-edge relation map is not limited, and the existing construction method of the blood-edge relation map can be used for pre-constructing the data warehouse table blood-edge relation map. Alternatively, a data warehouse table blood-edge relationship map may be constructed by table tasks corresponding to each data warehouse table in the data warehouse, and by the correspondence between the table tasks and the data warehouse table.
Step S103, generating a label set of the data warehouse table to be marked according to the label sets of the plurality of associated data warehouse tables.
Specifically, a set of labels for the data warehouse table to be marked may be generated from the set of labels for the plurality of associated data warehouse tables related to the data warehouse table to be marked determined in step S102. Optionally, the labels in the label set of the associated data warehouse table may be filtered according to a label eigenvalue threshold to generate a label set of the data warehouse table to be marked.
According to the label set generating method of the data warehouse table, the data warehouse where the data warehouse table to be marked is located is obtained, the data warehouse comprises a plurality of data warehouse tables, a plurality of associated data warehouse tables related to the data warehouse table to be marked are determined in the data warehouse table of the data warehouse according to the blood-edge relation map of the data warehouse table, the label set of the data warehouse table to be marked is generated according to the label set of the associated data warehouse tables, the associated data warehouse table of the data warehouse table to be marked is determined through the blood-edge relation map of the data warehouse table constructed in advance, accuracy of determining the associated data warehouse table is improved, and further the label set of the data warehouse table to be marked is generated through the determined label set of the associated data warehouse table, and accuracy of label set generation of the data warehouse table is improved.
Fig. 2 is a schematic diagram of a process for constructing a data warehouse table blood-edge relationship map according to a second embodiment of the present application, and the process for constructing a data warehouse table blood-edge relationship map in advance is described in this embodiment on the basis of the embodiment shown in fig. 1.
As shown in fig. 2, the construction of the data warehouse table blood relationship map of the present embodiment may include the steps of:
step S201, determining a plurality of table tasks of the data warehouse, where one data warehouse table corresponds to one table task, and the table task refers to task information for constructing a current data warehouse table according to other data warehouse tables in the data warehouse.
Wherein, as described in step S101, the data warehouse where the data warehouse table to be marked is located includes a plurality of data warehouse tables. Specifically, a plurality of table tasks of the data warehouse may be determined, where one data warehouse table corresponds to one table task, and the table task refers to task information for constructing a current data warehouse table according to other data warehouse tables in the data warehouse. Optionally, the table tasks include, but are not limited to: parent tasks and/or child tasks.
Step S202, extracting dependency relationship information among the table tasks, and generating a blood relationship map of the table tasks according to the dependency relationship information among the table tasks.
Specifically, after determining the plurality of table tasks of the data warehouse in step S201, dependency information between the table tasks may be extracted, and further, a blood relationship map of the table tasks may be generated according to the dependency information between the table tasks.
And step 203, extracting the corresponding relation between each data warehouse table and the table task, and generating a data warehouse table blood edge relation map according to the blood edge relation map of the table task and the corresponding relation.
Specifically, in the process of determining the plurality of table tasks of the data warehouse described in step S201, the correspondence between each data warehouse table and the table task may be further extracted, and the data warehouse table blood edge relationship map may be further generated according to the blood edge relationship map of the table task generated in step S202 and the correspondence between each data warehouse table and the table task.
According to the process for constructing the data warehouse table blood-edge relation map, through determining the plurality of table tasks of the data warehouse, one data warehouse table corresponds to one table task, the table task is used for constructing task information of a current data warehouse table according to other data warehouse tables in the data warehouse, dependency relation information among the table tasks is extracted, the blood-edge relation map of the table task is generated according to the dependency relation information among the table tasks, the corresponding relation between the table tasks and the data warehouse table is extracted, and the data warehouse table blood-edge relation map is generated according to the blood-edge relation map of the table tasks and the corresponding relation among the table tasks in the data warehouse, wherein the accuracy of the data warehouse table blood-edge relation map can be improved through constructing the table tasks corresponding to the data warehouse table and the corresponding relation among the table tasks and the data warehouse table, and the accuracy of label set generation of the data warehouse table is further improved.
Fig. 3 is a schematic flow chart of determining a plurality of associated data warehouse tables related to a data warehouse table to be marked in the data warehouse table of the data warehouse according to the third embodiment of the present application, and the embodiment of the present application describes the process of determining a plurality of associated data warehouse tables related to a data warehouse table to be marked in the data warehouse table of the data warehouse according to the data warehouse table blood-edge relationship map in the embodiment of fig. 1 or fig. 2.
As shown in fig. 3, according to the data warehouse table blood relationship map of the present embodiment, determining a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse tables of the data warehouse may include the steps of:
step S301, determining a plurality of upstream data warehouse tables of the data warehouse table to be marked according to the data warehouse table blood-edge relation map.
Specifically, a plurality of upstream data warehouse tables of the data warehouse table to be marked may be determined according to the data warehouse table blood-edge relation map described in step S102 or generated in step S203, where the upstream data warehouse table refers to a data warehouse table that characterizes the data sources in the data warehouse table to be marked, which is determined according to the data warehouse blood-edge relation map.
Step S302, calculating the characteristic correlation strength between each upstream data warehouse table and the data warehouse table to be marked.
Specifically, the feature correlation strength between each upstream data warehouse table determined in step S301 and the data warehouse table to be marked may be calculated.
Alternatively, table features of the upstream data warehouse table and the data warehouse table to be marked may be extracted first. Where a table feature refers to a feature that may characterize a data warehouse table.
Alternatively, the correlation strength between the table features may be calculated next. Specifically, the correlation strengths between the table features of each upstream data warehouse table and the table features of the data warehouse table to be marked may be calculated separately, alternatively, each correlation strength may be calculated by a graph neural network (english full name: graph Neural Networks, abbreviated name: GNN) technique, specifically, each table feature may be converted into a table feature vector. And then constructing a table feature graph structure according to the table feature vector. And finally traversing the table feature graph structure to calculate the correlation strength among the table features.
Optionally, the feature correlation strength between each upstream data warehouse table and the data warehouse table to be marked can be finally determined according to the correlation strength between each table feature.
Step S303, determining a plurality of related data warehouse tables according to the characteristic correlation strength and a preset characteristic correlation strength threshold.
Specifically, a plurality of relevant data warehouse tables may be determined according to the feature correlation strength between each upstream data warehouse table and the data warehouse table to be marked calculated in step S302, and a preset feature correlation strength threshold. Specifically, if the feature correlation strength between the upstream data warehouse table and the data warehouse table to be marked is greater than or equal to a preset feature correlation strength threshold, the upstream data warehouse table is used as the relevant data warehouse table of the data warehouse table to be marked. If the feature correlation strength between the upstream data warehouse table and the data warehouse table to be marked is less than the preset feature correlation strength threshold, the upstream data warehouse table cannot be used as the relevant data warehouse table of the data warehouse table to be marked. The feature correlation strength threshold can be set according to user requirements.
According to the process of determining a plurality of associated data warehouse tables related to a data warehouse table to be marked in the data warehouse table of the data warehouse according to the data warehouse table blood-edge relation map, the plurality of upstream data warehouse tables of the data warehouse table to be marked are determined according to the data warehouse table blood-edge relation map, the characteristic correlation strength between each upstream data warehouse table and the data warehouse table to be marked is calculated, the plurality of associated data warehouse tables are determined according to the characteristic correlation strength and the preset characteristic correlation strength threshold, wherein the upstream data warehouse table is determined according to the blood-edge relation map, and the associated data warehouse table of the data warehouse table to be marked is further determined according to the characteristic correlation strength between each calculated upstream data warehouse table and the data warehouse table to be marked, so that the accuracy of determining the associated data warehouse table is improved, and the accuracy of label set generation of the data warehouse table is further improved.
Fig. 4 is a schematic flow chart of a process for generating a tag set of a data warehouse table to be marked according to a tag set of a plurality of associated data warehouse tables provided in a fourth embodiment of the present application, and the process of generating a tag set of a data warehouse table to be marked according to a tag set of a plurality of associated data warehouse tables is described in this embodiment on the basis of the embodiment shown in fig. 1, 2 or 3.
As shown in fig. 4, generating the tag set of the data warehouse table to be marked according to the tag sets of the plurality of associated data warehouse tables of the present embodiment may include the steps of:
step S401, a tag set of a plurality of related data warehouse tables is acquired.
Wherein, as described in step S101, each data warehouse table contains a plurality of tag sets for the user to understand, query, use, etc., specifically, a tag set of a plurality of related data warehouse tables determined in step S102 or determined in step S303 may be obtained.
And step S402, extracting all the labels of the related data warehouse table from the label set of the related data warehouse table, and sequencing all the labels according to the label characteristic values of the labels to obtain a label sequence of the data warehouse table to be marked.
Specifically, after the tag sets of the plurality of related data warehouse tables are acquired, all the tags of the related data warehouse tables can be extracted from the acquired tag sets of the related data warehouse tables, namely, the tag sets of all the related data warehouse tables are obtained, and further, all the tags can be sequenced according to tag characteristic values of the tags, so that a tag sequence of the data warehouse table to be marked is obtained, wherein the tag characteristic values refer to characteristic values representing the degree of correlation between the tags and the data warehouse table, and the higher the characteristic values are, the higher the degree of correlation between the tags and the data warehouse table is.
Step S403, screening the tag sequence according to a preset tag characteristic value threshold or a preset tag quantity threshold of the tag set to obtain the tag set of the data warehouse table to be marked.
Specifically, the tag sequence of the data warehouse table to be marked obtained in step S402 may be screened according to a preset tag characteristic value threshold or a preset tag number threshold of the tag set, to obtain the tag set of the data warehouse table to be marked. The tag characteristic value threshold and the tag quantity threshold can be set according to user requirements.
According to the process of generating the tag set of the data warehouse table to be marked according to the tag sets of the plurality of associated data warehouse tables, all tags of the associated data warehouse table are extracted from the tag sets of the associated data warehouse table by acquiring the tag sets of the plurality of associated data warehouse tables, all the tags are sequenced according to the tag characteristic values of the tags to obtain the tag sequence of the data warehouse table to be marked, the tag sequence is screened according to the preset tag characteristic value threshold value or the tag quantity threshold value of the preset tag set to obtain the tag set of the data warehouse table to be marked, wherein all the tags of the associated data warehouse table are screened according to the tag characteristic values to obtain the tag set of the data warehouse table to be marked, and the accuracy of tag set generation of the data warehouse table is improved.
Fig. 5 is a schematic structural diagram of a tag set generating device of a data warehouse table according to a fifth embodiment of the present application.
As shown in fig. 5, the tag set generating apparatus 50 of the data warehouse table of the present embodiment includes an acquisition module 51, a determination module 52, and a generation module 53.
The obtaining module 51 is configured to obtain a data warehouse where the data warehouse table to be marked is located, where the data warehouse includes a plurality of data warehouse tables.
A determining module 52 is configured to determine a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse according to a data warehouse table blood-edge relationship map, which is a data warehouse table relationship map including data circulation paths in the data warehouse.
A generating module 53, configured to generate a tag set of the data warehouse table to be marked according to the tag sets of the plurality of associated data warehouse tables.
The apparatus provided in this embodiment may be used to execute the technical solutions of fig. 1 to 4 in the above method embodiment, and the implementation principle and technical effects are similar, which are not repeated here.
Fig. 6 is a schematic structural diagram of a tag set generating device of a data warehouse table according to a sixth embodiment of the present application.
As shown in fig. 6, the tag set generating apparatus 60 of the data warehouse table of the present embodiment includes: processor 61, memory 62, communication interface 63.
The memory 62 is used to store executable instructions of the processor;
wherein the processor 61 is configured to perform the tag set generation method of the data warehouse table of any of the above method embodiments fig. 1 to 4 via execution of executable instructions.
In the embodiment shown in fig. 6, it should be understood that the processor may be a central processing unit (english: central Processing Unit, abbreviated as CPU), or may be other general purpose processors, digital signal processors (english: digital Signal Processor, abbreviated as DSP), application specific integrated circuits (english: application Specific Integrated Circuit, abbreviated as ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The Memory may comprise high-speed Memory (Random Access Memory, RAM) or may further comprise Non-volatile Memory (NVM), such as at least one disk Memory.
The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.
The embodiments of the present application also provide a readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements a tag set generation method of performing the data warehouse table of any one of fig. 1 to 4 of the above-described method embodiments.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. A method for generating a tag set of a data warehouse table, comprising:
acquiring a data warehouse where a data warehouse table to be marked is located, wherein the data warehouse comprises a plurality of data warehouse tables;
determining a plurality of associated data warehouse tables related to the data warehouse table to be marked in the data warehouse tables of the data warehouse according to a data warehouse table blood-edge relation map, wherein the blood-edge relation map is a data warehouse table relation map comprising data circulation paths in the data warehouse;
and generating the label set of the data warehouse table to be marked according to the label sets of the plurality of associated data warehouse tables.
2. The method according to claim 1, wherein the method further comprises:
determining a plurality of table tasks of the data warehouse, wherein one table task corresponds to one table task, and the table task is task information for constructing a current data warehouse table according to other data warehouse tables in the data warehouse;
extracting dependency relationship information among the table tasks;
and generating a blood relationship map of the table task according to the dependency relationship information among the table tasks.
3. The method according to claim 2, wherein the method further comprises:
extracting the corresponding relation between each data warehouse table and the table task;
and generating the data warehouse table blood relationship map according to the blood relationship map of the table task and the corresponding relationship.
4. A method according to claim 3, wherein said determining a plurality of associated data warehouse tables related to said data warehouse table to be tagged among the data warehouse tables of said data warehouse based on a data warehouse table blood relationship map comprises:
determining a plurality of upstream data warehouse tables of the data warehouse table to be marked according to the data warehouse table blood-edge relation map;
calculating the characteristic correlation strength between each upstream data warehouse table and the data warehouse table to be marked;
and determining the plurality of related data warehouse tables according to the characteristic correlation strength and a preset characteristic correlation strength threshold.
5. The method of claim 4, wherein said calculating a feature correlation strength between each of said upstream data warehouse tables and said data warehouse table to be marked comprises:
extracting table features of the upstream data warehouse table and the data warehouse table to be marked;
calculating the correlation strength between the table features;
and determining the characteristic correlation strength between each upstream data warehouse table and the data warehouse table to be marked according to the correlation strength between the table characteristics.
6. The method of claim 5, wherein said calculating a correlation strength between each of said table features comprises:
converting each table feature into a table feature vector;
constructing a table feature graph structure according to the table feature vector;
traversing the table feature graph structure to calculate the correlation strength between the table features.
7. The method of claim 6, wherein generating the set of labels of the data warehouse table to be tagged from the set of labels of the plurality of associated data warehouse tables comprises:
acquiring tag sets of a plurality of related data warehouse tables;
extracting all tags of the relevant data warehouse table from a tag set of the relevant data warehouse table;
sorting all the labels according to the label characteristic values of the labels to obtain a label sequence of the data warehouse table to be marked;
and screening the tag sequence according to a preset tag characteristic value threshold or a tag quantity threshold of a preset tag set to obtain the tag set of the data warehouse table to be marked.
8. A tag set generating apparatus of a data warehouse table, comprising:
the data warehouse comprises a plurality of data warehouse tables;
a determining module, configured to determine, from data warehouse tables of the data warehouse, a plurality of associated data warehouse tables related to the data warehouse table to be marked according to a data warehouse table blood-edge relationship map, where the blood-edge relationship map is a data warehouse table relationship map that includes a data circulation path in the data warehouse;
and the generating module is used for generating the label set of the data warehouse table to be marked according to the label sets of the plurality of associated data warehouse tables.
9. A tag set generating apparatus of a data warehouse table, comprising:
a processor, a memory, a communication interface;
the memory is used for storing executable instructions of the processor;
wherein the processor is configured to perform the tag set generation method of the data warehouse table of any one of claims 1 to 7 via execution of the executable instructions.
10. A readable storage medium having stored thereon a computer program, which when executed by a processor implements a tag set generation method of executing the data warehouse table of any of claims 1 to 7.
CN202311265535.7A 2023-09-27 2023-09-27 Tag set generation method, device and equipment of data warehouse table and storage medium Pending CN117312317A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311265535.7A CN117312317A (en) 2023-09-27 2023-09-27 Tag set generation method, device and equipment of data warehouse table and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311265535.7A CN117312317A (en) 2023-09-27 2023-09-27 Tag set generation method, device and equipment of data warehouse table and storage medium

Publications (1)

Publication Number Publication Date
CN117312317A true CN117312317A (en) 2023-12-29

Family

ID=89242071

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311265535.7A Pending CN117312317A (en) 2023-09-27 2023-09-27 Tag set generation method, device and equipment of data warehouse table and storage medium

Country Status (1)

Country Link
CN (1) CN117312317A (en)

Similar Documents

Publication Publication Date Title
CN112800095B (en) Data processing method, device, equipment and storage medium
CN111460011A (en) Page data display method and device, server and storage medium
CN112035480A (en) Data table management method, device, equipment and storage medium
CN114116013B (en) Data processing method and device, electronic equipment and storage medium
CN110888876A (en) Method and device for generating database script, storage medium and computer equipment
CN109344255B (en) Label filling method and terminal equipment
CN111143038A (en) RISC-V architecture microprocessor kernel information model modeling and generating method
CN112506503B (en) Programming method, device, terminal equipment and storage medium
CN111784246B (en) Logistics path estimation method
CN112989050A (en) Table classification method, device, equipment and storage medium
CN111930891A (en) Retrieval text expansion method based on knowledge graph and related device
CN109697234B (en) Multi-attribute information query method, device, server and medium for entity
CN117312317A (en) Tag set generation method, device and equipment of data warehouse table and storage medium
CN116503608A (en) Data distillation method based on artificial intelligence and related equipment
CN115757304A (en) Log storage method, device and system, electronic equipment and storage medium
CN114756365A (en) Computing resource identification method and device and computer readable storage medium
CN113468258A (en) Heterogeneous data conversion method and device and storage medium
CN109783134B (en) Front-end page configuration method and device and electronic equipment
CN110471708B (en) Method and device for acquiring configuration items based on reusable components
CN110334328B (en) Automatic generation method and device for object list based on machine learning
CN109542986B (en) Element normalization method, device, equipment and storage medium of network data
CN109597873B (en) Corpus data processing method and device, computer readable medium and electronic equipment
CN113010812B (en) Information acquisition method, device, electronic equipment and storage medium
US11836426B1 (en) Early detection of sequential access violations for high level synthesis
CN109815123A (en) Interface testing case script classification method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination