CN117852777B

CN117852777B - Linking method and system for multi-source heterogeneous data asset

Info

Publication number: CN117852777B
Application number: CN202410262827.3A
Authority: CN
Inventors: 霍绥力; 张春红; 张尧
Original assignee: Beijing Huaban Zhiyuan Technology Co ltd
Current assignee: Beijing Huaban Zhiyuan Technology Co ltd
Priority date: 2024-03-07
Filing date: 2024-03-07
Publication date: 2024-05-24
Anticipated expiration: 2044-03-07
Also published as: CN117852777A

Abstract

The invention discloses a method and a system for managing multi-source heterogeneous data assets, which are applied to the technical field of data processing, wherein the method comprises the following steps: and performing tertiary clustering by obtaining the basic information of the multi-source heterogeneous data assets, and generating a data asset clustering result. And calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information. And receiving a data access request of the user side for the clustering result of the data assets, wherein the data access request comprises a target timestamp. And carrying out mapping information matching on the low-dimensional mapping information of the data asset according to the target timestamp and the index timestamp to obtain the target low-dimensional mapping information. And carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to the user side. The technical problems of large storage load and poor data asset security of multi-source heterogeneous data assets in the prior art are solved.

Description

Linking method and system for multi-source heterogeneous data asset

Technical Field

The invention relates to the field of data processing, in particular to a method and a system for linking multi-source heterogeneous data assets.

Background

A data asset is a data asset that is capable of bringing economic benefits to an enterprise, typically by recording the data asset in an electronically stored manner. However, in the prior art, the source of the data asset is wide and the structure is complex, the storage load of the data asset is large, the data asset can be directly acquired after the data is leaked, and the security of the data asset is poor.

Therefore, in the prior art, the multi-source heterogeneous data asset has the technical problems of large storage load and poor data asset security.

Disclosure of Invention

The application solves the technical problems of large storage load and poor data asset security of the multi-source heterogeneous data asset in the prior art by providing the method and the system for managing the multi-source heterogeneous data asset.

The present application provides a method of linking together multi-source heterogeneous data assets, the method comprising: obtaining multi-source heterogeneous data asset base information, wherein the multi-source heterogeneous data asset base information at least comprises a data asset source, a data asset structure, a data asset type and a data asset timestamp; performing tertiary clustering on multi-source heterogeneous data assets according to the data asset sources, the data asset types and the data asset structures to generate data asset clustering results;

The data asset time stamp is called through a data distribution mapping algorithm, the data asset clustering result is traversed to carry out dimension reduction mapping, and data asset low-dimension mapping information is generated, wherein the data asset low-dimension mapping information is provided with an index time stamp; receiving a data access request of a user side to the data asset clustering result, wherein the data access request comprises a target timestamp; performing mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information; and carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to a user side.

The present application also provides a linked system for multi-source heterogeneous data assets, the system comprising: the system comprises a data acquisition module, a data storage module and a data storage module, wherein the data acquisition module is used for acquiring multi-source heterogeneous data asset basic information, and the multi-source heterogeneous data asset basic information at least comprises a data asset source, a data asset structure, a data asset type and a data asset timestamp; the clustering module is used for performing tertiary clustering on the multi-source heterogeneous data assets according to the data asset sources, the data asset types and the data asset structures to generate data asset clustering results; the dimension reduction mapping module is used for calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to carry out dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information is provided with an index time stamp; the access request receiving module is used for receiving a data access request of the user side to the data asset clustering result, wherein the data access request comprises a target time stamp; the mapping matching module is used for carrying out mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information; and the data restoration module is used for carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to the user side.

The application also provides an electronic device, comprising:

A memory for storing executable instructions;

And the processor is used for realizing the method for linking the multi-source heterogeneous data asset when executing the executable instructions stored in the memory.

The present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the linking method for multi-source heterogeneous data assets provided by the present application.

The method and the system for managing the multi-source heterogeneous data asset are used for generating a data asset clustering result by performing tertiary clustering on obtained basic information of the multi-source heterogeneous data asset. And calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information. And receiving a data access request of the user side for the clustering result of the data assets, wherein the data access request comprises a target timestamp. And carrying out mapping information matching on the low-dimensional mapping information of the data asset according to the target timestamp and the index timestamp to obtain the target low-dimensional mapping information. And carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to the user side. The method has the advantages that the dimension reduction storage of the multi-source heterogeneous data asset is realized, the storage load is reduced, the data content cannot be directly extracted even if the data is leaked, and the safety of the data is improved. The technical problems of large storage load and poor data asset security of multi-source heterogeneous data assets in the prior art are solved.

The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.

Drawings

In order to more clearly illustrate the technical solution of the embodiments of the present invention, the following description will briefly explain the drawings of the embodiments of the present invention. It is apparent that the figures in the following description relate only to some embodiments of the invention and are not limiting of the invention.

FIG. 1 is a flow diagram of a method for linking multi-source heterogeneous data assets provided by an embodiment of the present application;

FIG. 2 is a schematic flow chart of a method for linking multi-source heterogeneous data assets to obtain asset clustering results according to an embodiment of the present application;

FIG. 3 is a schematic flow chart of a method for linking multi-source heterogeneous data assets to match target low-dimensional mapping information according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a system for providing a multi-source heterogeneous data asset according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a link system electronic device for multi-source heterogeneous data assets according to an embodiment of the present invention.

Reference numerals illustrate: the system comprises a data acquisition module 11, a clustering module 12, a dimension reduction mapping module 13, an access request receiving module 14, a mapping matching module 15, a data reduction module 16, a processor 31, a memory 32, an input device 33 and an output device 34.

Detailed Description

Examples

The present application will be further described in detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.

In the following description, the terms "first", "second", "third" and the like are merely used to distinguish similar objects and do not represent a particular ordering of the objects, it being understood that the "first", "second", "third" may be interchanged with a particular order or sequence, as permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only.

While the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server, the modules are merely illustrative, and different aspects of the system and method may use different modules.

A flowchart is used in the present application to describe the operations performed by a system according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.

As shown in fig. 1, an embodiment of the present application provides a method for linking multi-source heterogeneous data assets, the method comprising:

Obtaining multi-source heterogeneous data asset base information, wherein the multi-source heterogeneous data asset base information at least comprises a data asset source, a data asset structure, a data asset type and a data asset timestamp;

performing tertiary clustering on multi-source heterogeneous data assets according to the data asset sources, the data asset types and the data asset structures to generate data asset clustering results;

the data asset time stamp is called through a data distribution mapping algorithm, the data asset clustering result is traversed to carry out dimension reduction mapping, and data asset low-dimension mapping information is generated, wherein the data asset low-dimension mapping information is provided with an index time stamp;

A data asset is a data asset that is capable of bringing economic benefits to an enterprise, typically by recording the data asset in an electronically stored manner. Multi-source heterogeneous data asset base information is obtained, wherein the multi-source heterogeneous data asset base information includes at least a data asset source, a data asset structure, a data asset type and a data asset timestamp, and specific data content. The data asset source is a specific source of data such as enterprise management system, enterprise website, etc., the data asset structure is a data storage structure, the data asset type is a specific category of data, such as data category, text category, etc., other non-graphic data, and the data asset timestamp is the acquisition time of the data. And performing tertiary clustering on the multi-source heterogeneous data assets according to the data asset sources, the data asset types and the data asset structures, wherein the clustering sequence is used for clustering according to the sequence of the data asset sources, the data asset types and the data asset structures, and further generating a data asset clustering result. Further, the data asset time stamp is called through a data distribution mapping algorithm, the data asset clustering result is traversed to carry out dimension reduction mapping, and data asset low-dimension mapping information is generated, wherein the data asset low-dimension mapping information is provided with an index time stamp.

As shown in fig. 2, the method provided by the embodiment of the present application further includes:

Performing primary clustering on the multi-source heterogeneous data assets according to the data asset sources to generate a data asset primary clustering result;

Traversing the primary clustering result of the data asset to perform secondary clustering according to the type of the data asset, and generating a secondary clustering result of the data asset;

And traversing the data asset secondary clustering result to perform tertiary clustering according to the data asset structure, and generating the data asset clustering result.

Performing tertiary clustering on the multi-source heterogeneous data asset according to the data asset source, the data asset type and the data asset structure to generate a data asset clustering result, wherein the data asset clustering result comprises the following steps: and performing primary clustering on the multi-source heterogeneous data assets according to the data asset sources to generate a primary clustering result of the data assets. And then traversing the primary clustering result of the data asset to perform secondary clustering according to the type of the data asset, and generating a secondary clustering result of the data asset. And further, traversing the data asset secondary clustering result to perform tertiary clustering according to the data asset structure, and generating the data asset clustering result. The data asset clustering result is generated by carrying out clustering operation according to the clustering sequence of primary clustering, secondary clustering and tertiary clustering. After the data asset clustering results are generated, the data asset sources, the data asset types and the data asset structures of the clustering results in the data asset clustering results are consistent, so that the data in the clustering clusters are conveniently processed and classified and stored according to the clustering clusters.

The method provided by the embodiment of the application further comprises the following steps:

Obtaining a first cluster data asset characteristic value set of the data asset clustering result;

Randomly selecting a first data asset characteristic value from the first cluster data asset characteristic value set, and setting the first data asset characteristic value as a reference characteristic value, wherein the reference characteristic value has a reference timestamp at the data asset timestamp;

Randomly selecting a second data asset characteristic value which is different from the first data asset characteristic value from the first cluster data asset characteristic value, and setting the second data asset characteristic value as a comparison characteristic value, wherein the comparison characteristic value has a comparison time stamp at the data asset time stamp;

Calculating a semantic mapping vector from the reference feature value to the comparison feature value;

Calculating a time mapping vector from the reference time stamp to the comparison time stamp;

And constructing low-dimensional mapping information of the comparison eigenvalue based on the reference eigenvalue, the reference timestamp, the semantic mapping vector and the time mapping vector, and adding the low-dimensional mapping information into the data asset low-dimensional mapping information, wherein the reference timestamp is the index timestamp.

And calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information has an index time stamp and comprises the following steps: and obtaining a first cluster data asset characteristic value set of the data asset clustering result, wherein the first cluster data asset characteristic value set is specific content of each clustered data. And randomly selecting a first data asset characteristic value from the first cluster data asset characteristic value set, namely randomly acquiring one data in the first cluster data asset characteristic value set, and setting the data as a reference characteristic value, wherein the reference characteristic value has a reference timestamp at the data asset timestamp. And then randomly selecting a second data asset characteristic value which is different from the first data asset characteristic value from the first cluster of data asset characteristic values, and setting the second data asset characteristic value as an alignment characteristic value, wherein the alignment characteristic value has an alignment time stamp at the data asset time stamp. And calculating semantic mapping vectors from the reference characteristic values to the comparison characteristic values, wherein the obtained characteristic values are all data of the same category, such as Chinese characters and digital letters, and have corresponding semantic distance vectors, so that the comparison characteristic values can be represented by the semantic mapping vectors. And calculating the time mapping vector from the reference time stamp to the comparison time stamp in the same acquisition mode. And any one comparison characteristic value is stored based on the reference characteristic value and the reference time stamp through the semantic mapping vector and the time mapping vector, so that the data storage load is reduced, and the data security is improved. And constructing low-dimensional mapping information of the comparison eigenvalue based on the reference eigenvalue, the reference timestamp, the semantic mapping vector and the time mapping vector, and adding the low-dimensional mapping information of the comparison eigenvalue into the low-dimensional mapping information of the data asset, wherein the reference timestamp is the index timestamp, and the acquisition of the semantic mapping vector and the time mapping vector of the residual data asset eigenvalue in the first cluster of data asset eigenvalues is completed.

performing semantic unit decomposition on the reference characteristic value to obtain a reference unit sequence;

carrying out semantic unit decomposition on the comparison characteristic values to obtain a comparison unit sequence;

When the number of the first semantic units of the reference unit sequence is larger than or equal to the number of the second semantic units of the comparison unit sequence, aligning the comparison unit sequence and the reference unit sequence from beginning to end for semantic distance vector analysis, and generating the semantic mapping vector;

When the number of the first semantic units of the reference unit sequence is smaller than the number of the second semantic units of the comparison unit sequence, aligning the comparison unit sequence with the reference unit sequence from the beginning, supplementing preset reference characters to the reference unit sequence, and performing semantic distance vector analysis on the alignment of the tail parts of the comparison unit sequence to generate the semantic mapping vector.

Calculating a semantic mapping vector of the reference feature value to the comparison feature value, comprising: and carrying out semantic unit decomposition on the reference characteristic value, decomposing the reference characteristic value into a plurality of single characteristics, wherein the single characteristics are the minimum units which cannot be continuously segmented by the reference characteristic value, the reference units after the segmentation of the digital characteristics are single numbers, and obtaining a reference unit sequence based on the sequence formed by the plurality of reference units. And carrying out semantic unit decomposition on the comparison characteristic values by adopting the same segmentation mode to obtain a comparison unit sequence. When the number of the first semantic units of the reference unit sequence is greater than or equal to the number of the second semantic units of the comparison unit sequence, the elements in the reference unit sequence can calculate semantic mapping vectors of all units in the comparison unit sequence, and the comparison unit sequence and the reference unit sequence are aligned from the beginning to perform semantic distance vector analysis to generate the semantic mapping vectors. When the number of the first semantic units of the reference unit sequence is smaller than the number of the second semantic units of the comparison unit sequence, the elements in the reference unit sequence cannot realize semantic mapping vector calculation of all units in the comparison unit sequence, aligning the comparison unit sequence with the reference unit sequence from the beginning, supplementing a preset reference character after the last unit of the reference unit sequence to realize semantic distance vector analysis with the tail alignment of the comparison unit sequence, and generating the semantic mapping vector, wherein the preset reference character is a preset character which is convenient for semantic mapping vector calculation. And calculating a time mapping vector from the reference time stamp to the alignment time stamp by adopting the same calculation scheme as the semantic mapping vector.

Receiving a data access request of a user side to the data asset clustering result, wherein the data access request comprises a target timestamp;

Performing mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information;

And carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to a user side.

And receiving a data access request of the user side to the data asset clustering result, wherein the data access request comprises a target timestamp. And then, carrying out mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp, namely after the target timestamp is acquired, acquiring the data to be accessed by combining the index timestamp to acquire the target low-dimensional mapping information. And finally, carrying out dimension lifting restoration on the target low-dimensional mapping information, and carrying out inverse operation on the semantic mapping vector which is the target low-dimensional mapping information and is combined with the reference characteristic value during dimension lifting restoration, so as to generate a target data asset and send the target data asset to a user side. The method has the advantages that the dimension reduction storage of the multi-source heterogeneous data asset is realized, the storage load is reduced, the data content cannot be directly extracted even if the data is leaked, and the safety of the data is improved.

As shown in fig. 3, the method provided by the embodiment of the present application further includes:

calculating an index time mapping vector from the index time stamp to the target time stamp;

And matching the target low-dimensional mapping information from the data asset low-dimensional mapping information according to the index time mapping vector.

Performing mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information, wherein the method comprises the following steps: and calculating an index time mapping vector from the index time stamp to the target time stamp, wherein the index time mapping vector obtained at the moment is the time mapping vector of the comparison time stamp corresponding to the data to be retrieved by the user. And matching the target low-dimensional mapping information from the data asset low-dimensional mapping information according to the index time mapping vector.

configuring a data asset disposition task set, wherein the data asset disposition task set comprises a to-be-tuned data asset base information tag;

Traversing the data asset disposal task set and configuring a uniquely associated task number tag set;

And calling the basic information label of the data asset to be regulated, the task number label set and the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information is provided with an index time stamp.

A data asset handling task set is configured, wherein the data asset handling task set comprises a plurality of execution tasks needing to be called in the data set, and the data asset handling task set comprises a to-be-called data asset basic information tag. And traversing the data asset disposal task set, and configuring a unique associated task number label set, namely distributing a unique task number for the data asset disposal task set. And then, calling the basic information label of the data asset to be regulated, the task number label set and the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information is provided with an index time stamp. By configuring the data asset handling task set, the task numbers can be directly searched for efficient acquisition of corresponding data when corresponding task processing is performed subsequently.

Randomly selecting a first data asset characteristic value from the first cluster of data asset characteristic values, and setting the first data asset characteristic value as a reference characteristic value, wherein the reference characteristic value has a reference timestamp at the data asset timestamp;

Randomly selecting a second data asset characteristic value which is different from the first data asset characteristic value from the first cluster data asset characteristic value set, and setting the second data asset characteristic value as a comparison characteristic value, wherein the comparison characteristic value has a comparison time stamp at the data asset time stamp;

matching the relevant task number set for the comparison characteristic value according to the to-be-adjusted data asset basic information tag and the task number tag set;

And constructing low-dimensional mapping information of the comparison characteristic value based on the reference characteristic value, the reference timestamp, the semantic mapping vector, the time mapping vector and the associated task number set, and adding the low-dimensional mapping information into the low-dimensional mapping information of the data asset, wherein the reference timestamp is the index timestamp.

And calling the basic information tag of the data asset to be regulated, the task number tag set and the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the method comprises the following steps of: and obtaining a first cluster data asset characteristic value set of the data asset clustering result. And randomly selecting a first data asset characteristic value from the first cluster of data asset characteristic values, and setting the first data asset characteristic value as a reference characteristic value, wherein the reference characteristic value has a reference timestamp at the data asset timestamp. And randomly selecting a second data asset characteristic value which is different from the first data asset characteristic value from the first cluster data asset characteristic value set, and setting the second data asset characteristic value as a comparison characteristic value, wherein the comparison characteristic value has a comparison time stamp at the data asset time stamp. And calculating a semantic mapping vector from the reference characteristic value to the comparison characteristic value. A time mapping vector of the reference time stamp to the alignment time stamp is calculated. And matching the relevant task number set for the comparison characteristic value according to the to-be-adjusted data asset basic information tag and the task number tag set, namely, corresponding the comparison characteristic value and the task number. And constructing low-dimensional mapping information of the comparison characteristic value based on the reference characteristic value, the reference timestamp, the semantic mapping vector, the time mapping vector and the associated task number set, and adding the low-dimensional mapping information into the low-dimensional mapping information of the data asset, wherein the reference timestamp is the index timestamp.

According to the technical scheme provided by the embodiment of the invention, the multi-source heterogeneous data asset basic information is obtained, wherein the multi-source heterogeneous data asset basic information at least comprises a data asset source, a data asset structure, a data asset type and a data asset timestamp. And performing tertiary clustering on the multi-source heterogeneous data asset according to the data asset source, the data asset type and the data asset structure to generate a data asset clustering result. And calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to perform dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information is provided with an index time stamp. And receiving a data access request of the user side to the data asset clustering result, wherein the data access request comprises a target timestamp. And carrying out mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information. And carrying out up-dimensional restoration on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to a user side. The method has the advantages that the dimension reduction storage of the multi-source heterogeneous data asset is realized, the storage load is reduced, the data content cannot be directly extracted even if the data is leaked, and the safety of the data is improved. The technical problems of large storage load and poor data asset security of multi-source heterogeneous data assets in the prior art are solved.

Examples

Based on the same inventive concept as the method for linking multi-source heterogeneous data assets in the foregoing embodiments, the present invention also provides a linking system for multi-source heterogeneous data assets, which can be implemented in hardware and/or software, and can be generally integrated in an electronic device, for performing the method provided by any of the embodiments of the present invention. As shown in fig. 4, the system includes:

A data acquisition module 11, configured to obtain multi-source heterogeneous data asset base information, where the multi-source heterogeneous data asset base information includes at least a data asset source, a data asset structure, a data asset type, and a data asset timestamp;

A clustering module 12, configured to perform tertiary clustering on the multi-source heterogeneous data assets according to the data asset source, the data asset type and the data asset structure, and generate a data asset clustering result;

The dimension-reducing mapping module 13 is configured to retrieve the data asset timestamp through a data distribution mapping algorithm, traverse the data asset clustering result to perform dimension-reducing mapping, and generate data asset low-dimension mapping information, where the data asset low-dimension mapping information has an index timestamp;

an access request receiving module 14, configured to receive a data access request from a user side to the data asset clustering result, where the data access request includes a target timestamp;

The mapping matching module 15 is configured to match mapping information in the data asset low-dimensional mapping information according to the target timestamp and the index timestamp, so as to obtain target low-dimensional mapping information;

And the data restoration module 16 is configured to perform up-scaling restoration on the target low-dimensional mapping information, generate a target data asset, and send the target data asset to the user side.

Further, the clustering module 12 is further configured to:

Further, the dimension-reduction mapping module 13 is further configured to:

Further, the mapping matching module 15 is further configured to:

Further, the data reduction module 16 is further configured to:

The included units and modules are only divided according to the functional logic, but are not limited to the above-mentioned division, so long as the corresponding functions can be realized; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Examples

Fig. 5 is a schematic structural diagram of an electronic device provided in a third embodiment of the present invention, and shows a block diagram of an exemplary electronic device suitable for implementing an embodiment of the present invention. The electronic device shown in fig. 5 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention. As shown in fig. 5, the electronic device includes a processor 31, a memory 32, an input device 33, and an output device 34; the number of processors 31 in the electronic device may be one or more, in fig. 5, one processor 31 is taken as an example, and the processors 31, the memory 32, the input device 33 and the output device 34 in the electronic device may be connected by a bus or other means, in fig. 5, by bus connection is taken as an example.

The memory 32 serves as a computer readable storage medium for storing software programs, computer executable programs and modules, such as program instructions/modules corresponding to the linking method for multi-source heterogeneous data assets in embodiments of the present invention. The processor 31 executes various functional applications of the computer device and power distribution by running software programs, instructions and modules stored in the memory 32, i.e. implements the linking method for multi-source heterogeneous data assets described above.

Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims

1. A method for linking multi-source heterogeneous data assets, comprising:

Performing dimension lifting reduction on the target low-dimension mapping information, generating a target data asset and sending the target data asset to a user side;

the data asset time stamp is called through a data distribution mapping algorithm, the data asset clustering result is traversed to carry out dimension reduction mapping, and data asset low-dimension mapping information is generated, wherein the data asset low-dimension mapping information is provided with an index time stamp, and the method comprises the following steps:

constructing low-dimensional mapping information of the comparison eigenvalue based on the reference eigenvalue, the reference timestamp, the semantic mapping vector and the time mapping vector, and adding the low-dimensional mapping information into the data asset low-dimensional mapping information, wherein the reference timestamp is the index timestamp;

wherein calculating a semantic mapping vector of the reference feature value to the alignment feature value comprises:

2. The method of claim 1, wherein tertiary clustering of multi-source heterogeneous data assets according to the data asset sources, the data asset types, and the data asset structures, generating data asset cluster results comprises:

3. The method of claim 1, wherein performing mapping information matching at the data asset low-dimensional mapping information based on the target timestamp and the index timestamp to obtain target low-dimensional mapping information, comprising:

4. The method as recited in claim 1, further comprising:

5. The method of claim 4, wherein retrieving the data asset base information tag to be tuned, the task number tag set, and the data asset timestamp by a data distribution mapping algorithm, traversing the data asset clustering result for dimension-reduction mapping, generating data asset low-dimensional mapping information, comprises:

6. A piping system for multi-source heterogeneous data assets, comprising:

the system comprises a data acquisition module, a data storage module and a data storage module, wherein the data acquisition module is used for acquiring multi-source heterogeneous data asset basic information, and the multi-source heterogeneous data asset basic information at least comprises a data asset source, a data asset structure, a data asset type and a data asset timestamp;

The clustering module is used for performing tertiary clustering on the multi-source heterogeneous data assets according to the data asset sources, the data asset types and the data asset structures to generate data asset clustering results;

The dimension reduction mapping module is used for calling the data asset time stamp through a data distribution mapping algorithm, traversing the data asset clustering result to carry out dimension reduction mapping, and generating data asset low-dimension mapping information, wherein the data asset low-dimension mapping information is provided with an index time stamp;

the access request receiving module is used for receiving a data access request of the user side to the data asset clustering result, wherein the data access request comprises a target time stamp;

The mapping matching module is used for carrying out mapping information matching on the data asset low-dimensional mapping information according to the target timestamp and the index timestamp to obtain target low-dimensional mapping information;

The data reduction module is used for carrying out dimension-lifting reduction on the target low-dimensional mapping information, generating a target data asset and sending the target data asset to the user side;

The dimension-reduction mapping module is further used for:

7. An electronic device, the electronic device comprising:

A memory for storing executable instructions;

a processor for implementing the method of linking multi-source heterogeneous data assets of any one of claims 1 to 5 when executing executable instructions stored in the memory.

8. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a linking method for multi-source heterogeneous data assets as claimed in any of claims 1-5.