CN114841481A - Data management method, device and storage medium - Google Patents

Data management method, device and storage medium Download PDF

Info

Publication number
CN114841481A
CN114841481A CN202110138137.3A CN202110138137A CN114841481A CN 114841481 A CN114841481 A CN 114841481A CN 202110138137 A CN202110138137 A CN 202110138137A CN 114841481 A CN114841481 A CN 114841481A
Authority
CN
China
Prior art keywords
data management
target
data
grading
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110138137.3A
Other languages
Chinese (zh)
Inventor
刘妍
陈守志
林岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110138137.3A priority Critical patent/CN114841481A/en
Publication of CN114841481A publication Critical patent/CN114841481A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling

Abstract

The embodiment of the application discloses a data management method, a device and a storage medium, wherein the method comprises the following steps: acquiring asset state information of a target data asset; determining a data management index of the target data asset, the data management index comprising a plurality of data; acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of each data management index corresponding to the target data assets; and determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy. According to the data management method and device, refined matching of management data is achieved, and the reliability of data management is effectively improved.

Description

Data management method, device and storage medium
Technical Field
The present application relates to the field of data management technologies, and in particular, to a data management method, apparatus, and storage medium.
Background
The internet service is increasingly rich and diversified, the user level and the online duration are continuously increased, and large-level data precipitation is brought. Data is a core asset of an enterprise, and has replaceable value in aspects of product function verification, driving business growth, refined operation, personalized service and the like, so that reliable data management is very important.
At present, data management is generally managed by establishing a data management strategy with a unified standard, core data loss, freezing and data security events are easy to occur in the data management process, the problem of management resource waste exists, and management data are difficult to be finely matched, so that the reliability of data management is low.
Disclosure of Invention
The embodiment of the application provides a data management method and a related device, aiming at realizing the refined matching of management data and effectively improving the reliability of data management.
In order to solve the above technical problem, an embodiment of the present application provides the following technical solutions:
according to an embodiment of the present application, a data management method includes: acquiring asset state information of a target data asset; determining a data management index of the target data asset, the data management index comprising a plurality of data; acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of each data management index corresponding to the target data assets; and determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy.
According to an embodiment of the present application, a data management apparatus includes: the acquisition module is used for acquiring asset state information of the target data asset; a determining module, configured to determine a plurality of data management metrics of the target data asset; the matching module is used for acquiring target state information matched with each data management index from the asset state information; the grading module is used for grading the target data assets based on the target state information to obtain a grading result of each data management index corresponding to the target data assets; and the management module is used for determining a target data management strategy according to the grading result of each data management index corresponding to the target data asset, and managing the target data asset based on the target data management strategy.
In some embodiments of the present application, the asset status information comprises attribute information corresponding to a plurality of attributes of the target data asset; a matched module comprising: the label establishing unit is used for establishing an attribute label of the target data asset corresponding to each attribute based on the attribute information corresponding to each attribute; and the tag association unit is used for acquiring an attribute tag corresponding to the attribute matched with each data management index as the target state information matched with each data management index.
In some embodiments of the present application, the tag establishing unit includes: the characteristic obtaining subunit is used for obtaining the information characteristics of the attribute information corresponding to each attribute; the strategy matching subunit is used for determining a label establishing strategy corresponding to each attribute according to the information characteristics of the attribute information corresponding to each attribute; and the label establishing subunit is used for establishing an attribute label of the target data asset corresponding to each attribute by using the attribute information corresponding to each attribute according to the label establishing strategy corresponding to each attribute.
In some embodiments of the present application, the tag associating unit includes: the table acquisition subunit is used for acquiring an attribute query table, wherein the attribute query table comprises each data management index and an attribute matched with each data management index; the query subunit is used for determining a target attribute matched with each data management index according to the attribute query table; and the attribute matching subunit is used for acquiring an attribute tag corresponding to the target attribute matched with each data management index, and the attribute tag is used as the target state information matched with each data management index.
In some embodiments of the present application, the ranking module comprises: the grading strategy determining unit is used for determining a grading strategy corresponding to each data management index; and the strategy grading unit is used for grading the target data assets respectively by utilizing the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index.
In some embodiments of the present application, the ranking policy determining unit includes: the asset analysis subunit is configured to determine a service scene feature and a data supervision requirement corresponding to the target data asset, where the service scene feature is a relevant feature of a service scene to which the target data asset is applied, and the data supervision requirement is a target requirement for managing the target data asset; and the grading strategy determining subunit is used for determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements.
In some embodiments of the present application, the ranking policy corresponding to the first data management indicator comprises a plurality of ranking constraints, each of the ranking constraints corresponding to a rank; the policy classification unit comprises: the constraint condition matching subunit is used for determining a hierarchical constraint condition which is met by the target state information matched with the first data management index to obtain a target hierarchical constraint condition; the grade matching subunit is used for acquiring the grade corresponding to the target grading constraint condition; and the result determining subunit is used for determining the grade corresponding to the target grading constraint condition as the grading result of the target data asset corresponding to the first data management index.
In some embodiments of the present application, the first data management indicator comprises an importance indicator; the plurality of hierarchical constraint conditions comprise a first hierarchical constraint condition, a second hierarchical constraint condition and a third hierarchical constraint condition, wherein the level corresponding to the first hierarchical constraint condition is an effective asset, the level corresponding to the second hierarchical constraint condition is a disabled asset, and the level corresponding to the third hierarchical constraint condition is an asset to be observed; the constraint matching subunit is configured to: sequentially determining whether the target state information matched with the importance indexes meets the first grading constraint condition, the second grading constraint condition and the third grading constraint condition; and determining the hierarchical constraint condition which is met by the target state information matched with the importance index as the target hierarchical constraint condition.
In some embodiments of the present application, the ranking policy corresponding to the second data management indicator is a ranking model; the policy classification unit is configured to: and inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset corresponding to the second data management index, which is output by the hierarchical model.
In some embodiments of the present application, the second data management indicator comprises a sensitivity indicator and a security indicator; the policy ranking unit is configured to: inputting the target state information matched with the sensitivity indexes into a sensitivity degree grading model to obtain a grading result of the target data assets corresponding to the sensitivity indexes and output by the sensitivity degree grading model; and inputting the target state information matched with the safety indexes into a safety degree grading model to obtain a grading result of the target data assets corresponding to the safety indexes and output by the safety degree grading model.
In some embodiments of the present application, the management module includes: the system comprises a template acquisition unit, a strategy template acquisition unit and a strategy template matching unit, wherein the template acquisition unit is used for acquiring a strategy template set, and each strategy template in the strategy template set is marked with a plurality of grade labels; the template matching unit is used for determining a grade label matched with each grading result according to the grading result of the target data asset corresponding to each data management index to obtain a target grade label; and the template determining unit is used for taking the strategy template marked with the target grade label as the target data management strategy.
In some embodiments of the present application, the management module includes: the analysis unit is used for inputting the grading result of the target data asset corresponding to each data management index into a strategy decision model to obtain strategy information output by the strategy decision model; and the strategy generating unit is used for generating the target data management strategy according to the strategy information.
In some embodiments of the present application, the data management apparatus further comprises: the metadata acquisition module is used for acquiring metadata corresponding to the target data assets; and the metadata association unit is used for associating the target data management strategy with metadata corresponding to the target data assets.
According to another embodiment of the present application, an electronic device may include: a memory storing computer readable instructions; and a processor for reading the computer readable instructions stored in the memory to perform the methods of the embodiments.
According to another embodiment of the present application, a storage medium has stored thereon computer-readable instructions which, when executed by a processor of a computer, cause the computer to perform the method of the embodiments of the present application.
According to another embodiment of the present application, a computer program product or computer program comprises computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described in the embodiments of this application.
The method comprises the steps of acquiring asset state information of a target data asset; determining a plurality of data management indexes of the target data asset; then, acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index; multi-dimensional grading is carried out on the target data assets according to the self states of the data assets and multiple indexes; then, according to the grading result of the target data asset corresponding to each data management index, determining a target data management strategy, realizing the grading result based on multiple dimensions, and refining and determining the target data management strategy adapted to the self state of the target data asset; and then the target data assets are managed based on the target data management strategy, core data loss, freezing and data security events can be effectively avoided in the data management process, the problem of management resource waste is avoided, refined matching of management data is achieved, reliability of data management is effectively improved, and management cost is guaranteed.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 shows a schematic diagram of a system to which embodiments of the present application may be applied.
FIG. 2 shows a flow diagram of a data management method according to an embodiment of the present application.
FIG. 3 shows a flow diagram of a method of obtaining target state information according to one embodiment of the present application.
FIG. 4 shows a flow diagram of a ranking method according to an embodiment of the present application.
FIG. 5 illustrates a flow diagram of a method of determining a target data management policy according to one embodiment of the present application.
FIG. 6 illustrates a flow diagram of a method of determining a target data management policy according to one embodiment of the present application.
Fig. 7 shows a flow chart of data management in one scenario to which embodiments of the present application are applied.
FIG. 8 shows a block diagram of a data management device according to an embodiment of the present application.
FIG. 9 shows a block diagram of an electronic device according to an embodiment of the application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific form set forth herein, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.
FIG. 1 shows a schematic diagram of a system 100 to which embodiments of the present application may be applied. As shown in fig. 1, the system 100 may include a server 101 and a terminal 102, where the server 101 may store target data assets and the terminal 102 may manage the target data assets.
The server 101 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The server 101 may perform background tasks and the data assets may be stored in the server 101.
In one embodiment, the server 101 may provide an artificial intelligence cloud service, such as an artificial intelligence cloud service that provides Massively Multiplayer Online Role Playing Games (MMORPGs). The so-called artificial intelligence cloud Service is also generally called AIaaS (AI as a Service, chinese). The method is a service mode of an artificial intelligence platform, and particularly, the AIaaS platform splits several types of common AI services and provides independent or packaged services at a cloud. This service model is similar to the one opened in an AI theme mall: all developers can access one or more artificial intelligence services provided by the platform through an API interface, and some of the qualified developers can also use the AI framework and AI infrastructure provided by the platform to deploy and operate the self-dedicated cloud artificial intelligence services, for example, the server 101 can provide artificial intelligence-based data management.
The terminal 102 may be an edge device such as a smart phone, a computer, etc. The user can update, modify and delete the target data assets through the operation page on the terminal 102.
The terminal 102 and the server 101 may be directly or indirectly connected through wireless communication, and the application is not limited herein.
In one embodiment of this example, terminal 102 may obtain asset status information for a target data asset; determining a data management index of the target data asset, the data management index comprising a plurality of data; acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index; and determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy.
FIG. 2 schematically shows a flow diagram of a data management method according to an embodiment of the application. The execution subject of the data management method may be an electronic device having a calculation processing function, such as the server 101 or the terminal 102 shown in fig. 1.
As shown in fig. 2, the data management method may include steps S210 to S250.
Step S210, acquiring asset state information of a target data asset;
step S220, determining a plurality of data management indexes of the target data asset;
step S230, acquiring target state information matched with each data management index from the asset state information;
step S240, grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index;
and step S250, determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy.
The following describes a specific process of each step performed when data is managed.
In step S210, asset status information of the target data asset is acquired.
In the exemplary embodiment, a target data asset refers to a physically or electronically recorded data resource owned or controlled by an individual or business that can bring future economic benefits to the business. For example, the target data asset may be a data table of data storage types such as mysql, kv, hive, Tbase, etc., and the target data asset may also be a resource such as a data text file.
Asset status information is attribute information describing a plurality of status attributes of a data asset, which may include an accessed status attribute, an updated status attribute, an affiliated business attribute, an associated status attribute with other data assets, a sensitive status attribute, and an asset security impact status attribute of a target data asset.
The obtaining of the asset state information of the target data asset may be through an active pulling or receiving pushing manner, for example, periodically pulling the asset state information of the target data asset from a storage system of the data asset, or receiving the asset state information of the target data asset uploaded by the storage system of the data asset in real time.
In one embodiment, the target data asset is a data table, and acquiring asset status information of the data table includes: acquiring fan-in data and fan-out data of the data table based on the blood relationship by establishing the blood relationship with the data table to obtain related attribute information of the data table and the related state attributes of other data assets; acquiring a data operation record of a data storage system to obtain related attribute information of the update state attribute of the data table; acquiring an access record of the data table in the data operation platform to obtain related attribute information of the accessed state attribute of the data table; and acquiring user marking data of the data table to obtain the related attribute information of the business attribute, the sensitive state attribute and the asset safety influence state attribute.
The bloodletting relationship is a channel for the established generation process and use condition of the acquired data, and the fan-in data and the fan-out data of the data table can be acquired by acquiring the upstream task number and the downstream task number of the data table. The data storage system is a system for storing data tables, and information such as update time of the data tables is recorded in data operation records of the data storage system. The data operation platform is, for example, a big data management platform, and information such as access time of the data table is recorded in an access record of the data table in the data operation platform. The user marking data is information such as description information, table field description, table sensitivity level and table safety influence level of a manager of the data table for calibration of the data table.
In step S220, a data management index of the target data asset is determined, wherein the data management index includes a plurality of data management indexes.
In the embodiment of the present example, the data management index is a defined data management standard index, wherein the data management index includes a plurality of, i.e., at least two, data management indexes, for example, at least two data management indexes including a sensitivity index, an importance index, and a security index. Each data management index corresponds to a data management standard of one dimension so as to realize multi-dimensional fine management of target data assets.
In one embodiment, the data management metrics for the target data asset are determined by: the data management method comprises the steps that a plurality of data management indexes defined by a user for a target data asset are determined, the user can select a plurality of data association indexes of the current data management requirement through a preset data management index table (a predefined data management index cluster is stored in the data management index table), and data management can be flexibly carried out on the basis of the selected data management indexes.
In one embodiment, the data management metrics for the target data asset are determined by: and determining metadata of the target data asset, and inputting the metadata into an index decision model to obtain a plurality of data management indexes output by the index decision model. Wherein the metadata describes data describing the target data asset, such as table name, table description, table building statement, table field, field type, field description, and the like; the index decision model is a pre-trained machine learning model and can be arranged on a cloud server, metadata of data asset samples collected by the cloud server in real time and corresponding data management indexes are used as training data, and the index decision model is trained in real time, wherein the corresponding scores of the data management indexes corresponding to the data asset samples are calibrated based on the data asset sample management effect, and the index decision model can be trained to decide a plurality of data associated indexes which are in line with the data asset self state and can bring a good management effect.
In step S230, from the asset status information, target status information matching each data management index is acquired.
In the embodiment of the present example, each data management index corresponds to a data management standard of one dimension, and target status information matched with each data management index, that is, status information under the dimension corresponding to each data management index is obtained, so that in the subsequent step, target data assets can be analyzed under each dimension respectively.
The target state information matched with each data management index may be obtained by directly obtaining attribute information corresponding to an attribute matched with each data management index, and the attribute information is used as corresponding target state information, for example, related information corresponding to a service attribute of a target data asset and related information corresponding to a security impact state attribute labeled by a user are directly obtained for a security index.
The method includes the steps of obtaining target state information matched with each data management index, or obtaining attribute information corresponding to an attribute matched with each data management index, and then establishing an attribute tag based on the attribute information, for example, after obtaining relevant information corresponding to a service attribute of a target data asset and relevant information corresponding to an original safety influence state attribute marked by a user aiming at a safety index, establishing a safety influence degree tag according to the obtained information, wherein the safety influence degree tag is a high-level, middle-level or low-level tag, or a personalized tag influencing a storage position or influencing the number of backups.
In one embodiment, referring to FIG. 3, the asset status information includes attribute information corresponding to a plurality of attributes of the target data asset; in step S230, obtaining target status information matched with each data management index from the asset status information includes:
step S310, establishing an attribute tag corresponding to each attribute of the target data asset based on the attribute information corresponding to each attribute;
step S320, obtaining an attribute tag corresponding to the attribute matched with each data management index as the target status information matched with each data management index.
The plurality of attributes may include an accessed status attribute, an updated status attribute, an affiliated business attribute, an associated status attribute with other data assets, a sensitive status attribute, and an asset security-affected status attribute of the target data asset.
Each attribute corresponds to corresponding attribute information, for example, the attribute information corresponding to the accessed state attribute may include access user information, access time information, access location information, and the like in the access record of the target data asset.
Establishing an attribute tag corresponding to each attribute of the target data asset, namely establishing an attribute tag with significance based on a large amount of attribute information, for example, an accessed state attribute can establish a key tag: the latest access time label and the access frequency label, and the update state attribute can establish the latest update time label and the update frequency label.
And acquiring an attribute label corresponding to the attribute matched with each data management index, wherein the attribute label is used as the target state information matched with each data management index, and the attribute labels with significance can be directly classified in the subsequent steps, so that the classification efficiency and reliability are ensured.
In one embodiment, step S310, establishing an attribute tag corresponding to each attribute of the target data asset based on the attribute information corresponding to each attribute includes:
acquiring information characteristics of attribute information corresponding to each attribute;
determining a label establishment strategy corresponding to each attribute according to the information characteristics of the attribute information corresponding to each attribute;
and establishing an attribute tag corresponding to each attribute of the target data asset by using the attribute information corresponding to each attribute according to the tag establishing strategy corresponding to each attribute.
The information characteristic of the attribute information is a data characteristic of the attribute information, and for example, the information characteristic may include a data type (e.g., text type or numeric type), a generation type (e.g., user-mark type data or operation record type data), and an information amount.
The label establishing policy is a manner of establishing a label, and the label establishing policy may include an establishing policy based on a label establishing model (i.e., a machine learning model for establishing a label), an establishing policy based on a calculation function, and the like.
The mapping relationship between the information characteristics and the tag establishment strategy can be established in advance, the tag establishment strategy corresponding to each attribute is determined according to the information characteristics of the attribute information corresponding to each attribute based on the mapping relationship, and then the attribute tag corresponding to each attribute of the target data asset is established by using the attribute information corresponding to each attribute. In this way, the attribute label can be accurately established according to the characteristics of the attribute information for acquiring the attribute information corresponding to each attribute.
Each tag establishing policy corresponds to one tag establishing module, and determining the tag establishing policy corresponding to each attribute may be embodied as determining the tag establishing module corresponding to each attribute.
In one embodiment, in step S320, acquiring an attribute tag corresponding to an attribute matched with each data management index as target status information matched with each data management index, where the method includes:
acquiring an attribute query table, wherein the attribute query table comprises each data management index and an attribute matched with each data management index;
determining a target attribute matched with each data management index according to the attribute query table;
and acquiring an attribute tag corresponding to the target attribute matched with each data management index as target state information matched with each data management index.
The attribute lookup table contains each data management index and an attribute matched with each data management index, namely, a mapping relation between the data management index and the attribute is established in the attribute lookup table. Further, a target attribute matching each data management index may be searched based on the attribute lookup table.
In step S240, the target data assets are ranked based on the target status information, resulting in a ranking result of the target data assets corresponding to each data management index.
In the embodiment of the example, each data management index corresponds to the matched target state information, and the target data assets are graded by using the target state information matched with each data management index in the dimensionality of each data management index, so that the target data assets are subjected to multi-dimensional fine grading based on the self-generated state of the target data assets, and the grading result of the target data assets corresponding to each data management index is obtained.
Wherein, the result of ranking the importance indicators is 5 levels of core assets, backbone assets, common assets, disabled assets and zombie assets (i.e. assets to be observed), or 3 levels of active assets, disabled assets and zombie assets (i.e. assets to be observed). The results of grading the sensitivity index are 5 grades such as top secret, high sensitivity, medium sensitivity and low sensitivity. The results of the classification for the security indicators are, for example, five, four, three, two, and one 5 levels.
In one embodiment, referring to fig. 4, step S240 is to rank the target data assets based on the target status information to obtain a ranking result of the target data assets corresponding to each data management index:
step S410, determining a grading strategy corresponding to each data management index;
step S420, according to the grading strategy corresponding to each data management index, the target data assets are graded respectively by using the target state information matched with each data management index, and the grading result of each data management index corresponding to each target data asset is obtained.
The hierarchical policy is a hierarchical manner, and the hierarchical policy may include a policy based on a hierarchical model (i.e., a hierarchical machine learning model), a hierarchical policy based on matching hierarchical constraints, and the like. Each of the hierarchical policies corresponds to a hierarchical policy module, and determining the hierarchical policy corresponding to each of the data management indicators may be embodied as determining the hierarchical policy module corresponding to each of the data management indicators.
The classification strategy corresponding to each data management index is determined, which may be to obtain a mapping relationship between the data management index and the classification strategy that is established in advance, and directly obtain the classification strategy corresponding to each data management index based on the mapping relationship.
In one embodiment, step S410, determining a ranking policy corresponding to each data management index includes:
determining service scene characteristics and data supervision requirements corresponding to the target data assets, wherein the service scene characteristics are relevant characteristics of service scenes applied by the target data assets, and the data supervision requirements are target requirements for managing the target data assets;
and determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements.
The business scenario features may include relevant features of the business scenario to which the target data asset is applied, such as names of business scenarios such as payment, and features of people used by the scenarios; the data administration requirements are target requirements for managing target data assets, such as temporary administration or regular administration.
Determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements, obtaining a mapping relation between the service scene characteristics and the data supervision requirements and the grading strategies (the mapping relation can be from a data supervision organization or a data management enterprise), and determining the grading strategy corresponding to each data management index according to the mapping relation (the grading strategy can be issued by a cloud server of the data supervision organization or the data management enterprise).
Each hierarchical policy corresponds to one hierarchical policy module, and the determination of the hierarchical policy corresponding to each data management index can be embodied as the determination of the hierarchical policy module corresponding to each data management index.
In one embodiment, the ranking policy corresponding to the first data management index includes a plurality of ranking constraints, each ranking constraint corresponding to a rank; step S420, according to the classification strategy corresponding to each data management index, using the target status information matched with each data management index to classify the target data assets, respectively, so as to obtain a classification result of the target data assets corresponding to each data management index, including:
determining a grading constraint condition which is met by the target state information matched with the first data management index to obtain a target grading constraint condition;
acquiring a grade corresponding to a target grading constraint condition;
and determining the grade corresponding to the target grading constraint condition as a grading result of the target data asset corresponding to the first data management index.
The hierarchical constraint is a constraint composed of target state information, for example, a hierarchical constraint is: "low fan-in number" and ("updated within 30 days" or "low visit"). Further, a hierarchical constraint that the target status information that matches the first data management indicator meets may be determined. The first data management index may be specified according to actual conditions, for example, the first data management index is an importance index, and it is understood that the first data management index may also be other indexes such as a sensitivity index and a security index.
Each hierarchical constraint condition corresponds to a level, for example, the hierarchical constraint condition is "low fan-in number" and the level corresponding to the "update within 30 days" or "low access amount" is "common asset", and further, the level corresponding to the target hierarchical constraint condition can be obtained; and determining the grade corresponding to the target grading constraint condition as a grading result of the target data asset corresponding to the first data management index.
In one embodiment, the first data management indicator comprises an importance indicator; the plurality of hierarchical constraint conditions comprise a first hierarchical constraint condition, a second hierarchical constraint condition and a third hierarchical constraint condition, wherein the grade corresponding to the first hierarchical constraint condition is an effective asset, the grade corresponding to the second hierarchical constraint condition is a non-use asset, and the grade corresponding to the third hierarchical constraint condition is an asset to be observed;
determining a hierarchical constraint condition met by the target state information matched with the first data management index to obtain a target hierarchical constraint condition, wherein the target hierarchical constraint condition comprises the following steps:
sequentially determining whether the target state information matched with the importance indexes meets a first grading constraint condition, a second grading constraint condition and a third grading constraint condition;
and determining the grading constraint condition which is met by the target state information matched with the importance index as a target grading constraint condition.
Based on the hierarchical mode of the hierarchical constraint condition, the hierarchical constraint condition can be interpreted and is convenient for adjusting and monitoring. The target state information matched with the importance indexes of the data assets is information generated by the assets, such as 'fan-in number', 'updating frequency' and 'access amount', and the like, and the interpretability and the convenience in adjustment and supervision of the grading result corresponding to the importance indexes can be effectively ensured by a strategy based on grading constraint conditions.
And sequentially determining whether the target state information matched with the importance indexes meets a first grading constraint condition, a second grading constraint condition and a third grading constraint condition, namely determining the target state information matched with the importance indexes according to the importance sequence of the effective assets, the deactivated assets and the assets to be observed, and determining the grading constraint condition matched firstly as the grading constraint condition met by the target state information matched with the importance indexes. Therefore, the target state information matched with the importance indexes can be prevented from simultaneously meeting a plurality of grading constraint conditions.
Wherein, the effective assets are core table, main table and common table; decommissioning assets such as decommissioning tables; assets to be observed such as zombie tables.
In one embodiment, the grading strategy corresponding to the second data management index is a grading model; step S420, according to the classification strategy corresponding to each data management index, using the target status information matched with each data management index to classify the target data assets, respectively, so as to obtain a classification result of the target data assets corresponding to each data management index, including:
and inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset output by the hierarchical model corresponding to the second data management index.
The second data management index may be an importance index, or may be other indexes such as a sensitivity index and a security index.
The classification model is a machine learning model obtained by training with the target state information sample as input data and the classification result corresponding to the target state information sample as expected output. And analyzing the target state information matched with the second data management index based on the hierarchical model, and outputting a hierarchical result of the target data asset corresponding to the second data management index, so that the hierarchical result can be reliably obtained even under the condition that the target state information is relatively complex.
In one embodiment, the second data management indicator includes a sensitivity indicator and a security indicator; inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset output by the hierarchical model corresponding to the second data management index, wherein the hierarchical result comprises the following steps:
inputting the target state information matched with the sensitivity indexes into a sensitivity degree grading model to obtain a grading result of the target data assets output by the sensitivity degree grading model corresponding to the sensitivity indexes;
and inputting the target state information matched with the safety indexes into the safety degree grading model to obtain a grading result of the target data assets output by the safety degree grading model, wherein the grading result corresponds to the safety indexes.
The sensitivity degree hierarchical model is a machine learning model obtained by training by taking a target state information sample matched with the sensitivity index as input data and a hierarchical structure corresponding to the target state information sample as expected output. The safety degree hierarchical model is also a machine learning model obtained by training with a target state information sample matched with the safety index as input and a hierarchical structure corresponding to the target state information sample as expected output.
The target state information matched with the sensitivity index and the safety index usually comprises more complex information, and the reliability of the grading result can be ensured by a machine learning model-based grading mode.
It is to be understood that when the second data management index includes other indexes, a grading model corresponding to the other indexes may be provided, and grading under the other indexes may be performed.
In step S250, a target data management policy is determined according to the result of ranking of the target data assets corresponding to each data management index, and the target data assets are managed based on the target data management policy.
In the embodiment of the present example, the target data management policy is a policy for managing the target data assets, for example, considering three major points of storage, calculation and security of the target data assets, each point of view designs a different level of guarantee management policy.
And determining a target data management strategy fusing the multidimensional data management standards by combining the grading result of the target data assets corresponding to each data management index, wherein the target data assets can be reliably managed based on the target data management strategy, and for example, the safety, integrity and updating frequency of high-grade data assets can be preferentially ensured.
In this way, based on steps S210-S250, by acquiring asset status information of the target data asset; determining a plurality of data management indexes of the target data asset; then, acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index; multi-dimensional grading is carried out on the target data assets according to the self states of the data assets and multiple indexes; then, according to the grading result of the target data asset corresponding to each data management index, determining a target data management strategy, realizing the grading result based on multiple dimensions, and refining and determining the target data management strategy adapted to the self state of the target data asset; and then the target data assets are managed based on the target data management strategy, core data loss, freezing and data security events can be effectively avoided in the data management process, the problem of management resource waste is avoided, refined matching of management data is achieved, and reliability of data management is effectively improved.
In one embodiment, referring to fig. 5, step S250, determining a target data management policy according to the ranking result of the target data assets corresponding to each data management index includes:
step S510, a strategy template set is obtained, and each strategy template in the strategy template set is calibrated with a plurality of grade labels;
step S520, determining a grade label matched with each grading result according to the grading result of each data management index corresponding to the target data asset to obtain a target grade label;
step S530, using the policy template marked with the target level label as the target data management policy.
The policy template set is a policy template set of a data management policy, each policy template in the policy template set is calibrated with a plurality of level tags, and the applicable level of each policy template in different management angles can be calibrated through the level tags, for example, a tag calibrated by a certain policy template comprises level tags in a plurality of angles such as data storage, data calculation resources, data authority, data audit and the like, and a plurality of data management standards are fused.
Meanwhile, the target data asset corresponds to a plurality of ranking results (i.e., ranking results corresponding to a plurality of data management metrics). By obtaining the mapping relation between the grading result and the grade labels, the grade label matched with each grading result can be determined, a target grade label (comprising a plurality of grade labels) is obtained, a many-to-many matching relation is formed, then a strategy template which is simultaneously marked with the grade labels included in the target grade label is used as a target data management strategy, and the matched target data management strategy which integrates the multidimensional data management standard is reliably obtained.
In one embodiment, referring to fig. 6, step S250, determining a target data management policy according to the ranking result of the target data assets corresponding to each data management index includes:
step S610, inputting the grading result of the target data asset corresponding to each data management index into a strategy decision model to obtain strategy information output by the strategy decision model;
step S620 generates a target data management policy according to the policy information.
The strategy decision model is a machine learning model obtained by training and taking the grading result samples as input data and strategy information labels corresponding to the grading result samples as expected outputs. The policy information may include policy information for multiple angles of data storage, data computing resources, data permissions, data audits, etc., for example, information for data storage may include backup number and storage location level, etc. And intelligently deciding the strategy information matched with the target data asset based on the strategy decision model, and intelligently generating a target data management strategy according to the strategy information.
In one embodiment, the data management method of the present application further includes:
acquiring metadata corresponding to target data assets;
and associating the target data management policy with metadata corresponding to the target data asset.
The metadata corresponding to the target data assets are data describing the target data assets, such as table names, table descriptions, table building statements, table fields, field types, field descriptions, responsible persons, creation time, modification records, partition quantity and other information, and unified management can be achieved on data of different systems by acquiring the metadata to form an asset management directory. Meanwhile, the target data management strategy is related to the metadata corresponding to the target data asset, so that the data management strategy can be reliably managed, and the metadata-based retrieval is facilitated.
The method for obtaining metadata may be to actively obtain metadata information of a target data asset by adopting an active pulling manner, for example, to obtain metadata information of each upstream data type (including data storage types such as mysql, kv, hive, Tbase, and the like) at regular time, to update a current table by comparing the obtained metadata information with data in the current table in which the metadata information is stored, and then performing operations such as metadata addition, modification, deletion, and the like according to a comparison result, so as to ensure final consistency of the metadata information and information of an original table (target data asset). The metadata may be used as asset status information for the acquired target data asset.
The method described in the above embodiments is further illustrated in detail by way of example.
Fig. 7 shows a flow chart of data management in one scenario to which embodiments of the present application are applied. Target data assets in the scene are data tables; data management in this scenario may be based on steps S710-S740 as shown in fig. 7.
Step S710, obtaining metadata corresponding to the data table.
The metadata includes information of table name, table description, table building statement, table field, field type, field description, responsible person, creation time, modification record, partition number and the like of the data table.
And S720, building a metadata tag system.
First, the asset status information of the data table is obtained, which includes: acquiring fan-in data and fan-out data of the data table based on the blood relationship by establishing the blood relationship with the data table to obtain related attribute information of the data table and the related state attributes of other data assets; acquiring a data operation record of a data storage system to obtain related attribute information of the update state attribute of the data table; acquiring an access record of the data table in the data operation platform to obtain related attribute information of the accessed state attribute of the data table; and acquiring user marking data of the data table to obtain the related attribute information of the business attribute, the sensitive state attribute and the asset safety influence state attribute.
Secondly, based on attribute information corresponding to a plurality of target attributes in the asset state information, an attribute tag corresponding to each attribute in the data table is established, and the established attribute tags simultaneously correspond to metadata corresponding to the data table.
The plurality of target attributes may include an accessed status attribute, an updated status attribute, an affiliated business attribute, an associated status attribute with other data assets, a sensitive status attribute, and an asset security impact status attribute of the target data asset.
In step S730, the metadata is multidimensional-graded.
Firstly, determining a plurality of data management indexes of a data table, wherein the data management indexes at least comprise an importance index, a safety index and a sensitivity index.
And secondly, acquiring target state information matched with each data management index from the asset state information, wherein the target state information specifically comprises an attribute tag corresponding to an attribute matched with each data management index, and the attribute tag is used as the target state information matched with each data management index.
The attribute labels corresponding to the importance indexes comprise a fan-in number, a fan-out number, a latest access time label, an access frequency label, a latest update time label and an update frequency label; the attribute labels corresponding to the sensitivity indexes comprise sensitivity labels and business labels thereof; the attribute label corresponding to the security index comprises a security influence degree label and a business label.
Then, the target data assets are classified based on the target state information, and a classification result of the data table corresponding to each data management index is obtained, wherein the classification result simultaneously corresponds to metadata of the data table.
Specifically, whether the target state information of which the importance index matches, meets a first hierarchical constraint condition (the fan-in number >0 (table is depended on by downstream tasks) and (data update within near 30 days or no data access within near 30 days), a second hierarchical constraint condition (the fan-in number is 0 (table data is not depended on downstream), and data is not updated within near 90 days and data is not accessed within near 90 days) and a third hierarchical constraint condition (the fan-in number is 0 (table data is not depended on downstream), and (data is not updated within near 90 days or data is not accessed within near 90 days) are determined in sequence.
The grade corresponding to the first grading constraint condition is an effective table, the grade corresponding to the second grading constraint condition is a stop table, and the grade corresponding to the third grading constraint condition is a table to be observed.
Further, for the active table, it can be further ranked by the sub-ranking constraint of the first ranking constraint. For example, the first sub-hierarchy constraint: high fan-in number or high access amount, and the corresponding grade is a core table; second sub-hierarchy constraint: the number of medium fans or medium access amount, and the corresponding grade is a backbone table; the third sub-hierarchical constraint: low fan-in number and (update or low access within 30 days), the corresponding level is a normal table.
And inputting the target state information corresponding to the safety index and the sensitivity index into the corresponding grading model to obtain grading results corresponding to the safety index and the sensitivity index.
Wherein, the grading result of the importance index comprises one of 5 grades of an effective table (the effective table comprises a core table, a main table and a common table) and a non-use table and a zombie table (namely a table to be observed); the core table and the backbone table are core assets, and high specifications are needed to ensure data storage, calculation and safety; the stop list is a recoverable resource, and the zombie list is a resource to be observed. The grading result of the sensitivity index comprises one of 5 grades of absolute secrecy, confidentiality, high sensitivity, medium sensitivity and low sensitivity. The grading result of the safety index comprises one of five grades, four grades, three grades, two grades and 5 grades of the first grade.
In step S740, a data management policy is established and matched.
And determining a target data management strategy according to the grading result of the data table corresponding to each data management index, so as to manage the data table based on the target data management strategy.
And comprehensively determining a target data management strategy based on the grading results respectively corresponding to the importance index, the security index and the sensitivity index, wherein the target data association strategy integrates strategies of storing, security and calculating three angles, the strategy of each angle is simultaneously matched with the grading results respectively corresponding to the importance index, the security index and the sensitivity index, and the target data management strategy is simultaneously associated with the metadata of the data table through the establishing process.
For example, when the classification result of the data table includes a core table, and the sensitive and security classification is two levels, the target data management policy includes a storage policy: intranet, 4 backup storage, cluster physical isolation, calculation strategy: the shared cluster scheduling priority is high, the number of task failure retries is 10 (more), and the security policy is as follows: authority management at a form level, monthly timeliness, general supervision and approval and the like.
In this way, compared with a data asset single-dimensional hierarchical management mode adopting unified standard management and data asset single-dimensional hierarchical management, the target data management strategy adapting to the self state of the data table is determined in a refined mode based on the multi-dimensional hierarchical result; and then the data table is managed based on the target data management strategy, core data loss, freezing and data security events can be effectively avoided in the data management process, the problem of management resource waste is avoided, refined matching of management data is achieved, and reliability of data management is effectively improved. On the basis of guaranteeing the data value output and safety, the cost of data management is greatly reduced.
And data assets are managed by adopting a unified standard, for example, the data assets related to user account information are unified and top-secret, fixed unified levels are divided, so that storage, calculation and safety strategies of all the data assets follow the unified standard, or a standard 'conservative' problem exists, for example, all data are unified, two places and three backups (3 data are stored in machine rooms of 2 cities), scheme management cost is high, or a standard 'loose' problem exists, for example, core data single machine room 3 backups exist, scheme management risk is high, and the problem that the standard 'conservative' problem and the standard 'loose' problem are difficult to solve and cannot give consideration to cost and reliability at the same time.
And the assets are classified in a single dimension, so that a data management strategy cannot be matched in a refined mode, and cost risk control is difficult to make. For example, the storage scheme and the access authentication scheme are set for the high-sensitivity table a only by referring to the sensitivity level, and there may be a case that the high-sensitivity table a has no access or no update for a long time and can be frozen or even deleted.
In order to better implement the data management method provided by the embodiments of the present application, embodiments of the present application further provide a data management device based on the data management method. The terms are the same as those in the data management method, and details of implementation can be referred to the description in the method embodiment. FIG. 8 shows a block diagram of a data management device according to an embodiment of the present application.
As shown in fig. 8, the data management apparatus 800 may include an obtaining module 810, a determining module 820, a matching module 830, a ranking module 840, and a management module 850.
The obtaining module 810 may be configured to obtain asset status information of a target data asset; the determining module 820 may be configured to determine a data management indicator of the target data asset, the data management indicator comprising a plurality of indicators; the matching module 830 may be configured to obtain target status information matched with each of the data management indicators from the asset status information; the ranking module 840 may be configured to rank the target data assets based on the target status information to obtain a ranking result of the target data assets corresponding to each of the data management indicators; management module 850 may be configured to determine a target data management policy based on the ranking of the target data assets corresponding to each of the data management metrics, the target data assets being managed based on the target data management policy.
In some embodiments of the present application, the asset status information comprises attribute information corresponding to a plurality of attributes of the target data asset; a matched module comprising: the label establishing unit is used for establishing an attribute label of the target data asset corresponding to each attribute based on the attribute information corresponding to each attribute; and the tag association unit is used for acquiring an attribute tag corresponding to the attribute matched with each data management index as the target state information matched with each data management index.
In some embodiments of the present application, the tag establishing unit includes: the characteristic obtaining subunit is used for obtaining the information characteristics of the attribute information corresponding to each attribute; the strategy matching subunit is used for determining a label establishing strategy corresponding to each attribute according to the information characteristics of the attribute information corresponding to each attribute; and the label establishing subunit is used for establishing an attribute label of the target data asset corresponding to each attribute by using the attribute information corresponding to each attribute according to the label establishing strategy corresponding to each attribute.
In some embodiments of the present application, the tag associating unit includes: the table acquisition subunit is used for acquiring an attribute query table, wherein the attribute query table comprises each data management index and an attribute matched with each data management index; the query subunit is used for determining a target attribute matched with each data management index according to the attribute query table; and the attribute matching subunit is used for acquiring an attribute tag corresponding to the target attribute matched with each data management index, and the attribute tag is used as the target state information matched with each data management index.
In some embodiments of the present application, the ranking module comprises: the grading strategy determining unit is used for determining a grading strategy corresponding to each data management index; and the strategy grading unit is used for grading the target data assets respectively by utilizing the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index.
In some embodiments of the present application, the ranking policy determining unit includes: the asset analysis subunit is configured to determine a service scene feature and a data supervision requirement corresponding to the target data asset, where the service scene feature is a relevant feature of a service scene applied to the target data asset, and the data supervision requirement is a target requirement for managing the target data asset; and the grading strategy determining subunit is used for determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements.
In some embodiments of the present application, the ranking policy corresponding to the first data management indicator comprises a plurality of ranking constraints, each of the ranking constraints corresponding to a rank; the policy classification unit comprises: the constraint condition matching subunit is used for determining a hierarchical constraint condition which is met by the target state information matched with the first data management index to obtain a target hierarchical constraint condition; the grade matching subunit is used for acquiring the grade corresponding to the target grading constraint condition; and the result determining subunit is used for determining the grade corresponding to the target grading constraint condition as the grading result of the target data asset corresponding to the first data management index.
In some embodiments of the present application, the first data management indicator comprises an importance indicator; the plurality of hierarchical constraint conditions comprise a first hierarchical constraint condition, a second hierarchical constraint condition and a third hierarchical constraint condition, wherein the level corresponding to the first hierarchical constraint condition is an effective asset, the level corresponding to the second hierarchical constraint condition is a disabled asset, and the level corresponding to the third hierarchical constraint condition is an asset to be observed; the constraint matching subunit is configured to: sequentially determining whether the target state information matched with the importance indexes meets the first grading constraint condition, the second grading constraint condition and the third grading constraint condition; and determining the grading constraint condition met by the target state information matched with the importance index as the target grading constraint condition.
In some embodiments of the present application, the ranking policy corresponding to the second data management indicator is a ranking model; the policy classification unit is configured to: and inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset corresponding to the second data management index, which is output by the hierarchical model.
In some embodiments of the present application, the second data management indicator comprises a sensitivity indicator and a security indicator; the policy ranking unit is configured to: inputting the target state information matched with the sensitivity indexes into a sensitivity degree grading model to obtain a grading result of the target data assets corresponding to the sensitivity indexes and output by the sensitivity degree grading model; and inputting the target state information matched with the safety indexes into a safety degree grading model to obtain a grading result of the target data assets corresponding to the safety indexes and output by the safety degree grading model.
In some embodiments of the present application, the management module includes: the system comprises a template acquisition unit, a strategy template acquisition unit and a strategy template matching unit, wherein the template acquisition unit is used for acquiring a strategy template set, and each strategy template in the strategy template set is marked with a plurality of grade labels; the template matching unit is used for determining a grade label matched with each grading result according to the grading result of the target data asset corresponding to each data management index to obtain a target grade label; and the template determining unit is used for taking the strategy template marked with the target grade label as the target data management strategy.
In some embodiments of the present application, the management module includes: the analysis unit is used for inputting the grading result of the target data asset corresponding to each data management index into a strategy decision model to obtain strategy information output by the strategy decision model; and the strategy generating unit is used for generating the target data management strategy according to the strategy information.
In some embodiments of the present application, the data management apparatus further comprises: the metadata acquisition module is used for acquiring metadata corresponding to the target data assets; and the metadata association unit is used for associating the target data management strategy with metadata corresponding to the target data assets.
In this way, based on the data management apparatus 800, by acquiring the asset status information of the target data asset; determining a plurality of data management indexes of the target data asset; then, acquiring target state information matched with each data management index from the asset state information; grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index; multi-dimensional grading is carried out on the target data assets according to the self states of the data assets and multiple indexes; then, according to the grading result of the target data asset corresponding to each data management index, determining a target data management strategy, realizing the grading result based on multiple dimensions, and refining and determining the target data management strategy adapted to the self state of the target data asset; and then the target data assets are managed based on the target data management strategy, core data loss, freezing and data security events can be effectively avoided in the data management process, the problem of management resource waste is avoided, refined matching of management data is achieved, reliability of data management is effectively improved, and management cost is guaranteed.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
In addition, an embodiment of the present application further provides an electronic device, where the electronic device may be a terminal or a server, as shown in fig. 9, which shows a schematic structural diagram of the electronic device according to the embodiment of the present application, and specifically:
the electronic device may include components such as a processor 901 of one or more processing cores, memory 902 of one or more computer-readable storage media, a power supply 903, and an input unit 904. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 9 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 901 is a control center of the electronic device, connects various parts of the entire computer device by using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 902 and calling data stored in the memory 902, thereby performing overall monitoring of the electronic device. Optionally, processor 901 may include one or more processing cores; preferably, the processor 901 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user page, an application program, and the like, and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 901.
The memory 902 may be used to store software programs and modules, and the processor 901 executes various functional applications and data processing by operating the software programs and modules stored in the memory 902. The memory 902 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 902 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 902 may also include a memory controller to provide the processor 901 access to the memory 902.
The electronic device further comprises a power supply 903 for supplying power to each component, and preferably, the power supply 903 may be logically connected to the processor 901 through a power management system, so that functions of managing charging, discharging, power consumption, and the like are realized through the power management system. The power supply 903 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The electronic device may further include an input unit 904, and the input unit 904 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the electronic device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 901 in the electronic device loads an executable file corresponding to a process of one or more application programs into the memory 902 according to the following instructions, and the processor 901 runs the application programs stored in the memory 902, so as to implement various functions as follows:
acquiring asset state information of a target data asset;
determining a data management index of the target data asset, the data management index comprising a plurality of data;
acquiring target state information matched with each data management index from the asset state information;
grading the target data assets based on the target state information to obtain a grading result of each data management index corresponding to the target data assets;
and determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy.
In one embodiment, the asset status information includes attribute information corresponding to a plurality of attributes of the target data asset; the obtaining target status information matched with each data management index from the asset status information includes:
establishing attribute tags corresponding to each attribute of the target data assets based on the attribute information corresponding to each attribute;
and acquiring an attribute tag corresponding to the attribute matched with each data management index as target state information matched with each data management index.
In one embodiment, the creating an attribute tag corresponding to each of the attributes of the target data asset based on the attribute information corresponding to each of the attributes includes:
acquiring information characteristics of attribute information corresponding to each attribute;
determining a label establishing strategy corresponding to each attribute according to the information characteristics of the attribute information corresponding to each attribute;
and establishing an attribute tag corresponding to each attribute of the target data asset by using attribute information corresponding to each attribute according to a tag establishing strategy corresponding to each attribute.
In one embodiment, the obtaining an attribute tag corresponding to an attribute matched with each data management index as target status information matched with each data management index includes:
acquiring an attribute query table, wherein the attribute query table comprises each data management index and an attribute matched with each data management index;
determining a target attribute matched with each data management index according to the attribute query table;
and acquiring an attribute label corresponding to the target attribute matched with each data management index as target state information matched with each data management index.
In one embodiment, the ranking the target data assets based on the target status information to obtain a ranking result of the target data assets corresponding to each of the data management metrics includes:
determining a grading strategy corresponding to each data management index;
and according to the grading strategy corresponding to each data management index, grading the target data assets by using the target state information matched with each data management index to obtain a grading result of each data management index corresponding to the target data assets.
In one embodiment, the determining a ranking policy corresponding to each of the data management metrics includes:
determining a service scene characteristic and a data supervision requirement corresponding to the target data asset, wherein the service scene characteristic is a relevant characteristic of a service scene applied by the target data asset, and the data supervision requirement is a target requirement for managing the target data asset;
and determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements.
In one embodiment, the hierarchical policy corresponding to the first data management index includes a plurality of hierarchical constraints, each of the hierarchical constraints corresponding to a level;
the step of respectively grading the target data assets by using the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index comprises the following steps:
determining a grading constraint condition which is met by the target state information matched with the first data management index to obtain a target grading constraint condition;
acquiring the grade corresponding to the target grading constraint condition;
and determining the grade corresponding to the target grading constraint condition as a grading result of the target data asset corresponding to the first data management index.
In one embodiment, the first data management indicator comprises an importance indicator; the plurality of hierarchical constraint conditions comprise a first hierarchical constraint condition, a second hierarchical constraint condition and a third hierarchical constraint condition, wherein the level corresponding to the first hierarchical constraint condition is an effective asset, the level corresponding to the second hierarchical constraint condition is a disabled asset, and the level corresponding to the third hierarchical constraint condition is an asset to be observed;
the step of determining the hierarchical constraint condition met by the target state information matched with the first data management index to obtain a target hierarchical constraint condition comprises:
sequentially determining whether the target state information matched with the importance indexes meets the first grading constraint condition, the second grading constraint condition and the third grading constraint condition;
and determining the grading constraint condition met by the target state information matched with the importance index as the target grading constraint condition.
In one embodiment, the grading strategy corresponding to the second data management index is a grading model;
the step of respectively grading the target data assets by using the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index comprises the following steps:
and inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset corresponding to the second data management index, which is output by the hierarchical model.
In one embodiment, the second data management indicator comprises a sensitivity indicator and a security indicator;
inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset output by the hierarchical model, wherein the hierarchical result corresponds to the second data management index, and the method comprises the following steps:
inputting the target state information matched with the sensitivity indexes into a sensitivity degree grading model to obtain a grading result of the target data assets corresponding to the sensitivity indexes and output by the sensitivity degree grading model;
and inputting the target state information matched with the safety indexes into a safety degree grading model to obtain a grading result of the target data assets corresponding to the safety indexes and output by the safety degree grading model.
In one embodiment, the determining a target data management policy based on the ranking result of the target data asset corresponding to each of the data management metrics comprises:
acquiring a strategy template set, wherein each strategy template in the strategy template set is marked with a plurality of grade labels;
determining a grade label matched with each grading result according to the grading result of the target data asset corresponding to each data management index to obtain a target grade label;
and taking the strategy template marked with the target grade label as the target data management strategy.
In one embodiment, the determining a target data management policy based on the ranking result of the target data asset corresponding to each of the data management metrics comprises:
inputting the target data assets into a strategy decision model corresponding to the grading result of each data management index to obtain strategy information output by the strategy decision model;
and generating the target data management strategy according to the strategy information.
In one embodiment, the method further comprises:
acquiring metadata corresponding to the target data assets;
and associating the target data management policy with metadata corresponding to the target data asset.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.
To this end, the present application further provides a storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute the steps in any one of the methods provided in the present application.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the computer program stored in the storage medium can execute the steps in any method provided in the embodiments of the present application, the beneficial effects that can be achieved by the methods provided in the embodiments of the present application can be achieved, for details, see the foregoing embodiments, and are not described herein again.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the method provided in the various alternative implementations of the above embodiments of the present application.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the present application is not limited to the embodiments that have been described above and shown in the drawings, but that various modifications and changes can be made without departing from the scope thereof.

Claims (15)

1. A method for managing data, comprising:
acquiring asset state information of a target data asset;
determining a data management index of the target data asset, the data management index comprising a plurality of data;
acquiring target state information matched with each data management index from the asset state information;
grading the target data assets based on the target state information to obtain a grading result of the target data assets corresponding to each data management index;
and determining a target data management strategy according to the grading result of the target data assets corresponding to each data management index, and managing the target data assets based on the target data management strategy.
2. The data management method of claim 1, wherein the asset status information comprises attribute information corresponding to a plurality of attributes of the target data asset; the obtaining target status information matched with each data management index from the asset status information includes:
establishing attribute tags corresponding to each attribute of the target data assets based on the attribute information corresponding to each attribute;
and acquiring an attribute tag corresponding to the attribute matched with each data management index as target state information matched with each data management index.
3. The method according to claim 2, wherein said creating an attribute tag corresponding to each of said attributes of said target data asset based on attribute information corresponding to each of said attributes comprises:
acquiring information characteristics of attribute information corresponding to each attribute;
determining a label establishing strategy corresponding to each attribute according to the information characteristics of the attribute information corresponding to each attribute;
and according to the label establishing strategy corresponding to each attribute, establishing the attribute label of the target data asset corresponding to each attribute by using the attribute information corresponding to each attribute.
4. The data management method according to claim 2, wherein the obtaining an attribute tag corresponding to an attribute matched with each data management index as the target status information matched with each data management index comprises:
acquiring an attribute query table, wherein the attribute query table comprises each data management index and an attribute matched with each data management index;
determining a target attribute matched with each data management index according to the attribute query table;
and acquiring an attribute label corresponding to the target attribute matched with each data management index as target state information matched with each data management index.
5. The method according to claim 1, wherein the ranking the target data assets based on the target status information to obtain a ranking result of the target data assets corresponding to each data management index comprises:
determining a grading strategy corresponding to each data management index;
and according to the grading strategy corresponding to each data management index, grading the target data assets by using the target state information matched with each data management index to obtain a grading result of each data management index corresponding to the target data assets.
6. The method according to claim 5, wherein the determining a ranking policy corresponding to each of the data management metrics comprises:
determining a service scene characteristic and a data supervision requirement corresponding to the target data asset, wherein the service scene characteristic is a relevant characteristic of a service scene applied by the target data asset, and the data supervision requirement is a target requirement for managing the target data asset;
and determining a grading strategy corresponding to each data management index according to the service scene characteristics and the data supervision requirements.
7. The data management method of claim 5, wherein the hierarchical policy corresponding to the first data management index comprises a plurality of hierarchical constraints, each of the hierarchical constraints corresponding to a level;
the step of respectively grading the target data assets by using the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index comprises the following steps:
determining a grading constraint condition which is met by the target state information matched with the first data management index to obtain a target grading constraint condition;
acquiring the grade corresponding to the target grading constraint condition;
and determining the grade corresponding to the target grading constraint condition as a grading result of the target data asset corresponding to the first data management index.
8. The data management method of claim 7, wherein the first data management metrics comprise importance metrics; the plurality of hierarchical constraint conditions comprise a first hierarchical constraint condition, a second hierarchical constraint condition and a third hierarchical constraint condition, wherein the level corresponding to the first hierarchical constraint condition is an effective asset, the level corresponding to the second hierarchical constraint condition is a disabled asset, and the level corresponding to the third hierarchical constraint condition is an asset to be observed;
the step of determining the hierarchical constraint condition met by the target state information matched with the first data management index to obtain a target hierarchical constraint condition comprises:
sequentially determining whether the target state information matched with the importance indexes meets the first grading constraint condition, the second grading constraint condition and the third grading constraint condition;
and determining the grading constraint condition met by the target state information matched with the importance index as the target grading constraint condition.
9. The data management method of claim 5, wherein the ranking policy corresponding to the second data management index is a ranking model;
the step of respectively grading the target data assets by using the target state information matched with each data management index according to the grading strategy corresponding to each data management index to obtain the grading result of the target data assets corresponding to each data management index comprises the following steps:
and inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset output by the hierarchical model, which corresponds to the second data management index.
10. The data management method of claim 9, wherein the second data management metrics comprise a sensitivity metric and a security metric;
inputting the target state information matched with the second data management index into the hierarchical model to obtain a hierarchical result of the target data asset output by the hierarchical model, wherein the hierarchical result corresponds to the second data management index, and the method comprises the following steps:
inputting the target state information matched with the sensitivity indexes into a sensitivity degree grading model to obtain a grading result of the target data assets corresponding to the sensitivity indexes and output by the sensitivity degree grading model;
and inputting the target state information matched with the safety indexes into a safety degree grading model to obtain a grading result of the target data assets corresponding to the safety indexes and output by the safety degree grading model.
11. The method of claim 1, wherein determining a target data management policy based on the ranking result of the target data asset corresponding to each of the data management metrics comprises:
acquiring a strategy template set, wherein each strategy template in the strategy template set is calibrated with a plurality of grade labels;
determining a grade label matched with each grading result according to the grading result of the target data asset corresponding to each data management index to obtain a target grade label;
and taking the strategy template marked with the target grade label as the target data management strategy.
12. The method of claim 1, wherein determining a target data management policy based on the ranking result of the target data asset corresponding to each of the data management metrics comprises:
inputting the target data assets into a strategy decision model corresponding to the grading result of each data management index to obtain strategy information output by the strategy decision model;
and generating the target data management strategy according to the strategy information.
13. The data management method according to any one of claims 1 to 12, wherein the data management method further comprises:
acquiring metadata corresponding to the target data assets;
and associating the target data management policy with metadata corresponding to the target data asset.
14. A data management apparatus, comprising:
the acquisition module is used for acquiring asset state information of the target data asset;
a determining module, configured to determine a plurality of data management metrics of the target data asset;
the matching module is used for acquiring target state information matched with each data management index from the asset state information;
the grading module is used for grading the target data assets based on the target state information to obtain a grading result of each data management index corresponding to the target data assets;
and the management module is used for determining a target data management strategy according to the grading result of each data management index corresponding to the target data asset, and managing the target data asset based on the target data management strategy.
15. A storage medium having stored thereon computer readable instructions which, when executed by a processor of a computer, cause the computer to perform the method of any one of claims 1-13.
CN202110138137.3A 2021-02-01 2021-02-01 Data management method, device and storage medium Pending CN114841481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110138137.3A CN114841481A (en) 2021-02-01 2021-02-01 Data management method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110138137.3A CN114841481A (en) 2021-02-01 2021-02-01 Data management method, device and storage medium

Publications (1)

Publication Number Publication Date
CN114841481A true CN114841481A (en) 2022-08-02

Family

ID=82561417

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110138137.3A Pending CN114841481A (en) 2021-02-01 2021-02-01 Data management method, device and storage medium

Country Status (1)

Country Link
CN (1) CN114841481A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089987A (en) * 2023-04-07 2023-05-09 北京元数智联技术有限公司 Data leakage protection method, device and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089987A (en) * 2023-04-07 2023-05-09 北京元数智联技术有限公司 Data leakage protection method, device and equipment

Similar Documents

Publication Publication Date Title
US20210360000A1 (en) System and method for intelligent agents for decision support in network identity graph based identity management artificial intelligence systems
US20200050968A1 (en) Interactive interfaces for machine learning model evaluations
US8863276B2 (en) Automated role adjustment in a computer system
CN109522312B (en) Data processing method, device, server and storage medium
KR20210040891A (en) Method and Apparatus of Recommending Information, Electronic Device, Computer-Readable Recording Medium, and Computer Program
US11294958B2 (en) Managing a distributed knowledge graph
US20190317842A1 (en) Feature-Based Application Programming Interface Cognitive Comparative Benchmarking
JP2016100005A (en) Reconcile method, processor and storage medium
US11120212B2 (en) Creating and modifying applications from a mobile device
US20230273959A1 (en) Computer-implemented methods, systems comprising computer-readable media, and electronic devices for narrative representation of a network computing environment
US20230281249A1 (en) Computer-implemented methods, systems comprising computer-readable media, and electronic devices for enabled intervention into a network computing environment
CN114398669A (en) Joint credit scoring method and device based on privacy protection calculation and cross-organization
CN110955801B (en) Knowledge graph analysis method and system for cognos report indexes
Lakshmanan Data Science on the Google Cloud Platform
CN112732949A (en) Service data labeling method and device, computer equipment and storage medium
CN114841481A (en) Data management method, device and storage medium
CN116601644A (en) Providing interpretable machine learning model results using distributed ledgers
CN101894327A (en) Digital resource long-term storage format outdating risk quantitative evaluation method
CN101965558B (en) Event history tracking device
CN109446278A (en) A kind of big data management platform system based on block chain
US20230334344A1 (en) Distributed ledger based machine-learning model management
US20230273958A1 (en) Computer-implemented methods, systems comprising computer-readable media, and electronic devices for narrative representation of a network computing environment
CN111259975B (en) Method and device for generating classifier and method and device for classifying text
US20090112704A1 (en) Management tool for efficient allocation of skills and resources
US8527446B2 (en) Information integrity rules framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40072643

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination