CN118278962A

CN118278962A - Evaluation method and device for data asset value

Info

Publication number: CN118278962A
Application number: CN202410359786.XA
Authority: CN
Inventors: 干从勇; 张彬彬; 李峰; 李晓明
Original assignee: Beijing Yusys Technologies Group Co ltd
Current assignee: Beijing Yusys Technologies Group Co ltd
Filing date: 2024-03-27
Publication date: 2024-07-02

Abstract

The embodiment of the invention provides a method and a device for evaluating the value of a data asset, wherein the method comprises the following steps: according to a preset evaluation target, collecting data assets to be evaluated, and processing the data assets to be evaluated through various data processing steps aiming at the data assets to be evaluated to obtain at least one data asset table and data production cost corresponding to the data assets to be evaluated; aiming at each data asset table, according to a blood margin propagation algorithm, the data production cost corresponding to the data asset to be evaluated is allocated, and the cost information of the data asset table is obtained; and evaluating the data management elements corresponding to the data asset tables by using an analytic hierarchy process aiming at each data asset table to obtain the data management score of the data asset table, and evaluating the asset return rate of the data asset to be evaluated according to the data management score and the cost information of all the data asset tables.

Description

Evaluation method and device for data asset value

Technical Field

The invention relates to the field of data asset value evaluation, in particular to a data asset value evaluation method and device.

Background

The value of the data asset is difficult to evaluate, and although the value of the data asset is widely accepted, a cross-regional and cross-industry data pricing and evaluating system or method still lacks at present, and the value of the data asset is difficult to evaluate. The estimation process is high in subjectivity, lacks quantization indexes and is low in technology application degree, so that the estimation result of the data asset is inaccurate.

In carrying out the present invention, the applicant has found that at least the following problems exist in the prior art:

The prior art lacks asset value assessment based on the full lifecycle of data assets.

Disclosure of Invention

The embodiment of the invention provides a method and a device for evaluating the value of a data asset, which solve the problem that the prior art lacks asset value evaluation based on the full life cycle of the data asset.

To achieve the above object, in one aspect, an embodiment of the present invention provides a method for evaluating a value of a data asset, including:

According to a preset evaluation target, collecting data assets to be evaluated, and processing the data assets to be evaluated through various data processing steps aiming at the data assets to be evaluated to obtain at least one data asset table and data production cost corresponding to the data assets to be evaluated;

Aiming at each data asset table, according to a blood margin propagation algorithm, the data production cost corresponding to the data asset to be evaluated is allocated, and the cost information of the data asset table is obtained;

Evaluating data management elements corresponding to each data asset table through an analytic hierarchy process to obtain data management scores of the data asset tables;

And evaluating the asset return rate of the data asset to be evaluated according to the data management scores and the cost information of all the data asset tables.

Further, according to a preset evaluation target, collecting a data asset to be evaluated, and processing the data asset to be evaluated through a plurality of data processing steps for the data asset to be evaluated to obtain at least one data asset table and data production cost corresponding to the data asset to be evaluated, including:

determining the range of a corresponding data asset to be evaluated according to a preset evaluation target, and collecting the data asset to be evaluated according to the range of the data asset to be evaluated;

performing data cleaning on the data asset to be evaluated to obtain cleaned data, wherein the data cleaning comprises: removing repeated data and repairing data errors;

Carrying out data integration on the cleaned data with different sources, formats and structures, and storing according to preset unified data storage requirements to obtain storage data;

Classifying the stored data according to the nature, source and purpose of the stored data and marking tags;

Replacing sensitive information in the stored data with appointed replacement information to obtain desensitized data corresponding to the stored data, and encrypting the desensitized data;

Storing the desensitized and encrypted stored data as at least one data asset table by category;

Collecting data production costs corresponding to the data assets to be evaluated according to a data processing step from the data assets to be evaluated to the at least one data asset table;

Wherein, the data production cost includes: hardware construction cost, software construction cost, operation and maintenance cost, resource cost, data purchase cost, data consultation cost, manpower expenditure cost and site cost.

Further, for each data asset table, according to a blood-margin propagation algorithm, apportioning the data production cost corresponding to the data asset to be evaluated to obtain cost information of the data asset table, including:

Generating a processing procedure blood edge relation corresponding to each data asset table according to a blood edge propagation algorithm; the blood relationship of the processing procedure represents the father-son relationship between the data processing steps;

According to the blood relationship of the processing procedure corresponding to the data asset table, the data production cost corresponding to the data asset to be evaluated is allocated to the data production cost corresponding to each data processing step for obtaining the data asset table;

And accumulating the data production cost corresponding to each data processing step for obtaining the data asset table according to the blood relationship of the processing procedures, and obtaining the cost information of the data asset table.

Further, for each data asset table, evaluating the data management element corresponding to the data asset table by using a hierarchical analysis method to obtain a data management score of the data asset table, including:

Constructing a judgment matrix of an analytic hierarchy process according to the data management elements corresponding to the data asset table;

Comparing the relative importance among different data management elements corresponding to the data asset table according to the judgment matrix, and determining the weight of each data management element corresponding to the data asset table;

carrying out normalization processing on each column of the judgment matrix, and calculating weight vectors of all data management elements corresponding to the data asset table;

calculating the comprehensive weight of each data management element corresponding to the data asset table according to the weight vector of each data management element corresponding to the data asset table;

Determining the data management score of the data asset table according to each data management element corresponding to the data asset table and the comprehensive weight of each data management element;

the data management element comprises a basic element, a quality element and a use element;

The base elements include data size;

the mass element comprises: accuracy, consistency, integrity, normalization, timeliness;

the use elements include: accessibility and heat of use.

Further, the evaluating the asset return rate of the data asset to be evaluated according to the data management scores and the cost information of all the data asset tables comprises:

adding the data management score and the cost information of each data asset table to obtain the total value of the data asset table;

determining the weight corresponding to the overall value of the data asset table according to the application scene and the contribution degree of the data asset table;

and weighting and tie the total values of all the data asset tables according to the weights corresponding to the total values of all the data asset tables to obtain the asset return rate of the data asset to be evaluated.

In another aspect, an embodiment of the present invention provides an apparatus for evaluating a value of a data asset, including:

The data asset acquisition unit is used for collecting data assets to be evaluated according to a preset evaluation target, and processing the data assets to be evaluated through various data processing steps aiming at the data assets to be evaluated to obtain at least one data asset table and data production cost corresponding to the data assets to be evaluated;

The data asset cost spreading unit is used for spreading the data production cost corresponding to the data asset to be evaluated according to a blood margin spreading algorithm aiming at each data asset table to obtain the cost information of the data asset table;

The data asset management unit is used for evaluating the data management elements corresponding to each data asset table through a hierarchical analysis method to obtain the data management scores of the data asset tables;

And the data asset evaluation unit is used for evaluating the asset return rate of the data asset to be evaluated according to the data management scores and the cost information of all the data asset tables.

Further, the data asset collection unit comprises:

The data asset collection module is used for determining the range of the corresponding data asset to be evaluated according to a preset evaluation target and collecting the data asset to be evaluated according to the range of the data asset to be evaluated;

The data asset cleaning module is used for cleaning the data of the data asset to be evaluated to obtain cleaned data, and the data cleaning comprises: removing repeated data and repairing data errors;

The data asset storage module is used for integrating data of cleaned data with different sources, formats and structures, and storing the cleaned data according to preset unified data storage requirements to obtain storage data;

the data asset classification module is used for classifying the stored data according to the properties, sources and purposes of the stored data and marking tags;

The data asset desensitization encryption module is used for replacing sensitive information in the stored data with appointed replacement information to obtain desensitization data corresponding to the stored data, and encrypting the desensitization data;

A data asset table generation module for storing the desensitized and encrypted stored data as at least one data asset table by category;

The data production cost collecting module is used for collecting the data production cost corresponding to the data asset to be evaluated according to the data processing steps from the data asset to be evaluated to the at least one data asset table;

Further, the data asset cost propagation unit includes:

The processing procedure blood edge relation determining module is used for generating a processing procedure blood edge relation corresponding to each data asset table according to a blood edge propagation algorithm; the blood relationship of the processing procedure represents the father-son relationship between the data processing steps;

The cost allocation module is used for allocating the data production cost corresponding to the data asset to be evaluated into the data production cost corresponding to each data processing step for obtaining the data asset table according to the blood relationship of the processing procedure corresponding to the data asset table;

And the cost information determining module is used for accumulating the data production cost corresponding to each data processing step for obtaining the data asset table according to the blood relationship of the processing procedure, and obtaining the cost information of the data asset table.

Further, the data asset management unit includes:

The judging matrix construction module is used for constructing a judging matrix of the analytic hierarchy process according to the data management elements corresponding to the data asset table;

The management element weight determining module is used for comparing the relative importance among different data management elements corresponding to the data asset table according to the judging matrix and determining the weight of each data management element corresponding to the data asset table;

The weight vector determining module is used for carrying out normalization processing on each column of the judging matrix and calculating the weight vector of each data management element corresponding to the data asset table;

the comprehensive weight determining module is used for calculating the comprehensive weight of each data management element corresponding to the data asset table according to the weight vector of each data management element corresponding to the data asset table;

A data management score determining unit, configured to determine a data management score of a data asset table according to each data management element corresponding to the data asset table and a comprehensive weight thereof;

The base elements include data size;

the use elements include: accessibility and heat of use.

Further, the data asset assessment unit comprises:

The total value determining module is used for adding the data management score and the cost information of each data asset table to obtain the total value of the data asset table;

the total value weight determining module is used for determining the weight corresponding to the total value of the data asset table according to the application scene and the contribution degree of the data asset table;

And the asset return rate determining module is used for weighting and tie the total values of all the data asset tables according to the weights corresponding to the total values of all the data asset tables to obtain the asset return rate of the data asset to be evaluated.

The technical scheme has the following beneficial effects: the cost information of the data asset table is determined by processing the obtained data asset through various data processing steps, and then the cost information of the data asset table is determined by apportioning the data production cost corresponding to the data asset to be evaluated, the data management element is analyzed through a hierarchical analysis method, the data management score of the data asset table is obtained, the value of the data asset is determined by combining the cost information and the data management score, the asset value evaluation from the data source of the data asset to the full life cycle of the data asset is realized, and the effect of simplifying the complex evaluation steps and evaluating the value of the data asset more comprehensively and accurately is achieved. And (3) establishing a data asset value evaluation index system, and perfecting a data asset value evaluation framework based on influence factors of data in different stages. The method can be widely applied to industries and enterprise-level applications with large data and expected to conduct data transaction.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method of evaluating the value of a data asset according to one embodiment of the invention;

FIG. 2 is a block diagram of an apparatus for evaluating the value of a data asset according to one embodiment of the invention;

FIG. 3 is a schematic diagram of a retrieval process of the graph retrieval algorithm according to one embodiment of the present invention;

FIG. 4 is a schematic diagram of a data asset assessment model according to one embodiment of the invention;

FIG. 5 is another flow chart of a method of evaluating the value of a data asset in one of the embodiments of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In one aspect, as shown in fig. 1, an embodiment of the present invention provides a method for evaluating a value of a data asset, including:

Step S10, collecting data assets to be evaluated according to a preset evaluation target, and processing the data assets to be evaluated through various data processing steps aiming at the data assets to be evaluated to obtain at least one data asset table and data production cost corresponding to the data assets to be evaluated;

Step S11, aiming at each data asset table, according to a blood margin propagation algorithm, the data production cost corresponding to the data asset to be evaluated is amortized, and the cost information of the data asset table is obtained;

Step S12, evaluating the data management elements corresponding to each data asset table through a hierarchical analysis method to obtain the data management scores of the data asset tables;

And step S13, evaluating the asset return rate of the data asset to be evaluated according to the data management scores and the cost information of all the data asset tables.

In some embodiments, the data full life cycle may be divided into four phases, including "mining", "building", "managing", and "using", according to links from production to consumption, where "mining" represents the whole process of aggregating heterogeneous data, and "building" represents the completeness and coherence of data governance, and "managing" refers to the governance and management of the full volume of data assets, and "using" emphasizes the application and return of data assets. And establishing an asset value evaluation method based on the full life cycle of the data, and perfecting a data asset value evaluation framework from the influence elements of the data in different stages. Fully considering all factors of significance of the data asset in the value-exerting path, analyzing and classifying the data asset, calculating the data asset by adopting different technical means aiming at different types of factors to obtain a complete data asset assessment model which is consistent with the real attribute of the data asset in the evaluation of the data asset, wherein the data asset assessment model is shown in the figure 4, the whole data asset assessment stage starts from the data asset collection (the data asset to be assessed) and the data production cost collection (such as data cost identification and input), the data asset construction stage is finished, the data asset is based on a cost distribution calculator taking a blood-edge propagation algorithm (a blood-edge graph algorithm) as a dominant cost, And a management stage of carrying out data management scoring according to the basic elements, quality elements and use elements of data management, and a stage of using the data management scoring and the data production cost, simultaneously evaluating the value of the data asset by considering non-economic factors and economic factors, and evaluating the value of the data asset by combining a data asset element evaluation system and data application market value analysis to determine the return rate of the data asset. For different types of data assets, different technological means can be adopted to calculate and evaluate the value, and the following are some common technological means and corresponding factors, and quantitative analysis is carried out: by calculating and evaluating the quantity and attributes of the data, for example, for sales data assets, data mining and statistical analysis methods can be used to calculate sales, growth rates, etc. metrics; financial analysis: calculation and evaluation by financial data, for example, for financial data assets, financial ratio analysis, financial models, etc. may be used to evaluate value and potential returns; risk assessment: calculating the value of a data asset by assessing its risk and uncertainty, e.g., for a market data asset, its market value and risk can be assessed using a risk model and probability analysis; Brand value assessment: calculating the value of the brand data asset by evaluating its impact on the enterprise value, e.g., brand value may be calculated using a brand evaluation model and market research data; technical evaluation: the value of a technical data asset is calculated by evaluating its innovativeness, patent value, etc., for example, a patent evaluation model and a technical competition analysis may be used to evaluate the value of the technical data asset. The following steps of the blood-edge propagation algorithm, data cost identification: first, there is a need to identify and record cost items associated with data assets, including costs in terms of data collection, storage, processing, maintenance, etc., which can be determined by examining financial records, discussing with the relevant team, and analyzing the data lifecycle, etc.; And (3) data cost input: associating the identified cost items with corresponding data assets and entering the cost data into a calculator, which may be done by manual entry or integration with a financial system; blood map algorithm: tracking the source, flow direction and use condition of the data asset by using a blood-edge graph algorithm based on the blood-edge relationship, wherein the blood-edge relationship can be identified and captured by means of data labels, metadata, log records and the like; and (3) cost distribution calculation: according to the blood map algorithm and the cost data, the calculator can perform cost distribution calculation, and according to the use condition of the data assets, the flow path and other factors, the calculator distributes and calculates related cost to determine the cost of each data asset; Value evaluation: based on the calculated cost of the data asset, a value assessment may be made, which may be compared to the expected revenue, market value, etc. of the data asset to determine the value of the data asset. The method comprises the steps of comprehensively combing and summarizing data production cost, summarizing various costs of data, summarizing the data cost into hardware construction cost, software construction cost, operation and maintenance cost, resource cost, data purchase cost, data consultation cost, manpower expenditure, site cost and the like. Blood margin propagation was used for data cost assessment. The data costs are tracked and the collectable costs are all amortized to the root node of the data asset table. Data costs are propagated along the data processing blood edges by the blood edge transfer relationship of the data. Based on a summary of the existing data management scoring dimensions, a data management scoring system is established that includes basic attribute elements (base elements), data quality elements (quality elements), and usage elements. The data management factors mainly explain the operation and maintenance conditions after the data development, adopt basic evaluation indexes (corresponding to basic elements), data quality evaluation indexes (corresponding to quality elements) and use evaluation indexes (corresponding to use elements), and evaluate non-economic factors (data management elements) of the data through a hierarchical analysis method, so that the data management scores of the single data asset table are obtained. And obtaining the whole industry return rate by analyzing the business conditions of the enterprises, and fitting the data management score and the cost information into the data asset return rate of each table. The overall asset return of an enterprise is related to the industry in which it is located and the digitized construction capabilities of that industry. Cost entry for a data asset assessment refers to the process of recording and managing cost information associated with the data asset assessment. Its purpose is to ensure accurate, comprehensive and traceable records of the cost of data asset assessment for auditing and analysis when needed. Specifically, the primary tasks of cost entry include the following aspects, identifying and collecting cost information: identifying and collecting cost information related to data asset assessment, by coordination with related departments and teams, which may include human resources, hardware devices, software tools, external services, and the like; Recording and sorting costs: recording and classifying the collected cost information, which can be achieved by establishing a cost classification system or using an existing financial system, wherein common cost classifications include personnel cost, equipment cost, software cost, training cost and the like; distribution cost: assigning costs to related data asset assessment projects or activities according to cost attribution principles, which may be assigned according to project workload, resource consumption, time allocation, etc.; calculating and summarizing cost: calculating and summarizing costs according to cost classifications and assignments, which may be automatically calculated and summarized by using a spreadsheet or financial system; auditing and validation costs: cost auditing and verification are carried out regularly, and the accuracy and reliability of the cost are ensured, which can be realized by conducting financial audit in cooperation with a financial department or using an internal audit method; Analysis and reporting costs: cost analysis and reporting is performed on demand, which may include generating cost reports, formulating cost budgets, performing cost benefit analysis, etc., to support management decisions and optimize resource allocation. Cost maintenance of data asset assessment refers to the activity of updating and managing cost information involved in the data asset assessment process, and specifically it includes the work of cost data collection: collecting all cost data related to the data asset assessment, including cost information in terms of hardware devices, software permissions, human resources, training, etc.; cost classification and archiving: classifying and archiving the collected cost data, and ensuring that each cost item has clear identification and classification so as to facilitate subsequent cost analysis and management; cost update and adjustment: according to the actual situation, the cost data is updated and adjusted regularly, for example, when the price of the hardware equipment is changed, the cost data needs to be updated correspondingly; cost analysis and reporting: analyzing the collected cost data to evaluate the cost benefit and return on investment of the data asset, and generating a report based on the cost data to provide information about the cost of evaluating the data asset to a decision maker; cost control and optimization: and according to the analysis result of the cost data, adopting corresponding control measures to optimize the cost benefit of the data asset evaluation, for example, reducing the cost by saving software licensing fees or optimizing human resource allocation and the like.

In other embodiments, as shown in FIG. 5, the four phases "mining," "building," "managing," "using" are embodied as four phases of data asset inventory, data asset management, data asset operation, and data asset assessment. Data asset inventory: and (3) carrying out data asset inventory, and preferentially conducting inventory on important data with service use value. And classifying and grading the data, and making data asset right. Data asset management: and establishing a data asset management system and a matched system and process. And (5) carrying out data standard checking, data quality management, data security and other activities in a standard manner. Data asset operation: the existing data assets are combed to form an enterprise-level data asset directory. The user applies for consuming the asset and records the element information required by daily management, accounting processing and information disclosure. Data asset assessment: and selecting data element influencing factors according to a value evaluation scheme, constructing a data asset value evaluation framework, and pricing and evaluating the data asset. Data asset inventory, data asset management, and data asset operation are very useful for data asset assessment. Data asset inventory: before evaluating the data assets, first, a data asset inventory is required, which includes collecting and recording all data assets owned by the enterprise, including structured data (e.g., data in databases) and unstructured data (e.g., files, documents, etc.), by which a clear data asset inventory can be established, providing a basis for subsequent evaluation tasks; data asset management: the data asset management means that a data management strategy is formulated and executed to ensure the quality, safety and compliance of data, and plays an important role in data asset evaluation, and the reliability and the value of the data asset can be determined and related indexes and guidance are provided for evaluation by carrying out data quality evaluation, data safety risk evaluation and compliance evaluation; data asset operation: the data asset operation refers to a process of achieving business objectives by using data assets, in the data asset evaluation, the data asset operation can provide information of data use conditions and benefits, and the data asset return rate and potential value can be evaluated by analyzing the use conditions and business values of the data assets, and a basis is provided for decision making. In summary, data asset inventory, data asset management, and data asset operations are useful for data asset assessment, which provide basic information, quality, and security assessments for assessing data assets, and help determine the rate of return and potential value of data assets.

In some embodiments, in the data asset assessment, four stages, "mining", "building", "managing", "using" are fused with a data warehouse hierarchical processing architecture specific to big data, specifically including the following layers, the data acquisition layer: this layer is primarily responsible for collecting data from various data sources, which may include structured data, unstructured data, semi-structured data, etc., which may include various business systems, log files, third party data, etc.; data preprocessing layer: the collected raw data is subjected to preprocessing operations such as cleaning, conversion, merging and the like so as to be processed in the next step, and an ETL (Extract, transform, load) tool is generally used for completing the step; data warehouse layer: storing and managing the preprocessed data, this layer typically contains one or more data stores for storing historical data and supporting complex analytical queries; data analysis layer: analyzing and mining data stored in a data warehouse, extracting valuable information and knowledge, wherein the layer usually uses methods such as statistical analysis, data mining, machine learning and the like; data application layer: and applying the analysis result to the service, for example, providing support for decision making in the modes of report form display, early warning prompt, intelligent recommendation and the like. Throughout the process, we evaluate the quality, integrity, consistency, etc. of the data to ensure the value of the data. After the data warehouse layered processing architecture specific to big data is fused, the data asset assessment has the following obvious effects, and the data processing efficiency is improved: the hierarchical processing architecture can decompose large-scale data into smaller and easier to manage parts, so that the speed of data processing can be improved, and errors in the data processing process can be reduced; enhancing availability and accessibility of data: through the hierarchical processing architecture, data can be classified according to importance and purposes thereof, so that users can more conveniently obtain the data needed by the users, and the availability and accessibility of the data are improved; the quality and accuracy of data are improved: in the data processing process, the layered processing architecture can effectively prevent data repetition, and in addition, the data quality and accuracy can be improved by carrying out data cleaning and data verification on each level; better protects the data security: different access rights can be set for data of different levels through a layered processing architecture, so that the safety of the data is effectively protected; auxiliary decision making: for enterprises, the data processed by the data warehouse layering processing architecture specific to big data can help the enterprises to better understand the service conditions of the enterprises, so that more scientific and reasonable decisions are made.

The embodiment of the invention has the following technical effects: the cost information of the data asset table is determined by processing the obtained data asset through various data processing steps, and then the cost information of the data asset table is determined by apportioning the data production cost corresponding to the data asset to be evaluated, the data management element is analyzed through a hierarchical analysis method, the data management score of the data asset table is obtained, the value of the data asset is determined by combining the cost information and the data management score, the asset value evaluation from the data source of the data asset to the full life cycle of the data asset is realized, and the effect of simplifying the complex evaluation steps and evaluating the value of the data asset more comprehensively and accurately is achieved. Through deep analysis of data assets and exploration of different technical means, a core thought of adopting a 'blood margin allocation as a core and management and transaction elements as adjustment' for a 'mining, construction, management and use' stage data table in the enterprise is defined. And through the blood margin analysis view, a detailed process from the source end to the cost value accumulation to release in the application process of each item of data is completely and clearly disclosed for enterprises. Thereby helping enterprises to pertinently identify and analyze high-value and high-cost processes or nodes and further optimizing the data resource allocation inside the enterprises. By establishing a value evaluation index system based on the full life cycle of the asset, the complex value evaluation flow is simplified, and the technical threshold of the enterprise for evaluating the value of the asset is reduced. And through the data asset value evaluation model, the detailed process from the source end to the cost value accumulation to release in the whole life process of each item of data is completely and clearly recorded for the enterprise. The data warehouse layering processing architecture and the data blood edge characteristics special for big data are fused. And (3) the calculation of the allocation of the overall construction cost of the data to the construction cost of the single data table is completed.

In some embodiments, the range of the corresponding data asset to be evaluated is determined according to a preset evaluation target, and the data asset to be evaluated is collected according to the range of the data asset to be evaluated, namely, a "mining" stage, wherein the "mining" stage specifically includes one or more of the following steps: determining an evaluation objective, collecting data asset information, determining an evaluation index, determining an evaluation objective, evaluating data value, identifying data relevance, designing a data convergence scheme, implementing data convergence, verifying data convergence results, and/or updating data asset value evaluation. Wherein, determining the purpose and range of the evaluation target for explicit evaluation; for example, determining whether the evaluation is for a particular data asset only or the entire data asset library; Collecting data asset information for collecting and collating detailed information about the data asset, including data source, data type, data volume, data quality, etc.; determining an evaluation index for determining an evaluation index based on the evaluation target, wherein the indexes can be commercial value, potential risk, market demand and the like of the data; evaluating the data value for value evaluation of the data asset according to the selected evaluation index, which may include using statistical analysis methods, market research, expert judgment, etc.; identifying data associations for analyzing associations and interactions between data assets to determine the necessity and value of aggregate data; the data convergence scheme is designed for designing the data convergence scheme according to the evaluation result and the data relevance. This may include determining the manner in which the data is aggregated (e.g., ETL, API integration, etc.), the data cleansing and transformation policies, etc.; data aggregation is implemented for implementing data aggregation according to the design scheme. This may involve steps of data extraction, conversion, loading, etc.; the data aggregation result is used for verifying the accuracy and the integrity of the aggregated data, so that the data can meet the expected requirement; the updated data asset value assessment is used to update the value assessment of the data asset based on the actual data convergence result to reflect the impact of the data convergence. The "build" phase includes one or more of the following steps: data cleansing, data integration, data classification and labeling, data desensitization and encryption, data quality monitoring, and/or data governance policy formulation. The data cleaning is used for preprocessing the collected original data, including removing repeated data, repairing error data, unifying data formats and the like. The purpose of this step is to ensure the quality and accuracy of the data, laying a foundation for subsequent data analysis and application. The data integration is used for integrating data from different sources, formats and structures after the data are cleaned, so as to form a unified data storage. This requires the use of data integration techniques such as ETL (extraction, conversion, loading) etc. to meet the requirements of subsequent data analysis and application. The data classification and the label are used for classifying the integrated data, and the data is classified into different categories according to the property, the source, the use and the like of the data. Meanwhile, corresponding labels are added to the data so as to facilitate subsequent data management and application. Data desensitization and encryption are used for protecting data privacy, and data needs to be subjected to desensitization treatment, namely sensitive information is replaced by a specified replacement value. In addition, the data is encrypted to prevent the data from being illegally accessed during transmission and storage. The data quality monitoring is used for establishing a data quality monitoring mechanism, continuously monitoring the quality index of the data asset, and ensuring the accuracy, completeness, timeliness and reliability of the data asset. The data management strategy is formulated to formulate corresponding data management strategies according to business requirements and data characteristics of enterprises, and the data management strategies comprise data management, data safety, data compliance and other specifications. In the "build" process, the data asset value assessment is mainly manifested by the following aspects: data quality: in assessing the value of a data asset, attention is paid to the quality of the data, including accuracy, integrity, timeliness, etc. of the data. High quality data assets are more valuable. Data value mining: in the process of data integration and classification, potential values in the data are mined, the relevance and the rule between the data are found, and support is provided for business decisions of enterprises. Data security and compliance: the value of a data asset is evaluated taking into account the security and compliance of the data. And the data asset is ensured to bring value to enterprises on the premise of compliance. Data application scenario: and analyzing application potential of the data asset in different service scenes, and evaluating value contribution of the data asset in each scene. Data lifecycle: the overall process of data from creation to extinction is focused on, and the value changes of the data asset at different stages are analyzed to comprehensively evaluate the value of the data asset.

Starting from the whole life cycle of the data asset, the total input cost (data production cost) of the data asset comprises hardware construction cost, software construction cost, operation and maintenance cost, resource cost, data purchase cost, data consultation cost, manpower expenditure, site cost and other aspects, and links of data asset acquisition, purchase, storage, calculation, management, application and the like are covered. The specific cost sub-items and descriptions of the various costs are shown in table 1.

TABLE 1 description of total input cost for data asset element

In some embodiments, the establishing stage further includes, for each data asset table, apportioning data production costs corresponding to the data asset to be evaluated according to a blood-margin propagation algorithm, to obtain cost information of the data asset table; the data cost is tracked, the collectable data production cost is distributed to the root node of the data asset table, and the data cost is propagated along the data processing blood edge through the blood edge transfer relation of the data. The root node is the topmost node in the data asset blood-edge propagation graph, i.e., the origin of the entire data asset blood-edge relationship. In data asset blood-edge propagation, the root node represents the original data source or earliest point of data generation. It is the starting point of the whole data flow, and no other nodes can trace back to it. Typically, the root node may be a point of generation of the original data, such as a table in a database, a file in a file system, etc. In evaluating data costs and apportionment costs, the root node is important in that it represents the initial investment cost of the entire data flow. By progressively apportioning costs from the root node along the data lineage propagation path, the cost of each data asset can be more accurately assessed and assigned the appropriate cost. This allows a better understanding of the value of the data asset and its contribution to the business, helping to make more informed decisions and resource allocations. In summary, the root node is the origin in the data asset blood-edge propagation graph, representing the original data source or earliest point of data generation. The root node plays an important role in evaluating data costs and apportioning costs.

In some embodiments, propagating data production costs along data processing blood edges through the blood edge transfer relationship of the data may be performed by determining a root node of the data asset as follows: finding a starting node in the data processing flow, namely a source or input point of data, which can be a source system, an external data source or other data set; knowing the data processing flow: the method is characterized by comprising the following steps of detail understanding of each step and conversion of data in the processing process, including data cleaning, conversion, integration and the like; factors affecting data costs are identified from the blood-lineage propagation relationship: in each data processing step, determining factors that affect data costs, which may include computing resources, storage space, labor costs, etc.; tracking data blood-edge propagation paths: establishing a data blood edge propagation path by recording the input and output relation of each step in the data processing flow, which can be realized by a metadata management system or a data blood edge analysis tool; calculating the data cost: the data cost is calculated according to the data blood-edge propagation path based on the cost factors determined in each data processing step. This may be a simple sum or weight calculation based on different factors; analysis data cost: according to the calculated data cost, evaluation and analysis are carried out, so that high-cost data processing steps can be identified, the opportunity for optimization is found, and corresponding improvement measures are provided.

The blood-margin propagation algorithm is an algorithm based on a graph search algorithm, the graph search algorithm is a field in the graph algorithm, a tree theory is adopted to search the graph, a branch is traced back after an end point is found, a path from a starting point to the end point is finally obtained, the graph search algorithm is shown in fig. 3, a graph data structure is arranged on the left side, and a search path based on the graph data structure on the left side is arranged on the right side.

The collected costs are distributed to the data asset table, which may be determined by: first, a table of data assets that require evaluation of costs is determined. This may be a database table, a data file, a data set, etc. And (3) collecting related cost: cost information associated with the data asset table is collected. This includes costs in terms of data acquisition, storage, processing, maintenance, etc. Determining a cost sharing factor: factors for apportioning costs are determined. This may be the number of rows of the data asset table, the amount of data, the frequency of use, etc. Calculating the cost sharing proportion: and calculating the cost allocation proportion of each data asset table according to the determined allocation factors. The total cost may be scaled to each data asset table using a simple scaling method. Apportioning costs to the data asset table: and according to the calculated cost allocation proportion, allocating the corresponding cost to each data asset table. The apportioned cost information may be recorded for later management and analysis. Periodic update cost amortization: over time, costs may change. Therefore, the cost split needs to be updated periodically to reflect the latest cost situation.

Tracking data costs and evaluating the spread of blood edges of data assets may include, in particular: determining a blood relationship of the data asset: first, the blood-bearing relationships between data assets, i.e., the source, transfer, and consumption relationships of the data, are determined. This may be accomplished by analyzing the data stream, data conversion, and data storage processes. Cost information for the tag data asset: for each data asset and associated blood-bearing relationship, corresponding cost information is tagged. The cost information may include cost data for links such as data acquisition, data transmission, data storage, data processing, and the like. Recording and updating cost information: a cost management system or database is built to record and update cost information for the data asset. And ensuring that the cost information is kept synchronous with the actual situation and updated in time. Tracking propagation of data costs: by analyzing the blood-bearing relationships and cost information between data assets, the propagation path of the data costs can be tracked. From the source of the data, the consumption and associated costs of the data are tracked along a chain of blood relationship. Evaluating data cost: the cost of the data asset is assessed based on the propagation path of the data cost and the associated cost information. The cost of the data asset at each link may be calculated or an overall cost assessment of the data asset based on the required metrics.

In other embodiments, graph retrieval algorithms may be applied to the process of data asset assessment to assist us in processing specific asset data. The specific processing steps are as follows: constructing a data asset map: all data assets are represented in the form of nodes, and relationships between nodes are represented in the form of edges. This data asset map may include a data table, database, file, etc. Defining an evaluation index: according to the evaluated requirements, evaluation indexes such as data value, data reliability, data use and the like are determined. These metrics may be used as attributes of the nodes. Applying a graph retrieval algorithm: and selecting a proper graph retrieval algorithm to search and traverse the data asset graph according to the required evaluation index. Common graph retrieval algorithms include depth-first search (DFS), breadth-first search (BFS), dijkstra algorithm, minimum spanning tree algorithm, and the like. Processing asset data: in the process of graph retrieval, specific asset data may be processed according to the results of the algorithm. Such as calculating data value, evaluating data reliability, determining data usage, etc. Outputting an evaluation result: and generating an evaluation result report according to the processed asset data, wherein the evaluation result report comprises an evaluation index value of each asset and an overall evaluation conclusion. By applying the graph retrieval algorithm, the data asset related to the evaluation index can be found from the data asset graph, and processed and evaluated, so that the evaluation result of the data asset is obtained. This may help us better understand the value and availability of the data asset and thus make better decisions. In graph retrieval algorithms, "tables" generally refer to stored forms of data, which can also be understood as tables in a database. There is a relationship between various aspects in the cost of data and tables. Hardware construction cost: the cost of purchasing and configuring hardware devices, including servers, storage devices, etc., is related to the storage of the tables. Software construction cost: the costs associated with the database system of the table, including the purchase and development of the database software. Operation and maintenance cost: in connection with the maintenance and management of tables, including payroll by database administrators, server maintenance costs, and the like. Resource cost: in connection with the access and use of the table, including the cost of server resources, network bandwidth costs, etc. Data purchase cost: associated with the data source of the table, including the cost of obtaining the data from outside. Data consultation cost: in connection with the data analysis and decision making of the tables, including the cost of the advisory services. Manpower expenditure: related to maintenance, development, analysis, etc. of the table, including personnel wages, training, etc. A specific "table" refers to a data table in a database that holds a particular type of data. Relationships can be established between multiple tables through association keys (such as a main key and an external key), so that data association and inquiry between tables can be performed. The storage organization of the tables is typically automatically managed by a database management system, and different storage mechanisms (e.g., B-trees, hashes, etc.) may be implemented using different storage engines. By managing the storage organization and data content of the tables, we can retrieve and analyze data from the tables using a graph retrieval algorithm.

In some embodiments, the relationship of the data's blood-lineage transfer can be obtained by the metadata management system: the blood-edge relationships of data assets can be tracked and recorded using a specialized metadata management system. These systems can track the source, conversion and use of data and construct a complete blood-lineage transfer relationship graph. Data acquisition tool: during data acquisition, the blood-edge relationship of the data may be automatically recorded by using the data acquisition tool. These tools can capture the source, destination, and conversion process of the data and correlate them to form a blood-lineage transfer relationship. Database log: some database systems may record an oplog of data, including reading, writing, and modifying data. By analyzing these database logs, the data's blood-lineage transfer relationship can be restored. And (3) manual recording: in the absence of automated tools, manually recording the blood relationship of the data is also a viable way. Through communication and investigation with related personnel, the source and the use condition of the data can be known, and a blood margin transfer relation graph is manually drawn.

In some embodiments, the calculation of the overall construction cost of the data to the construction cost of a single data table may be performed by determining the total cost of the data construction (the production cost of the data corresponding to the data asset to be evaluated): the total cost of data construction generally includes hardware facility investment, software development and purchase, labor costs (including data acquisition, cleaning, arrangement, maintenance, etc.), etc.; calculating the weight of each data table: the weight of the data table can be defined according to factors such as the size (such as the number of records), the complexity (such as the number and type of fields) and the use frequency; and (3) apportioning cost: the total cost may be amortized over each data table according to the weight of each data table, i.e. the cost per data table = the total cost of data construction x the table weight; data table cost = data construction total cost x data table weight, which is an approximation method and whose accuracy depends on how the weights are defined and calculated.

The base elements include data size;

the use elements include: accessibility and heat of use.

Analytic Hierarchy Process (AHP) is a method for multi-criteria decision making that helps us weight rank different factors. The following is a specific step of the analytic hierarchy process, determining the hierarchy: firstly, determining a hierarchical structure of a decision-making problem, namely decomposing the problem into different factor hierarchies, wherein in data asset evaluation, data quality evaluation indexes can be divided into accuracy, consistency, integrity, normalization and timeliness; constructing a judgment matrix: next, we need to construct a decision matrix for comparing the relative importance between different factors, the decision matrix being a square matrix in which each element represents a comparison of the relative importance between the factors, in which we need to compare each factor in pairs, giving a relative weight according to its importance, usually scored using a comparison scale of 1 to 9; calculating a weight vector: in the judgment matrix, normalization processing is required to be carried out on each column, then the weight vector of each factor is calculated, the normalization processing can be realized by dividing each element by the sum of the columns, and finally, the average value of each row is calculated to obtain the weight vector of each factor; consistency test: to ensure consistency of the decision matrix, we need to calculate a consistency index (Consistency Index, CI for short) and a consistency ratio (Consistency Ratio, CR for short). If CR is less than 0.1, the judgment matrix is considered to be consistent, otherwise, the comparison matrix needs to be adjusted; comprehensive weight sequencing: according to the weight vector of each factor, a comprehensive weight can be calculated for sorting the different factors, and the comprehensive weight can be obtained by carrying out weighted summation on the weight vector of each factor. Through the steps, the data quality evaluation indexes can be ranked by using an analytic hierarchy process, and the relative importance of the data quality evaluation indexes can be determined. This allows for accurate decision making and prioritization in data asset assessment. In data asset assessment, non-economic factors refer to factors related to data management, but do not relate to cost or benefit in terms of economics. These factors typically include the following aspects, data quality: factors such as accuracy, integrity, consistency and reliability of data; data security: the ability to protect data from unauthorized access, modification, deletion, or leakage; availability of data: the ability to ensure that data can be accessed and used at any time when needed; data management: rules, procedures, and mechanisms for data management to ensure that data is properly managed and used within an organization; data compliance: ensuring that data management meets the requirements of regulatory, legal and industry standards.

In some embodiments, the relationship between the cost of software construction, cost of data consultation, cost of resources, cost of hardware construction, and cost of operation and maintenance with respect to the investment of data assets may also be assessed by analytic hierarchy process. Analytic hierarchy process is a quantitative analysis method used to compare and evaluate the relative importance between multiple factors. In such an evaluation system, the software construction cost, the data consultation cost, the resource cost, the hardware construction cost and the operation and maintenance cost can be regarded as economic factors, and the relative weight or importance between them can be determined using a hierarchical analysis method. By comparing and evaluating different cost factors, it can be determined which factors have higher weight in the data asset evaluation, thereby guiding the decision maker in making a more intelligent choice in resource allocation and investment decisions. It should be noted that economic and non-economic factors are interrelated in data asset assessment. For example, the improvement of data quality and data security may require more resources and technology to be invested, which would involve economic costs. Thus, in performing data asset assessment, economic and non-economic factors should be considered in combination to achieve a comprehensive optimization of data management.

In some embodiments, the data element value assessment is largely divided into two steps: firstly, the data elements which are explicitly required to evaluate the value, and secondly, the evaluation calculation is carried out according to a proper method. When the enterprise specifically selects the index set of the data management element, the enterprise is mainly influenced by two factors: on one hand, the maturity level of the current data infrastructure of the enterprise is combined, so that the selected element indexes can be calculated; on the other hand, the evaluation index highly correlated with the value of the data asset should be selected in combination with the current business condition and industry attribute of the enterprise.

The basis for calculating the index of the selection element of the total input cost (the data production cost corresponding to the data asset to be evaluated) of the data asset is determined according to specific environment and requirements, and the following are some common index selection elements, and the cost analysis purpose is as follows: determining the purpose of cost analysis is critical, such as determining return on investment, cost effectiveness, cost control, etc.; data asset type: different types of data assets may have different cost element metrics, e.g., purchase costs, maintenance fees, etc., may be considered for hardware assets. For software assets, licensing costs, update and upgrade costs, etc. may be considered; organization requirements: selecting an appropriate index based on the needs and priorities of the organization, e.g., if the organization is very concerned with cost-effectiveness, an index related to the benefit, such as cost-savings ratio or return on investment, may be selected; industry standard: selecting an index with reference to industry standards and best practices, wherein the industry standards can provide a plurality of commonly used index selection references to ensure that the selected index is consistent with other peer organizations for comparison and evaluation; availability of data: the index with reliable data source support is selected to ensure accuracy and reliability of cost analysis, taking into account availability and reliability of the data. Selecting an evaluation index highly associated with the value of a data asset, which needs to be customized according to the business condition and the industry characteristics of an enterprise, is one possible method, and the business value is as follows: evaluating according to the application degree of the data in the business process and the influence degree of the data on business decisions; rarity: evaluating according to the acquisition difficulty and uniqueness of the data; integrity: evaluating according to the integrity degree of the data content; accuracy: according to the quality evaluation of the data, the accuracy, consistency, timeliness and the like of the data are included; compliance with: based on the compliance assessment of the data, including security, privacy, compliance, etc. of the data. The following factors can be considered in combination with the business conditions of enterprises, and sales data and customer data can have higher value for enterprises with sales as main; for scientific enterprises, research and development data and patent data may have higher value; industry attributes can also affect the value of data, such as: in the financial industry, transaction data, credit data may have a high value, and in the medical industry, patient data, drug data may have a high value. All the above factors can be used as references for evaluating indexes, and specific indexes need to be customized according to the specific conditions of enterprises.

Table 2 shows a data management element evaluation system table

Table 2 data management element evaluation System Table

The table record content of the data element evaluation system of table 2 is used for evaluating and comparing the performance and the value of different data management elements, and comprises indexes and related data of each data management element for helping a decision maker to make a reasonable decision. When these table records are used, the data management elements may be scored and sorted according to different criteria. Each index may have different weights and scoring criteria depending on the particular evaluation system. The evaluator can adjust the weights of different indexes according to the demands and the preferences of the evaluator so as to reflect the importance of the evaluator on the different indexes. Then, a composite score is calculated for comparison and decision making based on the scores of each data management element on the respective indicators. The contents of these table records can also be used for tracking and monitoring of data management elements. By periodically updating the data and re-evaluating the metrics, changes and improvements in the data management elements can be tracked and the evaluation results and decisions adjusted in time.

In some embodiments, the data management elements corresponding to the data asset tables are evaluated by a hierarchical analysis method for each data asset table to obtain a data management score of the data asset table, namely a management stage, and in the management stage, the main objective of the data asset value evaluation is to ensure that the whole data asset is effectively managed and managed so as to realize the maximum value of the data asset. The "pipe" phase may include one or more of the following steps, data manipulation for integrating and manipulating data sources, ensuring data quality, accuracy, timeliness and integrity, which helps to increase the value of the data asset; the data classification and labeling are used for classifying and labeling the data according to the characteristics and business requirements of the data, which is helpful for improving the discoverability and usability of the data, thereby increasing the value of the data asset; the data governance is used for implementing data governance policies including data quality management, data security management, data privacy protection and data compliance checking, which help to ensure the reliability and compliance of the data asset and further enhance the value of the data asset; the data value mining is used for utilizing technologies such as data analysis, data mining and the like to discover potential values and correlations in the data assets, which is helpful for discovering new business opportunities and improving the practical value of the data assets; the data asset catalogue is used for establishing and perfecting the data asset catalogue and recording basic information, use modes and authority control of the data asset, so that the searchability and reusability of the data asset are improved; the data value evaluation is used for evaluating the value of the data asset periodically according to the influence factors of the data asset in each stage, which is helpful for knowing the value change of the data asset in real time and providing basis for optimizing and adjusting the data asset; the data asset application and the return are used for pushing the application of the data asset in the business scene, so that the data asset value is landed. Meanwhile, the application effect of the data asset is tracked, and the data asset is continuously optimized to improve the value of the data asset. In the process of 'management', the data asset value assessment is embodied mainly by improving the data quality through the following aspects: the accuracy and the integrity of the data are ensured by means of data carding, management and the like, so that the value of the data asset is improved; enhancing data availability: the discoverability and usability of the data are improved through data classification, labels and other modes, so that the data asset is easier to use by business departments; ensuring data safety and compliance: through data management measures, the safety and compliance of the data asset are ensured, and the potential risk is reduced; potential value was found: potential values and correlations in the data assets are discovered through data mining and analysis, and support is provided for business innovation; improving reusability of data assets: by establishing a data asset catalog, the basic information and the use mode of the data asset are recorded, so that the service department can use the data asset repeatedly; optimizing data assets: according to the data asset value evaluation result, optimizing and adjusting the data asset to better meet the business requirement; realizing data asset value landing: the application of the data asset in the actual business scene is promoted, the data asset is converted into business achievements, and the maximization of the value of the data asset is realized.

In some embodiments, a data management scoring system based on data base attribute elements and data quality elements includes the following dimensions, data availability: evaluating whether the data is easy to acquire, share and utilize, and has sufficient reliability and stability; data integrity: evaluating whether the data is complete, contains all required fields and records, and has data loss or error; accuracy of data: evaluating the accuracy and correctness of the data, whether the data accords with expected standards and specifications, whether data redundancy or inconsistency exists or not; data consistency: evaluating consistency of data between different systems or data sources, whether there is a data difference or mismatch; data security: evaluating the security and confidentiality of data, whether the data is subjected to proper protection measures or not, and whether potential security risks exist or not; data traceability: evaluating whether the source and change history of the data are traceable, and whether enough audit and log records exist; data documentation: the degree and quality of documentation of the data is evaluated, and whether there are clear data definitions, field specifications, and data dictionaries. The purpose of these scoring dimensions is to evaluate and measure the quality and effectiveness of data management. The dimension of the data management score may be obtained in a quantitative and qualitative manner. The quantitative score may be based on a metric value of the data indicator, such as a data loss rate, a data error rate, and the like. Qualitative scores may be obtained by expert evaluation, user feedback, or questionnaires. These scores may be calculated and measured according to specific evaluation methods and criteria, ultimately yielding a score for each score dimension.

In some embodiments, the base rating index and the data quality rating index in the data management scoring system under data asset assessment may be used by: these metrics are used to measure the size and number of datasets. By evaluating the data size and data volume, the storage and processing resources required for data management can be determined and benchmarks are provided for data management plans; growth rate: the growth rate index is used for evaluating the growth rate of the data, and by knowing the growth rate of the data, the future data demand can be predicted, and the data management strategy can be planned and adjusted correspondingly; accuracy: the accuracy evaluation index is used for measuring the accuracy degree and the accuracy of the data, the accuracy evaluation index can be evaluated by comparing the matching degree of the data and the real situation, and the reliability and the credibility of the decision can be enhanced by improving the accuracy of the data; consistency: the consistency evaluation index is used for evaluating the consistency of the data at different sources and different time points. This can be evaluated by comparing whether the values, formats, and definitions of the data are consistent, ensuring that data consistency can reduce errors and collisions, and improving the reliability and reliability of the data; integrity: the integrity evaluation index is used for evaluating whether the data is complete or not and contains all required attributes and records, and the problems caused by information deletion and data incompleteness can be avoided by ensuring the integrity of the data; normalization: the normative evaluation index is used to evaluate whether the data meets predetermined criteria, specifications, and conventions. By following the specification, consistency, comparability, and operability of the data may be improved; timeliness: the timeliness evaluation index is used for evaluating the updating speed and timeliness of the data, and by ensuring that the data is updated in time, the decision and analysis can be ensured to be based on the latest data.

In some embodiments, the asset return is derived by adding the data management score to the cost information and then weighted averaging based on the data application value. The following is a calculation of asset rate of return: collecting data assets to be evaluated, and preprocessing the data assets to be evaluated to obtain at least one data asset table and data production cost corresponding to the data assets to be evaluated; calculating the cost information of each data asset table through a blood margin propagation algorithm; and obtaining the data management score of the single data asset table by using an analytic hierarchy process. The data management score is added to the cost information for each data asset table to obtain the overall value for each data asset table. And aiming at each data asset table, corresponding weights are given according to the application scene and the contribution degree of the data asset table. And carrying out weighted average on the total value of each data asset table according to the weight to obtain the asset return rate.

In some embodiments, the specific steps of the "use" phase are as follows, data application planning: planning an application scene and a mode of the data asset according to the business requirements and the data characteristics, wherein the aim of the step is to ensure that the data asset can generate value in a proper scene; data processing and analysis: performing necessary processing and analysis on the data according to the application planning to meet the service requirements, wherein the processing and analysis comprise data cleaning, data conversion, data mining and other operations; data is assets: converting the processed and analyzed data into reusable assets, providing continuous value to the business, including building data models, data warehouses, data services, etc.; the data value is realized: through data application, the business target is realized, the business efficiency is improved or new business value is created. In the "use" process, the data asset value assessment is embodied mainly in the following points, the data application effect assessment: evaluating the effect of the data asset in practical application, such as improvement of business indexes, improvement of working efficiency and the like; evaluation of data value mining degree: evaluating the mining degree of the data asset in the business scene, including the depth, the breadth and the like of data processing and analysis; data asset reuse efficiency assessment: evaluating the utilization rate of the data asset in multiple applications, and reflecting the sustainable value of the data asset; contribution assessment of data assets to business innovations: evaluating the role of the data asset in business innovations, including development of new business, product improvement, etc.; data security and compliance assessment: the security and compliance of the data asset in the application process are evaluated, and the stable value of the data asset is ensured. And by combining the factors, the value of the data asset in the process of using is comprehensively evaluated, so that a basis is provided for management and optimization of the data asset.

In some embodiments, to specifically analyze the business conditions of the enterprise and derive the industry return rate for the overall data asset, the business data may be collected as follows: firstly, collecting financial statement and operation data of enterprises, including profit table, liability table and cash flow table, etc., which can provide key information about enterprise operation status and financial health; determining a data asset: identifying and quantifying data assets of an enterprise, which may include customer databases, market research data, sales data, supply chain data, etc., ensuring a clear understanding of the value and importance of the data assets; analyzing the operation conditions: analyzing financial status, profitability, and operating efficiency of the enterprise using the collected business data, which may include calculating financial rates and metrics, such as net profit margin, asset return, and flow rate, to assess business status of the enterprise; industry comparison: performing industry comparison analysis, including comparing financial indicators of competitors and co-industry enterprises, may help determine relative competitive advantages and industry average levels of the enterprises; calculating the industry return rate: by comparing the data asset value of the enterprise with the industry average, the industry return rate of the data asset can be calculated. This may be calculated by comparing the data asset revenue of the business with the total revenue of the business. The particular analysis method and index selection may vary from industry to industry and enterprise to enterprise. In performing the data asset assessment, it is recommended to consult a professional financial analyst or data analyst to ensure accuracy and reliability.

In some embodiments, to fit the data management scores to the data asset payback rate for each table, the following steps may be performed: first, a calculation method of determining a data management score, wherein the data management score can comprise indexes such as data accuracy, data integrity, data reliability and the like, and weights and score rules of the indexes can be determined according to actual conditions; then, determining a calculation method of the data asset return rate of each table, wherein the data asset return rate can comprise indexes such as data use frequency, data value, data quality and the like, and likewise, determining weights and scoring rules of the indexes according to actual conditions; next, each table is evaluated for data management scores and data asset rewards, each table is scored according to defined metrics and scoring rules, and some tools or methods may be used to automate the evaluation process, such as a data quality management tool or a data asset management system; matching the data management score with the data asset return rate, and corresponding the data management score with the data asset return rate of each table, wherein the relationship between the two indexes can be fitted by using linear regression or other relevant statistical analysis methods; finally, according to the fitting result, the data asset return rates of other tables can be predicted, and by applying the fitting model, the data asset return rates of other tables can be estimated, so that decision support and optimization data management strategies are provided. This process may involve some data collection and data analysis effort. Meanwhile, the fitting result is only a predicted value, and the actual data asset return rate may be affected by other factors. Therefore, in practical applications, the model needs to be continuously monitored and adjusted to ensure its accuracy and effectiveness.

In another aspect, as shown in fig. 2, an embodiment of the present invention provides an apparatus for evaluating a value of a data asset, including:

The data asset acquisition unit 200 is configured to collect a data asset to be evaluated according to a preset evaluation target, and process the data asset to be evaluated through multiple data processing steps for the data asset to be evaluated to obtain at least one data asset table and data production cost corresponding to the data asset to be evaluated;

A data asset cost spreading unit 201, configured to, for each data asset table, split data production costs corresponding to the data asset to be evaluated according to a blood-margin spreading algorithm, and obtain cost information of the data asset table;

A data asset management unit 202, configured to evaluate, for each data asset table, a data management element corresponding to the data asset table by using a hierarchical analysis method, to obtain a data management score of the data asset table;

And the data asset evaluation unit 203 is configured to evaluate an asset return rate of the data asset to be evaluated according to the data management scores and the cost information of all the data asset tables.

Further, the data asset collection unit 200 comprises:

Further, the data asset cost propagation unit 201 includes:

Further, the data asset management unit 202 includes:

The base elements include data size;

the use elements include: accessibility and heat of use.

Further, the data asset evaluation unit 203 includes:

The embodiments of the present invention are device embodiments corresponding to the foregoing method embodiments one by one, and may be understood according to the foregoing method embodiments, which are not described herein again.

It should be understood that the specific order or hierarchy of steps in the processes disclosed are examples of exemplary approaches. Based on design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not meant to be limited to the specific order or hierarchy presented.

In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate preferred embodiment of this invention.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. As will be apparent to those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, as used in the specification or claims, the term "comprising" is intended to be inclusive in a manner similar to the term "comprising," as "comprising: "as interpreted in the claims as a joinder word. Furthermore, any use of the term "or" in the specification of the claims is intended to mean "non-exclusive or".

Those of skill in the art will further appreciate that the various illustrative logical blocks (illustrative logical block), units, and steps described in connection with the embodiments of the invention may be implemented by electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software (interchangeability), various illustrative components described above (illustrative components), elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Those skilled in the art may implement the described functionality in varying ways for each particular application, but such implementation is not to be understood as beyond the scope of the embodiments of the present invention.

The various illustrative logical blocks or units described in the embodiments of the invention may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described. A general purpose processor may be a microprocessor, but in the alternative, the general purpose processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. In an example, a storage medium may be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may reside in a user terminal. In the alternative, the processor and the storage medium may reside as distinct components in a user terminal.

In one or more exemplary designs, the above-described functions of embodiments of the present invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on a computer-readable medium or transmitted as one or more instructions or code on the computer-readable medium. Computer readable media includes both computer storage media and communication media that facilitate transfer of computer programs from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media may include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to carry or store program code in the form of instructions or data structures and other data structures that may be read by a general or special purpose computer, or a general or special purpose processor. Further, any connection is properly termed a computer-readable medium, e.g., if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless such as infrared, radio, and microwave, and is also included in the definition of computer-readable medium. The disks (disks) and disks (disks) include compact disks, laser disks, optical disks, DVDs, floppy disks, and blu-ray discs where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included within the computer-readable media.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. A method of evaluating the value of a data asset, comprising:

2. The method for evaluating the value of a data asset according to claim 1, wherein the collecting the data asset to be evaluated according to the preset evaluation target, and processing the data asset to be evaluated through a plurality of data processing steps to obtain at least one data asset table and data production cost corresponding to the data asset to be evaluated, includes:

3. The method for evaluating the value of a data asset according to claim 1, wherein for each data asset table, the step of distributing the data production cost corresponding to the data asset to be evaluated according to a blood-margin propagation algorithm to obtain cost information of the data asset table comprises the following steps:

4. The method for evaluating the value of a data asset according to claim 1, wherein the evaluating the data management element corresponding to the data asset table by a hierarchical analysis method for each data asset table to obtain the data management score of the data asset table comprises:

The base elements include data size;

the use elements include: accessibility and heat of use.

5. The method of claim 1, wherein the evaluating the asset return of the data asset to be evaluated based on the data management scores and the cost information of all data asset tables comprises:

6. An apparatus for evaluating the value of a data asset, comprising:

7. The apparatus for evaluating the value of a data asset of claim 6, wherein the data asset acquisition unit comprises:

8. The apparatus for evaluating the value of a data asset according to claim 6, wherein the data asset cost propagation unit comprises:

9. The apparatus for evaluating the value of a data asset according to claim 6, wherein the data asset management unit comprises:

The base elements include data size;

the use elements include: accessibility and heat of use.

10. The apparatus for evaluating the value of a data asset according to claim 6, wherein the data asset evaluation unit comprises: