CN112948362B - Data quality evaluation method, device, computer equipment and storage medium - Google Patents

Data quality evaluation method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN112948362B
CN112948362B CN202110148907.2A CN202110148907A CN112948362B CN 112948362 B CN112948362 B CN 112948362B CN 202110148907 A CN202110148907 A CN 202110148907A CN 112948362 B CN112948362 B CN 112948362B
Authority
CN
China
Prior art keywords
data
quality
distributed application
quality evaluation
shared
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110148907.2A
Other languages
Chinese (zh)
Other versions
CN112948362A (en
Inventor
韩鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202110148907.2A priority Critical patent/CN112948362B/en
Publication of CN112948362A publication Critical patent/CN112948362A/en
Application granted granted Critical
Publication of CN112948362B publication Critical patent/CN112948362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application relates to the technical field of blockchains, and provides a data quality evaluation method, a device, computer equipment and a storage medium, wherein a blockchain and a distributed application technology are utilized to enable a data user to perform quality evaluation on shared data provided by a data providing end; meanwhile, the quality evaluation result of the shared data is disclosed by utilizing the characteristics of non-falsification and traceability of the blockchain network, so that the quality supervision of the shared data is realized.

Description

Data quality evaluation method, device, computer equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of blockchain, in particular to a data quality evaluation method, a device, computer equipment and a storage medium.
Background
In the data market, for benefit reasons, there are a large number of false data and duplicate data for the data provided by some data providers, which increase the data volume and increase the overall price of the data. Alternatively, the partial data format provided by the data provider is not standard, which may cause some trouble to the process of processing and using the data by the data consumer.
However, most of the current data sharing transactions are one-to-one transactions, and a public data quality display platform is lacking, so that the data quality provided by a data provider cannot be supervised.
Disclosure of Invention
An embodiment of the present application is directed to a data quality evaluation method, apparatus, computer device, and storage medium, for solving the problem that the quality of data provided by a data provider cannot be supervised in the existing data sharing process.
In order to achieve the above purpose, the technical solution adopted in the embodiment of the present application is as follows:
in a first aspect, an embodiment of the present application provides a data quality evaluation method, which is applied to a data use end in a blockchain network, where the blockchain network runs with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by a data providing end in the block chain network;
the method comprises the following steps:
acquiring the data description file through the distributed application;
generating a data sharing request based on the data description archive;
sending the data sharing request to the data providing end through the distributed application;
Receiving shared data sent by the data providing end, wherein the shared data is obtained from the open data content by the data providing end according to the data sharing request;
performing quality evaluation on the shared data, and generating a quality evaluation result when the quality problem exists in the shared data;
and adding the quality assessment result to the data description file through the distributed application.
In a second aspect, an embodiment of the present application provides a data quality assessment method, applied to a data providing end in a blockchain network, where the blockchain network runs with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by the data providing end;
the method comprises the following steps:
receiving a data sharing request sent by the data using end through the distributed application, wherein the data sharing request is generated by the data using end through the distributed application to acquire the data description file and based on the data description file;
acquiring the shared data from the open data content according to the data sharing request;
And sending the shared data to the data using end so that the data using end carries out quality assessment on the shared data, generating a quality assessment result when the quality problem exists in the shared data, and adding the quality assessment result to the data description file through the distributed application.
In a third aspect, an embodiment of the present application provides a data quality evaluation device, which is applied to a data use end in a blockchain network, where a distributed application is running in the blockchain network;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by a data providing end in the block chain network;
the device comprises:
the first acquisition module is used for acquiring the data description file through the distributed application;
the request generation module is used for generating a data sharing request based on the data description file;
a request sending module, configured to send, by using the distributed application, the data sharing request to the data providing end;
the first receiving module is used for receiving the shared data sent by the data providing end, wherein the shared data is obtained from the open data content by the data providing end according to the data sharing request;
The first execution module is used for carrying out quality evaluation on the shared data and generating a quality evaluation result when the quality problem exists in the shared data;
and the second execution module is used for adding the quality evaluation result to the data description file through the distributed application.
In a fourth aspect, an embodiment of the present application provides a data quality assessment device, which is applied to a data providing end in a blockchain network, where a distributed application is running;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by the data providing end;
the device comprises:
the second receiving module is used for receiving a data sharing request sent by the data using end through the distributed application, wherein the data sharing request is generated based on the data description file and is obtained by the data using end through the distributed application;
the second acquisition module is used for acquiring the shared data from the open data content according to the data sharing request;
and the data sending module is used for sending the shared data to the data using end so that the data using end carries out quality evaluation on the shared data, generates a quality evaluation result when the quality problem exists in the shared data, and adds the quality evaluation result to the data description file through the distributed application.
In a fifth aspect, embodiments of the present application further provide a computer device, the computer device including: one or more processors; and the memory is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the data quality evaluation method applied to the data use terminal or the data quality evaluation method applied to the data providing terminal.
In a sixth aspect, embodiments of the present application further provide a computer readable storage medium having a computer program stored thereon, where the computer program when executed by a processor implements the above-described data quality assessment method applied to a data consumer or a data quality assessment method applied to a data provider.
Compared with the prior art, the data quality evaluation method, the device, the computer equipment and the storage medium provided by the embodiment of the application have the advantages that the data providing end issues the data description file of the open data content to the blockchain network in advance; the data using end accesses the blockchain network to obtain a data description file meeting the requirement through the distributed application, and generates a data sharing request based on the data description file and sends the data sharing request to the data providing end; the data providing end obtains shared data meeting the data sharing request from the open data content and sends the shared data to the data using end; the data use terminal carries out quality evaluation on the shared data, generates a quality evaluation result when the quality problem exists in the shared data, and adds the quality evaluation result to the data description file through the distributed application; the data user can evaluate the quality of the shared data provided by the data providing end, and meanwhile, the quality evaluation result of the shared data is shown by utilizing the characteristics of non-tampering and traceability of the blockchain network, so that the quality supervision of the shared data is realized.
Drawings
Fig. 1 shows a schematic architecture diagram of a blockchain network provided in an embodiment of the present application.
Fig. 2 is a schematic flow chart of a data quality evaluation method applied to a data consumer according to an embodiment of the present application.
Fig. 3 is another flow chart of a data quality evaluation method applied to a data consumer according to an embodiment of the present application.
Fig. 4 is a schematic flow chart of a data quality evaluation method applied to a data consumer according to an embodiment of the present application.
Fig. 5 is a schematic flow chart of a data quality evaluation method applied to a data provider according to an embodiment of the present application.
Fig. 6 is a schematic flow chart of another data quality evaluation method applied to a data provider according to an embodiment of the present application.
Fig. 7 is a schematic flow chart of a data quality evaluation method applied to a data provider according to an embodiment of the present application.
Fig. 8 is a schematic block diagram of a data quality evaluation device applied to a data consumer according to an embodiment of the present application.
Fig. 9 is a block diagram of a data quality evaluation device applied to a data providing end according to an embodiment of the present application.
Fig. 10 shows a block schematic diagram of a computer device provided in an embodiment of the present application.
Icon: 10-a computer device; 11-a processor; 12-memory; 13-bus; 100. 200-data quality assessment means; 110-a first acquisition module; 120-request generation module; 130-a request sending module; 140-a first receiving module; 150-a first execution module; 160-a second execution module; 170-a third execution module; 210-a second receiving module; 220-a second acquisition module; 230-a data transmission module; 240-a processing module.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an architecture of a blockchain network according to an embodiment of the present application. The blockchain network includes a plurality of nodes, e.g., node a, node b, node c, etc., and is running a distributed application. Each node may have and communicate with a respective database.
The blockchain is a distributed ledger and is also a novel application mode. The blockchain technology covers computer technologies such as distributed storage, network, information security, data fault tolerance and the like, and the blockchain application has the characteristics of decentralization, openness, autonomy and the like. A blockchain network is a structure of compute nodes that is used to manage, update, and maintain one or more blockchain structures. The blockchain network may be a public blockchain network, a private blockchain network, or a federated blockchain network.
The distributed application (Decentralized Application, DAPP), also known as a decentralized application, is an internet application. The principle of DAPP is the same as that of common Application (APP), and the biggest difference is: the DAPP operates in an decentralized network, namely a blockchain network, and the DAPP can be completely controlled by the nodes without the centralization in the network; whereas conventional applications are centralized, a server is required to obtain data, process data, etc.
For a user of a node, if access to the DAPP is required, a browser, e.g., augur, for accessing the DAPP needs to be downloaded and accessed through the browser. That is, the node needs to be installed with a browser for accessing the DAPP to access the DAPP.
In FIG. 1, the DAPP is running on a blockchain network, any one of the nodes in the blockchain network, e.g., any one of node a, node b, node c, etc., is installed with a browser, e.g., augur, for accessing the DAPP.
The node a, the node b, the node c, and the like may be a computer device accessing a blockchain network, including but not limited to a smart phone, a tablet computer, a personal computer, a server, a private cloud, a public cloud, and the like. The node a, the node b, the node c, etc. may also be a user accessing the blockchain network through the above-mentioned computer device, and may specifically be determined according to the actual application scenario, which is not limited herein.
The data provider may be any node in a blockchain network, e.g., node a; the data consumer may be any node in the blockchain network other than the data consumer, e.g., node b. And in particular, the method can be determined according to the actual application scene, and is not limited herein.
Referring to fig. 2, fig. 2 is a flow chart illustrating a data quality evaluation method applied to a data consumer according to an embodiment of the present application, where the data quality evaluation method may include the following steps:
s101, acquiring a data description file through a distributed application.
The data provider may be a data management authority, for example a telecommunications carrier. In a blockchain network, the data provider may be any node, such as node a. And assuming that the data providing end is a certain internet book listening platform, the node a is the equipment of the internet book listening platform for accessing the blockchain network.
The database in communication with the data provider is for storing data owned by the data provider, e.g., the database may be an audio content library, a user information library, a user listening behavior library, a blacklist library, etc. The following examples illustrate a library of user listening behaviors.
The data providing end can determine the open data content which can be provided in an open mode from the database in advance, establish a data description file of the open data content, and issue the data description file into the blockchain network through the DAPP. For example, the database is a user listening behavior library for storing various behavior tables of a user during listening using the internet listening platform, such as browsing content tables, collecting audio tables, searching audio tables, listening list tables, purchasing album tables, and the like. The open data content may be a list of actions that can be provided open in the user listening behavior library, e.g., a list of sheets.
Because the blockchain network comprises a plurality of nodes, each node is possibly a data providing end, and can issue a data description file of open data content in a database of the blockchain network. Thus, a plurality of data specification files are published in the blockchain network, which are open to all nodes in the blockchain network.
For an open data content, there may be at least one data description archive. Each data file published in the blockchain network has respective identification information, which refers to information capable of uniquely referring to open data content, such as table names, storage addresses, and the like.
When the data user has data demand, all the data description files issued in the block chain network can be searched through the DAPP, and the data description files meeting the demand can be selected according to the identification information. For example, the data user end needs to analyze the listening behavior of the user, and then may select a data description file with the identification information referring to "listening list" and obtain the data description file.
S102, generating a data sharing request based on the data description file.
The blockchain network can be deployed with intelligent contracts, and the data providing end can issue the data description file of the open data content to the blockchain network and issue the analysis strategy of the data description file to the blockchain network in an intelligent contract mode.
Smart contracts are a type of computer protocol that aims to propagate, verify or execute contracts in an informative manner, allowing trusted transactions to be made without third parties, which transactions are traceable and irreversible. An intelligent contract is a set of digitally defined commitments, including agreements on which contract participants can execute the commitments, a piece of code written on the blockchain.
The data description file is used for representing basic information of open data content provided by a data providing end in the blockchain network. The basic information may include the acquisition mode of the open data content, all fields, filtering definition, data format, data privacy protection policy, etc.
After the data providing end accesses the blockchain network through the DAPP and obtains the data description file meeting the self requirement, the intelligent contract can be called to analyze the data description file, and each item of content in the data description file, such as an acquisition mode, all fields, screening definition, data format, data privacy protection strategy and the like of open data content, can be obtained.
The acquisition mode refers to a data acquisition mode supported by open data content, for example, an API interface, a download link, and the like. All fields refer to which fields are contained in the open data content and can be represented by z1, z2, z3 … … zn, wherein z1, z2, z3 … … zn refer to information capable of uniquely referring to each field in the open data content, such as field names, field numbers, and the like. The screening definition of each field refers to whether each field supports screening, e.g., z1 supports screening, z3 does not support screening, etc. The data format refers to a format of the open data content, and a format of each field in the open data content, for example, numerals, text, and the like. The data privacy protection policy refers to a policy of performing desensitization processing in the open data content, for example, a field identifier to be desensitized, a desensitization mode, etc., where the desensitization mode may be to add a mask to the sensitive data, or delete the sensitive data, etc.
For example, the data description profile is shown in Table 1 below:
TABLE 1
The data user terminal accesses the blockchain network through the DAPP, and after acquiring the data description file meeting the self requirement, the data user terminal can generate a data sharing request based on the data description file.
The data sharing request includes at least one selected field, and a screening condition for each selected field. For example, the user name, the ticket, the age, and the age >18, the ticket= "novel", wherein the selected field is the user name, the ticket, the age, the screening condition of the age is all data >18, and the screening condition of the ticket field is all data labeled "novel".
S103, sending a data sharing request to a data providing end through the distributed application.
After the data user terminal generates a data sharing request based on the data description profile, the data sharing request can be sent to the data provider terminal through the DAPP. The data providing terminal screens the open data content according to the data sharing request, namely screens each selected field in the open data content according to the respective screening condition to obtain the shared data.
S104, receiving the shared data sent by the data providing end, wherein the shared data is obtained from the open data content by the data providing end according to the data sharing request.
After the data providing terminal obtains the shared data from the open data content according to the data sharing request, the shared data can be desensitized according to the data privacy protection policy of the data providing terminal, for example, the data under the 'mobile phone number' is deleted; after desensitizing the shared data, the shared data can be encrypted by DAPP, and the encrypted data is sent to the data user terminal. After receiving the data sent by the data providing end, the data using end can decrypt the data first, and then the shared data can be obtained.
S105, carrying out quality evaluation on the shared data, and generating a quality evaluation result when the quality problem exists in the shared data.
The data user uses DAPP as the program of data acquisition, use and quality assessment. After the data user obtains the shared data through the DAPP, the quality of the shared data can be evaluated, and a quality evaluation result is generated when the quality problem exists in the shared data.
The quality assessment results may include quality issue records, which may be quality issue specifications of the presence of the shared data, and quality scores, which may be scores for the quality of the shared data. That is, after the data using end obtains the shared data, the shared data is processed and used, if the quality problem of the data is found in the using process, the intelligent contract can be called through the DAPP, and the quality problem description is added and scored under the data description file.
And S106, adding the quality evaluation result to the data description file through the distributed application.
When the quality problem exists in the shared data, the quality evaluation result can be added to the data description file after the data user terminal generates the quality evaluation result. Therefore, the quality evaluation result of the shared data can be shown by utilizing the characteristics of non-falsifiability and traceability of the blockchain network, and the quality supervision of the shared data is realized.
For example, the data description profile after adding the quality evaluation results is shown in table 2 below:
TABLE 2
Because the blockchain network comprises a plurality of nodes, each node is possibly a data using end, and quality problem description can be added and scored in the data description file of the open data content. Thus, mass average refers to the average calculated from the individual data usage end-to-end scoring of the open data content in the blockchain network.
Optionally, step S106 may include the sub-steps of:
s1061, calling the intelligent contract through the distributed application, and adding the quality problem record and the quality score to the data description file.
S1062, triggering the intelligent contract to update the quality average score based on the quality score.
The data consumer can add the quality assessment result to the data description file by calling the intelligent contract through the DAPP, namely, the quality problem record is added and scored under the data description file. The quality issue record may include an issue status identification, e.g., unrepaired, repaired to be confirmed, repaired, etc.
Meanwhile, after the data using end adds the quality evaluation result to the data description file, the intelligent contract is triggered to recalculate the quality average score in the data description file due to the generation of the new score.
Optionally, after the data user terminal adds the quality assessment result to the data description file by calling the intelligent contract through the DAPP, the data provider terminal may receive the quality assessment result through an event mechanism of the blockchain. Therefore, referring to fig. 3 on the basis of fig. 2, after step S106, the data quality evaluation method may further include steps S107 to S110.
And S107, sending the quality evaluation result to the data providing end through the distributed application, so that the data providing end repairs the shared data according to the quality evaluation result to obtain repair data.
The data providing end receives the quality evaluation result of the shared data by the data using end through the event mechanism of the block chain, and then can choose to repair the shared data according to the quality problem record, or can choose not to repair the shared data.
S108, receiving a data restoration notification which characterizes the completion of the restoration of the shared data and is sent by the data providing terminal.
If the data providing end chooses to repair the shared data according to the quality problem records, after the repair is completed to obtain the repaired data, the problem state identification corresponding to the quality problem records can be modified through the DAPP, namely, the problem state identification is modified from unrepaired to repaired to be confirmed.
After the repair is completed, the data providing end can also send a data repair notification representing the completion of the repair of the shared data to the data providing end through an event mechanism of the blockchain so as to notify the data providing end to confirm the repair of the data.
S109, acquiring repair data through the distributed application.
S110, quality evaluation is carried out on the repair data, and quality evaluation results are updated when the quality problems of the repair data do not exist.
After receiving the data repair notification, the data consumer can acquire the repair data through the DAPP. Performing quality evaluation on the repair data, and if the repair of the problem is confirmed, modifying the problem state identifier corresponding to the quality problem record through the DAPP, namely, modifying the problem state identifier from the repaired to-be-confirmed repair to the repaired; and scoring again, i.e. modifying the quality score in the quality assessment result, e.g. from score 1 to score 5.
After the data consumer modifies the quality score by DAPP, the intelligent contract is triggered to recalculate the quality average score in the data description file because a new score is generated.
Optionally, step S107 may include the sub-steps of:
s1071, sending a quality evaluation result to a data providing end through the distributed application, so that the data providing end carries out data repair on the shared data according to the quality problem record to obtain repair data, and after repair is completed, calling an intelligent contract through the distributed application to update the problem state identifier from unrepaired to repaired to be confirmed.
Optionally, when the data using end obtains the repaired data through the DAPP, all the shared data can be obtained; only the data portion of the shared data, which has been repaired due to the quality problem, may be acquired.
Thus, as an embodiment, step S109 may comprise the sub-steps of:
s1091, a full-volume data acquisition request is sent to a data providing end through the distributed application.
S1092, the received data providing end obtains the repair data sent by the request according to the full data, wherein the repair data is the repaired shared data.
As another embodiment, step S109 may further include the sub-steps of:
S109-1, sending an incremental data acquisition request to a data providing end through a distributed application.
S109-2, the received data providing end obtains the repair data sent by the request according to the incremental data, wherein the repair data is the incremental part in the repaired shared data.
Optionally, step S110 may include the sub-steps of:
and S1101, performing quality evaluation on the repair data.
S1102, when the quality problem does not exist in the repair data, the intelligent contract is called through the distributed application, and the problem state identification is updated from the repaired to-be-confirmed to the repaired.
S1103, modifying the quality score to trigger the intelligent contract to update the quality average score based on the modified quality score.
It should be noted that for a data description profile, each time there is a new score or a modified score, the smart contract is triggered to recalculate the quality average in the data description profile.
Optionally, since the blockchain network includes a plurality of nodes, each node may be a data consumer, and a quality problem description may be added and scored in the data description file of the open data content. Therefore, after the data user end obtains the data description file meeting the requirement of the user, the data user end can perform quality evaluation on the open data content according to the original quality average score and at least one quality evaluation result in the data description file so as to confirm whether to continue using the open data content.
Therefore, referring to fig. 4 on the basis of fig. 2, before step S102, the data quality evaluation method may further include step S111.
And S111, evaluating the open data content according to the quality average score and at least one quality evaluation result to determine whether to use the open data content.
When the data open content is actually used, step S102 is performed, namely, a data sharing request is generated based on the data description file; when it is determined that the content is not opened using the data, the flow ends.
Referring to fig. 5, fig. 5 is a flowchart illustrating a data quality evaluation method applied to a data providing end according to an embodiment of the present application, where the data quality evaluation method may include the following steps:
s201, receiving a data sharing request sent by a data using end through a distributed application, wherein the data sharing request is generated by the data using end through the distributed application to acquire a data description file and based on the data description file.
S202, obtaining shared data from open data content according to a data sharing request.
S203, the shared data is sent to the data using end, so that the data using end carries out quality evaluation on the shared data, a quality evaluation result is generated when the quality problem exists in the shared data, and the quality evaluation result is added to the data description file through the distributed application.
Optionally, after the data providing end receives the quality evaluation result of the shared data from the data using end, the shared data may be repaired according to the quality problem record. Therefore, referring to fig. 6 on the basis of fig. 5, after step S203, the data quality evaluation method may further include steps S04 to S206.
S204, receiving a quality evaluation result sent by the data use terminal through the distributed application.
And S205, repairing the shared data according to the quality evaluation result to obtain repair data.
S206, sending a data repair notice to the data using terminal through the distributed application, so that the data using terminal obtains repair data through the distributed application, performs quality evaluation on the repair data, and updates a quality evaluation result when the repair data has no quality problem.
Optionally, the quality assessment result comprises a quality issue record, the quality issue record comprising an issue status identification. Therefore, referring to fig. 7 on the basis of fig. 5, after step S205, the data quality evaluation method may further include step S210.
S210, calling the intelligent contract through the distributed application, and updating the problem state identification from unrepaired to repaired to be confirmed.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
firstly, using a blockchain and DAPP technology, enabling a data user to perform quality evaluation on shared data provided by a data providing end;
secondly, the quality evaluation result of the shared data is disclosed by utilizing the characteristics of non-falsification and traceability of the blockchain network, so that the quality supervision of the shared data is realized.
In order to perform the above-mentioned data quality assessment method embodiments and the corresponding steps in each possible implementation, an implementation of the data quality assessment device applied to the data consumer and an implementation of the data quality assessment device applied to the data provider are given below.
Referring to fig. 8, fig. 8 is a block diagram illustrating a data quality evaluation apparatus 100 according to an embodiment of the present application. The data quality evaluation device 100 is applied to a data use terminal, and includes: the system comprises a first acquisition module 110, a request generation module 120, a request transmission module 130, a first receiving module 140, a first execution module 150 and a second execution module 160.
The first obtaining module 110 is configured to obtain, through a distributed application, a data description file.
The request generation module 120 is configured to generate a data sharing request based on the data description file.
The request sending module 130 is configured to send a data sharing request to the data provider through the distributed application.
The first receiving module 140 is configured to receive the shared data sent by the data providing end, where the shared data is obtained by the data providing end from the open data content according to the data sharing request.
The first execution module 150 is configured to perform quality evaluation on the shared data, and generate a quality evaluation result when the quality problem exists in the shared data.
The second execution module 160 is configured to add the quality evaluation result to the data description file through the distributed application.
Optionally, the blockchain network is deployed with a smart contract; the quality evaluation result comprises quality problem records and quality scores; the data description file comprises quality average of open data content;
the second execution module 160 is specifically configured to: invoking an intelligent contract through the distributed application, and adding the quality problem record and the quality score to the data description file; triggering the smart contract updates the quality average based on the quality score.
Optionally, the data quality assessment apparatus 100 may further comprise a third execution module 170.
The third execution module 170 is configured to: sending the quality evaluation result to the data providing end through the distributed application, so that the data providing end repairs the shared data according to the quality evaluation result to obtain repair data; receiving a data restoration notification representing completion of restoration of shared data sent by a data providing end; obtaining repair data through distributed application; and carrying out quality assessment on the repair data, and updating the quality assessment result when the quality problem of the repair data does not exist.
Optionally, the third execution module 170 performs a manner of sending the quality evaluation result to the data providing end through the distributed application, so that the data providing end repairs the shared data according to the quality evaluation result to obtain repair data, including:
and sending the quality evaluation result to the data providing end through the distributed application, so that the data providing end carries out data repair on the shared data according to the quality problem record to obtain repair data, and after repair is finished, the distributed application calls an intelligent contract to update the problem state identifier from unrepaired to repaired to be confirmed.
As one embodiment, the third execution module 170 performs a manner of acquiring repair data through a distributed application, including: transmitting a full-volume data acquisition request to a data providing end through a distributed application; and the received data providing end obtains the repair data sent by the request according to the total data, wherein the repair data is the repaired shared data.
As another embodiment, the third execution module 170 executes a manner of acquiring repair data through a distributed application, including: sending an incremental data acquisition request to a data providing end through a distributed application; and the received data providing end obtains the repair data sent by the request according to the incremental data, wherein the repair data is the incremental part in the repaired shared data.
Optionally, the third execution module 170 performs quality assessment on the repair data, and updates the quality assessment result when the repair data has no quality problem, including: performing quality evaluation on the repair data; when the quality problem does not exist in the repair data, calling an intelligent contract through the distributed application, and updating the problem state identification from the repaired to-be-confirmed to the repaired; the quality score is modified to trigger the smart contract to update the quality average score based on the modified quality score.
Optionally, the data description archive includes a quality average of the open data content and at least one quality assessment result;
the third execution module 170 is further configured to: the open data content is evaluated based on the quality average score and the at least one quality evaluation result to determine whether to use the open data content.
Referring to fig. 9, fig. 9 is a block diagram illustrating a data quality evaluation apparatus 200 according to an embodiment of the disclosure. The data quality evaluation device 200 is applied to a data providing end, and comprises: the second receiving module 210, the second obtaining module 220 and the data transmitting module 230.
The second receiving module 210 is configured to receive a data sharing request sent by the data consumer through the distributed application, where the data sharing request is generated based on the data description file and obtained by the data consumer through the distributed application.
The second obtaining module 220 is configured to obtain the shared data from the open data content according to the data sharing request.
The data sending module 230 is configured to send the shared data to the data consumer, so that the data consumer performs quality assessment on the shared data, generates a quality assessment result when the shared data has a quality problem, and adds the quality assessment result to the data description file through the distributed application.
Optionally, the data quality assessment apparatus 100 may further comprise a processing module 240.
The processing module 240 is configured to: receiving a quality evaluation result sent by a data using end through a distributed application; repairing the shared data according to the quality evaluation result to obtain repair data; and sending a data repair notice to the data using terminal through the distributed application, so that the data using terminal obtains repair data through the distributed application, performs quality evaluation on the repair data, and updates a quality evaluation result when the repair data has no quality problem.
Optionally, the blockchain network is deployed with a smart contract; the quality evaluation result comprises a quality problem record, and the quality problem record comprises a problem state identifier;
the processing module 240 is further configured to: and calling the intelligent contract through the distributed application, and updating the problem state identification from unrepaired to repaired to be confirmed.
It will be clear to those skilled in the art that, for convenience and brevity of description, the specific working procedures of the data quality assessment apparatus 100 and the data quality assessment apparatus 200 described above may refer to the corresponding procedures in the foregoing method embodiments, and will not be described herein.
Referring to fig. 10, fig. 10 is a block diagram of a computer device 10 according to an embodiment of the present application. The computer device 10 may be a data use terminal or a data providing terminal. The computer device 10 includes a processor 11, a memory 12, and a bus 13, the processor 11 being connected to the memory 12 via the bus 13.
The memory 12 is used to store a program, for example, the data quality evaluation device 100 shown in fig. 8 or the data quality evaluation device 200 shown in fig. 9. Taking the data quality assessment device 100 as an example, the data quality assessment device 100 includes at least one software functional module that may be stored in the memory 12 in the form of software or firmware (firmware), and the processor 11 executes the program after receiving the execution instruction to implement the data quality assessment method applied to the data consumer disclosed in the above embodiment.
The memory 12 may include high-speed random access memory (Random Access Memory, RAM) and may also include non-volatile memory (NVM).
The processor 11 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 11 or by instructions in the form of software. The processor 11 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a micro control unit (Microcontroller Unit, MCU), a complex programmable logic device (Complex Programmable Logic Device, CPLD), a Field-programmable gate array (Field-Programmable Gate Array, FPGA), an embedded ARM, or the like.
The embodiment of the present application further provides a computer readable storage medium, on which a computer program is stored, which when executed by the processor 11 implements the data quality evaluation method applied to the data consumer or the data quality evaluation method applied to the data provider disclosed in the above embodiment.
In summary, the data quality evaluation method, the device, the computer equipment and the storage medium provided by the application, the data providing end issues the data description file of the open data content to the blockchain network in advance; the data using end accesses the blockchain network to obtain a data description file meeting the requirement through the distributed application, and generates a data sharing request based on the data description file and sends the data sharing request to the data providing end; the data providing end obtains shared data meeting the data sharing request from the open data content and sends the shared data to the data using end; the data use terminal carries out quality evaluation on the shared data, generates a quality evaluation result when the quality problem exists in the shared data, and adds the quality evaluation result to the data description file through the distributed application; the data user can evaluate the quality of the shared data provided by the data providing end, and meanwhile, the quality evaluation result of the shared data is shown by utilizing the characteristics of non-tampering and traceability of the blockchain network, so that the quality supervision of the shared data is realized.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (15)

1. The data quality evaluation method is characterized by being applied to a data use end in a block chain network, wherein the block chain network is operated with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by a data providing end in the block chain network;
the method comprises the following steps:
acquiring the data description file through the distributed application;
generating a data sharing request based on the data description archive;
sending the data sharing request to the data providing end through the distributed application;
receiving shared data sent by the data providing end, wherein the shared data is obtained from the open data content by the data providing end according to the data sharing request;
Performing quality evaluation on the shared data, and generating a quality evaluation result when the quality problem exists in the shared data;
and adding the quality assessment result to the data description file through the distributed application.
2. The method of claim 1, wherein the method further comprises:
the distributed application is used for sending the quality evaluation result to the data providing end, so that the data providing end repairs the shared data according to the quality evaluation result to obtain repair data;
receiving a data restoration notification which characterizes the completion of the restoration of the shared data and is sent by the data providing terminal;
acquiring the repair data through the distributed application;
and carrying out quality evaluation on the repair data, and updating the quality evaluation result when the quality problem does not exist in the repair data.
3. The method of claim 2, wherein the blockchain network is deployed with a smart contract; the quality evaluation result comprises quality problem records and quality scores; the data description file comprises the quality average of the open data content;
the step of adding the quality assessment results to the data profile by the distributed application comprises:
Invoking the intelligent contract by the distributed application, adding the quality issue record and the quality score to the data description archive;
triggering the intelligent contract to update the quality average score based on the quality score.
4. The method of claim 3, wherein the quality problem record includes a problem status identification;
the step of sending the quality evaluation result to the data providing end through the distributed application so that the data providing end repairs the shared data according to the quality evaluation result to obtain repair data includes:
and sending the quality evaluation result to the data providing end through the distributed application, so that the data providing end carries out data repair on the shared data according to the quality problem record to obtain repair data, and after repair is completed, the distributed application calls the intelligent contract to update the problem state identifier from unrepaired to repaired to be confirmed.
5. The method of claim 4, wherein the step of performing quality assessment on the repair data and updating the quality assessment result when the repair data has no quality problem comprises:
Performing quality assessment on the repair data;
when the repair data has no quality problem, calling the intelligent contract through the distributed application, and updating the problem state identification from repaired to-be-confirmed to repaired;
modifying the quality score to trigger the smart contract to update the quality average score based on the modified quality score.
6. The method of claim 2, wherein the step of obtaining the repair data by the distributed application comprises:
transmitting a full-volume data acquisition request to the data providing end through the distributed application;
and receiving the repair data sent by the data providing terminal according to the full data acquisition request, wherein the repair data is the repaired shared data.
7. The method of claim 2, wherein the step of obtaining repair data by the distributed application comprises:
sending an incremental data acquisition request to the data providing end through the distributed application;
and receiving the repair data sent by the data providing end according to the incremental data acquisition request, wherein the repair data is an incremental part in the repaired shared data.
8. The method of claim 1, wherein the data profile includes a quality average of the open data content and at least one quality assessment result;
before the step of generating a data sharing request based on the data description profile, the method further comprises:
evaluating the open data content according to the quality average score and the at least one quality evaluation result to determine whether to use the open data content;
when it is determined to use the open data content, the step of generating a data sharing request based on the data specification profile is performed.
9. The data quality evaluation method is characterized by being applied to a data providing end in a block chain network, wherein the block chain network is operated with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by the data providing end;
the method comprises the following steps:
receiving a data sharing request sent by a data using end through the distributed application, wherein the data sharing request is generated by the data using end through the distributed application to acquire the data description file and based on the data description file;
Acquiring shared data from the open data content according to the data sharing request;
and sending the shared data to the data using end so that the data using end carries out quality assessment on the shared data, generating a quality assessment result when the quality problem exists in the shared data, and adding the quality assessment result to the data description file through the distributed application.
10. The method of claim 9, wherein the method further comprises:
receiving the quality evaluation result sent by the data using end through the distributed application;
repairing the shared data according to the quality evaluation result to obtain repair data;
and sending a data restoration notification to the data using end through the distributed application so that the data using end obtains the restoration data through the distributed application, carries out quality assessment on the restoration data, and updates the quality assessment result when the restoration data has no quality problem.
11. The method of claim 10, wherein the blockchain network is deployed with a smart contract; the quality evaluation result comprises a quality problem record, wherein the quality problem record comprises a problem state identifier;
After the step of repairing the shared data according to the quality evaluation result to obtain repaired data, the method comprises the following steps:
and calling the intelligent contract through the distributed application, and updating the problem state identification from unrepaired to repaired to be confirmed.
12. The data quality evaluation device is characterized by being applied to a data use end in a block chain network, wherein the block chain network is operated with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by a data providing end in the block chain network;
the device comprises:
the first acquisition module is used for acquiring the data description file through the distributed application;
the request generation module is used for generating a data sharing request based on the data description file;
a request sending module, configured to send, by using the distributed application, the data sharing request to the data providing end;
the first receiving module is used for receiving the shared data sent by the data providing end, wherein the shared data is obtained from the open data content by the data providing end according to the data sharing request;
The first execution module is used for carrying out quality evaluation on the shared data and generating a quality evaluation result when the quality problem exists in the shared data;
and the second execution module is used for adding the quality evaluation result to the data description file through the distributed application.
13. The data quality assessment device is characterized by being applied to a data providing end in a block chain network, wherein the block chain network is operated with a distributed application;
the block chain network is provided with a data description file, and the data description file is used for representing basic information of open data content provided by the data providing end;
the device comprises:
the second receiving module is used for receiving a data sharing request sent by a data using end through the distributed application, wherein the data sharing request is generated by the data using end through the distributed application to acquire the data description file and based on the data description file;
the second acquisition module is used for acquiring shared data from the open data content according to the data sharing request;
and the data sending module is used for sending the shared data to the data using end so that the data using end carries out quality evaluation on the shared data, generates a quality evaluation result when the quality problem exists in the shared data, and adds the quality evaluation result to the data description file through the distributed application.
14. A computer device, the computer device comprising:
one or more processors;
a memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the data quality assessment method of any of claims 1-8, or the data quality assessment method of any of claims 9-11.
15. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements a data quality assessment method according to any one of claims 1-8 or a data quality assessment method according to any one of claims 9-11.
CN202110148907.2A 2021-02-03 2021-02-03 Data quality evaluation method, device, computer equipment and storage medium Active CN112948362B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110148907.2A CN112948362B (en) 2021-02-03 2021-02-03 Data quality evaluation method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110148907.2A CN112948362B (en) 2021-02-03 2021-02-03 Data quality evaluation method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112948362A CN112948362A (en) 2021-06-11
CN112948362B true CN112948362B (en) 2023-12-22

Family

ID=76242163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110148907.2A Active CN112948362B (en) 2021-02-03 2021-02-03 Data quality evaluation method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112948362B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117671A (en) * 2018-08-22 2019-01-01 平安科技(深圳)有限公司 A kind of encryption data sharing method, server and computer readable storage medium
CN111062807A (en) * 2019-12-17 2020-04-24 北京工业大学 Internet of things data service credit assessment method based on block chain
KR20200044363A (en) * 2018-10-19 2020-04-29 빅픽처랩 주식회사 Method for managing trust information based on block-chain
CN111858611A (en) * 2020-07-28 2020-10-30 北京金山云网络技术有限公司 Data access method and device, computer equipment and storage medium
CN111858769A (en) * 2020-07-28 2020-10-30 北京金山云网络技术有限公司 Data using method, device, node equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117671A (en) * 2018-08-22 2019-01-01 平安科技(深圳)有限公司 A kind of encryption data sharing method, server and computer readable storage medium
KR20200044363A (en) * 2018-10-19 2020-04-29 빅픽처랩 주식회사 Method for managing trust information based on block-chain
CN111062807A (en) * 2019-12-17 2020-04-24 北京工业大学 Internet of things data service credit assessment method based on block chain
CN111858611A (en) * 2020-07-28 2020-10-30 北京金山云网络技术有限公司 Data access method and device, computer equipment and storage medium
CN111858769A (en) * 2020-07-28 2020-10-30 北京金山云网络技术有限公司 Data using method, device, node equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于区块链的共享汽车智能合约算法设计;刘永相;李彦斌;林亮;江冰;刘期烈;谢冬菊;;计算机应用(第S1期);全文 *

Also Published As

Publication number Publication date
CN112948362A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN109241358A (en) Metadata management method, device, computer equipment and storage medium
WO2021040994A1 (en) Systems, method, and media for determining security compliance of continuous build software
CN111414407A (en) Data query method and device of database, computer equipment and storage medium
CN110096496A (en) A kind of form validation method, relevant apparatus and equipment
CN111460458B (en) Data processing method, related device and computer storage medium
CN111124917B (en) Method, device, equipment and storage medium for managing and controlling public test cases
CN113010378B (en) Log processing method and device of microservice module, storage medium and electronic device
CN113595788B (en) API gateway management method and device based on plug-in
CN110597918A (en) Account management method and device and computer readable storage medium
CN111125175A (en) Service data query method and device, storage medium and electronic device
CN108667660B (en) Method and device for route management and service routing and routing system
CN111984735A (en) Data archiving method and device, electronic equipment and storage medium
CN111813418A (en) Distributed link tracking method, device, computer equipment and storage medium
CN113821307B (en) Method, device and equipment for quickly importing virtual machine images
CN111324389A (en) Cloud platform network management method, device, equipment and storage medium
CN112948362B (en) Data quality evaluation method, device, computer equipment and storage medium
CN116208676A (en) Data back-source method, device, computer equipment, storage medium and program product
CN111045928A (en) Interface data testing method, device, terminal and storage medium
US9621424B2 (en) Providing a common interface for accessing and presenting component configuration settings
CN115617781A (en) Digital object creating and data management method and device
US11573808B2 (en) Methods of providing an integrated interface that includes a virtual mobile device
CN108400901A (en) Test method, terminal device and the computer readable storage medium of application
CN114143187B (en) Intelligent platform interface network address management method, system, terminal and storage medium
CN113852516B (en) Method, system, terminal and storage medium for generating switch diagnostic program
CN115481932B (en) ERP system database trigger generation method, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant