CN115600476A - Method and device for evaluating data comprehensive value and electronic equipment - Google Patents

Method and device for evaluating data comprehensive value and electronic equipment Download PDF

Info

Publication number
CN115600476A
CN115600476A CN202110719131.5A CN202110719131A CN115600476A CN 115600476 A CN115600476 A CN 115600476A CN 202110719131 A CN202110719131 A CN 202110719131A CN 115600476 A CN115600476 A CN 115600476A
Authority
CN
China
Prior art keywords
data
value
evaluated
evaluation
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110719131.5A
Other languages
Chinese (zh)
Inventor
蔡国庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Ltd Research Institute filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110719131.5A priority Critical patent/CN115600476A/en
Publication of CN115600476A publication Critical patent/CN115600476A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Software Systems (AREA)
  • Development Economics (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Geometry (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Computer Hardware Design (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method and a device for evaluating data comprehensive value and electronic equipment. The method comprises the following steps: in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained; and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value. According to the embodiment of the invention, the multidimensional value quantitative evaluation is carried out on the basis of the data relative value in the data evaluation stage, the training value of the data in the model training stage on model training and the data of the application value of the data in the model application stage, so that the data value is prevented from being absolute and static, a single evaluation mode is prevented from being used, and the data value evaluation result is more accurate. In addition, a third party is not required to be introduced in the data evaluation process, and the data privacy of a data provider is protected.

Description

Method and device for evaluating data comprehensive value and electronic equipment
Technical Field
The present invention relates to the field of data processing, and in particular, to a method and an apparatus for evaluating a comprehensive value of data, and an electronic device.
Background
In the big data era, data has huge economic value and is like emerging petroleum resources. By deeply mining the cross-domain data resources, the economic law behind the data is discovered, and the industrial upgrading and the cross-type development can be powerfully promoted. The breakthrough progress of the machine learning algorithm and the large-scale floor application of the artificial intelligence technology do not open the supply of high-quality data in the sea, however, the current data sharing and circulation rules and technologies cannot meet the strong demands of various applications on data resources, a large number of isolated data islands with the world are formed, and the great waste of the data resources is caused, so that an open platform and related technologies for supporting data sharing are urgently needed to break the data barrier, the circulation of data on the internet is promoted, the economic value of big data is mined, and the application potential of various data is released.
In this context, the federal learning technology is produced, the federal learning technology is still in a state of rapid development at present, and is still in an early test point technology verification stage at present, and particularly for data quality evaluation, there are two main methods for evaluating data in the federal learning at present: the quality evaluation focuses on the multi-dimensional characteristics of the data content, such as integrity, timeliness and the like, and the value evaluation further comprehensively considers the cost of the data in the production process and the yield of the data in different tasks while evaluating the data quality. The data evaluation scheme has the defects of absolute and staticizing data value, so that the data evaluation result is inaccurate.
Disclosure of Invention
The invention aims to provide a method and a device for evaluating the comprehensive value of data and electronic equipment, which are used for solving the problem that the evaluation result of the conventional data evaluation method is inaccurate.
In order to achieve the above object, an embodiment of the present invention provides a method for evaluating a data composite value, including:
in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained;
and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Optionally, obtaining the relative value of the data to be evaluated provided by the data provider includes:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all data evaluators on the data to be evaluated.
Optionally, before performing data quality evaluation on the data to be evaluated, the method further includes:
performing data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: evaluating at least one of content, an evaluation field, and a security algorithm;
obtaining a shared data security assessment protocol according to the result of the data assessment negotiation;
in the data evaluation process, the data quality evaluation is performed on the data to be evaluated, and a relative value evaluation result of the data to be evaluated relative to the data evaluation node is obtained, including:
and according to the shared data security evaluation protocol, performing data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node.
Optionally, the obtaining of the training value of the data to be evaluated provided by the data provider includes:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between a data modeling initiator and the data provider;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
Optionally, before calculating a contribution degree of data to be evaluated provided by a data provider to training of a data collaborative model during training of the data collaborative model between a data modeling initiator and the data provider, the method further includes:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
and establishing a data cooperative connection channel between the data modeling initiator and the data provider according to the notification message.
Optionally, obtaining an application value of data to be evaluated provided by a data provider includes:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the calling times of the model and the data collaborative calculation times of the data provider in each calling.
Optionally, determining a comprehensive value of the data to be evaluated according to the relative value, the training value, and the application value includes:
determining weights of the relative value, the training value and the application value respectively;
and according to the weight, carrying out weighted summation processing on the relative value, the training value and the application value to obtain the comprehensive value of the data to be evaluated.
To achieve the above object, an embodiment of the present invention provides an apparatus for evaluating a data composite value, including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring the relative value, the training value and the application value of data to be evaluated, which are provided by a data provider, in the federal learning process;
and the value evaluation module is used for determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Optionally, the first obtaining module includes: the data relative value evaluation module is specifically used for:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all the data evaluators on the data to be evaluated.
Optionally, the apparatus further comprises:
the data negotiation module is used for carrying out data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: evaluating at least one of content, an evaluation field, and a security algorithm;
the second acquisition module is used for acquiring a shared data security assessment protocol according to the data assessment negotiation result;
the data relative value evaluation module is specifically used for:
and according to the shared data security evaluation protocol, performing data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node.
Optionally, the first obtaining module includes: a data model training value evaluation module, the data model training value evaluation module specifically configured to:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between a data modeling initiator and the data provider;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
Optionally, the apparatus further comprises: the multi-party collaborative computing management module is specifically used for:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
and establishing a data cooperative connection channel between the data modeling initiator and the data provider according to the notification message.
Optionally, the first obtaining module includes: a data application value evaluation module, the data application value evaluation module specifically configured to:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the number of times of model calling and the number of times of data collaborative calculation of the data provider in each calling.
Optionally, the value assessment module comprises:
a determining unit for determining the relative value, the training value and the weight of the application value, respectively;
and the comprehensive value evaluation unit is used for carrying out weighting summation processing on the relative value, the training value and the application value according to the weight to obtain the comprehensive value of the data to be evaluated.
To achieve the above object, an embodiment of the present invention provides an electronic device, which includes a processor and a transceiver, wherein,
the processor is configured to: in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained;
and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Optionally, the processor obtains a relative value of the data to be evaluated provided by the data provider, and includes:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all data evaluators on the data to be evaluated.
Optionally, before performing data quality evaluation on the data to be evaluated, the processor is further configured to:
performing data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: at least one of evaluate content, evaluate fields, and security algorithms;
obtaining a shared data security assessment protocol according to the result of the data assessment negotiation;
the processor is further configured to: and according to the shared data security assessment protocol, performing data quality assessment on the data to be assessed to obtain a relative value assessment result of the data to be assessed relative to a data assessment node.
Optionally, the processor obtains a training value of data to be evaluated provided by a data provider, and includes:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between the data modeling initiator and the data provider;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
Optionally, in the process of performing data collaborative model training between the data modeling initiator and the data provider, before calculating a degree of contribution of data to be evaluated provided by the data provider to the model training, the transceiver is configured to:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
the processor is configured to: and establishing a data collaborative connection channel between the data modeling initiator and the data provider according to the notification message.
Optionally, the processor obtains an application value of data to be evaluated provided by a data provider, and includes:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the number of times of model calling and the number of times of data collaborative calculation of the data provider in each calling.
Optionally, the processor determines the comprehensive value of the data to be evaluated according to the relative value, the training value, and the application value, and specifically includes:
determining weights for the relative value, the training value, and the application value, respectively;
and according to the weight, carrying out weighted summation processing on the relative value, the training value and the application value to obtain the comprehensive value of the data to be evaluated.
To achieve the above object, an embodiment of the present invention provides an electronic device, which includes a transceiver, a processor, a memory, and a program or instructions stored in the memory and executable on the processor; the processor, when executing the program or instructions, implements the method for assessing the composite value of data as described above.
To achieve the above object, an embodiment of the present invention provides a readable storage medium on which a program or instructions are stored, which when executed by a processor, implement the steps in the method for evaluating a composite value of data as described above.
The technical scheme of the invention has the following beneficial effects:
in the embodiment of the invention, a data evaluator of federal learning carries out multidimensional quantitative evaluation on value based on data relative value in a data evaluation stage, training value of data in a model training stage to model training and application value of data in a model application stage. In the value evaluation process of the embodiment, the relative value of the data to be evaluated relative to the data evaluation node is adopted, so that the data value is prevented from being absolute; the training value is the dynamic contribution of the data to be evaluated to model training, and the data value is evaluated based on the data dynamic contribution, so that the data value is prevented from being staticized, and the data value evaluation result can be more accurate. And finally, carrying out multi-dimensional comprehensive value evaluation on the data by using the relative value, the training value and the application value of the data, avoiding using a single evaluation mode and further ensuring the accuracy of a data value evaluation result. In addition, a third party is not required to be introduced in the data evaluation process, and the data privacy of a data provider is protected.
Drawings
FIG. 1 is a schematic flow chart of a method for evaluating a composite value of data according to an embodiment of the present invention;
FIG. 2 is a second flowchart illustrating a method for evaluating a comprehensive value of data according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an architecture of a data collaboration network according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus for evaluating a comprehensive value of data according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the invention;
fig. 6 is a second schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In various embodiments of the present invention, it should be understood that the sequence numbers of the following processes do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the inherent logic of the process, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In addition, the terms "system" and "network" are often used interchangeably herein.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may be determined from a and/or other information.
Before describing the embodiments of the present invention, a description will be given of concepts used in the embodiments of the present invention.
And (4) federal learning:
the federated learning is a data collaborative network formed by a plurality of nodes such as a data provider and a data evaluator, so that data can participate in multi-party collaborative modeling without going out of the local area, information safety and individual privacy can be ensured during data exchange, and under the condition of ensuring compliance and safety, the data of a plurality of different nodes are connected to discover the collaborative economic value of the big data. Each node participating in federal learning forms a trusted computing network in a mode of a alliance chain, and each data provider and each model owner are used as independent nodes of the data collaboration network.
For example: in the medical field, sample data of a single hospital is not enough to construct a diagnosis model with sufficient accuracy, the problem of the accuracy of the model can be solved by realizing collaborative modeling of samples of multiple hospitals through the federal learning technology, and meanwhile, the federal learning can ensure data without exposing privacy of patients. The process of federal learning includes: the method comprises the steps that firstly, a unified model is issued to server nodes of a plurality of medical institution terminals by a cloud server, each medical institution node is trained by using local data, then model parameters obtained by training are returned to the cloud server in an encrypted state through a trusted computing network, the cloud server integrates and updates the model parameters obtained from the plurality of medical institution nodes, then next round of model training is carried out on each node, and the model training is sequentially iterated step by step until a final model is converged to complete a complete federal learning process.
As shown in fig. 1, the method for evaluating the comprehensive value of data provided by the embodiment of the present invention includes:
and step 11, in the federal learning process, acquiring the relative value, the training value and the application value of the data to be evaluated, which are provided by a data provider.
And step 12, determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Federal learning includes: a complete life cycle from data release, data retrieval, data evaluation, model training to model prediction (model inference) of shared data of a data provider is shown in fig. 2, a specific process of data comprehensive value evaluation based on federal learning is shown in fig. 2, and the relative value of the data to be evaluated is obtained in a data evaluation stage, wherein the relative value is the data relative value of the data to be evaluated provided by the data provider relative to an evaluation node; acquiring the training value of the data to be evaluated for model training in a model training stage; and acquiring the application value of the data to be evaluated to model prediction in a model prediction stage.
And comprehensively calculating the relative value, the training value and the application value of the data to be evaluated, and determining the comprehensive value of the data provided by the data provider. And calculating incentive by using the calculated data comprehensive value, and providing the incentive to the data provider.
In the embodiment of the invention, a data evaluator of federal learning carries out multidimensional quantitative evaluation on value based on data relative value in a data evaluation stage, training value of data in a model training stage to model training and application value of data in a model application stage. In the value evaluation process of the embodiment, the relative value of the data to be evaluated relative to the data evaluation node is adopted, so that the data value is prevented from being absolute; the training value is the dynamic contribution of the data to be evaluated to model training, and the data value is evaluated based on the data dynamic contribution, so that the data value is prevented from being staticized, and the data value evaluation result can be more accurate. And finally, the data relative value, the training value and the application value are used for carrying out multi-dimensional comprehensive value evaluation on the data, so that a single evaluation mode is avoided, and the accuracy of a data value evaluation result is further ensured. In addition, a third party is not required to be introduced in the data evaluation process, and the data privacy of a data provider is protected.
Specifically, in the federal learning process, the obtaining of the relative value, the training value and the application value of the data to be evaluated provided by the data provider may include:
(1) And in the data evaluation process of the federal learning, acquiring the relative value of the data to be evaluated relative to the data evaluation node. Wherein a composite assessment may be based on the relative data value of a plurality of data evaluators to the data provider data.
(2) And in the model training process of the federal learning, obtaining the training value of the data to be evaluated on model training. Wherein, comprehensive assessment can be carried out based on the data participation degree and contribution degree of the collaborative task to the model training.
(3) And acquiring the application value of the data to be evaluated in the model prediction process of the federal learning. The comprehensive evaluation can be performed based on the number of calling of the model application phase model and the number of collaborative calculation of the data.
Specifically, the federal learning procedure includes the steps of: performing data retrieval to acquire data to be evaluated provided by the data provider; performing data evaluation on the data to be evaluated; under the condition that the evaluation result of the data to be evaluated meets a preset condition, performing data collaborative model training by using the data to be evaluated to obtain a model; and calling the model to perform model prediction.
It should be noted that, since the data evaluation process is performed based on federal learning, multidimensional evaluation can be performed in the data evaluation process based on the order of federal learning. For example: the data provider prepares for publishing data in the local data management module, the data is published through the publishing module according to a set standard protocol format after being ready, and the multiparty collaborative computing management module in the data collaborative network forwards the information to all interested other nodes according to the registration information of each node after receiving the published information. And the data evaluator performs data retrieval, enters a data evaluation stage after retrieving the data, performs relative value evaluation on the retrieved data provided by the data provider, determines whether the data requirement of model training is met according to the data relative value result after obtaining the relative value of the data, performs model training and obtains a model when the data requirement is met, and obtains the training value of the data to be evaluated on the model training in the model training process. When the model is called for application subsequently, the calling times of the model and the collaborative calculation times of the data can be calculated in the model application process for comprehensive evaluation, and then the application value of the data provided by the data provider is determined.
The predetermined condition may be set according to requirements of model training on training data, for example, when the requirement on data quality is high, a high relative value threshold may be set.
The process of obtaining the value of each data is described below by specific embodiments.
As an alternative embodiment, the obtaining the relative value of the data to be evaluated provided by the data provider may include:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all the data evaluators on the data to be evaluated.
In this embodiment, the data collaboration network may include a plurality of data evaluation nodes, and when performing relative value evaluation on data provided by a data provider, each data evaluation node may share a relative value evaluation result after performing the relative value evaluation, and each evaluation node may obtain evaluation results of a plurality of other evaluation nodes. For example, each evaluation node uploads the relative value evaluation result to the data relative value evaluation module, and the data relative value evaluation module performs comprehensive evaluation (for example, weighted summation) according to the relative value evaluation results of all the evaluation nodes to determine the comprehensive relative value of the data provider data.
Optionally, before performing data quality evaluation on the data to be evaluated, the method further includes:
performing data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: evaluating at least one of content, an evaluation field, and a security algorithm; obtaining a shared data security assessment protocol according to the result of the data assessment negotiation; and according to the shared data security evaluation protocol, performing data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node.
In this embodiment, other nodes except the data provider in the data collaboration network are data evaluators, and after receiving the publishing information from the data provider, the evaluators negotiate and agree with the data provider about an evaluation method, where negotiation content may include (but is not limited to) evaluation content, evaluation fields, security algorithms, and so on, to form a shared data security evaluation protocol that participants adhere to together.
And performing data quality evaluation according to the shared data security evaluation protocol data negotiated by the data evaluator and the data provider, including (but not limited to) performing two-party privacy sample alignment through a secure multi-party computing protocol. Data feature engineering evaluations such as: calculating statistical characteristics of characteristic values of data provider data samples, and importance coefficients of each characteristic value relative to a label (such as a user service label), such as: evidence Weight (WOE), information Value (IV) Value, etc.
The data relative value evaluation module of the data collaborative network comprehensively evaluates the data relative value of the data provided by the data provider relative to the evaluation node according to the information, the formula for calculating the relative value score is not limited, and the calculation mode should meet the following requirements: the higher the sample matching degree, the higher the score, the more important the provided characteristic value, the higher the provided characteristic value, and the higher the score.
The number of the data evaluation nodes can be multiple, and after the relative value evaluation module receives the relative value evaluation results exceeding the preset threshold value, the relative value evaluation module performs comprehensive evaluation according to the relative value evaluation results of all the evaluation nodes to obtain the comprehensive relative value of the data, and informs the data management module of the data collaborative network of ranking and issuing the ranking to the whole data collaborative network. Meanwhile, the whole evaluation value of the data provided by the data provider by the data evaluator and the detailed value of each field are sent to a credible data traceability auditing module of the data collaborative network for recording
The embodiment adopts the evaluator to evaluate the relative value of the data provider, thereby avoiding the tendency of absolute data value; the multiple evaluators are integrated to carry out integrated evaluation on the same data to avoid unreasonable scoring of a single evaluator, and credibility of the quality of the evaluated data can be guaranteed to the greatest extent.
As an alternative embodiment, the obtaining of the training value of the data to be evaluated provided by the data provider may include:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between the data modeling initiator and the data provider; and calculating the training value of the data to be evaluated for model training according to the contribution degree.
In this embodiment, a data modeling initiator of the data collaborative network initiates a modeling task, and performs data collaborative modeling with the data provider. In the model training process, the data evaluators respectively calculate the contribution degree of each data provider to model training, and respectively calculate the training value of each data evaluator provided data based on the contribution degree.
Optionally, before calculating a contribution degree of data to be evaluated provided by a data provider to training of a data collaborative model during training of the data collaborative model between a data modeling initiator and the data provider, the method further includes: receiving a data cooperation request sent by a data modeling initiator; sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request; receiving a notification message that the data modeling initiator selects the data to be evaluated; and establishing a data collaborative connection channel between the data modeling initiator and the data provider according to the notification message.
In this embodiment, a data management module (the data management module is a management module of a data evaluator) of a data collaborative network from a data modeling initiator sends a data sharing and collaborative request, and the data management module returns data allowed to be shared to the data modeling initiator in the current network according to a score ranking; the data modeling party selects data (data of a plurality of data providing parties can be selected) and informs the data management module, and the data management module assists the data modeling initiator and the data providing parties to establish a connection channel.
The data modeling initiator and the data provider negotiate about the data cooperation task, the negotiation content can include (but is not limited to) the data cooperation interaction mode of the two parties, a security encryption algorithm, interaction frequency, interaction termination conditions, settlement price and strategy of the two parties and the like, and the model training mode is agreed after the negotiation. And starting the data collaborative modeling (namely model training) or calculating tasks by the data modeling initiator and the data providers according to the achieved agreement, and comprehensively (including but not limited to cumulatively) calculating the contribution degree of each data provider to the efficiency change of the collaborative task (namely the dynamic training value of the model) according to the update cycle of the effective participation of the data provider of each collaborative task in the interaction process of the whole task. And meanwhile, the contribution degree of the data providers is fed back to the data management module, and the data management module carries out secondary correction on the data comprehensive value of each data provider according to the contribution degree of the data of the current cooperative task, so that the balance of the relative static value and the dynamic training value of the data is realized.
According to the embodiment, the value of the data is evaluated in a mode based on the data contribution degree in the model training process, the tendency of staticizing the data value is avoided, and the reliability of the data evaluation result can be ensured.
As an alternative embodiment, the obtaining of the application value of the data to be evaluated provided by the data provider may include:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the number of times of model calling and the number of times of data collaborative calculation of the data provider in each calling.
In the embodiment, after the model cooperation task is finished, the model is deployed in the data cooperation network according to a secret sharing mode, the data cooperation network is a trusted computing network, and namely a privacy protection mode is adopted for evaluating the application value of the data. In the model prediction application stage, comprehensive evaluation can be performed according to the number of calling times of the model and the number of data collaborative calculation times of each data provider in each calling to obtain the data application value of each data provider, meanwhile, the contribution degree of the data providers is fed back to the data management module, the data management module corrects the data comprehensive value of each data provider again according to the data value of the model application at this time, and balance between the model training value and the model application value of the data is achieved.
According to the embodiment, the value of the data provider relative to the value of the data application party is evaluated in a privacy protection mode, and the data privacy of the data provider is prevented from being revealed in the evaluation process. After the application value of the data is obtained, the relative value, the training value and the application value of the data can be comprehensively calculated, so that the comprehensive value of the data provided by the data provider is determined, a data multidimensional value quantitative evaluation mode is adopted, a third party is not required to be introduced in the data evaluation process, and meanwhile, a safe multi-party calculation protocol is adopted to protect the data privacy of the data provider in the data evaluation.
Optionally, determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value may include:
determining weights of the relative value, the training value and the application value respectively; and according to the weight, carrying out weighted summation processing on the relative value, the training value and the application value to obtain the comprehensive value of the data to be evaluated.
In this embodiment, the weights occupied by the relative value, the training value, and the application value may be determined according to the requirements of the data to be evaluated in different aspects in the data evaluation, and the comprehensive value of the data to be evaluated may be calculated by a weighted summation method. For example: in the data value evaluation process, if the data provided by the provider is required to meet the higher model training requirement, the larger training value weight can be set; and if the requirement of the data in the application stage is low, a small application value weight can be set, and the comprehensive value of the data provided by the data provider is calculated according to the weight of each data value.
Optionally, after determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value, the method further includes: calculating the incentive of the data provider in the data collaborative task according to the comprehensive value of the data to be evaluated; sending the incentive to the data provider.
After the final data value is obtained through comprehensive calculation according to the relative value obtained in the data evaluation stage, the dynamic training value obtained in the model training stage and the application value obtained in the model prediction application stage, the value incentive module can be used for calculating the incentive of each data provider in the cooperative task at this time according to an agreed protocol, and the cooperative modeling initiator pays and settles accounts to each data provider.
Optionally, after obtaining the relative value, the training value and the application value of the data to be evaluated provided by the data provider, the method further includes: and sending the relative value, the training value and the application value of the data to be evaluated to a credible data traceability auditing module, and recording by the credible data traceability auditing module.
The credible data traceability auditing module records data information conditions of effective participants, interaction of all parties, data collaborative task efficiency evaluation conditions and final model data calling conditions in each interaction updating period of the data evaluation, training and prediction application stages, and can be used for auditing in the later stage and ensuring data authenticity.
The following describes a data collaboration network formed by federate chain manner of each node participating in federal learning through a specific embodiment, wherein each data provider and each model owner are used as an independent node of the data collaboration network, and the functional architecture of each node is shown in fig. 3.
The interface layer corresponds to the local node, the agent layer corresponds to the agent node, the interface layer and the enterprise entity internal platform layer are separated through an intranet firewall, and the interface layer (namely, a data provider) and the agent layer (namely, a data evaluator) are separated through an extranet firewall. The middle platform layer comprises a service middle platform and a data middle platform, and the service middle platform comprises: a plurality of modules such as a user center, an entertainment center, a charging center, a communication open and the like; the data center station comprises: asset directory, data algorithms, AI models, data queries, etc.
The interface layer includes: the system comprises a federal learning shared data management module and a federal learning shared data release module. The federal learning shared data management module comprises: according to the unified format and field dimension, the preparation work before extracting data from the local central station and publishing the data comprises description, evaluation, update and maintenance on the overall situation of the data, including but not limited to: general introduction of data, application field, data labels, total data size, data integrity, data acquisition mode and sampling method, data timeliness, basic statistical characteristics of data, key field definition and the like.
The federal learning shared data release module: and issuing a signaling message to the data coordination network according to a protocol requirement and a specific format, wherein the signaling message comprises data information and corresponding authentication information: wherein the data information may include, but is not limited to, data tags, general introductions, application domains, field definitions, sample data, CRC information, etc.; but does not include any actual information of the data and statistical information of any field (such as data volume, variance, average value, etc.); where the authentication information controls which nodes are allowed to access which data in what form, including (but not limited to) a list or level of users allowed to access the information, an interaction protocol for secure multiparty computing, and an allowed access field, etc.
The proxy layer includes: the system comprises a multi-party collaborative computing data management module, a data relative value evaluation module, a credible data traceability audit module, a data model training value evaluation module and a data application value and excitation module, wherein:
a data relative value evaluation module:
judging according to the value of the data provided by the data provider relative to the data evaluator, wherein the data can comprise an evaluation method library, an evaluation calculation and other sub-functional modules;
the evaluation method library submodule can be used for defining methods or calculation formulas for evaluation, comparison, calculation and the like of different types of data and information; the evaluation calculation sub-module can be used for specific quantification of evaluation conditions of different dimensions (including but not limited to subjective scoring or objective calculation) and comprehensive collection of multi-dimensional evaluation results (such as weighted summation).
The multi-party collaborative computing data management module:
the method is used for accepting and initiating the cooperative task, matching and screening the data provider, achieving and maintaining the cooperative agreement and the like; among the sub-modules that may be included (but are not limited to) are: collaborative task management, data provider management, protocol management, and the like.
The cooperative task management submodule is used for receiving a cooperative task request, issuing and receiving a cooperative task abstract (Co-task Profile), and can include (but is not limited to) an application field, total data, statistical characteristics, key fields, cooperative task life cycle management and the like;
the data provider management submodule is used for sending a data collaboration request to a data provider, receiving collaboration task matching degree data and participation wishes fed back by each data owner, comprehensively selecting participants, recording process data of participation of the participants in collaboration tasks and performing life cycle management of the participants;
the protocol management submodule is used for generating and managing a protocol scheme which is negotiated for the cooperative task by the cooperative task initiator and each data provider, and the negotiation content can include (but is not limited to) a data cooperation interaction mode of two parties, a security encryption mechanism, an interaction frequency, an interaction termination/termination condition, settlement prices and strategies of the two parties and the like, so that a data sharing and cooperation protocol which is commonly observed by the participants is formed; the submodule is responsible for protocol generation, maintenance management and life cycle management, and can generate a typical protocol library.
The data model training value evaluation module:
the method is used for dynamically tracking and evaluating the collaborative task execution process under participation of each party in the model training process, each party (which can comprise a data provider and a collaborative task initiator) brings change and contribution degree estimation to the collaborative task efficiency, the estimation condition is fed back to a multi-party collaborative computing data management module, and then the data value obtained by combining other two stages is combined to obtain the comprehensive data value.
Data application value and incentive module:
according to the relative value of data, the dynamic contribution value of a model training process and the comprehensive data value obtained by the application value of a data model, combining data such as the participation process record and the life cycle of a cooperative task participant, referring to the agreement content between an initiator and each participant, and performing incentive distribution and collection and management of related data records; may include, but is not limited to, sub-modules for value data aggregation, incentive generation and execution, payment settlement, data management, and the like.
The trusted data tracing audit module:
and a credible data storage network is established in a coalition chain mode, and uplink storage of key information is realized in the relative value of data in the evaluation stage of the data, the value of dynamic contribution degree in the training stage of the model and the long-term application value in the application stage of the model, so that the credible data storage network is used for auditing in the later stage and ensuring the authenticity of the data.
According to the embodiment of the invention, the multi-dimensional value quantitative evaluation is carried out based on the data relative value of the data evaluation stage, the dynamic contribution value of the data in the model training stage to the model training and the data of the application value of the data in the model application stage, so that the data value is prevented from being absolute and static, and the data value evaluation result is more accurate. In addition, a third party is not required to be introduced in the data evaluation process, and the data privacy of a data provider is protected.
In addition, the relative value of the data provider is evaluated by the evaluator, so that the tendency of absolute data value is avoided; the multiple evaluators are integrated to perform integrated evaluation on the same data, so that unreasonable grading of a single evaluator is avoided; the value of the data provider is evaluated relative to the value of the data application party by adopting a privacy protection mode, so that the data privacy of the data provider is prevented from being leaked in the evaluation process; the value of the data is evaluated in a mode based on the data contribution degree in the model training process, the tendency of staticizing the data value is avoided, and the accuracy of the evaluation result of the data value evaluation is further ensured.
As shown in fig. 4, an embodiment of the present invention provides an apparatus 400 for evaluating a data composite value, including:
the first obtaining module 410 is configured to obtain a relative value, a training value, and an application value of data to be evaluated, which are provided by a data provider, in a federal learning process;
and the value evaluation module 420 determines the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Optionally, the first obtaining module 410 includes:
the data relative value evaluation module is used for acquiring the relative value of the data to be evaluated relative to the data evaluation node in the data evaluation process of the federal learning;
the data model training value evaluation module is used for acquiring the training value of the data to be evaluated on model training in the process of model training of the federal learning;
and the data application value evaluation module is used for acquiring the application value of the data to be evaluated in the model prediction process of the federal learning.
Optionally, the apparatus further comprises:
the retrieval module is used for retrieving data and acquiring data to be evaluated provided by the data provider;
the value assessment module is to: performing data evaluation on the data to be evaluated;
the training module is used for performing data collaborative model training by using the data to be evaluated to obtain a model under the condition that the evaluation result of the data to be evaluated meets a preset condition;
and the model prediction module is used for calling the model to perform model prediction.
Optionally, the data relative value evaluation module is specifically configured to:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all the data evaluators on the data to be evaluated.
Optionally, the apparatus further comprises:
the data negotiation module is used for carrying out data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: at least one of evaluate content, evaluate fields, and security algorithms;
the second acquisition module is used for acquiring a shared data security assessment protocol according to the data assessment negotiation result;
the data relative value evaluation module is specifically used for:
and according to the shared data security assessment protocol, performing data quality assessment on the data to be assessed to obtain a relative value assessment result of the data to be assessed relative to a data assessment node.
Optionally, the data model training value evaluation module is specifically configured to:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between the data modeling initiator and the data provider;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
Optionally, the apparatus further comprises: the multi-party collaborative computing management module is specifically used for:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
and establishing a data cooperative connection channel between the data modeling initiator and the data provider according to the notification message.
Optionally, the data application value evaluation module is specifically configured to:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the number of times of model calling and the number of times of data collaborative calculation of the data provider in each calling.
Optionally, the value assessment module comprises:
a determining unit for determining the relative value, the training value and the weight of the application value, respectively;
and the comprehensive value evaluation unit is used for carrying out weighting summation processing on the relative value, the training value and the application value according to the weight to obtain the comprehensive value of the data to be evaluated.
Optionally, the apparatus further comprises:
the excitation module is used for calculating the excitation of the data provider in the data collaborative task according to the comprehensive value of the data to be evaluated;
sending the incentive to the data provider.
Optionally, the apparatus further comprises:
and the first sending module is used for sending the relative value, the training value and the application value of the data to be evaluated to the trusted data traceability auditing module, and the trusted data traceability auditing module records the relative value, the training value and the application value of the data to be evaluated.
It should be noted that the apparatus for evaluating a comprehensive value of data provided in the embodiment of the present invention can implement all the method steps implemented in the method embodiment described above, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the method embodiment in this embodiment are omitted here.
As shown in fig. 5, an electronic device 500 of an embodiment of the invention includes a processor 510 and a transceiver 520, wherein,
the processor 510 is configured to: in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained;
and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
Optionally, the processor 510 obtains the relative value, the training value, and the application value of the data to be evaluated provided by the data provider in the federal learning process, including:
in the data evaluation process of the federal learning, the relative value of the data to be evaluated relative to a data evaluation node is obtained;
in the model training process of the federal learning, the training value of the data to be evaluated on model training is obtained;
and acquiring the application value of the data to be evaluated in the model prediction process of the federal learning.
Optionally, the processor 510 performs the following steps in a federal learning procedure:
performing data retrieval to acquire data to be evaluated provided by the data provider;
performing data evaluation on the data to be evaluated;
under the condition that the evaluation result of the data to be evaluated meets a preset condition, performing data collaborative model training by using the data to be evaluated to obtain a model;
and calling the model to perform model prediction.
Optionally, the processor 510 obtains the relative value of the data to be evaluated provided by the data provider, including:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all data evaluators on the data to be evaluated.
Optionally, before performing data quality evaluation on the data to be evaluated, the processor 510 is further configured to:
performing data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: evaluating at least one of content, an evaluation field, and a security algorithm;
obtaining a shared data security evaluation protocol according to the result of the data evaluation negotiation;
the processor is further configured to: and according to the shared data security assessment protocol, performing data quality assessment on the data to be assessed to obtain a relative value assessment result of the data to be assessed relative to a data assessment node.
Optionally, the processor 510 obtains a training value of data to be evaluated provided by a data provider, including:
calculating the contribution degree of data to be evaluated provided by a data provider to the model training in the process of performing data collaborative model training between the data modeling initiator and the data provider;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
Optionally, in the process of performing data collaborative model training between the data modeling initiator and the data provider, before calculating a degree of contribution of data to be evaluated provided by the data provider to the model training, the transceiver 520 is configured to:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
the processor 510 is configured to: and establishing a data cooperative connection channel between the data modeling initiator and the data provider according to the notification message.
Optionally, the processor 510 obtains an application value of data to be evaluated provided by a data provider, including:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the calling times of the model and the data collaborative calculation times of the data provider in each calling.
Optionally, the determining, by the processor, a comprehensive value of the data to be evaluated according to the relative value, the training value, and the application value specifically includes:
determining weights for the relative value, the training value, and the application value, respectively;
and according to the weight, carrying out weighted summation processing on the relative value, the training value and the application value to obtain the comprehensive value of the data to be evaluated.
Optionally, the processor 510 is further configured to: calculating the incentive of the data provider in the data collaborative task according to the comprehensive value of the data to be evaluated;
the transceiver is to: sending the incentive to the data provider.
Optionally, the transceiver 520 is further configured to: and sending the relative value, the training value and the application value of the data to be evaluated to a credible data traceability auditing module, and recording by the credible data traceability auditing module.
It should be noted that the electronic device provided in the embodiment of the present invention can implement all the method steps implemented by the above method embodiment, and can achieve the same technical effect, and detailed descriptions of the same parts and beneficial effects as those of the method embodiment in this embodiment are not repeated herein.
An electronic device according to another embodiment of the present invention, as shown in fig. 6, includes a transceiver 610, a processor 600, a memory 620, and a program or instructions stored in the memory 620 and executable on the processor 600; the processor 600 implements the above-described method of evaluating the integrated value of data when executing the program or instructions.
The transceiver 610 is used for receiving and transmitting data under the control of the processor 600.
Where in fig. 6, the bus architecture may include any number of interconnected buses and bridges, with various circuits being linked together, particularly one or more processors represented by processor 600 and memory represented by memory 620. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The transceiver 610 may be a number of elements including a transmitter and a receiver that provide a means for communicating with various other apparatus over a transmission medium. The processor 600 is responsible for managing the bus architecture and general processing, and the memory 620 may store data used by the processor 600 in performing operations.
The readable storage medium of the embodiment of the present invention stores a program or an instruction thereon, and the program or the instruction, when executed by the processor, implements the steps in the above-described method for evaluating a comprehensive value of data, and can achieve the same technical effect, and is not described herein again to avoid repetition.
The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
It is further noted that the electronic devices described in this specification include, but are not limited to, smart phones, tablets, etc., and that many of the functional components described are referred to as modules in order to more particularly emphasize their implementation independence.
In embodiments of the present invention, modules may be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be constructed as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different bits which, when joined logically together, comprise the module and achieve the stated purpose for the module.
Indeed, a module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Likewise, operational data may be identified within the modules and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, merely as electronic signals on a system or network.
When a module can be implemented by software, considering the level of existing hardware technology, a module implemented by software may build a corresponding hardware circuit to implement a corresponding function, without considering cost, and the hardware circuit may include a conventional Very Large Scale Integration (VLSI) circuit or a gate array and an existing semiconductor such as a logic chip, a transistor, or other discrete components. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
The exemplary embodiments described above are described with reference to the drawings, and many different forms and embodiments of the invention may be made without departing from the spirit and teaching of the invention, therefore, the invention is not to be construed as limited to the exemplary embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. In the drawings, the size and relative sizes of elements may be exaggerated for clarity. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Unless otherwise indicated, a range of values, when stated, includes the upper and lower limits of the range, and any subranges therebetween.
While the foregoing is directed to the preferred embodiment of the present invention, it will be appreciated by those skilled in the art that various changes and modifications may be made therein without departing from the principles of the invention as set forth in the appended claims.

Claims (11)

1. A method for evaluating the comprehensive value of data is characterized by comprising the following steps:
in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained;
and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
2. The method of claim 1, wherein obtaining the relative value of the data to be evaluated provided by the data provider comprises:
in the data evaluation process, carrying out data quality evaluation on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to a data evaluation node;
and calculating the comprehensive relative value of the data to be evaluated according to the relative value evaluation results of all the data evaluators on the data to be evaluated.
3. The method of claim 2, wherein before the evaluating data quality of the data to be evaluated, the method further comprises:
performing data evaluation negotiation with the data provider; the contents of the data evaluation negotiation include: at least one of evaluate content, evaluate fields, and security algorithms;
obtaining a shared data security evaluation protocol according to the result of the data evaluation negotiation;
in the data evaluation process, the data quality evaluation is performed on the data to be evaluated to obtain a relative value evaluation result of the data to be evaluated relative to the data evaluation node, and the method comprises the following steps:
and according to the shared data security assessment protocol, performing data quality assessment on the data to be assessed to obtain a relative value assessment result of the data to be assessed relative to a data assessment node.
4. The method of claim 1, wherein obtaining a training value for data to be evaluated provided by a data provider comprises:
in the process of training a data collaborative model between a data modeling initiator and a data provider, calculating the contribution degree of data to be evaluated provided by the data provider to the model training;
and calculating the training value of the data to be evaluated for model training according to the contribution degree.
5. The method according to claim 4, wherein during the data collaborative model training between the data modeling initiator and the data provider, before calculating the contribution degree of the data to be evaluated provided by the data provider to the model training, the method further comprises:
receiving a data cooperation request sent by a data modeling initiator;
sending the data to be evaluated allowed to be shared to the data modeling initiator according to the data cooperation request;
receiving a notification message that the data modeling initiator selects the data to be evaluated;
and establishing a data collaborative connection channel between the data modeling initiator and the data provider according to the notification message.
6. The method of claim 1, wherein obtaining the application value of the data to be evaluated provided by the data provider comprises:
and in the model prediction process, calculating the application value of the data to be evaluated provided by the data provider according to the calling times of the model and the data collaborative calculation times of the data provider in each calling.
7. The method of claim 1, wherein determining a composite value of the data to be evaluated based on the relative value, the training value, and the application value comprises:
determining weights for the relative value, the training value, and the application value, respectively;
and according to the weight, carrying out weighted summation processing on the relative value, the training value and the application value to obtain the comprehensive value of the data to be evaluated.
8. An apparatus for evaluating a composite value of data, comprising
The system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring the relative value, the training value and the application value of data to be evaluated, which are provided by a data provider, in the federal learning process;
and the value evaluation module is used for determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
9. An electronic device, comprising: a transceiver and a processor;
the processor is configured to: in the federal learning process, the relative value, the training value and the application value of data to be evaluated provided by a data provider are obtained;
and determining the comprehensive value of the data to be evaluated according to the relative value, the training value and the application value.
10. An electronic device, comprising: a transceiver, a processor, a memory, and a program or instructions stored on the memory and executable on the processor; wherein the processor, when executing the program or instructions, implements the method for assessing integrated value of data according to any one of claims 1 to 7.
11. A readable storage medium on which a program or instructions are stored, the program or instructions, when executed by a processor, implementing the steps in the method for evaluating a composite value of data according to any one of claims 1 to 7.
CN202110719131.5A 2021-06-28 2021-06-28 Method and device for evaluating data comprehensive value and electronic equipment Pending CN115600476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110719131.5A CN115600476A (en) 2021-06-28 2021-06-28 Method and device for evaluating data comprehensive value and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110719131.5A CN115600476A (en) 2021-06-28 2021-06-28 Method and device for evaluating data comprehensive value and electronic equipment

Publications (1)

Publication Number Publication Date
CN115600476A true CN115600476A (en) 2023-01-13

Family

ID=84840356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110719131.5A Pending CN115600476A (en) 2021-06-28 2021-06-28 Method and device for evaluating data comprehensive value and electronic equipment

Country Status (1)

Country Link
CN (1) CN115600476A (en)

Similar Documents

Publication Publication Date Title
CN110189192B (en) Information recommendation model generation method and device
CN112183730B (en) Neural network model training method based on shared learning
Schmidt et al. Fuzzy trust evaluation and credibility development in multi-agent systems
Wei et al. On designing context-aware trust model and service delegation for social internet of things
CN113689003B (en) Mixed federal learning framework and method for safely removing third party
CN105868039A (en) Method, system and device for managing software problem reports
CN113194126B (en) Transverse federal learning model construction method based on blockchain
CN112632013A (en) Data security credible sharing method and device based on federal learning
Rahi et al. Identifying the moderating effect of trust on the adoption of cloud‐based services
US20040205129A1 (en) Collaboration framework
CN113486584A (en) Equipment fault prediction method and device, computer equipment and computer readable storage medium
CN111369337B (en) Block chain-based trust air control system, method, equipment and medium
CN113726890A (en) Block chain data service-oriented federal prediction method and system
CN110096511B (en) Data consistency verification method, device, equipment and medium based on private chain
CN110825589A (en) Anomaly detection method and device for micro-service system and electronic equipment
CN111625474B (en) Automatic testing method of alliance chain
CN103268332B (en) A kind of believable method for service selection based on community structure
CN113077895A (en) Software definition-based intelligent HIE platform construction method and electronic equipment
CN115600476A (en) Method and device for evaluating data comprehensive value and electronic equipment
Blake et al. WSC-06: the web service challenge
CN115514761A (en) Data sharing and cooperation method and system under federated learning environment
Alasbali et al. Stakeholders’ viewpoints toward blockchain integration within IoT-based smart cities
CN115202911A (en) Node exception handling method and device of federated learning system and communication equipment
Anand et al. Impact of code smells on software development environments: a study based on ENTROPY-CODAS method
Zhang et al. Escape or return? Users’ intermittent discontinuance behavior in strong-ties social functions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination