CN114970670A - Model fairness assessment method and device - Google Patents

Model fairness assessment method and device Download PDF

Info

Publication number
CN114970670A
CN114970670A CN202210379396.XA CN202210379396A CN114970670A CN 114970670 A CN114970670 A CN 114970670A CN 202210379396 A CN202210379396 A CN 202210379396A CN 114970670 A CN114970670 A CN 114970670A
Authority
CN
China
Prior art keywords
sample
evaluation
model
evaluated
fairness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210379396.XA
Other languages
Chinese (zh)
Inventor
李进锋
刘翔宇
张�荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202210379396.XA priority Critical patent/CN114970670A/en
Publication of CN114970670A publication Critical patent/CN114970670A/en
Priority to PCT/CN2023/086570 priority patent/WO2023197927A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a model fairness evaluation method and a model fairness evaluation device, wherein the model fairness evaluation method comprises the steps of determining the probability distribution of real data of a picture and/or text training sample according to a picture and/or text training model; determining a credibility detection result of a sample to be evaluated according to the probability distribution of the real data and a generated confrontation network model; under the condition that the credibility detection result meets an unreliable condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample; and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample, and applying the fairness evaluation to algorithm management.

Description

Model fairness assessment method and device
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to two model fairness evaluation methods.
Background
With the continuous breakthrough of the basic theory and technology of artificial intelligence, the graphic algorithm taking the artificial intelligence technology as the base stone is widely applied to a plurality of public fields such as finance, education, medical treatment, security and the like, and a series of intelligent applications such as intelligent security, intelligent customer service, medical inquiry, personalized recommendation and the like are promoted, so that the daily life of people is greatly enriched and facilitated, and the development and progress of social economy and science and technology are promoted and promoted.
However, the problem of unfairness and even discrimination of artificial intelligence in the automatic decision making process is subject to social disputes, so that the worry and question of people about the automatic decision making of the algorithm are raised, the wide attention of the society and the public is gradually raised, and some places successively issue law and regulations related to algorithm fairness, and clearly indicate that the research and development and application of the artificial intelligence algorithm must meet the fairness constraint. Therefore, whether from the perspective of meeting the regulatory compliance or improving the user experience, carrying out algorithm fairness evaluation and eliminating the algorithm bias is an essential step in the whole life cycle of the algorithm.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide two model fairness evaluation methods. One or more embodiments of the present disclosure relate to two model fairness evaluation apparatuses, a computing device, a computer-readable storage medium, and a computer program, so as to solve technical deficiencies in the prior art.
According to a first aspect of embodiments of the present specification, there is provided a model fairness assessment method, including:
determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model;
determining the credibility detection result of the sample to be evaluated according to the probability distribution of the real data and the generated confrontation network model;
under the condition that the credibility detection result meets an unreliable condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
According to a second aspect of embodiments herein, there is provided a model fairness evaluation apparatus including:
a probability distribution determination module configured to determine a real data probability distribution of the picture and/or text training samples according to the picture and/or text training model;
the detection result determining module is configured to determine a credibility detection result of the sample to be evaluated according to the real data probability distribution and the generated confrontation network model;
the sample processing module is configured to perform sample processing on the sample to be evaluated according to the credibility detection result under the condition that the credibility detection result meets an unreliable condition, so as to obtain an updated evaluation sample;
and the evaluation module is configured to perform fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
According to a third aspect of the embodiments of the present specification, there is provided a model fairness evaluation method applied to a model fairness evaluation platform, including:
determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model;
receiving a to-be-evaluated sample and a to-be-evaluated model sent by a user;
determining the credibility detection result of the sample to be evaluated according to the probability distribution of the real data and a generated confrontation network model;
under the condition that the credibility detection result meets the non-credibility condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample;
and obtaining a fairness evaluation result of the model to be evaluated, and returning the fairness evaluation result to the user.
According to a fourth aspect of the embodiments of the present specification, there is provided a model fairness evaluation method applied to a model fairness evaluation platform, including:
the first determining module is configured to determine the real data probability distribution of the picture and/or text training samples according to the picture and/or text training model;
the data receiving module is configured to receive a to-be-evaluated sample and a to-be-evaluated model sent by a user;
the second determination module is configured to determine a credibility detection result of the to-be-evaluated sample according to the real data probability distribution and a generated confrontation network model;
the sample updating module is configured to perform sample processing on the sample to be evaluated according to the credibility detection result under the condition that the credibility detection result meets an unreliable condition to obtain an updated evaluation sample;
the fairness evaluation module is configured to evaluate the fairness of the model to be evaluated according to the sample to be evaluated and the updated evaluation sample;
and the result display module is configured to obtain a fairness evaluation result of the model to be evaluated and return the fairness evaluation result to the user.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, which when executed by the processor, implement the steps of the model fairness assessment method described above.
According to a sixth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the model fairness assessment method described above.
According to a seventh aspect of embodiments herein, there is provided a computer program, wherein when the computer program is executed in a computer, the computer program is caused to execute the steps of the model fairness evaluation method described above.
One embodiment of the specification realizes two model fairness evaluation methods and devices, wherein one model fairness evaluation method comprises the steps of determining the probability distribution of real data of picture and/or text training samples according to picture and/or text training models; determining the credibility detection result of the sample to be evaluated according to the probability distribution of the real data and the generated confrontation network model; under the condition that the credibility detection result meets an unreliable condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample; and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
Specifically, the model fairness assessment method models real data probability distribution of the image-text training samples through the image-text training model, and performs credibility detection on the samples to be assessed according to the real data probability distribution; and the method is used for processing the untrusted sample to obtain an updated evaluation sample, so that the reliability and the completeness of the evaluation sample in an untrusted environment are improved, the fairness evaluation method of the model is ensured, the robustness in a trusted environment and an untrusted environment and the availability of the evaluation result of the model are ensured, and the fairness evaluation accuracy of the model to be evaluated by the fairness evaluation method is ensured, so that the follow-up model to be evaluated has a good effect in practical application from the viewpoint of meeting supervision compliance or improving user experience, and can be applied to algorithm management.
Drawings
Fig. 1 is a schematic diagram illustrating a specific scenario of a model fairness assessment method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a model fairness evaluation method according to an embodiment of the present disclosure;
FIG. 3 is a flowchart of a process of a model fairness evaluation method according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a model fairness evaluation apparatus according to an embodiment of the present disclosure;
FIG. 5 is a flow diagram of another model fairness evaluation method provided by one embodiment of the subject specification;
FIG. 6 is a schematic structural diagram of another model fairness evaluation method provided in an embodiment of the present disclosure;
fig. 7 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can be termed a second and, similarly, a second can be termed a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
First, the noun terms to which one or more embodiments of the present specification relate are explained.
And (3) image-text algorithm: the image-text algorithm refers to a general name of an image algorithm and a text algorithm, and specifically comprises algorithms which take images as input, such as image classification, face recognition, target detection, image retrieval and the like, and algorithms which take texts as input, such as text classification, emotion analysis, machine translation, conversation generation and the like.
Algorithm fairness: the automated decision making of the artificial intelligence algorithm is independent of the protected sensitive attributes (natural attributes and social attributes) of nationalities, beliefs, regions and the like, that is, for the protected sensitive attributes, the decision making of the artificial intelligence algorithm does not have bias or preference for individuals or groups due to inherent or acquired attributes.
OOD data: OOD (Out-of-Distribution) data, also called Out-of-Distribution data, means that the sample data comes from data that is different from the Distribution of the training data of the algorithm model. And if the sample data distribution is the same as the algorithm model training data distribution, the data is called ID data, namely in-distribution data.
Robustness: robustness, also known as robustness or robustness, refers to the ability of a computer system to continue normal operation and ensure its stable performance when certain parameters (structure, size) change, or errors are handled during execution, and algorithms encounter input, operation, and other anomalies.
With the continuous breakthrough of the basic theory and technology of artificial intelligence, the graphic algorithm taking the artificial intelligence technology as the base stone is widely applied to a plurality of public fields such as finance, education, medical treatment, security and the like, and prompts a series of intelligent applications such as intelligent security, intelligent customer service, medical inquiry, personalized recommendation and the like, thereby greatly enriching and facilitating the daily life of people, and promoting the development and progress of social economy and science and technology.
However, with the increasing application scenarios, the legal and ethical problems and risks faced by artificial intelligence algorithms are becoming more and more prominent. For example, a crime risk assessment algorithm may systematically discriminate against a ethnic group of people, etc. The problem that the artificial intelligence has unfairness and even discrimination in the automatic decision making process is subject to social disputes, so that the worry and the question of people about the automatic decision making of the algorithm are raised, the wide attention of the society and the public is gradually raised, algorithm fairness related laws and regulations are successively issued in some places, and the condition that the research and the development and the application of the artificial intelligence algorithm must meet fairness constraints is clearly indicated. Therefore, whether from the perspective of meeting the regulatory compliance or improving the user experience, carrying out algorithm fairness evaluation and eliminating the algorithm bias is an essential step in the whole life cycle of the algorithm.
In view of the above problems, embodiments of the present specification provide a fairness evaluation system, which has fairness evaluation capability for partial artificial intelligence algorithm tasks (such as text classification, image classification, etc.), but only considers fairness evaluation for natural input in a trusted environment, and quantization of fairness of the fairness evaluation system completely depends on indexes in statistical significance, such as accuracy, recall rate, F1-score, and the like. In an uncontrolled environment (an untrusted environment), the statistical indexes are greatly influenced by factors such as anti-disturbance and data selection, so that a fairness evaluation result produced by the system cannot accurately reflect the true fairness level of an algorithm (a model), and therefore the effectiveness and the availability of evaluation cannot be guaranteed.
Based on this, in this specification, two model fairness evaluation methods are provided. One or more embodiments of the present specification relate to both model fairness evaluation apparatuses, a computing device, a computer-readable storage medium, and a computer program, which are described in detail in the following embodiments one by one.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a specific scenario of a model fairness assessment method according to an embodiment of the present disclosure, which specifically includes the following steps.
Specifically, the model fairness assessment method provided in the embodiments of the present description is applied to a model fairness assessment platform.
Step 102: training a large-scale pre-training model (picture and/or text training model) by using an auto-supervised learning technology according to a large-scale picture and text training sample collected in a database, and initially modeling the data probability distribution of the picture and text training sample through the pre-training model; and on the basis of the pre-training model, a generated confrontation network model is constructed by utilizing a depth generation technology, and the data probability distribution of the image-text training sample is further optimized through the generated confrontation network model.
The graphic training sample can be understood as a picture and a text training sample.
Step 104: receiving a fairness evaluation request sent by a user, wherein the fairness evaluation request carries a sample to be evaluated and a model to be evaluated; firstly, according to the data probability distribution of a graph-text training sample and a generated confrontation network model, carrying out credibility detection on a sample to be evaluated; secondly, based on the credibility detection result of the sample to be evaluated, under the condition that the sample to be evaluated comprises the countermeasure sample, denoising and countermeasure reconstruction are carried out on the countermeasure sample by utilizing an countermeasure defense technology, and under the condition that the distribution diversity detection of the sample to be evaluated is determined to be weak, diversity generation is carried out on the sample to be evaluated by utilizing a generated countermeasure network model; and finally, mixing the original to-be-evaluated sample, the evaluation sample after the countermeasure reconstruction and the sample obtained by diversity generation, and inputting the mixture into a fairness evaluation module for evaluation by combining the to-be-evaluated model to obtain a fairness evaluation result of the to-be-evaluated model.
Specifically, the fairness evaluation of the model to be evaluated in the embodiments of the present description may be understood as counting model performance difference indexes, such as false positive rate, statistical equalization, chance equalization, inconsistent influence and the like, of the model to be evaluated on different groups divided based on sensitive/protected attributes (such as nationality, belief, income and the like). The fairness of the model to be evaluated can be subsequently evaluated according to the indexes.
Step 106: and returning the fairness evaluation result of the model to be evaluated to the user.
The model fairness evaluation method provided by the embodiment of the specification provides a robust image-text algorithm fairness evaluation system, the data probability distribution is modeled by combining a large-scale pre-training technology and a deep generation technology, reliability detection is carried out on a sample to be evaluated according to the probability distribution, and meanwhile denoising, countermeasure reconstruction and diversity generation are respectively carried out on an untrusted sample such as an countermeasure sample or a distribution deviation sample, so that the reliability and integrity of the sample to be evaluated under an untrusted environment are improved, and the robustness of the fairness evaluation system under an uncontrolled environment and the availability of an evaluation result are guaranteed.
Referring to fig. 2, fig. 2 shows a flowchart of a model fairness evaluation method provided in an embodiment of the present specification, which specifically includes the following steps.
Step 202: and determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model.
The pictures include, but are not limited to, pictures of any type, any size, and any content contained therein, such as pictures containing animals or people; text includes, but is not limited to, any type, any spread, text containing any content therein, such as academic discussions, literature articles, and the like.
The picture and/or text training model can be understood as a picture training model, a text training model or a training model combining pictures and texts; in practical application, which type of model the picture and/or text training model is specific to may be determined according to actual requirements, which is not limited in this description embodiment.
In specific implementation, in order to ensure the accuracy of the real data probability distribution of the picture and/or text training sample, after model training is performed on the picture and/or text training model by using a large-scale picture and/or text training sample, the real data probability distribution of the picture and/or text training sample is modeled according to the trained picture and/or text training model; and then, optimizing the probability distribution of the real data by generating a confrontation network model to obtain the optimized probability distribution of the real data. The specific implementation mode is as follows:
the determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model comprises the following steps:
acquiring a picture and/or text training sample;
training to obtain a picture and/or text training model by utilizing an automatic supervision learning technology according to the training samples;
obtaining real data probability distribution after the training sample is adjusted according to the picture and/or text training model;
and adjusting the real data probability distribution of the training sample according to the generated confrontation network model to obtain the adjusted real data probability distribution of the training sample.
In order to ensure the accuracy of the picture and/or text training model, in the embodiment of the specification, a large-scale picture and/or text training sample is obtained to perform model training on the picture and/or text training sample; when the training sample is a picture, the picture and/or text training model can be understood as a visual Transformer model and the like; when the training sample is a text, the picture and/or text training model can be understood as a language model BERT model and the like; when the training sample is a graphic training sample, the graphic and/or text training model can be understood as a combined multi-modal fusion model of a visual Transformer model and a language model BERT model.
Specifically, after a large-scale picture and/or text training sample is obtained, a picture and/or text training model is obtained through training by using an auto-supervised learning technology according to the large-scale picture and/or text training sample; and then initially modeling the real data probability distribution of the picture and/or text training sample through the picture and/or text training model. In order to further optimize the real data probability distribution of the picture and/or text training sample, the real data probability distribution of the picture and/or text training sample can be optimized according to the generated confrontation network model, and the adjusted real data probability distribution of the picture and/or text training sample is obtained.
According to the model fairness evaluation method provided by the embodiment of the specification, firstly, large-scale picture and/or text training samples and picture and/or text training models are trained, the real data probability distribution of the picture and/or text training samples is preliminarily modeled according to the picture and/or text training models, then a generation confrontation network model is constructed according to a depth generation technology, and the data probability distribution of the preliminarily modeled picture and/or text training samples is optimized, so that the accuracy and the usability of the picture and/or text training samples are determined.
Before adjusting the probability distribution of the real data of the picture and/or text training sample according to the generated confrontation network model, the generated confrontation network model is constructed by utilizing a depth generation technology according to the picture and/or text training model so as to ensure the subsequent usability of the generated confrontation network model. The specific implementation mode is as follows:
the adjusting the real data probability distribution of the training sample according to the generated confrontation network model to obtain the adjusted real data probability distribution of the training sample comprises:
according to the picture and/or text training model, a confrontation network model is constructed and generated;
training the generated confrontation network model according to the training sample to obtain a judging module and a generating module of the trained generated confrontation network model;
and adjusting the real data probability distribution of the training sample according to the judging module to obtain the adjusted real data probability distribution of the training sample.
Specifically, a confrontation network model is constructed and generated according to a picture and/or text training model; training the generated confrontation network model according to the picture and/or text training sample to obtain a discrimination module and a generation module for generating the confrontation network model after training; and finally, fine-tuning the initial probability distribution of the picture and/or text training sample according to a discrimination module for generating the confrontation network model to obtain the real data probability distribution of the picture and/or text training sample after fine tuning.
In specific implementation, the construction stage for generating the countermeasure network model comprises two parts, wherein the first part is used for constructing the countermeasure network model, and the second part is used for training the countermeasure network model; the generation of the confrontation network model is constructed by a discrimination module and a generation module, so that when the confrontation network model is constructed, the picture and/or text training sample model obtained by training in the embodiment can be used as the discrimination module for generating the confrontation network model; for the generation module, if image data is generated, a generation module for generating a countermeasure network model may be constructed using a plurality of up-sampled deconvolution networks, and if text data is generated, a Transformer may be used as the generation module for generating the countermeasure network model. The specific implementation mode is as follows:
the method for constructing and generating the confrontation network model according to the picture and/or text training model comprises the following steps:
initializing module parameters of a discrimination module for generating the countermeasure network model according to the model parameters of the picture and/or text training model, and constructing the discrimination module for generating the countermeasure network model;
generating a network according to the deconvolution network and/or the text, and constructing a generation module for generating a confrontation network model;
and constructing the generation confrontation network model according to the discrimination module and the generation module.
The method comprises the steps that a picture and/or text training sample model obtained by training in the embodiment is used as a judging module for generating an confrontation network model, and the module parameters of the judging module for generating the confrontation network model are initialized according to the model parameters of the picture and/or text training model to construct the judging module for generating the confrontation network model; the generation module can select a deconvolution network or a text generation network to construct based on the type of the data to be generated; and the finally generated discrimination module and the generation module construct and generate a confrontation network model.
After the generated confrontation network model is constructed, the generated confrontation network model can be trained; specifically, a generating module and a distinguishing module for generating the confrontation network model are alternately trained by constructing a zero and game confrontation loss function, so that data generated by the generating module can be closer to real data distribution, and meanwhile, the distinguishing module can better distinguish the real data from the generated data.
Specifically, in the embodiment of the present specification, parameters of a pre-training model (i.e., model parameters of a picture and/or text training model) are used to initialize parameters of a discriminator (i.e., a discrimination module), which is advantageous in that the pre-training model is obtained based on large-scale image-text training samples, and the discriminator can be initialized by the pre-training model to transfer knowledge learned from large-scale training samples of the pre-training model to the discriminator, i.e., the pre-training plus fine-tuning technology in deep learning is implemented.
The model fairness evaluation method provided by the embodiment of the specification builds and generates the confrontation network model according to the picture and/or text training model, trains the confrontation network model according to the real data probability distribution of the picture and/or text training sample, and then adjusts and optimizes the real data probability distribution of the picture and/or text training sample according to the generated confrontation network model obtained by training to obtain the adjusted real data probability distribution of the picture and/or text training sample, so as to improve the reality of the picture and/or text training sample.
Step 204: and determining the credibility detection result of the sample to be evaluated according to the real data probability distribution and the generated confrontation network model.
After the adjusted real data probability distribution of the picture and/or text training sample is obtained, the confrontation network model can be generated in a combined mode, and the credibility of the sample to be evaluated is detected.
Specifically, the determining a credibility detection result of the sample to be evaluated according to the probability distribution of the real data and the generation of the confrontation network model includes:
according to the picture and/or text training model, obtaining sample data probability distribution of the sample to be evaluated;
according to the sample data probability distribution, determining the similarity of the real data probability distribution of the to-be-evaluated sample belonging to the training sample;
obtaining a sample prediction result of a sample to be evaluated according to the discrimination module for generating the confrontation network model;
and determining the credibility detection result of the to-be-evaluated sample according to the similarity and the sample prediction result.
For convenience of understanding, the real data probability distribution in the following embodiments can be understood as the real data probability distribution after the picture and/or text training sample is adjusted; the sample prediction result of the sample to be evaluated can be understood as the generation sample or the real sample of the sample to be evaluated.
In specific implementation, firstly, obtaining sample data probability distribution of an evaluation sample to be tested according to a picture and/or text training model, and calculating similarity (namely log-likelihood) of the evaluation sample in real data (picture and/or text training sample) distribution according to the sample data probability distribution of the evaluation sample to be tested; meanwhile, obtaining a sample prediction result of the sample to be evaluated according to a discrimination module for generating the confrontation network model; and then determining the credibility detection result of the sample to be evaluated according to the similarity and the sample prediction result.
According to the model fairness evaluation method provided by the embodiment of the specification, the credibility of the sample to be evaluated is detected according to the target probability distribution of the picture and/or text training sample and the generation of the confrontation network model, so that whether the confrontation sample is included in the sample to be evaluated or not or the diversity of the sample to be evaluated is weak and the like is judged, and the subsequent processing can be performed on the sample to be evaluated under the condition that the credibility of the sample to be evaluated is determined to be in a problem, so that the reliability and the integrity of the sample to be evaluated under the non-credible environment are improved.
In practical application, the credibility of the sample to be evaluated can be understood as whether the sample to be evaluated is an antagonistic sample or not and the distribution diversity of the sample to be evaluated. The specific implementation mode is as follows:
the step of determining the credibility detection result of the to-be-evaluated sample according to the similarity and the sample prediction result comprises the following steps:
and determining whether the sample to be evaluated is an confrontation sample or not and the distribution diversity of the sample to be evaluated according to the similarity and the sample prediction result.
In practical application, after obtaining the log likelihood and the sample prediction result, it may be determined whether the evaluated sample is an OOD sample according to the log likelihood and the sample prediction result, such as a countermeasure sample, where theoretically the log likelihood is smaller (in practical application, a log likelihood threshold may be set for the evaluated sample according to the sample set and the model task, and if the log likelihood threshold is smaller than the log likelihood threshold, the log likelihood may be considered smaller), and the probability that the evaluated sample is the countermeasure sample is larger.
And for the distribution diversity detection of the sample to be evaluated, the distribution of the log-likelihood degree of the sample to be evaluated in the real data distribution is counted, if the distribution is more divergent, the distribution diversity of the sample to be evaluated is stronger, and if the distribution is more concentrated, the distribution diversity of the sample to be evaluated can be weaker. In practical application, the distribution diversity detection of the to-be-evaluated sample is the detection of the distribution diversity of the whole to-be-evaluated sample, and is not directed to the measurement of a single to-be-evaluated sample, and the distribution diversity of the to-be-evaluated sample can be detected through indexes of distribution divergence such as variance, standard deviation, median and concentration trend, which is not limited in this specification.
In the model fairness assessment method provided in the embodiments of the present description, after obtaining the log likelihood and the sample prediction result that the sample to be evaluated belongs to in the real data distribution, the credibility of the sample to be evaluated may be detected according to the log likelihood and the sample prediction result, that is, whether the sample to be evaluated is an antagonistic sample or the distribution diversity of the sample to be evaluated is detected, when it is determined that the sample to be evaluated is an antagonistic sample or the distribution diversity of the sample to be evaluated is weak, the sample to be evaluated may be determined as untrusted, and subsequently, the data processing may be performed on the untrusted sample to be evaluated, so as to improve the reliability and integrity of the sample to be evaluated in an untrusted environment.
Step 206: and under the condition that the credibility detection result meets the non-credible condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample.
Specifically, under the condition that the sample to be evaluated is determined to be the countermeasure sample, the sample can be reconstructed in a denoising and countermeasure reconstruction mode; and under the condition that the distribution diversity of the sample to be evaluated is determined to be weaker, diversity generation can be carried out on the sample to be evaluated. The specific implementation mode is as follows:
under the condition that the credibility detection result meets the non-credibility condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample, wherein the sample processing comprises the following steps:
under the condition that the sample to be evaluated is an antagonistic sample, carrying out sample processing on the sample to be evaluated according to a first preset processing mode to obtain a first updated evaluation sample; and/or
And under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, carrying out sample processing on the sample to be evaluated according to a second preset processing mode to obtain a second updated evaluation sample.
The first preset processing mode and the second preset processing mode may be set according to practical applications, which is not limited in this description embodiment.
In practical application, for the credibility detection of the evaluation sample to be detected, whether the evaluation sample to be detected is an antagonistic sample or not and whether the distribution diversity of the evaluation sample to be detected is weak or not can be detected; under the condition that the sample to be evaluated is a countermeasure sample or the distribution diversity of the sample to be evaluated is weak, the sample to be evaluated can be regarded as unreliable; at this time, the evaluation sample to be tested needs to be processed.
In specific implementation, under the condition that the credibility detection result of the to-be-evaluated sample is the anti-evaluation sample, the credibility detection result of the to-be-evaluated sample can be determined to meet the non-credibility condition, and at the moment, the to-be-evaluated sample can be subjected to sample processing according to a first preset processing mode to obtain a first updated evaluation sample; and under the condition that the credibility detection result of the to-be-evaluated sample is that the distribution diversity of the to-be-evaluated sample is weak, determining that the credibility detection result of the to-be-evaluated sample meets the non-credibility condition, and at the moment, performing sample processing on the to-be-evaluated sample according to a second preset processing mode to obtain a second updated evaluation sample.
Then, under the condition that the first preset processing mode includes a denoising method of data compression, data randomization or countercheck error, the specific mode of performing sample processing on the sample to be evaluated according to the first preset processing mode to obtain the first updated evaluation sample is as follows:
under the condition that the sample to be evaluated is the countermeasure sample, carrying out sample processing on the sample to be evaluated according to a first preset processing mode to obtain a first updated evaluation sample, and the method comprises the following steps:
and under the condition that the sample to be evaluated is a countermeasure sample, reconstructing the sample to be evaluated by a denoising method of data compression, data randomization or countermeasure error correction to obtain a first updating evaluation sample.
And under the condition that the second preset processing method is to generate a newly added evaluation sample, performing sample processing on the to-be-evaluated sample according to a second preset processing mode, wherein a specific processing mode for obtaining a second updated evaluation sample is as follows:
under the condition that the distribution diversity of the sample to be evaluated meets the preset distribution condition, carrying out sample processing on the sample to be evaluated according to a second preset processing mode to obtain a second updated evaluation sample, and the method comprises the following steps:
under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, generating a newly added evaluation sample through the generation module for generating the confrontation network model according to the sample data probability distribution of the sample to be evaluated;
and obtaining a second updated evaluation sample according to the newly added evaluation sample.
Specifically, under the condition that the credibility of the sample to be evaluated is generated due to weak diversity of the countermeasure sample and distribution, the countermeasure reconstruction and the diversity generation can be performed on the sample to be evaluated, wherein the countermeasure reconstruction can be understood as reconstructing the countermeasure sample detected in the sample to be evaluated by adopting denoising methods including but not limited to data compression, data randomization, countermeasure error correction and the like so as to eliminate the interference of countermeasure noise; for diversity generation, it can be understood that, based on the sample data probability distribution of the sample to be evaluated, a generation module for generating the confrontation network model is used to generate a newly added evaluation sample so as to expand the diversity of the sample to be evaluated.
In the model fairness evaluation method provided in the embodiments of the present description, after obtaining the log likelihood and the sample prediction result that the evaluation sample to be evaluated belongs to the real data distribution, the credibility of the evaluation sample to be evaluated may be detected according to the log likelihood and the sample prediction result, that is, whether the evaluation sample to be evaluated is an countermeasure sample or the distribution diversity of the evaluation sample to be evaluated is detected, when it is determined that the evaluation sample to be evaluated is an countermeasure sample or the distribution diversity of the evaluation sample to be evaluated is weak, the evaluation sample to be evaluated may be determined to be unreliable, and subsequently, data processing may be performed on the unreliable evaluation sample to be evaluated, so as to improve the reliability and integrity of the evaluation sample to be evaluated in an unreliable environment, thereby ensuring the robustness and the availability of the evaluation result of the fairness evaluation method in an unreliable environment.
In addition, in order to ensure the accuracy of the newly added evaluation sample, after the newly added evaluation sample is generated by the generation module for generating the confrontation network model, the quality of the newly added evaluation sample is also detected according to the discrimination module for generating the confrontation network model, so that the accuracy of the newly added evaluation sample is ensured. The specific implementation mode is as follows:
the step of obtaining a second updated evaluation sample according to the newly added evaluation sample comprises the following steps:
inputting the newly added evaluation sample into the discrimination module for generating the confrontation network model to obtain a prediction result of the newly added evaluation sample;
and deleting the newly added evaluation sample according to the prediction result of the newly added evaluation sample to obtain a second updated evaluation sample.
In the model fairness evaluation method provided in the embodiments of the present description, in order to ensure the accuracy of the newly added evaluation sample, after the newly added evaluation sample is generated by the generation module that generates the countermeasure network model, the newly added evaluation sample is further subjected to sample filtering based on the generation quality of the newly added evaluation sample, so as to ensure the accuracy of the newly added evaluation sample. In addition, the newly added evaluation sample may also be subjected to quality filtering according to the quality evaluation index of the original evaluation sample to be evaluated, for example, when the original evaluation sample to be evaluated is an image, the original evaluation sample may be subjected to quality filtering according to the inclusion Score (initial Score) of the image, and when the original evaluation sample to be evaluated is a text, the original evaluation sample may be subjected to quality filtering according to the fluency of the text, and the like.
Step 208: and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
The updating evaluation sample comprises a first updating evaluation sample and/or a second updating evaluation sample; and the evaluation model to be tested can also be understood as the image-text recognition model of the same type as the training model.
Under the condition that the updated evaluation sample comprises a first updated evaluation sample, mixing the evaluation sample to be evaluated and the first updated evaluation sample, and after a mixed sample is generated, performing fairness evaluation on the model to be evaluated; under the condition that the updated evaluation sample comprises a second updated evaluation sample, mixing the evaluation sample to be evaluated and the second updated evaluation sample, and after a mixed sample is generated, performing fairness evaluation on the model to be evaluated; and under the condition that the updated evaluation sample comprises a first updated evaluation sample and a second updated evaluation sample, mixing the to-be-evaluated sample, the first updated evaluation sample and the second updated evaluation sample, generating a mixed sample, and then performing fairness evaluation on the to-be-evaluated model. The specific implementation mode is as follows:
and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample, wherein the fairness evaluation comprises the following steps:
mixing the to-be-evaluated sample and the updated evaluation sample to obtain a mixed evaluation sample;
inputting the mixed evaluation sample and the evaluation model to be evaluated into a fairness evaluation module to obtain a fairness evaluation index of the evaluation model to be evaluated;
and carrying out fairness evaluation on the model to be evaluated according to fairness evaluation indexes of the model to be evaluated.
The fairness evaluation index includes, but is not limited to, false positive rate, statistical equalization, chance equalization, inconsistency influence, and the like.
In specific implementation, the step of inputting the mixed evaluation sample and the evaluation model to be evaluated into a fairness evaluation module to obtain a fairness evaluation index of the evaluation model to be evaluated comprises the following steps:
inputting the mixed evaluation sample and the evaluation model to be tested into a fairness evaluation module;
receiving the fairness evaluation index of the model to be evaluated, which is output by the fairness evaluation module and determined according to the comparison result of the real value and the predicted value of the mixed evaluation sample,
and outputting the predicted value by the to-be-tested evaluation model according to the mixed evaluation sample.
Specifically, the comparison result of the real value and the predicted value of the mixed evaluation sample can be understood as the prediction accuracy of the model to be evaluated.
In practical application, inputting the mixed evaluation sample and the evaluation model to be tested into a fairness evaluation module, and inputting the mixed evaluation sample into the evaluation model to be tested in the fairness evaluation module to obtain a predicted value of the mixed evaluation sample output by the evaluation model to be tested; calculating the prediction accuracy of the evaluation model to be tested according to the predicted value and the real value of the mixed evaluation sample; after the fairness evaluation module determines the prediction accuracy of the model to be evaluated, the indexes of the model to be evaluated, such as false positive rate, statistic equalization, chance equalization, inconsistent influence and the like, are calculated according to the prediction accuracy; subsequent users or the system can evaluate the fairness of the model to be evaluated according to the indexes.
In the model fairness evaluation method provided in the embodiments of the present description, a graph-text training model is used to model a true data probability distribution of a graph-text training sample, and a credibility detection is performed on a sample to be evaluated according to the true data probability distribution; and the method is used for processing the untrusted sample to obtain an updated evaluation sample, so that the reliability and the completeness of the evaluation sample in an untrusted environment are improved, the fairness evaluation method of the model is ensured, the robustness in a trusted environment and an untrusted environment and the availability of the evaluation result of the model are ensured, the fairness evaluation accuracy of the model to be evaluated by the fairness evaluation method is ensured, and the follow-up model to be evaluated has a better effect in practical application from the viewpoint of meeting supervision compliance or improving user experience.
The following description will further explain the model fairness evaluation method by taking an application of the model fairness evaluation method provided in this specification to fairness evaluation of a recommendation model as an example, with reference to fig. 3. Fig. 3 shows a flowchart of a processing procedure of a model fairness evaluation method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 302: according to the collected large-scale unsupervised image-text training samples, a pre-training model is trained by using an automatic supervision learning technology, and the real data probability distribution of the image-text training samples is initially modeled through the pre-training model.
Step 304: and constructing and generating a confrontation network model by using a depth generation technology according to the pre-training model, training and generating the confrontation network model according to the image-text training sample, and optimizing the real data probability distribution of the image-text training sample according to the generated confrontation network model to obtain the optimized real data probability distribution of the image-text training sample.
Step 306: and generating a confrontation network model according to the optimized real data probability distribution of the image-text training sample, and carrying out credibility detection on the sample to be evaluated.
For a specific implementation manner of performing credibility detection on the sample to be evaluated according to the optimized real data probability distribution of the image-text training sample and the generated confrontation network model, reference may be made to the detailed description of the above embodiment, which is not described herein again.
Step 308: according to the credibility detection result of the sample to be evaluated, performing countermeasure reconstruction on the sample to be evaluated by using an countermeasure technology, or performing diversity generation on the sample to be evaluated by using a generator for generating a countermeasure network model.
For a specific implementation manner of the countermeasure reconstruction and diversity generation of the evaluation sample to be tested, reference may be made to the detailed description of the above embodiments, which is not described herein again.
Step 310: and mixing the evaluation sample to be tested with the evaluation sample to be tested which is reconstructed in an antagonistic manner and/or the evaluation sample to be tested which is generated by diversity to obtain a mixed sample, and inputting the mixed sample and the recommendation model into a fairness evaluation module for evaluation to obtain a fairness evaluation result of the recommendation model.
The model fairness evaluation method provided by the embodiment of the specification provides an algorithm fairness evaluation technology under an uncontrolled (trusted) environment, based on large-scale easily-obtained unsupervised image-text training data, real data probability distribution is modeled by combining a large-scale pre-training technology and a depth generation technology, an untrusted sample in a sample to be evaluated is detected based on the real data (unsupervised image-text training data) probability distribution obtained by modeling, and countermeasures, denoising and reconstruction are performed through a countermeasures and defense technology, so that the influence of countermeasures and noises on fairness evaluation can be effectively eliminated.
For the problem of distribution deviation of the evaluation data to be detected (the sample distribution is single, and the whole data distribution cannot be covered), the model fairness evaluation method provided by the embodiment of the specification provides data distribution based on the evaluation sample to be detected, and meanwhile diversity generation is performed by combining a depth generation technology, so that the purpose of improving the reliability and integrity of the evaluation sample to be detected in an untrusted environment can be achieved. In addition, in the whole fairness evaluation process, the model fairness evaluation method provided by the embodiment of the specification does not need additional evaluation data or manual intervention, greatly improves the intelligence degree of evaluation, and reduces the evaluation cost. To sum up, the model fairness assessment method provided in the embodiments of the present specification not only has fairness assessment capability in a trusted environment, but also can ensure robustness of fairness assessment and availability of assessment results in an uncontrolled environment, and thus, the method is applicable to fairness assessment of platforms such as e-commerce platforms, online social platforms, and online social media, including but not limited to algorithms such as intelligent customer service, personalized recommendation, and intelligent wind control, and achieves the purposes of eliminating algorithm bias, and promoting the algorithms to meet supervision compliance and improve user experience.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a model fairness evaluation device, and fig. 4 shows a schematic structural diagram of a model fairness evaluation device provided in an embodiment of the present specification. As shown in fig. 4, the apparatus includes:
a probability distribution determination module 402 configured to determine a real data probability distribution of the picture and/or text training samples according to the picture and/or text training model;
a detection result determining module 404 configured to determine a credibility detection result of the sample to be evaluated according to the real data probability distribution and the generated confrontation network model;
the sample processing module 406 is configured to, when the credibility detection result meets an untrusted condition, perform sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
the evaluation module 408 is configured to evaluate fairness of the to-be-evaluated model according to the to-be-evaluated sample and the updated evaluation sample.
Optionally, the probability distribution determining module 402 is further configured to:
acquiring a picture and/or text training sample;
training to obtain a picture and/or text training model by utilizing an automatic supervision learning technology according to the training samples;
obtaining real data probability distribution after the training sample is adjusted according to the picture and/or text training model;
and adjusting the real data probability distribution of the training sample according to the generated confrontation network model to obtain the adjusted real data probability distribution of the training sample.
Optionally, the probability distribution determining module 402 is further configured to:
according to the picture and/or text training model, a confrontation network model is constructed and generated;
training the generated confrontation network model according to the training sample to obtain a judging module and a generating module of the trained generated confrontation network model;
and adjusting the real data probability distribution of the training sample according to the judging module to obtain the adjusted real data probability distribution of the training sample.
Optionally, the probability distribution determining module 402 is further configured to:
initializing module parameters of a discrimination module for generating the countermeasure network model according to the model parameters of the picture and/or text training model, and constructing the discrimination module for generating the countermeasure network model;
generating a network according to the deconvolution network and/or the text, and constructing a generation module for generating a confrontation network model;
and constructing the generation confrontation network model according to the discrimination module and the generation module.
Optionally, the detection result determining module 404 is further configured to:
according to the picture and/or text training model, obtaining sample data probability distribution of the sample to be evaluated;
according to the sample data probability distribution, determining the similarity of the real data probability distribution of the to-be-evaluated sample belonging to the training sample;
obtaining a sample prediction result of a sample to be evaluated according to the discrimination module for generating the confrontation network model;
and determining the credibility detection result of the sample to be evaluated according to the similarity and the sample prediction result.
Optionally, the detection result determining module 404 is further configured to:
and determining whether the sample to be evaluated is an antagonistic sample or not and the distribution diversity of the sample to be evaluated according to the similarity and the sample prediction result.
Optionally, the sample processing module 406 is further configured to:
under the condition that the sample to be evaluated is an antagonistic sample, carrying out sample processing on the sample to be evaluated according to a first preset processing mode to obtain a first updated evaluation sample; and/or
And under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, carrying out sample processing on the sample to be evaluated according to a second preset processing mode to obtain a second updated evaluation sample.
Optionally, the sample processing module 406 is further configured to:
and under the condition that the sample to be evaluated is a countermeasure sample, reconstructing the sample to be evaluated by a denoising method of data compression, data randomization or countermeasure error correction to obtain a first updating evaluation sample.
Optionally, the sample processing module 406 is further configured to:
under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, generating a newly added evaluation sample through the generation module for generating the confrontation network model according to the sample data probability distribution of the sample to be evaluated;
and obtaining a second updated evaluation sample according to the newly added evaluation sample.
Optionally, the sample processing module 406 is further configured to:
inputting the newly added evaluation sample into the discrimination module for generating the confrontation network model to obtain a prediction result of the newly added evaluation sample;
and deleting the newly added evaluation sample according to the prediction result of the newly added evaluation sample to obtain a second updated evaluation sample.
Optionally, the evaluation module 408 is further configured to:
mixing the to-be-evaluated sample and the updated evaluation sample to obtain a mixed evaluation sample;
inputting the mixed evaluation sample and the evaluation model to be evaluated into a fairness evaluation module to obtain a fairness evaluation index of the evaluation model to be evaluated;
and carrying out fairness evaluation on the model to be evaluated according to fairness evaluation indexes of the model to be evaluated.
Optionally, the evaluation module 408 is further configured to:
inputting the mixed evaluation sample and the evaluation model to be tested into a fairness evaluation module;
receiving the fairness evaluation index of the model to be evaluated, which is output by the fairness evaluation module and determined according to the comparison result of the real value and the predicted value of the mixed evaluation sample,
and the predicted value is output by the to-be-tested evaluation model according to the mixed evaluation sample.
The model fairness assessment device provided by the embodiment of the specification models real data probability distribution of the image-text training samples through the image-text training model, and performs credibility detection on the samples to be assessed according to the real data probability distribution; and the method is used for processing the untrusted sample to obtain an updated evaluation sample, so that the reliability and the completeness of the evaluation sample in an untrusted environment are improved, the fairness evaluation method of the model is ensured, the robustness in a trusted environment and an untrusted environment and the availability of the evaluation result of the model are ensured, the fairness evaluation accuracy of the model to be evaluated by the fairness evaluation method is ensured, and the follow-up model to be evaluated has a better effect in practical application from the viewpoint of meeting supervision compliance or improving user experience.
The above is an exemplary scheme of the model fairness evaluation apparatus of this embodiment. It should be noted that the technical solution of the model fairness assessment apparatus and the technical solution of the model fairness assessment method described above belong to the same concept, and details that are not described in detail in the technical solution of the model fairness assessment apparatus can be referred to the description of the technical solution of the model fairness assessment method described above.
Referring to fig. 5, fig. 5 is a flowchart illustrating another model fairness evaluation method provided in an embodiment of the present disclosure, which specifically includes the following steps.
Step 502: and determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model.
Step 504: and receiving the sample to be evaluated and the model to be evaluated sent by the user.
Step 506: and determining the credibility detection result of the sample to be evaluated according to the probability distribution of the real data and the generation of a confrontation network model.
Step 508: and under the condition that the credibility detection result meets the non-credibility condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample.
Step 510: and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
Step 512: and obtaining a fairness evaluation result of the model to be evaluated, and returning the fairness evaluation result to the user.
The model fairness assessment method provided by the embodiment of the specification is applied to a model fairness assessment platform.
The practical application scenario may be that a user wants to perform fairness evaluation on the project model in the model fairness evaluation platform, then the user may send the evaluation sample to be evaluated and the evaluation model to be evaluated to the model fairness evaluation platform, and after receiving the evaluation sample to be evaluated and the evaluation model sent by the user, the model fairness evaluation platform may perform countermeasure reconstruction or diversity generation on the evaluation sample to be evaluated according to the manner of the above embodiment, thereby ensuring that the evaluation result given by the model fairness evaluation platform is closer to the true situation of the evaluation model to be evaluated.
In the model fairness evaluation method provided in the embodiments of the present description, based on large-scale easily-obtained unsupervised image-text training data, a training data probability distribution is modeled as a true data probability distribution by using a combination of a large-scale pre-training technique and a deep generation technique. For the sample to be evaluated, before entering the fairness evaluation module, reliability detection is carried out on the sample to be evaluated according to the real data probability distribution, and the interference of the untrusted sample on the evaluation result is effectively relieved. For detected non-credible samples such as countermeasure samples or distribution deviation samples, the scheme is used for denoising and reconstructing the evaluation samples based on the countermeasure defense technology and generating diversity based on the depth generation model, the purpose of improving the reliability and integrity of the evaluation samples in the non-credible environment is achieved, the robustness of a fairness evaluation system in the uncontrolled environment and the availability of evaluation results are guaranteed, and the defect that the existing system is sensitive to the countermeasure disturbance and the distribution deviation of the evaluation samples in the uncontrolled environment is effectively overcome.
Corresponding to the above method embodiment, the present specification further provides another embodiment of a model fairness assessment apparatus, and fig. 6 shows a schematic structural diagram of another model fairness assessment apparatus provided in an embodiment of the present specification. As shown in fig. 6, the apparatus is applied to a model fairness evaluation platform, and includes:
a first determining module 602 configured to determine a real data probability distribution of the picture and/or text training sample according to the picture and/or text training model;
the data receiving module 604 is configured to receive the sample to be evaluated and the model to be evaluated, which are sent by the user;
a second determining module 606 configured to determine a credibility detection result of the to-be-evaluated sample according to the real data probability distribution and a generated confrontation network model;
a sample updating module 608 configured to, when the credibility detection result satisfies an untrusted condition, perform sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
a fairness evaluation module 610 configured to perform fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample;
a result showing module 612 configured to obtain a fairness evaluation result of the model to be evaluated, and return the fairness evaluation result to the user.
The model fairness assessment device provided by the embodiment of the specification is used for modeling the probability distribution of training data as the probability distribution of real data by utilizing a mode of combining a large-scale pre-training technology and a deep generation technology based on large-scale easily-obtained unsupervised image-text training data. For the sample to be evaluated, before entering the fairness evaluation module, reliability detection is carried out on the sample to be evaluated according to the real data probability distribution, and the interference of the untrusted sample on the evaluation result is effectively relieved. For detected non-credible samples such as countermeasure samples or distribution deviation samples, the scheme is used for denoising and reconstructing the evaluation samples based on the countermeasure defense technology and generating diversity based on the depth generation model, the purpose of improving the reliability and integrity of the evaluation samples in the non-credible environment is achieved, the robustness of a fairness evaluation system in the uncontrolled environment and the availability of evaluation results are guaranteed, and the defect that the existing system is sensitive to the countermeasure disturbance and the distribution deviation of the evaluation samples in the uncontrolled environment is effectively overcome.
The foregoing is an exemplary scheme of the model fairness assessment apparatus of this embodiment. It should be noted that the technical solution of the model fairness assessment apparatus and the technical solution of the model fairness assessment method described above belong to the same concept, and details that are not described in detail in the technical solution of the model fairness assessment apparatus can be referred to the description of the technical solution of the model fairness assessment method described above.
FIG. 7 illustrates a block diagram of a computing device 700, provided in accordance with one embodiment of the present description. Components of the computing device 700 include, but are not limited to, a memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
Wherein processor 720 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the model fairness assessment method described above.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the model fairness assessment method described above belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the model fairness assessment method described above.
An embodiment of the present specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the model fairness assessment method described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium and the technical solution of the model fairness evaluation method described above belong to the same concept, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the model fairness evaluation method described above.
An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the model fairness assessment method.
The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program and the technical solution of the model fairness assessment method described above belong to the same concept, and details that are not described in detail in the technical solution of the computer program can be referred to the description of the technical solution of the model fairness assessment method described above.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (14)

1. A model fairness assessment method, comprising:
determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model;
determining the credibility detection result of the sample to be evaluated according to the probability distribution of the real data and the generated confrontation network model;
under the condition that the credibility detection result meets an unreliable condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
and performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
2. The model fairness assessment method of claim 1, the determining the true data probability distribution of the picture and/or text training samples from the picture and/or text training model, comprising:
acquiring a picture and/or text training sample;
training to obtain a picture and/or text training model by utilizing an automatic supervision learning technology according to the training samples;
according to the picture and/or text training model, obtaining the probability distribution of the real data after the training sample is adjusted;
and adjusting the real data probability distribution of the training sample according to the generated confrontation network model to obtain the adjusted real data probability distribution of the training sample.
3. The model fairness evaluation method of claim 2, the adjusting the true data probability distribution of the training samples according to the generated confrontation network model to obtain the adjusted true data probability distribution of the training samples, comprising:
according to the picture and/or text training model, a confrontation network model is constructed and generated;
training the generated confrontation network model according to the training sample to obtain a judging module and a generating module of the trained generated confrontation network model;
and adjusting the real data probability distribution of the training sample according to the judging module to obtain the adjusted real data probability distribution of the training sample.
4. The model fairness assessment method of claim 3, the constructing and generating a confrontation network model according to the picture and/or text training model, comprising:
initializing module parameters of a discrimination module for generating the countermeasure network model according to the model parameters of the picture and/or text training model, and constructing the discrimination module for generating the countermeasure network model;
generating a network according to the deconvolution network and/or the text, and constructing a generation module for generating a confrontation network model;
and constructing the generation confrontation network model according to the discrimination module and the generation module.
5. The model fairness evaluation method of claim 1, the determining credibility testing results of the samples to be evaluated according to the true data probability distribution and the generation of the confrontation network model, comprising:
according to the picture and/or text training model, obtaining sample data probability distribution of the sample to be evaluated;
according to the sample data probability distribution, determining the similarity of the real data probability distribution of the to-be-evaluated sample belonging to the training sample;
obtaining a sample prediction result of a sample to be evaluated according to the discrimination module for generating the confrontation network model;
and determining the credibility detection result of the to-be-evaluated sample according to the similarity and the sample prediction result.
6. The model fairness evaluation method of claim 5, wherein the determining the credibility testing result of the to-be-evaluated sample according to the similarity and the sample prediction result comprises:
and determining whether the sample to be evaluated is an antagonistic sample or not and the distribution diversity of the sample to be evaluated according to the similarity and the sample prediction result.
7. The model fairness evaluation method of claim 6, wherein the performing sample processing on the to-be-evaluated sample according to the credibility detection result to obtain an updated evaluation sample when the credibility detection result meets an unreliable condition comprises:
under the condition that the sample to be evaluated is an antagonistic sample, carrying out sample processing on the sample to be evaluated according to a first preset processing mode to obtain a first updated evaluation sample; and/or
And under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, carrying out sample processing on the sample to be evaluated according to a second preset processing mode to obtain a second updated evaluation sample.
8. The model fairness evaluation method of claim 7, wherein in a case that the evaluation sample is a countermeasure sample, the sample processing is performed on the evaluation sample according to a first preset processing mode to obtain a first updated evaluation sample, and the method includes:
and under the condition that the sample to be evaluated is a countermeasure sample, reconstructing the sample to be evaluated by a denoising method of data compression, data randomization or countermeasure error correction to obtain a first updating evaluation sample.
9. The model fairness evaluation method of claim 7, wherein the step of performing sample processing on the sample to be evaluated according to a second preset processing mode to obtain a second updated evaluation sample when the distribution diversity of the sample to be evaluated meets a preset distribution condition comprises:
under the condition that the distribution diversity of the sample to be evaluated meets a preset distribution condition, generating a newly added evaluation sample through the generation module for generating the confrontation network model according to the sample data probability distribution of the sample to be evaluated;
and obtaining a second updated evaluation sample according to the newly added evaluation sample.
10. The model fairness assessment method of claim 9, wherein obtaining a second updated evaluation sample based on the newly added evaluation sample comprises:
inputting the newly added evaluation sample into the discrimination module for generating the confrontation network model to obtain a prediction result of the newly added evaluation sample;
and deleting the newly added evaluation sample according to the prediction result of the newly added evaluation sample to obtain a second updated evaluation sample.
11. The model fairness evaluation method of claim 1, wherein the performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample comprises:
mixing the to-be-evaluated sample and the updated evaluation sample to obtain a mixed evaluation sample;
inputting the mixed evaluation sample and the evaluation model to be evaluated into a fairness evaluation module to obtain a fairness evaluation index of the evaluation model to be evaluated;
and carrying out fairness evaluation on the model to be evaluated according to fairness evaluation indexes of the model to be evaluated.
12. The model fairness evaluation method of claim 11, the inputting the mixture evaluation sample and the evaluation model to be evaluated into a fairness evaluation module to obtain a fairness evaluation index of the evaluation model to be evaluated, comprising:
inputting the mixed evaluation sample and the evaluation model to be tested into a fairness evaluation module;
receiving the fairness evaluation index of the model to be evaluated, which is output by the fairness evaluation module and determined according to the comparison result of the real value and the predicted value of the mixed evaluation sample,
and outputting the predicted value by the to-be-tested evaluation model according to the mixed evaluation sample.
13. A model fairness evaluation apparatus, comprising:
a probability distribution determination module configured to determine a real data probability distribution of the picture and/or text training samples according to the picture and/or text training model;
the detection result determining module is configured to determine a credibility detection result of the sample to be evaluated according to the real data probability distribution and the generated confrontation network model;
the sample processing module is configured to perform sample processing on the sample to be evaluated according to the credibility detection result under the condition that the credibility detection result meets an unreliable condition, so as to obtain an updated evaluation sample;
and the evaluation module is configured to perform fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample.
14. A model fairness evaluation method is applied to a model fairness evaluation platform and comprises the following steps:
determining the real data probability distribution of the picture and/or text training sample according to the picture and/or text training model;
receiving a to-be-evaluated sample and a to-be-evaluated model sent by a user;
determining a credibility detection result of the sample to be evaluated according to the probability distribution of the real data and a generated confrontation network model;
under the condition that the credibility detection result meets an unreliable condition, carrying out sample processing on the sample to be evaluated according to the credibility detection result to obtain an updated evaluation sample;
performing fairness evaluation on the model to be evaluated according to the sample to be evaluated and the updated evaluation sample;
and obtaining a fairness evaluation result of the model to be evaluated, and returning the fairness evaluation result to the user.
CN202210379396.XA 2022-04-12 2022-04-12 Model fairness assessment method and device Pending CN114970670A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210379396.XA CN114970670A (en) 2022-04-12 2022-04-12 Model fairness assessment method and device
PCT/CN2023/086570 WO2023197927A1 (en) 2022-04-12 2023-04-06 Model fairness evaluation methods and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210379396.XA CN114970670A (en) 2022-04-12 2022-04-12 Model fairness assessment method and device

Publications (1)

Publication Number Publication Date
CN114970670A true CN114970670A (en) 2022-08-30

Family

ID=82977853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210379396.XA Pending CN114970670A (en) 2022-04-12 2022-04-12 Model fairness assessment method and device

Country Status (2)

Country Link
CN (1) CN114970670A (en)
WO (1) WO2023197927A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197927A1 (en) * 2022-04-12 2023-10-19 阿里巴巴(中国)有限公司 Model fairness evaluation methods and apparatus

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11604992B2 (en) * 2018-11-02 2023-03-14 Microsoft Technology Licensing, Llc Probabilistic neural network architecture generation
CN111753918B (en) * 2020-06-30 2024-02-23 浙江工业大学 Gender bias-removed image recognition model based on countermeasure learning and application
CN112700408B (en) * 2020-12-28 2023-09-08 中国银联股份有限公司 Model training method, image quality evaluation method and device
CN113220553B (en) * 2021-05-13 2022-06-17 支付宝(杭州)信息技术有限公司 Method and device for evaluating performance of text prediction model
CN114139601A (en) * 2021-11-01 2022-03-04 国家电网有限公司大数据中心 Evaluation method and system for artificial intelligence algorithm model of power inspection scene
CN114970670A (en) * 2022-04-12 2022-08-30 阿里巴巴(中国)有限公司 Model fairness assessment method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197927A1 (en) * 2022-04-12 2023-10-19 阿里巴巴(中国)有限公司 Model fairness evaluation methods and apparatus

Also Published As

Publication number Publication date
WO2023197927A1 (en) 2023-10-19

Similar Documents

Publication Publication Date Title
US20230041233A1 (en) Image recognition method and apparatus, computing device, and computer-readable storage medium
CN109344908B (en) Method and apparatus for generating a model
Wells et al. Artificial intelligence in dermatopathology: Diagnosis, education, and research
US10872412B2 (en) Automatic graph scoring for neuropsychological assessments
CN109447156B (en) Method and apparatus for generating a model
CN111582348A (en) Method, device, equipment and storage medium for training condition generating type countermeasure network
CN110286938B (en) Method and apparatus for outputting evaluation information for user
US20210312042A1 (en) Graph-Based Classification of Elements
CN115994226B (en) Clustering model training system and method based on federal learning
CN111915086A (en) Abnormal user prediction method and equipment
CN115185760A (en) Abnormality detection method and apparatus
WO2023197927A1 (en) Model fairness evaluation methods and apparatus
CN113485993A (en) Data identification method and device
CN113033912A (en) Problem solving person recommendation method and device
WO2020093817A1 (en) Identity verification method and device
Garcia de Alford et al. Reducing age bias in machine learning: An algorithmic approach
CN110163761B (en) Suspicious item member identification method and device based on image processing
CN117172632B (en) Enterprise abnormal behavior detection method, device, equipment and storage medium
CN117892019B (en) Cross-social network identity linking method and device
US20240095327A1 (en) Computer authentication using knowledge of former devices
US20240177243A1 (en) Intelligent platform for audit response using a metaverse-driven approach for regulator reporting requirements
CN113536672B (en) Target object processing method and device
CN111062468B (en) Training method and system for generating network, and image generation method and device
US20220393949A1 (en) Systems and Methods for Automatic Generation of Social Media Networks and Interactions
WO2024051364A1 (en) Living body detection model training method and apparatus, and storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination