GB2597406A

GB2597406A - Fairness improvement through reinforcement learning

Info

Publication number: GB2597406A
Application number: GB2115858.9A
Authority: GB
Inventors: Chaloulos Georgios; Floether Frederik; Graf Florian; Lustenberger Patrick; Ravizza Stefan; Slottke Eric
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2019-04-08
Filing date: 2020-03-18
Publication date: 2022-01-26
Also published as: DE112020000537T5; US20200320428A1; CN113692594A; JP2022527536A; WO2020208444A1

Abstract

A computer-implemented method for improving fairness in a supervised machine-learning model may be provided. The method comprises linking the supervised machine-learning model to a reinforcement learning meta model, selecting a list of hyper-parameters and parameters of the supervised machine-learning model, and controlling at least one aspect of the supervised machine-learning model by adjusting hyper-parameters values and parameter values of the list of hyper-parameters and parameters of the supervised machine-learning model by a reinforcement learning engine relating to the reinforcement learning meta model by calculating a reward function based on multiple conflicting objective functions. The method further comprises repeating iteratively the steps of selecting and controlling for improving a fairness value of the supervised machine-learning model.

Claims

1. A computer-implemented method, the method comprising: receiving an original version of a machine learning model (MLM) including a plurality of parameter values, a plurality of hyperparameter values and an original fairness value that reflects fairness with respect to segmented relevant sub-groups; adjusting at least some of the parameter values and/or at least some of the hyperparameter values of the original version of the MLM to create a provisional version of the MLM; determining a fairness value for the provisional version of the MLM by operations including the following: receiving a reinforcement learning meta model (RLMM) that defines a plurality of fairness related objectives and a reward function reflecting the plurality of fairness related objectives; operating the provisional version of the MLM; during the operation of the provisional version of the MLM, calculating, by the RLMM, reward values based on the reward function; and determining a provisional fairness value for the provisional version of the MLM based upon the reward values; determining that the provisional fairness value is greater than the original fairness value; and responsive to the determination that the provisional fairness value is greater than the original fairness value, replacing the original version of the MLM with the provisional version of the MLM and replacing the original fairness value with the provisional fairness value.

2. The computer-implemented method of claim 1, further comprising: iteratively repeating the operations of until the original fairness value exceeds a predetermined threshold.

3. The computer-implemented method of claim 1 or 2, wherein the original MLM is a supervised MLM.

4. The computer-implemented method of one of claims 1 to 3, wherein the fairness related objectives include at least one of the following: gender, age, nationality, religious beliefs, ethnicity and orientation.

5. The computer-implemented method of one of claims 1 to 4, further comprising: linking the original MLM to the reinforcement learning meta model based on a configuration and a read out.

6. The computer-implemented method of one of claims 1 to 5, wherein the plurality of parameter values includes a value for at least one of the following parameter types: weighing factors and activation function variables.

7. The computer-implemented method of claim 1 to 6, wherein the plurality of hyperparameter values include a value for at least one of the following hyperparameter types: type of activation function, number of nodes per layer, number of layers of a neural network and machine-learning model.

8. A computer program product, the computer program product comprising: one or more non-transitory computer readable storage media and program instructions stored on the one or more non-transitory computer readable storage media, the program instructions comprising: program instructions to receive an original version of a machine learning model (MLM) including a plurality of parameter values, a plurality of hyperparameter values and an original fairness value that reflects fairness with respect to segmented relevant sub-groups; program instructions to adjust at least some of the parameter values and/or at least some of the hyperparameter values of the original version of the MLM to create a provisional version of the MLM; program instructions to determine a fairness value for the provisional version of the MLM by operations including the following: program instructions to receive a reinforcement learning meta model (RLMM) that defines a plurality of fairness related objectives and a reward function reflecting the plurality of fairness related objectives; program instructions to operate the provisional version of the MLM; during the operation of the provisional version of the MLM, program instructions to calculate, by the RLMM, reward values based on the reward function; and program instructions to determine a provisional fairness value for the provisional version of the MLM based upon the reward values; program instructions to determine that the provisional fairness value is greater than the original fairness value; and responsive to the determination that the provisional fairness value is greater than the original fairness value, program instructions to replace the original version of the MLM with the provisional version of the MLM and replacing the original fairness value with the provisional fairness value.

9. The computer program product of claim 8, further comprising: program instructions to iteratively repeating the operations of until the original fairness value exceeds a predetermined threshold.

10. The computer program product of claim 8 or 9, wherein the original MLM is a supervised MLM.

11. The computer program product of one of claims 8 to 10, wherein the fairness related objectives include at least one of the following: gender, age, nationality, religious beliefs, ethnicity and orientation.

12. The computer program product of one of claims 8 to 11, further comprising: program instructions to link the original MLM to the reinforcement learning meta model based on a configuration and a read out.

13. The computer program product of one of claims 8 to 12, wherein the plurality of parameter values includes a value for at least one of the following parameter types: weighing factors and activation function variables.

14. The computer program product of one of claim 8 to 13, wherein the plurality of hyperparameter values include a value for at least one of the following hyperparameter types: type of activation function, number of nodes per layer, number of layers of a neural network and machine-learning model.

15. A computer system, comprising: one or more computer processors; one or more computer readable storage media; program instructions stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising: program instructions to receive an original version of a machine learning model (MLM) including a plurality of parameter values, a plurality of hyperparameter values and an original fairness value that reflects fairness with respect to segmented relevant sub-groups; program instructions to adjust at least some of the parameter values and/or at least some of the hyperparameter values of the original version of the MLM to create a provisional version of the MLM; program instructions to determine a fairness value for the provisional version of the MLM by operations including the following: program instructions to receive a reinforcement learning meta model (RLMM) that defines a plurality of fairness related objectives and a reward function reflecting the plurality of fairness related objectives; program instructions to operate the provisional version of the MLM; during the operation of the provisional version of the MLM, program instructions to calculate, by the RLMM, reward values based on the reward function; and program instructions to determine a provisional fairness value for the provisional version of the MLM based upon the reward values; program instructions to determine that the provisional fairness value is greater than the original fairness value; and responsive to the determination that the provisional fairness value is greater than the original fairness value, program instructions to replace the original version of the MLM with the provisional version of the MLM and replacing the original fairness value with the provisional fairness value.

16. The computer system of claim 15, further comprising: program instructions to iteratively repeating the operations of until the original fairness value exceeds a predetermined threshold.

17. The computer system of claim 15 or 16, wherein the fairness related objectives include at least one of the following: gender, age, nationality, religious beliefs, ethnicity and orientation.

18. The computer system of one of claims 15 to 17, further comprising: program instructions to link the original MLM to the reinforcement learning meta model based on a configuration and a read out.

19. The computer system of one of claims 15 to 18, wherein the plurality of parameter values includes a value for at least one of the following parameter types: weighing factors and activation function variables.

20. The computer system of one of claims 15 to 19, wherein the plurality of hyperparameter values include a value for at least one of the following hyperparameter types: type of activation function, number of nodes per layer, number of layers of a neural network and machine-learning model.