WO2024042627A1

WO2024042627A1 - Information processing device, determination method, and determination program

Info

Publication number: WO2024042627A1
Application number: PCT/JP2022/031794
Authority: WO
Inventors: 大心伊藤
Original assignee: 三菱電機株式会社
Priority date: 2022-08-24
Filing date: 2022-08-24
Publication date: 2024-02-29

Abstract

An information processing device (100) comprises: an acquisition unit (130) that acquires a trained model (115), a behavioral characteristics table (112) indicating behavioral characteristics of each user, an attribute table (113) indicating attributes of each user, a measures candidate table (116) indicating measures candidates, measures scores, which are values for reducing the inequality between measures, and budget information indicating a budget; a generation unit (140) that generates data on the basis of the behavioral characteristics table (112), the attribute table (113), and the measures candidate table (116); a prediction unit (150) that predicts the sales or increase in the number of store visits when measures are implemented, on the basis of the generated data and the trained model (115); and a determination unit (160) that determines measures for each user by performing optimization calculations within the budget using the prediction results and the measures scores.

Description

Information processing device, decision method, and decision program

The present disclosure relates to an information processing device, a determination method, and a determination program.

Measures are being taken to improve profits. For example, a coupon may be provided to the user. Then, the user visits the store and purchases many items while using the coupon. This improves profits. However, if a coupon is provided to a user who plans to visit the store, the price will simply be reduced, and the effectiveness of the measure will be low. Therefore, a technique for providing incentives to users has been proposed (see Patent Document 1). The information processing device of Patent Document 1 provides incentives to users based on the behavior history and background information of a plurality of users.

Patent No. 6899350

Incidentally, depending on the policy, inequality may arise. Inequality in measures is thought to give rise to distrust in services.

The purpose of this disclosure is to eliminate inequality in measures.

An information processing device according to one aspect of the present disclosure is provided. The information processing device includes a learned model, behavioral characteristic information indicating behavioral characteristics of each user, attribute information indicating attributes of each user, policy candidate information indicating policy candidates, and values for alleviating inequality of measures. an acquisition unit that acquires budget information indicating a certain measure score and a budget; a generation unit that generates data based on the behavior characteristic information, the attribute information, and the measure candidate information; and a generation unit that generates data based on the generated data and the a prediction unit that predicts the increase in sales or the number of store visits when a measure is implemented based on the learned model; and a prediction unit that performs optimization calculations within the budget using the prediction result and the measure score, and and a decision section that decides on measures.

According to the present disclosure, inequality in measures can be resolved.

FIG. 1 is a diagram showing a communication system. FIG. 2 is a diagram showing hardware included in an information processing device. FIG. 2 is a block diagram showing the functions of the information processing device. It is a figure showing an example of an action history table. FIG. 3 is a diagram showing an example of a behavioral feature table. FIG. 3 is a diagram showing an example of an attribute table. It is a figure showing an example of a measure result table. FIG. 3 is a diagram showing an example of learning data. 3 is a flowchart illustrating an example of processing executed by a learning section. It is a figure showing an example of a measure candidate table. FIG. 3 is a diagram showing an example of generated data. It is a figure showing an example of a prediction result. (A) and (B) are diagrams (Part 1) illustrating an example of a method for calculating a policy score. FIG. 2 is a diagram (part 2) illustrating an example of a method for calculating a policy score. FIG. 3 is a diagram illustrating a specific example of processing executed by a determining unit. FIG. 3 is a diagram showing an example of measures for each user. 3 is a flowchart illustrating an example of processing executed by the information processing device.

Hereinafter, embodiments will be described with reference to the drawings. The following embodiments are merely examples, and various modifications can be made within the scope of the present disclosure.

Embodiment.
FIG. 1 is a diagram showing a communication system. The communication system includes an information processing device 100 and a terminal device 200. The information processing device 100 and the terminal device 200 communicate via a network.
The information processing device 100 is a device that executes a determination method. For example, the information processing device 100 is a server. Further, the information processing device 100 may be a personal computer, a smartphone, a tablet terminal, or the like.
Terminal device 200 is a device used by a user. For example, the terminal device 200 is a smartphone, a tablet terminal, or the like. In FIG. 1, one terminal device is depicted. The number of terminal devices may be two or more.

Next, hardware included in the information processing device 100 will be explained.
FIG. 2 is a diagram showing hardware included in the information processing device. The information processing device 100 includes a processor 101, a volatile storage device 102, a nonvolatile storage device 103, and a communication interface 104.

The processor 101 controls the entire information processing device 100. For example, the processor 101 is a CPU (Central Processing Unit), an FPGA (Field Programmable Gate Array), a DSP (Digital Signal Processor), or the like. Processor 101 may be a multiprocessor. Further, the information processing device 100 may include a processing circuit. Further, the information processing device 100 may include a microcomputer or a System on Chip (SoC).

The volatile storage device 102 is the main storage device of the information processing device 100. For example, the volatile storage device 102 is a RAM (Random Access Memory). The nonvolatile storage device 103 is an auxiliary storage device of the information processing device 100. For example, the nonvolatile storage device 103 is a ROM (Read Only Memory), an HDD (Hard Disk Drive), or an SSD (Solid State Drive).
Communication interface 104 communicates with terminal device 200 .

Next, the functions of the information processing device 100 will be explained.
FIG. 3 is a block diagram showing the functions of the information processing device. The information processing device 100 includes a storage section 110, a learning section 120, an acquisition section 130, a generation section 140, a prediction section 150, a determination section 160, and an output section 170.

The storage unit 110 may be realized as a storage area secured in the volatile storage device 102 or the nonvolatile storage device 103.
A part or all of the learning section 120, the acquisition section 130, the generation section 140, the prediction section 150, the determination section 160, and the output section 170 may be realized by a processing circuit. Furthermore, part or all of the learning unit 120, the acquisition unit 130, the generation unit 140, the prediction unit 150, the determination unit 160, and the output unit 170 may be realized as modules of a program executed by the processor 101. For example, the program executed by processor 101 is also referred to as a determination program. For example, the determination program is recorded on a recording medium.

The storage unit 110 may store a behavior history table 111, a behavior feature table 112, an attribute table 113, a policy result table 114, a learned model 115, and a policy candidate table 116. The action history table 111, action feature table 112, attribute table 113, policy result table 114, learned model 115, and policy candidate table 116 will be explained later.

<Learning phase>
The learning unit 120 generates a learned model 115. The functions of the learning section 120 will be explained in detail.
The learning unit 120 acquires the behavior history table 111. An action history table 111 is shown.

FIG. 4 is a diagram showing an example of an action history table. The behavior history table 111 shows the user's behavior history. The action history table 111 has items such as user ID (identifier), date and time, and stay area. The action history table 111 may also include GPS (Global Positioning System) data and ticket gate entrance/exit history.

The learning unit 120 extracts the user's behavior characteristics based on the behavior history table 111. The learning unit 120 registers behavioral features in the behavioral feature table 112. A behavioral feature table 112 is shown.

FIG. 5 is a diagram showing an example of a behavioral feature table. The behavior feature table 112 shows behavior characteristics for each user. The behavioral feature table 112 has items such as user ID, average store visit frequency, and average store visit time.
The learning unit 120 may extract facilities that the user often uses and times when the user often travels as behavioral features. Further, the learning unit 120 may extract behavior patterns on holidays and weekdays as behavior features.

The learning unit 120 acquires the attribute table 113. An attribute table 113 is shown.

FIG. 6 is a diagram showing an example of an attribute table. The attribute table 113 shows attributes for each user. The attribute table 113 has items such as user ID, age, gender, and address.

The learning unit 120 acquires the policy result table 114. A measure result table 114 is shown.

FIG. 7 is a diagram showing an example of a measure result table. The measure result table 114 shows the results of measures taken in the past. For example, the policy result table 114 shows that the user with the user ID "00001" used a 100 yen coupon at the A store, and that the user paid 1500 yen to the A store. Note that 1,500 yen may be expressed as A store sales. Further, the measure result table 114 may include the number of increased visits to the store.

The learning unit 120 generates learning data based on the behavior feature table 112, attribute table 113, and policy result table 114. Show training data.

FIG. 8 is a diagram showing an example of learning data. Learning data 300 is data generated by learning section 120.
The learning unit 120 uses the learning data 300 to generate a trained model 115. Multiple regression analysis may be used to generate the learned model 115. For example, the learning unit 120 performs multiple regression analysis using equation (1).

Note that y is the objective variable. The values based on the behavioral feature table 112 and the attribute table 113 are x1 to xi. The values based on the measure result table 114 are z1 to zz. T is measure presence information. α1 to αi and β1 to βj are correction coefficients. γ is a constant term. Moreover, the policy effect may be calculated using equation (2).

Note that the case with a measure is the case where "1" is input to T. The case of no measures is the case where "0" is input to T.

Additionally, a known method may be used to generate the trained model 115. For example, in generating the trained model 115, support vector machine (SVM), gradient boosting decision tree (GBDT), Meta-Learner methods such as S-Leaner and T-Leaner, and Causal Tree methods such as Causal Tree and Causal Forest are used. etc. may also be used.

In this way, the learned model 115 is generated. The trained model 115 is a model that predicts sales or an increase in the number of store visits. The learning unit 120 stores the learned model 115 in the storage unit 110 or an external device connectable to the information processing device 100. Note that the illustration of the external device is omitted.

Next, the process executed by the learning unit 120 will be explained using a flowchart.
FIG. 9 is a flowchart illustrating an example of processing executed by the learning section.
(Step S11) The learning unit 120 extracts the user's behavior characteristics based on the behavior history table 111.
(Step S12) The learning unit 120 acquires the behavioral feature table 112, the attribute table 113, and the policy result table 114 from the storage unit 110.
(Step S13) The learning unit 120 generates learning data based on the behavioral feature table 112, the attribute table 113, and the policy result table 114.
(Step S14) The learning unit 120 generates the learned model 115 using the learning data.
(Step S15) The learning unit 120 stores the learned model 115 in the storage unit 110.

In the above description, the case where the information processing device 100 generates the trained model 115 has been described. The learned model 115 may be generated by a learning device other than the information processing device 100.

<Utilization phase>
Returning to FIG. 3, the functions of the acquisition unit 130 and the like will be explained.
The acquisition unit 130 acquires the learned model 115. For example, the acquisition unit 130 acquires the trained model 115 from the storage unit 110 or an external device.

The acquisition unit 130 acquires the behavioral feature table 112 and the attribute table 113. For example, the acquisition unit 130 acquires the behavioral feature table 112 and the attribute table 113 from the storage unit 110 or an external device. Here, the behavior feature table 112 is also referred to as behavior feature information. The attribute table 113 is also referred to as attribute information.
The acquisition unit 130 also acquires the measure candidate table 116. For example, the acquisition unit 130 acquires the policy candidate table 116 from the storage unit 110 or an external device. A measure candidate table 116 is shown.

FIG. 10 is a diagram showing an example of a measure candidate table. The policy candidate table 116 shows policy candidates. The measure candidate table 116 may be expressed as information indicating candidates of measures to be implemented for a plurality of users. The policy candidate table 116 is also referred to as policy candidate information.
The measure candidate table 116 in FIG. 10 shows nine measures. For example, policy No. 1 indicates providing a 100 yen coupon for A store. For example, policy No. 2 indicates providing a 100 yen coupon for B store.

The generation unit 140 generates data based on the behavior feature table 112, attribute table 113, and policy candidate table 116. The data is input to the learned model 115. Shows the data generated.

FIG. 11 is a diagram showing an example of generated data. The generation unit 140 generates nine pieces of data based on the behavioral characteristics, attributes, and nine measures of the user ID "00001." The generation unit 140 similarly generates data corresponding to all users, such as user ID "00002".

The prediction unit 150 predicts the increase in sales or the number of store visits when the measures are implemented, based on the data generated by the generation unit 140 and the learned model 115. Specifically, the prediction unit 150 inputs the data to the learned model 115, and the learned model 115 outputs the increase in sales or the number of store visits when the measure is implemented. Show prediction results.

FIG. 12 is a diagram showing an example of prediction results. The prediction result in FIG. 12 shows the increase in the number of store visits. For example, a prediction result of "1.2" indicates that if a 100 yen coupon from A store is provided to a user with user ID "00001", the number of visits to the store by the user will increase to "1.2".

The acquisition unit 130 acquires a policy score that is a value for alleviating inequality in policies. For example, the acquisition unit 130 acquires the policy score from the storage unit 110 or an external device. Here, the method for calculating the policy score will be explained.

FIGS. 13A and 13B are diagrams (part 1) illustrating an example of a method for calculating a measure score. FIG. 13(A) shows policy scores that evaluate whether different users receive the policy on the previous day and today. “Coef ₁ ” indicated by the equation in FIG. 13(A) may be set by the user. Furthermore, “Coef ₁ ” may be automatically set. “i” in the equation of FIG. 13(A) is set to “0” or “1”. “0” indicates that no measures are taken. “1” indicates that the measure will be taken. When calculating a policy score that evaluates whether different users receive the policy on the previous day and today, the previous day and plan 2 in the table of FIG. 13(A) are referred to. Then, the policy score is calculated using the inner product based on the previous day and Plan 2, and "Coef _1. "

FIG. 13(B) shows a measure score that evaluates whether there is no bias toward a specific measure among a plurality of measures. “Coef ₂ ” indicated by the equation in FIG. 13(B) may be set by the user. Furthermore, “Coef ₂ ” may be automatically set. The case where there is no bias in a specific measure among the plurality of measures is the case of plan 2 in the table of FIG. 13(B). For example, plan 2 indicates that coupon A is provided to 21 people, coupon B is provided to 20 people, and coupon C is provided to 19 people. In this way, Plan 2 shows a case where coupons A to C are provided to many people. In other words, Plan 2 represents a case where one coupon (for example, Coupon A) is not provided to many people. For example, this case is case 1. The policy score is calculated using the standard deviation based on plan 2 and "Coef ₂ ".

FIG. 14 is a diagram (part 2) illustrating an example of the method for calculating the policy score. FIG. 14 shows policy scores that evaluate whether the policy is implemented for different users. “Coef ₃ ” indicated by the formula in FIG. 14 may be set by the user. Furthermore, “Coef ₃ ” may be automatically set. The case where measures are taken for different users is case 2 in the table of FIG. 14. The policy score is calculated using the standard deviation based on plan 2 and "Coef _3. "
In this way, the measure score is calculated. The calculation methods shown in FIGS. 13 and 14 are examples. Therefore, the policy score may be calculated by a method other than the above.

Additionally, the acquisition unit 130 acquires budget information indicating the budget. For example, the acquisition unit 130 acquires budget information from the storage unit 110 or an external device.

The determining unit 160 uses the prediction results and the policy score to perform optimization calculations within the budget and determines the policy for each user. Specifically, the determining unit 160 performs optimization calculation using equation (3). Further, equation (4) is used as a constraint condition.

Here, equation (3) is also called an objective function. The determining unit 160 performs optimization calculation using equation (3) in order to maximize the objective function. Further, in the optimization calculation, an optimization method such as the greedy method, a parameter search method such as gradient descent method, Bayesian optimization, etc. may be used.

The processing executed by the determining unit 160 will be explained using a specific example.

FIG. 15 is a diagram illustrating a specific example of processing executed by the determining unit. First, the budget is 600 yen. The determining unit 160 considers measure A within the budget. Measure A is that the user with user ID "00001" is provided with a 300 yen coupon from C store, the user with user ID "00002" is provided with a 100 yen coupon from A store, and the user with user ID "00004" is provided with A store. This indicates that a 200 yen coupon will be provided. The determining unit 160 calculates the total value of the prediction results corresponding to measure A. The determining unit 160 calculates an evaluation value for the measure A using the total value of the prediction results and the measure score.

The decision unit 160 considers measure B within the budget. Measure B is that the user with user ID "00001" is provided with a 200 yen coupon from B store, the user with user ID "00002" is provided with a 100 yen coupon from C store, and the user with user ID "00003" is provided with C store. This indicates that a 300 yen coupon will be provided. The determining unit 160 calculates the total value of the prediction results corresponding to measure B. The determining unit 160 calculates an evaluation value for the measure B using the total value of the prediction results and the measure score.

The determining unit 160 compares the evaluation value for measure A and the evaluation value for measure B. The determining unit 160 determines that measure A is optimal based on the comparison result. The determining unit 160 repeats the same process and examines the optimal policy. Then, the determining unit 160 determines the optimal policy as a policy for each user. Here, an example of measures for each user will be shown.

FIG. 16 is a diagram showing an example of measures for each user. The table in FIG. 16 shows the determined measures for each user. For example, the user with user ID "00001" is provided with a 100 yen coupon from A store as a measure.
Further, the table in FIG. 16 shows the prediction results and costs corresponding to the determined measures.

Note that the above describes a case where the measure score acquired by the acquisition unit 130 is used. The determining unit 160 may calculate a policy score based on the policy selected within the budget, and may use the calculated policy score. For example, the determining unit 160 may calculate a policy score based on policy A, and use the calculated policy score. Furthermore, if the calculated policy score is greater than the policy score acquired by the acquisition unit 130, the determining unit 160 may re-determine the policy.

Additionally, the determining unit 160 may perform optimization calculation using equation (5). Further, equation (4) is used as a constraint condition.

The output unit 170 outputs the policy for each user as policy information.

Next, the processing executed by the information processing apparatus 100 will be explained using a flowchart.
FIG. 17 is a flowchart illustrating an example of processing executed by the information processing device.
(Step S21) The acquisition unit 130 acquires the trained model 115.
(Step S22) The acquisition unit 130 acquires the behavioral feature table 112, the attribute table 113, and the policy candidate table 116.
(Step S23) The generation unit 140 generates data based on the behavioral feature table 112, the attribute table 113, and the policy candidate table 116.

(Step S24) Based on the generated data and the learned model 115, the prediction unit 150 predicts the increase in sales or the number of store visits when the measure is implemented.
(Step S25) The acquisition unit 130 acquires the policy score.
(Step S26) The acquisition unit 130 acquires budget information.
(Step S27) The determining unit 160 uses the prediction result and the policy score to perform optimization calculations within the budget, and determines a policy for each user.
(Step S28) The output unit 170 outputs the policy for each user as policy information.

Here, if the budget is unlimited, the best coupon can be provided to all users. However, there are budget limitations. Therefore, the information processing device 100 performs optimization calculations in order to maximize the effectiveness of the measures within the budget. Furthermore, when optimization calculations are performed normally, there is a possibility that the calculation result will be output every time that the best coupon is provided to a specific user. For example, the calculation result may be output every time that the user with the user ID "00001" is provided with a 300 yen coupon from A store. In order to prevent such biased calculation results from being output, the information processing device 100 uses the policy score in optimization calculations. Preventing biased calculation results from being output eliminates inequality in measures. Therefore, according to the embodiment, the information processing apparatus 100 can eliminate inequality in measures.

Additionally, the above example shows the case where coupons are provided as a measure. For example, novelty items may be provided as a measure.

100 Information processing device, 101 Processor, 102 Volatile storage device, 103 Non-volatile storage device, 104 Communication interface, 110 Storage unit, 111 Behavior history table, 112 Behavior feature table, 113 Attribute table, 11 4 Measure result table, 115 Learned Model, 116 Measure candidate table, 120 Learning unit, 130 Acquisition unit, 140 Generation unit, 150 Prediction unit, 160 Determination unit, 170 Output unit, 200 Terminal device, 300 Learning data.

Claims

A trained model, behavioral characteristic information indicating behavioral characteristics of each user, attribute information indicating attributes of each user, policy candidate information indicating policy candidates, policy score which is a value for alleviating inequality of measures, and an acquisition unit that acquires budget information indicating the budget;
a generation unit that generates data based on the behavior characteristic information, the attribute information, and the measure candidate information;
a prediction unit that predicts an increase in sales or the number of store visits when the measure is implemented based on the generated data and the learned model;
a determining unit that performs an optimization calculation within the budget using the prediction result and the measure score, and determines the measure for each user;
An information processing device having:
The information processing device
A trained model, behavioral characteristic information indicating behavioral characteristics of each user, attribute information indicating attributes of each user, policy candidate information indicating policy candidates, policy score which is a value for alleviating inequality of measures, and Acquire budget information indicating a budget, generate data based on the behavior characteristic information, the attribute information, and the measure candidate information, and implement measures based on the generated data and the learned model. Predict the increase in sales or number of store visits when
performing optimization calculations within the budget using the prediction result and the measure score, and determining measures for each user;
How to decide.
In the information processing device,
A trained model, behavioral characteristic information indicating behavioral characteristics of each user, attribute information indicating attributes of each user, policy candidate information indicating policy candidates, policy score which is a value for alleviating inequality of measures, and Acquire budget information indicating a budget, generate data based on the behavior characteristic information, the attribute information, and the measure candidate information, and implement measures based on the generated data and the learned model. Predict the increase in sales or number of store visits when
performing optimization calculations within the budget using the prediction result and the measure score, and determining measures for each user;
A decision program that executes a process.