WO2022062449A1

WO2022062449A1 - User grouping method and apparatus, and electronic device and storage medium

Info

Publication number: WO2022062449A1
Application number: PCT/CN2021/096532
Authority: WO
Inventors: 徐卓扬; 赵惟; 孙行智; 胡岗; 左磊; 赵婷婷
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-09-25
Filing date: 2021-05-27
Publication date: 2022-03-31
Also published as: CN112115322A; CN112115322B

Abstract

A user grouping method and apparatus, and an electronic device and a computer-readable storage medium. The method comprises: acquiring, from a database, return visit data of a user, and organizing the return visit data to obtain sample data (S1); training a pre-built grouping prediction model, so as to obtain an output result of the sample data (S2); adjusting a loss function of a pre-built user grouping model on the basis of the output result, so as to obtain an optimized loss function (S3); training the user grouping model according to the optimized loss function, so as to obtain an optimized user grouping model (S4); and by using the optimized user grouping model, grouping user data to be grouped, so as to obtain a grouping result, and outputting the grouping result by means of a display screen (S5). The efficiency and scalability of user grouping are improved. The return visit data can also be stored in a blockchain.

Description

User grouping method, device, electronic device and storage medium

This application claims the priority of the Chinese patent application filed on September 25, 2020 with the application number CN202011021840.8 and the invention title is "User Grouping Method, Device, Electronic Device and Storage Medium", the entire content of which is approved by Reference is incorporated in this application.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a user grouping method, apparatus, electronic device, and computer-readable storage medium.

Background technique

Different users have different age, gender and other differences, therefore, different service methods or strategies for different users are also different. For example, different patients may be treated differently even if they have the same disease. Therefore, it is necessary to divide patients into several subgroups, and formulate different treatment methods for each subgroup to achieve the best treatment effect.

The inventor realized that the current user grouping methods are either knowledge-based user grouping methods or knowledge and data-based user grouping methods, both of which require professional guide knowledge, such as professional medical knowledge, to sort out , this sorting behavior requires a lot of human time, high cost, and low efficiency; and these two clustering methods are based on guide knowledge rather than pure data-driven models, which lack scalability.

SUMMARY OF THE INVENTION

A user grouping method, the method is applied to an electronic device, and includes:

Obtain the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain sample data;

Use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

Adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

According to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model;

The user data to be grouped is grouped by using the optimized user grouping model, a grouping result is obtained, and the grouping result is output through the display screen of the electronic device.

A user grouping device, the device comprising:

The sample data acquisition module is used to acquire the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain the sample data;

a grouping prediction model training module, configured to use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

a loss function improvement module, configured to adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

The user grouping model training module, according to the optimization loss function, uses the sample data to train the user grouping model to obtain an optimized user grouping model;

The grouping module is configured to use the optimized user grouping model to group the user data to be grouped, obtain a grouping result, and output the grouping result through the display screen of the electronic device.

An electronic device comprising:

a memory that stores at least one computer program instruction; and

A processor that executes computer program instructions stored in the memory to achieve the following steps:

A computer-readable storage medium storing a computer program, the computer program being executed by a processor to implement the following steps:

Obtain the user's return visit data from a database communicatively connected to the electronic device, and organize the return visit data to obtain sample data;

The present application can achieve the purpose of more efficient, scalable, and purely data-driven user grouping.

Description of drawings

1 is a schematic flowchart of a user grouping method provided by an embodiment of the present application;

2 is a schematic flowchart of a method for generating sample data according to an embodiment of the present application;

3 is a schematic flowchart of a model training method provided by an embodiment of the present application;

4 is a schematic flowchart of a method for improving a loss function provided by an embodiment of the present application;

5 is a schematic flowchart of a method for generating an optimized user grouping model according to an embodiment of the present application;

6 is a schematic flowchart of a grouping method provided by an embodiment of the present application;

FIG. 7 is a schematic block diagram of a user grouping device according to an embodiment of the present application;

8 is a schematic diagram of an internal structure of an electronic device for implementing a user grouping method provided by an embodiment of the present application;

The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.

detailed description

It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application.

The execution subject of the user grouping method provided by the embodiment of the present application includes, but is not limited to, at least one of the electronic devices that can be configured to execute the method provided by the embodiment of the present application, such as a server and a terminal. In other words, the user grouping method can be executed by software or hardware installed in a terminal device or a server device, and the software can be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Referring to FIG. 1 , it is a schematic flowchart of a user grouping method provided by an embodiment of the present application. In this embodiment, the user grouping method includes:

S1. Acquire the user's return visit data from a database, and organize the return visit data to obtain sample data.

In the embodiment of the present application, the database is connected in communication with the electronic device that executes the user grouping method described in this solution.

Preferably, in one of the embodiments of the present application, the user is a patient who has suffered from a disease. Therefore, the return visit data of the user includes long-term follow-up records of multiple patients, including but not limited to demographic information. , Inspection and inspection indicators, medication history, expert prescription and other indicator data. Wherein, the expert prescribing medicine can be considered as expert grouping, as the standard grouping result of user grouping.

In the embodiment of the present application, the return visit data can be obtained from the database of the medical platform. In order to ensure the privacy and security of the return visit data, the return visit data can also be obtained from a preset blockchain node.

In detail, referring to FIG. 2 , the re-visit data is sorted to obtain sample data, including:

S10, sorting the return visit data in chronological order to obtain initial sample data;

S11. Convert the index data in the initial sample data into a multi-dimensional feature vector to obtain sample data.

S2. Use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data.

Preferably, the grouping prediction model described in this application is a deep neural network (Deep Neural Networks, DNN) model for predicting multi-classification problems. Wherein, the DNN model includes an input layer, a hidden layer, an output layer and a softmax function. The input layer is used to receive data; the hidden layer is used to calculate the data and enhance the classification capability of the model; the output layer includes a plurality of output nodes, each output node outputs the corresponding category of the node The softmax function is used to convert the output score to a probability value.

Further, for the pre-built grouping prediction model, the grouping prediction model needs to be trained to improve the accuracy of the grouping prediction model.

In detail, referring to FIG. 3 , the use of the sample data to train the pre-built grouping prediction model includes:

S20, using the grouping prediction model to perform a grouping operation on the sample data to obtain prediction probability values of multiple grouping schemes;

S21, calculating the cross-entropy loss function of the predicted probability value and the standard grouping result to obtain a loss value;

S22. Modify the parameters of the grouping prediction model according to the loss function, and re-perform the grouping operation on the sample data by using the modified grouping prediction model until a preset stop condition is reached.

Wherein, the preset stop condition means that the loss value no longer decreases.

The cross-entropy loss function described in the embodiments of the present application includes:

Among them, H(p,q) is the loss function value, n is the total number of clustering schemes, p(x _i ) is the true probability value of the i-th clustering scheme, and q(x _i ) is the predicted probability value of the i-th clustering scheme .

Further, in the embodiment of the present application, the sample data is input into the trained grouping prediction model, and the output result of the sample data is obtained.

S3. Adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function.

Preferably, the pre-built user grouping model is a DQN (Deep Q-learning, deep Q-value learning) model based on a deep reinforcement learning algorithm, which can optimize the long-term goal of the sequence decision problem.

Preferably, the input of the DQN model is state, the output is the Q (expected reward) value corresponding to each action, and reward participates in training to optimize the model's selection of actions. In a preferred embodiment of the present application, the input state of the user grouping model is the sample data, the action is the unique code of the grouping scheme, and the reward (reward) varies according to the type of disease. Taking diabetes as an example, reward=- (Whether there is a complication in the user's next return visit) - (whether a hypoglycemia event occurs in the user's next return visit) + (whether the user's next return visit is up to the glycated hemoglobin standard).

The loss function described in the embodiments of the present application is as follows:

L=R+Q(s′,a″)-Q(s,a)

Among them, s is the current sample data; a is the current grouping scheme; s' is the next sample data of the current sample data; a" is the grouping scheme corresponding to the maximum Q value output after the sample data s' is input into the user grouping model ; Q(s, a) is the Q value of the corresponding grouping scheme a output by the user grouping model when the input is sample data s; Q(s', a") is the user grouping model when the input is sample data s', the output corresponding to the Q value of the grouping scheme a"; R is the reward (reward) of the sample data s.

Preferably, in order to make the grouping result of the user grouping model as close to the expert grouping result as possible and improve the reliability of the grouping result, the loss function needs to be improved.

In detail, referring to Fig. 4, the loss function of the pre-built user grouping model is improved based on the output result, including:

S30, modifying the selection method of the grouping scheme in the loss function;

S31. Add a preset penalty item to the loss function.

Further, the method for modifying the selection of the grouping scheme in the loss function includes:

Modify the selection method to the following function:

Among them, a"' is the grouping scheme corresponding to the maximum Q value of the output after the sample data s' is input into the user grouping model;

is the corresponding grouping scheme output by the user grouping model when the input is sample data s'

The Q value of , A′ _DNN is the n clustering schemes with the highest output prediction probability value when the clustering prediction model inputs sample data s′, n is a preset constant, and can be 1/3 of the total number of all clustering schemes.

Further, the preset penalty item is the penalty item that the current grouping scheme is higher than the expert grouping scheme, including:

Among them, P(s) is the penalty value; Q(s,a) is the Q value of the corresponding grouping scheme a output by the user grouping model when the input is sample data s; A _DNN is the input sample of the grouping prediction model The n clustering schemes with the highest predicted probability output when the data is s, n is a preset constant, and the value can be 1/3 of the total number of all clustering schemes;

is the average of the Q values of all the grouping schemes output by the user grouping model when the input is the sample data s belonging to the A _DNN .

In detail, the embodiment of the present application improves the loss function through the above steps to obtain an optimized loss function. Further, the optimized loss function includes:

L=R+Q(s′,a″′)-Q(s,a)+P(s)

Among them, s is the current sample data; a is the current grouping scheme; s' is the next sample data of the current sample data; a"' is the grouping corresponding to the maximum Q value output after the sample data s' is input into the user grouping model scheme; Q(s,a) is the Q value of the corresponding grouping scheme a output by the user grouping model when the input is sample data s; Q(s′,a″′) is the user grouping model when the input is When the sample data is s', the Q value of the corresponding grouping scheme a"' is output; R is the reward (reward) of the sample data s; P(s) is the penalty value.

Preferably, only pure data models are used in the present application, but in the model training process, the loss function is improved to limit the model's tendency to adopt a grouping scheme that is most likely to be decided by experts, thereby improving the credibility of the grouping scheme.

S4. According to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model.

In detail, referring to Fig. 5, the S4 includes:

S40, inputting the sample data into the user grouping model to obtain a training result;

S41, using the optimized loss function to calculate the loss value of the training result;

S42, comparing the loss value with a preset loss threshold;

S43, when the loss value is greater than or equal to the loss threshold, adjust the parameters of the user grouping model, and return to S40 to retrain to obtain a training result;

S44. When the loss value is less than the loss threshold, obtain the optimized user grouping model.

Preferably, the present application utilizes a large amount of user return visit data collected for training and learning, and the data utilization rate is relatively high.

S5. Use the optimized user grouping model to group the user data to be grouped, obtain a grouping result, and output the grouping result.

In detail, referring to FIG. 6 , the user data to be grouped is grouped by using the optimized user grouping model to obtain a grouping scheme, including:

S50, input the user data to be grouped into the optimized user grouping model;

S51, using the optimized user grouping model to output each grouping scheme of the user data to be grouped and the expected reward value (Q value) corresponding to each grouping scheme;

S52. Select the grouping scheme with the largest expected reward value (Q value) as the grouping result of the user data to be grouped.

Preferably, in the preferred embodiment of the present application, the optimized user grouping model is used to group patients, and the obtained grouping results can help doctors to quickly understand the treatment conditions of the patients, so as to carry out the next treatment plan.

In this embodiment of the present application, a large amount of return visit data is collected as sample data, which is conducive to the subsequent optimization of the grouping model; the sample data is used to train a pre-built grouping prediction model, and the trained grouping prediction model is used to obtain the The output result of the sample data uses the grouping prediction model to perform grouping prediction, which improves the work efficiency; adjusts the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function, and restricts the user grouping by improving the loss function. The model adopts the grouping scheme most likely to be decided by experts to improve the accuracy of the grouping scheme; according to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model, and use the collected sample data for training. training, without wasting the collected data information, and improving the data utilization rate; using the optimized user grouping model to group the user data to be grouped to obtain a grouping result, reducing a lot of human labor, and the optimized user grouping model scalability Strong, easy for subsequent expansion. Therefore, the user grouping method, device and computer-readable storage medium proposed in this application can achieve the purpose of more efficient, scalable, and purely data-driven user grouping.

As shown in FIG. 7 , it is a functional block diagram of the user grouping device of the present application.

The user grouping apparatus 100 described in this application may be installed in an electronic device. According to the implemented functions, the user grouping apparatus 100 may include a sample data acquisition module 101 , a grouping prediction model training module 102 , a loss function improvement module 103 , a user grouping model training module 104 and a grouping module 105 . The modules described in the present invention can also be called units, which refer to a series of computer program segments that can be executed by the electronic device processor and can perform fixed functions, and are stored in the memory of the electronic device.

In this embodiment, the functions of each module/unit are as follows:

The sample data acquisition module 101 is configured to acquire the user's return visit data from a database, and organize the return visit data to obtain sample data.

In detail, when sorting the return visit data to obtain sample data, the sample data acquisition module 101 specifically performs the following operations:

Sorting the return visit data in chronological order to obtain initial sample data;

Convert the index data in the initial sample data into a multi-dimensional feature vector to obtain sample data.

The grouping prediction model training module 102 is configured to use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data.

In detail, when using the sample data to train the pre-built grouping prediction model, the grouping prediction model training module 102 specifically performs the following operations:

Perform a grouping operation on the sample data by using the grouping prediction model to obtain prediction probability values of multiple grouping schemes;

Calculate the cross-entropy loss function of the predicted probability value and the standard grouping result to obtain a loss value;

The parameters of the grouping prediction model are modified according to the loss function, and the grouping operation is performed again on the sample data by using the modified grouping prediction model until a preset stopping condition is reached.

The loss function improvement module 103 is configured to adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function.

Modify the selection method to the following function:

In detail, the optimized loss function includes:

L=R+Q(s′,a″′)-Q(s,a)+P(s)

The user grouping model training module 104 is configured to use the sample data to train the user grouping model according to the optimization loss function to obtain an optimized user grouping model.

In detail, the user grouping model training module 104 is specifically used for:

Inputting the sample data into the user grouping model to obtain a training result;

Calculate the loss value of the training result by using the optimized loss function;

comparing the loss value with a preset loss threshold;

When the loss value is greater than or equal to the loss threshold, adjust the parameters of the user grouping model, and re-train to obtain a training result;

When the loss value is less than the loss threshold, the optimized user grouping model is obtained.

The grouping module 105 is configured to use the optimized user grouping model to group the user data to be grouped, obtain a grouping result, and output the grouping result.

In detail, when using the optimized user grouping model to group the user data to be grouped to obtain a grouping scheme, the grouping module 105 specifically performs the following operations:

inputting the user data to be grouped into the optimized user grouping model;

Use the optimized user grouping model to output each grouping scheme of the user data to be grouped and the expected reward value (Q value) corresponding to each grouping scheme;

The grouping scheme with the largest expected reward value (Q value) is selected as the grouping result of the user data to be grouped.

As shown in FIG. 8 , it is a schematic structural diagram of an electronic device implementing the user grouping method of the present application.

The electronic device 1 may include a processor 10, a memory 11 and a bus, and may also include a computer program stored in the memory 11 and executable on the processor 10, such as a user grouping program 12.

Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (for example: SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a mobile hard disk of the electronic device 1 . In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a pluggable mobile hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash memory card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can not only be used to store application software installed in the electronic device 1 and various types of data, such as the code of the user grouping program 12, etc., but also can be used to temporarily store data that has been output or will be output.

In some embodiments, the processor 10 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits packaged with the same function or different functions, including one or more integrated circuits. Central Processing Unit (CPU), microprocessor, digital processing chip, graphics processor and combination of various control chips, etc. The processor 10 is the control core (Control Unit) of the electronic device, and uses various interfaces and lines to connect the various components of the entire electronic device, by running or executing the program or module (for example, executing the program) stored in the memory 11. User grouping program, etc.), and call the data stored in the memory 11 to execute various functions of the electronic device 1 and process data.

The bus may be a peripheral component interconnect (PCI for short) bus or an extended industry standard architecture (Extended industry standard architecture, EISA for short) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection communication between the memory 11 and at least one processor 10 and the like.

FIG. 8 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 8 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the figure. components, or a combination of certain components, or a different arrangement of components.

For example, although not shown, the electronic device 1 may also include a power supply (such as a battery) for powering the various components, preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that the power management The device implements functions such as charge management, discharge management, and power consumption management. The power source may also include one or more DC or AC power sources, recharging devices, power failure detection circuits, power converters or inverters, power status indicators, and any other components. The electronic device 1 may further include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

Further, the electronic device 1 may also include a network interface, optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.

Optionally, the electronic device 1 may further include a user interface, and the user interface may be a display (Display), an input unit (eg, a keyboard (Keyboard)), optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.

The user grouping program 12 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions, and when running in the processor 10, can realize:

Further, if the modules/units integrated by the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium, and the computer-readable storage medium can be stored in a computer-readable storage medium. Can be volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) . The computer-readable medium stores a computer program, and the computer program is executed by the processor to realize the following steps:

In the several embodiments provided in this application, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.

The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.

It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.

Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any accompanying reference signs in the claims should not be construed as limiting the involved claims.

Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application and not to limit them. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present application.

Claims

A user grouping method, wherein the method is applied to an electronic device, and includes:

Obtain the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain sample data;

Use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

Adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

According to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model;

The user data to be grouped is grouped by using the optimized user grouping model, a grouping result is obtained, and the grouping result is output through the display screen of the electronic device.
The method for user grouping according to claim 1, wherein said organizing the return visit data to obtain sample data, comprising:

Sorting the return visit data in chronological order to obtain initial sample data;

Convert the index data in the initial sample data into a multi-dimensional feature vector to obtain sample data.
The method for user grouping according to claim 1, wherein said using the sample data to train a pre-built grouping prediction model comprises:

Perform a grouping operation on the sample data by using the grouping prediction model to obtain prediction probability values of multiple grouping schemes;

Calculate the cross-entropy loss function of the predicted probability value and the standard grouping result to obtain a loss value;

The parameters of the grouping prediction model are modified according to the loss function, and the grouping operation is performed again on the sample data by using the modified grouping prediction model until a preset stopping condition is reached.
The user grouping method according to claim 1, wherein the adjusting the loss function of the pre-built user grouping model based on the output result comprises:

Modify the selection method of the grouping scheme in the loss function;

A preset penalty term is added to the loss function.
The method for user grouping according to claim 4, wherein said modifying the method for selecting a grouping scheme in said loss function comprises:

Modify the selection method to the following function:

Among them, a"' is the grouping scheme corresponding to the maximum Q value of the output after the sample data s' is input into the user grouping model;
is the corresponding grouping scheme output by the user grouping model when the input is sample data s'
The Q value of , A′ DNN is the n clustering schemes with the highest output prediction probability value when the clustering prediction model inputs sample data s′, and n is a preset constant.
The user grouping method of claim 5, wherein the optimizing loss function comprises:

L=R+Q(s′,a″′)-Q(s,a)+P(s)

Among them, s is the current sample data; a is the current grouping scheme; s' is the next sample data of the current sample data; a"' is the grouping corresponding to the maximum Q value output after the sample data s' is input into the user grouping model scheme; Q(s,a) is the Q value of the corresponding grouping scheme a output by the user grouping model when the input is sample data s; Q(s′,a″′) is the user grouping model when the input is When the sample data is s', the output corresponds to the Q value of the grouping scheme a"'; R is the reward of the sample data s; P(s) is the penalty value.
The user grouping method according to any one of claims 1 to 6, wherein the grouping of user data to be grouped by using the optimized user grouping model to obtain a grouping scheme, comprising:

inputting the user data to be grouped into the optimized user grouping model;

Using the optimized user grouping model to output each grouping scheme of the user data to be grouped and the expected reward value corresponding to each grouping scheme;

The grouping scheme with the largest expected reward value is selected as the grouping result of the user data to be grouped.
A user grouping device, wherein the device comprises:

The sample data acquisition module is used to acquire the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain the sample data;

a grouping prediction model training module, configured to use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

a loss function improvement module, configured to adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

The user grouping model training module, according to the optimization loss function, uses the sample data to train the user grouping model to obtain an optimized user grouping model;

The grouping module is configured to use the optimized user grouping model to group the user data to be grouped, obtain a grouping result, and output the grouping result through the display screen of the electronic device.
An electronic device, wherein the electronic device comprises:

a memory that stores at least one computer program instruction; and

A processor that executes computer program instructions stored in the memory to perform the following steps:

Obtain the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain sample data;

Use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

Adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

According to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model;

The user data to be grouped is grouped by using the optimized user grouping model, a grouping result is obtained, and the grouping result is output through the display screen of the electronic device.
The electronic device as claimed in claim 9, wherein said organizing the return visit data to obtain sample data, comprising:

Sorting the return visit data in chronological order to obtain initial sample data;

Convert the index data in the initial sample data into a multi-dimensional feature vector to obtain sample data.
The electronic device according to claim 9, wherein said using the sample data to train a pre-built grouping prediction model comprises:

Perform a grouping operation on the sample data by using the grouping prediction model to obtain prediction probability values of multiple grouping schemes;

Calculate the cross-entropy loss function of the predicted probability value and the standard grouping result to obtain a loss value;

The parameters of the grouping prediction model are modified according to the loss function, and the grouping operation is performed again on the sample data by using the modified grouping prediction model until a preset stopping condition is reached.
The electronic device according to claim 9, wherein the adjusting the loss function of the pre-built user grouping model based on the output result comprises:

Modify the selection method of the grouping scheme in the loss function;

A preset penalty term is added to the loss function.
The electronic device of claim 12, wherein the method of modifying the selection of the grouping scheme in the loss function comprises:

Modify the selection method to the following function:

Among them, a"' is the grouping scheme corresponding to the maximum Q value of the output after the sample data s' is input into the user grouping model;
is the corresponding grouping scheme output by the user grouping model when the input is sample data s'
The Q value of , A′ DNN is the n clustering schemes with the highest output prediction probability value when the clustering prediction model inputs sample data s′, and n is a preset constant.
The electronic device of claim 13, wherein the optimizing loss function comprises:

L=R+Q(s′,a″′)-Q(s,a)+P(s)

Among them, s is the current sample data; a is the current grouping scheme; s' is the next sample data of the current sample data; a"' is the grouping corresponding to the maximum Q value output after the sample data s' is input into the user grouping model scheme; Q(s,a) is the Q value of the corresponding grouping scheme a output by the user grouping model when the input is sample data s; Q(s′,a″′) is the user grouping model when the input is When the sample data is s', the output corresponds to the Q value of the grouping scheme a"'; R is the reward of the sample data s; P(s) is the penalty value.
The electronic device according to any one of claims 9 to 14, wherein the grouping of user data to be grouped by using the optimized user grouping model to obtain a grouping scheme, comprising:

inputting the user data to be grouped into the optimized user grouping model;

Using the optimized user grouping model to output each grouping scheme of the user data to be grouped and the expected reward value corresponding to each grouping scheme;

The grouping scheme with the largest expected reward value is selected as the grouping result of the user data to be grouped.
A computer-readable storage medium storing a computer program, wherein the computer program implements the following steps when executed by a processor:

Obtain the user's return visit data from the database communicatively connected with the electronic device, and organize the return visit data to obtain sample data;

Use the sample data to train a pre-built grouping prediction model, and use the trained grouping prediction model to obtain an output result of the sample data;

Adjust the loss function of the pre-built user grouping model based on the output result to obtain an optimized loss function;

According to the optimization loss function, use the sample data to train the user grouping model to obtain an optimized user grouping model;

The user data to be grouped is grouped by using the optimized user grouping model, a grouping result is obtained, and the grouping result is output through the display screen of the electronic device.
The computer-readable storage medium according to claim 16, wherein the organizing the return visit data to obtain sample data, comprising:

Sorting the return visit data in chronological order to obtain initial sample data;

Convert the index data in the initial sample data into a multi-dimensional feature vector to obtain sample data.
The computer-readable storage medium of claim 16, wherein the training of a pre-built cluster prediction model using the sample data comprises:

Perform a grouping operation on the sample data by using the grouping prediction model to obtain prediction probability values of multiple grouping schemes;

Calculate the cross-entropy loss function of the predicted probability value and the standard grouping result to obtain a loss value;

The parameters of the grouping prediction model are modified according to the loss function, and the grouping operation is performed again on the sample data by using the modified grouping prediction model until a preset stopping condition is reached.
The computer-readable storage medium of claim 16, wherein the adjusting the loss function of the pre-built user grouping model based on the output result comprises:

Modify the selection method of the grouping scheme in the loss function;

A preset penalty term is added to the loss function.
The computer-readable storage medium of claim 19, wherein the method of modifying the selection of a clustering scheme in the loss function comprises:

Modify the selection method to the following function:

Among them, a"' is the grouping scheme corresponding to the maximum Q value of the output after the sample data s' is input into the user grouping model;
is the corresponding grouping scheme output by the user grouping model when the input is sample data s'
The Q value of , A′ DNN is the n clustering schemes with the highest output prediction probability value when the clustering prediction model inputs sample data s′, and n is a preset constant.