CN114860912B

CN114860912B - Data processing method, device, electronic equipment and storage medium

Info

Publication number: CN114860912B
Application number: CN202210552425.8A
Authority: CN
Inventors: 汪自立; 蒋宁; 王洪斌; 吴海英
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2023-08-29
Anticipated expiration: 2042-05-20
Also published as: CN114860912A

Abstract

The application provides a data processing method, a device, an electronic device and a storage medium, wherein the data processing method comprises the following steps: acquiring user data of a target user, wherein the user data comprises related data generated in the process of automatically serving the target user; inputting user data into a first classification model for normalization analysis to obtain a first score, wherein the first score represents a first probability of success of being serviced without performing non-automatic service after a target user passes through automatic service, and the larger the first score is, the larger the first probability is, the success of being serviced means that the target user has preset type intention to a target object; if the first score is greater than or equal to a first score threshold, outputting a service scheme corresponding to the preset type intention; if the first score is less than the first score threshold, the target user is marked as a user who needs to provide non-automated services. The application can improve the service efficiency and the service quality.

Description

Data processing method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method, a data processing device, an electronic device, and a storage medium.

Background

With the development of information technology, traditional offline service driven by resources gradually goes away, while online intelligent service based on new technology is perfected. Currently, in intelligent services, for example, in the context of online services to users, the service process generally comprises two distinct phases, namely an automated service phase by means of intelligent robots and a non-automated service phase engaged by professional service personnel.

However, after the automated service, there is typically already a portion of users who have a significant preset intent, and who may directly provide other services without performing subsequent non-automated services; however, there is a portion of users that do not have a significant preset intent and may also need to enter a subsequent non-automated service phase. If the service contents of the following user cannot be accurately determined after the end of the automation service, there is a problem in that the service cost is high and the efficiency is low.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device, electronic equipment and a storage medium, which are used for solving the problems of high service cost and low efficiency of the current service scheme.

In a first aspect, an embodiment of the present application provides a data processing method, including: acquiring user data of a target user, wherein the user data comprises related data generated in the process of automatically serving the target user; inputting user data into a first classification model for normalization analysis to obtain a first score, wherein the first score represents a first probability of success of being serviced without performing non-automatic service after a target user passes through automatic service, and the larger the first score is, the larger the first probability is, the success of being serviced means that the target user has preset type intention to a target object; if the first score is greater than or equal to a first score threshold, outputting a service scheme corresponding to the preset type intention; if the first score is less than the first score threshold, the target user is marked as a user who needs to provide non-automated services.

In a second aspect, an embodiment of the present application provides a data processing apparatus, including:

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring user data of a target user, and the user data comprises related data generated in the process of carrying out automatic service on the target user;

the analysis unit is used for inputting the user data into the first classification model for normalization analysis to obtain a first score, wherein the first score represents the first probability that the target user does not perform non-automatic service after passing through automatic service, and the success of the service is that the target user has preset type intention on the target object as the first score is larger;

The output unit is used for outputting a service scheme corresponding to the preset type intention if the first score is greater than or equal to a first score threshold value;

and the marking unit is used for marking the target user as the user needing to provide the non-automation service if the first score is smaller than the first score threshold value.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the data processing method according to any one of the first aspects when executing the computer program.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, in which a computer program is stored which, when run on an electronic device, causes the electronic device to perform the data processing method according to any one of the first aspects.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run on an electronic device, causes the electronic device to perform the data processing method according to any of the first aspects.

The embodiment of the application provides a data processing method, a device, electronic equipment and a storage medium, wherein in the process of automatically serving a target user, user data generated in the process of automatically serving the target user is taken as a basis, the user data is analyzed and processed by adopting a first classification model, whether non-automatic service is needed or not is determined according to an analysis and processing result, and a service scheme corresponding to a preset type intention is directly output if the non-automatic service is not needed, so that the cost is saved, and the service efficiency is improved. And marking the users needing to be provided with the non-automatic service as the users needing to be provided with the non-automatic service, thereby improving the service quality.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic diagram of an application scenario of a data processing method provided by the present application;

FIG. 2 is a flow chart of a data processing method according to an embodiment of the application;

FIG. 3 is a schematic diagram of a data processing method according to an embodiment of the present application;

FIG. 4 is a flow chart of a model training method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the application.

Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

At present, when determining a service scheme for a target user, distributing a preset number of test flows to different service schemes according to a test distribution strategy, and determining the service scheme distributed to the target user according to the corresponding relation between the label of the target user of the target flow and the service scheme, wherein the target flow is the test flow hitting the corresponding service scheme. The method aims to maximize the service conversion rate, excessive service exists, the problem that the time waste of service personnel is reduced while the conversion rate is maintained cannot be solved, the service cost is further increased, and the service efficiency is reduced.

Based on the above problems, an embodiment of the present application provides a data processing method, including: acquiring user data of a target user, wherein the user data comprises related data generated in the process of automatically serving the target user; inputting user data into a first classification model for normalization analysis to obtain a first score, wherein the first score represents a first probability of success of being serviced without performing non-automatic service after a target user passes through automatic service, and the larger the first score is, the larger the first probability is, the success of being serviced means that the target user has preset type intention to a target object; if the first score is greater than or equal to a first score threshold, outputting a service scheme corresponding to the preset type intention; if the first score is less than the first score threshold, the target user is marked as a user who needs to provide non-automated services. According to the embodiment of the application, the user data generated in the automatic service process is taken as a basis, the user data is analyzed and processed by adopting the first classification model, whether the non-automatic service is needed or not is determined according to the analysis and processing result, and the service scheme corresponding to the preset type intention is directly output if the non-automatic service is not needed, so that the cost is saved, and the service efficiency is improved. And marking the target users needing to be provided with the non-automatic service as target users needing to be provided with the non-automatic service, thereby improving the service quality.

The data processing method provided by the embodiment of the application can be executed by electronic equipment, and the electronic equipment can be terminal equipment such as a smart phone, a tablet computer, a notebook computer, a desktop computer, intelligent voice interaction equipment, intelligent household appliances, vehicle-mounted terminals, aircrafts and the like; the method comprises the steps of carrying out a first treatment on the surface of the Alternatively, the electronic device may be a server, such as an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides a cloud computing service.

For example, assuming that the electronic device is a server, fig. 1 is a schematic diagram of a data processing system provided in the present application. As shown in fig. 1, the data processing system includes: a server 11, a user terminal 12 and a manual service terminal 13. The server 11 and the user terminal 12 can be connected in a wired or wireless manner to realize information transmission between the server 11 and the user terminal 12, and similarly, the server 11 and the manual service terminal 13 can be connected in a wired or wireless manner to realize information transmission between the server 11 and the manual service terminal 13. The user terminal 12 may be any terminal device used by a target user, the user terminal 12 may send service request information to the server 11 according to a service request operation of the target user on the user terminal 12, and the server 11 provides an automation service for the user terminal 12 according to the service request information. After the automation service, if the target user corresponding to the user terminal 12 is not satisfied with the automation service, or after the automation service, the target user does not have a preset type intention on the target object described by the automation service, the target user is transferred to the artificial customer service terminal to perform non-automation service on the target user corresponding to the user terminal 12. The server 11 may forward the information sent by the user terminal 12 to the artificial customer service terminal 13, and then forward the information replied by the artificial customer service terminal 13 to the user terminal 12. According to the embodiment of the application, the user data generated in the automatic service process is taken as a basis, the user data is analyzed and processed by adopting the first classification model, whether the non-automatic service is needed or not is determined according to the analysis and processing result, and the service scheme corresponding to the preset type intention is directly output if the non-automatic service is not needed, so that the cost is saved, and the service efficiency is improved. And marking target users needing non-automatic service as target users needing non-automatic service, so as to facilitate the follow-up non-automatic service to be provided for the users. Therefore, by adopting the embodiment of the application, whether the target user still needs to carry out the non-automatic service can be accurately distinguished after the automatic service, and the non-automatic service is provided for the needed target user instead of carrying out the automatic service for all the target users, so that the resource consumption of the non-automatic service can be saved, and the targeted service can be provided for different target users, thereby improving the service quality.

In addition, the embodiment of the application can be applied to various service scenes such as online shopping, online electronic resource business handling and the like. The embodiment of the application is not limited to specific application scenes.

The technical scheme of the application is described in detail through specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. It should be noted that, the user data related to the present application are all acquired under the consent of the target user.

Fig. 2 is a flow chart of a data processing method according to an embodiment of the present application. As shown in fig. 2, the data processing method includes the steps of:

s201, user data of a target user is acquired.

The target user refers to any one of users receiving the automation service, and the user data comprises related data generated in the process of carrying out the automation service on the target user. Specifically, the automated service may refer to performing a service on the target user by using a robot customer service, for example, performing interaction with the target user by using text or voice call, and providing service handling guidance for the target user by using the robot, for example, the robot customer service outputs guidance information according to a service operation input by the target user, where the guidance information is used to indicate an operation required to be performed by the target user in order to enjoy the service or information required to be provided, and so on.

In the embodiment of the present application, before S201, the robot customer service has performed an automation service on the target user. Then in this step the user data of the target user is obtained.

Further, the relevant data generated in the automated service process includes: user attribute feature data and/or user interaction data during an automated service.

Wherein the user attribute feature data comprises: user statistics and user tag characteristics. The user statistics feature includes: at least one of age, gender, academic or native place of the target user. The user tag characteristics include tags of the target user on the corresponding service, such as at least one of a income class, occupation of the target user.

The user interaction data includes: dialogue text data and dialogue behavioral data. The dialogue text data refers to dialogue texts between a target user and a robot customer service in an automatic service process, and comprises texts sent to the robot customer service by the target user and texts sent to the target user by the robot customer service. The dialogue action data refers to dialogue action data in the process of dialogue between a target user and a robot customer service in the process of automatic service, and specifically, some actions of dialogue text data generated by the target user and the robot customer service in the dialogue, such as a connection sent by the robot customer service is included in the dialogue text data, and if the target user clicks the connection, the dialogue action data includes clicking a link action; for another example, if the target user collects some data in the dialogue text data, the dialogue action data includes collection information actions; as another example, the dialogue action data may also include actions of querying the robot, browsing related information sent by the robot, participating in live broadcast, and the like.

In the embodiment of the present application, the dialogue action data may be obtained by a buried point manner, or may be obtained by other manners, which is not limited herein.

S202, inputting the user data into a first classification model for normalization analysis to obtain a first score.

The first classification model is used for carrying out normalized analysis on the user data and analyzing the automatic service condition of the target user. After the electronic device inputs the user data into the first classification model, the first classification model performs normalized analysis on the user data, and outputs a first score, wherein the first score represents a first probability that the target user does not perform non-automatic service after passing through automatic service and is successfully served, and the greater the first score, the greater the first probability is, the success of being served means that the target user has preset type intention to the target object.

The preset type intention is determined according to different service services, for example, the service is a recommended target object, and the preset type intention is the intention of the target object; and if the service business is to collect information through the target object (form), the preset type intention refers to filling the form.

In the embodiment of the present application, step S202 inputs the user data into the first classification model to perform normalization analysis to obtain a first score, specifically, inputs the user attribute feature data and the user interaction data into the first classification model to perform normalization analysis to obtain the first score.

In an alternative embodiment, step S202 may include: inputting the user data into a coding layer of the first classification model for coding treatment to obtain a target feature vector; and inputting the target feature vector into a classification layer of the first classification model for normalization analysis to obtain a first score.

Referring to fig. 3, the first classification model 31 includes an encoding layer 311 and a classification layer 312. The coding layer 311 is connected to the classification layer 312. The encoding layer 311 may use one-hot (an encoding manner), word embedding and/or quantization, and the classification layer 312 uses XGBOOST (a classification algorithm technique). Wherein the one-hot encoding is a valid encoding, mainly using N-bit status registers for encoding N states, each with an independent register bit, and only one bit at any time being valid, the one-hot encoding is a representation of a class variable as a binary vector, first requiring mapping of the class value to an integer value, then each integer value being represented as a binary vector, and other values being zero except for the index being marked 1. XGBOOST is an algorithm toolkit based on boosting framework, and is very powerful in parallel computing efficiency, missing value processing and prediction performance. XGBOOST is an additive model, and the supported base classifier includes a decision tree and a linear model.

In an alternative embodiment, inputting the user data into the coding layer of the first classification model to perform coding processing to obtain the target feature vector, including: labeling the user attribute feature data through the coding layer to obtain label data, and coding the label data to obtain an attribute feature vector; and extracting interactive features of the user interactive data through the coding layer to obtain interactive feature vectors, and splicing the attribute feature vectors and the interactive feature vectors to obtain target feature vectors.

In the embodiment of the application, the user attribute characteristic data and the user interaction data are used as the input of the first classification model, and the obtained first score can better represent the first probability that the target user is successfully served without non-automatic service after passing through the automatic service.

The tagging of the user attribute feature data by the encoding layer means that the continuity feature in the user attribute data is tagged. In a specific implementation, the continuity features that are within different ranges or belong to different types may be quantified in one tag. For example, the user attribute feature data includes: age, gender, academic and native. The labeling treatment corresponding to the age is "unknown", "0 to 18 years", "18 to 30 years", "30 to 50 years", "50 years and above". Then when the age of the target user is 25 years, the tag data obtained after the tagging process is "18 years to 30 years". The tags for the genders may be "unknown", "male" and "female", and if the genders of the target users are unknown, the corresponding tag data is "unknown".

The method comprises the steps of encoding tag data to obtain encoded vectors corresponding to the tag data by encoding each tag data in a one-hot encoding mode, and splicing a plurality of encoded vectors to obtain attribute feature vectors corresponding to user attribute feature data.

In an alternative embodiment, the user interaction data includes dialogue text data and dialogue behavioral data; extracting interactive features of the user interactive data through the coding layer to obtain interactive feature vectors, wherein the method comprises the following steps: obtaining reply text data of a target user from dialogue text data, and performing word segmentation processing on the reply text data to obtain a plurality of word vectors; performing preset operation on a plurality of word vectors to determine dialogue feature vectors; vectorizing dialogue behavior data to obtain behavior feature vectors; and splicing the dialogue feature vector and the behavior feature vector to obtain the interaction feature vector.

Wherein the dialogue text data includes: in the embodiment of the application, the content output by the target user is extracted as the reply text data. In addition, before the word segmentation processing is performed on the reply text data, preprocessing is further performed on the reply text data, for example: removing links, expressions, pictures and other contents in the text, converting full-angle characters in the reply text data into half-angle characters, correcting wrong characters and the like. Further, word segmentation processing can be performed on the preprocessed reply text data to obtain a plurality of segmented words, and then word embedding processing is performed on each segmented word to obtain word vectors of each segmented word. And then, carrying out operation on a plurality of word vectors by adopting preset operation to obtain dialogue feature vectors.

Specifically, the preset operation may be an average value operation, which may be expressed by the following formula:

in the above, V _d N represents the number of words in the reply text data, and is an integer greater than or equal to 1, V _w The word vector, which refers to the segmentation word w, d represents the reply text data. The above formula represents averaging n word vectors to obtain a dialogue feature vector.

In an alternative embodiment, word segmentation is performed on the reply text data to obtain a plurality of word vectors, including: and carrying out vector extraction processing on the segmentation processing result through a pre-training word embedding model to obtain a plurality of word vectors.

In an embodiment of the present application, the coding layer includes: the word embedding model is a pre-trained word embedding model, and can embed a high-dimensional space with the number of all words into a continuous vector space with a much lower dimension, each word or phrase is mapped into a vector on a real number domain, and then word vectors corresponding to word segmentation are obtained.

It should be understood that the dialogue action data may include multiple actions, each action corresponds to a feature value, and the implementation manner of performing vectorization processing on the dialogue action data to obtain the action feature vector may be: extracting characteristic values corresponding to the target behaviors from dialogue behavior data, and sequentially splicing the extracted characteristic values to obtain behavior characteristic vectors; target behavior refers to any one or more of a variety of behaviors.

In an alternative embodiment, the feature value corresponding to each behavior may be obtained by performing statistical characterization processing on multiple behaviors in a historical time period in advance according to a statistical method. Specifically, according to a preset statistical method, counting various behaviors in a historical time period; and carrying out characteristic value treatment on each behavior according to a preset characteristic rule to obtain a characteristic value corresponding to each behavior.

Specifically, the preset statistical method refers to preset multiple behaviors to be counted, wherein the multiple behaviors include: reply behavior, click link behavior, picture and text outer link stay behavior, picture and text outer link collection behavior, picture and text outer link forwarding behavior, live broadcast viewing behavior and live broadcast interaction behavior.

Further, the historical period of time for which statistics are determined may be within a week or a day.

Further, the preset characterization rule refers to that if there is a corresponding behavior, the feature value is a value corresponding to the behavior, and if the feature value corresponding to the reply behavior includes: the average reply time length, the latest reply time length, the average reply word length, the latest reply word length and the number of replies in a preset time. The characteristic values corresponding to the click link behavior comprise: link click rate. The characteristic value corresponding to the stay behavior of the outer chain of the image-text is the average stay time of the outer chain of the image-text. The characteristic value corresponding to the picture and text outer chain collection behavior is the average collection rate of the picture and text outer chain. The characteristic value corresponding to the image-text outer chain forwarding behavior is the image-text outer chain average forwarding rate. And the characteristic value corresponding to the live watching behavior is the live watching rate. And the characteristic value corresponding to the live interaction behavior is the live interaction rate.

The average reply duration refers to an average value of time intervals between reply of the target user and reply of the robot customer service. The latest reply time length is the time interval between the time when the robot customer service last sent the message to the target user and the time when the target user last replied to the robot customer service. The average word length of the target reply refers to the average value of the character lengths of the target user replies. The latest reply word length refers to the length of the character of the last reply robot customer service of the target user. The number of replies within the preset time refers to the number of reply messages of the target user within the preset time, such as the number of reply messages within thirty minutes. The link click rate refers to the ratio of the number of links clicked by the target user to the number of all links sent to the target user by the robot customer service. The average stay time of the external links of the graphics context refers to the average time of the target user browsing the links when the links are graphics context resources. The average collection rate of the image-text outer links refers to the ratio of the number of the image-text outer links collected by the target user to the total number of the image-text outer links, wherein the image-text outer links refer to links with images and texts sent by the robot customer service to the target user. The average forwarding rate of the outer links refers to the ratio of the number of times the outer links are forwarded by the target user to the total number of outer links. The live view rate refers to the ratio of the number of live links watched by the target user to the total number of live links, wherein the live links are pushed to the target user by the robot customer service. The live interaction rate refers to the ratio of the number of live links participating in interaction to the total number of live links when a target user views live. The interaction refers to the message leaving, praying or other interaction actions of the target user aiming at the corresponding live link.

Further, if there is no corresponding behavior, the characteristic value corresponding to the behavior is a set value, such as-1. For example: if the target user does not have the reply behavior, the characteristic values corresponding to the reply average duration, the latest reply duration, the reply average word length, the latest reply word length and the reply number in the preset time are all-1. If the target user does not click any link, i.e. does not have a link clicking action, the characteristic value corresponding to the link clicking rate is-1. If the target user clicks any image-text outer chain, the characteristic value corresponding to the average stay time length of the image-text outer chain is-1. If the target user does not collect any image-text outer chain, the characteristic value corresponding to the average collection rate of the image-text outer chain is-1. If the target user does not forward any image-text outer chain, the characteristic value corresponding to the average forwarding rate of the image-text outer chain is-1. If the target user does not watch any live broadcast, the characteristic value corresponding to the live broadcast viewing rate is-1. If the target user does not interact with any live broadcast, the characteristic value corresponding to the live broadcast interaction rate is-1.

When extracting the feature value corresponding to the target behavior from the dialogue behavior data, there may be no feature value corresponding to the target behavior, and then the feature value of the target behavior may be a default value, such as-1.

For example: referring to table 1: the characteristic values of all target behaviors are spliced in sequence, and the obtained behavior characteristic vectors are {8, -1,11, -1,3, -1,4, -1,0.3, -1}. The behavior feature vector can comprehensively represent the behavior features of the target user.

TABLE 1

In the embodiment of the application, the first score may be a number between 0 and 1 (including 1), and the first score and the first probability are in positive correlation, and the larger the first score, the larger the probability that the target user is successfully served when the target user does not perform non-automated service after the target user is subjected to automated service. Wherein the non-automation service comprises: and (5) manual service.

Illustratively, the preset type of intent of the automated and non-automated services of the present application is to require the target user to fill out a form. The automated and non-automated service processes are to introduce the relevant contents of the form to the target user. The larger the first score, the greater the probability of filling out the form when the target user is not being serviced by the non-automated service after being serviced by the automated service.

And S203, outputting a service scheme corresponding to the preset type intention if the first score is greater than or equal to the first score threshold.

In the embodiment of the application, the first score threshold is preset. For example, it may be set to 0.5. The first score threshold may be different for different services, so that the data processing method of the present application may be better applied to different services.

If the first score is greater than or equal to the first score threshold, the service scheme of the preset type intention is output to the target user after the target user is subjected to the automatic service, wherein the service scheme indicates that the target user has the preset type intention with relatively high probability. Wherein, the service scheme can be understood as a form for the target user to fill in. In addition, the service scheme may be other content for the target user to select, which is not limited herein.

In the embodiment of the application, when the first score is greater than or equal to the first score threshold, the service scheme corresponding to the preset type intention is output, so that the labor cost can be reduced, and the service efficiency can be improved.

In an alternative embodiment, the preset type of intention includes a provisioning intention, and outputting a service scheme corresponding to the preset type of intention includes: outputting a service scheme corresponding to the provisioning intention, wherein the service scheme is used for indicating information required to be provided by a target user when the target object is provisioned; and providing a target object for the target user according to the information input by the target user for the service scheme.

The provisioning intention may be an intention of a target user to provision a certain target object, such as an intention of a certain article or service. Provision of items such as provision of living goods or electronic products, provision of services such as provision of financial services. The service scheme is set aiming at different preset types of intents, and if the service scheme is to reserve articles, the service scheme comprises the following steps: please fill in the item name, item number, etc. If the financial service is reserved, the service scheme comprises the following steps: please fill in the service amount, service time, etc. In addition, if the target user has further input operation for the service scheme, if the input information is that a corresponding form is filled, and the corresponding target object is successfully prepared by the target user, the target object is provided for the target user.

In the embodiment of the present application, the preset type intention may also be other intentions, which are not limited herein.

And S204, if the first score is smaller than the first score threshold, marking the target user as the user needing to provide the non-automation service.

In the embodiment of the present application, if the first score is smaller than the first score threshold, it indicates that only after the automated service, the target user does not have a corresponding preset type of intention or the probability of the preset type of intention is small. For example, the target user does not have a provisioning intent to provision the target object, or the probability of provisioning the target object is small. The target user is marked as a user needing to provide non-automatic service, and the non-automatic service, such as manual service, can be further performed on the target user later so as to improve the probability value of the target user aiming at the preset type intention, such as improving the possibility of the target user to reserve the target object.

Further, after non-automation service is performed, a service scheme corresponding to the preset type intention can be output to the target user.

Further, the method further comprises: and inputting the user data into a second classification model for normalization analysis to obtain a second score, wherein the second score represents a second probability of success of service if non-automatic service is performed after the target user passes through the automatic service, and the larger the second score is, the larger the second probability is.

Wherein fig. 3 further comprises a second classification model 32, which comprises an encoding layer 321 and a classification layer 322. Wherein the user data is outputted simultaneously with a first classification model 31 outputting a first score and a second classification model 32 outputting a second score. The first score represents a first probability that the target user is successfully serviced when the target user is serviced by the automated service and not serviced by the non-automated service. The second score indicates the probability of the target user being successfully serviced when it is subject to an automated service and then to a non-automated service.

In the embodiment of the present application, the specific encoding process of the encoding layer 221 refers to the encoding manner of the encoding layer 311, and is not described herein. Classification layer 322 also employs XGBOOST techniques.

S203, including: and if the first score is greater than or equal to the first score threshold and the second score is less than the second score threshold, outputting a service scheme corresponding to the preset type intention.

The second score threshold is also preset, for example, the second score threshold is 0.5.

Further, the first score is greater than or equal to the first score threshold, and the second score is less than the second score threshold, which indicates that the first probability of successful service is greater when the target user is not under the non-automated service through the automated service, and the second probability of successful service is smaller when the target user is under the automated service and under the non-automated service. Therefore, the service scheme corresponding to the preset type intention is directly output, non-automatic service is not performed, the probability of successful service can be improved, the cost is reduced, and the service efficiency is improved.

S204 includes: if the first score is less than the first score threshold and the second score is greater than or equal to the second score threshold, marking the target user as a user who needs to provide non-automated services.

The first score is smaller than the first score threshold, the second score is larger than or equal to the second score threshold, and the first score indicates that the first probability of successful service is smaller when the target user passes through the automatic service and does not pass through the non-automatic service, and the second score of successful service is larger when the target user passes through the automatic service and passes through the non-automatic service. Therefore, the target user is marked as the user needing to provide the non-automatic service, and the service scheme corresponding to the preset type intention is output after the non-automatic service is subsequently performed, so that the probability of success of the service can be improved.

In an alternative embodiment, further comprising: if the first score is greater than or equal to the first score threshold and the second score is greater than or equal to the second score threshold, outputting a service scheme corresponding to the preset type intention, wherein the condition indicates that the first probability of successful service is greater when the target user does not pass through the non-automatic service through the automatic service, the second probability of successful service is greater when the target user passes through the automatic service and passes through the non-automatic service, and the probability of successful service is greater when the target user passes through or does not pass through the non-automatic service, and the service scheme corresponding to the preset type intention is directly output without passing through the non-automatic service, so that the probability of successful service can be improved, the cost is reduced, and the service efficiency is improved.

In an alternative embodiment, further comprising: if the first score is smaller than the first score threshold and the second score is smaller than the second score threshold, outputting a service scheme corresponding to the preset type intention, wherein the condition indicates that the first probability of successful service is smaller when the target user does not pass through the non-automatic service through the automatic service, the second probability of successful service is smaller when the target user passes through the automatic service and passes through the non-automatic service, and the probability of successful service is smaller when the target user passes through or does not pass through the non-automatic service, and the service scheme corresponding to the preset type intention is directly output without passing through the non-automatic service, so that the waste of labor cost can be avoided, and the service scheme corresponding to the preset type intention is output for the target user to select autonomously.

In the embodiment of the application, the probability of successful service of the target user under various conditions can be determined by combining the first classification model and the second classification model, so that the cost is reduced, and the service efficiency and the probability of successful service are improved.

Further, referring to fig. 4, a flowchart showing steps of a training method of the first classification model and the second classification model specifically includes the following steps:

S401, acquiring first user data and second user data.

Wherein the first user data includes: related data generated in the process of performing an automated service on a first user, wherein the first user is a user who does not perform a non-automated service after the automated service, and the second user data comprises: related data generated during the process of automating the second user, the second user being a user who enters a non-automation service after the automation service.

In the embodiment of the present application, preparation of training data is performed in advance, and the training data includes: first user data and second user data. The training data may be a link of the service system, after completing the automated service for the user, to output a preset type of intent and a service scheme of the non-automated service in proportion (e.g., 1:1). And then, counting whether the user is successfully served. The successful service of the user may be that the user fills in the content corresponding to the service scheme, or prepares a target object recommended by the service scheme, or the like.

The first user data of the first user is used as training data corresponding to the first user and the second user data of the second user is used as training data of the second user in the statistical automation service process. The training data for each user includes: user attribute feature data and user interaction data. The obtaining and processing of the user attribute feature data and the user interaction data refer to the description in the above method, and are not described herein.

S402, a first label corresponding to the first user data and a second label corresponding to the second user data are obtained.

Wherein the first label indicates whether the first user is successfully served and the second label indicates whether the second user is successfully served.

Specifically, the first tag indicates whether the first user is successfully serviced without being serviced by the non-automation service after being serviced by the automation service, if the first user is successfully serviced, the first tag may be set to 1, and if the first user is not successfully serviced, the first tag may be set to 0. The second label indicates whether the second user is successfully serviced under the condition of non-automatic service after being serviced by the automatic service, if the second user is serviced successfully, the second label can be set to 1, and if the second user is not serviced successfully, the second label can be set to 0.

S403, training a first classification model by using the first user data and the first label.

Specifically, the first user data is input into a first classification model, and a first prediction score is output. And calculating a first prediction score and a first loss value of the first label, adjusting the first classification model by adopting the first loss value if the first loss value is larger than or equal to a first threshold value, and finishing training of the first classification model if the first loss value is smaller than the first threshold value.

S404, training a second classification model by using the second user data and the second label.

Specifically, the second user data is input into the second classification model, and the second prediction score is output. And calculating a second prediction score and a second loss value of the second label, if the second loss value is larger than or equal to a second threshold value, adopting the second loss value to adjust the second classification model, and if the second loss value is smaller than the first threshold value, completing training of the second classification model.

In the embodiment of the application, the trained first classification model and second classification model can be used for predicting the first score and the second score, so as to determine whether to perform non-automatic service on the target user, thereby improving the success probability of service on the target user, improving the service efficiency and reducing the service cost.

The following are examples of the apparatus of the present application that may be used to perform the method embodiments of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the method of the present application.

Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The embodiment of the application provides a data processing device which can be integrated on electronic equipment such as a server. As shown in fig. 5, the data processing apparatus 50 includes: an acquisition unit 51, an analysis unit 52, an output unit 53, a marking unit 54. Wherein:

An acquiring unit 51, configured to acquire user data of a target user, where the user data includes related data generated during an automated service performed on the target user;

the analysis unit 52 is configured to input the user data into the first classification model for performing normalization analysis to obtain a first score, where the first score indicates a first probability that the target user does not perform non-automated service after passing through the automated service, and the success of the service indicates that the target user has a preset type intention on the target object if the first score is greater;

an output unit 53, configured to output a service scheme corresponding to a preset type intention if the first score is greater than or equal to a first score threshold;

and a marking unit 54, configured to mark the target user as a user who needs to provide the non-automation service if the first score is smaller than the first score threshold.

In a possible embodiment, the analysis unit 52 is further configured to: and inputting the user data into a second classification model for normalization analysis to obtain a second score, wherein the second score represents a second probability of success of service if non-automatic service is performed after the target user passes through the automatic service, and the larger the second score is, the larger the second probability is.

In a possible embodiment, the output unit 53 is specifically configured to: if the first score is greater than or equal to the first score threshold and the second score is less than the second score threshold, outputting a service scheme corresponding to the preset type intention;

in a possible embodiment, the marking unit 54 is specifically configured to: if the first score is less than the first score threshold and the second score is greater than or equal to the second score threshold, marking the target user as a user who needs to provide non-automated services.

In a possible embodiment, the analysis unit 52 is specifically configured to: inputting the user data into a coding layer of the first classification model for coding treatment to obtain a target feature vector; and inputting the target feature vector into a classification layer of the first classification model for normalization analysis to obtain a first score.

In one possible embodiment, the related data includes: the analysis unit 52 is specifically configured to, when inputting the user data into the coding layer of the first classification model to perform coding processing to obtain the target feature vector: labeling the user attribute feature data through the coding layer to obtain label data, and coding the label data to obtain an attribute feature vector; and extracting interactive features of the user interactive data through the coding layer to obtain interactive feature vectors, and splicing the attribute feature vectors and the interactive feature vectors to obtain target feature vectors.

In one possible implementation, the user interaction data includes dialog text data and dialog behavior data; the analysis unit 52 is specifically configured to, when extracting the interaction feature of the user interaction data through the coding layer to obtain the interaction feature vector: obtaining reply text data of a target user from dialogue text data, and performing word segmentation processing on the reply text data to obtain a plurality of word vectors; performing preset operation on a plurality of word vectors to determine dialogue feature vectors; vectorizing dialogue behavior data to obtain behavior feature vectors; and splicing the dialogue feature vector and the behavior feature vector to obtain the interaction feature vector.

In a possible implementation manner, the analysis unit 52 is specifically configured to, when performing word segmentation processing on the reply text data to obtain a plurality of word vectors: and carrying out vector extraction processing on the segmentation processing result through a pre-training word embedding model to obtain a plurality of word vectors.

In a possible implementation manner, the analysis unit 52 is specifically configured to, when performing vectorization processing on dialogue behavior data to obtain a behavior feature vector: according to a preset statistical rule, a plurality of relevant behavior characteristics of a target user in an automatic service process are counted to obtain a statistical result; if the statistical result of the related behavior characteristics does not exist, carrying out the valued processing on the related behavior characteristics according to a preset valued rule to obtain corresponding characteristic values; and splicing the statistical result and the characteristic value according to a preset sequence to obtain a behavior characteristic vector.

In a possible implementation, the preset type of intention includes a preparation intention, and the output unit 53 is specifically configured to: outputting a service scheme corresponding to the provisioning intention, wherein the service scheme is used for indicating a target user to provision a target object.

In a possible implementation, the first classification model and the second classification model are trained by:

acquiring first user data and second user data, the first user data comprising: related data generated in the process of performing an automated service on a first user, wherein the first user is a user who does not perform a non-automated service after the automated service, and the second user data comprises: related data generated in the process of carrying out automation service on a second user, wherein the second user is a user who enters non-automation service after the automation service; acquiring a first label corresponding to the first user data and a second label corresponding to the second user data, wherein the first label indicates whether the first user is successfully served or not, and the second label indicates whether the second user is successfully served or not; training a first classification model using the first user data and the first tag; a second classification model is trained using the second user data and the second tag.

The device provided by the embodiment of the application can be used for executing the method in the embodiment, and the implementation principle and the technical effect are similar, and are not repeated here.

It should be noted that, it should be understood that the division of the units of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into one physical entity or may be physically separated. And these units may all be implemented in the form of software calls through the processing element; or can be realized in hardware; the method can also be realized in a form that a part of units are called by processing elements to be software, and the other part of units are realized in a form of hardware. For example, the processing unit may be a processing element that is set up separately, may be implemented as integrated in a chip of the above-mentioned apparatus, or may be stored in a memory of the above-mentioned apparatus in the form of program codes, and may be called by a processing element of the above-mentioned apparatus to execute the functions of the above-mentioned processing unit. The implementation of the other units is similar. Furthermore, all or part of these units may be integrated together or may be implemented independently. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above units may be one or more integrated circuits configured to implement the above methods, such as: one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more microprocessors (Digital Signal Processor, abbreviated as DSP), or one or more field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), or the like. For another example, when a unit is implemented in the form of a processing element scheduler code, the processing element may be a general purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the units may be integrated together and implemented in the form of a System-On-a-Chip (SOC).

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.) means from one website, computer, server, or data center. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. Usable media may be magnetic media (e.g., floppy disks, hard disks, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., solid State Disk (SSD)), among others.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the application. As shown in fig. 6, the electronic device 60 may include: a processor 61, a memory 62, a communication interface 63 and a system bus 64. The memory 62 and the communication interface 63 are connected to the processor 61 through the system bus 64 and complete communication with each other, the memory 62 is used for a computer program, the communication interface 63 is used for communicating with other devices, and the processor 61 is used for calling the computer program in the memory to execute the scheme of the data processing method embodiment.

The system bus 64 mentioned in fig. 6 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The system bus 64 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface 63 is used to enable communication between the database access apparatus and other devices (e.g., target user side, read-write library, and read-only library).

The memory 62 may include a random access memory (Random Access Memory, simply referred to as RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 61 may be a general-purpose processor including a central processing unit, a network processor (Network Processor, NP) and the like; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.

The embodiment of the application also provides a computer readable storage medium, in which a computer program is stored, which when run on an electronic device causes the electronic device to execute the data processing method according to any of the method embodiments above.

The embodiment of the application also provides a chip running a computer program, and the chip is used for executing the data processing method of any method embodiment.

Embodiments of the present application also provide a computer program product comprising a computer program stored in a computer readable storage medium, from which at least one processor can read the computer program, the at least one processor implementing a data processing method according to any of the method embodiments above when executing the computer program.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the front and rear associated objects are an "or" relationship; in the formula, the character "/" indicates that the front and rear associated objects are a "division" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application. In the embodiment of the present application, the sequence number of each process does not mean the sequence of the execution sequence, and the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application in any way.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. A method of data processing, comprising:

acquiring user data of a target user, wherein the user data comprises related data generated in the process of carrying out automatic service on the target user;

inputting the user data into a first classification model for normalization analysis to obtain a first score, wherein the first score represents the first probability that the target user does not perform non-automatic service after passing through the automatic service, and the service success is that the target user has preset type intention on a target object as the first score is larger;

If the first score is greater than or equal to a first score threshold and the second score is less than a second score threshold, outputting a service scheme corresponding to the preset type intention; the second score represents a second probability of being successfully serviced if non-automated service is performed after the target user passes through the automated service;

and if the first score is smaller than the first score threshold and the second score is larger than or equal to the second score threshold, marking the target user as a user needing to provide non-automation service.

2. The data processing method of claim 1, wherein the method further comprises:

and inputting the user data into a second classification model for normalization analysis to obtain a second score, wherein the second score represents a second probability that the target user is successfully served if the target user is subjected to non-automatic service after passing through the automatic service, and the second probability is larger as the second score is larger.

3. The method of claim 2, wherein said inputting the user data into the first classification model for normalization analysis results in a first score, comprising:

inputting the user data to a coding layer of the first classification model for coding processing to obtain a target feature vector;

And inputting the target feature vector into a classification layer of the first classification model for normalization analysis to obtain the first score.

4. A data processing method according to claim 3, wherein the related data comprises: user attribute feature data and user interaction data in an automatic service process, wherein the step of inputting the user data into a coding layer of the first classification model for coding processing to obtain a target feature vector comprises the following steps:

the user attribute feature data is subjected to labeling processing through the coding layer to obtain label data, and the label data is subjected to coding processing to obtain attribute feature vectors;

and extracting interactive features of the user interactive data through the coding layer to obtain interactive feature vectors, and splicing the attribute feature vectors and the interactive feature vectors to obtain target feature vectors.

5. The data processing method according to claim 4, wherein the user interaction data includes dialogue text data and dialogue behavioral data; the step of extracting the interactive features of the user interactive data through the coding layer to obtain interactive feature vectors comprises the following steps:

Obtaining reply text data of the target user from the dialogue text data, and performing word segmentation processing on the reply text data to obtain a plurality of word vectors;

performing preset operation on the word vectors to determine dialogue feature vectors;

vectorizing the dialogue behavior data to obtain behavior feature vectors;

and splicing the dialogue feature vector and the behavior feature vector to obtain the interaction feature vector.

6. The method for processing data according to claim 5, wherein said word segmentation of said reply text data to obtain a plurality of word vectors comprises:

and carrying out vector extraction processing on the segmentation processing result through a pre-training word embedding model to obtain a plurality of word vectors.

7. The data processing method according to claim 5, wherein the dialogue action data includes a plurality of actions, each action corresponding to a feature value; the vectorizing processing is performed on the dialogue behavior data to obtain a behavior feature vector, which comprises the following steps:

extracting characteristic values corresponding to the target behaviors from the dialogue behavior data, and sequentially splicing the extracted characteristic values to obtain behavior characteristic vectors; target behavior refers to any one or more of a variety of behaviors.

8. The data processing method of claim 7, wherein the method further comprises:

according to a preset statistical method, counting various behaviors in a historical time period;

and carrying out characteristic value treatment on each behavior according to a preset characteristic rule to obtain a characteristic value corresponding to each behavior.

9. The data processing method according to any one of claims 1 to 7, wherein the preset type of intention includes a provisioning intention, and the outputting of a service scheme corresponding to the preset type of intention includes:

outputting a service scheme corresponding to the provisioning intention, wherein the service scheme is used for indicating information required to be provided by the target user when the target object is provisioned;

and providing a target object for the target user according to the information input by the target user aiming at the service scheme.

10. The data processing method according to any one of claims 2 to 7, characterized in that the first classification model and the second classification model are trained by:

acquiring first user data and second user data, wherein the first user data comprises: related data generated in the process of performing an automated service on a first user, wherein the first user is a user who does not perform a non-automated service after the automated service, and the second user data comprises: related data generated in the process of carrying out automation service on a second user, wherein the second user is a user who enters non-automation service after the automation service;

Acquiring a first label corresponding to the first user data and a second label corresponding to the second user data, wherein the first label indicates whether the first user is successfully served or not, and the second label indicates whether the second user is successfully served or not;

training a first classification model using the first user data and the first tag;

training a second classification model using the second user data and the second tag.

11. A data processing apparatus, comprising:

the system comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring user data of a target user, and the user data comprise related data generated in the process of carrying out automatic service on the target user;

the analysis unit is used for inputting the user data into a first classification model for normalization analysis to obtain a first score, wherein the first score represents a first probability that the target user does not perform non-automatic service after passing through the automatic service, and the success of the service is that the target user has preset type intention on the target object as the first score is larger and the first probability is larger;

the output unit is used for outputting a service scheme corresponding to the preset type intention if the first score is greater than or equal to a first score threshold value and the second score is smaller than a second score threshold value; the second score represents a second probability of being successfully serviced if non-automated service is performed after the target user passes through the automated service;

And the marking unit is used for marking the target user as a user needing to provide non-automatic service if the first score is smaller than the first score threshold and the second score is larger than or equal to the second score threshold.

12. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, wherein the processor implements the data processing method of any of claims 1 to 10 when the computer program is executed by the processor.

13. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when run on an electronic device, causes the electronic device to perform the data processing method according to any of claims 1 to 10.