WO2023165111A1 - 客服热线中用户意图轨迹识别的方法及系统 - Google Patents
客服热线中用户意图轨迹识别的方法及系统 Download PDFInfo
- Publication number
- WO2023165111A1 WO2023165111A1 PCT/CN2022/118511 CN2022118511W WO2023165111A1 WO 2023165111 A1 WO2023165111 A1 WO 2023165111A1 CN 2022118511 W CN2022118511 W CN 2022118511W WO 2023165111 A1 WO2023165111 A1 WO 2023165111A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- dialogue
- intention
- model
- text
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Definitions
- This application relates to the field of artificial intelligence, for example, it relates to a method and system for identifying user intention traces in customer service hotlines based on user behavior traces and context.
- a method for context-based intent recognition uses the one-hot result of the above intent classification as a feature combined with the current sentence to jointly predict the current intent.
- this method is highly dependent on whether the above intent classification results are correct, and errors in the above intent classification results may lead to continuous errors in subsequent intent results.
- relying only on text for intent recognition also has certain limitations.
- the user’s behavior trajectory such as browsing a product page while calling, or purchasing a product during a call, will also affect the user’s behavior after the time point.
- User intent recognition provides valuable information.
- Difficulty 2 Different user behavior trajectories corresponding to the same sentence may lead to different user real intentions.
- users In the process of customer service dialogue, users also generate behavioral actions in real time, such as browsing product details pages, or handling purchase products, etc. These actions may It also implies the user's next intention, but the expression in the sentence is ambiguous, which increases the difficulty of analyzing the user's true intention.
- the present application provides a method and system for identifying user intention tracks in a customer service hotline.
- the method and system requirements of the present application can effectively reduce the mistransmission of the above intention errors to the current sentence intention recognition, and the user's behavior track during the previous dialogue is considered in the modeling.
- This application provides a method for identifying a user's intent trajectory in a customer service hotline, which identifies the user's intent trajectory in a customer service hotline based on user behavior trajectory and context, including:
- Feature processing using the corpus pre-trained model to perform feature extraction on the dialogue text, and using the output vector of the model as a text feature representation; performing normalization and one-hot processing on the user behavior trajectory data, wherein the The continuous numerical features in the user behavior track data are normalized so that the processed features conform to the standard normal distribution, and the discrete numerical features in the user behavior track data are first encoded by one-hot, Then normalize the encoded features to obtain the user behavior feature representation; use one-hot to perform feature encoding on the above user intentions, and then normalize the encoded above user intentions to obtain the above Text user intention feature representation; splicing the text feature representation, the user behavior feature representation and the above user intention feature representation as a sample feature representation output;
- Intent classification using a multilayer perceptron (Multilayer Perceptron, MLP) neural network as the intention classification algorithm model, using the sample feature representation as the input of the intention classification algorithm model, to obtain the one-hot vector of user intention as the target;
- MLP Multilayer Perceptron
- the cross-entropy loss function and the backpropagation mechanism are used to update the network parameters, and the trained model parameters are saved;
- the MLP model with the same structure as the MLP neural network is built and Load the trained model parameters, input the sample feature representation into the MLP model after loading the model parameters, and use the vector of the last layer of the MLP model as the output result;
- the application also provides a system for identifying user intent tracks in customer service hotlines, which is configured to identify user intent tracks in customer service hotlines based on user behavior tracks and context, including a data slicing module, a feature processing module, an intent classification module, and a Beam Search strategy module ;
- the data slicing module is configured to receive user behavior trajectory data and dialogue texts, and cut the dialogue into N dialogue segments composed of 4 sentences, so as to convert a complete dialogue text into N sequential dialogue segments , associating N dialogue fragments with the user behavior trajectory data, the association basis is each dialogue fragment and the time node when the user behavior trajectory occurs, and in the training corpus, manually give each dialogue fragment a standard correct user intent category , outputting data to the feature processing module;
- the feature processing module is configured to use a Bidirectional Encoder Representations from Transformer representation (Bidirectional Encoder Representations from Transformer, BERT) model containing 12 layers of Transformer to perform feature extraction on the dialogue text to obtain a text feature vector representation; using normalization Process the user behavior trajectory data with one-hot to obtain the feature representation of the user behavior trajectory; use one-hot to perform feature encoding on the above user intentions, and use Z-score normalization processing after one-hot encoding to obtain The above user intention feature representation; splicing the text feature representation, the user behavior feature representation and the above user intention feature representation, and outputting it as a sample feature representation to the intention classification module;
- a Bidirectional Encoder Representations from Transformer representation Bidirectional Encoder Representations from Transformer, BERT
- BERT Bidirectional Encoder Representations from Transformer
- the intention classification module is configured to use a multi-layer perceptual neural network as an intention classification algorithm model, and use the sample feature representation as the input of the intention classification algorithm model, so as to obtain a one-hot vector of user intention as a target, in the
- the network parameters are updated using the cross-entropy loss function and the back propagation mechanism, and the trained model parameters are saved;
- an MLP model with the same structure as the MLP neural network is built and loaded for training.
- Good model parameters the sample feature output by the feature processing module represents the MLP model after inputting the loaded model parameters, and the vector of the last layer of the MLP model is used as an output result to the Beam Search strategy module;
- the Beam Search strategy module is set to generate the optimal user intent trajectory in the prediction stage according to the one-hot vector of the user intent and the Beam Search strategy as the output of the user intent trajectory of the entire dialog text, wherein the optimal The user intention trajectory of is the intention trajectory with the highest probability of final selection.
- the present application also provides an electronic device, including:
- a storage device configured to store at least one program
- the at least one processor When the at least one program is executed by the at least one processor, the at least one processor is enabled to implement the above-mentioned method for identifying user intention traces in a customer service hotline.
- the present application also provides a computer storage medium storing a computer program, and when the program is executed by a processor, the above-mentioned method for identifying a user's intention track in a customer service hotline is realized.
- Fig. 1 is a schematic diagram of the overall operation process of a method for user intention trajectory identification in a customer service hotline provided in the embodiment of the present application;
- FIG. 2 is a schematic representation of text features in a system for identifying user intention traces in a customer service hotline provided in an embodiment of the present application;
- FIG. 3 is a schematic diagram of an intention classification model in a system for identifying user intention traces in a customer service hotline provided in an embodiment of the present application;
- FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- the present application provides a method for identifying user intention traces in a customer service hotline.
- the method identifies user intent traces in a customer service hotline based on user behavior traces and context.
- the method includes:
- the first step is to obtain data, to obtain user behavior trajectory data and dialogue text.
- the second step is data slicing and data association, sliding and slicing the dialogue text, converting a complete dialogue text into N sequential dialogue fragments, and dividing N dialogues according to the time nodes of each dialogue fragment and user behavior trajectory Fragments are associated with user behavior track data.
- the third step is feature processing.
- For the content of the dialogue text use the corpus pre-trained model to extract the features of the dialogue text, and use the output vector of the model as the text feature representation; use normalization and one-hot to process the user behavior trajectory data , normalize the continuous numerical features in the user behavior trajectory data so that the processed features conform to the standard normal distribution, and use one-hot encoding for the discrete numerical features in the user behavior trajectory data, and then Use normalization processing to get the user behavior feature representation; use one-hot to encode the above user intent feature, and then use normalization processing to get the above user intent feature representation; then text feature representation, user behavior feature The representation is concatenated with the above user intention feature representation, and is output as a sample feature representation.
- the fourth step is intent classification, using a multi-layer perceptual neural network as the intent classification algorithm model, using sample feature representation as the input of the intent classification algorithm model, the goal is to obtain the one-hot vector of user intent, and use the cross entropy loss function during training
- the network parameters are updated with the backpropagation mechanism, and the model parameters are saved after training.
- the MLP model with the same structure as the MLP neural network is built and the trained model parameters are loaded, and the sample features are expressed as input models, and the last one is taken.
- the vector of the layer is used as the output result.
- the fifth step is to generate the optimal user intent trajectory, and use the Beam Search strategy to generate the optimal user intent trajectory in the prediction stage.
- a window with a size of 4 and a step size of 2 is used to slide and slice the dialogue text, and the original text is cut into N pieces of dialogue fragments consisting of 4 sentences.
- Each dialogue The segments are all in the order of customer service sentence-customer sentence-customer service sentence-customer sentence, and if the last dialogue segment ends with a customer service sentence, a blank user sentence is used to fill in the end of the last dialogue segment.
- the BERT model containing a 12-layer Transformer pre-trained with a large amount of prior knowledge is used to extract the features of the dialogue text.
- a basic version of 12 The BERT model of the first layer is connected to a classification model of the fully connected layer.
- the input of this model is tokenized dialogue text data, and the target is the one-hot vector of user intent.
- This model is trained for a small number of rounds first, and the first 8 layers of Transformer are frozen during training.
- the BERT model with the same structure as the BERT model loads the trained model parameters, inputs the tokenized dialogue text data into the model, and takes the vector corresponding to the [CLS] symbol of the last layer as the output, which is the text feature representation.
- the third step uses normalization and one-hot to process user behavior trajectory data, and uses Z-score normalization for continuous numerical features.
- the processed features conform to the standard normal distribution, that is, the mean is 0, the standard deviation is 1, and the conversion function is:
- ⁇ is the mean of all sample data
- ⁇ is the standard deviation of all sample data
- one-hot encoding is used first, and Z-score normalization is used after one-hot encoding.
- the structure of the MLP neural network in the fourth step includes 2 hidden layers and 1 output layer, and the first 2 hidden layers have 128 neurons and 64 neurons respectively ,
- Use Randomized Leaky ReLU (ReLU) as the activation function
- the number of neurons in the output layer is the same as the user's intention one-hot vector dimension
- use the softmax function as the activation function
- the propagation mechanism updates the network parameters, uses the inverted dropout mechanism in the input layer and the first hidden layer to reduce training overfitting, and uses the Early Stopping mechanism to monitor the loss on the verification set, and when the loss of the verification set does not decrease within a certain number of rounds Stop training when training is over, so as to avoid training overfitting, and save the model parameters after training; in the prediction stage, build an MLP model with the same structure as the MLP neural network and load the trained model parameters, and input the sample features into the model, and take the last
- the layer vector is
- the k intent categories (beam size) with the highest probability are kept each time, and the number of beam sizes k is selected as 2-3.
- the k intent categories reserved under the dialogue segment T are respectively input as the above user intent features, and so on, until the user of the last dialogue segment is predicted Intent, select an intent track with the highest probability as the user intent track of the entire dialogue text.
- this application provides a system for identifying user intent tracks in customer service hotlines.
- the system identifies user intent tracks in customer service hotlines based on user behavior tracks and context, including a data slicing module, a feature processing module, and an intent classification module. and Beam Search strategy modules.
- the data slicing module receives user behavior trajectory data and dialogue texts, cuts the original text into N dialogue fragments consisting of four sentences, converts a complete dialogue text into N sequential dialogue fragments, and then divides the N dialogue fragments Dialogue fragments are associated with user behavior trajectory data based on the time nodes when each dialogue fragment and user behavior trajectory occurs. In the training corpus, each dialogue fragment is manually marked with the correct user intent category, and the data is output to the feature processing module.
- this module uses the BERT model that contains 12 layers of Transformer to dialogue text content to carry out feature extraction to dialogue text, obtains text feature vector representation; Use normalization and one-hot to process user behavior track data, obtain Feature representation of user behavior trajectory; use one-hot to perform feature encoding on the above user intentions, and use Z-score normalization processing after one-hot encoding to obtain the feature representation of the above user intent; combine text feature representation, user behavior features
- the representation is concatenated with the above user intent feature representation, and is output to the intent classification module as a sample feature representation.
- the intent classification module uses a multi-layer perceptual neural network as the intent classification algorithm model, takes the sample feature representation as input, and aims to obtain the one-hot vector of user intent, and uses the cross-entropy loss function and backpropagation mechanism to train the network The parameters are updated, and the model parameters are saved after training.
- the MLP model with the same structure as the MLP neural network is built and the trained model parameters are loaded, and the sample features output by the feature processing module are represented as input models.
- the vector is sent to the Beam Search strategy module as the output result.
- the Beam Search strategy module generates the optimal user intention trajectory in the prediction stage, and finally selects an intention trajectory with the highest probability as the output of the user intention trajectory of the entire dialogue text.
- this module uses a window with a size of 4 and a step size of 2 to slide and slice the dialogue text, and cut the original text into N pieces of dialogue fragments consisting of 4 sentences.
- Each dialogue fragment is a customer service sentence -Customer sentence-customer service sentence-the order of user sentence, if the last dialogue segment ends with a customer service sentence, a blank user sentence is used to fill in the end of the last dialogue segment. That is, a complete dialogue text is converted into N sequential dialogue fragments.
- this module associates text fragments with user behavior trajectory data, and the association basis is the time node when the dialogue fragment and user behavior trajectory occur.
- Table 1 An example is shown in Table 1 below:
- this module uses the BERT model with 12 layers of Transformer pre-trained on the corpus containing a large amount of prior knowledge to extract the features of the dialogue text.
- a basic version of the 12-layer BERT model is first built and connected to a fully connected layer classification model.
- the input of this model is tokenized dialogue text data, and the target is the one-hot vector of user intent.
- This model trains a small number of rounds first, freezes the first 8 layers of Transformer during training, so that its parameters will not be updated, and uses the cross-entropy loss function and back propagation mechanism to update the parameters of the last 4 layers of Transformer and fully connected layers. Save the BERT model parameters after training.
- this module uses normalization and one-hot to process the user behavior trajectory data to form a user behavior trajectory feature representation.
- Z-score normalization is used, and the processed features conform to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1.
- the conversion function is:
- ⁇ is the mean of all sample data
- ⁇ is the standard deviation of all sample data
- one-hot encoding is used first, and Z-score normalization is used after one-hot encoding.
- this module uses one-hot to perform feature encoding on the above user intent, and uses Z-score normalization processing after one-hot encoding to form the above user intent feature representation .
- feature splicing is performed later. This module splices the text feature representation, user behavior trajectory feature representation and the above user intention feature representation, and outputs it as a sample feature representation.
- this module uses the MLP neural network as the intent classification algorithm model.
- the input of this model is the output of the feature processing module, that is, the sample feature representation, and the target is the one-hot vector of user intent.
- the feature processing module that is, the sample feature representation
- the target is the one-hot vector of user intent.
- the first 2 hidden layers have 128 neurons and 64 neurons respectively.
- ReLU is used as the activation function.
- the number of neurons in the output layer is related to the user's intention one-hot vector dimension. Same, use softmax function as activation function.
- This model uses the cross-entropy loss function and the backpropagation mechanism to update the network parameters during training, uses the inverted dropout mechanism in the input layer and the first hidden layer to reduce training overfitting, uses the EarlyStopping mechanism to monitor the loss on the verification set, and Stop training when the validation set loss does not decrease within a certain number of rounds, so as to avoid training overfitting. Save the model parameters after training.
- the MLP model with the same structure as the MLP neural network is built and the trained model parameters are loaded, the sample features output by the feature processing module are represented as input models, and the vector of the last layer is taken as the output result. Each element of this vector is a floating-point number between 0 and 1, indicating the probability value of the corresponding user intention, and the sum of multiple elements of this vector is equal to 1.
- this module uses the Beam Search strategy to generate the optimal user intent trajectory in the prediction stage. That is, when processing the output of the intent classification model, the k intent categories (beam size) with the highest probability are retained each time, and the number k of the beam size can be 2-3. When predicting the user intention of the dialogue segment T+1, the k intention categories reserved under the dialogue segment T are respectively input as the above user intention features, and so on.
- the model output intent probabilities of the first dialogue segment are [0.4, 0.5, 0.1] respectively.
- the two intent category candidates with the highest probability, namely A and B, are reserved.
- input A as the above intention feature, and the obtained intention probabilities are [0.1, 0.7, 0.2] respectively; input B as the above intention feature, and obtain the intention probability respectively [0.5, 0.2,0.3].
- the Beam Search strategy module expands the search range through the Beam Search strategy to ensure a higher accuracy rate, which can effectively reduce the error transmission of the above intent error to the current sentence intent recognition.
- the method and system for identifying the user intention trajectory in the customer service hotline based on the user behavior trajectory and context provided by this application expand the search range through the Beam Search strategy to ensure a higher accuracy rate, which is to deal with the intention classification model.
- the k intent categories with the highest probability are kept each time.
- the k intention categories reserved under the dialogue segment T are respectively input as the above user intention features, and so on.
- the Beam Search strategy can effectively reduce the error transmission of the above intent error to the current sentence intent recognition, so as to generate the optimal user intent trajectory in the prediction stage; during the associated dialogue process
- the user behavior trajectory data is processed, and its features are processed and spliced with text features to participate in the training and prediction of user intent classification, and the recognition accuracy can be improved for ambiguous text expressions.
- FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- FIG. 4 shows a block diagram of an electronic device 412 suitable for implementing embodiments of the present application.
- the electronic device 412 shown in FIG. 4 is only an example, and should not limit the functions and scope of use of this embodiment of the present application.
- electronic device 412 takes the form of a general-purpose computing device.
- Components of the electronic device 412 may include, but are not limited to: one or more processors 416, a storage device 428, and a bus 418 connecting different system components (including the storage device 428 and the processor 416).
- Bus 418 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures.
- these architectures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (MicroChannel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association) , VESA) local bus and peripheral component interconnect (Peripheral Component Interconnect, PCI) bus.
- Electronic device 412 includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 412 and include both volatile and nonvolatile media, removable and non-removable media.
- Storage 428 may include computer system readable media in the form of volatile memory, such as random access memory (Random Access Memory, RAM) 430 and/or cache memory 432 .
- Electronic device 412 may include other removable/non-removable, volatile/nonvolatile computer system storage media.
- storage system 434 may be configured to read from and write to non-removable, non-volatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive").
- a disk drive configured to read and write to a removable non-volatile disk (such as a "floppy disk") may be provided, as well as a removable non-volatile disk (such as a Compact Disc- ReadOnly Memory, CD-ROM), Digital Video Disc (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical media) CD-ROM drive.
- each drive may be connected to bus 418 through one or more data media interfaces.
- the storage device 428 may include at least one program product, which has a set of (for example, at least one) program modules configured to execute the functions of the embodiments of the present application.
- a program 436 having a set (at least one) of program modules 426 may be stored, for example, in storage device 428, such program modules 426 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, which Each or a combination of the examples may include the implementation of a network environment.
- the program modules 426 generally perform the functions and/or methods of the embodiments described herein.
- the electronic device 412 may also communicate with one or more external devices 414 (such as a keyboard, pointing device, camera, display 424, etc.), and may also communicate with one or more devices that enable a user to interact with the electronic device 412, and/or Or communicate with any device (eg, network card, modem, etc.) that enables the electronic device 412 to communicate with one or more other computing devices. Such communication may be performed through an Input/Output (I/O) interface 422 .
- the electronic device 412 can also communicate with one or more networks (such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN) and/or a public network, such as the Internet) through the network adapter 420.
- networks such as a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN) and/or a public network, such as the Internet
- network adapter 420 communicates with other modules of electronic device 412 via bus 418 .
- other hardware and/or software modules may be used in conjunction with electronic device 412, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, disk arrays (Redundant Arrays) of Independent Disks, RAID) systems, tape drives, and data backup storage systems.
- the processor 416 executes a variety of functional applications and data processing by running the programs stored in the storage device 428 , such as implementing the methods provided in the above-mentioned embodiments of the present application.
- the embodiment of the present application also provides a computer storage medium storing a computer program, and the computer program is used to perform the method for identifying user intention trajectory in a customer service hotline according to any one of the above-mentioned embodiments of the present application when executed by a computer processor.
- the computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media.
- the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
- a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. Examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more conductors, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory, EPROM) or flash memory), optical fiber, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
- the program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to wireless, electric wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
- any appropriate medium including but not limited to wireless, electric wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
- Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional A procedural programming language, such as the "C" language or similar programming language.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
- the remote computer can be connected to the user computer through any kind of network, including a LAN or WAN, or it can be connected to an external computer (eg via the Internet using an Internet service provider).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
一种客服热线中用户意图轨迹识别的方法以及基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别的系统,该方法包括:获得用户行为轨迹数据和对话文本;将完整对话文本转化为对话片段,依照时间节点将对话片段与用户行为轨迹数据关联;分别得到文本特征表示、用户行为特征表示和上文用户意图特征表示,特征拼接后,作为样本特征表示输出;使用多层感知神经网络作为意图分类算法模型,以样本特征表示作为输入,取最后一层的向量作为输出结果;使用Beam Search策略生成最优的用户意图轨迹。
Description
本申请要求在2022年03月01日提交中国专利局、申请号为202210199654.6的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
本申请涉及人工智能领域,例如涉及一种基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别的方法及系统。
银行、保险公司、电商平台、手机运营商等服务类行业都开发了客服热线系统,单日接线量已十分庞大。在人工智能技术日新月异的当下,企业也纷纷利用语音识别技术把客服热线录音转成半结构化的文本数据,利用自然语言处理技术进行文本挖掘,以此提升客服热线分析的效率。在客服热线分析的广泛领域内,用户的意图识别是十分常见也是被认为能带来高度业务价值的需求,针对用户意图的分析挖掘有利于指导企业的市场推广和产品运营。
在专利号为CN104951433A的中国申请专利中,公开了一种基于上下文进行意图识别的方法,该方法将上文意图分类结果one-hot作为特征结合当前语句共同预测当前意图。但是这个方法对上文意图分类结果是否正确有极高依赖,上文意图分类结果错误可能会导致之后的意图结果连续错误。而且,仅依赖文本进行意图识别也存在一定局限性,在热线对话过程中用户的行为轨迹,例如边通话边浏览商品页面,或者在通话过程中办理购买产品等,也会对该时间点之后的用户意图识别提供有价值的信息。
经过认真分析,相关技术中解决文本意图轨迹识别时存在如下难点:
难点1)上文意图分类结果错误会导致下文连续意图识别错误,在客服对话中即便是相同的语句也可能表示不同的用户意图,这是因为当前语句包含信息有限,上文邻近用户意图作为隐藏状态对当前意图识别可以起到巨大作用。但正因为如此,假使上文用户意图识别错误,也增大了当前意图识别错误的可能性;并且错误会随着对话传递放大,直到整个用户意图轨迹产生巨大偏差。
难点2)相同语句对应的不同用户行为轨迹可能会导致不同的用户真实意图,在客服对话的过程中,用户也实时在产生行为动作,例如浏览商品详情页面,或者办理购买产品等,这些动作可能也隐含着用户接下来的意图,但是语句中的表述模棱两可,从而提升了分析用户真实意图的难度。
发明内容
本申请提供一种客服热线中用户意图轨迹识别的方法和系统。本申请的方法和系统要求能够有效降低上文意图错误对当前语句意图识别的错误传递,在建模中考虑了前文对话时用户的行为轨迹。
本申请提供一种客服热线中用户意图轨迹识别的方法,基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别,包括:
数据获得,获得用户行为轨迹数据和对话文本;
数据切片和数据关联,对所述对话文本滑动切片,将一通完整的对话文本转化为N条有先后顺序的对话片段,依照每条对话片段和所述用户行为轨迹发生的时间节点将N条对话片段与所述用户行为轨迹数据关联;
特征处理,使用语料预训练的模型对所述对话文本进行特征提取,将所述模型的输出向量作为文本特征表示;对所述用户行为轨迹数据进行归一化和one-hot处理,其中,对所述用户行为轨迹数据中连续型的数值特征进行归一化处理以使处理后的特征符合标准正态分布,对所述用户行为轨迹数据中离散型的数值特征先采用one-hot进行编码,再对编码后的特征进行归一化处理,得到用户行为特征表示;采用one-hot对上文用户意图进行特征编码,再对编码后的所述上文用户意图进行归一化处理,得到上文用户意图特征表示;将所述文本特征表示、所述用户行为特征表示和所述上文用户意图特征表示拼接,作为样本特征表示输出;
意图分类,使用多层感知(Multilayer Perceptron,MLP)神经网络作为意图分类算法模型,以所述样本特征表示作为所述意图分类算法模型的输入,以获得用户意图的one-hot向量作为目标;在所述意图分类算法模型的训练阶段,使用交叉熵损失函数与反向传播机制对网络参数进行更新,保存训练后的模型参数;在预测阶段,搭建与所述MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将所述样本特征表示输入加载模型参数后所述MLP模型,将所述MLP模型的最后一层的向量作为输出结果;
生成最优的用户意图轨迹,根据所述用户意图的one-hot向量以及Beam Search策略在所述预测阶段生成所述最优的用户意图轨迹。
本申请还提供一种客服热线中用户意图轨迹识别的系统,设置为基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别,包括数据切片模块、特征处理模块、意图分类模块和Beam Search策略模块;
所述数据切片模块设置为接收用户行为轨迹数据和对话文本,将所述对话本切成N条4句话组成的对话片段,以将一通完整的对话文本转化成N条有先 后顺序的对话片段,将N条对话片段与所述用户行为轨迹数据关联,关联依据为每条对话片段和所述用户行为轨迹发生的时间节点,在训练语料中,人工给每条对话片段标准正确的用户意图类别,输出数据至所述特征处理模块;
所述特征处理模块,设置为使用含有12层Transformer的来自Transformer的双向编码器表示(Bidirectional Encoder Representations from Transformer,BERT)模型对所述对话文本进行特征提取,获得文本特征向量表示;使用归一化和one-hot对所述用户行为轨迹数据进行处理,获得用户行为轨迹特征表示;使用one-hot对上文用户意图进行特征编码,在one-hot编码后采用Z-score归一化处理,获得上文用户意图特征表示;将所述文本特征表示、所述用户行为特征表示和所述上文用户意图特征表示拼接,作为样本特征表示输出至所述意图分类模块;
所述意图分类模块设置为使用多层感知神经网络作为意图分类算法模型,以所述样本特征表示作为所述意图分类算法模型的输入,以获得用户意图的one-hot向量作为目标,在所述意图分类算法模型的训练阶段,使用交叉熵损失函数与反向传播机制对网络参数进行更新,保存训练后的模型参数;在预测阶段,搭建与所述MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将所述特征处理模块输出的所述样本特征表示输入加载模型参数后所述MLP模型,将所述MLP模型的最后一层的向量作为输出结果至所述Beam Search策略模块;
所述Beam Search策略模块设置为根据所述用户意图的one-hot向量以及Beam Search策略在所述预测阶段生成最优的用户意图轨迹作为整个对话文本的用户意图轨迹输出,其中,所述最优的用户意图轨迹为最终选择的概率最高的一个意图轨迹。
本申请还提供一种电子设备,包括:
至少一个处理器;
存储装置,设置为存储至少一个程序;
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现上述的客服热线中用户意图轨迹识别的方法。
本申请还提供一种计算机存储介质,存储有计算机程序,所述程序被处理器执行时实现上述的客服热线中用户意图轨迹识别的方法。
图1是本申请实施例提供的一种客服热线中用户意图轨迹识别的方法的整 体操作流程示意图;
图2是本申请实施例提供的一种客服热线中用户意图轨迹识别的系统中的文本特征表示示意图;
图3是本申请实施例提供的一种客服热线中用户意图轨迹识别的系统中的意图分类模型示意图;
图4是本申请实施例提供的一种电子设备的结构示意图。
下面结合附图和具体的实施例来对本申请提供的客服热线中用户意图轨迹识别的方法和系统进行说明。
如图1所示,本申请提供一种客服热线中用户意图轨迹识别的方法,该方法基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别。该方法包括:
第一步,数据获得,获得用户行为轨迹数据和对话文本。
第二步,数据切片和数据关联,对对话文本滑动切片,将一通完整的对话文本转化为N条有先后顺序的对话片段,依照每条对话片段和用户行为轨迹发生的时间节点将N条对话片段与用户行为轨迹数据关联。
第三步,特征处理,对于对话文本内容,使用语料预训练的模型对对话文本进行特征提取,将模型的输出向量作为文本特征表示;对于用户行为轨迹数据使用归一化和one-hot进行处理,对用户行为轨迹数据中连续型的数值特征进行归一化处理以使处理后的特征符合标准正态分布,对用户行为轨迹数据中离散型的数值特征先采用one-hot进行编码,然后再采用归一化处理,得到用户行为特征表示;采用one-hot对上文用户意图进行特征编码,然后再采用归一化处理,得到上文用户意图特征表示;再将文本特征表示、用户行为特征表示和上文用户意图特征表示拼接,作为样本特征表示输出。
第四步,意图分类,使用多层感知神经网络作为意图分类算法模型,以样本特征表示作为意图分类算法模型的输入,目标是获得用户意图的one-hot向量,在训练时使用交叉熵损失函数与反向传播机制对网络参数进行更新,训练后将模型参数保存,在预测阶段,搭建与MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将样本特征表示输入模型,取最后一层的向量作为输出结果。
第五步,生成最优的用户意图轨迹,使用Beam Search策略在预测阶段生成最优的用户意图轨迹。
在上述第二步中,对对话文本数据切片时,以一个大小为4,步长为2的窗口对对话文本滑动切片,把原文本切成N条4句话组成的对话片段,每条对话片段都是客服句-用户句-客服句-用户句的顺序,到最后一条对话片段如果以客服句结尾则用空白的用户句填补在最后一条对话片段的最后。
在训练语料中,对于数据切片后的对话片段,需要人工给每条对话片段标注正确的用户意图类别。
在第三步特征处理时,如图2所示,使用包含大量先验知识的语料预训练的含有12层Transformer的BERT模型对对话文本进行特征提取,在训练阶段,先搭建一个基础版的12层的BERT模型连接一个全连接层的分类模型,这个模型的输入是token化的对话文本数据,目标是用户意图的one-hot向量,此模型先训练少量轮次,训练时冻结前8层Transformer,使其参数不会被更新,使用交叉熵损失函数与反向传播机制对后4层Transformer和全连接层的参数进行更新,训练后将BERT模型参数保存;在预测阶段,搭建与12层的BERT模型相同结构的BERT模型并加载训练好的模型参数,将token化的对话文本数据输入模型,取最后一层的[CLS]符号对应的向量作为输出,此向量即为文本特征表示。
在客服热线中用户意图轨迹识别的方法中,所述的第三步使用归一化和one-hot对用户行为轨迹数据进行处理,针对连续型的数值特征,采用Z-score归一化,经过处理的特征符合标准正态分布,即均值为0,标准差为1,转化函数为:
其中,μ为所有样本数据的均值,σ为所有样本数据的标准差。
针对离散的类型特征,先采用one-hot进行编码,并在one-hot编码后采用Z-score归一化处理。
在客服热线中用户意图轨迹识别的方法中,所述第四步中MLP神经网络的结构包括2层隐藏层和1层输出层,前2层隐藏层分别拥有128个神经元和64个神经元、使用随机线性整流(Randomized Leaky ReLU,ReLU)作为激活函数,输出层的神经元数量与用户意图one-hot向量维度相同,使用softmax函数作为激活函数,在训练时使用交叉熵损失函数与反向传播机制对网络参数进行更新,在输入层和第1层隐藏层使用inverted dropout机制降低训练过拟合,使用Early Stopping机制监控验证集上损失,并当验证集损失在一定轮数内不再下降时停止训练,从而避免训练过拟合,训练后将模型参数保存;在预测阶段,搭建与MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将样本特征表示输入模型,取最后一层的向量作为输出结果,此向量每一个元素都是 0-1之间的浮点数,表示相应的用户意图的概率值大小,且此向量的多个元素之和等于1。
在客服热线中用户意图轨迹识别的方法中,在第五步使用Beam Search策略处理意图分类模型的输出时,每次都保留概率最大的k个意图类别(beam size),beam size的数量k取2-3,在预测对话片段T+1的用户意图时,将在对话片段T下保留的k个意图类别分别作为上文用户意图特征输入,以此类推,直到预测完最后一条对话片段的用户意图,选择概率最高的一个意图轨迹作为整个对话文本的用户意图轨迹。
如图1所示,本申请提供一种客服热线中用户意图轨迹识别的系统,该系统基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别,包括数据切片模块、特征处理模块、意图分类模块和Beam Search策略模块。
所述数据切片模块接收用户行为轨迹数据和对话文本,把原文本切成N条4句话组成的对话片段,将一通完整的对话文本转化成N条有先后顺序的对话片段,再将N条对话片段与用户行为轨迹数据关联,关联依据为每条对话片段和用户行为轨迹发生的时间节点,在训练语料中,人工给每条对话片段标注正确的用户意图类别,输出数据至所述特征处理模块。
所述特征处理模块,该模块对对话文本内容使用含有12层Transformer的BERT模型对对话文本进行特征提取,获得文本特征向量表示;使用归一化和one-hot对用户行为轨迹数据进行处理,获得用户行为轨迹特征表示;使用one-hot对上文用户意图进行特征编码,在one-hot编码后采用Z-score归一化处理,获得上文用户意图特征表示;将文本特征表示、用户行为特征表示和上文用户意图特征表示拼接,作为样本特征表示输出至意图分类模块。
所述意图分类模块使用多层感知神经网络作为意图分类算法模型,以样本特征表示作为输入,目标是获得用户意图的one-hot向量,在训练时使用交叉熵损失函数与反向传播机制对网络参数进行更新,训练后将模型参数保存,在预测阶段,搭建与MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将特征处理模块输出的样本特征表示输入模型,取最后一层的向量作为输出结果至Beam Search策略模块。
所述Beam Search策略模块在预测阶段生成最优的用户意图轨迹,最终选择概率最高的一个意图轨迹作为整个对话文本的用户意图轨迹输出。
对于上述的数据切片模块,本模块以一个大小为4,步长为2的窗口将对话文本滑动切片,把原文本切成N条4句话组成的对话片段,每条对话片段都是客服句-用户句-客服句-用户句的顺序,到最后一条对话片段如果以客服句结尾 则用空白的用户句填补在最后一条对话片段的最后。即一通完整对话文本转化成N条有先后顺序的对话片段。同时,本模块将文本片段与用户行为轨迹数据关联,关联依据为对话片段和用户行为轨迹发生的时间节点。在训练语料中,人工给每条对话片段标注正确的用户意图类别。举例如下表1所示:
表1
因对话文本内容数据敏感,故训练语料样例中使用无实际意义的数字代替真实语句。
在特征处理模块中,对于对话文本内容,本模块使用包含大量先验知识的语料预训练的含有12层Transformer的BERT模型对对话文本进行特征提取。在训练阶段,先搭建一个基础版的12层的BERT模型连接一个全连接层的分类模型,这个模型的输入是token化的对话文本数据,目标是用户意图的one-hot向量。此模型先训练少量轮次,训练时冻结前8层Transformer,使其参数不会被更新,使用交叉熵损失函数与反向传播机制对后4层Transformer和全连接层的参数进行更新。训练后将BERT模型参数保存。在预测阶段,搭建与12层的BERT模型相同结构的BERT模型并加载训练好的模型参数,将token化的对话文本数据输入模型,取最后一层的[CLS]符号对应的向量作为输出,此向量即为 文本特征表示。
在特征处理模块中,对于用户行为轨迹数据,本模块使用归一化和one-hot对用户行为轨迹数据进行处理形成用户行为轨迹特征表示。针对连续型的数值特征,采用Z-score归一化,经过处理的特征符合标准正态分布,即均值为0,标准差为1。转化函数为:
其中,μ为所有样本数据的均值,σ为所有样本数据的标准差。
针对离散的类型特征,先采用one-hot进行编码,并在one-hot编码后采用Z-score归一化处理。
在特征处理模块中,对于上文用户意图,本模块使用one-hot对上文用户意图进行特征编码,并在one-hot编码后采用Z-score归一化处理,形成上文用户意图特征表示。
在特征处理模块中,后面进行特征拼接,本模块将文本特征表示、用户行为轨迹特征表示和上文用户意图特征表示拼接,作为样本特征表示输出。
如图3所示,在意图分类模块中,本模块使用MLP神经网络作为意图分类算法模型,此模型的输入是特征处理模块的输出,即样本特征表示,目标是用户意图的one-hot向量。结构上共2层隐藏层和1层输出层,前2层隐藏层分别拥有128个神经元和64个神经元、使用ReLU作为激活函数,输出层的神经元数量与用户意图one-hot向量维度相同,使用softmax函数作为激活函数。此模型在训练时使用交叉熵损失函数与反向传播机制对网络参数进行更新,在输入层和第1层隐藏层使用inverted dropout机制降低训练过拟合,使用EarlyStopping机制监控验证集上损失,并当验证集损失在一定轮数内不再下降时停止训练,从而避免训练过拟合。训练后将模型参数保存。在预测阶段,搭建与MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将特征处理模块输出的样本特征表示输入模型,取最后一层的向量作为输出结果。此向量每一个元素都是0-1之间的浮点数,表示相应的用户意图的概率值大小,且此向量的多个元素之和等于1。
对于Beam Search策略模块,本模块使用Beam Search策略在预测阶段生成最优的用户意图轨迹。就是在处理意图分类模型的输出时,每次都保留概率最大的k个意图类别(beam size),beam size的数量k取2-3即可。在预测对话片段T+1的用户意图时,将在对话片段T下保留的k个意图类别分别作为上文用户意图特征输入,以此类推。
举例来说,假设用户意图一共有[A,B,C]3类,beam size的数量取2。第一条对话片段的模型输出意图概率分别为[0.4,0.5,0.1],这时保留概率最大的2个意图类别候选,即A和B。预测第二条对话片段的用户意图时,将A作为上文意图特征输入,得到意图概率分别为[0.1,0.7,0.2];将B作为上文意图特征输入,得到意图概率分别为[0.5,0.2,0.3]。此时再次计算概率最大的2个输出:AA=0.4*0.1=0.04;AB=0.4*0.7=0.28;AC=0.4*0.2=0.08;BA=0.5*0.5=0.25;BB=0.5*0.2=0.1;BC=0.5*0.3=0.15,取概率最大的两个意图轨迹就是AB和BA,再继续预测第三条对话片段的用户意图,以此类推,直到最后一条对话片段,选择概率最高的一个意图轨迹作为整个对话文本的用户意图轨迹。Beam Search策略模块通过Beam Search策略扩大搜索范围从而保证更高的正确率,能够有效降低上文意图错误对当前语句意图识别的错误传递。
基于上述技术方案,本申请提供的基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别的方法及系统,通过Beam Search策略扩大搜索范围从而保证更高的正确率,就是在处理意图分类模型的输出时,每次都保留概率最大的k个意图类别。在预测对话片段T+1的用户意图时,将在对话片段T下保留的k个意图类别分别作为上文用户意图特征输入,以此类推。相比每次只保留概率最大的意图类别的贪心策略,Beam Search策略能有效降低上文意图错误对当前语句意图识别的错误传递,从而在预测阶段生成最优的用户意图轨迹;关联对话过程中的用户行为轨迹数据,并将其特征处理后与文本特征拼接,共同参与用户意图分类的训练和预测,对文本表述模棱两可的情况能够提升识别准确率。
图4是本申请实施例提供的一种电子设备的结构示意图。图4示出了适于用来实现本申请实施方式的电子设备412的框图。图4显示的电子设备412仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图4所示,电子设备412以通用计算设备的形式表现。电子设备412的组件可以包括但不限于:一个或者多个处理器416,存储装置428,连接不同系统组件(包括存储装置428和处理器416)的总线418。
总线418表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(MicroChannel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。
电子设备412包括多种计算机系统可读介质。这些介质可以是任何能够被电子设备412访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
存储装置428可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)430和/或高速缓存存储器432。电子设备412可以包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统434可以设置为读写不可移动的、非易失性磁介质(图4未显示,通常称为“硬盘驱动器”)。尽管图4中未示出,可以提供设置为对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如只读光盘(Compact Disc-ReadOnly Memory,CD-ROM)、数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线418相连。存储装置428可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请实施例的功能。
具有一组(至少一个)程序模块426的程序436,可以存储在例如存储装置428中,这样的程序模块426包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或一种组合中可能包括网络环境的实现。程序模块426通常执行本申请所描述的实施例中的功能和/或方法。
电子设备412也可以与一个或多个外部设备414(例如键盘、指向设备、摄像头、显示器424等)通信,还可与一个或者多个使得用户能与该电子设备412交互的设备通信,和/或与使得该电子设备412能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口422进行。并且,电子设备412还可以通过网络适配器420与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器420通过总线418与电子设备412的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备412使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。
处理器416通过运行存储在存储装置428中的程序,从而执行多种功能应用以及数据处理,例如实现本申请上述实施例所提供的方法。
本申请实施例还提供一种存储计算机程序的计算机存储介质,所述计算机 程序在由计算机处理器执行时用于执行本申请上述实施例任一所述的客服热线中用户意图轨迹识别的方法。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、ROM、可擦式可编程只读存储器((Erasable Programmable Read Only Memory,EPROM)或闪存)、光纤、CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括LAN或WAN连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
Claims (10)
- 一种客服热线中用户意图轨迹识别的方法,用于基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别,包括:数据获得,获得用户行为轨迹数据和对话文本;数据切片和数据关联,对所述对话文本滑动切片,将一通完整的对话文本转化为N条有先后顺序的对话片段,依照每条对话片段和所述用户行为轨迹发生的时间节点将N条对话片段与所述用户行为轨迹数据关联;特征处理,使用语料预训练的模型对所述对话文本进行特征提取,将所述模型的输出向量作为文本特征表示;对所述用户行为轨迹数据进行归一化和one-hot处理,其中,对所述用户行为轨迹数据中连续型的数值特征进行归一化处理以使处理后的特征符合标准正态分布,对所述用户行为轨迹数据中离散型的数值特征先采用one-hot进行编码,再对编码后的特征进行归一化处理,得到用户行为特征表示;采用one-hot对上文用户意图进行特征编码,再对编码后的所述上文用户意图进行归一化处理,得到上文用户意图特征表示;将所述文本特征表示、所述用户行为特征表示和所述上文用户意图特征表示拼接,作为样本特征表示;意图分类,使用多层感知MLP神经网络作为意图分类算法模型,以所述样本特征表示作为所述意图分类算法模型的输入,以获得用户意图的one-hot向量作为目标;在所述意图分类算法模型的训练阶段,使用交叉熵损失函数与反向传播机制对网络参数进行更新,保存训练后的模型参数;在预测阶段,搭建与所述MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将所述样本特征表示输入加载模型参数后所述MLP模型,将所述MLP模型的最后一层的向量作为输出结果;生成最优的用户意图轨迹,根据所述用户意图的one-hot向量以及Beam Search策略在所述预测阶段生成所述最优的用户意图轨迹。
- 根据权利要求1所述的客服热线中用户意图轨迹识别的方法,其中,所述对所述对话文本滑动切片,将一通完整的对话文本转化为N条有先后顺序的对话片段,包括:以一个大小为4,步长为2的窗口对所述对话文本滑动切片,将所述对话文本切成N条4句话组成的对话片段,每条对话片段的结构为客服句-用户句-客服句-用户句,在最后一条对话片段以客服句结尾的情况下用空白的用户句填补在所述最后一条对话片段的结尾。
- 根据权利要求1所述的客服热线中用户意图轨迹识别的方法,在所述对所述对话文本滑动切片,将一通完整的对话文本转化为N条有先后顺序的对话片 段之后,还包括:在训练语料中,人工给每条对话片段标注正确的用户意图类别。
- 根据权利要求1所述的客服热线中用户意图轨迹识别的方法,其中,所述语料预训练的模型为使用包含预设数量先验知识的语料预训练的含有12层Transformer的来自Transformer的双向编码器表示BERT模型;所述方法还包括:在所述语料预训练的模型的训练阶段,将一个12层的BERT模型连接一个全连接层的分类模型作为预设模型,其中,所述预设模型的输入是token化的对话文本数据,以获得用户意图的one-hot向量作为目标;对所述预设模型训练预设轮数,在训练的过程中冻结前8层Transformer,使所述前8层Transformer的参数不被更新,使用交叉熵损失函数与反向传播机制对后4层Transformer和全连接层的参数进行更新,训练后将BERT模型参数保存;所述使用语料预训练的模型对所述对话文本进行特征提取,将所述模型的输出向量作为文本特征表示,包括:在预测阶段,搭建与所述12层的BERT模型相同结构的BERT模型并加载训练好的模型参数,将token化的所述对话文本输入加载模型参数后的所述BERT模型,将加载模型参数后的所述BERT模型的最后一层的[CLS]符号对应的向量作为所述文本特征表示。
- 根据权利要求1所述的客服热线中用户意图轨迹识别的方法,其中,所述 多层感知神经网络的结构包括2层隐藏层和1层输出层,前2层隐藏层分别拥有128个神经元和64个神经元、使用随机线性整流ReLU作为激活函数,所述输出层的神经元数量与用户意图one-hot向量维度相同,使用softmax函数作为激活函数;所述MLP模型的最后一层的向量中每一个元素是0-1之间的浮点数,表示相应的用户意图的概率值大小,所述向量的多个元素之和等于1;所述使用交叉熵损失函数与反向传播机制对网络参数进行更新,包括:在输入层和第1层隐藏层使用inverted dropout机制降低训练过拟合,使用EarlyStopping机制监控验证集上损失,并在验证集损失在预设轮数内不再下降的情况下停止训练,以避免训练过拟合。
- 根据权利要求1所述的客服热线中用户意图轨迹识别的方法,其中,所述根据所述用户意图的one-hot向量以及Beam Search策略在所述预测阶段生成所述最优的用户意图轨迹,包括:使用Beam Search策略处理所述意图分类算法模型的输出,每次保留概率最大的k个意图类别,其中,k为2或3,在预测对话片段T+1的用户意图的情况下,将在对话片段T下保留的k个意图类别分别作为上文用户意图特征输入Beam Search策略,直到预测完最后一条对话片段的用户意图,选择概率最高的一个意图轨迹作为整个对话文本的用户意图轨迹。
- 一种客服热线中用户意图轨迹识别的系统,设置为基于用户行为轨迹和上下文进行客服热线中用户意图轨迹识别,包括数据切片模块、特征处理模块、意图分类模块和Beam Search策略模块;所述数据切片模块设置为接收用户行为轨迹数据和对话文本,将所述对话文本切成N条4句话组成的对话片段,以将一通完整的对话文本转化成N条有先后顺序的对话片段,将N条对话片段与所述用户行为轨迹数据关联,关联依据为每条对话片段和所述用户行为轨迹发生的时间节点,在训练语料中,人工给每条对话片段标注正确的用户意图类别,输出数据至所述特征处理模块;所述特征处理模块,设置为使用含有12层Transformer的来自Transformer的双向编码器表示BERT模型对所述对话文本进行特征提取,获得文本特征向量表示;使用归一化和one-hot对所述用户行为轨迹数据进行处理,获得用户行为轨迹特征表示;使用one-hot对上文用户意图进行特征编码,在one-hot编码后采用Z-score归一化处理,获得上文用户意图特征表示;将所述文本特征表示、所述用户行为特征表示和所述上文用户意图特征表示拼接,作为样本特征表示输出至所述意图分类模块;所述意图分类模块设置为使用多层感知MLP神经网络作为意图分类算法模 型,以所述样本特征表示作为所述意图分类算法模型的输入,以获得用户意图的one-hot向量作为目标,在所述意图分类算法模型的训练阶段,使用交叉熵损失函数与反向传播机制对网络参数进行更新,保存训练后的模型参数;在预测阶段,搭建与所述MLP神经网络相同结构的MLP模型并加载训练好的模型参数,将所述特征处理模块输出的所述样本特征表示输入加载模型参数后所述MLP模型,将所述MLP模型的最后一层的向量作为输出结果至所述Beam Search策略模块;所述Beam Search策略模块设置为根据所述用户意图的one-hot向量以及Beam Search策略在所述预测阶段生成最优的用户意图轨迹作为整个对话文本的用户意图轨迹输出,其中,所述最优的用户意图轨迹为最终选择的概率最高的一个意图轨迹。
- 一种电子设备,包括:至少一个处理器;存储装置,设置为存储至少一个程序;当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一项所述的客服热线中用户意图轨迹识别的方法。
- 一种计算机存储介质,存储有计算机程序,所述程序被处理器执行时实现如权利要求1-7中任一项所述的客服热线中用户意图轨迹识别的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210199654.6A CN114818738B (zh) | 2022-03-01 | 2022-03-01 | 一种客服热线用户意图轨迹识别的方法及系统 |
CN202210199654.6 | 2022-03-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023165111A1 true WO2023165111A1 (zh) | 2023-09-07 |
Family
ID=82528478
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/118511 WO2023165111A1 (zh) | 2022-03-01 | 2022-09-13 | 客服热线中用户意图轨迹识别的方法及系统 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114818738B (zh) |
WO (1) | WO2023165111A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118377884A (zh) * | 2024-06-24 | 2024-07-23 | 四川博德蚁穴建设工程有限公司 | 基于远程天然气服务的业务数据处理方法及系统 |
CN118568241A (zh) * | 2024-07-31 | 2024-08-30 | 浙江大学 | 一种基于预训练模型的用户对话和画像的意图预测方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818738B (zh) * | 2022-03-01 | 2024-08-02 | 达观数据有限公司 | 一种客服热线用户意图轨迹识别的方法及系统 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018036555A1 (zh) * | 2016-08-25 | 2018-03-01 | 腾讯科技(深圳)有限公司 | 会话处理方法及装置 |
CN110232114A (zh) * | 2019-05-06 | 2019-09-13 | 平安科技(深圳)有限公司 | 语句意图识别方法、装置及计算机可读存储介质 |
CN112597301A (zh) * | 2020-12-16 | 2021-04-02 | 北京三快在线科技有限公司 | 一种语音意图识别方法及装置 |
CN113240436A (zh) * | 2021-04-22 | 2021-08-10 | 北京沃东天骏信息技术有限公司 | 在线客服话术质检的方法和装置 |
WO2021169745A1 (zh) * | 2020-02-25 | 2021-09-02 | 升智信息科技(南京)有限公司 | 基于语句前后关系预测的用户意图识别方法及装置 |
WO2022017245A1 (zh) * | 2020-07-24 | 2022-01-27 | 华为技术有限公司 | 一种文本识别网络、神经网络训练的方法以及相关设备 |
CN114818738A (zh) * | 2022-03-01 | 2022-07-29 | 达而观信息科技(上海)有限公司 | 一种客服热线用户意图轨迹识别的方法及系统 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059178A (zh) * | 2019-02-12 | 2019-07-26 | 阿里巴巴集团控股有限公司 | 问题派发方法及装置 |
CN113874935A (zh) * | 2019-05-10 | 2021-12-31 | 谷歌有限责任公司 | 将上下文信息与端到端模型一起用于语音识别 |
CN110543554A (zh) * | 2019-08-12 | 2019-12-06 | 阿里巴巴集团控股有限公司 | 针对多轮对话的分类方法和装置 |
CN111145728B (zh) * | 2019-12-05 | 2022-10-28 | 厦门快商通科技股份有限公司 | 语音识别模型训练方法、系统、移动终端及存储介质 |
US20210201144A1 (en) * | 2019-12-30 | 2021-07-01 | Conversica, Inc. | Systems and methods for artificial intelligence enhancements in automated conversations |
CN111177324B (zh) * | 2019-12-31 | 2023-08-11 | 支付宝(杭州)信息技术有限公司 | 基于语音识别结果进行意图分类的方法和装置 |
WO2021081562A2 (en) * | 2021-01-20 | 2021-04-29 | Innopeak Technology, Inc. | Multi-head text recognition model for multi-lingual optical character recognition |
CN113094475B (zh) * | 2021-06-08 | 2021-09-21 | 成都晓多科技有限公司 | 一种基于上下文注意流的对话意图识别系统及方法 |
CN113887643B (zh) * | 2021-10-12 | 2023-07-18 | 西安交通大学 | 一种基于伪标签自训练和源域再训练的新对话意图识别方法 |
-
2022
- 2022-03-01 CN CN202210199654.6A patent/CN114818738B/zh active Active
- 2022-09-13 WO PCT/CN2022/118511 patent/WO2023165111A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018036555A1 (zh) * | 2016-08-25 | 2018-03-01 | 腾讯科技(深圳)有限公司 | 会话处理方法及装置 |
CN110232114A (zh) * | 2019-05-06 | 2019-09-13 | 平安科技(深圳)有限公司 | 语句意图识别方法、装置及计算机可读存储介质 |
WO2021169745A1 (zh) * | 2020-02-25 | 2021-09-02 | 升智信息科技(南京)有限公司 | 基于语句前后关系预测的用户意图识别方法及装置 |
WO2022017245A1 (zh) * | 2020-07-24 | 2022-01-27 | 华为技术有限公司 | 一种文本识别网络、神经网络训练的方法以及相关设备 |
CN112597301A (zh) * | 2020-12-16 | 2021-04-02 | 北京三快在线科技有限公司 | 一种语音意图识别方法及装置 |
CN113240436A (zh) * | 2021-04-22 | 2021-08-10 | 北京沃东天骏信息技术有限公司 | 在线客服话术质检的方法和装置 |
CN114818738A (zh) * | 2022-03-01 | 2022-07-29 | 达而观信息科技(上海)有限公司 | 一种客服热线用户意图轨迹识别的方法及系统 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118377884A (zh) * | 2024-06-24 | 2024-07-23 | 四川博德蚁穴建设工程有限公司 | 基于远程天然气服务的业务数据处理方法及系统 |
CN118568241A (zh) * | 2024-07-31 | 2024-08-30 | 浙江大学 | 一种基于预训练模型的用户对话和画像的意图预测方法 |
Also Published As
Publication number | Publication date |
---|---|
CN114818738B (zh) | 2024-08-02 |
CN114818738A (zh) | 2022-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023165111A1 (zh) | 客服热线中用户意图轨迹识别的方法及系统 | |
US20210141799A1 (en) | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system | |
US10789415B2 (en) | Information processing method and related device | |
JP2021089705A (ja) | 翻訳品質を評価するための方法と装置 | |
WO2022121251A1 (zh) | 文本处理模型训练方法、装置、计算机设备和存储介质 | |
CN112084334B (zh) | 语料的标签分类方法、装置、计算机设备及存储介质 | |
CN111079432B (zh) | 文本检测方法、装置、电子设备及存储介质 | |
CN112420024A (zh) | 一种全端到端的中英文混合空管语音识别方法及装置 | |
WO2024146328A1 (zh) | 翻译模型的训练方法、翻译方法及设备 | |
CN116050425A (zh) | 建立预训练语言模型的方法、文本预测方法及装置 | |
CN115803806A (zh) | 用于训练双模式机器学习言语识别模型的系统和方法 | |
US20230252225A1 (en) | Automatic Text Summarisation Post-processing for Removal of Erroneous Sentences | |
CN115688937A (zh) | 一种模型训练方法及其装置 | |
KR20210125449A (ko) | 업계 텍스트를 증분하는 방법, 관련 장치 및 매체에 저장된 컴퓨터 프로그램 | |
CN117574879A (zh) | 基于预训练模型的数据增强方法、系统、设备及介质 | |
CN110717316B (zh) | 字幕对话流的主题分割方法及装置 | |
CN114218356B (zh) | 基于人工智能的语义识别方法、装置、设备及存储介质 | |
Choi et al. | Joint streaming model for backchannel prediction and automatic speech recognition | |
CN115827865A (zh) | 一种融合多特征图注意力机制的不良文本分类方法及系统 | |
CN113627197B (zh) | 文本的意图识别方法、装置、设备及存储介质 | |
CN113689866A (zh) | 一种语音转换模型的训练方法、装置、电子设备及介质 | |
WO2024055707A1 (zh) | 翻译方法及相关设备 | |
CN114281996B (zh) | 长文本分类方法、装置、设备及存储介质 | |
US20240005082A1 (en) | Embedding Texts into High Dimensional Vectors in Natural Language Processing | |
CN115114433B (zh) | 语言模型的训练方法、装置、设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22929541 Country of ref document: EP Kind code of ref document: A1 |