WO2020159395A1

WO2020159395A1 - Method for creating a dialogue analysis model on the basis of artificial intelligence

Info

Publication number: WO2020159395A1
Application number: PCT/RU2019/000051
Authority: WO
Inventors: Денис Олегович АНТЮХОВ; Леонид Петрович ПУГАЧЕВ
Original assignee: Публичное Акционерное Общество "Сбербанк России"
Priority date: 2019-01-29
Filing date: 2019-01-29
Publication date: 2020-08-06
Also published as: EA038264B1; RU2730449C2; RU2019102403A3; RU2019102403A; EA201990216A1

Abstract

The present technical solution relates in general to the field of electronic data processing, and more particularly to machine learning methods for constructing natural language dialogue analysis models. A computer implemented method for creating a dialogue analysis model on the basis of artificial intelligence for processing user queries is carried out with the aid of at least one processor and comprises the steps of: obtaining a set of primary data containing at least text data of dialogues between users and operators including user queries and operator responses; processing the set of data obtained and simultaneously forming a training set for an artificial neural network containing positive and negative examples of user queries based on an analysis of the context of the dialogues, wherein said positive examples contain a semantically related set of operator replies in response to a user query; identifying and coding a vector representation of each reply from the positive and negative examples of the training set referred to in the previous step; and using the resulting training set to train models for determining relevant replies from the context of user queries in dialogues.

Description

METHOD FOR CREATING A MODEL FOR ANALYSIS OF DIALOGUES BASED ON

ARTIFICIAL INTELLIGENCE

AREA OF TECHNOLOGY

[0001] The present technical solution, in general, relates to the field of computational data processing, and in particular, to machine learning methods for constructing dialogue analysis models in natural language.

LEVEL OF TECHNOLOGY

[0002] At present, automated natural language recognition systems have become widespread in various fields of technology. The widest application of these technologies is observed in the user sector when used in various software applications, for example, search engines, navigators, applications for the selection of goods, etc., for example, when using intelligent assistants. A key feature in the work of such intelligent assistants is the ability to accurately recognize the speech commands generated by users.

[0003] The existing complexity is the formation of models for the analysis of speech messages, which with a given accuracy and speed allow you to quickly form and provide a response to a user request, especially when it comes to a specialized field of their application, which requires careful tuning and training of such models.

[0004] At the moment, from the prior art, there are many approaches in the field of creating and training models for natural language processing (English "NLP" Natural Language Processing). The principle of creating models using a machine learning algorithm is known, which consists in applying the method of filtering sentences using a recurrent neural network and the "Bag of words" algorithm (patent application US20180268298, applicant: Salesforce.com Inc., published on 20.09.2018). The well-known approach reveals the principle of sentiment analysis using two types of models - simple and complex, which classify the received message in natural language. The disadvantages of the known approach are low accuracy and speed of work, which is due to the use of several models selected depending on the type and complexity of the request received.

ESSENCE OF THE TECHNICAL SOLUTION

[0005] The claimed technical solution proposes a new approach in the field of artificial intelligence (AI) by creating machine learning models for processing user requests in natural language.

[0006] The technical problem or technical problem to be solved is to create a new method for creating a model for analyzing calls in natural language, which has a high degree of accuracy in recognizing the context of a call and the speed of processing incoming calls.

[0007] The main technical result achieved when solving the above technical problem is the creation of a model for analyzing user requests in natural language, which has high accuracy in recognizing the context of requests, due to the ability to rank responses to incoming requests from users.

[0008] The claimed result is achieved by a computer-implemented method for creating a dialogue analysis model based on artificial intelligence for processing user requests, performed using at least one processor and containing the steps at which:

• receive a set of primary data, and the set includes at least text data of dialogs between users and operators, containing user requests and responses of operators;

• processing the obtained data set, during which a training sample is formed for an artificial neural network, containing positive and negative examples of user requests based on the analysis of the context of dialogues, and positive examples contain a semantically related set of operator replicas in response to the user's request;

• perform selection and coding of vector representations of each replica from the positive and negative examples of the training sample mentioned at the previous step; • use the generated training sample to train the model for determining the relevant replicas from the context of user calls in dialogues.

[0009] In one particular embodiment of the method, the model is at least one artificial neural network.

[0010] In another particular embodiment of the method, positive examples are formed on the basis of complete chains of dialogues between the operator and the client, and such a chain contains at least one interrogative sentence.

[OOP] In another particular embodiment of the method, when selecting relevant replicas for responding to the customer's call phrase at the stage of training the model, a scoring score is calculated for each response.

[0012] In another particular embodiment of the method, in the step of encoding replicas into replica vectors, representing sentences are encoded as a matrix of semantic vectors.

[0013] Also, the specified technical result is achieved by implementing a system for processing user requests in the information channel using artificial intelligence, which contains at least one processor; at least one memory connected to the processor, which contains computer-readable instructions, which, when executed by at least one processor, provide: obtaining user access using the information channel; processing a user request using a machine learning model for automated processing of user requests created using the method according to the above method; formation and transmission in the information channel of a response message to the user's request.

[0014] In a particular implementation, the system is a server, mainframe, or supercomputer.

[0015] In another particular implementation of the system, the traffic channel is a chat session, VoIP communication, or a telephone channel.

[0016] In another particular implementation of the system, the chat session is a chat using a mobile application or a chat on a website. DESCRIPTION OF DRAWINGS

[0017] Features and advantages of the present invention will become apparent from the following detailed description of the invention and the accompanying drawings, in which:

[0018] FIG. 1 illustrates a block diagram of the implementation of the claimed method.

[0019] FIG. 2 illustrates an example of data processing to form a training sample.

[0020] FIG. 3 illustrates the architecture of the interrogative sentence definition model.

[0021] FIG. 4 illustrates a method for training a model to identify relevant replicas.

[0022] FIG. 5 illustrates the architecture of the Relevant Replica Definition Model.

[0023] FIG. 6 illustrates an example of using a trained model for determining relevant replicas.

[0024] FIG. 7 illustrates a general view of the claimed system.

CARRYING OUT THE INVENTION

[0025] In this technical solution, terms such as "operator", "client", "bank employee" can be used for clarity of understanding of the work, which in general should be understood as a "user" of the system.

[0026] The claimed method (100) for creating an AI-based dialogue analysis model for processing user requests, as shown in FIG. 1 consists in performing a series of sequential steps performed by a processor.

[0027] The initial step (101) for generating a dialog analysis model is to obtain a primary ("raw") data set, on which a training sample for an artificial neural network (ANN) will be built. The set of primary data can be an array of unlabeled text logs (records) of dialogs between operators and clients when processing incoming calls. The subject of text logs can be different and vary depending on the requirements for the final formation of the analysis model for a given industry of its final application. [0028] Customer inquiries are understood as any requests coming to information channels of interaction with an operator of a contact center or support service, for example, a financial institution. As a rule, the primary data are records of conversations between customers and operators, converted into text. Information channels for receiving data from conversations with operators can be, but are not limited to, a telephone channel, a VoIP channel, a chat session on a website or in a mobile banking application, a chat bot of a messenger, etc. Any type of channel through which a client can carry out a dialogue with an operator can be used to receive these dialogs for their subsequent translation into text form for the purpose of implementing the learning process of a machine learning model.

[0029] As mentioned above, the operator in the present solution can mean both a person performing processing of the purposes of customer requests and a software algorithm, for example, a chat bot or an intelligent answering machine, capable of also providing information for processing customer requests.

[0030] Based on the received array of primary data with logs at step (101), further processing is performed to form the ANN training sample (102). From the resulting array of logs, question-answer pairs are formed. This procedure includes an algorithm for collecting data, using a model for determining an interrogative sentence, and an algorithm for forming question-answer pairs. Using the model for determining interrogative sentences, the client's questions are searched in text logs. The operator's replica following the found client's question (provided that it satisfies a number of requirements: not an interrogative, long enough, does not contain stop words) is considered the answer to this request.

[0031] The processing of the data array of dialogs between operators and clients consists in forming on the basis of question-answer pairs of positive and negative examples. As a rule, such examples are formed as follows: (the context of the conversation (2-5 replicas), the client's request or request, the operator's response to the request) - a positive example; (the context of the conversation (2-5 replicas), the client's request or request, the operator's response to some other request) is a negative example.

[0032] Below are examples of the implementation of question-answer pairs.

[0033] Example 1 (positive): ctx: ['Hello, Ivan Ivanovich!', 'How can I help? ',' Hello, can I get a credit card '] rsp:' You can clarify the conditions for credit cards and apply at the link: http://www.sberbank.ru/moscow/ru/person/bank_cards/credit/ '

[0034] Example 2 (negative): ctx: ['Hello, Kirill!', 'How can I help you?', 'Hello, how to activate the mobile banking service?'] Rsp: 'Only Visa Gold and MasterCard cards can be ordered online Gold '.

[0035] The formation of question-answer pairs for creating a training sample for the ANN is carried out using a model of analysis of interrogative sentences, which is necessary for the correct partitioning of the context of the logs and the formation of a training data set. In this case, context refers to a time-ordered set of operator and client replicas. The last cue in context is always the customer's question.

[0036] FIG. 2 shows an example of using the model for determining interrogative sentences for the formation at stage (102) of a training sample for ANN. The interrogative sentence determination model (220) is a machine learning model, for example, an artificial neural network. To train the model (220), a dataset (sometimes called a "dataset") OpenSubtitles (OPUS) (http://opus.nlpl.eu/) (221) can be used, as well as data from chats with an operator (222). The OPUS dataset (221) is an open source dataset of subtitles for films in various languages that is used as a source of colloquial vocabulary commonly found in feature films.

[0037] All sentences containing a question mark were selected as positive examples from datasets (221) - (222), and all other sentences were selected as negative examples. In OPUS (221), all sentences ending with a question mark are interrogative, since the punctuation in the subtitles is always correct. This also applies to interrogative sentences from these chats (222). The interrogative sentences extracted in this way also undergo additional filtering: short sentences or sentences containing stop words are discarded. Similar to positive examples, most of the sentences from OPUS (221) that do not contain a question mark are not interrogative and are chosen as negative examples. In addition, sentences containing interrogative words or too short, where the word size is predefined, are discarded.

[0038] Negative examples can be generated in any number, which allows you to achieve any ratio of positive and negative examples in the training set for ANN. Experiments have shown that the best quality is achieved when the ratio of positive and negative examples is 1: 1.

[0039] In an exemplary embodiment, the training sample is balanced, all punctuation is cut out so that the model (220) makes its predictions based solely on the semantics of words in the sentence. All obtained raw data were used, but the number of positive and negative examples was the same in each batch (data chunk) when training the model (220). For semantic analysis, the semantic word model fasttext and a recurrent neural network based on LSTM (Long short-term memory) are used to model the semantics of a sentence. FastText is a library for learning word embedding and text classification created by AI Research Lab at Facebook ™. The accuracy of this data set processing procedure is about 95%.

[0040] LSTM is a type of feedback loop recurrent neural networks that is widely used in the industry for modeling time series and other sequences. This architecture is most widely used in computational linguistics, where it is used to model the semantics of sentences or entire paragraphs of text.

[0041] FIG. 3 shows the architecture of the model for determining interrogative sentences (220). The architecture of the model for determining interrogative sentences (220) is presented on the example of a neural network model in the form of an acyclic computational graph.

[0042] On the architecture of model (220), examples of the dimensions of the input and output tensor for each block are indicated.

Recording example:

Input: (No,, 20)

Output: (No, 20, 300) [0043] This example means that the block accepts a tensor of dimension (batch size, 20) as input and sends a tensor of dimension (batch size, 20, 300). The size of the batch (data packet) for the trained model can be any (this only affects performance and depends on the runtime environment), for this, the notation indicates (No.

[0044] Model (220) contains an input node for textual data (inp ctx O) (2201) and one output node of model prediction (relevance) (2211). The FastText model containing texts in Russian is used as pre-trained embeddings (vector representations of words). A bi-directional LSTM module is used as an encoder.

[0045] The word vectorization module (2202) contains a pre-trained word embedding model for vectorization at the word level. Each of the sentences submitted to the input of the model is already broken down into tokens - presented as a list of words. At the same time, all sentences are presented as sequences of equal length (this is necessary for efficient batch processing). Short sentences are padded to this fixed length with a zero token, too long sentences are truncated. Hereinafter, the length of the sequences will be denoted as MAX LEN. In the experiments, the value MAX LEN = 24 was used, but not limited.

[0046] Word embedding is a vector representation of a word obtained using a distributive language model (usually word2vec, fasttext, or glove natural language semantics analysis software tools). This is a vector of dimensions of the order of several hundred (100-1000). A characteristic feature is that words similar in meaning are represented by similar vectors (according to the Euclidean metric L2).

[0047] Each word is assigned a semantic vector - the so-called. word embedding (see information source https://ru.wikipedia.org/wiki/Word2vec). For this, the fasttext model trained on a thematic, for example, a banking dataset (see https://arxiv.org/abs/ 1607.046061. The advantage of the fasttext model over the basic word2vec is the ability to process (vectorize) words that were absent in the training set. Semantic vectors fasttext have a dimension of the order of several hundred, this dimension can be denoted as EMB DIM In our experiments with the architecture of the presented model (220), we used EMB_DIM = 300.

[0048] As a result of these procedures, each sentence at the input of the model (220) is associated with a matrix of dimensions (MAX LEN, EMB DIM). This functionality is encapsulated in the word embedding model (2202) module. A module (2202) is used to vectorize the context and response sentences. In this case, the word vectorization module (2202) is not trained in the process of setting up the model (220), since the word vectors in it are fixed and do not change anymore.

[0049] The sentence vectorization module (2203) contains a model for vectorizing the entire sentence. Each of the sentences is represented as a matrix from the (MAX LEN, EMB DIM) module (2202) and encoded into a vector of fixed dimension. For this, a recurrent neural network of the LSTM type is used. The matrix obtained by the module (2202) is processed from left to right by the LSTM module, the last internal state of the LSTM (that is, corresponding to the last word in the sentence) corresponds as the vector representation of the sentence.

[0050] As a result of the operation of the module (2203), each sentence will be represented as a vector of fixed dimension LSTM DIM. As an example of work, the cell dimension LSTM_DIM = 340 was used. Thus, a matrix (CTX LEN, LSTM DIM) is assigned to the query context, consisting of CTX LEN replicas with a maximum of SEQ LEN words (input inp ctx). An LSTM DIM vector is assigned to a single-sentence candidate. A candidate represents one of the possible responses for a given context. This functionality is encapsulated in the sentence vectorization module (2203). Module (2203) contains most of the trained parameters of the model (2-5 million depending on the configuration) and is the most computationally "heavy".

[0051] Modules of subsampling (pooling) (2204, 2205) receive as input a vector of fixed dimension from the internal state RNN of the sentence vectorization module (2203). In a particular embodiment, the modules (2204, 2205) may be part of the sentence vectorization module (2203).

[0052] The concatenation module (2206) is designed to concatenate vectors received from the modules (2204, 2205) into one, for their subsequent transfer to the multilayer perceptron (2207) as a single vector.

[0053] Multilayer perceptron (English "MLP / Multilayer perceptron") (2207), in particular, two-layer, in which fully connected (English "Dense") layers are interspersed with regularization (English "Dropout"). The Dropout value for this MLP example is 0.3. Dropout is a way to regularize neural networks, which serves to combat retraining (see for example http://jmlr.org/papers/volumel 5 / srivastava 14a / sri vastava 14a.pdf

[0054] The module (2208) is an output neuron with a sigmoidal function of the model for determining interrogative sentences of activation (220) and contains the prediction of the model (220), made based on the processing of the input text data. In the presented example, the architecture of the presented model (220) reached 0.945 AUC (Area under the ROC Curve), 0.875 ACC (Accuracy) on the validation set. The area under the ROC Curve (AUC) is an aggregated characteristic of the quality of the classification that does not depend on the price ratio of errors. The higher the AUC value, the "better" the classification model. This indicator is often used for comparative analysis of several classification models.

[0055] Next, consider the step of generating a training sample (102) for the reference analysis model. The generation of the training sample (102) is performed using a model (220) for extracting interrogative sentences from the input dataset (210), which represents unlabeled chat dialogs between clients and operators (210). Each customer replica in each chat is processed using the mentioned model (220).

[0056] At step (103), the replicas are tokenized and represented as a sequence of word vectors, after which they are fed to the input of the interrogative sentence extraction model (220). During replica processing, model (220) estimates the likelihood that the replica is an interrogative sentence. If several operator messages follow the replica request, a positive training example is formed (231). If there are several replicas of the client in a row, the one for which the predicted probability is the highest is chosen as the interrogative. If there are several replica-responses of the operator in a row in response to the client's request, then a training example is formed with each of them.

[0057] In the positive training example (231), all replicas up to and including the client request are included as context. The following operator's replica is used as a response. The context includes the last n replicas (of both the client and the operator) preceding the client's request, where n is a model parameter, for example, from 1 to 6. In an exemplary implementation, during processing of the dataset (210) the model (220) formed a training set, which contained about 1,000,000 positive examples (231).

[0058] Negative examples (232) were formed by replacing the correct answer in the positive example with an arbitrary one from the set of all possible operator responses (which at the time of training were about 1,000,000).

[0059] Based on the generated training sample, at step (104), the model for determining the relevant replicas (240) is trained. FIG. 4 shows an example of training a model for determining relevant replicas (240). The input of the model (240) receives data from the training sample (230), formed on the basis of the obtained positive (231) and negative (232) examples of processing customer requests.

[0060] From the training part of the training sample (230), training batches are formed. The ratio of positive and negative examples in the batch is chosen to be approximately equal. The typical batch size is 256, 512. The model (240) is trained for 32 epochs, at the end of each it is validated on a deferred sample. Lazy sampling is a part of the dataset that is not used when training a model, but which is used to validate it (calculate metrics). As an example, the lazy sample can be 10% of the original generated training sample.

[0061] Optionally, in the validation process, in addition to the training set (230) obtained automatically from raw (in other words, unlabeled) data, a manually labeled question and answer dataset can be used. If the manually labeled dataset is large enough (thousands of question-answer pairs), then in this way you can additionally train the model (240) on these pairs. In this case, the training sample (230) is replaced with a manually labeled dataset, with the help of which further training of the model (240) is continued. This leads to a significant increase in quality metrics on questions from an additional dataset.

[0062] If there is little data, then the model (240) is validated on them by calculating the appropriate quality metrics. As a rule, recall @ k and precision @ k metrics are calculated for a question-answer system that includes model (240) - the model with the maximum values of these metrics can be selected for subsequent serialization in pickle (the pickle module implements the algorithm for serializing and deserializing Python objects). The value of this metric is determined by the frequency getting the correct answer to a question in the top-K of the model's responses ranked by relevance. This value is calculated by the formula: (number of relevant responses up to the kth position in the ranked list of responses) / (total number of relevant responses)

For example: the models asked 10 questions, 5 times correct answer was the first in the list of sorted answers, and 8 times correct answer entered the top 3 sorted by relevance answers. In this case, for such a test recall @ l = 5/10 = 0.5, recall @ 3 = 8/10 (assuming that there is only one relevant answer for each question).

[0063] Training the model (240), on average, takes 2-3 hours depending on the use of the GPU NVIDIA 1080TΪ. The model (240) with the maximum accuracy on deferred sampling is serialized into a binary format (pickle) for further use.

[0064] FIG. 5 shows the architecture of the model (240) for determining the relevant replicas. The relevance model (240) is designed to assess the relevance of a given context-response pair. The model has two input nodes - for the dialogue context (2401) and for the candidate replica (2402). The model has one output node (2407), which is a module for determining a relevance score, which can take values from 0 to 1.

[0065] The word vectorization module (2403) is similar in functionality to the module (2202), which also performs vectorization at the word level. The notation "Iterative" means that the module (2404) can perform the prescribed data processing sequentially several times. In this case, the "Iterative" parameter for the word vectorization module (2404) indicates that the word vectorization module (240) is applied in turn for each replica of the context (of which there are 3 in this example). This is not required for the candidate's replica, since it consists of one sentence (accordingly, the vectorization is processed once).

[0066] Module (2405) is designed for vectorization of sentences and in its functionality repeats the functionality of module (2203). Referring to FIG. 5 in the diagram, the subsampling and concatenation nodes are encapsulated inside the module (2405) and are not explicitly shown in the diagram. In this case, the "Iterative" parameter indicates that the sentence vectorization module (2406) is applied in turn for each replica of the context (of which there are 3 in this example). [0067] Unit (2407) is a relevance computation unit. This module (2407) accepts vector representations of the context and the candidate as input and returns a single number [0,1] - the relevance score. The score is calculated based on the calculation of a number of factors including: the dot product between the candidate vector and each of the vectors in the context, the dot product between the candidate vector and the sum of the context vectors; concatenation of context vectors and candidate vectors; calculating the dot product with a candidate vector.

[0068] The result of the concatenation of the context vectors and the candidate vector is fed to the input to the two-layer perceptron. The dimension of the output layer is equal to the LSTM DIM. The output forms a matrix (CTX LEN, LSTM DIM). The dot product with the candidate vector is calculated, and as a result, the CTX LEN factors for the resulting context are determined at the output. The length of the context is denoted as CTX LEN and can be from 1 (context - just a question) to infinity (context - the whole dialogue). Typical values are [1: 5]. As an example implementation, with CTX_LEN = 3, 7 factors are obtained for calculating relevance.

[0069] These factors are input to yet another two-layer perceptron with a sigmoidal activation function on the last layer. The output of the module (2406) is one number - the relevance score, which is the final output of the entire model (240). Model (240) is trained as a binary classifier, predicting whether a given candidate is relevant to a given context or not, that is, model (240) allows you to determine the answer to the received question in circulation.

[0070] The trained model (240) can be used to build a question and answer system as follows. Of all the possible answers of the operator, a certain limited set of candidates is distinguished. The selection process excludes too short, too long, meaningless and duplicate answers. This process can be either fully automated or semi-automated, in which the final list of candidates is additionally checked manually by a specialist, which makes it possible to obtain additional quality of the entire system. As a result, many candidates are obtained (usually from hundreds to thousands), each of which model (240) will be able to answer the query.

[0071] To respond to a query, the model (240) evaluates the relevance of the query context to each of the loaded candidates. Top-K candidates (typical k = 3) returned as the most likely response options for a customer request.

[0072] After receiving the trained model (240) for determining the relevance of replicas, this model (240) can be used in the future in automated systems for analyzing dialogs coming from the client. For example, such systems can be chat bots, intelligent assistants placed on websites, widgets, telephone robots, etc.

[0073] FIG. 6 shows an example of using the trained model (240) for determining the relevant replicas for processing client requests (10), which can arrive at the resource (20) upon request. Resource (20) can be used as a web resource (website, portal, etc.), call center, mobile application, etc. The client (10) can form his appeal in the form of a phone call, through a chat session, a VoIP call, using a specialized widget or software, etc. The resource (20), upon receiving the information of the request from the client (10), transfers the context of the request to the trained model (240), which determines the question-answer pair and transfers the data to the module for generating the response to the client's request (250) to generate the answer to the client's question (10 ). The response to the client's request (10) is transmitted from the module (250), as a rule, in the same information channel from which the request came. The response can be a chatbot response, a telephone robot, interactive information, a hyperlink or a combination of response options, etc.

[0074] FIG. 7 shows an example of a general view of a computing system (300) that implements the claimed method (100) or is part of a computer system, for example, a server, a personal computer, part of a computing cluster that processes the necessary data to implement the claimed technical solution.

[0075] In the General case, the system (300) contains one or more processors (301), united by a common bus of information exchange, memory means such as RAM (302) and ROM (303), input / output interfaces (304), input devices / output (305), and a device for networking (306).

[0076] The processor (301) (or multiple processors, multi-core processor, etc.) can be selected from a range of devices currently widely used, for example, manufacturers such as: Intel ™, AMD ™, Apple ™, Samsung Exynos ™,

MediaTEK ™, Qualcomm Snapdragon ™, etc. Under the processor or one of of the processors used in the system (300), it is also necessary to take into account the graphics processor, for example, NVIDIA GPU or Graphcore, the type of which is also suitable for full or partial execution of the method (100), and can also be used for training and applying machine learning models in various information systems ...

[0077] RAM (302) is a random access memory and is intended for storing machine-readable instructions executed by the processor (301) for performing the necessary operations for logical data processing. RAM (302), as a rule, contains executable instructions of the operating system and corresponding software components (applications, software modules, etc.). In this case, the available memory of the graphics card or the graphics processor can act as RAM (302).

[0078] ROM (303) is one or more persistent storage devices, for example, hard disk drive (HDD), solid state drive (SSD), flash memory (EEPROM, NAND, etc.), optical storage media ( CD-R / RW, DVD-R / RW, BlueRay Disc, MD), etc.

[0079] Various types of I / O interfaces (304) are used to organize the operation of system components (300) and to organize the operation of external connected devices. The choice of the appropriate interfaces depends on the specific design of the computing device, which can be, but are not limited to: PCI, AGP, PS / 2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS / Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232, etc.

[0080] To ensure user interaction with the computing system (300), various I / O means (305) are used, for example, a keyboard, display (monitor), touch display, touch pad, joystick, mouse manipulator, light pen, stylus, touch panel, trackball, speakers, microphone, augmented reality, optical sensors, tablet, light indicators, projector, camera, biometric identification (retina scanner, fingerprint scanner, voice recognition module), etc.

[0081] The networking tool (306) provides data transmission via an internal or external computer network, for example, Intranet, Internet, LAN, and the like. One or more means (306) may be used, but not limited to: Ethernet card, GSM modem, GPRS modem, LTE modem, 5G modem, satellite communication module, NFC module, Bluetooth and / or BLE module, Wi-Fi module, and dr. [0082] Additionally, satellite navigation aids can also be used as part of the system (300), for example, GPS, GLONASS, BeiDou, Galileo.

[0083] As shown in FIG. 6, the resource (20) accessed by the client (10) can be organized using the system (300), which can be a server to provide the required functionality for processing incoming calls, recognizing response replicas using a trained model (240) and generating response messages using the module (250), which are transmitted over various wired and / or wireless data channels.

[0084] Client requests (10) can also be generated using a device that contains partial functionality of the system (300), in particular, the client's device (10) can be a smartphone, computer, tablet, terminal and any other device that provides communication a channel with a resource (20) for generating and transmitting an appeal and receiving the required response, which can also include various types of digital information.

[0085] Thus, when applying the model for determining the relevant answers

(240), created using the claimed method (100), achieves a more accurate selection in an automated mode of response pairs according to the incoming context in user calls, which makes it possible to create a new, more improved way of training and applying machine learning models in systems based on the use of AI. ...

[0086] The presented application materials disclose preferred examples of the implementation of the technical solution and should not be construed as limiting other, particular examples of its implementation, not going beyond the scope of the claimed legal protection, which are obvious to specialists in the relevant field of technology.

Claims

FORMULA

1. A computer-implemented method for creating a dialogue analysis model based on artificial intelligence for processing user requests, performed using at least one processor and containing the stages at which:

• processing the received data set, during which a training sample for an artificial neural network is formed, containing positive and negative examples of user calls based on the analysis of the context of dialogues, and the positive examples contain a semantically related set of operator replicas in response to the user call;

• perform selection and coding of vector representations of each replica from the positive and negative examples of the training sample mentioned at the previous step;

• use the generated training sample to train the model for determining the relevant replicas from the context of user calls in dialogues.

2. The method according to claim 1, characterized in that the model is at least one artificial neural network.

3. The method according to claim 1, characterized in that positive examples are formed on the basis of complete chains of dialogues between the operator and the user, and such a chain contains at least one interrogative sentence.

4. The method according to claim 1, characterized in that when selecting relevant replicas for responding to the user's appeal phrase at the stage of model training, a scoring score is calculated for each response replica.

5. The method according to claim 1, characterized in that at the stage of coding the replicas into a vector representation of the replica, the representing sentences are encoded as a matrix of semantic vectors.

6. A system for processing user requests in an information channel using artificial intelligence, containing

- at least one processor;

- at least one memory coupled to the processor, which contains machine-readable instructions that, when executed by at least one processor, provide:

- receiving user requests using the information channel;

- processing of user requests using a machine learning model for automated processing of user requests, created using the method according to any one of paragraphs. 1-5;

- formation and transmission in the information channel of a response message to the user's request.

7. The system according to claim 6, characterized in that it is a server, mainframe, or supercomputer.

8. The system according to claim 6, characterized in that the information channel is a chat session, VoIP communication, or a telephone channel.

9. The system according to claim 6, characterized in that the chat session is a chat using a mobile application or a chat on a website.