WO2021012382A1

WO2021012382A1 - Method and apparatus for configuring chat robot, computer device and storage medium

Info

Publication number: WO2021012382A1
Application number: PCT/CN2019/107693
Authority: WO
Inventors: 黄海杰
Original assignee: 深圳壹账通智能科技有限公司; 壹帐通金融科技有限公司（新加坡）
Priority date: 2019-07-25
Filing date: 2019-09-25
Publication date: 2021-01-28
Also published as: CN110569341B; CN110569341A; SG11202004541WA

Abstract

A method for configuring a chat robot, comprising: acquiring a scanned image of a service table; extracting table feature information in the scanned image of the service table; determining field types of text blocks in the table feature information, and identifying target text blocks that need to perform field filling according to the field types; according to said target text blocks, establishing an association relationship between the text blocks; querying a preset data type configuration table according to the target text blocks; determining a data type of a field to be filled corresponding to each target text block; according to the data type of the field corresponding to each target text block, the association relationship between the text blocks, and a preset sentence template, generating a service segment sentence of each target text block; and according to position information of each target text block in the table feature information and the service segment sentence of each target text block, configuring a chat robot.

Description

Method, device, computer equipment and storage medium for configuring chat robot

Cross references to related applications

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on July 25, 2019. The application number is 2019106768243 and the application name is "Methods, devices, computer equipment and storage media for configuring chat robots". The reference is incorporated in this application.

Technical field

This application relates to a method, device, computer equipment and storage medium for configuring a chat robot.

Background technique

With the development of computer technology, a business application method based on OCR (Optical Character Recognition) has emerged. OCR refers to electronic devices (such as scanners or digital cameras) that check the characters printed on paper and detect dark and bright characters. The model determines its shape, and then uses character recognition to translate the shape into computer text. The OCR-based business application method includes grabbing content from paper forms that users have filled in advance through OCR and automatically entering the system.

However, the inventor realizes that the current OCR-based business application method still requires users to fill out paper forms first, but only automates the later entry process, which has the problem of low business processing efficiency.

Summary of the invention

According to various embodiments disclosed in the present application, a method, apparatus, computer device, and storage medium for configuring a chat robot are provided.

One method of configuring a chatbot includes:

Obtain the scan map of the business table, extract the table feature information in the scan map of the business table, determine the field type of the text block in the table feature information, and identify the target text block that needs to be field filled according to the field type, and perform the field filling target according to the need Text block, establish the relationship between each text block;

Query the preset data type configuration table according to the target text block that needs to be field filled, and determine the data type of the field to be filled corresponding to each target text block;

According to the data type of the fields to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template, generate business segment sentences for each target text block; and

According to the location information of each target text block in the table feature information and the business segment sentence of each target text block, the chat robot is configured.

A device for configuring a chat robot includes:

The acquisition module is used to obtain the scan map of the business table, extract the table feature information in the scan map of the business table, determine the field type of the text block in the table feature information, and identify the target text block that needs to be filled in according to the field type. The target text block for field filling, and the relationship between each text block is established;

The first processing module is configured to query the preset data type configuration table according to the target text block that needs to be field filled, and determine the data type of the field to be filled corresponding to each target text block;

The second processing module is used to generate business segment sentences for each target text block according to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template; and

The configuration module is used to configure the chat robot according to the position information of each target text block in the table feature information and the business segment statement of each target text block.

A computer device, including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:

One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the following steps:

The details of one or more embodiments of the application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

Fig. 1 is an application scenario diagram of a method for configuring a chat robot according to one or more embodiments.

Fig. 2 is a schematic flowchart of a method for configuring a chat robot according to one or more embodiments.

FIG. 3 is a schematic diagram of the sub-flow of step 202 in FIG. 1 according to one or more embodiments.

Fig. 4 is a schematic diagram of a sub-flow of step 202 in Fig. 1 according to one or more embodiments.

FIG. 5 is a schematic diagram of the sub-flow of step 202 in FIG. 1 according to one or more embodiments.

FIG. 6 is a schematic diagram of the sub-flow of step 206 in FIG. 1 according to one or more embodiments.

FIG. 7 is a schematic diagram of the sub-flow of step 608 in FIG. 1 according to one or more embodiments.

Fig. 8 is a block diagram of an apparatus for configuring a chat robot according to one or more embodiments.

Figure 9 is a block diagram of a computer device according to one or more embodiments.

Detailed ways

In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and not used to limit the application.

The method for configuring a chat robot provided in this application can be applied to the application environment as shown in FIG. 1. The terminal 102 and the server 104 communicate through the network. The server 104 obtains the scanned image of the business form from the terminal 102, extracts the feature information of the form in the scanned image of the business form, determines the field type of the text block in the feature information of the form, and identifies the target text block that needs to be filled in according to the field type. The target text block for field filling, establish the association relationship between each text block, query the preset data type configuration table according to the target text block for field filling, and determine the data type of the field to be filled corresponding to each target text block , According to the data type of the field to be filled corresponding to each target text block, the association relationship between each text block and the preset sentence template, generate the business segment sentence of each target text block, according to each target text block in the table feature information The location information and business segment sentences of each target text block, configure the chat robot. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers.

In one of the embodiments, as shown in FIG. 2, a method for configuring a chat robot is provided. Taking the method applied to the server in FIG. 1 as an example, the method includes the following steps:

Step S202: Obtain a scanned image of the business form, extract the feature information of the form in the scanned image of the business form, determine the field type of the text block in the feature information of the form, and identify the target text block that needs to be filled in according to the field type, and perform the fields as needed The filled target text block establishes the relationship between the text blocks.

The scanned image of the business form can be a scanned copy of a paper business application form or a picture of an electronic business application form. The character recognition algorithm can be used to extract the feature information of the form in the scanned image of the business form, and the character recognition algorithm can use feature extraction and text positioning And optical recognition extracts the form feature information from the business form scan. Feature extraction refers to the feature extraction based on the target detection model trained in the character recognition algorithm, taking the scanned image of the business table as input, and extracting features through the convolutional neural network in the target detection model. Text localization refers to the features extracted from the trained target detection model in the character recognition algorithm, and obtains the position information of the detected text block pictures and the text symbol pictures. Optical recognition refers to the training through the character recognition algorithm The picture classification model of the detected characters and symbols in each text block picture and each text symbol picture.

The trained target detection model and picture classification model in the character recognition algorithm are trained based on a large number of sample pictures containing text and text symbols. The text captured by the character recognition algorithm will be captured in blocks. For example, the field "Full Name" will be captured as a block. The table feature information includes each character block, position information of each character block, character symbols, and position information of each character symbol. Position information refers to the coordinates of each text block and each text symbol relative to the entire picture, in pixels. The coordinates of the top-left vertex of the grab box are (top,left), and the bottom-right vertex is (bottom,right). The coordinates of these two points determine the position and size of the grab box.

The field types of each text block include required fields, option fields, and comment fields. Option fields and comment fields correspond to the required fields. The server will determine the field type as the text of the required field according to the field type of each text block The block is the target text block, the text block with the field type as the option field is the option field text block, and the text block with the field type as the comment field is the comment field text block, and then according to each target text block, each option field text block and each comment field The distance between the text blocks determines the corresponding relationship between them, and then establishes the association relationship between the text blocks according to the corresponding relationship.

Step S204: Query a preset data type configuration table according to the target text block for which field filling is required, and determine the data type of the field to be filled corresponding to each target text block.

The server queries the preset data type configuration table according to the target text block for field filling, and can determine the data type of the field to be filled corresponding to each target text block. In the data type configuration table, the data type of the field to be filled corresponding to each target text block is preset. For example, when the field to be filled is a phone number or age, the corresponding data type is a number.

Step S206, according to the data type of the fields to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template, generate business segment sentences of each target text block.

The server will determine the option field text block and the comment field text block corresponding to each target text block according to the association relationship between the text blocks, and then according to the data type of the field to be filled and the preset sentence corresponding to each target text block The template, the option field text block and the comment field text block corresponding to each target text block, generate business segment sentences for each target text block. Business fragment sentences refer to sample dialogue fragments that need to be filled in fields corresponding to each target text block, including machine reply sentences, comment prompt sentences, and customer intention sentences.

The machine reply sentence is obtained based on each target text block. The machine reply sentence refers to the language used by the chat robot to ask the customer for the required information corresponding to the field to be filled. The comment prompt sentence is obtained based on the association relationship between the text blocks. When there is a comment field text block corresponding to the target text block, the comment prompt sentence can be obtained according to the comment field text block. The comment prompt sentence is used to prompt the customer to input and The required information corresponding to the field to be filled. The customer intent sentence is obtained based on the data type of each target text block and the relationship between the text blocks, and refers to the statement that the customer provides the information to be filled corresponding to the field to be filled. For example, when the data type is a number, the obtained customer intent statement should be a string of numbers. The server will obtain the business segment sentences of each target text block in the order of machine reply sentences, comment prompt sentences, and user intention sentences.

In step S208, the chat robot is configured according to the position information of each target text block in the table feature information and the business segment sentence of each target text block.

According to the position information of each target text block, the filling order of the fields to be filled corresponding to each target text block can be determined, and the business segment sentences of each target text block are sorted according to the filling order of each target text block to generate a complete The scenario process information of the business application, so as to configure the chat robot according to the scenario process information.

The above method of configuring chat bots extracts the table feature information in the business table scanning diagram, determines the field type of each text block in the table feature information, and identifies the target text block that needs to be filled in according to the field type, and fills in the field as required Target text block, establish the association relationship between each text block, determine the data type of the field to be filled corresponding to each target text block by querying the preset data type configuration table, and then fill according to the need to fill corresponding to each target text block The data type of the field, the association relationship between each text block and the preset sentence template, generate the business segment statement of each target text block, according to the position information of each target text block in the table feature information and the business segment of each target text block Statement, configure the chat robot. Thereby, business processing can be performed according to the configured chat robot, so that users can provide the required information in the original paper form through online chat, complete business applications, and improve the efficiency of business processing.

In one of the embodiments, as shown in FIG. 3, step S202 includes:

Step S302, obtaining a scanned image of the business form, and preprocessing the scanned image of the business form;

Step S304: According to the trained target detection model, obtain the position information of each text block and the position information of each text symbol in the preprocessed business table scan image. The target detection model is trained on a sample image including text blocks and text symbols. get;

Step S306, according to the position information of each character block and the position information of each character symbol, segment the scanned image of the business table to obtain multiple character block images and character symbol images;

Step S308: According to the trained picture classification model, extract the text block image and the text block and text symbol in each text symbol image to obtain the table feature information in the scanned image of the business table. The picture classification model includes text blocks and text Sample images of symbols are trained.

Preprocessing includes denoising processing and tilt correction. The target detection model is trained on sample pictures including text blocks and text symbols. After the scanned image of the business table is input into the trained target detection model in the character recognition algorithm, the convolutional neural network in the trained target detection model will extract Based on the features of the business table scan graph, and based on the extracted features and the fully connected layer in the trained target detection model, the location information of each text block in the business table scan graph and the location information of each text symbol are obtained, according to each text block The location information of each text symbol and the location information of each text symbol can be divided into the scanned image of the business table to obtain multiple text block images and text symbol images. Finally, the trained image classification model can be used to recognize the text in the picture. Both the target detection model and the picture classification model are trained on sample pictures including text blocks and text symbols. The target detection model can be common YOLO, Faster R-CNN, SSD, etc., and the picture classification model can be ResNet. Common text symbols include long underscores, check boxes, etc. These text symbols can be used to help classify each text block.

In the above-mentioned embodiment, the pre-processed business table scan map is processed by using the trained target detection model and the picture classification model, and the table feature information in the business table scan map is extracted, so as to realize the extraction of the table feature information.

In one of the embodiments, step 304 includes:

Input the preprocessed scanned image of the business table into the trained target detection model;

Extracting the features of the preprocessed scan map of the business table according to the convolutional neural network in the target detection model; and

According to the fully connected layer in the target detection model and the features of the business table scan map, the position information of each character block and the position information of each word symbol in the preprocessed business table scan map are obtained.

In one of the embodiments, as shown in FIG. 4, step S202 includes:

Step S402: Input each text block in the table feature information into the trained classification model to obtain the confidence that each text block belongs to each preset field type;

Step S404: Determine the coordinate distance between each character block and each character symbol according to the position information of each character block in the table feature information and the position information of each character symbol;

Step S406: Use each character symbol whose coordinate distance from each character block is within a preset distance threshold range as a character symbol associated with each character block;

Step S408, according to the association between each text block and each text symbol, correct the confidence that each text block belongs to each preset field type;

Step S410: Sort the confidence of each text block belonging to each preset field type, and use the field type with the highest confidence as the field type of each text block.

The server inputs each text block in the table feature information into the trained classification model, and can obtain the confidence that each text block belongs to each preset field type, and the confidence that each text block belongs to each preset field type is used to indicate The probability that each text block belongs to each preset field type. After obtaining the confidence that each text block belongs to each preset field type, the server will determine the distance between each text block and each text symbol according to the location information of each text block and the location information of each text symbol in the table feature information. Coordinate distance, each character symbol whose coordinate distance from each character block is within the preset distance threshold is regarded as the character symbol associated with each character block, and the attribute of each character block is corrected according to the association between each character block and each character symbol Regarding the confidence of each preset field type, finally sort the confidence of each text block attributable to each preset field type, and use the field type with the highest confidence as the field type of each text block.

According to the association between each text block and each text symbol, to modify the confidence that each text block belongs to each preset field type means that when the text block is associated with the text symbol, the text block is adjusted according to the type of the associated text symbol For example, if a field is followed by a check box, increase the confidence that the field is an "option field"; if a field is followed by a long underline, increase the confidence that the field is a "field required". Furthermore, the required fields include required fields and optional fields. The required fields can be further classified by detecting whether there are required symbols before and after the text block.

In the above-mentioned embodiment, according to the trained classification model, the confidence that each text block belongs to each preset field type is obtained, and according to the association between each text block and each text symbol, it is corrected that each text block belongs to each preset field For the confidence of the type, the field type with the highest confidence is used as the field type of each text block to realize the determination of the field type of each text block.

In one of the embodiments, as shown in FIG. 5, the field types include required fields, option fields, and comment fields. Step S202 includes:

Step S502, according to the field type of each text block, determine that the text block whose field type is a field to be filled is the target text block that needs to be filled in.

Step S504: Determine the distance between each target text block and each option field text block and each comment field text block according to the position information of each text block in the table feature information;

Step S506, according to the distance between each target text block and each option field text block and each comment field text block, determine the option field text block and the comment field text block corresponding to each target text block;

Step S508: Establish an association relationship between each target text block and the corresponding option field text block and annotation field text block.

The field types of each text block include required fields, option fields, and comment fields. Option fields and comment fields correspond to the required fields. According to the field type of each text block, the server will determine that the text block whose field type is the required field is the target text block that needs to be filled in, the text block whose field type is the option field is the option field text block, and the field type is the comment field The text block is the text block of the comment field. Then, according to the position information of each text block in the table feature information, the distance between each target text block, each option field text block, and each comment field text block is determined. The distance between the option field text block and the comment field text block, determine the option field text block and the comment field text block corresponding to each target text block, and establish each target text block and the corresponding option field text block and comment field text block The relationship between.

In the above embodiment, according to the field type of each text block, the text block whose field type is the field to be filled is determined as the target text block, and the option field text corresponding to each target text block is determined according to the position information of each text block in the table feature information Blocks and text blocks in the annotation field, thereby establishing the association relationship between the text blocks, and realizing the determination of the association relationship between the text blocks.

In one of the embodiments, as shown in FIG. 6, step S206 includes:

Step S602, according to the position information of each target text block in the form feature information, determine the filling order of the fields to be filled corresponding to each target text block;

Step S604: Determine the business process sequence of the business segment statements of each target text block according to the filling order of the fields to be filled corresponding to each target text block;

Step S606: Generate scenario process information of the business application according to the business process sequence;

Step S608: Perform model training according to the scene process information and the business segment sentences of each target text block, and configure a chat robot.

Integrate the business segment statements of each target text block according to the order of filling in the fields to be filled corresponding to each target text block, and then each target text block can get the business process sequence, according to the business process sequence and the business segment of each target text block Sentences can generate the scene process information of the business application, and then perform model training according to the scene process information and the business fragment sentences of each target text block to obtain the natural language understanding model and the dialogue management model. Configure according to the natural language understanding model and the dialogue management model Chatbot. The natural language understanding model is used to determine the user's intention based on the user's sentence and to capture entity information, and the dialogue management model is used to determine the reply sentence based on the user's sentence and the user's intention.

In the above embodiment, according to the position information of each target text block, the order of filling in the fields to be filled corresponding to each target text block is determined, and the business of each target text block is determined according to the order of filling in the fields to be filled corresponding to each target text block. The business flow sequence of the fragment sentences, the scene flow information of the business application is generated, the model training is performed according to the scene flow information and the business fragment sentences of each target text block, the chat robot is configured, and the configuration of the chat robot is realized.

In one of the embodiments, as shown in FIG. 7, step S608 includes:

Step S702, input the business segment sentences of each target text block as the first training set into the initial natural language understanding model for model training to obtain a natural language understanding model, which is used to judge user intentions and capture entity information based on user sentences ；

Step S704, input the scene process information as the second training set into the initial dialogue management model for model training, to obtain the dialogue management model, the dialogue management model is used to determine the reply sentence according to the user sentence and the user's intention;

Step S706: Configure a chat robot according to the natural language understanding model and the dialogue management model.

The business segment sentences of each target text block include machine reply sentences, comment prompt sentences, and user intention sentences. The business segment sentences of each target text block are input as the first training set into the initial natural language understanding model for model training, which can make natural The language comprehension model judges the user's intention according to the user's intention sentence and captures the required information in the user's intention sentence as the entity information. The scene process information is input into the initial dialogue management model as the second training set for model training, so that the dialogue management model can determine the corresponding machine reply sentence and the comment prompt sentence according to the user sentence and the user's intention. According to the natural language understanding model and the dialogue management model, the chatbot can be configured. After the configuration is completed, in the chatbot task, when the customer inputs the first user intention sentence, the natural language understanding model will determine the user intention according to the first user intention sentence, and input the user intention into the dialogue management model. The dialogue management model will be based on the user The intention determines the corresponding machine reply sentence and the comment prompt sentence and pushes it. The customer then responds to the second user’s intent sentence according to the pushed machine reply sentence and the comment prompt sentence. The natural language understanding model will grab the need to fill in the second user’s intention sentence Information as entity information.

In the above embodiment, the natural language understanding model is obtained according to the business segment sentences of each target text block, the dialogue management model is obtained according to the scene process information, and the chat robot is configured according to the natural language understanding model and the dialogue management model, thereby realizing the configuration of the chat robot.

It should be understood that although the various steps in the flowcharts of FIGS. 2-7 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least part of the steps in Figures 2-7 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.

In one embodiment, as shown in FIG. 8, a device for configuring a chat robot is provided, including: an acquisition module 802, a first processing module 804, a second processing module 806, and a configuration module 808, wherein:

The obtaining module 802 is used to obtain the scanned image of the business form, extract the feature information of the form in the scanned image of the business form, determine the field type of the text block in the form feature information, and identify the target text block that needs to be filled in according to the field type. For the target text blocks that need to be filled with fields, establish the association relationship between the text blocks;

The first processing module 804 is configured to query a preset data type configuration table according to the target text block that needs to be field filled, and determine the data type of the field to be filled corresponding to each target text block;

The second processing module 806 is configured to generate business segment sentences of each target text block according to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template;

The configuration module 808 is used to configure the chat robot according to the position information of each target text block in the table feature information and the business segment sentence of each target text block.

The above device for configuring chat robots extracts the table feature information in the business table scanning diagram, determines the field type of each text block in the table feature information, and identifies the target text block that needs field filling according to the field type, and performs field filling according to the needs Target text block, establish the association relationship between each text block, determine the data type of the field to be filled corresponding to each target text block by querying the preset data type configuration table, and then fill according to the need to fill corresponding to each target text block The data type of the field, the association relationship between each text block and the preset sentence template, generate the business segment statement of each target text block, according to the position information of each target text block in the table feature information and the business segment of each target text block Statement, configure the chat robot. Thereby, business processing can be performed according to the configured chat robot, so that users can provide the required information in the original paper form through online chat, complete business applications, and improve the efficiency of business processing.

In one of the embodiments, the acquisition module is also used to acquire a scanned image of the business form, preprocess the scanned image of the business form, and obtain the position of each text block in the scanned image of the preprocessed business form according to the trained target detection model The target detection model is trained on sample pictures including text blocks and text symbols. According to the location information of each text block and the location information of each text symbol, the scanned image of the business table is divided to obtain multiple Text block images and text symbol images, according to the trained picture classification model, extract the text block images and text blocks and text symbols in each text symbol image to obtain the table feature information in the business table scan graph. The picture classification model is based on Sample pictures including text blocks and text symbols are trained.

In one of the embodiments, the acquisition module is also used to input the preprocessed scan map of the business form into the trained target detection model, and extract the preprocessed business form according to the convolutional neural network in the target detection model According to the features of the scanned image, the position information of each character block and the position information of each character symbol in the preprocessed service table scanned image are obtained according to the fully connected layer in the target detection model and the characteristics of the service table scanned image.

In one of the embodiments, the acquisition module is also used to input each text block in the table feature information into the trained classification model to obtain the confidence that each text block belongs to each preset field type, according to each text in the table feature information The position information of the block and the position information of each character symbol determine the coordinate distance between each character block and each character symbol, and each character symbol whose coordinate distance from each character block is within a preset distance threshold is regarded as the coordinate distance between each character block and each character symbol. Block-associated text symbols, according to the association between each text block and each text symbol, correct the confidence that each text block belongs to each preset field type, and calculate the confidence that each text block belongs to each preset field type Sort, and use the field type with the highest confidence as the field type of each text block.

In one of the embodiments, the field types include fields to be filled, option fields, and comment fields. The acquisition module is also used to determine the text block whose field type is a field to be filled as the target to be filled according to the field type of each text block. The text block determines the distance between each target text block and each option field text block and each comment field text block according to the position information of each text block in the table feature information, and determines the distance between each target text block and each option field text block and each The distance between the note field text blocks determines the option field text block and the note field text block corresponding to each target text block, and establishes the association relationship between each target text block and the corresponding option field text block and the note field text block.

In one of the embodiments, the configuration module is also used to determine the filling order of the fields to be filled corresponding to each target text block according to the position information of each target text block in the table feature information, and to fill according to the required filling corresponding to each target text block Fill in the fields in order to determine the business process sequence of the business fragment statements of each target text block, generate the scene process information of the business application according to the business process sequence, and perform model training and configuration based on the scene process information and the business fragment statements of each target text block Chatbot.

In one of the embodiments, the configuration module is also used to input the business segment sentences of each target text block as the first training set into the initial natural language understanding model for model training, to obtain the natural language understanding model, which is used according to the user Sentences determine user intentions and capture entity information, use scene process information as the second training set into the initial dialogue management model for model training, and obtain the dialogue management model. The dialogue management model is used to determine the reply sentence according to the user's sentence and user intention. Language understanding model and dialogue management model, configure chatbots.

For the specific limitation of the device for configuring the chat robot, please refer to the above limitation on the method of configuring the chat robot, which will not be repeated here. The various modules in the above apparatus for configuring chat robots can be implemented in whole or in part by software, hardware, and combinations thereof. The foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 9. The computer equipment includes a processor, a memory, and a network interface connected through a system bus. The processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to implement a method of configuring a chat robot.

Those skilled in the art can understand that the structure shown in FIG. 9 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. The specific computer equipment may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.

A computer device including a memory and one or more processors, and computer-readable instructions are stored in the memory. When the computer-readable instructions are executed by the processor, the steps of the method for configuring a chat robot provided in any embodiment of the present application are implemented .

One or more non-volatile storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors implement the configuration provided in any embodiment of the present application The steps of the chatbot method.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction between the combinations of these technical features, they should It is considered as the range described in this specification.

The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims

A method of configuring chatbots includes:

Obtain the scan map of the business form, extract the feature information of the form in the scan graph of the business form, determine the field type of the text block in the form feature information, and identify the target text block that needs to be filled in according to the field type. Establishing an association relationship between the target text blocks that need to be filled with fields;

Query a preset data type configuration table according to the target text block requiring field filling, and determine the data type of the field to be filled corresponding to each target text block;

According to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template, generate the business segment sentence of each target text block; and

According to the location information of each target text block in the table feature information and the business segment sentence of each target text block, a chat robot is configured.
The method according to claim 1, wherein said obtaining a scan map of a business table and extracting table feature information in the scan map of a business table comprises:

Obtain a scanned image of the business form, and preprocess the scanned image of the business form;

According to the trained target detection model, the position information of each text block and the position information of each text symbol in the preprocessed business table scan image are obtained, and the target detection model is obtained by training a sample picture including text blocks and text symbols ；

According to the position information of each of the character blocks and the position information of each of the character symbols, segment the scan map of the business form to obtain a plurality of character block images and character symbol images; and

According to the trained picture classification model, extract each text block image and the text block and text symbol in each text symbol image to obtain the table feature information in the business table scan image. The picture classification model is based on including text blocks and Sample images of text symbols are trained.
The method according to claim 2, wherein the obtaining the position information of each character block and the position information of each character symbol in the preprocessed business table scanning image according to the trained target detection model comprises:

Input the preprocessed scanned image of the business table into the trained target detection model;

Extracting the features of the preprocessed scan map of the business table according to the convolutional neural network in the target detection model; and

According to the fully connected layer in the target detection model and the features of the business table scan map, the position information of each character block and the position information of each word symbol in the preprocessed business table scan map are obtained.
The method according to claim 1, wherein the determining the field type of the text block in the table characteristic information comprises:

Input each text block in the table feature information into the trained classification model to obtain the confidence that each text block belongs to each preset field type;

Determine the coordinate distance between each character block and each character symbol according to the position information of each character block and the position information of each character symbol in the table feature information;

Use each character symbol whose coordinate distance to each character block is within a preset distance threshold range as a character symbol associated with each character block;

According to the association between each text block and each text symbol, modify the confidence that each text block belongs to each preset field type; and

Sort the confidence of each text block belonging to each preset field type, and use the field type with the highest confidence as the field type of each text block.
The method according to claim 1, wherein the field type includes a field to be filled, an option field, and a comment field, and the target text block that needs to be filled in the field is identified according to the field type, and is performed according to the need The target text block for field filling establishes the relationship between each text block, including:

According to the field type of each text block, determine that the text block whose field type is a field to be filled is the target text block that needs to be filled in;

Determine the distance between each target text block and each option field text block and each comment field text block according to the position information of each text block in the table feature information;

According to the distance between each target text block and each option field text block and each comment field text block, determine the option field text block and the comment field text block corresponding to each target text block; and

Establish an association relationship between each target text block and the corresponding option field text block and note field text block.
The method according to claim 1, wherein the configuring a chat robot according to the position information of each target text block in the table feature information and the business segment sentence of each target text block comprises:

According to the position information of each target text block in the table feature information, determine the filling order of the fields to be filled corresponding to each target text block;

Determine the business process sequence of the business segment statements of each target text block according to the filling order of the fields to be filled corresponding to each target text block;

According to the business process sequence, generate the scenario process information of the business application; and

Perform model training according to the scene process information and the business segment sentences of each target text block, and configure a chat robot.
The method according to claim 6, characterized in that, performing model training based on the scene process information and business segment sentences of each target text block and configuring a chat robot comprises:

The business segment sentences of each target text block are input as the first training set into the initial natural language understanding model for model training to obtain a natural language understanding model, which is used to judge user intentions and capture entity information according to user sentences;

Input the scene process information as the second training set into the initial dialogue management model for model training to obtain the dialogue management model, the dialogue management model is used to determine the reply sentence according to the user sentence and the user intention; and according to the said The natural language understanding model and the dialogue management model are configured with chat robots.
A device for configuring chat robots, including:

The obtaining module is used to obtain the scan map of the business form, extract the form feature information in the scan graph of the business form, determine the field type of the text block in the form feature information, and identify the fields that need to be filled in according to the field type The target text block, according to the target text block that needs to be filled with fields, establishes the association relationship between the text blocks;

The first processing module is configured to query a preset data type configuration table according to the target text block for which field filling is required, and determine the data type of the field to be filled corresponding to each target text block;

The second processing module is configured to generate business segment sentences for each target text block according to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template; and

The configuration module is used to configure the chat robot according to the position information of each target text block in the table feature information and the business segment sentence of each target text block.
The device according to claim 8, wherein the acquisition module is further configured to acquire a scan map of the business table, preprocess the scan map of the business table, and obtain the preprocessed target detection model according to the trained target detection model. The location information of each text block and the location information of each text symbol in the business table scan graph. The target detection model is trained on sample pictures including text blocks and text symbols, and is based on the location information of each text block and each location. Describe the position information of the text symbols, segment the scan of the business table to obtain multiple text block images and text symbol images, and extract each text block image and text blocks and text in each text symbol image according to the trained picture classification model Symbols to obtain table feature information in the business table scanning diagram, and the picture classification model is obtained by training sample pictures including text blocks and text symbols.
A computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the one or more processors, the one or more Each processor performs the following steps:

Obtain the scan map of the business form, extract the feature information of the form in the scan graph of the business form, determine the field type of the text block in the form feature information, and identify the target text block that needs to be filled in according to the field type. Establishing an association relationship between the target text blocks that need to be filled with fields;

Query a preset data type configuration table according to the target text block requiring field filling, and determine the data type of the field to be filled corresponding to each target text block;

According to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template, generate the business segment sentence of each target text block; and

According to the location information of each target text block in the table feature information and the business segment sentence of each target text block, a chat robot is configured.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:

Obtain a scanned image of the business form, and preprocess the scanned image of the business form;

According to the trained target detection model, the position information of each text block and the position information of each text symbol in the preprocessed business table scan image are obtained, and the target detection model is obtained by training a sample picture including text blocks and text symbols ；

According to the position information of each of the character blocks and the position information of each of the character symbols, segment the scan map of the business form to obtain a plurality of character block images and character symbol images; and

According to the trained picture classification model, extract each text block image and the text block and text symbol in each text symbol image to obtain the table feature information in the business table scan image. The picture classification model is based on including text blocks and Sample images of text symbols are trained.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:

Input the preprocessed scanned image of the business table into the trained target detection model;

Extracting the features of the preprocessed scan map of the business table according to the convolutional neural network in the target detection model; and

According to the fully connected layer in the target detection model and the features of the business table scan map, the position information of each character block and the position information of each word symbol in the preprocessed business table scan map are obtained.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:

Input each text block in the table feature information into the trained classification model to obtain the confidence that each text block belongs to each preset field type;

Determine the coordinate distance between each character block and each character symbol according to the position information of each character block and the position information of each character symbol in the table feature information;

Use each character symbol whose coordinate distance to each character block is within a preset distance threshold range as a character symbol associated with each character block;

According to the association between each text block and each text symbol, modify the confidence that each text block belongs to each preset field type; and

Sort the confidence of each text block belonging to each preset field type, and use the field type with the highest confidence as the field type of each text block.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:

According to the field type of each text block, determine that the text block whose field type is a field to be filled is the target text block that needs to be filled in;

Determine the distance between each target text block and each option field text block and each comment field text block according to the position information of each text block in the table feature information;

According to the distance between each target text block and each option field text block and each comment field text block, determine the option field text block and the comment field text block corresponding to each target text block; and

Establish an association relationship between each target text block and the corresponding option field text block and note field text block.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instruction:

According to the position information of each target text block in the table feature information, determine the filling order of the fields to be filled corresponding to each target text block;

Determine the business process sequence of the business segment statements of each target text block according to the filling order of the fields to be filled corresponding to each target text block;

According to the business process sequence, generate the scenario process information of the business application; and

Perform model training according to the scene process information and the business segment sentences of each target text block, and configure a chat robot.
One or more non-volatile computer-readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:

Obtain the scan map of the business form, extract the feature information of the form in the scan graph of the business form, determine the field type of the text block in the form feature information, and identify the target text block that needs to be filled in according to the field type. Establishing an association relationship between the target text blocks that need to be filled with fields;

Query a preset data type configuration table according to the target text block requiring field filling, and determine the data type of the field to be filled corresponding to each target text block;

According to the data type of the field to be filled corresponding to each target text block, the association relationship between the text blocks, and the preset sentence template, generate the business segment sentence of each target text block; and

According to the location information of each target text block in the table feature information and the business segment sentence of each target text block, a chat robot is configured.
The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:

Obtain a scanned image of the business form, and preprocess the scanned image of the business form;

According to the trained target detection model, the position information of each text block and the position information of each text symbol in the preprocessed business table scan image are obtained, and the target detection model is obtained by training a sample picture including text blocks and text symbols ；

According to the position information of each of the character blocks and the position information of each of the character symbols, segment the scan map of the business form to obtain a plurality of character block images and character symbol images; and

According to the trained picture classification model, extract each text block image and the text block and text symbol in each text symbol image to obtain the table feature information in the business table scan image. The picture classification model is based on including text blocks and Sample images of text symbols are trained.
The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:

Input the preprocessed scanned image of the business table into the trained target detection model;

Extracting the features of the preprocessed scan map of the business table according to the convolutional neural network in the target detection model; and

According to the fully connected layer in the target detection model and the features of the business table scan map, the preprocessed business table scan map position information and the position information of each word symbol are obtained.
The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:

Input each text block in the table feature information into the trained classification model to obtain the confidence that each text block belongs to each preset field type;

Determine the coordinate distance between each character block and each character symbol according to the position information of each character block and the position information of each character symbol in the table feature information;

Use each character symbol whose coordinate distance to each character block is within a preset distance threshold range as a character symbol associated with each character block;

According to the association between each text block and each text symbol, modify the confidence that each text block belongs to each preset field type; and

Sort the confidence of each text block belonging to each preset field type, and use the field type with the highest confidence as the field type of each text block.
The storage medium according to claim 16, wherein the following steps are further executed when the computer-readable instructions are executed by the processor:

According to the field type of each text block, determine that the text block whose field type is a field to be filled is the target text block that needs to be filled in;

Determine the distance between each target text block and each option field text block and each comment field text block according to the position information of each text block in the table feature information;

According to the distance between each target text block and each option field text block and each comment field text block, determine the option field text block and the comment field text block corresponding to each target text block; and

Establish an association relationship between each target text block and the corresponding option field text block and note field text block.