CN113051384B

CN113051384B - User portrait extraction method based on dialogue and related device

Info

Publication number: CN113051384B
Application number: CN202110458709.6A
Authority: CN
Inventors: 孙梓淇; 张智; 白祚; 莫洋
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2021-04-26
Filing date: 2021-04-26
Publication date: 2023-09-19
Anticipated expiration: 2041-04-26
Also published as: CN113051384A

Abstract

The embodiment of the application provides a user portrait extraction method based on conversation and a related device, wherein the method comprises the following steps: acquiring a first dialogue sentence of a user and a second dialogue sentence of a service person in any one dialogue of multiple rounds of dialogue; performing entity recognition on the first dialogue sentence and the second dialogue sentence; identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on entities recorded in a preset data table to obtain a target first dialogue sentence and a target second dialogue sentence; then, extracting user portraits from the target first dialogue sentence and the target second dialogue sentence; filtering the extracted user portraits belonging to the service personnel to obtain the user portraits belonging to the user in any one round of dialogue, and merging the extracted target user portraits belonging to the user in each round of dialogue of multiple rounds of dialogue. The embodiment of the application is beneficial to improving the accuracy of user portrait extraction.

Description

User portrait extraction method based on dialogue and related device

Technical Field

The application relates to the technical field of data analysis, in particular to a user portrait extraction method based on dialogue and a related device.

Background

In business processing, a large number of scenes communicated with clients, such as client condition understanding, product service consultation, after-sales processing and the like, are involved, dialogue information generated in communication has extremely important significance for user mining or business expansion, such as user image extraction on the dialogue information, subsequent personalized recommendation and user service condition tracking are facilitated, and topic trend can be guided to further user image mining. Traditional user portrayal extraction is mainly based on manpower and rules to mine dialogue information, and labels capable of reflecting certain personal information of users are extracted from answers of the users, but the user portrayal extracted in the mode is often incomplete, and the accuracy of describing the users is low.

Disclosure of Invention

In order to solve the problems, the application provides a user portrait extraction method based on dialogue and a related device, which are beneficial to improving the accuracy of user portrait extraction.

To achieve the above object, a first aspect of an embodiment of the present application provides a user portrait extraction method based on a dialogue, including:

acquiring a first dialogue sentence of a user and a second dialogue sentence of a service person in any one dialogue of multiple rounds of dialogue;

Performing entity identification on the first dialogue statement and the second dialogue statement, and recording the identified entities to a preset data table;

identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence;

user portrayal extraction is carried out on the target first dialogue sentence based on a first preset rule, and user portrayal extraction is carried out on the target second dialogue sentence based on a second preset rule;

filtering the user portraits belonging to the service personnel extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merging the target user portraits belonging to the user extracted from each round of dialogue of the multiple rounds of dialogue.

With reference to the first aspect, in a possible implementation manner, the performing, based on the entity recorded in the preset data table, reference digestion on the identified pronoun includes:

acquiring the entity identified in the first-round dialogue from the preset data table under the condition that any one-round dialogue is the first-round dialogue of the multi-round dialogue, and performing reference resolution on the identified pronoun based on the entity identified in the first-round dialogue;

Acquiring the entity identified in the target round of dialogue and the entity identified in the history round of dialogue from the preset data table under the condition that any round of dialogue is the target round of dialogue except the first round of dialogue, and performing reference resolution on the identified pronouns based on the entity identified in the target round of dialogue and the entity identified in the history round of dialogue; wherein the historical round of dialogue is the dialogue before the target round of dialogue in the multi-round dialogue.

With reference to the first aspect, in a possible implementation manner, the extracting, based on a first preset rule, a user portrait of the target first dialogue sentence includes:

detecting sensitive words and business words of the target first dialogue sentence to obtain a first candidate rule set;

performing rule matching on the target first dialogue statement by adopting a regular expression to obtain a second candidate rule set;

acquiring an intersection of the first candidate rule set and the second candidate rule set to obtain a third candidate rule set;

and extracting the user portrait in the target first dialogue sentence under the condition that the rule in the third candidate rule set is the first preset rule.

With reference to the first aspect, in a possible implementation manner, the obtaining a first candidate rule set includes:

and under the condition that the sensitive word is not detected in the target first dialogue sentence and the target first dialogue sentence does not accord with the business operation, adopting a rule engine based on a multi-slot Huffman Trie tree to carry out rule matching on the target first dialogue sentence to obtain the first candidate rule set.

With reference to the first aspect, in a possible implementation manner, the performing, based on a second preset rule, user portrait extraction on the target second dialogue sentence includes:

detecting the sensitive word and the business speech operation of the target second dialogue sentence to obtain a fourth candidate rule set;

performing rule matching on the target second dialogue statement by adopting a regular expression to obtain a fifth candidate rule set;

acquiring an intersection of the fourth candidate rule set and the fifth candidate rule set to obtain a sixth candidate rule set;

and extracting the user portrait in the target second dialogue sentence under the condition that the rule in the sixth candidate rule set is the second preset rule.

With reference to the first aspect, in a possible implementation manner, after obtaining a user portrait belonging to the user in the arbitrary round of dialogue, the method further includes:

And carrying out conflict detection on user portraits belonging to users in any round of conversations, and determining the target user portraits in any round of conversations by adopting a voting strategy.

A second aspect of an embodiment of the present application provides a dialog-based user portrait extraction apparatus, including:

the dialogue acquisition module is used for acquiring a first dialogue sentence of a user and a second dialogue sentence of a salesman in any one round of dialogue of multiple rounds of dialogue;

the entity identification module is used for carrying out entity identification on the first dialogue statement and the second dialogue statement and recording the identified entity into a preset data table;

the reference resolution module is used for identifying pronouns in the first dialogue sentence and the second dialogue sentence, and resolving the identified pronouns in an referring mode based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence;

the portrait extraction module is used for extracting user portraits from the target first dialogue statement based on a first preset rule and extracting user portraits from the target second dialogue statement based on a second preset rule;

and the portrait merging module is used for filtering the user portraits belonging to the service staff extracted from the target first dialogue statement and the target second dialogue statement to obtain the user portraits belonging to the users in any round of dialogue, and merging the target user portraits belonging to the users extracted from each round of dialogue of the multiple rounds of dialogue.

A third aspect of the embodiments of the present application provides an electronic device, including an input device and an output device, and further including a processor adapted to implement one or more instructions; and a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of:

A fourth aspect of the embodiments of the present application provides a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the steps of:

Compared with the prior art, the method and the device have the advantages that the first dialogue statement of the user and the second dialogue statement of the service staff in any dialogue of multiple dialogues are obtained; performing entity identification on the first dialogue statement and the second dialogue statement, and recording the identified entities to a preset data table; identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence; user portrayal extraction is carried out on the target first dialogue sentence based on a first preset rule, and user portrayal extraction is carried out on the target second dialogue sentence based on a second preset rule; filtering the user portraits belonging to the service personnel extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merging the target user portraits belonging to the user extracted from each round of dialogue of the multiple rounds of dialogue. In this way, the dialogue sentences of the users and dialogue sentences of the service staff in the multi-round dialogue are processed through entity identification and reference resolution, and pronouns in the dialogue correspond to the entities, so that the problem that the pronouns in the user portrait extraction based on the single sentence dialogue are difficult to accurately identify as the entities is solved, the dialogue sentences in the multi-round dialogue are more complete, the possibility of extracting the user portrait in the single sentence dialogue is improved, more user portraits are extracted, the depiction of one user is more accurate, and the identity judgment is performed on the dialogue sentences obtained after the reference resolution is facilitated, so that the user portraits describing the service staff are filtered.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of a network system architecture according to an embodiment of the present application;

FIG. 2 is a flow chart of a user portrait extraction method based on dialogue according to an embodiment of the present application;

FIG. 3 is a diagram illustrating user portrayal extraction of a target first dialogue sentence according to an embodiment of the present application;

FIG. 4 is a diagram illustrating user portrayal extraction of a target second dialogue sentence according to an embodiment of the present application;

FIG. 5 is a flow chart of another dialog-based user portrait extraction method according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a user portrait extraction device based on dialogue according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

The terms "comprising" and "having" and any variations thereof, as used in the description, claims and drawings, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. Furthermore, the terms "first," "second," and "third," etc. are used for distinguishing between different objects and not for describing a particular sequential order.

The embodiment of the application provides a user portrait extraction method based on dialogue, which can be implemented based on a network system architecture shown in fig. 1, please refer to fig. 1, the network system architecture comprises a terminal and an electronic device, the terminal and the electronic device are connected through wired or wireless network communication, the terminal is a terminal device used by a user and a salesman, and the terminal can be a mobile phone, a tablet, a computer, a personal digital assistant (Personal Digital Assistant, a PDA) and the like of the user and the salesman, and is used for providing dialogue sentences between the user and the salesman for the electronic device, wherein the dialogue sentences can be real-time dialogue sentences between the user and the salesman, or historical dialogue sentences in a log record extracted from a database by a developer. The electronic equipment at least comprises a communication module and a processing module, wherein the communication module is integrated with a digital protocol interface, the communication module acquires dialogue sentences submitted by a terminal through the digital protocol interface and forwards the dialogue sentences to the processing module, and the processing module executes operations such as entity identification, reference resolution, user portrait extraction, user portrait filtration, user portrait combination and the like on the dialogue sentences. The electronic device may be an independent physical server, a server cluster or a distributed system, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, basic cloud computing services such as big data and artificial intelligence platforms.

Based on the network system architecture shown in fig. 1, the following describes in detail the user portrait extraction method based on dialogue provided by the embodiment of the present application in combination with other drawings.

Referring to fig. 2, fig. 2 is a flowchart of a dialog-based user portrait extraction method according to an embodiment of the present application, where the method is applied to an electronic device, as shown in fig. 2, and includes steps S21-S25:

s21, acquiring a first dialogue sentence of a user and a second dialogue sentence of a service person in any dialogue of multiple rounds.

In the embodiment of the disclosure, the multiple rounds of conversations may be real-time conversations generated in business communication between the user and the salesman, or conversation records of the user and the salesman extracted from an offline log, such as conversation records generated in a customer service system, conversation records generated in telephone communication, and the like. Among them, the business personnel include, but are not limited to, business staff, intelligent conversation systems, conversation robots.

S22, performing entity recognition on the first dialogue sentence and the second dialogue sentence, and recording the recognized entity to a preset data table.

In the embodiment of the disclosure, for the first dialogue sentence and the second dialogue sentence, a keyword dictionary may be used for entity recognition, or a named entity recognition model may be used for entity recognition, and each recognized entity is recorded for use in subsequent entity inheritance and reference resolution.

In a possible implementation manner, the entity identifying the first dialogue sentence and the second dialogue sentence includes:

performing the following operations on any one of the first dialogue sentence and the second dialogue sentence:

word segmentation is carried out on any dialogue sentence to obtain a word sequence; embedding an embedded matrix by using a pre-trained or randomly initialized word to map the word sequence into a word vector sequence; inputting the word vector sequence into a bidirectional LSTM for feature extraction to obtain a feature sequence corresponding to any dialogue sentence; inputting the feature sequence into a CRF (conditional random field ) layer for sentence-level sequence labeling of the word sequence to obtain a tag sequence corresponding to the word sequence, and obtaining an entity in any dialogue sentence based on the tag sequence. For example: for a second dialogue sentence of a salesman, namely 'you have children', mapping word sequences after word segmentation into word vector sequences (x 1, x2, x3, …, x 6), taking the word vector sequences (x 1, x2, x3, …, x 6) as the input of a bidirectional LSTM, splicing the output result of the forward last layer and the output result of the reverse last layer according to the position by the bidirectional LSTM, obtaining corresponding feature sequences (h 1, h2, h3, … h 6), taking the feature sequences (h 1, h2, h3, … h 6) as the input of a CRF layer, carrying out sentence-level sequence labeling on the CRF layer by adopting BIO rules, wherein B represents the beginning of entity words, I represents the interior of the entity words, O represents the entity words not being entity words, the entity words can be predefined words such as children, the old words, and the tag sequences (y 1, y2, y3, …, y 6) output by the CRF layer, and calculating the probability values of each of the feature sequences (x 1, x2, x3, x 6) in the CRF layer are equal to or greater than the preset probability values, and determining that each of the tag words belongs to the second dialogue sentence.

S23, identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence.

In the embodiment of the disclosure, the target first dialogue sentence refers to a dialogue sentence obtained by performing reference resolution on a pronoun in the first dialogue sentence, and similarly, the target second dialogue sentence refers to a dialogue sentence obtained by performing reference resolution on a pronoun in the second dialogue sentence. The preset data table is used for storing the entities identified in each dialog of the multiple dialogues, and keyword dictionary, regular expression and the like can be used for identifying the pronouns.

In one possible implementation manner, the performing reference resolution on the identified pronouns based on the entities recorded in the preset data table includes:

acquiring the entity identified in the target round of dialogue and the entity identified in the history round of dialogue from the preset data table under the condition that any round of dialogue is the target round of dialogue except the first round of dialogue, and performing reference resolution on the identified pronouns based on the entity identified in the target round of dialogue and the entity identified in the history round of dialogue; wherein, the historical round dialog refers to a dialog before the target round dialog in the multiple rounds of dialogs.

It will be appreciated that when any one round of dialogue is the first round of dialogue, only the entity identified in the first round of dialogue is recorded in the preset data table, and when any round of dialogue is the non-first round of dialogue (i.e. the target round of dialogue), the target round of dialogue and the entity identified in the history round of dialogue before the target round of dialogue are recorded in the preset data table. The reference resolution of the pronouns may be performed using an reference resolution model that performs reference resolution by pairing the identified pronouns with the entities recorded in the preset data table, then calculating the score of each pair, and using the entity in the pair with the largest score as the antecedent of the pronoun. For example, a: do you have children? B: he was high and medium. By pairing the pronoun "he" with an entity in the preset data table, the "he" here actually refers to "child" in the dialogue sentence of a, and the target first dialogue sentence obtained after performing the reference resolution may be that the child is all high and medium.

S24, user portrait extraction is conducted on the target first dialogue sentence based on a first preset rule, and user portrait extraction is conducted on the target second dialogue sentence based on a second preset rule.

In the embodiment of the disclosure, the first preset rule is a matching rule designed for the identity of the user, and the second preset rule is a matching rule designed for the identity of the salesman.

In a possible implementation manner, as shown in fig. 3, the user portrait extraction on the target first dialogue sentence based on the first preset rule includes steps S31 to S34:

s31, detecting sensitive words and business words of the target first dialogue sentence to obtain a first candidate rule set;

s32, carrying out rule matching on the target first dialogue statement by adopting a regular expression to obtain a second candidate rule set;

s33, acquiring an intersection of the first candidate rule set and the second candidate rule set to obtain a third candidate rule set;

s34, extracting the user portrait in the target first dialogue sentence under the condition that the rule in the third candidate rule set is the first preset rule.

In the aspect of obtaining the first candidate rule set, if no sensitive word is detected in the target first dialogue sentence and the target first dialogue sentence does not accord with the business rules, a rule engine based on a multi-slot Huffman Trie tree is adopted to carry out rule matching on the target first dialogue sentence so as to obtain the first candidate rule set.

In a possible implementation manner, as shown in fig. 4, the user portrait extraction on the target second dialogue sentence based on the second preset rule includes steps S41 to S44:

s41, detecting the sensitive word and the business speech operation of the target second dialogue sentence to obtain a fourth candidate rule set;

s42, carrying out rule matching on the target second dialogue statement by adopting a regular expression to obtain a fifth candidate rule set;

s43, taking an intersection of the fourth candidate rule set and the fifth candidate rule set to obtain a sixth candidate rule set;

s44, extracting the user portrait in the target second dialogue sentence under the condition that the rule in the sixth candidate rule set is the second preset rule.

And in the aspect of obtaining the fourth candidate rule set, if no sensitive word is detected in the target second dialogue sentence and the target second dialogue sentence does not accord with the business rules, adopting a rule engine based on a multi-slot Huffman Trie tree to carry out rule matching on the target second dialogue sentence, so as to obtain the fourth candidate rule set.

Specifically, different rules are adopted for matching dialogue sentences of a user and dialogue sentences of a salesman, before rule matching is carried out, sensitive words and business words are detected on the dialogue sentences, because certain sensitive words or business words are preset and are not allowed to be extracted from the dialogue sentences, such as 'I hear about/I have friends/My relatives', the beginning dialogue sentences can interfere the extracted user figures, the user figures can be classified as business words to be removed, and optionally, regular expressions can be adopted for detecting the sensitive words and the business words. If the target first dialogue sentence and the target second dialogue sentence do not comprise sensitive words or do not belong to business theory, rule matching is firstly carried out by adopting a rule engine based on a multi-slot Huffman Trie, wherein a rule template is predefined by the rule engine based on the multi-slot Huffman Trie, the target first dialogue sentence and the target second dialogue sentence are firstly matched to corresponding slots, such as ' Beijing in winter ' hit slots in winter ' and ' Beijing ' hit slots in place, for each hit slot, leaf nodes corresponding to the Huffman Trie are recursively searched in the slots to obtain a contained rule set, and an intersection set of the rule sets of the slots is obtained to obtain a candidate rule set. Although the rule engine based on the multi-slot huffman Trie can optimize rule matching performance, the rule engine also has business logic which is not covered, and the rule template supported by the regular expression has larger coverage, so that the regular expression can be adopted to perform rule matching again on the dialogue sentence so as to supplement logic, and the situation that the same rule exists in candidate rule sets (namely a first candidate rule set and a second candidate rule set, a fourth candidate rule set and a fifth candidate rule set) obtained by the two rule matching is possible, and the intersection sets are taken to filter the same rule. And finally, determining the third candidate rule set as a rule hit by the target first dialogue sentence, determining the sixth candidate rule set as a rule hit by the target second dialogue sentence, judging whether the rule in the third candidate rule set belongs to a first preset rule, if so, performing user portrait extraction on the target first dialogue sentence, judging whether the rule in the sixth candidate rule set belongs to a second preset rule, and if so, performing user portrait extraction on the target second dialogue sentence.

The target first dialogue sentence and the target second dialogue sentence can be extracted by using a trained natural language processing model, and the target first dialogue sentence is used as an example to preprocess the target first dialogue sentence, wherein the preprocessing comprises, but is not limited to, error correction, simplified-complex conversion and special symbol processing, the preprocessed target first dialogue sentence is input into the trained natural language processing model to carry out label classification, labels of at least one type of user portraits are obtained, and corresponding user portraits are obtained according to the labels. The user image may be classified into gender, marital status, whether home is formed, etc., the label is represented by a thermal single vector, the label includes at least two dimensions, and by taking gender as an example, the gender is paved into two dimensions, the first dimension is 1 if the user is male, the second dimension is 0 if the user is female, the second dimension is 1, the first dimension is 0, and for example, if the business requirement comparison focuses on whether the user is home, the user image of whether home is formed can be classified into dimensions such as 'not married not having baby, and married having baby'. Similar to the target first dialogue sentence, the target second dialogue sentence can also use the natural language processing model to extract the user portrait, which is not described herein. In the embodiment, the labels and the label dimensions of the user portrait can be independently formulated according to the service requirements, so that the user portrait extraction has more flexibility.

S25, filtering the user portraits belonging to the service staff extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the users in any one round of dialogue, and merging the target user portraits belonging to the users extracted from each round of dialogue of the multiple rounds of dialogue.

In the embodiment of the disclosure, the rules of the dialogue sentences of the user are relatively loose, so long as the target first dialogue sentence hits the first preset rule, the user portrait extracted from the target first dialogue sentence is considered to be the user portrait of the user, the rules of the dialogue sentences of the service personnel are relatively compact, the target second dialogue sentence is limited except for hitting the second preset rule, if the preset words such as 'you', 'you' do not exist in the main words of the target second dialogue sentence, the target second dialogue sentence is considered to describe the service personnel, then the user portrait extracted from the target second dialogue sentence is the user portrait of the service personnel, the user portrait of the service personnel is filtered, each single sentence only remains the user portrait of the user, and finally the target user portraits extracted from the multiple rounds of dialogue and belonging to the user are combined, so that the complete user portrait of the user is obtained.

In a possible implementation manner, after obtaining the user portraits belonging to the user in the any one round of dialogue, the method further comprises:

The target user portrait is a user portrait belonging to the user obtained after conflict detection, for example, 5 sentences in a certain dialog identify the user portrait belonging to the user, wherein 3 sentences consider the user as female, 1 sentence considers the user as male, and the target user portrait in the dialog is male.

It can be seen that, in the embodiment of the application, the first dialogue statement of the user and the second dialogue statement of the service staff in any one round of dialogue of multiple rounds of dialogue are obtained; performing entity identification on the first dialogue statement and the second dialogue statement, and recording the identified entities to a preset data table; identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence; user portrayal extraction is carried out on the target first dialogue sentence based on a first preset rule, and user portrayal extraction is carried out on the target second dialogue sentence based on a second preset rule; filtering the user portraits belonging to the service personnel extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merging the target user portraits belonging to the user extracted from each round of dialogue of the multiple rounds of dialogue. In this way, the dialogue sentences of the users and dialogue sentences of the service staff in the multi-round dialogue are processed through entity identification and reference resolution, and pronouns in the dialogue correspond to the entities, so that the problem that the pronouns in the user portrait extraction based on the single sentence dialogue are difficult to accurately identify as the entities is solved, the dialogue sentences in the multi-round dialogue are more complete, the possibility of extracting the user portrait in the single sentence dialogue is improved, more user portraits are extracted, the depiction of one user is more accurate, and the identity judgment is performed on the dialogue sentences obtained after the reference resolution is facilitated, so that the user portraits describing the service staff are filtered.

Referring to fig. 5, a flowchart of another dialog-based user portrait extraction method according to an embodiment of the present application is shown in fig. 5, which may be implemented based on the network system architecture shown in fig. 1, and includes steps S51-S57:

s51, acquiring a first dialogue sentence of a user and a second dialogue sentence of a salesman in any one round of dialogue of multiple rounds of dialogue;

s52, performing entity recognition on the first dialogue sentence and the second dialogue sentence, and recording the recognized entity to a preset data table;

s53, identifying pronouns in the first dialogue sentence and the second dialogue sentence;

if the arbitrary session is the first session of the multi-session, step S54 is executed; executing step S55 when the arbitrary round of dialogue is a target round of dialogue other than the first round of dialogue in the plurality of rounds of dialogues;

s54, acquiring the entity identified in the first-round dialogue from the preset data table, and performing reference resolution on the identified pronouns based on the entity identified in the first-round dialogue to obtain a target first dialogue sentence and a target second dialogue sentence;

s55, acquiring the entity identified in the target round dialogue and the entity identified in the history round dialogue from the preset data table, and performing reference resolution on the identified pronouns based on the entity identified in the target round dialogue and the entity identified in the history round dialogue to obtain a target first dialogue sentence and a target second dialogue sentence;

Wherein, the history round dialogue is the dialogue before the target round dialogue in the multi-round dialogue;

s56, user portrait extraction is carried out on the target first dialogue sentence based on a first preset rule, and user portrait extraction is carried out on the target second dialogue sentence based on a second preset rule;

s57, filtering the user portraits belonging to the business personnel extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merging the target user portraits belonging to the user extracted from each round of dialogue of the multiple rounds of dialogue.

The specific implementation of steps S51-S57 is described in the embodiment shown in fig. 2, and the same or similar advantages can be achieved, and for avoiding repetition, the description is omitted here.

Referring to fig. 6 for a description of the embodiment of the user portrait extraction method based on the session, fig. 6 is a schematic structural diagram of a user portrait extraction device based on the session according to an embodiment of the present application, as shown in fig. 6, where the device includes:

a dialogue acquisition module 61, configured to acquire a first dialogue sentence of a user and a second dialogue sentence of a salesman in any one of multiple dialogues;

The entity recognition module 62 is configured to perform entity recognition on the first dialogue sentence and the second dialogue sentence, and record the recognized entity to a preset data table;

an reference resolution module 63, configured to identify pronouns in the first dialogue sentence and the second dialogue sentence, and perform reference resolution on the identified pronouns based on the entity recorded in the preset data table, so as to obtain a target first dialogue sentence and a target second dialogue sentence;

a representation extraction module 64 for user representation extraction of the target first dialogue sentence based on a first preset rule and user representation extraction of the target second dialogue sentence based on a second preset rule;

and the portrait merging module 65 is configured to filter the user portraits belonging to the business person extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merge the target user portraits belonging to the user extracted from each round of dialogue.

In one possible implementation manner, in terms of reference resolution of the identified pronouns based on the entities recorded in the preset data table, the reference resolution module 63 is specifically configured to:

In one possible implementation, in terms of user portrayal extraction of the target first dialogue sentence based on a first preset rule, the portrayal extraction module 64 is specifically configured to:

In one possible implementation, in obtaining the first candidate rule set, the representation extraction module 64 is specifically configured to:

In one possible implementation, in terms of user portrayal extraction of the target second dialogue sentence based on a second preset rule, the portrayal extraction module 64 is specifically configured to:

In one possible implementation, the image merging module 65 is further configured to:

According to one embodiment of the present application, each unit of the dialog-based user portrait extraction device shown in fig. 6 may be separately or completely combined into one or several additional units, or some unit(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the application, the dialog-based user representation extraction device may also include other elements, and in actual practice, these functions may be facilitated by other elements and may be cooperatively implemented by a plurality of elements.

According to another embodiment of the present application, a dialog-based user representation extraction device as shown in fig. 6 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or 5 on a general purpose computing device such as a computer, including a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), etc., processing elements and storage elements, and to implement the dialog-based user representation extraction method of an embodiment of the present application. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.

Based on the description of the method embodiment and the device embodiment, the embodiment of the application also provides electronic equipment. Referring to fig. 7, the electronic device includes at least a processor 71, an input device 72, an output device 73, and a computer storage medium 74. Wherein the processor 71, input device 72, output device 73, and computer storage medium 74 within the electronic device may be coupled by a bus or other means.

The computer storage medium 74 may be stored in a memory of an electronic device, the computer storage medium 74 being for storing a computer program comprising program instructions, the processor 71 being for executing the program instructions stored by the computer storage medium 74. The processor 71, or CPU (Central Processing Unit ), is a computing core as well as a control core of the electronic device, which is adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function.

In one embodiment, the processor 71 of the electronic device provided by embodiments of the present application may be configured to perform a series of dialog-based user profile extraction:

In yet another embodiment, the processor 71 performs the reference resolution of the identified pronouns based on the entities recorded in the preset data table, including:

In yet another embodiment, the processor 71 performs the user portrayal extraction on the target first dialogue sentence based on a first preset rule, including:

In yet another embodiment, processor 71 executes the deriving the first candidate rule set, comprising:

In yet another embodiment, the processor 71 performs the user portrayal extraction on the target second dialogue sentence based on the second preset rule, including:

In a further embodiment, after obtaining a representation of the user belonging to the user in said any of the rounds of dialog, the processor 71 is further configured to:

By way of example, electronic devices include, but are not limited to, a processor 71, an input device 72, an output device 73, and a computer storage medium 74. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of an electronic device and is not limiting of an electronic device, and may include more or fewer components than shown, or certain components may be combined, or different components.

It should be noted that, since the steps in the above-described dialog-based user portrait extraction method are implemented when the processor 71 of the electronic device executes the computer program, the embodiments of the above-described dialog-based user portrait extraction method are applicable to the electronic device, and all achieve the same or similar advantages.

The embodiment of the application also provides a computer storage medium (Memory), which is a Memory device in the electronic device and is used for storing programs and data. It will be appreciated that the computer storage medium herein may include both a built-in storage medium in the terminal and an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 71. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; alternatively, it may be at least one computer storage medium located remotely from the aforementioned processor 71. In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by processor 71 to implement the corresponding steps described above with respect to the dialog-based user representation extraction method.

The computer program of the computer storage medium may illustratively include computer program code, which may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.

It should be noted that, since the steps in the dialog-based user portrait extraction method described above are implemented when the computer program of the computer storage medium is executed by the processor, all embodiments of the dialog-based user portrait extraction method described above are applicable to the computer storage medium, and achieve the same or similar advantages.

The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A dialog-based user portrait extraction method, the method comprising:

identifying pronouns in the first dialogue sentence and the second dialogue sentence, and performing reference resolution on the identified pronouns based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence; the reference digestion is carried out by adopting a reference digestion model, the reference digestion model calculates the score of each pair by matching the identified pronouns with the entities recorded in a preset data table, and takes the entity in the pair with the largest score as the antecedent of the pronoun;

filtering the user portraits belonging to the service staff extracted from the target first dialogue sentence and the target second dialogue sentence to obtain the user portraits belonging to the user in any one round of dialogue, and merging the target user portraits belonging to the user extracted from each round of dialogue of the multiple rounds of dialogue;

The user portrait extraction of the target first dialogue statement based on the first preset rule comprises the following steps:

detecting sensitive words and business words of the target first dialogue sentence to obtain a first candidate rule set; performing rule matching on the target first dialogue statement by adopting a regular expression to obtain a second candidate rule set; acquiring an intersection of the first candidate rule set and the second candidate rule set to obtain a third candidate rule set; extracting a user portrait in the target first dialogue sentence under the condition that the rule in the third candidate rule set is the first preset rule;

the user portrait extraction is performed on the target second dialogue statement based on a second preset rule, and the user portrait extraction comprises the following steps:

detecting the sensitive word and the business speech operation of the target second dialogue sentence to obtain a fourth candidate rule set; performing rule matching on the target second dialogue statement by adopting a regular expression to obtain a fifth candidate rule set; acquiring an intersection of the fourth candidate rule set and the fifth candidate rule set to obtain a sixth candidate rule set; and extracting the user portrait in the target second dialogue sentence under the condition that the rule in the sixth candidate rule set is the second preset rule.

2. The method of claim 1, wherein the reference resolution of the identified pronouns based on the entities recorded in the preset data table comprises:

3. The method of claim 1, wherein the deriving a first set of candidate rules comprises:

4. A method according to any one of claims 1-3, characterized in that after obtaining a representation of a user belonging to the user in said any one of the rounds of dialog, the method further comprises:

5. A dialog-based user representation extraction device, the device comprising:

the reference resolution module is used for identifying pronouns in the first dialogue sentence and the second dialogue sentence, and resolving the identified pronouns in an referring mode based on the entity recorded in the preset data table to obtain a target first dialogue sentence and a target second dialogue sentence; the reference digestion is carried out by adopting a reference digestion model, the reference digestion model calculates the score of each pair by matching the identified pronouns with the entities recorded in a preset data table, and takes the entity in the pair with the largest score as the antecedent of the pronoun;

the portrait merging module is used for filtering the user portraits belonging to the service staff extracted from the target first dialogue statement and the target second dialogue statement to obtain the user portraits belonging to the users in any round of dialogue, and merging the target user portraits belonging to the users extracted from each round of dialogue of the multiple rounds of dialogue;

in terms of user portrayal extraction of the target first dialogue sentence based on a first preset rule, the portrayal extraction module is specifically configured to:

In terms of user portrayal extraction of the target second dialogue sentence based on a second preset rule, the portrayal extraction module is specifically configured to:

6. The apparatus of claim 5, wherein in terms of reference resolution of the identified pronouns based on the entities recorded in the preset data table, the reference resolution module is specifically configured to:

7. An electronic device comprising an input device and an output device, further comprising:

a processor adapted to implement one or more instructions; the method comprises the steps of,

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the method of any one of claims 1-4.

8. A computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the method of any one of claims 1-4.