CN112989822B - Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation - Google Patents

Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation Download PDF

Info

Publication number
CN112989822B
CN112989822B CN202110412241.7A CN202110412241A CN112989822B CN 112989822 B CN112989822 B CN 112989822B CN 202110412241 A CN202110412241 A CN 202110412241A CN 112989822 B CN112989822 B CN 112989822B
Authority
CN
China
Prior art keywords
sentence
target
sentences
probability
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110412241.7A
Other languages
Chinese (zh)
Other versions
CN112989822A (en
Inventor
陈佳豪
丁文彪
刘子韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Century TAL Education Technology Co Ltd
Original Assignee
Beijing Century TAL Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Century TAL Education Technology Co Ltd filed Critical Beijing Century TAL Education Technology Co Ltd
Priority to CN202110412241.7A priority Critical patent/CN112989822B/en
Publication of CN112989822A publication Critical patent/CN112989822A/en
Application granted granted Critical
Publication of CN112989822B publication Critical patent/CN112989822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the application provides a method, a device, electronic equipment and a storage medium for recognizing sentence categories in a conversation, wherein the method for recognizing the sentence categories in the conversation comprises the following steps: obtaining a dialog text comprising at least two sentences; for each sentence in the dialog text, splicing the speaker information of the speaker for indicating the sentence with the sentence to generate a first sentence corresponding to the sentence; obtaining a sentence sequence comprising the first sentence corresponding to each sentence in the dialog text; acquiring context information of a target sentence to be recognized in the dialog text from the sentence sequence, wherein the context information comprises an upper sentence of the target sentence and/or the first sentence corresponding to the lower sentence; and identifying the category of the target sentence according to the first sentence corresponding to the target sentence and the context information. The method and the device can improve the recognition accuracy of sentence categories in the conversation.

Description

Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for recognizing sentence categories in a dialog, an electronic device, and a storage medium.
Background
With the continuous development and progress of computer technology, speech recognition is widely applied in education, medical treatment, industrial production and other fields. When speech recognition is performed in a multi-person conversation scene, if the category of a sentence in a conversation can be recognized, the accuracy of the speech recognition can be improved, and other functions can be realized based on the recognition result of the sentence category. For example, in a classroom teaching scene, whether a sentence spoken by a teacher in a conversation between the teacher and a student is a question, an incentive, a humor and other categories is identified, and the content to be expressed by the teacher can be more accurately determined based on the sentence category identification result.
At present, when a sentence category in a conversation is recognized, a sentence to be recognized in the conversation is input into a classification model trained in advance, and the category of the sentence to be recognized is determined according to an output result of the classification model.
For the current method for recognizing sentence categories in a conversation, the categories of single sentences are directly recognized, and the accuracy of recognizing the sentence categories is low due to less information which can be referred to during recognition.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, an apparatus, an electronic device, and a storage medium for recognizing a sentence category in a dialog, which can improve accuracy of recognizing the sentence category in the dialog.
In a first aspect, a method for identifying a sentence category in a dialog provided by an embodiment of the present application includes:
obtaining a dialog text comprising at least two sentences;
for each sentence in the dialog text, splicing the speaker information of the speaker for indicating the sentence with the sentence to generate a first sentence corresponding to the sentence;
obtaining a sentence sequence comprising the first sentence corresponding to each sentence in the dialog text;
acquiring context information of a target sentence to be recognized in the dialog text from the sentence sequence, wherein the context information comprises an upper sentence of the target sentence and/or the first sentence corresponding to the lower sentence;
and identifying the category of the target sentence according to the first sentence corresponding to the target sentence and the context information.
In a second aspect, an embodiment of the present application further provides an apparatus for recognizing a category of a sentence in a dialog, including:
a text acquisition module for acquiring a dialog text including at least two sentences;
an information splicing module, configured to splice, for each sentence in the dialog text acquired by the text acquisition module, speaker information used for indicating a speaker of the sentence with the sentence, and generate a first sentence corresponding to the sentence;
a sequence acquisition module, configured to acquire a sentence sequence including each of the first sentences generated by the information concatenation module;
a context obtaining module, configured to obtain context information of a target sentence to be recognized in the dialog text from the sentence sequence obtained by the sequence obtaining module, where the context information includes an upper sentence of the target sentence and/or the first sentence corresponding to the lower sentence;
and the category identification module is used for identifying the category of the target sentence according to the first sentence corresponding to the target sentence acquired by the information splicing module and the context information acquired by the context acquisition module.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor and a memory, the processor being connected to the memory, the memory storing a computer program, the processor being configured to execute the computer program to implement the method for recognizing a category of a sentence in a dialog provided by the first aspect.
In a fourth aspect, an embodiment of the present application further provides a computer storage medium, including: the computer storage medium stores a computer program that, when executed by a processor, implements the method for recognizing sentence categories in a dialog provided by the first aspect.
According to the technical scheme, each sentence in the dialog text is spliced with the information of the speaker to obtain a first sentence corresponding to each sentence, for any target sentence to be recognized in the dialog text, the first sentence corresponding to the upper sentence and/or the lower sentence of the target sentence is obtained to serve as context information, and then the type of the target sentence is recognized according to the first sentence corresponding to the target sentence and the context information. Because the first sentence corresponding to the target sentence comprises the speaker information, and the context information of the target sentence comprises the upper sentence and/or the lower sentence of the target sentence, when the category of the target sentence is identified, not only the content of the target sentence can be referred, but also the speaker and the context sentence of the target sentence can be taken as the reference information, and both the speaker and the context of the sentence can embody the category of the sentence, so that the category of the target sentence can be identified by referring to more information, and the accuracy of identifying the sub-sentence category can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for identifying sentence categories in a dialog according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method for identifying sentence categories in a dialog according to a second embodiment of the present application;
fig. 3 is a schematic diagram of a translation sentence vector according to the second embodiment of the present application;
FIG. 4 is a diagram of an apparatus for recognizing sentence types in a dialog according to a third embodiment of the present application;
fig. 5 is a schematic view of an electronic device according to a fourth embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application shall fall within the scope of protection of the embodiments of the present application.
Example one
Fig. 1 is a flowchart of a method for recognizing sentence categories in a dialog according to an embodiment of the present application. Referring to fig. 1, a method for recognizing sentence categories in a dialog provided in an embodiment of the present application includes the following steps:
101. a dialog text comprising at least two sentences is obtained.
The dialog text is a text representation of a multi-person dialog, the dialog text comprising a plurality of sentences, each sentence being a text representation of what one speaker said, different sentences being text representations of what the same or different speakers said, but each sentence comprised by the dialog text being a text representation of what at least two different speakers said.
The multi-person conversation may be a video conversation, an audio conversation or a text conversation, for example, the video conversation is a conversation in which a teacher and a student have a video conversation in an online classroom scene, the audio conversation is a conversation in which multiple persons have a telephone conversation, and the text conversation is a conversation in which mails, an instant chat program, and the like are used to directly make a text conversation. For text conversation, each speaker corresponds to one data source, text data for performing text conversation are respectively acquired from the data sources, and the acquired text data are combined to obtain a conversation text comprising a plurality of sentences. For video conversation and audio conversation, each speaker corresponds to one data source, video data/audio data for video conversation/audio conversation are respectively obtained from the data sources, text data are obtained by carrying out voice recognition on the obtained video data/audio data, and then the obtained text data are combined to obtain a conversation text comprising a plurality of sentences.
It should be noted that, for a video dialog, if the acquired video data includes subtitles, the subtitle text in the video data may be directly combined to obtain the dialog text without performing speech recognition on the video data.
102. For each sentence in the dialog text, the speaker information used for indicating the speaker of the sentence is spliced with the sentence to generate a first sentence corresponding to the sentence.
When the dialog text is acquired, the speaker information of each sentence may be acquired, and the speaker information is used to indicate the speaker of the corresponding sentence. When the speaker information and the sentence are spliced to obtain the first sentence, the speaker information may be placed in front of the sentence for splicing, or the speaker information may be placed in the rear of the sentence for splicing.
In one possible implementation, the speaker information is placed in front of the sentences for splicing, and the information of the speakers is obtained according to the following steps: the format of the sentence "is spliced. For example, a sentence in the dialog text is "this thing, or i ask me for six Zhao in person", the speaker of the sentence is "Lisi", and a first sentence obtained by concatenating the sentence and the information of the explanatory person is "Lisi: this thing, i ask me to ask Zhao Liu in person.
103. A sentence sequence is obtained that includes a first sentence corresponding to each sentence in the dialog text.
After the sentences in the dialog text are respectively spliced with the information of the speaker to obtain first sentences corresponding to the sentences, the first sentences corresponding to the sentences are sequenced according to the sequence of the sentences to obtain a sentence sequence comprising the sequenced first sentences. For example, the dialog text includes 5 sentences in total from sentence 1 to sentence 5, where sentences 1 to sentence 5 correspond to first sentence 1 to first sentence 5 in sequence, and sentences 1 to sentence 5 are expressed in the order of sentence 1-sentence 2-sentence 3-sentence 4-sentence 5, and the sentence sequence is first sentence 1-first sentence 2-first sentence 3-first sentence 4-first sentence 5.
104. The method comprises the steps of obtaining context information of a target sentence to be recognized in a dialog text from a sentence sequence, wherein the context information comprises an upper sentence of the target sentence and/or a first sentence corresponding to the lower sentence.
For any target sentence in the dialog text which needs to be subjected to category identification, an upper sentence and/or a lower sentence of the target sentence are determined from the dialog text, and then a first sentence corresponding to the determined upper sentence and/or lower sentence is obtained from the sentence sequence and is used as the context information of the target sentence.
When the context information of the target sentence is obtained from the sentence sequence, only the first sentence corresponding to the upper sentence of the target sentence may be obtained as the context information of the target sentence, only the first sentence corresponding to the lower sentence of the target sentence may be obtained as the context information of the target sentence, and the upper sentence of the target sentence and the first sentence corresponding to the lower sentence may be simultaneously obtained as the context information of the target sentence. For example, 3 first sentences before the first sentence corresponding to the target sentence in the sentence sequence are used as the context information of the target sentence, for example, 5 first sentences after the first sentence corresponding to the target sentence in the sentence sequence are used as the context information of the target sentence, and for example, 3 first sentences before and 4 first sentences after the first sentence corresponding to the target sentence in the sentence sequence are used as the context information of the target sentence.
The first sentence as the context information of the target sentence may include a first sentence adjacent to the first sentence corresponding to the target sentence in the sentence sequence, and may further include a first sentence not adjacent to the first sentence corresponding to the target sentence in the sentence sequence. For example, 7 first sentences located before and 7 first sentences located after the first sentence corresponding to the target sentence in the sentence sequence are determined as the context information of the target sentence.
105. And identifying the category of the target sentence according to the first sentence corresponding to the target sentence and the context information.
After the first sentence corresponding to the target sentence and the context information of the target sentence are obtained, the category of the target sentence is identified according to the first sentence corresponding to the target sentence and the context information, and an identification result is obtained. The obtained recognition result is used for representing that the target sentence belongs to one of the multiple categories, or representing whether the target sentence belongs to the target category, for example, when the sentence is to be recognized whether the sentence belongs to the humorous category, the target category is the humorous category, the recognition result is that the target sentence belongs to the humorous category, or the target sentence does not belong to the humorous category.
It should be noted that, when the category of the target sentence is identified according to the first sentence corresponding to the target sentence and the context information, input data may be constructed according to the first sentence corresponding to the target sentence and the context information, the input data is input into a pre-constructed classification model, and then the category of the target sentence is determined according to the output of the classification model.
The method for recognizing the sentence type in the dialogue includes the steps of splicing each sentence in a dialogue text with speaker information to obtain a first sentence corresponding to each sentence, obtaining a context sentence of the target sentence and/or the first sentence corresponding to the context sentence as context information for any target sentence to be recognized in the dialogue text, and recognizing the type of the target sentence according to the first sentence corresponding to the target sentence and the context information. Because the first sentence corresponding to the target sentence comprises the speaker information, and the context information of the target sentence comprises the upper sentence and/or the lower sentence of the target sentence, when the category of the target sentence is identified, not only the content of the target sentence can be referred, but also the speaker and the context sentence of the target sentence can be taken as the reference information, and the speaker and the context of the sentence can embody the category of the sentence, so that more reference information can be referred when the category of the target sentence is identified, and the accuracy of identifying the sentence subcategory can be improved.
Example two
In the method for recognizing sentence types in a dialog according to the first embodiment, when context information of a target sentence is acquired, a plurality of sentence groups may be determined from a sentence sequence through a plurality of windows, so that each sentence group includes one first sentence or a plurality of adjacent first sentences, and each determined sentence group is used as context information of the target sentence. The following takes determining context information in a multi-window manner as an example, and further details the method for recognizing sentence categories in a dialog provided by the embodiment of the present application.
Fig. 2 is a flowchart of a method for identifying a sentence category in a dialog according to a second embodiment of the present application. Referring to fig. 2, a method for recognizing sentence categories in a dialog according to an embodiment of the present application includes the following steps:
201. a dialog text comprising at least two sentences is obtained.
The dialog text is the dialog content of a multi-person dialog, the dialog text comprises a plurality of sentences, each sentence is the content expressed by one speaker, and different sentences are the content expressed by different or the same speakers. If the dialog form of the multi-person dialog is a dialog by video or audio, a dialog text including a text sentence is obtained by performing speech recognition on the video or audio, and if the dialog form of the multi-person dialog is a dialog by text transmission, a dialog text including a text sentence is directly obtained.
Table 1 below shows an example of a dialog text including 5 sentences, which are the contents of a dialog of three speakers.
TABLE 1
Figure 64465DEST_PATH_IMAGE001
202. And splicing each sentence in the dialog text with the corresponding speaker information to obtain a first sentence.
For each sentence in the dialog text, determining speaker information for indicating a speaker of the sentence, for example, the speaker information is a name of the speaker, and then splicing the determined speaker information with the sentence to obtain a first sentence corresponding to the sentence.
In one possible implementation, when the sentence is spliced with the speaker information, the speaker information is placed in front of the sentence content for splicing, that is, according to the following steps of "speaker: the format of sentence content "is spliced. Because different sentences may have different lengths, the speaker information is placed in front of the sentence content to be spliced to obtain the first sentences, so that the initial parts of the first sentences are all the speaker information, and the first sentences are conveniently processed when sentence sub-categories are identified subsequently.
For example, for 5 sentences in table 1 above, the first sentence 1 obtained by concatenating the sentence with sequence number 1 and the speaker information is "one third: where no a is admitted, a first sentence 2 obtained by splicing a sentence with the serial number 2 and the speaker information is' lisi: no longer cheering, the first sentence 3 obtained after splicing the sentence with the sequence number 3 and the information of the speaker is "Liquan: this thing is still that i ask one question of Zhao Liu in person, and the first sentence 4 obtained by splicing the sentence with the serial number 4 and the speaker information is "Li Si IV: i can have abundant questioning experience, and a first sentence 5 obtained by splicing a sentence with the serial number 5 and the information of the speaker is 'Wangwu': so seriously, you use your experience in this area all.
203. And sequencing the first sentences corresponding to the sentences according to the sequence of the sentences in the dialog text to obtain a sentence sequence.
Each sentence in the dialog text has a timestamp which characterizes the time at which the respective sentence was expressed. After the first sentences corresponding to the sentences in the dialog text are obtained, the first sentences corresponding to the sentences can be sequenced according to the time stamps of the sentences in the dialog text and the sequence of the sentences, so that a sentence sequence comprising the sequenced first sentences is obtained.
For a multi-person conversation over video or audio, text content and timestamps may be obtained when speech recognition is performed on the video or audio. For a multi-person conversation by sending text, the sent text carries a timestamp of the sent time.
For example, in table 1 above, the sequence number is the order in which the corresponding sentences are expressed, that is, the order of 5 sentences is sentence 1 (sentence with sequence number 1) -sentence 2 (sentence with sequence number 2) -sentence 3 (sentence with sequence number 3) -sentence 4 (sentence with sequence number 4) -sentence 5 (sentence with sequence number 5), and then according to the order of 5 sentences, the first sentences corresponding to 5 sentences are sorted to obtain a sentence sequence, and the sorting result of 5 first sentences in the sentence sequence is first sentence 1-first sentence 2-first sentence 3-first sentence 4-first sentence 5.
204. And acquiring at least one sentence group corresponding to the target sentence from the sentence sequence.
After a sentence sequence is obtained, at least one sentence group corresponding to the target sentence is determined according to the position of a first sentence corresponding to the target sentence in the sentence sequence, wherein the determined sentence group can comprise at least one first sentence positioned before the first sentence corresponding to the target sentence in the sentence sequence, or can comprise at least one first sentence positioned after the first sentence corresponding to the target sentence in the sentence sequence, or can comprise at least one first sentence positioned before and at least one first sentence positioned after the first sentence corresponding to the target sentence in the sentence sequence. In addition, the first sentence corresponding to the target sentence can be used as a sentence group corresponding to the target sentence, and the sentence group only comprises the first sentence corresponding to the target sentence.
At least one first sentence before and/or after the first sentence corresponding to the target sentence in the sentence sequence is obtained to form at least one sentence group corresponding to the target sentence, and then the category of the target sentence is determined according to the sentence groups. The sentence sequence corresponds to the sequence of each sentence in the dialog text, and the first sentences before and/or after the first sentence corresponding to the target sentence are obtained to form the sentence group, so that the sentence group can embody the context information of the target sentence, and the accuracy of the recognition result can be improved when the category of the target sentence is recognized based on the sentence group. According to the requirement, the first sentences which are positioned before, after, before and after the first sentence corresponding to the target sentence can be obtained to form the sentence group corresponding to the target sentence, so that the requirement of different application scenes can be met, and the applicability of the sentence category identification method is improved.
When a sentence group corresponding to a target sentence is obtained from the sentence sequence, a plurality of first sentences continuing from the first sentence corresponding to the target sentence may be obtained to constitute the sentence group corresponding to the target sentence. For example, 2 first sentences before the first sentence corresponding to the target sentence, and 3 first sentences after the first sentence corresponding to the target sentence are obtained to form a sentence group including 6 first sentences. In addition, a plurality of discrete first sentences can be obtained from the sentence sequence to form a sentence group corresponding to the target sentence according to the set sampling rule, for example, the 1 st, 3 rd and 5 th first sentences before the first sentence corresponding to the target sentence, the first sentence corresponding to the target sentence and the 1 st, 3 rd and 5 th first sentences after the first sentence corresponding to the target sentence are obtained, and a sentence group comprising 7 first sentences is formed.
In one possible implementation manner, after the sentence sequence is obtained, i first sentences located before the first sentence corresponding to the target sentence, and j first sentences located after the first sentence corresponding to the target sentence in the sentence sequence are sequentially combined to be determined as a sentence group corresponding to the target sentence. By changing the values of i and j, a plurality of windows can be obtained, the combination of the first sentences in the same window is used as a sentence group corresponding to the target sentence, and then a plurality of sentence groups corresponding to the target sentence can be obtained.
When determining the sentence group corresponding to the target sentence, introducing a parameter i and a parameter j, wherein the parameter i is used for indicating how many first sentences are taken upwards as the upper information of the target sentence, and the parameter j is used for indicating how many first sentences are taken downwards as the lower information of the target sentence. For example, when i =1 and j =2, i =1 indicates that one first sentence before the first sentence corresponding to the target sentence is taken as the upper information corresponding to the target sentence, and j =2 indicates that two first sentences after the first sentence corresponding to the target sentence are taken as the lower information of the target sentence.
And aiming at a determined combination of i and j, determining i first sentences before the first sentence corresponding to the target sentence as the upper information, determining j first sentences after the first sentence corresponding to the target sentence as the lower information, and sequentially splicing the upper information, the first sentence corresponding to the target sentence and the lower information to obtain a sentence group corresponding to the target sentence. For example, taking the sentence with the sequence number 3 in table 1 as an example of the target sentence, when i =1 and j =2, the first sentence 2 located before the first sentence 3 is determined as the upper information of the target sentence, the first sentence 4 and the first sentence 5 located after the first sentence 3 are determined as the lower information of the target sentence, and the first sentence 2, the first sentence 3, the first sentence 4 and the first sentence 5 are sequentially combined to obtain a sentence group corresponding to the target sentence.
The value range of the parameter i is preset to be [0, m ], the value range of the parameter j is preset to be [0, n ], m and n are positive integers, and the parameter i and the parameter j are integers, so that the combination of the parameter i and the parameter j has (m +1) (n +1) types in total, and (m +1) (n +1) sentence groups can be obtained. The values of m and n are the same or different, and can be determined according to the number of sentences included in the dialog text and the number of people participating in the dialog, for example, when the number of sentences included in the dialog text or people participating in the dialog is large, m and n have larger values, and when the number of sentences included in the dialog text is small or people participating in the dialog is small, m and n have smaller values. For example, the values of m and n are both 7, that is, the maximum value of the parameter i and the parameter j is 7, and when different values are selected for the parameter i and the parameter j to be combined, a total of 8 × 8=64 sentence groups can be obtained.
It should be noted that, according to the different order in which the target sentences are expressed in the dialog text, the number of first sentences before the first sentence corresponding to the target sentence may be less than m, and the number of first sentences after the first sentence corresponding to the target sentence may also be less than n. When the value of the parameter i is larger than the number of first sentences before the first sentence corresponding to the target sentence, all first sentences before the first sentence corresponding to the target sentence are selected as the upper information of the target sentence, and when the value of the parameter j is larger than the number of first sentences after the first sentence corresponding to the target sentence, all first sentences after the first sentence corresponding to the target sentence are selected as the lower information of the target sentence, so that in a sentence group determined according to a determined combination of i and j, the number of the first sentences may be smaller than i + j +1, and thus the same sentence group may be determined according to different combinations of i and j. For example, taking the sentence with the sequence number 3 in table 1 as an example, when i is 2 to 7, the first sentence 1 and the first sentence 2 are both used as the upper information of the target sentence, and when j is 2 to 7, the first sentence 4 and the first sentence 5 are both used as the lower information of the target sentence, so that only 9 different sentence groups are included in the obtained 64 sentence groups.
The method includes the steps that possible values are selected for a parameter i and a parameter j in a preset value range to be combined to obtain a plurality of different i and j combinations, a sentence group corresponding to a target sentence is determined according to each i and j combination, a plurality of sentence groups corresponding to the target sentence are obtained, when the number of first sentences before first sentences corresponding to the target sentence is smaller than i, all first sentences before the first sentences corresponding to the target sentence are selected to serve as upper information of the target sentence, when the number of first sentences after the first sentences corresponding to the target sentence is smaller than j, all first sentences after the first sentences corresponding to the target sentence are selected to serve as lower information of the target sentence, the sentence groups corresponding to the target sentence can be determined more quickly in a unified mode of determining the sentence groups, and the efficiency of recognizing the sub-sentences is improved.
In the embodiment of the application, after the sentence sequence is obtained, a plurality of windows are determined through the parameter i and the parameter j, then a first sentence is selected from the sentence sequence through the plurality of windows, a plurality of sentence groups are obtained, and then the category of the target sentence is identified through the obtained plurality of sentence groups. The multiple sentence groups can reflect the relevance between the target sentence and different context sentences, so that when the category of the target sentence is identified according to the multiple sentence groups, the category of the target sentence can be identified according to the relevance between the target sentence and different context sentences, and the accuracy of identifying the category of the target sentence is further improved.
205. And respectively splicing the first sentence corresponding to the target sentence with each sentence group to obtain a second sentence corresponding to the target sentence.
After a plurality of sentence groups corresponding to the target sentence are determined, each sentence group is spliced with the first sentence corresponding to the target sentence, and a plurality of second sentences corresponding to the target sentence are obtained. In a possible implementation manner, when the first sentence corresponding to the target sentence and the sentence group are spliced, the first sentence corresponding to the target sentence is placed before the sentence group for splicing, and at this time, the initial part of the obtained second sentence is the first sentence corresponding to the target sentence. In each sentence group corresponding to the target sentence, the positions of the target sentences in different sentence groups are possibly different, the first sentences corresponding to the target sentences are placed in front of the sentence groups and are spliced to obtain second sentences, the initial parts of the second sentences are all the first sentences corresponding to the target sentences, the target sentences to be recognized can be conveniently determined according to the second sentences, and the accuracy of recognizing the categories of the target sentences can be ensured.
And defining a first sentence corresponding to the target sentence as St and a sentence group corresponding to the target sentence as Sc, and splicing the first sentence and the sentence group according to the format of 'St [ SEP ] Sc' to obtain a second sentence text (i, j). For example, when the target sentence is sentence 3 and i +1, j =2, the first sentence 3, the first sentence 4, and the first sentence 5 are sequentially spliced to obtain a sentence group corresponding to the target sentence, and then the first sentence 3 and the sentence group are spliced to obtain the second sentence text (1, 2).
206. And inputting each second sentence into the corresponding first classification model respectively to obtain a first probability output by the first classification model aiming at each second sentence.
In a possible implementation manner, a first classification model is trained in advance, each second sentence is input into the first classification model, and a first probability of the first classification model for each second sentence output is obtained.
In another possible implementation manner, a plurality of first classification models are trained in advance, and different first classification models correspond to different combinations of i and j. The second sentence is obtained by splicing the first sentence and the sentence group corresponding to the target sentence, and different sentence groups correspond to different combinations of i and j, so that different second sentences also correspond to different combinations of i and j. And for each i and j combination, a first classification model corresponding to the i and j combination is trained in advance, and the first classification model outputs the probability that the target sentence belongs to the target category based on the input second sentence. For example, the value ranges of the parameter i and the parameter j are [0,7], 64 combinations of i and j are counted, and the first classification models are respectively trained in advance for different combinations of i and j to obtain 64 first classification models.
For any determined i and j combination, a plurality of second sentences corresponding to the i and j combination are obtained from the training data set according to the method in the steps 202 to 205, and then a first classification model corresponding to the i and j combination is trained according to the category label of each sentence in the training data set and the obtained second sentences.
And for each acquired second sentence, inputting the second sentence into a corresponding first classification model according to the combination of i and j corresponding to the second sentence, and further acquiring a first probability output by the first classification model, wherein the first probability is used for representing the probability that the target sentence belongs to the target category. For example, the pre-trained classification model is used to identify whether a sentence belongs to a humorous category, and then after a second sentence corresponding to the target sentence is input into the first classification model, a first probability output by the first classification model is used to represent a probability that the target sentence belongs to the humorous category.
In the embodiment of the application, a corresponding first classification model is trained for each i and j combination in advance, after each second sentence corresponding to a target sentence is obtained, the second sentences corresponding to the i and j combinations are input into the first classification models corresponding to the same i and j combinations, a first probability output by each first classification model is obtained, and then the category of the target sentence is identified based on the obtained first probabilities. Because the sentence groups in different second sentences correspond to different i and j combinations, and the sentence groups corresponding to different i and j combinations reflect the correlation between the target sentence and different context sentences, the first classification models corresponding to different i and j combinations are trained in advance, the second sentences corresponding to i and j combinations are input into the first classification models corresponding to the same i and j combinations, the first probability output by the first classification models is obtained, the first probability can be ensured to reflect whether the target sentence belongs to the target category more truly, and then the category of the target sentence can be identified more accurately according to the first probability.
In this embodiment of the application, the first classification Model may be implemented by different models, for example, the first classification Model may be a Deep Neural Network (DNN), a Support Vector Machine (SVM), a Logistic Regression (LR), a random forest (random forest), a Naive Bayesian Model (NBM), a Neural Network (NN), a Gradient Boosting Decision Tree (GBDT), a bert (binary Encoder retrieval from transforms), a Long-Short Term Memory network (Long Short-Term Memory, LSTM), or the like.
When the first classification model is a model which can directly process texts such as BERT or LSTM, the second sentence is directly input into the first classification model, and the first probability output by the first classification model is obtained. When the first classification model is a model which cannot directly process texts, such as DNN, SVM, LR, random forest, NBM, NN or GBDT, for each second sentence, firstly, performing word segmentation on the second sentence to obtain at least two words, then, according to a pre-established dictionary, respectively mapping each word to a word vector, then, according to each word vector, determining a sentence vector corresponding to the second sentence, and then, inputting the obtained sentence vector into the first classification model to obtain a first probability output by the first classification model. Those skilled in the art will appreciate that the word vectors have the same dimensions to enable averaging of the word vectors, for example, the dimension of a word vector is 200 dimensions, and a sentence vector obtained by averaging the word vectors is also 200 dimensions.
And determining a sentence vector corresponding to the second sentence for the word vector of each divided word by carrying out word segmentation on the second sentence, and inputting the sentence vector into the first classification model to obtain the first probability. And determining a sentence vector corresponding to the second sentence according to the word vectors, and integrating the influence of each word in the second sentence on the sentence category, so that the accuracy of category identification of the target sentence can be further improved when the category of the target sentence is determined according to the first probability.
In a possible implementation manner, when the sentence vector corresponding to the second sentence is determined according to the divided word vectors, an average value of the word vectors can be obtained as the sentence vector corresponding to the second sentence, and the sentence vector corresponding to the second sentence can be quickly obtained due to a simple calculation process, and the response of each word to the sentence category can be synthesized, so that the accuracy of sentence category identification based on the sentence vector is ensured.
Fig. 3 is a schematic diagram of a translation sentence vector according to an embodiment of the present application. Referring to fig. 3, a second sentence is "today's weather is good", the second sentence is first participled to obtain three participles of "today", "weather" and "good", the three participles are mapped according to a public dictionary, the "today" is mapped to a word vector a = [0.2,0.5,0.4 … 0.2.2 ], the "weather" is mapped to a word vector b = [0.2,0.6,0.4 … 0.2.2 ], the "good" is mapped to a word vector c = [0.2,0.4,0.4 … 0.2.2 ], and then the word vectors a, b and c are averaged to obtain a sentence vector a = [0.2,0.5,0.4 … 0.2.2 ] corresponding to the second sentence.
Those skilled in the art will appreciate that, in addition to the above-described manner of obtaining a sentence vector by averaging word vectors, the sentence vector of the second sentence may be determined by other manners, such as obtaining a sentence vector by weighted averaging of word vectors by TF-IDF, or obtaining a sentence vector by weighted averaging of word vectors by SIF.
207. And constructing a probability vector taking each first probability as an element.
And after each second sentence is input into the first classification model and the first probability corresponding to each second sentence is obtained, a probability vector taking each first probability as an element is constructed. In one possible implementation, the first probabilities corresponding to the second sentences are sorted in the ascending order of the corresponding parameter i and the corresponding parameter j, and then the sorted first probabilities are sequentially used as elements in the row vector/column vector to obtain the probability vector in the form of the row vector/column vector.
And inputting the second sentences text (i, j) into the first classification model corresponding to the combination of i and j to obtain first probabilities p (i, j), sorting the first probabilities corresponding to the second sentences according to the ascending order of the corresponding parameter i and the parameter j, wherein the sorting result is p (0,0)/…/p (0, n)/p (1,0)/…/p (1, n)/…/p (m,1)/…/p (m, n), and the probability vector obtained according to the sorting result of the first probabilities is B = [ p (0,0), …, p (0, n), p (1,0), …, p (1, n), …, p (m,1), …, p (m, n) ]. For example, 64 second sentences are obtained in total, and after each second sentence is input into the corresponding first classification model, 64 first probabilities can be obtained, so that a 64-dimensional probability vector is constructed.
208. And inputting the probability vector into a pre-trained second classification model to obtain a second probability output by the second classification model.
And training a second classification model in advance, wherein the second classification model can output a probability value based on the input vector, and the probability value output by the second classification model is the probability that the target sentence belongs to the target category. And after the probability vector is obtained, inputting the probability vector into a pre-trained second classification model, and obtaining a second probability output by the second classification model.
In an embodiment of the present application, the second classification model may be a logistic regression model (LR).
209. And judging whether the second probability is larger than a second probability threshold, if so, executing step 210, otherwise, executing step 211.
After the training of the second classification model is completed, a second probability threshold corresponding to the second classification model can be determined, and then whether the target sentence belongs to the target category or not is determined according to the magnitude relation between the second probability and the second probability threshold. The second probability threshold may be determined based on empirical values and may also be determined based on a harmonic mean (F1 value) of the second classification model, such as determining the parameter α that maximizes the F1 value of the second classification model as the second probability threshold. The harmonic mean (harmonic mean), also known as reciprocal mean, is the reciprocal of the arithmetic mean of the reciprocals of the statistical variables of the population, and represents the harmonic mean of accuracy and recall.
In a possible implementation manner, after the second classification model is trained, the parameter α traverses the value range 0-1 through the step length 0.1, the parameter α which maximizes the value of the second classification item F1 is selected, and the selected parameter α is determined as the second probability threshold. For example, the determined parameter α is 0.8, if the second probability output by the second classification model is greater than 0.8, step 210 is executed, and if the second probability output by the second classification model is less than or equal to 0.8, step 211 is executed.
In the embodiment of the application, the second probability threshold is determined according to the F1 value of the second classification model, since the F1 value is a harmonic mean value of the accuracy and the recall ratio, the accuracy and the recall ratio of the second classification model can be comprehensively evaluated through the F1 value, the parameter alpha traverses in 0-1 step length, the parameter alpha which enables the F1 value of the second classification model to be the maximum is determined as the second probability threshold, the accuracy and the recall ratio are comprehensively considered, and the sentence subclass can be more accurately classified based on the determined second probability threshold, so that the accuracy of identifying the sentence subclass is further improved.
In the embodiment of the application, the first probabilities are obtained based on the second sentence, that is, each first probability is obtained based on different context information of the target sentence, and the second probabilities are obtained based on the probability vector, where the probability vector is formed by each first probability, that is, the second classification model synthesizes results of the first classification models for performing category identification on the target sentence, and outputs the second probabilities that the target sentence belongs to the target category, so that the second probabilities synthesize results of the first classification models for performing category identification on the target sentence, and can more accurately reflect the category of the target sentence, so that whether the target sentence belongs to the target category can be more accurately determined according to the second probabilities, and accuracy of identifying the sentence subcategory is further improved.
210. And determining that the target sentence belongs to the target category, and ending the current flow.
In the embodiment of the application, when the second probability output by the second classification model is greater than the second probability threshold, it is indicated that the probability that the target sentence belongs to the target category is greater, and then the target category of the target sentence is determined. For example, the second probability output by the second classification model is the probability that the target sentence belongs to the humorous category, the second probability threshold is 0.8, and when the second probability output by the second classification model is greater than 0.8, the target sentence is determined to belong to the humorous category.
211. Determining that the target sentence does not belong to the target category.
In the embodiment of the application, when the second probability output by the second classification model is less than or equal to the second probability threshold, it is indicated that the probability that the target sentence belongs to the target category is low, and it is further determined that the target sentence does not belong to the target category. For example, the second probability output by the second classification model is the probability that the target sentence belongs to the humor class, the second probability threshold is 0.8, and when the second probability output by the second classification model is less than or equal to 0.8, it is determined that the target sentence does not belong to the humor class.
The method for recognizing the sentence type in the dialogue, provided by the embodiment of the application, includes introducing a parameter i and a parameter j, determining a window based on a combination of the parameter i and the parameter j, enabling the window to slide on a sequenced first sentence by changing values of the parameter i and the parameter j, determining a plurality of sentence groups, splicing the first sentence corresponding to a target sentence with each sentence group to obtain a plurality of second sentences, inputting each second sentence into a first classification model to obtain a plurality of first probabilities, inputting a probability vector formed by each first probability into a second classification model to obtain a second probability output by the second classification model, determining that the target sentence belongs to the target type if the second probability is greater than a second probability threshold, and determining that the target sentence does not belong to the target type if the second probability is less than or equal to the second probability threshold. The sentence groups are determined through sliding of the sub-windows, the category of the target sentence is determined based on the sentence groups, the relevance between the target sentence and different context sentences can be reflected by different sentence groups, the category of the target sentence is identified according to the sentence groups, and the accuracy of identifying the category of the target sentence can be improved.
It should be noted that, in another possible implementation manner, when the category of the target sentence is identified according to each first probability obtained in step 206, an average value of each first probability is calculated to obtain an average probability, then the magnitude relationship between the average probability and a preset first probability threshold is compared, and if the average probability is greater than the first probability threshold, the target sentence is determined to belong to the target category.
In the embodiment of the application, the average value of each first probability is calculated to obtain the average probability, the category of the target sentence is determined according to the magnitude relation between the average probability and the preset first probability threshold, only the first classification model for obtaining the first probability needs to be trained, and the second classification model for obtaining the second probability does not need to be trained, so that the sentence category identification method is easier to realize, and the sentence category identification efficiency can be improved due to the fact that the calculated data amount is small.
Those skilled in the art will appreciate that, in addition to determining the context information of the target sentence through multiple windows in the embodiment of the present application, the context information of the target sentence may also be determined in other ways, such as directly determining m sentences before the target sentence and n sentences after the target sentence as the context information of the target sentence, and the like.
It should be noted that all the optional technical solutions in the above method embodiments may be combined arbitrarily to form an optional embodiment of the present application, and are not described herein again.
EXAMPLE III
Fig. 4 is a schematic diagram of an apparatus for recognizing sentence types in a dialog according to an embodiment of the present application. Referring to fig. 4, an apparatus for recognizing sentence categories in a dialog provided in an embodiment of the present application includes:
a text acquiring module 401, configured to acquire a dialog text including at least two sentences;
an information splicing module 402, configured to splice, for each sentence in the dialog text acquired by the text acquisition module 401, speaker information used for indicating a speaker of the sentence with the sentence, and generate a first sentence corresponding to the sentence;
a sequence obtaining module 403, configured to obtain a sentence sequence including each first sentence generated by the information splicing module 402;
a context obtaining module 404, configured to obtain context information of a target sentence to be recognized in the dialog text from the sentence sequence obtained by the sequence obtaining module 403, where the context information includes an upper sentence of the target sentence and/or a first sentence corresponding to the lower sentence;
a category identifying module 405, configured to identify a category of the target sentence according to the first sentence corresponding to the target sentence acquired by the information splicing module 402 and the context information acquired by the context acquiring module 404.
In this embodiment, the text obtaining module 401 may be configured to perform the step 101 in the first embodiment, the information splicing module 402 may be configured to perform the step 102 in the first embodiment, the sequence obtaining module 403 may be configured to perform the step 103 in the first embodiment, the context obtaining module 404 may be configured to perform the step 104 in the first embodiment, and the category identifying module 405 may be configured to perform the step 105 in the first embodiment.
In the embodiment of the application, each sentence in the dialog text is spliced with the information of the speaker to obtain a first sentence corresponding to each sentence, for any target sentence to be identified in the dialog text, a context sentence of the target sentence and/or the first sentence corresponding to the context sentence are/is obtained as context information, and then the category of the target sentence is identified according to the first sentence corresponding to the target sentence and the context information. Because the first sentence corresponding to the target sentence comprises the speaker information, and the context information of the target sentence comprises the upper sentence and/or the lower sentence of the target sentence, when the category of the target sentence is identified, not only the content of the target sentence can be referred, but also the speaker and the context sentence of the target sentence can be taken as the reference information, and the speaker and the context of the sentence can embody the category of the sentence, so that more reference information can be referred when the category of the target sentence is identified, and the accuracy of identifying the sentence subcategory can be improved.
In a possible implementation manner, the context obtaining module 404 is configured to determine at least one sentence group corresponding to the target sentence according to a position of the first sentence corresponding to the target sentence in the sentence sequence, where the sentence group includes: at least one first sentence which is acquired from the sentence sequence and is positioned before the first sentence corresponding to the target sentence, and/or at least one first sentence which is acquired from the sentence sequence and is positioned after the first sentence corresponding to the target sentence; obtaining the context information including the at least one sentence group.
In a possible implementation manner, the context obtaining module 404 is configured to determine, as one of the sentence groups corresponding to the target sentence, a sequential combination of i first sentences located before the first sentence corresponding to the target sentence, and j first sentences located after the first sentence corresponding to the target sentence in the sentence sequence; and combining the possible value of i in [0, m ] with the possible value of j in [0, n ] to obtain (m +1) (n +1) sentence groups corresponding to the target sentence, wherein m and n are positive integers, and different sentence groups correspond to different combinations of i and j.
In a possible implementation manner, the category identification module 405 is configured to splice the first sentence corresponding to the target sentence with the at least one sentence group included in the context information, respectively, to generate at least one second sentence; for each generated second sentence, inputting the second sentence into a pre-trained first classification model, and obtaining a first probability output by the first classification model for the second sentence; and identifying the category of the target sentence according to the obtained first probabilities.
In one possible implementation, the category identifying module 405 is configured to, for each second sentence generated, perform: determining the first classification model corresponding to the second sentence according to the number of the first sentences which are positioned before and after the first sentence corresponding to the target sentence in the sentence group included in the second sentence; and inputting the second sentence into the corresponding first classification model, and obtaining the first probability which is output by the first classification model and corresponds to the second sentence.
In one possible implementation, the category identifying module 405 is configured to, for each second sentence generated, perform: carrying out word segmentation on the second sentence to obtain at least two words; respectively mapping each word into a word vector according to a pre-established dictionary; determining a sentence vector corresponding to the second sentence according to each word vector; and inputting the sentence vector into the first classification model to obtain the first probability output by the first classification model aiming at the second sentence.
In a possible implementation manner, the category identification module 405 is configured to calculate an average value of each word vector as the sentence vector corresponding to the second sentence.
In a possible implementation manner, the category identification module 405 is configured to calculate an average value of the first probabilities, so as to obtain an average probability; determining that the target sentence belongs to a target category if the average probability is greater than a preset first probability threshold.
In a possible implementation, the category identification module 405 is configured to construct a probability vector comprising at least one dimension by obtaining at least one of the first probabilities; inputting the probability vector into a pre-trained second classification model to obtain a second probability output by the second classification model; and if the second probability is greater than a preset second probability threshold, determining that the target sentence belongs to a target category.
In one possible implementation, the second probability threshold is a value that maximizes the harmonic mean of the second classification model.
It should be noted that the device for recognizing the sentence category in the dialog provided in the third embodiment of the present application is based on the same concept as the method for recognizing the sentence category in the dialog provided in the first embodiment and the second embodiment, and specific contents thereof can be referred to the description in the first embodiment and the second embodiment, and are not repeated herein.
Example four
Based on the methods for recognizing the sentence category in the dialog described in the first embodiment and the second embodiment, an embodiment of the present application provides an electronic device for executing the method for recognizing the sentence category in the dialog provided in the first embodiment or the second embodiment. Fig. 5 is a schematic view of an electronic device according to a fourth embodiment of the present application. Referring to fig. 5, an electronic device 50 provided in the embodiment of the present application includes: at least one processor (processor)502, memory 504, bus 506, and communication interface 508. Wherein the content of the first and second substances,
the processor 502, communication interface 508, and memory 504 communicate with each other via a communication bus 506.
A communication interface 508 for communicating with other devices.
The processor 502 is configured to execute the program 510, and may specifically execute the relevant steps in the methods described in the first embodiment, the second embodiment, or the fourth embodiment.
In particular, program 510 may include program code that includes computer operating instructions.
The processor 502 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present application. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
The memory 504 is used for storing the program 510. Memory 504 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
EXAMPLE five
An embodiment of the present application provides a computer storage medium, including: the computer storage medium stores a computer program which, when executed by a processor, implements a method of identifying a category of a sentence in a dialog as described in any of the embodiments of the present application.
In the embodiment of the application, each sentence in the dialog text is spliced with the information of the speaker to obtain a first sentence corresponding to each sentence, for any target sentence to be recognized in the dialog text and any target sentence to be recognized in the dialog text, the first sentence corresponding to the contextual sentence of the target sentence is obtained as contextual information, and then the category of the target sentence is recognized according to the first sentence corresponding to the target sentence and the contextual information. Because the first sentence corresponding to the target sentence comprises the speaker information and the context information of the target sentence comprises the context sentence of the target sentence, when the category of the target sentence is identified, not only the content of the target sentence can be referred, but also the speaker and the context sentence of the target sentence can be referred, and the speaker and the context of the sentence can embody the category of the sentence, so that the category of the target sentence can be identified by referring to more information, and the accuracy of identifying the sentence category can be improved.
So far, specific embodiments of the present application have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may be advantageous.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transmyedia) such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the phrase "including a" does not exclude the presence of other, identical elements in the process, method, article, or apparatus that comprises the same element, whether or not the same element is present in all of the same element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular transactions or implement particular abstract data types. The application may also be practiced in distributed computing environments where transactions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for identifying sentence classes in a conversation, comprising:
obtaining a dialog text comprising at least two sentences;
for each sentence in the dialog text, splicing the speaker information of the speaker for indicating the sentence with the sentence to generate a first sentence corresponding to the sentence;
obtaining a sentence sequence comprising the first sentence corresponding to each sentence in the dialog text;
acquiring context information of a target sentence to be recognized in the dialog text from the sentence sequence, wherein the context information comprises:
sequentially combining i first sentences located before the first sentence corresponding to the target sentence, the first sentence corresponding to the target sentence and j first sentences located after the first sentence corresponding to the target sentence in the sentence sequence to determine a sentence group corresponding to the target sentence;
combining the possible value of i in [0, m ] with the possible value of j in [0, n ] to obtain (m +1) (n +1) sentence groups corresponding to the target sentence, wherein m and n are positive integers, and different sentence groups correspond to different combinations of i and j;
obtaining the context information comprising (m +1) (n +1) of the sentence groups;
identifying a category of the target sentence according to the first sentence corresponding to the target sentence and the context information, including:
the first sentence corresponding to the target sentence is spliced with at least one sentence group included in the context information respectively to generate at least one second sentence;
for each generated second sentence, inputting the second sentence into a pre-trained first classification model, and obtaining a first probability output by the first classification model for the second sentence;
and identifying the category of the target sentence according to the obtained first probabilities.
2. The method according to claim 1, wherein the step of inputting each generated second sentence into a pre-trained first classification model, and obtaining a first probability output by the first classification model for the second sentence comprises:
for each of the second sentences generated, performing:
determining the first classification model corresponding to the second sentence according to the number of the first sentences which are positioned before and after the first sentence corresponding to the target sentence in the sentence group included in the second sentence;
and inputting the second sentence into the corresponding first classification model, and obtaining the first probability which is output by the first classification model and corresponds to the second sentence.
3. The method according to claim 1, wherein the step of inputting each generated second sentence into a pre-trained first classification model, and obtaining a first probability output by the first classification model for the second sentence comprises:
for each of the second sentences generated, performing:
carrying out word segmentation on the second sentence to obtain at least two words;
respectively mapping each word into a word vector according to a pre-established dictionary;
determining a sentence vector corresponding to the second sentence according to each word vector;
and inputting the sentence vector into the first classification model to obtain the first probability output by the first classification model aiming at the second sentence.
4. The method of claim 3, wherein determining a sentence vector corresponding to the second sentence based on each of the word vectors comprises:
and calculating the average value of each word vector to serve as the sentence vector corresponding to the second sentence.
5. The method according to any one of claims 1 to 4, wherein the identifying the category of the target sentence according to the obtained first probabilities comprises:
calculating the average value of the first probabilities to obtain an average probability;
determining that the target sentence belongs to a target category if the average probability is greater than a preset first probability threshold.
6. The method according to any one of claims 1 to 4, wherein the identifying the category of the target sentence according to the obtained first probability comprises:
constructing a probability vector comprising at least one dimension by obtaining at least one of the first probabilities;
inputting the probability vector into a pre-trained second classification model to obtain a second probability output by the second classification model;
and if the second probability is greater than a preset second probability threshold, determining that the target sentence belongs to a target category.
7. The method of claim 6, wherein the second probability threshold is a value that maximizes the harmonic mean of the second classification model.
8. An apparatus for identifying sentence classes in a conversation, comprising:
a text acquisition module for acquiring a dialog text including at least two sentences;
an information splicing module, configured to splice, for each sentence in the dialog text acquired by the text acquisition module, speaker information used for indicating a speaker of the sentence with the sentence, and generate a first sentence corresponding to the sentence;
a sequence acquisition module, configured to acquire a sentence sequence including each of the first sentences generated by the information concatenation module;
a context obtaining module, configured to determine, as a sentence group corresponding to a target sentence, a sequential combination of i first sentences located before the first sentence corresponding to the target sentence, and j first sentences located after the first sentence corresponding to the target sentence in the sentence sequence; combining the possible value of i in [0, m ] with the possible value of j in [0, n ] to obtain (m +1) (n +1) sentence groups corresponding to the target sentence, wherein m and n are positive integers, and different sentence groups correspond to different combinations of i and j; obtaining the context information comprising (m +1) (n +1) of the sentence groups;
and a category identification module, configured to splice, according to the first sentence corresponding to the target sentence acquired by the information splicing module, the first sentence with at least one sentence group included in the context information acquired by the context acquisition module, to generate at least one second sentence, for each generated second sentence, input the second sentence into a pre-trained first classification model, obtain a first probability that the first classification model outputs for the second sentence, and identify a category of the target sentence according to each obtained first probability.
9. An electronic device, comprising: a processor and a memory, the processor being coupled to the memory, the memory storing a computer program, the processor being configured to execute the computer program to implement the method of identifying a category of sentences in a dialog as claimed in any of claims 1-7 above.
10. A computer storage medium, comprising: the computer storage medium stores a computer program which, when executed by a processor, implements the method of identifying sentence classes in a conversation of any of claims 1-7 above.
CN202110412241.7A 2021-04-16 2021-04-16 Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation Active CN112989822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110412241.7A CN112989822B (en) 2021-04-16 2021-04-16 Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110412241.7A CN112989822B (en) 2021-04-16 2021-04-16 Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation

Publications (2)

Publication Number Publication Date
CN112989822A CN112989822A (en) 2021-06-18
CN112989822B true CN112989822B (en) 2021-08-27

Family

ID=76340856

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110412241.7A Active CN112989822B (en) 2021-04-16 2021-04-16 Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation

Country Status (1)

Country Link
CN (1) CN112989822B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407679B (en) * 2021-06-30 2023-10-03 竹间智能科技(上海)有限公司 Text topic mining method and device, electronic equipment and storage medium
CN113761209B (en) * 2021-09-17 2023-10-10 泰康保险集团股份有限公司 Text splicing method and device, electronic equipment and storage medium
CN114547315A (en) * 2022-04-25 2022-05-27 湖南工商大学 Case classification prediction method and device, computer equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101076061A (en) * 2007-03-30 2007-11-21 腾讯科技(深圳)有限公司 Robot server and automatic chatting method
CN107003997A (en) * 2014-12-04 2017-08-01 微软技术许可有限责任公司 Type of emotion for dialog interaction system is classified
CN107480122A (en) * 2017-06-26 2017-12-15 迈吉客科技(北京)有限公司 A kind of artificial intelligence exchange method and artificial intelligence interactive device
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium
CN109977208A (en) * 2019-03-22 2019-07-05 北京中科汇联科技股份有限公司 It is a kind of to merge FAQ and task and the actively conversational system of guidance
CN110188190A (en) * 2019-04-03 2019-08-30 阿里巴巴集团控股有限公司 Talk with analytic method, device, server and readable storage medium storing program for executing
CN110874402A (en) * 2018-08-29 2020-03-10 北京三星通信技术研究有限公司 Reply generation method, device and computer readable medium based on personalized information
CN111667811A (en) * 2020-06-15 2020-09-15 北京百度网讯科技有限公司 Speech synthesis method, apparatus, device and medium
CN111767715A (en) * 2020-06-10 2020-10-13 北京奇艺世纪科技有限公司 Method, device, equipment and storage medium for person identification
CN111950275A (en) * 2020-08-06 2020-11-17 平安科技(深圳)有限公司 Emotion recognition method and device based on recurrent neural network and storage medium
CN112100337A (en) * 2020-10-15 2020-12-18 平安科技(深圳)有限公司 Emotion recognition method and device in interactive conversation
CN112233698A (en) * 2020-10-09 2021-01-15 中国平安人寿保险股份有限公司 Character emotion recognition method and device, terminal device and storage medium
CN112270168A (en) * 2020-10-14 2021-01-26 北京百度网讯科技有限公司 Dialogue emotion style prediction method and device, electronic equipment and storage medium
CN112434492A (en) * 2020-10-23 2021-03-02 北京百度网讯科技有限公司 Text labeling method and device and electronic equipment

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8370440B2 (en) * 2008-09-30 2013-02-05 Microsoft Corporation Role-independent context exchange
JP6671020B2 (en) * 2016-06-23 2020-03-25 パナソニックIpマネジメント株式会社 Dialogue act estimation method, dialogue act estimation device and program
US11954719B2 (en) * 2019-05-30 2024-04-09 Ncr Voyix Corporation Personalized voice-based assistance
CN114270337A (en) * 2019-06-17 2022-04-01 得麦股份有限公司 System and method for personalized and multi-modal context-aware human-machine dialog
CN110399461A (en) * 2019-07-19 2019-11-01 腾讯科技(深圳)有限公司 Data processing method, device, server and storage medium
CN110413788B (en) * 2019-07-30 2023-01-31 携程计算机技术(上海)有限公司 Method, system, device and storage medium for predicting scene category of conversation text
US11551676B2 (en) * 2019-09-12 2023-01-10 Oracle International Corporation Techniques for dialog processing using contextual data
CN112270167B (en) * 2020-10-14 2022-02-08 北京百度网讯科技有限公司 Role labeling method and device, electronic equipment and storage medium
CN112270169B (en) * 2020-10-14 2023-07-25 北京百度网讯科技有限公司 Method and device for predicting dialogue roles, electronic equipment and storage medium
CN112270198B (en) * 2020-10-27 2021-08-17 北京百度网讯科技有限公司 Role determination method and device, electronic equipment and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101076061A (en) * 2007-03-30 2007-11-21 腾讯科技(深圳)有限公司 Robot server and automatic chatting method
CN107003997A (en) * 2014-12-04 2017-08-01 微软技术许可有限责任公司 Type of emotion for dialog interaction system is classified
CN107480122A (en) * 2017-06-26 2017-12-15 迈吉客科技(北京)有限公司 A kind of artificial intelligence exchange method and artificial intelligence interactive device
CN109308319A (en) * 2018-08-21 2019-02-05 深圳中兴网信科技有限公司 File classification method, document sorting apparatus and computer readable storage medium
CN110874402A (en) * 2018-08-29 2020-03-10 北京三星通信技术研究有限公司 Reply generation method, device and computer readable medium based on personalized information
CN109977208A (en) * 2019-03-22 2019-07-05 北京中科汇联科技股份有限公司 It is a kind of to merge FAQ and task and the actively conversational system of guidance
CN110188190A (en) * 2019-04-03 2019-08-30 阿里巴巴集团控股有限公司 Talk with analytic method, device, server and readable storage medium storing program for executing
CN111767715A (en) * 2020-06-10 2020-10-13 北京奇艺世纪科技有限公司 Method, device, equipment and storage medium for person identification
CN111667811A (en) * 2020-06-15 2020-09-15 北京百度网讯科技有限公司 Speech synthesis method, apparatus, device and medium
CN111950275A (en) * 2020-08-06 2020-11-17 平安科技(深圳)有限公司 Emotion recognition method and device based on recurrent neural network and storage medium
CN112233698A (en) * 2020-10-09 2021-01-15 中国平安人寿保险股份有限公司 Character emotion recognition method and device, terminal device and storage medium
CN112270168A (en) * 2020-10-14 2021-01-26 北京百度网讯科技有限公司 Dialogue emotion style prediction method and device, electronic equipment and storage medium
CN112100337A (en) * 2020-10-15 2020-12-18 平安科技(深圳)有限公司 Emotion recognition method and device in interactive conversation
CN112434492A (en) * 2020-10-23 2021-03-02 北京百度网讯科技有限公司 Text labeling method and device and electronic equipment

Also Published As

Publication number Publication date
CN112989822A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN112989822B (en) Method, device, electronic equipment and storage medium for recognizing sentence categories in conversation
CN111695352A (en) Grading method and device based on semantic analysis, terminal equipment and storage medium
US11475897B2 (en) Method and apparatus for response using voice matching user category
CN111916111B (en) Intelligent voice outbound method and device with emotion, server and storage medium
CN110556130A (en) Voice emotion recognition method and device and storage medium
CN107291840B (en) User attribute prediction model construction method and device
CN108874935B (en) Review content recommendation method based on voice search and electronic equipment
CN111901627B (en) Video processing method and device, storage medium and electronic equipment
US20200051451A1 (en) Short answer grade prediction
CN111344717B (en) Interactive behavior prediction method, intelligent device and computer readable storage medium
CN112562640B (en) Multilingual speech recognition method, device, system, and computer-readable storage medium
CN111651497A (en) User label mining method and device, storage medium and electronic equipment
CN111930792A (en) Data resource labeling method and device, storage medium and electronic equipment
CN114218488A (en) Information recommendation method and device based on multi-modal feature fusion and processor
CN114818729A (en) Method, device and medium for training semantic recognition model and searching sentence
CN112992147A (en) Voice processing method, device, computer equipment and storage medium
CN112908315A (en) Question-answer intention judgment method based on voice characteristics and voice recognition
CN113111855B (en) Multi-mode emotion recognition method and device, electronic equipment and storage medium
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN112100986B (en) Voice text clustering method and device
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment
CN110610697B (en) Voice recognition method and device
CN111522937A (en) Method and device for recommending dialect and electronic equipment
CN115440198B (en) Method, apparatus, computer device and storage medium for converting mixed audio signal
CN114203159A (en) Speech emotion recognition method, terminal device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant