CN111797631A

CN111797631A - Information processing method and device and electronic equipment

Info

Publication number: CN111797631A
Application number: CN201910270744.8A
Authority: CN
Inventors: 韩伟
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2020-10-20
Anticipated expiration: 2039-04-04
Also published as: CN111797631B

Abstract

The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, wherein the method comprises the following steps: acquiring text information to be recognized, sequentially determining a set number of vocabularies of the text information as language units, performing semantic recognition processing on the language units, and determining effective semantic information of the text information according to a semantic recognition result of the language units; therefore, in the embodiment, before semantic recognition, no pre-segmentation is needed to be performed on the voice information or the text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, because the semantic recognition processing is carried out on each language unit in real time, the real-time performance of the semantic recognition is improved.

Description

Information processing method and device and electronic equipment

Technical Field

The embodiment of the invention relates to the technical field of artificial intelligence, in particular to an information processing method and device and electronic equipment.

Background

With the development of human-computer interaction technology, the semantic recognition technology shows its importance. Semantic recognition is a process of extracting feature information from a voice signal emitted by a human and determining the language meaning thereof, and mainly includes a voice recognition process and a semantic understanding process. The speech recognition process is a process of converting a human speech signal into text using an acoustic model, and the semantic understanding process is a process of recognizing the meaning of text using a natural language model.

In the prior art, when a voice signal input by a user is processed, a Voice Activity Detection (VAD) technology is first used to determine a start point and an end point of each voice segment in a continuous voice signal, so as to segment the continuous voice signal, and then voice recognition and semantic understanding are performed on the switched voice segments, so as to obtain the semantics of the user.

However, in practical applications, because the speaking speeds and speaking habits of different users and the scenes where speakers are located are different, the sentences are segmented in a VAD detection manner, so that the segmentation of the sentences is not accurate enough, and the accuracy of semantic recognition is not high.

Disclosure of Invention

The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, which are used for improving the accuracy of semantic recognition.

In a first aspect, an embodiment of the present invention provides an information processing method, including:

acquiring text information to be identified;

and sequentially determining the set number of vocabularies of the text information as language units, performing semantic recognition processing on the language units, and determining effective semantic information of the text information according to a semantic recognition result of the language units.

Optionally, the semantic recognition result includes: the semantic integrity probability score and the semantic information, wherein the effective semantic information of the text information is determined according to the semantic recognition result of the language unit, and the effective semantic information comprises the following steps:

and if the semantic integrity probability scores corresponding to the N continuous language units meet a preset condition, taking the semantic information of the N language units as effective semantic information of the text information, wherein N is greater than or equal to 1.

Optionally, if the semantic integrity probability scores corresponding to the N consecutive language units satisfy a preset condition, taking the semantic information of the N language units as effective semantic information of the text information, including:

aiming at any first language unit in the language units, obtaining a cached historical language unit, wherein the historical language unit comprises at least one language unit before the first language unit, and the semantic integrity probability score corresponding to the historical language unit does not meet a set condition;

performing semantic recognition processing on a second language unit obtained by splicing the historical language unit and the first language unit to obtain a semantic recognition result of the second language unit;

and if the semantic integrity probability score of the second language unit meets a set condition, taking the semantic information of the second language unit as effective semantic information of the text information.

Optionally, determining that the semantic integrity probability score of the second language unit meets a set condition according to the following steps:

and if the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, determining that the semantic integrity probability score of the second language unit meets a set condition.

if the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, and the semantic integrity probability score of the second language unit is greater than or equal to the semantic integrity probability score of the language unit obtained by splicing the second language unit and a third language unit, determining that the semantic integrity probability score of the second language unit meets a set condition;

wherein the third language unit is a language unit subsequent to and adjacent to the first language unit.

if the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, and the semantic integrity probability scores of the language units obtained by splicing the second language unit and the language units before the fourth language unit are all less than or equal to the integrity probability score of the second language unit, determining that the semantic integrity probability score of the second language unit meets a set condition;

the fourth language unit is located behind the first language unit, and a preset number of language units are arranged between the fourth language unit and the first language unit.

Optionally, the method further includes: and if the semantic integrity probability score of the second language unit meets the set condition, deleting the historical language unit from the cache.

Optionally, the method further includes:

and if the semantic integrity probability score of the second language unit does not meet the set condition, determining the second language unit as the historical language unit and caching the historical language unit into a cache.

Optionally, after the semantic information of the second language unit is used as the valid semantic information of the text information, the method further includes:

acquiring cached prediction semantic information and prediction reply information corresponding to the prediction semantic information, wherein the prediction semantic information is obtained by predicting according to the semantic information of the historical language unit;

and if the effective semantic information is consistent with the predicted semantic information, using the predicted reply information as reply information corresponding to the text information.

Optionally, before the obtaining the text information to be recognized, the method further includes:

acquiring voice information input into the intelligent equipment, and performing voice recognition processing on the voice information to obtain text information to be recognized.

Optionally, after determining the valid semantic information of the text information, the method further includes:

acquiring reply information corresponding to the text information according to the effective semantic information;

and controlling the intelligent equipment to output the reply information.

In a second aspect, an embodiment of the present invention provides an information processing apparatus, including:

the acquisition module is used for acquiring text information to be identified;

the first recognition module is used for sequentially determining the set number of vocabularies of the text information as language units, performing semantic recognition processing on the language units, and determining effective semantic information of the text information according to a semantic recognition result of the language units.

Optionally, the semantic recognition result includes: the first recognition module is specifically configured to:

Optionally, the first identification module is specifically configured to:

Optionally, the first identification module is further configured to:

and if the semantic integrity probability score of the second language unit meets the set condition, deleting the historical language unit from the cache.

Optionally, the first identification module is further configured to:

Optionally, the apparatus further comprises: a second identification module;

the acquisition module is also used for acquiring voice information input into the intelligent equipment;

and the second recognition module is used for carrying out voice recognition processing on the voice information to obtain text information to be recognized.

Optionally, the first identification module is further configured to:

and controlling the intelligent equipment to output the reply information.

In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor and memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any one of the first aspects.

In a fourth aspect, the present invention provides a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method according to any one of the first aspect is implemented.

In a fifth aspect, embodiments of the present invention provide a computer program product comprising computer program code which, when run on a computer, causes the computer to perform the method of any of the first aspects above.

In a sixth aspect, an embodiment of the present invention provides a chip, including a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that an electronic device in which the chip is installed performs the method according to any one of the above first aspects.

The technical scheme provided by the embodiment of the invention comprises the steps of acquiring text information to be recognized, sequentially determining a set number of vocabularies of the text information as language units, performing semantic recognition processing on the language units, and determining effective semantic information of the text information according to a semantic recognition result of the language units; therefore, in the embodiment, before semantic recognition, no pre-segmentation is needed to be performed on the voice information or the text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, because the semantic recognition processing is carried out on each language unit in real time, the real-time performance of the semantic recognition is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic diagram of a semantic recognition process in the prior art;

fig. 2 is a first schematic flow chart of an information processing method according to an embodiment of the present invention;

fig. 3 is a second schematic flowchart of an information processing method according to an embodiment of the present invention;

FIG. 4 is a first diagram illustrating a semantic recognition process according to an embodiment of the present invention;

FIG. 5 is a second diagram illustrating a semantic recognition process according to an embodiment of the present invention;

fig. 6 is a third schematic flowchart of an information processing method according to an embodiment of the present invention;

FIG. 7 is a first schematic structural diagram of an information processing apparatus according to an embodiment of the present invention;

FIG. 8 is a second schematic structural diagram of an information processing apparatus according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a schematic diagram of a semantic recognition process in the prior art, and as shown in fig. 1, when processing voice information input by a user, a Voice Activity Detection (VAD) technique is first used to determine a start point and an end point of each voice segment in continuous voice information, so as to segment the continuous voice information, and then voice recognition and semantic understanding are performed on the switched voice segments, so as to obtain the semantics of the user. Specifically, the Speech segment is input into an Automatic Speech Recognition (ASR) model for Recognition to obtain text information corresponding to the Speech segment, and then the text information is input into a Natural Language Processing (NLP) model for Recognition to obtain semantic information corresponding to the text information.

In order to solve the above problem, an embodiment of the present invention provides an information processing method. In the embodiment, continuous voice information is subjected to voice recognition without segmentation to obtain text information to be recognized, set number of vocabularies of the text information are sequentially used as language units to perform semantic recognition processing in real time, and effective semantic information of the text information is determined according to semantic recognition results of the language units; because the voice information or the text information does not need to be pre-segmented, semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, because the language unit obtained by voice recognition is subjected to semantic recognition processing in real time, the real-time performance of semantic recognition is improved.

The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 2 is a schematic flowchart of an information processing method according to an embodiment of the present invention, where the method may be executed by a server or a controller of an intelligent device. The smart device may be any electronic device having a function of performing a man-machine interaction with a user, including but not limited to: robot, intelligent audio amplifier, intelligent house, intelligent wearing equipment, smart mobile phone etc..

It should be noted that, for convenience of description, in this embodiment and the following embodiments, only the smart device is taken as an example for description.

As shown in fig. 2, the information processing method may include:

s201: and acquiring text information to be recognized.

The text information to be recognized may be long text information. That is, the text information to be recognized is text information that has not been cut.

The text information may be user input into the smart device. In one possible scenario, a user enters textual information directly into the smart device. In another possible scenario, a user inputs voice information into the intelligent device, and then the intelligent device performs voice recognition on the voice information to obtain text information.

Based on the second scenario, S201 may specifically include:

Specifically, when acquiring the voice information input to the intelligent device, the voice information of the user may be acquired through a microphone of the intelligent device, or the voice information of the user acquired by other devices may be received through a network or bluetooth. It should be noted that, the embodiment of the present invention is only described by taking the two possible implementation manners as examples to acquire the voice information of the user, but the embodiment of the present invention is not limited thereto.

After the voice information is acquired, voice recognition technology can be adopted to perform voice recognition processing on the voice information in real time to obtain text information. In an alternative embodiment, the speech information is input into an automatic speech recognition ASR model, which outputs the recognized text information.

The embodiment is different from the prior art in that after the intelligent device obtains the voice information input by the user, the voice information is not segmented, but the voice information is directly subjected to voice recognition to obtain text information. For example, the text information obtained by recognition may be "how effective the weather is today is really" you see a good best version of this robot ".

S202: and sequentially determining the set number of vocabularies of the text information as language units, performing semantic recognition processing on the language units, and determining effective semantic information of the text information according to a semantic recognition result of the language units.

In this embodiment, a set number of words sequentially read from text information are referred to as language units. It is understood that the number of words to be processed for semantic recognition at a time may be configured. For example, the set number may be configured to be 1; as another example, the set number may be configured to be 3. In general, the smaller the number of configured setting numbers, the higher the semantic recognition accuracy. However, the smaller the value of the set number to be allocated, the longer the processing time to be consumed.

In the embodiment of the invention, in the process of carrying out voice recognition on voice information, for recognized text information, a set number of words in the text information are sequentially read, and semantic recognition processing is carried out in real time.

In an alternative embodiment, the NLP model is processed using natural language, and a predetermined number of words in the text information are sequentially input to the NLP model to perform semantic recognition processing. Specifically, when the ASR model is used to perform speech recognition on speech information, recognized text information is sequentially input to the NLP model in real time by using a set number of words as language units, and the NLP model performs semantic recognition processing on the set number of words. Because each vocabulary input into the NLP model is input in a running-type real-time manner, the real-time performance of semantic recognition can be improved.

NLP models can typically process a text segment of a certain length at a time. As a possible implementation mode, the NLP model carries out word segmentation processing on an input text segment to obtain a keyword sequence, then word vectors with context semantic relation are obtained according to the keyword sequence, then the word vectors are input into a classification model to carry out feature extraction, and the classification model outputs the probability of semantic category to which the text segment belongs according to the extracted features.

Optionally, the classification model in the NLP model may be a deep neural network model.

For example, assuming that the speech information is "how the weather is today", when performing speech recognition on the speech information, the words obtained by sequential recognition are:

"jin", "Tian", "Qi", "what", "how" and "like"

Taking the example that the set number is set to 1, the words are sequentially input into the NLP model in real time for semantic recognition processing. Specifically, the NLP model performs semantic recognition processing on "today", "today day", "weather of today" and "weather of today" in sequence, and determines effective semantic information of the text information according to the obtained semantic recognition result.

In a possible implementation manner, in the process of performing semantic identification on each language unit of text information, if semantic integrity probability scores corresponding to N consecutive language units satisfy a preset condition, the semantic information of the N language units is used as effective semantic information of the text information, where N is greater than or equal to 1.

In this embodiment, when performing semantic recognition on a set number of words in text information as language units in sequence, if the integrity probability scores corresponding to N consecutive language units satisfy a preset condition, the semantic information of the N language units is used as effective semantic information of the text information. It can be understood that "the semantic integrity probability scores corresponding to the N consecutive language units satisfy the preset condition" means that the semantics corresponding to the N consecutive language units are relatively complete.

It can be understood that when determining whether the semantics are complete according to the semantic integrity probability score of the language unit, various preset conditions can be adopted for determination. The embodiment of the present invention is not particularly limited thereto.

For example, suppose the words in the text information are: "day, qi, what, way, sample, sing, head, song, bar" uses each vocabulary as a language unit to carry out semantic recognition, the 1 st to 6 th language units "day, qi, what, way, sample" have complete semantics, and the 7 th to 10 th language units "sing, head, song, bar" have complete semantics, therefore, the semantic information corresponding to the two groups of language units is used as the effective semantic information of the text information.

In the information processing method provided by this embodiment, text information to be recognized is acquired, a set number of vocabularies of the text information are sequentially determined as language units, semantic recognition processing is performed on the language units, and effective semantic information of the text information is determined according to a semantic recognition result of the language units; therefore, in the embodiment, before semantic recognition, no pre-segmentation is needed to be performed on the voice information or the text information, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, because the semantic recognition processing is carried out on each language unit in real time, the real-time performance of the semantic recognition is improved.

Fig. 3 is a flowchart illustrating a second information processing method according to an embodiment of the present invention. This embodiment refines the embodiment shown in fig. 2. As shown in fig. 3, the method of the present embodiment includes:

s301: acquiring voice information input into the intelligent equipment, and performing voice recognition processing on the voice information to obtain text information to be recognized.

In this embodiment, the specific implementation of S301 is similar to that of the embodiment shown in fig. 2, and is not described here again.

S302: determining the set number of vocabularies of the text information as language units in sequence, and performing semantic recognition processing on the language units to obtain a semantic recognition result, wherein the semantic recognition result comprises: semantic integrity probability scores and semantic information.

In this embodiment, the semantic recognition result includes: semantic integrity probability scores and semantic information. Specifically, when the NLP model is used for semantic recognition processing, a language unit is input into the NLP model, and the NLP model performs semantic recognition processing on the language unit to output semantic information of the language unit and also output a semantic integrity probability score of the language unit.

As can be appreciated, the semantic integrity probability score is used to indicate the integrity of the semantics expressed by the language unit. It can be understood that the more complete the semantics expressed by the language unit, the higher the corresponding semantic integrity probability score is; the more incomplete the semantics expressed by a language unit, the lower the corresponding semantic integrity probability score. For example: the semantic integrity probability score for "weather today" is less than the semantic integrity probability score for "weather so today".

S303: and aiming at any one first language unit in the language units, obtaining a cached historical language unit, wherein the historical language unit comprises at least one language unit before the first language unit, and the semantic integrity probability score corresponding to the historical language unit does not meet a set condition.

It will be appreciated that the lexical order of the cached historical speech units is consistent with the lexical order of the original speech information.

In addition, the present embodiment does not specifically limit the cache location of the history language unit. It is understood that the historical language unit may be cached in the cache of the NLP model, or may be cached in a cache outside the NLP model.

S304: and performing semantic recognition processing on a second language unit obtained by splicing the historical language unit and the first language unit to obtain a semantic recognition result of the second language unit.

It can be understood that the vocabulary order in the spliced second language unit is consistent with the vocabulary order in the original speech information.

S305: if the semantic integrity probability score of the second language unit meets a set condition, taking the semantic information of the second language unit as effective semantic information of the text information, and deleting the historical language unit from a cache; and if the semantic integrity probability score of the second language unit does not meet the set condition, determining the second language unit as the historical language unit and caching the historical language unit into a cache.

Wherein, the semantic integrity probability score corresponding to the language unit not meeting the set condition means: the probability score of semantic completeness corresponding to the language unit is low, namely the semantic of the language unit expression is incomplete. The language unit corresponding semantic integrity probability score meeting the set condition is that: the probability score of the semantic completeness corresponding to the language unit is higher, namely, the semantic completeness expressed by the language unit.

The setting conditions in the present embodiment may take various forms, and are not particularly limited. In one possible implementation, the semantic integrity probability score is greater than or equal to a preset threshold as a setting condition. That is, when the semantic integrity probability score of a language unit is greater than or equal to a preset threshold, the semantics of the language unit are considered to be complete, and when the semantic integrity probability score of the language unit is less than the preset threshold, the semantics of the language unit are considered to be incomplete.

The following description is given by way of example. The long text information is obtained by performing speech recognition through an ASR model, a set number of words (marked as language units 1) are taken from the initial position of the long text information and input into an NLP model for semantic recognition, and because the words are the 1 st language unit to be recognized and no historical language units exist in a cache, the language units 1 are input into the NLP model to obtain the semantic integrity probability scores and the semantic information of the language units 1. The following description is divided into two cases.

Case 1: if the semantic integrity probability score of the language unit 1 is greater than or equal to the preset threshold, it indicates that the semantics of the language unit 1 are complete, and therefore, the semantic information of the language unit 1 is taken as the effective semantic information of the text information. Then, the semantic recognition is carried out by taking a set number of words (marked as language units 2) from the current initial position of the long text information, and the recognition process is similar to that of the language units 1.

Case 2: if the semantic integrity probability score of the language unit 1 is smaller than the preset threshold, it indicates that the semantics of the language unit 1 are incomplete, and therefore, the language unit 1 is cached in the cache. In this case, a set number of words (i.e., the language unit 2) is continuously taken from the current starting position of the long text message, a history language unit (i.e., the language unit 1) is obtained from the cache, and the language unit 1 and the language unit 2 are spliced to obtain a new language unit.

And then carrying out semantic recognition processing on the new language unit to obtain the semantic integrity probability score and semantic information of the new language unit. When performing semantic recognition processing on a new language unit, the following two cases will be described.

Case 3: and if the semantic integrity probability score of the new language unit is greater than or equal to a preset threshold value, taking the semantic information of the new language unit as effective semantic information of the text information. In this case, since the semantic information of the language unit 1 is already included in the semantic information of the new language unit, the language unit 1 is deleted from the cache. Then, the semantic recognition is carried out by taking a set number of words (marked as language units 3) from the current initial position of the long text information, and the recognition process is similar to that of the language units 1.

Case 4: and if the semantic integrity probability score of the new language unit is smaller than a preset threshold value, storing the language unit 2 into a cache as a historical language unit, wherein the historical language unit comprises a language unit 1 and a language unit 2. In this case, a set number of words (i.e., the language unit 3) is continuously taken from the current start position of the long text message, the history language units (i.e., the language unit 1 and the language unit 2) are obtained from the cache, and the language unit 1, the language unit 2 and the language unit 3 are spliced to obtain a new language unit. And then, performing semantic recognition processing on the new language unit, wherein the specific processing process is similar to the above process and is not described herein again.

S306: and acquiring reply information corresponding to the text information according to the effective semantic information, and controlling intelligent equipment to output the reply information.

Specifically, there may be various embodiments for obtaining the reply information corresponding to the text information according to the valid semantic information. In an alternative embodiment, the knowledge base may be queried to obtain the reply information according to the valid semantic information. Wherein, the knowledge base records reply information corresponding to different semantic information.

In addition, the reply information output by the intelligent device can be in a Text form, can also be in a multimedia information form such as audio, video, picture and the like, and can also be in a voice form, namely TTS (English To Speech, Chinese from Text To voice). It can be understood that, in this embodiment, when the smart device outputs the reply message, the smart device may be in any form of the foregoing forms, or may be a combination of at least two forms of the foregoing forms, which is not specifically limited in this embodiment.

It should be noted that, in the present embodiment, when replying to the text message, the sentence pattern in the text message is not specifically limited. Illustratively, the statement sentence may be a statement sentence, a question sentence, an exclamation sentence, or the like. That is, the embodiment replies not only to the text message of the question sentence pattern, but also to the text messages of other sentence patterns.

In the embodiment, before the voice information is subjected to semantic recognition, the voice information does not need to be pre-segmented, and after the voice information is subjected to voice recognition to obtain the text information, the text information does not need to be pre-segmented, so that semantic recognition errors caused by segmentation errors are avoided, and the accuracy of semantic recognition is improved; in addition, in the embodiment, the words with the set number in the text information obtained by voice recognition are used as language units to perform semantic recognition processing in real time, so that the real-time performance of semantic recognition is improved; when the semantic integrity probability scores of the continuous N language units meet the preset condition, the semantic information of the continuous N language units is used as effective semantic information of the text information, and the accuracy of semantic recognition is improved.

The following describes the semantic recognition processing procedure of this embodiment by taking fig. 4 as an example. Fig. 4 is a first schematic diagram of a semantic recognition process according to an embodiment of the present invention, and as shown in fig. 4, it is assumed that words in text information obtained by performing speech recognition on speech information are respectively:

"jin", "Tian", "Qi", "what", "how" and "like"

And (4) sequentially inputting each vocabulary as a language unit into the NLP model in real time. After the 1 st language unit "now" is input into the NLP model, the NLP model calculates and outputs semantic information (not shown) and a semantic integrity probability score corresponding to the language unit "now". As shown in fig. 4, the semantic integrity probability score of the language unit "present" is 0.01, and since the semantic integrity probability score is lower than the preset threshold (assuming that the preset threshold is 0.95), the 1 st language unit "present" may be cached first.

For the 2 nd language unit to be recognized, taking out the historical language unit today from the buffer, splicing the historical language unit with the 2 nd language unit to obtain a language unit today, and after inputting the language unit today into the NLP model, obtaining the semantic integrity probability output by the NLP model as 0.1; since the semantic integrity probability score is still below the preset threshold, the 2 nd language unit "day" is also cached.

Aiming at the 3 rd language unit to be recognized, the historical language unit 'today' and the historical language unit 'day' are taken out from the buffer firstly, the historical language unit and the 3 rd language unit are spliced to obtain the language unit 'today day', and after the language unit 'today day' is input into the NLP model, the semantic integrity probability output by the NLP model is 0.2; since the semantic completeness probability score is still low, the 3 rd language unit "day" is also cached.

By analogy, as shown in fig. 4, the semantic integrity probability score corresponding to "weather today" is 0.75, the semantic integrity probability score corresponding to "weather what today" is 0.8, the semantic integrity probability score corresponding to "weather what today" is 0.9, and the semantic integrity probability score corresponding to "weather what today" is 0.95.

It can be understood that, in a specific application, a suitable preset threshold may be set, and when the semantic integrity probability score is smaller than the preset threshold, the current language unit is cached as the context information of the subsequent language unit. When the semantic integrity probability score is larger than the preset threshold, the semantic is complete, and the current language unit does not need to be cached. Further, the currently recognized semantic information can be used as the effective semantic information of the text information.

In the above example, each character is described as an example of a language unit, and in this way, because each character needs to be subjected to a semantic recognition process once, the calculation amount is large. In an alternative embodiment, to save computational resources, multiple words may be used as a language unit, for example, two words and three words may be used as a language unit.

It should be noted that, in the embodiment and the following embodiments, the setting of the semantic integrity probability score and the preset threshold of each language unit is only an example, and the invention is not limited thereto.

In the embodiments shown in fig. 3 and fig. 4, when performing semantic recognition processing sequentially using a set number of words as language units, as long as the probability score of semantic integrity of consecutive N language units is greater than or equal to a preset threshold, the semantics of the consecutive N language units are considered to be complete, and the semantic information corresponding to the consecutive N language units is used as the effective semantic information of the text information. In practical application, in order to improve the accuracy of semantic recognition, when the probability score of semantic integrity of the N language units is detected to be greater than or equal to the preset threshold, the subsequent one or more language units can be continuously detected, and the contribution condition of the subsequent language units to the semantic integrity is judged. The following is a detailed description with reference to two alternative embodiments as examples.

In one possible embodiment, it may be determined that the semantic integrity probability score of the second language unit satisfies the set condition according to the following steps:

In this embodiment, when it is detected that the semantic integrity probability score of the second language unit is greater than or equal to the preset threshold, the semantic integrity probability score of the language unit obtained by splicing the second language unit and the third language unit is continuously detected, and if the semantic integrity probability score is reduced to some extent, it is indicated that the semantic meaning of the second language unit is complete, and what the third language unit expresses is a new semantic meaning. Therefore, the semantic information of the second language unit is taken as the effective semantic information of the text information.

Fig. 5 is a schematic diagram of a semantic recognition process according to an embodiment of the present invention. As shown in fig. 5, it is assumed that the words sequentially input into the NLP model are: "day", "qi", "what", "like", "effect", "not", "wrong", "bar", that is, the set number is 1.

Assume that the preset threshold is 0.8. With reference to fig. 5, the semantic integrity probability score corresponding to "day" is 0.1, the semantic integrity probability score corresponding to "weather" is 0.3, the semantic integrity probability score corresponding to "weather what" is 0.7, the semantic integrity probability score corresponding to "weather what" is 0.75, and the semantic integrity probability score corresponding to "weather what" is 0.95.

In this embodiment, when the probability score of semantic completeness of the first 5 words is greater than the preset threshold, one more word is monitored. The 6 th vocabulary is spliced to obtain the weather-like effect, and the semantic integrity probability score of the text is 0.81. That is to say, on the basis of the first 5 vocabularies, the 6 th vocabulary is spliced, which results in the decrease of the semantic integrity probability score, so that the first 5 vocabularies are used as text fragments with complete semantics, and the semantic information corresponding to the first 5 vocabularies is used as the effective semantic information of the text information.

In another alternative embodiment, it may be determined that the semantic integrity probability score of the second language unit satisfies a set condition according to the following steps:

and if the semantic integrity probability score of the second language unit is greater than or equal to a preset threshold value, and the semantic integrity probability scores of the language units obtained by splicing the second language unit and the language units before the fourth language unit are all less than or equal to the integrity probability score of the second language unit, determining that the semantic integrity probability score of the second language unit meets a set condition.

In the present embodiment, the number of the language units spaced between the fourth language unit and the first language unit is not limited, and two, three or more language units may be spaced. In a specific implementation process, the number of the language units at intervals can be determined according to a preset time threshold. For example: and when the semantic integrity probability score of the second language unit is detected to be larger than or equal to the preset threshold value every time, continuously detecting the N language units within the preset time length, and if the semantic integrity probability score is not improved within the preset time length, taking the semantic information of the second language unit as effective semantic information of the text information.

For example, in conjunction with fig. 5, it is assumed that the language units sequentially input into the NLP model are: "day", "qi", "what", "like", "effect", "not", "wrong", "bar", that is, the set number is 1. The semantic integrity probability scores corresponding to the first 5 words "how weather" are respectively 0.95, which is greater than the preset threshold value of 0.8.

Assuming that N is 3, the subsequent 3 words "effect", "effect" and "no" need to be detected. With reference to fig. 5, on the basis of the first 5 vocabularies, the semantic integrity probability score corresponding to "how to look like the weather" obtained after the 6 th vocabulary is spliced is 0.81, the semantic integrity probability score corresponding to "how to look like the weather" obtained after the 7 th vocabulary is added is 0.8, and the semantic integrity probability score corresponding to "how to look like the weather" obtained after the 8 th vocabulary is continuously spliced is 0.7. Therefore, on the basis of the first 5 vocabularies, the semantic integrity probability score cannot be improved by continuously splicing the subsequent 3 vocabularies. Therefore, the first 5 words are used as text segments with complete semantics, and the semantic information corresponding to the first 5 words is used as effective semantic information of the text information. The vocabulary after the "effect" is re-recognized as the next sentence.

Compared with the above embodiment, the embodiment continuously monitors a plurality of language units after the target language unit is determined, thereby reducing the misjudgment and improving the accuracy of semantic recognition.

In the above embodiment shown in fig. 3, after the valid semantic information of the text message, in step S306, reply information is obtained according to the valid semantic information, and the smart device is controlled to output the reply information. That is, in the embodiment shown in fig. 3, the reply message is acquired after a relatively complete semantic meaning is recognized. In practical applications, sometimes only a few words need to be recognized, and complete semantic information can be predicted. For example: assuming that the voice information input by the user is "what the weather of Beijing is, in the actual recognition process, after the semantic information corresponding to the first 4 words" weather of Beijing "is obtained through recognition, it can be predicted that the semantic of the user is to inquire the weather condition of Beijing. Therefore, the embodiment of the invention also provides a scheme capable of acquiring the reply information in advance according to the predicted semantic information. This is described below in conjunction with fig. 6.

Fig. 6 is a third schematic flowchart of an information processing method according to an embodiment of the present invention. This embodiment is a further refinement of S306 in the embodiment shown in fig. 3. As shown in fig. 6, S306 may specifically include:

s3061: and acquiring cached prediction semantic information and prediction reply information corresponding to the prediction semantic information, wherein the prediction semantic information is obtained by predicting according to the semantic information of the historical language unit.

S3062: and if the effective semantic information is consistent with the predicted semantic information, using the predicted reply information as reply information corresponding to the text information.

Specifically, in this embodiment, when performing semantic identification on each language unit, on the one hand, the language unit is used as a historical language unit for caching when the semantic integrity probability score of the language unit does not meet the set condition, that is, the semantic of the language unit is incomplete; on the other hand, according to the semantic information of the language unit, complete semantic information is predicted to obtain predicted semantic information. Furthermore, according to the predicted semantic information, the predicted reply information corresponding to the predicted semantic information is obtained. And caching the predicted semantic information and the predicted reply information.

It should be noted that, according to the incomplete semantic information of the language unit, the complete semantic information is obtained by prediction, and there may be a plurality of prediction modes, which is not specifically limited in this embodiment, and the existing semantic prediction method may be adopted.

In a possible implementation manner, after a semantic recognition result is obtained for each language unit, as long as the semantics of the language unit are incomplete, prediction is performed according to the incomplete semantic information to obtain complete predicted semantic information.

In another possible implementation, for each language unit, when the semantics of the language unit are incomplete, whether the semantic integrity probability score of the language unit is greater than or equal to a prediction threshold is further determined, and when the semantic integrity probability score of the language unit is greater than or equal to the prediction threshold, semantic prediction is performed, so that the computing resources can be saved. It will be appreciated that the predicted threshold is less than the preset threshold described above.

For example, assume that the voice information input by the user is "what is the weather in beijing", and the prediction threshold is 0.5. In the process of sequentially inputting each vocabulary as a language unit into an NLP model for recognition, the semantic integrity probability score obtained by recognizing the 1 st vocabulary of Beijing is 0.01, the semantic integrity probability score obtained by recognizing the first 2 vocabularies of Beijing is 0.1, and the semantic integrity probability score obtained by recognizing the first 3 vocabularies of Beijing sky is 0.2; in the identification process, because the semantic integrity probability scores corresponding to all the language units are all smaller than the prediction threshold value 0.5, the semantic is very incomplete, and the accuracy of the obtained predicted semantic information is low even if semantic prediction is carried out. Therefore, after the recognition of the 3 words, there is no need to perform a semantic prediction process.

The semantic integrity probability score obtained by identifying the first 4 words of Beijing weather is 0.6. The probability score of the semantic completeness corresponding to the first 4 vocabularies is larger than the prediction threshold value, which shows that the semantics of the first 4 vocabularies are relatively complete, and according to the 'Beijing weather', the fact that the semantics of the user wants to inquire the Beijing weather can be predicted. Therefore, in this embodiment, after the first 4 words are recognized, prediction is performed according to semantic information of the first 4 words, so as to obtain predicted semantic information with complete semantics. And acquiring Beijing weather information as prediction reply information in advance, and caching the prediction semantic information and the prediction reply information.

Furthermore, after the text information "what kind of Beijing weather" is identified to obtain the effective semantic information, since one or more pieces of predicted semantic information and predicted reply information are cached in the past, the predicted reply information corresponding to the predicted semantic information that is consistent with the effective semantic information can be used as the reply information corresponding to the effective semantic information from each piece of predicted semantic information, and the intelligent device is controlled to output the reply information.

In the embodiment, when incomplete semantic information is obtained according to the recognition of part of the language units, complete semantic prediction is carried out according to the incomplete semantic information, prediction reply information is obtained in advance for caching, and when the complete semantic information is subsequently recognized, corresponding reply information only needs to be obtained from the cache, so that the real-time performance of semantic recognition is improved.

Fig. 7 is a first schematic structural diagram of an information processing apparatus according to an embodiment of the present invention. The information processing apparatus of this embodiment may be in the form of software and/or hardware, and the apparatus may be specifically configured in a server, or in an intelligent device.

As shown in fig. 7, the information processing apparatus 700 of the present embodiment includes: an acquisition module 701 and a first identification module 702.

The acquiring module 701 is used for acquiring text information to be identified;

the first identification module 702 is configured to sequentially determine a set number of vocabularies of the text information as language units, perform semantic identification processing on the language units, and determine effective semantic information of the text information according to a semantic identification result of the language units.

The apparatus of this embodiment may be used to implement the method embodiment shown in fig. 2, and the implementation principle and technical effect are similar, which are not described herein again.

Fig. 8 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present invention. On the basis of the embodiment shown in fig. 7, the information processing apparatus 700 of this embodiment may further include a second identification module 703.

Optionally, the semantic recognition result includes: the semantic integrity probability score and the semantic information, the first identifying module 702 is specifically configured to:

Optionally, the first identifying module 702 is specifically configured to:

Optionally, the first identifying module 702 is further configured to:

Optionally, the obtaining module 701 is further configured to obtain voice information input to the intelligent device;

the second recognition module 703 is configured to perform speech recognition processing on the speech information to obtain text information to be recognized.

Optionally, the first identifying module 702 is further configured to:

and controlling the intelligent equipment to output the reply information.

The information processing apparatus provided in the embodiment of the present invention may be configured to execute the technical solution of any of the method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 9 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention, where the electronic device may be a controller of an intelligent device or a server, and this is not particularly limited in the embodiment of the present invention. As shown in fig. 9, the electronic device 900 of the present embodiment includes: at least one processor 901 and memory 902. The processor 901 and the memory 902 are connected via a bus 903.

In a specific implementation process, the at least one processor 901 executes the computer execution instructions stored in the memory 902, so that the at least one processor 901 executes the technical solution of any one of the above method embodiments.

For a specific implementation process of the processor 901, reference may be made to the above method embodiments, which implement principles and technical effects are similar, and details of this embodiment are not described herein again.

In the embodiment shown in fig. 9, it should be understood that the Processor may be a Central Processing Unit (CPU), other general-purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.

The memory may comprise high speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory.

The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer execution instruction is stored in the computer-readable storage medium, and when a processor executes the computer execution instruction, a technical solution in any one of the above method embodiments is implemented.

The computer-readable storage medium may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the readable storage medium may also reside as discrete components in the apparatus.

An embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program code, and when the computer program code runs on a computer, the computer is caused to execute the technical solution in any of the above method embodiments.

The embodiment of the present invention further provides a chip, which includes a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that an electronic device installed with the chip executes the technical solutions of any of the above method embodiments.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. An information processing method characterized by comprising:

acquiring text information to be identified;

2. The method of claim 1, wherein the semantic recognition result comprises: the semantic integrity probability score and the semantic information, wherein the effective semantic information of the text information is determined according to the semantic recognition result of the language unit, and the effective semantic information comprises the following steps:

3. The method according to claim 2, wherein if the semantic integrity probability scores corresponding to the N consecutive language units satisfy a preset condition, taking the semantic information of the N language units as effective semantic information of the text information includes:

4. The method of claim 3, wherein the semantic completeness probability score of the second language unit is determined to satisfy a set condition according to the following steps:

5. The method of claim 3, wherein the semantic completeness probability score of the second language unit is determined to satisfy a set condition according to the following steps:

6. The method of claim 3, wherein the semantic completeness probability score of the second language unit is determined to satisfy a set condition according to the following steps:

7. The method of claim 3, further comprising:

8. An information processing apparatus characterized by comprising:

the acquisition module is used for acquiring text information to be identified;

9. An electronic device, comprising: at least one processor and memory;

the memory stores computer-executable instructions;

the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any of claims 1-7.

10. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1 to 7.