CN111785259A - Information processing method and device and electronic equipment - Google Patents

Information processing method and device and electronic equipment Download PDF

Info

Publication number
CN111785259A
CN111785259A CN201910270747.1A CN201910270747A CN111785259A CN 111785259 A CN111785259 A CN 111785259A CN 201910270747 A CN201910270747 A CN 201910270747A CN 111785259 A CN111785259 A CN 111785259A
Authority
CN
China
Prior art keywords
text
information
semantic
fragment
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910270747.1A
Other languages
Chinese (zh)
Inventor
韩伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Orion Star Technology Co Ltd
Original Assignee
Beijing Orion Star Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Orion Star Technology Co Ltd filed Critical Beijing Orion Star Technology Co Ltd
Priority to CN201910270747.1A priority Critical patent/CN111785259A/en
Publication of CN111785259A publication Critical patent/CN111785259A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, wherein the method comprises the following steps: acquiring text information to be recognized, adding punctuation marks to the text information, dividing the text information into at least one text segment, and acquiring effective semantic information of the text information according to a semantic recognition result of the at least one text segment; therefore, in the embodiment, when the semantic recognition is performed on the text information, the punctuation marks are added to the long text information to segment the text information, and then the semantic recognition is performed on the text segments. The natural language understanding is considered in the process of segmenting the text information according to the punctuation marks, so that the segmentation result of the text information is more accurate, the effective semantic information of the text information is determined according to the semantic recognition result of the segmented text segment, and the accuracy of semantic recognition can be improved.

Description

Information processing method and device and electronic equipment
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to an information processing method and device and electronic equipment.
Background
With the development of human-computer interaction technology, the semantic recognition technology shows its importance. Semantic recognition is a process of extracting feature information from a voice signal emitted by a human and determining the language meaning thereof, and mainly includes a voice recognition process and a semantic understanding process. The speech recognition process is a process of converting a human speech signal into text using an acoustic model, and the semantic understanding process is a process of recognizing the meaning of text using a natural language model.
In the prior art, when a voice signal input by a user is processed, a Voice Activity Detection (VAD) technology is first used to determine a start point and an end point of each voice segment in a continuous voice signal, so that the continuous voice signal is segmented into a plurality of voice segments, and then voice recognition and semantic understanding are performed on the segmented voice segments to obtain the semantics of the user.
However, in practical applications, due to the difference between the speaking speeds and speaking habits of different users and the scenes where speakers are located, the speech signal is segmented by the VAD detection method, so that the segmentation of the sentences is not accurate enough, and the accuracy of semantic recognition is low.
Disclosure of Invention
The embodiment of the invention provides an information processing method, an information processing device and electronic equipment, which are used for improving the accuracy of semantic recognition.
In a first aspect, an embodiment of the present invention provides an information processing method, including:
acquiring text information to be identified;
adding punctuation marks to the text information, and dividing the text information into at least one text segment;
and acquiring effective semantic information of the text information according to the semantic recognition result of the at least one text fragment.
Optionally, the semantic recognition result includes: semantic integrity probability scores and semantic information; the obtaining of the effective semantic information of the text information according to the semantic recognition result of the at least one text fragment includes:
and taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information.
Optionally, the step of taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information includes:
for each text fragment in the at least one text fragment, if the semantic integrity probability score of the text fragment is greater than or equal to a preset threshold value, taking the semantic information of the text fragment as effective semantic information of the text information; or
And aiming at the at least one text fragment, taking the semantic information of the text fragment with the highest semantic integrity probability score as the effective semantic information of the text information.
Optionally, the step of taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information includes:
aiming at any text fragment in the at least one text fragment, obtaining a cached historical text fragment, wherein the historical text fragment is at least one text fragment of which the semantic integrity probability score before the text fragment does not meet the preset condition;
carrying out semantic recognition processing on the historical text fragments and new text fragments obtained by splicing the text fragments to obtain semantic recognition results of the new text fragments;
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, taking the semantic information of the new text fragment as effective semantic information of the text information.
Optionally, the method further includes:
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, deleting the historical text fragment from the cache.
Optionally, the method further includes:
and if the semantic integrity probability score of the new text fragment is smaller than a preset threshold value, storing the new text fragment as a historical text fragment in a cache.
Optionally, the acquiring text information to be recognized includes:
acquiring voice information input into the intelligent equipment;
and carrying out voice recognition on the voice information to obtain text information to be recognized.
Optionally, after obtaining the valid semantic information corresponding to the text information, the method further includes:
acquiring reply information corresponding to the text information according to the effective semantic information;
and controlling the intelligent equipment to output the reply information.
Optionally, the adding punctuation marks to the text information includes:
and inputting the text information into a punctuation model, and acquiring the text information which is output by the punctuation model and added with at least one punctuation mark.
In a second aspect, an embodiment of the present invention provides an information processing apparatus, including:
the acquisition module is used for acquiring text information to be identified;
the segmentation module is used for adding punctuation marks to the text information and dividing the text information into at least one text segment;
and the identification module is used for acquiring the effective semantic information of the text information according to the semantic identification result of the at least one text fragment.
Optionally, the semantic recognition result includes: semantic integrity probability scores and semantic information; the identification module is specifically configured to:
and taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information.
Optionally, the identification module is specifically configured to:
for each text fragment in the at least one text fragment, if the semantic integrity probability score of the text fragment is greater than or equal to a preset threshold value, taking the semantic information of the text fragment as effective semantic information of the text information; or
And aiming at the at least one text fragment, taking the semantic information of the text fragment with the highest semantic integrity probability score as the effective semantic information of the text information.
Optionally, the identification module is specifically configured to:
aiming at any text fragment in the at least one text fragment, obtaining a cached historical text fragment, wherein the historical text fragment is at least one text fragment of which the semantic integrity probability score before the text fragment does not meet the preset condition;
carrying out semantic recognition processing on the historical text fragments and new text fragments obtained by splicing the text fragments to obtain semantic recognition results of the new text fragments;
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, taking the semantic information of the new text fragment as effective semantic information of the text information.
Optionally, the identification module is further specifically configured to: and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, deleting the historical text fragment from the cache.
Optionally, the identification module is further configured to:
and if the semantic integrity probability score of the new text fragment is smaller than a preset threshold value, storing the new text fragment as a historical text fragment in a cache.
Optionally, the obtaining module is specifically configured to:
acquiring voice information input into the intelligent equipment;
and carrying out voice recognition on the voice information to obtain text information to be recognized.
Optionally, the apparatus further comprises: an output module to:
acquiring reply information corresponding to the text information according to the effective semantic information;
and controlling the intelligent equipment to output the reply information.
Optionally, the cutting module is specifically configured to:
and inputting the text information into a punctuation model, and acquiring the text information which is output by the punctuation model and added with at least one punctuation mark.
In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any one of the first aspects.
In a fourth aspect, the present invention provides a computer-readable storage medium, in which computer-executable instructions are stored, and when a processor executes the computer-executable instructions, the method according to any one of the first aspect is implemented.
In a fifth aspect, embodiments of the present invention provide a computer program product comprising computer program code which, when run on a computer, causes the computer to perform the method of any of the first aspects above.
In a sixth aspect, an embodiment of the present invention provides a chip, including a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that an electronic device in which the chip is installed performs the method according to any one of the above first aspects.
According to the technical scheme provided by the embodiment of the invention, text information to be recognized is obtained, punctuation marks are added to the text information, the text information is divided into at least one text segment, and effective semantic information of the text information is obtained according to a semantic recognition result of the at least one text segment; therefore, in the embodiment, when the semantic identification is performed on the text information, punctuation marks are added to the long text information to segment the text information, and then the semantic identification is performed on the text segments to obtain the effective semantic information of the text information. The natural language understanding is considered in the process of segmenting the text information according to the punctuation marks, so that the segmentation result of the text information is more accurate, the effective semantic information of the text information is determined according to the semantic recognition result of the segmented text segment, and the accuracy of semantic recognition can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a semantic recognition process in the prior art;
FIG. 2 is a diagram illustrating a semantic identification process provided by an embodiment of the present invention;
fig. 3 is a first flowchart illustrating an information processing method according to an embodiment of the present invention;
fig. 4 is a second flowchart illustrating an information processing method according to an embodiment of the present invention;
FIG. 5 is a second diagram illustrating a semantic recognition process according to an embodiment of the present invention;
fig. 6 is a third schematic flowchart of an information processing method according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a schematic diagram of a semantic recognition process in the prior art, and as shown in fig. 1, when processing voice information input by a user, a Voice Activity Detection (VAD) technique is first used to determine a start point and an end point of each voice segment in continuous voice information, so as to segment the continuous voice information into a plurality of voice segments, and then perform voice recognition and semantic understanding on the segmented voice segments to obtain semantics of the user. Specifically, the Speech segment is input into an Automatic Speech Recognition (ASR) model for Recognition to obtain text information corresponding to the Speech segment, and then the text information is input into a Natural Language Processing (NLP) model for Recognition to obtain semantic information corresponding to the text information.
However, in practical applications, because the speaking speeds and speaking habits of different users and the scenes where speakers are located are different, the sentences are segmented in a VAD detection manner, so that the segmentation of the sentences is not accurate enough, and the accuracy of semantic recognition is not high.
In order to solve the above problem, an embodiment of the present invention provides an information processing method. Fig. 2 is a schematic diagram of a semantic recognition process according to an embodiment of the present invention. As shown in fig. 2, in this embodiment, continuous speech information is directly input to the ASR model without being segmented to perform speech recognition, so as to obtain text information corresponding to the continuous speech information. And then segmenting the text information to obtain a plurality of text segments, and inputting the text segments into the NLP model to obtain semantic information of the text segments.
In the embodiment, the speech information is recognized as the text information, and then the text information is segmented to obtain the text segment, so that natural language understanding can be considered in the segmentation process, the sentence segmentation accuracy is improved, and the semantic recognition accuracy is improved.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 3 is a first schematic flowchart of an information processing method according to an embodiment of the present invention, where the method of this embodiment may be executed by a server or an intelligent device. When the present embodiment is executed by a smart device, the smart device may be any electronic device having a man-machine interaction function with a user, including but not limited to: robot, intelligent audio amplifier, intelligent house, intelligent wearing equipment, smart mobile phone etc..
It should be noted that, for convenience of description, in this embodiment and the following embodiments, only the smart device is taken as an example for description.
As shown in fig. 3, the information processing method may include:
s301: and acquiring text information to be recognized.
The text information to be recognized is long text information. That is, the text information to be recognized is text information that has not been cut.
The text information may be user input into the smart device. In one possible scenario, a user enters textual information directly into the smart device. In another possible scenario, a user inputs voice information into the intelligent device, and then the intelligent device performs voice recognition on the voice information to obtain text information.
Based on the second scenario, the embodiment is different from the prior art in that after the intelligent device obtains the voice information input by the user, the voice information is not segmented, but the voice information is directly subjected to voice recognition to obtain text information. For example, the text information obtained by recognition may be "how effective the weather is today is really" you see a good best version of this robot ".
Therefore, the text information acquired in the embodiment is a long text, and the semantics of the text information are difficult to understand. If the text information is directly input into the NLP model for semantic recognition, an ambiguity problem can be caused, and the semantic recognition accuracy is not ideal. In this embodiment, after the text information to be identified is obtained, S302 may be executed to segment the text information to obtain a text segment.
S302: punctuation marks are added to the text information, and the text information is divided into at least one text segment.
In this embodiment, the text information is segmented by adding punctuation marks to the text information.
Specifically, there are various ways of adding punctuation marks to text information. In one possible embodiment, the text information is subjected to natural language understanding, and punctuation is added according to the result of the natural language understanding. In another possible implementation, punctuation marks may be added to the text message according to the length of the time interval between adjacent words in the text message. In yet another possible implementation, punctuation marks are added to the text information using a punctuation model.
The following description will be given by taking a punctuation model as an example, where text information is input into the punctuation model, and text information which is output by the punctuation model and added with at least one punctuation mark is obtained, and each punctuation mark divides the text information into at least one text segment.
Specifically, the punctuation model may add any punctuation marks to the text message, including but not limited to: comma, period, question mark, exclamation mark, semicolon, etc. For example: the text information after adding the punctuation marks is 'you see the robot well, and people try to see the bar' the bar is good today. How do the weather today? The effect is true not wrong! ".
It is understood that punctuation can divide the text message into at least one text segment by adding punctuation to the text message. For example: after punctuation marks are added to the text information in the above example, the punctuation marks divide the text information into four text segments, which are:
"you see that this robot is straight and good"
' Zan people try to get a best. "
"how do the weather today? "
"the effect is true not wrong! "
In this embodiment, when dividing the text segment according to the punctuations, each punctuation may be used as a segmentation point, for example, in the above example, four punctuations are used as segmentation points, so as to obtain 4 text segments. Of course, it is also possible to use only the preset punctuation marks as the dividing points, for example: only periods, exclamations and questions are taken as the segmentation points, the text message can be divided into 3 text segments, which are respectively:
"you see that this robot is tall and straight, and a person tries a bar. "
"how do the weather today? "
"the effect is true not wrong! "
It should be noted that there are various punctuation models in the prior art, and this embodiment is not particularly limited thereto. For example: a punctuation model based on a conditional random field algorithm (CRF), a punctuation model based on maximum entropy, etc.
S303: and acquiring effective semantic information of the text information according to the semantic recognition result of the at least one text fragment.
In this embodiment, punctuation marks are added to the text information in step S302, and after the text information is divided into a plurality of text segments, semantic recognition processing may be performed on each text segment, so as to obtain a semantic recognition result. Furthermore, effective semantic information of the text information can be obtained according to the semantic recognition result of each text segment.
In an alternative embodiment, the natural language processing NLP model is used to identify the semantics of each text segment. Specifically, for a current text segment to be recognized, the current text segment is input into the NLP model, and a semantic recognition result of the current text segment is obtained.
NLP models can typically process a text segment of a certain length at a time. As a possible implementation mode, the NLP model carries out word segmentation processing on an input text segment to obtain a keyword sequence, then word vectors with context semantic relation are obtained according to the keyword sequence, then the word vectors are input into a classification model to carry out feature extraction, and the classification model outputs the probability of semantic category to which the text segment belongs according to the extracted features.
Optionally, the classification model in the NLP model may be a deep neural network model.
In the information processing method provided by this embodiment, text information to be recognized is acquired, punctuation marks are added to the text information, the text information is divided into at least one text segment, and effective semantic information of the text information is acquired according to a semantic recognition result of the at least one text segment; therefore, in the embodiment, when the semantic identification is performed on the text information, punctuation marks are added to the long text information to segment the text information, and then the semantic identification is performed on the text segments to obtain the effective semantic information of the text information. The natural language understanding is considered in the process of segmenting the text information according to the punctuation marks, so that the segmentation result of the text information is more accurate, the effective semantic information of the text information is determined according to the semantic recognition result of the segmented text segment, and the accuracy of semantic recognition can be improved.
The following describes the detailed process of information processing according to the present invention in detail with reference to a specific embodiment. The following example is a refinement of the example shown in fig. 3.
Fig. 4 is a flowchart illustrating a second information processing method according to an embodiment of the present invention. As shown in fig. 4, the method of this embodiment includes:
s401: and acquiring voice information input into the intelligent equipment.
Specifically, when acquiring the voice information input to the intelligent device, the voice information of the user may be acquired through a microphone of the intelligent device, or the voice information of the user acquired by other devices may be received through a network or bluetooth. It should be noted that, the embodiment of the present invention is only described by taking the two possible implementation manners as examples to acquire the voice information of the user, but the embodiment of the present invention is not limited thereto.
S402: and carrying out voice recognition on the voice information to obtain text information to be recognized.
After the voice information is acquired, the voice information can be recognized as text information by adopting a voice recognition technology. In a possible implementation manner, the speech information is input into an ASR model, and text information corresponding to the speech information output by the ASR model is obtained.
The ASR model may specifically include an acoustic model and a language model, and obtains corresponding text information by recognizing the speech information. Wherein the textual information is a sequence of words and/or words.
S403: punctuation marks are added to the text information, and the text information is divided into at least one text segment.
In this embodiment, the specific implementation of S403 is similar to S302 in the embodiment shown in fig. 3, and is not described here again.
S404: and taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information.
In this embodiment, the semantic recognition result of the text segment includes: semantic integrity probability scores and semantic information. Specifically, when the NLP model is used for semantic recognition processing, the current text segment is input into the NLP model, and the NLP model performs semantic recognition processing on the text segment, outputs semantic information of the text segment, and also outputs a semantic integrity probability score of the text segment.
As can be appreciated, the semantic integrity probability score is used to indicate the integrity of the semantics expressed by the text segment. It can be understood that the more complete the semantics expressed by the text segment is, the higher the corresponding semantic integrity probability score is; the more incomplete the semantics expressed by the text fragment, the lower the corresponding semantic integrity probability score. For example: the semantic integrity probability score for "weather today" is less than the semantic integrity probability score for "weather so today".
In this embodiment, after the semantic integrity probability score and the semantic information of the text segment are identified and obtained, the semantic information of the text segment whose semantic integrity probability score meets the preset condition is used as the effective semantic information of the text information.
S405: and acquiring reply information corresponding to the text information according to the effective semantic information, and controlling intelligent equipment to output the reply information.
Specifically, there may be various embodiments for obtaining the reply information corresponding to the text information according to the valid semantic information. In an alternative embodiment, the knowledge base may be queried to obtain the reply information according to the valid semantic information. Wherein, the knowledge base records reply information corresponding to different semantic information.
In addition, the reply information output by the intelligent device can be in a Text form, can also be in a multimedia information form such as audio, video, picture and the like, and can also be in a voice form, namely TTS (English To Speech, Chinese from Text To voice). It can be understood that, in this embodiment, when the smart device outputs the reply message, the smart device may be in any form of the foregoing forms, or may be a combination of at least two forms of the foregoing forms, which is not specifically limited in this embodiment.
It should be noted that, in the present embodiment, when replying to the text message, the sentence pattern in the text message is not specifically limited. Illustratively, the statement sentence may be a statement sentence, a question sentence, an exclamation sentence, or the like. That is, the embodiment replies not only to the text message of the question sentence pattern, but also to the text messages of other sentence patterns.
In this embodiment, the voice information is directly recognized as the text information without being segmented, and then the text information is segmented into a plurality of text segments by adding punctuation marks to the text information, and semantic recognition is performed on each text segment. Because natural language understanding is considered in the process of segmenting the text information according to the punctuation marks, the segmentation result of the text information is more accurate, the effective semantic information of the text information is further determined according to the semantic integrity probability score of the segmented text segment, and the accuracy of semantic identification can be improved.
In the above embodiment, in S404, the semantic information of the text segment whose semantic integrity probability score meets the preset condition is used as the effective semantic information of the text information, and there may be a plurality of specific implementation manners.
Three specific embodiments are described below as examples. It should be noted that, in practical applications, other embodiments may exist, and the embodiments are not necessarily listed.
In a first possible implementation manner, the semantic information of the text segment with the highest semantic integrity probability score is used as the effective semantic information of the text information for the at least one text segment.
In this embodiment, after the semantic integrity probability score of each text segment is obtained for a plurality of text segments, the text segment with the highest semantic integrity probability score is determined, and the semantic information of the text segment is the most complete, so that the semantic information of the text segment can be used as the effective semantic information of the text information.
In a second possible implementation manner, for each text segment of the at least one text segment, if the semantic integrity probability score of the text segment is greater than or equal to a preset threshold, the semantic information of the text segment is used as the valid semantic information of the text information.
In this embodiment, the following processing is performed according to the relationship between the semantic integrity probability score and the preset threshold.
If the semantic integrity probability score of the current text segment is greater than or equal to the preset threshold, the semantic expressed by the current text segment is complete, and the semantic information of the current text segment can be used as the effective semantic information of the text information.
If the probability score of the semantic completeness of the current text segment is smaller than the preset threshold, the semantic expressed by the current text segment is incomplete, the current text segment can be ignored, and the next text segment is processed continuously.
It is understood that the two embodiments can also be combined in practical application. For example: in some scenarios, the first embodiment is used, and in other scenarios, the second embodiment is used.
The second possible embodiment described above will be described in detail with reference to fig. 5. Fig. 5 is a second schematic diagram of a semantic recognition process according to an embodiment of the present invention. As shown in fig. 5, the text information to be recognized is "how effective the weather is today is really for a bar if you see a right one of the robots".
And (5) inputting the text information into a punctuation model, and adding punctuation marks to the text information to obtain four text segments. Then, the four text segments are respectively input into the NLP model, and semantic information (not shown in fig. 5) and a semantic integrity probability score corresponding to each text segment are obtained.
With reference to fig. 5, the semantic integrity probability score obtained after inputting NLP model of the 1 st text segment "you see the robot well" is 0.2. Since the probability score of the semantic completeness of the 1 st text segment is smaller than the preset threshold (assuming that the preset threshold is 0.75), the semantic of the text segment is considered to be incomplete, the text segment is ignored, and the recognition process of the next text segment is continued.
The semantic integrity score obtained after the 2 nd text segment 'Zan tries a bar' is input into the NLP model is 0.1. And (3) because the semantic integrity probability score of the 2 nd text segment is also smaller than the preset threshold, the semantic of the text segment is considered to be incomplete, the text segment is ignored, and the identification process of the next text segment is continued.
The semantic integrity score obtained after inputting the 3 rd text segment "how today's weather" into the NLP model is 0.95. Because the semantic integrity probability score of the 3 rd text segment is larger than the preset threshold, the semantic of the text segment is considered to be complete, and the semantic information of the text segment is used as the effective semantic information of the text information to be identified.
Then, semantic recognition is carried out on the 4 th text segment, and the semantic integrity score obtained after the 4 th text segment with the effect of true miss bar is input into the NLP model is 0.3. And since the semantic integrity probability score of the 4 th text segment is smaller than the preset threshold, the semantic meaning of the text segment is considered to be incomplete, and the text segment is ignored.
It should be noted that the semantic integrity scores and the preset threshold of each text segment shown in fig. 5 are only exemplary.
In the embodiment shown in fig. 5, when the semantic integrity probability score of a text segment is smaller than the preset threshold, the text segment is ignored, and the recognition of the next text segment is continued. In some scenarios, there may be situations where: although the semantic integrity score of the current text segment is low, the current text segment can be used as the context information of the next text segment. That is, the semantics expressed by the current text segment in combination with the next text segment are complete.
Based on the above scenario, in step S404 of this embodiment, a third possible implementation manner may also be adopted. Specifically, the current text fragment may be cached when the semantic integrity probability score of the current text fragment is low. And when the next text segment is subjected to semantic recognition, the cached text segment and the next text segment are combined for recognition so as to improve the accuracy of a semantic recognition result.
The following description is made in conjunction with a specific embodiment. Fig. 6 is a third schematic flowchart of an information processing method according to an embodiment of the present invention, and this embodiment describes a processing procedure of any text fragment as an example. As shown in fig. 6, the method includes:
s601: and acquiring a cached historical text fragment aiming at any text fragment in the at least one text fragment, wherein the historical text fragment is the at least one text fragment of which the semantic integrity probability score before the text fragment does not meet the preset condition.
It can be understood that the sentence order of each text segment in the cached history text segment is consistent with the sentence order in the original voice message.
In addition, the present embodiment does not specifically limit the cache position of the history text segment. It is understood that the historical text segment may be cached in a cache of the NLP model, or may be cached in a cache outside the NLP model.
S602: and carrying out semantic recognition processing on the historical text fragments and the new text fragments obtained by splicing the text fragments to obtain a semantic recognition result of the new text fragments.
It will be appreciated that the sentence order of the new text segment is identical to the sentence order in the original speech information.
S603: and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, taking the semantic information of the new text fragment as effective semantic information of the text information, and deleting the historical text fragment from a cache.
S604: and if the semantic integrity probability score of the new text fragment is smaller than a preset threshold value, storing the new text fragment as a historical text fragment in a cache.
The following description is given by way of example. After punctuation marks are added to the text information, the text information is divided into three text segments, which are respectively: text segment 1, text segment 2, and text segment 3. Firstly, semantic recognition is carried out on a text segment 1, as the text segment is the 1 st text segment to be recognized, and no historical text segment exists in a cache, the text segment 1 is input into an NLP model to obtain the semantic integrity probability score and semantic information of the text segment 1. The following description is divided into two cases.
Case 1: the semantic integrity probability score of the text segment 1 is greater than or equal to a preset threshold value, which indicates that the semantics of the text segment 1 are complete, and therefore, the semantic information of the text segment 1 is taken as effective semantic information of the text information. Then, the semantic recognition of the text segment 2 is continued, and the recognition process is similar to the text segment 1.
Case 2: the semantic integrity probability score of the text segment 1 is smaller than a preset threshold value, which indicates that the semantics of the text segment 1 are incomplete, and therefore, the text segment 1 is cached in a cache. In this case, when the text segment 2 is identified, the history text segment (i.e., the text segment 1) is first obtained from the cache, and the text segment 1 and the text segment 2 are spliced to obtain a new text segment.
And then carrying out semantic recognition processing on the new text fragment to obtain the semantic integrity probability score and semantic information of the new text fragment. When performing semantic recognition processing on a new text segment, the following two cases will be described.
Case 3: and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, using the semantic information of the new text fragment as effective semantic information of the text information. In this case, since the semantic information of the text segment 1 is already included in the semantic information of the new text segment, the text segment 1 is deleted from the cache. The semantic recognition of the text segment 3 then proceeds, the recognition process being similar to the text segment 1.
Case 4: and if the semantic integrity probability score of the new text segment is smaller than a preset threshold, storing the text segment 2 into a cache as a historical text segment, wherein the historical text segment comprises a text segment 1 and a text segment 2. In this case, when the text segment 3 is identified, the history text segments (i.e., the text segment 1 and the text segment 2) are obtained from the cache, and the text segment 1, the text segment 2, and the text segment 3 are spliced to obtain a new text segment. And then carrying out semantic recognition processing on the new text segment, wherein the specific processing process is similar to the above process and is not described herein again.
In the embodiment shown in fig. 6, the current text segment with a lower semantic integrity probability score is cached as the context information of the next text segment, and the current text segment and the next text segment are subjected to semantic recognition processing together, so that the accuracy of semantic recognition is further improved.
Fig. 7 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present invention, where the information processing apparatus according to this embodiment may be in a software and/or hardware form, and the apparatus may be specifically disposed in a server or an intelligent device.
As shown in fig. 7, the information processing apparatus 700 of the present embodiment includes: an acquisition module 701, a segmentation module 702 and an identification module 703.
Wherein the content of the first and second substances,
an obtaining module 701, configured to obtain text information to be identified;
a segmentation module 702, configured to add punctuation marks to the text information, and segment the text information into at least one text segment;
the recognition module 703 is configured to obtain effective semantic information of the text information according to a semantic recognition result of the at least one text segment.
Optionally, the semantic recognition result includes: semantic integrity probability scores and semantic information; the identification module 703 is specifically configured to:
and taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information.
Optionally, the identifying module 703 is specifically configured to:
for each text fragment in the at least one text fragment, if the semantic integrity probability score of the text fragment is greater than or equal to a preset threshold value, taking the semantic information of the text fragment as effective semantic information of the text information; or
And aiming at the at least one text fragment, taking the semantic information of the text fragment with the highest semantic integrity probability score as the effective semantic information of the text information.
Optionally, the identifying module 703 is specifically configured to:
aiming at any text fragment in the at least one text fragment, obtaining a cached historical text fragment, wherein the historical text fragment is at least one text fragment of which the semantic integrity probability score before the text fragment does not meet the preset condition;
carrying out semantic recognition processing on the historical text fragments and new text fragments obtained by splicing the text fragments to obtain semantic recognition results of the new text fragments;
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, taking the semantic information of the new text fragment as effective semantic information of the text information.
Optionally, the identifying module 703 is specifically configured to: and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, deleting the historical text fragment from the cache.
Optionally, the identifying module 703 is further configured to:
and if the semantic integrity probability score of the new text fragment is smaller than a preset threshold value, storing the new text fragment as a historical text fragment in a cache.
Optionally, the obtaining module 701 is specifically configured to:
acquiring voice information input into the intelligent equipment;
and carrying out voice recognition on the voice information to obtain text information to be recognized.
Optionally, as shown in fig. 7, the apparatus may further include: an output module 704, the output module 704 configured to:
acquiring reply information corresponding to the text information according to the effective semantic information;
and controlling the intelligent equipment to output the reply information.
Optionally, the segmentation module 702 is specifically configured to:
and inputting the text information into a punctuation model, and acquiring the text information which is output by the punctuation model and added with at least one punctuation mark.
The information processing apparatus provided in the embodiment of the present invention may be configured to execute the technical solution of any of the method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 8 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present invention, where the electronic device may be a controller of an intelligent device or a server, and this is not particularly limited in the embodiment of the present invention. As shown in fig. 8, the electronic device 800 of the present embodiment includes: at least one processor 801 and a memory 802. The processor 801 and the memory 802 are connected by a bus 803.
In a specific implementation process, at least one processor 801 executes the computer-executable instructions stored in the memory 802, so that the at least one processor 801 executes the technical solution of any one of the method embodiments described above.
For a specific implementation process of the processor 801, reference may be made to the above method embodiments, which have similar implementation principles and technical effects, and details of this embodiment are not described herein again.
In the embodiment shown in fig. 8, it should be understood that the Processor may be a Central Processing Unit (CPU), other general purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
The memory may comprise high speed RAM memory and may also include non-volatile storage NVM, such as at least one disk memory.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer execution instruction is stored in the computer-readable storage medium, and when a processor executes the computer execution instruction, a technical solution in any one of the above method embodiments is implemented.
The computer-readable storage medium may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.
An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Of course, the processor and the readable storage medium may also reside as discrete components in the apparatus.
An embodiment of the present invention further provides a computer program product, where the computer program product includes a computer program code, and when the computer program code runs on a computer, the computer is caused to execute the technical solution in any of the above method embodiments.
The embodiment of the present invention further provides a chip, which includes a memory and a processor, where the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that an electronic device installed with the chip executes the technical solutions of any of the above method embodiments.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An information processing method characterized by comprising:
acquiring text information to be identified;
adding punctuation marks to the text information, and dividing the text information into at least one text segment;
and acquiring effective semantic information of the text information according to the semantic recognition result of the at least one text fragment.
2. The method of claim 1, wherein the semantic recognition result comprises: semantic integrity probability scores and semantic information; the obtaining of the effective semantic information of the text information according to the semantic recognition result of the at least one text fragment includes:
and taking the semantic information of the text segment with the semantic integrity probability score meeting the preset condition as the effective semantic information of the text information.
3. The method according to claim 2, wherein the step of regarding semantic information of the text segment whose semantic integrity probability score satisfies the preset condition as valid semantic information of the text information comprises:
for each text fragment in the at least one text fragment, if the semantic integrity probability score of the text fragment is greater than or equal to a preset threshold value, taking the semantic information of the text fragment as effective semantic information of the text information; or
And aiming at the at least one text fragment, taking the semantic information of the text fragment with the highest semantic integrity probability score as the effective semantic information of the text information.
4. The method according to claim 2, wherein the step of regarding semantic information of the text segment whose semantic integrity probability score satisfies the preset condition as valid semantic information of the text information comprises:
aiming at any text fragment in the at least one text fragment, obtaining a cached historical text fragment, wherein the historical text fragment is at least one text fragment of which the semantic integrity probability score before the text fragment does not meet the preset condition;
carrying out semantic recognition processing on the historical text fragments and new text fragments obtained by splicing the text fragments to obtain semantic recognition results of the new text fragments;
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, taking the semantic information of the new text fragment as effective semantic information of the text information.
5. The method of claim 4, further comprising:
and if the semantic integrity probability score of the new text fragment is greater than or equal to a preset threshold value, deleting the historical text fragment from the cache.
6. The method of claim 4, further comprising:
and if the semantic integrity probability score of the new text fragment is smaller than a preset threshold value, storing the new text fragment as a historical text fragment in a cache.
7. The method according to any one of claims 1 to 6, wherein after obtaining the valid semantic information corresponding to the text information, the method further comprises:
acquiring reply information corresponding to the text information according to the effective semantic information;
and controlling the intelligent equipment to output the reply information.
8. An information processing apparatus characterized by comprising:
the acquisition module is used for acquiring text information to be identified;
the segmentation module is used for adding punctuation marks to the text information and dividing the text information into at least one text segment;
and the identification module is used for acquiring the effective semantic information of the text information according to the semantic identification result of the at least one text fragment.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the at least one processor to perform the method of any of claims 1-7.
10. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1 to 7.
CN201910270747.1A 2019-04-04 2019-04-04 Information processing method and device and electronic equipment Pending CN111785259A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910270747.1A CN111785259A (en) 2019-04-04 2019-04-04 Information processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910270747.1A CN111785259A (en) 2019-04-04 2019-04-04 Information processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN111785259A true CN111785259A (en) 2020-10-16

Family

ID=72754977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910270747.1A Pending CN111785259A (en) 2019-04-04 2019-04-04 Information processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111785259A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700769A (en) * 2020-12-26 2021-04-23 科大讯飞股份有限公司 Semantic understanding method, device, equipment and computer readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102486801A (en) * 2011-09-06 2012-06-06 上海博路信息技术有限公司 Method for obtaining publication contents in voice recognition mode
CN105609107A (en) * 2015-12-23 2016-05-25 北京奇虎科技有限公司 Text processing method and device based on voice identification
CN107146618A (en) * 2017-06-16 2017-09-08 北京云知声信息技术有限公司 Method of speech processing and device
CN107146602A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio recognition method, device and electronic equipment
CN107291690A (en) * 2017-05-26 2017-10-24 北京搜狗科技发展有限公司 Punctuate adding method and device, the device added for punctuate
CN107316643A (en) * 2017-07-04 2017-11-03 科大讯飞股份有限公司 Voice interactive method and device
CN108845979A (en) * 2018-05-25 2018-11-20 科大讯飞股份有限公司 A kind of speech transcription method, apparatus, equipment and readable storage medium storing program for executing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102486801A (en) * 2011-09-06 2012-06-06 上海博路信息技术有限公司 Method for obtaining publication contents in voice recognition mode
CN105609107A (en) * 2015-12-23 2016-05-25 北京奇虎科技有限公司 Text processing method and device based on voice identification
CN107146602A (en) * 2017-04-10 2017-09-08 北京猎户星空科技有限公司 A kind of audio recognition method, device and electronic equipment
CN107291690A (en) * 2017-05-26 2017-10-24 北京搜狗科技发展有限公司 Punctuate adding method and device, the device added for punctuate
CN107146618A (en) * 2017-06-16 2017-09-08 北京云知声信息技术有限公司 Method of speech processing and device
CN107316643A (en) * 2017-07-04 2017-11-03 科大讯飞股份有限公司 Voice interactive method and device
CN108845979A (en) * 2018-05-25 2018-11-20 科大讯飞股份有限公司 A kind of speech transcription method, apparatus, equipment and readable storage medium storing program for executing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112700769A (en) * 2020-12-26 2021-04-23 科大讯飞股份有限公司 Semantic understanding method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111797632B (en) Information processing method and device and electronic equipment
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
CN112100349B (en) Multi-round dialogue method and device, electronic equipment and storage medium
KR101768509B1 (en) On-line voice translation method and device
CN110444198B (en) Retrieval method, retrieval device, computer equipment and storage medium
CN106570180B (en) Voice search method and device based on artificial intelligence
CN108682420B (en) Audio and video call dialect recognition method and terminal equipment
CN108447471A (en) Audio recognition method and speech recognition equipment
JP6677419B2 (en) Voice interaction method and apparatus
CN110060674B (en) Table management method, device, terminal and storage medium
CN109859747B (en) Voice interaction method, device and storage medium
CN112151015A (en) Keyword detection method and device, electronic equipment and storage medium
CN110633475A (en) Natural language understanding method, device and system based on computer scene and storage medium
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN114639386A (en) Text error correction and text error correction word bank construction method
CN112399269A (en) Video segmentation method, device, equipment and storage medium
CN112017643B (en) Speech recognition model training method, speech recognition method and related device
CN115858776B (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN111785259A (en) Information processing method and device and electronic equipment
CN114999463B (en) Voice recognition method, device, equipment and medium
CN109062891B (en) Media processing method, device, terminal and medium
CN110020429A (en) Method for recognizing semantics and equipment
CN114399992B (en) Voice instruction response method, device and storage medium
CN113053390B (en) Text processing method and device based on voice recognition, electronic equipment and medium
CN111680514A (en) Information processing and model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination