CN118379992B - Speech recognition management system based on artificial intelligence - Google Patents

Speech recognition management system based on artificial intelligence Download PDF

Info

Publication number
CN118379992B
CN118379992B CN202410805561.2A CN202410805561A CN118379992B CN 118379992 B CN118379992 B CN 118379992B CN 202410805561 A CN202410805561 A CN 202410805561A CN 118379992 B CN118379992 B CN 118379992B
Authority
CN
China
Prior art keywords
voice
sentence
unit
statement
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410805561.2A
Other languages
Chinese (zh)
Other versions
CN118379992A (en
Inventor
林子健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Sizheng Electronic Co ltd
Original Assignee
Guangzhou Sizheng Electronic Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Sizheng Electronic Co ltd filed Critical Guangzhou Sizheng Electronic Co ltd
Priority to CN202410805561.2A priority Critical patent/CN118379992B/en
Publication of CN118379992A publication Critical patent/CN118379992A/en
Application granted granted Critical
Publication of CN118379992B publication Critical patent/CN118379992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice recognition management system based on artificial intelligence, which comprises a voice receiving processing unit, a voice characteristic slowing unit, a voice function frame selection processing unit, a voice text conversion unit, a defect part summarizing unit, a priority analysis unit, a statement processing normalization unit, a keyword vector integration unit, a statement vector logic relation unit, a characteristic statement insertion complement influence unit, a statement fault tolerance detection and adjustment unit, a complete statement logic analysis unit, a keyword checking and reporting unit and a characteristic retrieval analysis unit; the voice receiving processing unit can adjust the part to be adjusted in the voice signal, so that the logic formed by the whole sentence can be clearly known by the police receiver, the efficiency of the police receiver when the telephone is switched on can be improved, and the waiting time of an alarm person is shortened; through the feature extraction analysis unit, the probability of address end identification errors is reduced, and the efficiency of the whole process implementation is improved.

Description

Speech recognition management system based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a voice recognition management system based on artificial intelligence.
Background
Speech recognition is a development method and technique that is used to fine tune the recognition of an individual's speech by reading that individual's speaker into the system, either text or an isolated vocabulary, and analyzing that individual's specific voice, commonly known as the speaker's speech system. The voice system is penetrated in a large number of fields, for example, common communication is carried out from 'siri' which is awakened initially to voice communication technology which can be applied in different platforms at present, so that people can be helped to solve basic problems in life, but the technology is rarely involved in the field of the telephone of an operator, and as the telephone received by the operator is an emergency contact telephone such as 110, 120 or 119, the true identity of the fact cannot be known when the speaking speed of the counterpart is high or the language described cannot be used by the police receiver to know corresponding logic due to tension, and meanwhile, the true identity of the floating stage cannot be obtained when the speaking tone of the police is in a high-low floating stage. Furthermore, addresses are a necessary option for the alarm personnel, and once the addresses are wrong, the life time of the alarm personnel is delayed, so that a technology of converting voice into text is needed to jointly identify the language identified by the alarm personnel.
Disclosure of Invention
The invention aims to provide a voice recognition management system based on artificial intelligence so as to solve the problems in the background technology.
In order to solve the technical problems, the invention provides the following technical scheme: a speech recognition management system based on artificial intelligence comprises a speech receiving processing unit, a speech feature slowing unit, a speech function frame selection processing unit and a text logic adjusting unit;
The voice receiving processing unit is used for receiving and analyzing the voice signal from the answering device;
the voice characteristic slowing unit is used for classifying and intercepting the received voice signals; thus, the interference of external noise of the background to the listener can be reduced;
the voice function box selection processing unit is used for automatically starting text logic adjustment according to the time length corresponding to the times of question inquiry by the police receiver and sorting the language logic corresponding to the opposite side; the police receiver can know the true phase of things, so that the police can be rapidly alerted and the problem can be solved;
The text logic adjusting unit is used for carrying out text missing judgment on the received voice signal after the text information conversion is completed, and completing language logic arrangement of the received voice signal by carrying out sentence normalization processing on the text with the missing.
Further, the text logic adjusting unit comprises a voice text converting unit, a defect part summarizing unit, a priority analyzing unit, a sentence processing and correcting unit, a keyword vector integrating unit, a sentence vector logic relation unit, a characteristic sentence insertion complement affecting unit, a sentence fault tolerance detecting and adjusting unit, a complete sentence logic analyzing unit, a keyword checking and reporting unit and a characteristic calling and analyzing unit;
the voice text conversion unit is used for analyzing the received voice signals and converting the voice signals into text information, so that a police receiver can be assisted to know the true phase of an event;
The defect part summarizing unit is used for judging the missing components in the text converted by voice and analyzing whether the sentence is partially missing or completely missing; the missing components can be filled up, so that the police receiver can know the importance content in the text;
the priority analysis unit is used for detecting that partial deletion and complete deletion occur in the text at the same time, and the partial deletion has higher priority than the complete deletion; therefore, the completely missing file content can be rapidly processed, and the accuracy of the text content is improved;
The sentence processing and normalizing unit is used for carrying out word segmentation processing on the characters of the discontinuous sentences and deleting the repeated words and the invalid words in the text; the text content combination is more convenient, for example, the weather is good today, and the weather can be deleted;
The keyword vector integration unit is used for forming different sentence vectors from the divided sentences and combining the sentence vectors into correct sentence components; enabling the combined sentence to communicate with the context;
the sentence vector logic relation unit is used for analyzing the logic relation between the partially missing sentence vectors and the left sentence vector and the right sentence vector and judging whether a full sentence pattern can be formed;
The complete sentence logic analysis unit is used for judging whether to fill sentence components or not and judging the influence degree of the added sentence components on the whole sentence;
The characteristic statement insertion complement influence unit is used for influencing the combined statement vectors after deleting or complementing the statement vectors; therefore, the number of sentence matching times can be reduced, and accurate matching is achieved;
The statement fault tolerance detection and adjustment unit is used for judging whether the current statement is an address field, further detecting whether the added or deleted statement vector contains a statement with higher fault tolerance when the current statement is detected to be the address field, and carrying out specific verification on an alarm person when the statement with higher fault tolerance exists;
The keyword checking and reporting unit is used for detecting whether keyword vectors with higher fault tolerance rate exist or not, reporting the keyword vectors with higher fault tolerance rate, and marking the keyword vectors on a map according to the number of times of voice recognition error rate so as to prompt a police to pay attention to the keywords;
the feature extraction and analysis unit is used for detecting that the police receiver adjusts the voice feature part corresponding to the keyword aiming at the keyword.
Further, the step of analyzing the voice signal is as follows:
step D01: checking the voice signals of the police receiver and the alarm person, dividing the voice signals into different voice section features, and analyzing according to the voice section feature sections; when detecting that the frequency of converting the bass signal into the treble signal in the voice signal exceeds the preset frequency, jumping to the step D02;
Step D02: intercepting the voice signal in the step D01, analyzing whether the frequency of questions asked by the police is higher than the preset frequency, if so, further analyzing whether the voice signal is lack of sentence vectors, and jumping to the step D03; if not, the query is a normal communication query;
Step D03: judging whether the sentence vector is completely missing or partially missing according to the logic relation between the sentence vector and the sentence at the back and carrying out effective addition; judging whether the sentences with higher influence degree lack sentence components or not; if yes, jumping to a step D04;
Step D04: and judging that the history record of the missing keywords is called in statement components with higher influence degree until the correct keywords can be matched and the correctness can be checked.
Further, in the step D01, the step of analyzing the characteristic segment of the voice segment includes:
D010: analyzing the generated voice segment according to voice amplitude characteristics, and intercepting and generating a plurality of low-frequency band and high-frequency band voice signals;
d011: judging whether the voice signal section contains voice components converted from high frequency band to low frequency band, if yes, automatically intercepting the voice signal of the corresponding section.
Further, according to the intercepted voice signal, the intercepted voice signal is divided into a plurality of voice segment sets, wherein W= { W 1,w2,w3...wn }, n is the intercepted voice segment number, wherein the set of high-frequency voice segments is F= { F 1,f2,f3...fs }, the set of low-frequency voice segments is L= { L 1,l2,l3...lo }, s and o refer to the total segment number of the high-frequency voice segments and the low-frequency voice segments, the amplitude set corresponding to the high-frequency voice segments is H= { H 1,h2,h3...hs }, the amplitude set corresponding to the low-frequency voice segments is P= { P 1,p2,p3...po }, and when H i-pj > y is detected, i and j are the segment numbers of the high-frequency voice segments and the low-frequency voice segment sets respectively, the current voice segment is analyzed; otherwise, the conversion is normal.
Further, in the step D02-D03, it is determined that in the intercepted high-frequency band and low-frequency band conversion, a time interval for the police to inquire about the problem is [ t a,tb ] > standard time [ t c,td ], and whether the number of times of inquiry about the problem is greater than a preset number of times in the time interval, if the above condition is satisfied, the missing sentence vector is further analyzed, t a refers to the start time of the inquired warning, t b refers to the end time of the inquired warning, t c refers to the start time of the standard time, and t d refers to the end time of the standard time;
detecting the importance degree of the current lacking sentence vector in the whole sentence, and segmenting the detected sentence vector, wherein the aim of segmentation is to delete the repetitive word vector and the word vector with unsmooth logic, so that the keyword word vector can be accurately obtained; in the sentence, the set of existing and existing sentence vectors is y= { ,,...The set of missing sentence vectors is x= {,,...Wherein r is the number of vectors requiring no supplemental statement, e is the number of vectors requiring supplemental statement, andAnd (3) withMerging to form an independent sentence a, and judging the influence degree of the independent sentence a on the solution integral sentence after the independent sentence a is formed;
Detecting a + JudgingThe degree of influence of different parts of the whole sentence:
when detecting When the main part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is higher;
when detecting When the modified part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is low;
when detecting When the sentence is a complement part in the sentence, the influence degree of the sentence vector on the whole sentence is between the influence degrees brought by the main body part and the modification part;
And makes a judgment using the following formula, zy=zy k+ZYs,ZYs =
When the added sentence vector is detectedWhen the logical relationship in the sentence cannot be identified, the sentence vector needs to be superimposed or added by itself until the logical relationship of the sentence vector can be clearly understood, where ZY refers to the influence value, ZY k refers to the preliminary influence value, ZY s refers to the second influence value,It is meant that the coefficient of influence,The missing sentence vector;
In the step D04, whether the main body part of the search key is a detailed address or not, when a specific address is detected, whether the address contains names of similar voice tones and areas is judged, the times of similar voice tones and alarming are timely called through big data, the times of the similar voice tones and alarming are ordered from high to low according to the occurrence times, and meanwhile, the geographic position face of the area is checked, and whether the environment containing the same geographic position face is located or not is judged; inquiring the retrieved result to an alarm person manually; if the geographical position is not the same as the address described by the alarm person, calling the names of other ranges; if the alarm times are the same, the alarm times of the area are marked, and other people are warned;
When the correct address is not queried for a plurality of times, judging the distribution of the tone of the alarm person, analyzing and detecting whether the received voice signal cannot be regulated and processed due to the overlarge distance between the mouth of the opposite party and equipment held by the alarm person;
In a plane, setting the position of the mouth of the alarm person as S= (a, b), setting the central position of the held equipment as L= (c, d) in the plane, setting the length and width of the held equipment as J and G, and setting the included angle between the mouth of the alarm person and the held equipment as When the included angle degree is detected to be smaller than the standard included angle degree and the amplitude of the voice signal is detected to be larger than the preset amplitude, the voice of the current alarm person is larger, the voice amplitude is effectively regulated, and the voice amplitude is reduced;
When the included angle degree is detected to be larger than the standard included angle degree and the amplitude of the voice signal is detected to be smaller than the standard amplitude, the voice of the current person is indicated to be smaller, the voice amplitude is effectively regulated, and the voice amplitude is improved.
Noise in the speech signal may be modulated by a variety of noise reduction means.
Compared with the prior art, the invention has the following beneficial effects:
1. The voice receiving processing unit can analyze the received voice signal, judge the amplitude value of the voice signal according to the amplitude after the call is switched on, extract and divide the voice signal, and judge the part which should be regulated in the voice signal, so that the logic formed by the whole sentence can be clearly known by the police, thereby improving the efficiency of the police call, and shortening the waiting time of the police;
2. The priority analysis unit can analyze that when the fixed sentence components are determined to be contained, the residual sentence components are supplemented, and words are segmented according to the voice feature vectors identified by the voice recognition technology, so that redundant word vectors can be deleted, corresponding sentence contents can be accurately identified, and the overall recognition accuracy is improved; when the content which is completely deleted is detected, the logic relation of the whole sentence vector is regulated, so that the logic relation of the whole fragment can be straightened, when partial deletion and complete deletion exist simultaneously, the partial deleted voice characteristic vector in the whole sentence vector is supplemented, so that the rest of the completely deleted sentence can be regulated according to the components of the already supplemented sentence, the logic sense of the whole sentence is increased, and a specific occurrence event can be known by an operator;
3. the feature extraction analysis unit can reduce the occurrence of addresses with higher fault tolerance in the address end when the address end of the missing content is detected, and effectively reduce the influence of noise on the whole voice in the telephone answering process, reduce the probability of error identification of the address end, reduce the waiting time of alarm personnel and improve the realization efficiency of the whole process.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a schematic diagram of steps of an artificial intelligence based speech recognition management system according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present invention provides the following technical solutions:
a speech recognition management system based on artificial intelligence comprises a speech receiving processing unit, a speech feature slowing unit, a speech function frame selection processing unit and a text logic adjusting unit;
The voice receiving processing unit is used for receiving and analyzing the voice signal from the answering device;
the voice characteristic slowing unit is used for classifying and intercepting the received voice signals; thus, the interference of external noise of the background to the listener can be reduced;
the voice function box selection processing unit is used for automatically starting text logic adjustment according to the time length corresponding to the times of question inquiry by the police receiver and sorting the language logic corresponding to the opposite side; the police receiver can know the true phase of things, so that the police can be rapidly alerted and the problem can be solved;
The text logic adjusting unit is used for carrying out text missing judgment after the received voice signals are converted into text information, and completing language logic arrangement of the received voice signals by carrying out sentence normalization processing on the text with the missing;
the text logic adjusting unit comprises a voice text converting unit, a defect part summarizing unit, a priority analyzing unit, a sentence processing and correcting unit, a keyword vector integrating unit, a sentence vector logic relation unit, a characteristic sentence insertion and complement affecting unit, a sentence fault tolerance detecting and adjusting unit, a complete sentence logic analyzing unit, a keyword checking and reporting unit and a characteristic calling and analyzing unit;
the voice text conversion unit is used for analyzing the received voice signals and converting the voice signals into text information, so that a police receiver can be assisted to know the true phase of an event;
The defect part summarizing unit is used for judging the missing components in the text converted by voice and analyzing whether the sentence is partially missing or completely missing; the missing components can be filled up, so that the police receiver can know the importance content in the text;
the priority analysis unit is used for detecting that partial deletion and complete deletion occur in the text at the same time, and the partial deletion has higher priority than the complete deletion; therefore, the completely missing file content can be rapidly processed, and the accuracy of the text content is improved;
The sentence processing and normalizing unit is used for carrying out word segmentation processing on the characters of the discontinuous sentences and deleting the repeated words and the invalid words in the text; the text content combination is more convenient, for example, the weather is good today, and the weather can be deleted;
The keyword vector integration unit is used for forming different sentence vectors from the divided sentences and combining the sentence vectors into correct sentence components; enabling the combined sentence to communicate with the context;
the sentence vector logic relation unit is used for analyzing the logic relation between the partially missing sentence vectors and the left sentence vector and the right sentence vector and judging whether a full sentence pattern can be formed;
The sentence periods are usually composed of subjects, predicates, complement words and the like, and when a part of the content is lacking, the specific meaning of the spoken content cannot be analyzed sufficiently, so that the content constitution of the sentence needs to be regulated, and a police receiver can judge the logic of the sentence as soon as possible;
The complete sentence logic analysis unit is used for judging whether to fill sentence components or not and judging the influence degree of the added sentence components on the whole sentence;
The characteristic statement insertion complement influence unit is used for influencing the combined statement vectors after deleting or complementing the statement vectors; therefore, the number of sentence matching times can be reduced, and accurate matching is achieved;
The statement fault tolerance detection and adjustment unit is used for judging whether the current statement is an address field, further detecting whether the added or deleted statement vector contains a statement with higher fault tolerance when the current statement is detected to be the address field, and carrying out specific verification on an alarm person when the statement with higher fault tolerance exists;
The keyword checking and reporting unit is used for detecting whether keyword vectors with higher fault tolerance rate exist or not, reporting the keyword vectors with higher fault tolerance rate, and marking the keyword vectors on a map according to the number of times of voice recognition error rate so as to prompt a police to pay attention to the keywords;
the characteristic extraction and analysis unit is used for detecting that a police receiver adjusts a voice characteristic part corresponding to a keyword aiming at the keyword;
wherein, the steps of analyzing the voice signal are as follows:
step D01: checking the voice signals of the police receiver and the alarm person, dividing the voice signals into different voice section features, and analyzing according to the voice section feature sections; when detecting that the frequency of converting the bass signal into the treble signal in the voice signal exceeds the preset frequency, jumping to the step D02;
Step D02: intercepting the voice signal in the step D01, analyzing whether the frequency of questions asked by the police is higher than the preset frequency, if so, further analyzing whether the voice signal is lack of sentence vectors, and jumping to the step D03; if not, the query is a normal communication query;
Step D03: judging whether the sentence vector is completely missing or partially missing according to the logic relation between the sentence vector and the sentence at the back and carrying out effective addition; judging whether the sentences with higher influence degree lack sentence components or not; if yes, jumping to a step D04;
Step D04: and judging that the history record of the missing keywords is called in statement components with higher influence degree until the correct keywords can be matched and the correctness can be checked.
In the step D01, the step of analyzing the speech feature segment is:
D010: analyzing the generated voice segment according to voice amplitude characteristics, and intercepting and generating a plurality of low-frequency band and high-frequency band voice signals;
d011: judging whether the voice signal section contains voice components converted from high frequency band to low frequency band, if yes, automatically intercepting the voice signal of the corresponding section.
According to the intercepted voice signal, the voice signal is divided into a plurality of voice segment sets W= { W 1,w2,w3...wn }, n is the intercepted voice paragraph number, wherein the set of high-frequency voice segments is F= { F 1,f2,f3...fs }, the set of low-frequency voice segments is L= { L 1,l2,l3...lo }, wherein s and o refer to the total number of the high-frequency voice segments and the low-frequency voice segments, the set of amplitude values corresponding to the high-frequency voice segments is H= { H 1,h2,h3...hs }, the set of amplitude values corresponding to the low-frequency voice segments is P= { P 1,p2,p3...po }, when H i-pj > y is detected, i and j are the paragraph numbers of the amplitude value sets of the high-frequency voice segments and the low-frequency voice segments respectively, and the current voice segment is analyzed; otherwise, the conversion is normal;
in the above process, when a section of voice signal is detected, the voice sections of different frequency sections need to be intercepted by the amplitude value of the voice signal, and the formula used in the above process is used for judging the number of the sections of the voice signal of the abnormal section, so that the number of the sections with the abnormality can be analyzed in time.
In the step D02-D03, it is determined that in the conversion between the high frequency band and the low frequency band, a time interval for the police to inquire about the problem is [ t a,tb ] > standard time [ t c,td ], and whether the number of times of inquiry about the problem in the time interval is greater than a preset number of times, if the above condition is satisfied, the missing sentence vector is further analyzed, t a refers to the start time of the inquired warning, t b refers to the end time of the inquired warning, t c refers to the start time of the standard time, and t d refers to the end time of the standard time;
detecting the importance degree of the current lacking sentence vector in the whole sentence, and segmenting the detected sentence vector, wherein the aim of segmentation is to delete the repetitive word vector and the word vector with unsmooth logic, so that the keyword word vector can be accurately obtained; in the sentence, the set of existing and existing sentence vectors is y= { ,,...The set of missing sentence vectors is x= {,,...Wherein r is the number of vectors requiring no supplemental statement, e is the number of vectors requiring supplemental statement, andAnd (3) withMerging to form an independent sentence a, and judging the influence degree of the independent sentence a on the solution integral sentence after the independent sentence a is formed;
Detecting a + JudgingThe degree of influence of different parts of the whole sentence:
when detecting When the main part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is higher;
when detecting When the modified part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is low;
when detecting When the sentence is a complement part in the sentence, the influence degree of the sentence vector on the whole sentence is between the influence degrees brought by the main body part and the modification part;
And makes a judgment using the following formula, zy=zy k+ZYs,ZYs =
When the added sentence vector is detectedWhen the logical relationship in the sentence cannot be identified, the sentence vector needs to be superimposed or added by itself until the logical relationship of the sentence vector can be clearly understood, where ZY refers to the influence value, ZY k refers to the preliminary influence value, ZY s refers to the second influence value,It is meant that the coefficient of influence,The missing sentence vector.
The above mentioned formula is to judge the influence degree of the sentence vector in the whole sentence, the ZY k mentioned in the above formula refers to the preliminary influence degree, the subsequent occurrenceTo judge the consistency of the whole sentence logic after the connection of the rest sentences, wherein the different sentencesRepresenting different influencing factors, and judging the influence degree of the whole sentence through the formula.
In the step D04, whether the main body part of the search key is a detailed address or not, when a specific address is detected, whether the address contains names of similar voice tones and areas is judged, the times of similar voice tones and alarming are timely called through big data, the times of the similar voice tones and alarming are ordered from high to low according to the occurrence times, and meanwhile, the geographic position face of the area is checked, and whether the environment containing the same geographic position face is located or not is judged; inquiring the retrieved result to an alarm person manually; if the geographical position is not the same as the address described by the alarm person, calling the names of other ranges; if the number of the alarm times in the area is the same, the alarm times in the area are marked, and other people are warned.
When the correct address is not queried for a plurality of times, judging the distribution of the tone of the alarm person, analyzing and detecting whether the received voice signal cannot be regulated and processed due to the overlarge distance between the mouth of the opposite party and equipment held by the alarm person;
In a plane, setting the position of the mouth of the alarm person as S= (a, b), setting the central position of the held equipment as L= (c, d) in the plane, setting the length and width of the held equipment as J and G, and setting the included angle between the mouth of the alarm person and the held equipment as When the included angle degree is detected to be smaller than the standard included angle degree and the amplitude of the voice signal is detected to be larger than the preset amplitude, the voice of the current alarm person is larger, the voice amplitude is effectively regulated, and the voice amplitude is reduced;
when the included angle degree is detected to be larger than the standard included angle degree and the amplitude of the voice signal is detected to be smaller than the standard amplitude, the voice of the current person is indicated to be smaller, the voice amplitude is effectively regulated, and the voice amplitude is improved;
by the formula The method can judge whether the reason for the alarm person to hear can be judged, and judge whether the alarm person is caused by too close distance between the mouth and the holding device, wherein the J is the length of the holding device, and the information of the alarm person when speaking can be identified by analyzing the degree of the included angle between the holding device and the mouth, which is also a formula for judging the angle in a unique mode.
Noise in the voice signal can be adjusted through various noise reduction modes, and noise reduction analysis can be carried out on the voice signal through a small chip of the noise reducer in the process.
Example 1: setting a telephone for an operator to receive an alarm person, because the emotion of the operator is high, the expressed meaning cannot be known in time, converting the expressed voice into a text through a voice recognition technology, detecting that a missing sentence is partially missing in the recognition process, combining according to the known text to obtain a partially missing content as a main body part, when recognizing that the sentence therein is the main body part, segmenting the content in the main body part, deleting the words with overlapping words therein and words with smaller influence on the whole sentence, for example, deleting redundant words in the address of xxxxxx of the operator to fully understand the whole sentence, and when detecting that the sentence in the main body part is combined with the rest of known sentences and the sentence in the sentence cannot be known, automatically adding the words until the logic of the sentence in the whole part is successfully regulated.
Example 2: when the address in the sentence is not known after the logic of the adjustment sentence, the content detected by the voice recognition comprises a plurality of voice features with similar voice tones, the position of the mouth of the person to be alerted is set as S= (a, b) = (100, 125), the central position of the held device in the plane is set as L= (c, d) = (160, 200), the length and the width of the held device are J=80 cm and G, and the included angle between the mouth of the person to be alerted and the held device is; The standard included angle degree is 30 degrees, but the voice recognition technology still clearly recognizes the voice characteristics, the amplitude of the whole voice signal is further adjusted, and the noise in the voice signal is adjusted through the noise reduction equipment, so that the corresponding voice can be clearly heard.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A speech recognition management system based on artificial intelligence is characterized in that: the system comprises a voice receiving processing unit, a voice characteristic slowing unit, a voice function box selection processing unit and a text logic adjusting unit;
The voice receiving processing unit is used for receiving and analyzing the voice signal from the answering device;
The voice characteristic slowing unit is used for classifying and intercepting the received voice signals;
The voice function box selection processing unit is used for determining whether to call the text logic adjusting unit according to the time length corresponding to the number of times of question inquiry by the police;
The text logic adjusting unit is used for carrying out text missing judgment after the received voice signals are converted into text information, and completing language logic arrangement of the received voice signals by carrying out sentence normalization processing on the text with the missing;
the text logic adjusting unit comprises a voice text converting unit, a defect part summarizing unit, a priority analyzing unit, a sentence processing and correcting unit, a keyword vector integrating unit, a sentence vector logic relation unit, a characteristic sentence insertion and complement affecting unit, a sentence fault tolerance detecting and adjusting unit, a complete sentence logic analyzing unit, a keyword checking and reporting unit and a characteristic calling and analyzing unit;
The voice text conversion unit is used for analyzing and converting the received voice signals into text information, so that a police receiver can be assisted to know the true phase of an event;
The defect part summarizing unit is used for judging the missing components in the text converted by voice and analyzing whether the sentence is partially missing or completely missing;
The priority analysis unit is used for detecting that partial deletion and complete deletion occur in the text at the same time, wherein the partial deletion has higher priority than the complete deletion;
The sentence processing and normalizing unit is used for carrying out word segmentation processing on the characters of the discontinuous sentences and deleting the repeated words and the invalid words in the text;
the keyword vector integration unit is used for forming different sentence vectors from the sentences after word segmentation and combining the sentence vectors into correct sentence components;
The sentence vector logic relation unit is used for analyzing the logic relation between the partially missing sentence vectors and the left sentence vector and the right sentence vector and judging whether a smooth sentence pattern can be formed;
the complete sentence logic analysis unit is used for judging whether the sentence components need to be filled or not and judging the influence degree of the filled sentence components on the whole sentence;
The characteristic statement insertion complement influence unit is used for influencing the combined statement vectors after the statement vectors are deleted or complemented;
The statement fault tolerance detection and adjustment unit is used for judging whether the current statement is an address field, further detecting whether the added or deleted statement vector contains a statement with higher fault tolerance when the current statement is detected to be the address field, and carrying out specific verification on an alarm person when the statement with higher fault tolerance is detected to be the statement with higher fault tolerance;
The keyword checking and reporting unit is used for detecting whether keyword vectors with higher fault tolerance rate exist or not, reporting the keyword vectors with higher fault tolerance rate, and marking the keyword vectors on a map according to the number of times of voice recognition error rate;
The feature extraction and analysis unit is used for detecting that the police receiver adjusts the voice feature part corresponding to the keyword aiming at the keyword.
2. An artificial intelligence based speech recognition management system according to claim 1, wherein: the steps of analyzing the speech signal are as follows:
step D01: checking the voice signals of the police receiver and the alarm person, dividing the voice signals into different voice section features, and analyzing according to the voice section feature sections; when detecting that the frequency of converting the bass signal into the treble signal in the voice signal exceeds the preset frequency, jumping to the step D02;
Step D02: intercepting the voice signal in the step D01, analyzing whether the frequency of questions asked by the police is higher than the preset frequency, if so, further analyzing whether the voice signal is lack of sentence vectors, and jumping to the step D03; if not, the query is a normal communication query;
Step D03: judging whether the sentence vector is completely missing or partially missing according to the logic relation between the sentence vector and the sentence at the back and carrying out effective addition; judging whether the sentences with higher influence degree lack sentence components or not; if yes, jumping to a step D04;
Step D04: and judging that the history record of the missing keywords is called in statement components with higher influence degree until the correct keywords can be matched and the correctness can be checked.
3. An artificial intelligence based speech recognition management system according to claim 2, wherein: in the step D01, the step of analyzing the characteristic segment of the speech segment includes:
D010: analyzing the generated voice segment according to voice amplitude characteristics, and intercepting and generating a plurality of low-frequency band and high-frequency band voice signals;
d011: judging whether the voice signal section contains voice components converted from high frequency band to low frequency band, if yes, automatically intercepting the voice signal of the corresponding section.
4. A speech recognition management system based on artificial intelligence according to claim 3, wherein: according to the intercepted voice signal, dividing the voice signal into a plurality of voice segment sets of W= { W 1,w2,w3...wn }, wherein n is the number of voice segments intercepted, the set of high-frequency voice segments is F= { F 1,f2,f3...fs }, the set of low-frequency voice segments is L= { L 1,l2,l3...lo }, wherein s and o refer to the total number of the high-frequency voice segments and the low-frequency voice segments, the set of amplitude values corresponding to the high-frequency voice segments is H= { H 1,h2,h3...hs }, the set of amplitude values corresponding to the low-frequency voice segments is P= { P 1,p2,p3...po }, and when H i-pj > y is detected, i and j are the number of segments of the amplitude value sets of the high-frequency voice segments and the low-frequency voice segments respectively, the current voice segments are analyzed; otherwise, the conversion is normal.
5. An artificial intelligence based speech recognition management system according to claim 2 or 4 and wherein: in the step D02-D03, in the step of determining that in the intercepted high-frequency band and low-frequency band conversion, the time interval of the police receiving person inquiring about the problem is [ t a,tb ] > standard time [ t c,td ], and whether the number of times of inquiring about the problem in the time interval is greater than a preset number of times or not is determined, if the condition is satisfied, the missing sentence vector is further analyzed, t a refers to the starting time of the inquiring about the warning person, t b refers to the ending time of the inquiring about the warning person, t c refers to the starting time of the standard time, and t d refers to the ending time of the standard time;
detecting the importance degree of the current lacking sentence vector in the whole sentence, and segmenting the detected sentence vector; in the sentence, the set of existing and existing sentence vectors is y= { ,,...The set of missing sentence vectors is x= {,,...Wherein r is the number of vectors requiring no supplemental statement, e is the number of vectors requiring supplemental statement, andAnd (3) withMerging to form an independent sentence a, and judging the influence degree of the independent sentence a on the solution integral sentence after the independent sentence a is formed;
Detecting a + JudgingThe degree of influence of different parts of the whole sentence:
when detecting When the main part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is higher;
when detecting When the modified part in the sentence is formed, the influence degree of the sentence vector on the whole sentence is low;
when detecting When the sentence is a complement part in the sentence, the influence degree of the sentence vector on the whole sentence is between the influence degrees brought by the main body part and the modification part;
And makes a judgment using the following formula, zy=zy k+ZYs,ZYs =
When the added sentence vector is detectedWhen the logical relationship in the sentence cannot be identified, the sentence vector needs to be superimposed or added by itself until the logical relationship of the sentence vector can be clearly understood, where ZY refers to the influence value, ZY k refers to the preliminary influence value, ZY s refers to the second influence value,It is meant that the coefficient of influence,The missing sentence vector.
6. An artificial intelligence based speech recognition management system according to claim 2, wherein: in the step D04, whether the main body part of the search key is a detailed address or not, when a specific address is detected, whether the address contains names of similar voice tones and areas is judged, the times of similar voice tones and alarming are timely called through big data, the times of the similar voice tones and alarming are ordered from high to low according to the occurrence times, and meanwhile, the geographic position face of the area is checked, and whether the environment containing the same geographic position face is located or not is judged; inquiring the retrieved result to an alarm person manually; if the geographical position is not the same as the address described by the alarm person, calling the names of other ranges; if the number of the alarm times in the area is the same, the alarm times in the area are marked, and other people are warned.
7. The artificial intelligence based speech recognition management system of claim 6, wherein: when the correct address is not queried for a plurality of times, judging the distribution of the tone of the alarm person, analyzing and detecting whether the received voice signal cannot be regulated and processed due to the overlarge distance between the mouth of the opposite party and equipment held by the alarm person;
In a plane, setting the position of the mouth of the alarm person as S= (a, b), setting the central position of the held equipment as L= (c, d) in the plane, setting the length and width of the held equipment as J and G, and setting the included angle between the mouth of the alarm person and the held equipment as When the included angle degree is detected to be smaller than the standard included angle degree and the amplitude of the voice signal is detected to be larger than the preset amplitude, the voice of the current alarm person is larger, the voice amplitude is effectively regulated, and the voice amplitude is reduced;
When the included angle degree is detected to be larger than the standard included angle degree and the amplitude of the voice signal is detected to be smaller than the standard amplitude, the voice of the current person is indicated to be smaller, the voice amplitude is effectively regulated, and the voice amplitude is improved.
8. An artificial intelligence based speech recognition management system according to claim 7, wherein: noise in the speech signal may be modulated by a variety of noise reduction means.
CN202410805561.2A 2024-06-21 2024-06-21 Speech recognition management system based on artificial intelligence Active CN118379992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410805561.2A CN118379992B (en) 2024-06-21 2024-06-21 Speech recognition management system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410805561.2A CN118379992B (en) 2024-06-21 2024-06-21 Speech recognition management system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN118379992A CN118379992A (en) 2024-07-23
CN118379992B true CN118379992B (en) 2024-08-27

Family

ID=91902280

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410805561.2A Active CN118379992B (en) 2024-06-21 2024-06-21 Speech recognition management system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN118379992B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116628173A (en) * 2023-07-26 2023-08-22 成都信通信息技术有限公司 Intelligent customer service information generation system and method based on keyword extraction

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1603928A (en) * 1978-04-27 1981-12-02 Dialog Syst Continuous speech recognition method
CN111145510B (en) * 2019-12-31 2021-12-17 清华大学 Alarm receiving processing method, device and equipment
CN113554857A (en) * 2021-07-20 2021-10-26 思必驰科技股份有限公司 Alarm receiving and processing auxiliary method and system for voice call
CN114707951A (en) * 2021-08-18 2022-07-05 南京西三艾电子系统工程有限公司 Alarm situation big data management method, device, equipment and storage medium
CN115495563A (en) * 2022-09-16 2022-12-20 重庆长安汽车股份有限公司 Intelligent session method and server based on table data retrieval

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116628173A (en) * 2023-07-26 2023-08-22 成都信通信息技术有限公司 Intelligent customer service information generation system and method based on keyword extraction

Also Published As

Publication number Publication date
CN118379992A (en) 2024-07-23

Similar Documents

Publication Publication Date Title
CN112804400B (en) Customer service call voice quality inspection method and device, electronic equipment and storage medium
US5802251A (en) Method and system for reducing perplexity in speech recognition via caller identification
US4720863A (en) Method and apparatus for text-independent speaker recognition
CN103179122B (en) A kind of anti-telecommunications telephone fraud method and system based on voice semantic content analysis
US20030125945A1 (en) Automatically improving a voice recognition system
CN109326305B (en) Method and system for batch testing of speech recognition and text synthesis
US6397180B1 (en) Method and system for performing speech recognition based on best-word scoring of repeated speech attempts
WO2004109657B1 (en) Speaker recognition in a multi-speaker environment and comparison of several voice prints to many
CN109036435A (en) Authentication and recognition methods based on voiceprint
CN112256849B (en) Model training method, text detection method, device, equipment and storage medium
CN111179936B (en) Call recording monitoring method
CN112800772A (en) Automatic danger early warning method and system of law enforcement recorder
CN111010484A (en) Automatic quality inspection method for call recording
CN114610840A (en) Sensitive word-based accounting monitoring method, device, equipment and storage medium
CN118379992B (en) Speech recognition management system based on artificial intelligence
CN110933236B (en) Machine learning-based null number identification method
US7340398B2 (en) Selective sampling for sound signal classification
Delacourt et al. Audio data indexing: Use of second-order statistics for speaker-based segmentation
CN110853674A (en) Text collation method, apparatus, and computer-readable storage medium
CN109859763A (en) A kind of intelligent sound signal type recognition system
JPH04369698A (en) Voice recognition system
CN116994597B (en) Audio processing system, method and storage medium
CN115294990B (en) Sound amplification system detection method, system, terminal and storage medium
CN118410201A (en) Voice data classified storage method and system based on Internet of things platform
CN108833656A (en) A kind of dialog context early warning based reminding method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant