WO2024027552A1 - Text classification method and apparatus, text recognition method and apparatus, electronic device and storage medium - Google Patents

Text classification method and apparatus, text recognition method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2024027552A1
WO2024027552A1 PCT/CN2023/109568 CN2023109568W WO2024027552A1 WO 2024027552 A1 WO2024027552 A1 WO 2024027552A1 CN 2023109568 W CN2023109568 W CN 2023109568W WO 2024027552 A1 WO2024027552 A1 WO 2024027552A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
feature
classified
type
value
Prior art date
Application number
PCT/CN2023/109568
Other languages
French (fr)
Chinese (zh)
Inventor
李长林
肖冰
曹磊
罗奇帅
Original Assignee
马上消费金融股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 马上消费金融股份有限公司 filed Critical 马上消费金融股份有限公司
Publication of WO2024027552A1 publication Critical patent/WO2024027552A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present disclosure relates to the field of computer technology, and in particular to a text classification method, recognition method and device, equipment, and storage medium.
  • Text classification refers to the automatic classification of text according to certain standards.
  • text processing tasks such as sentiment analysis, intent recognition, and question and answer matching can be processed through text classification, which can improve text processing capabilities.
  • the text content that needs to be recognized may contain corresponding interference information due to the presence of noise, resulting in problems such as semantic incoherence and semantic confusion in the text content, and thus the inability to obtain objective text recognition results.
  • the present disclosure provides a text classification method and device, a text recognition method and device, electronic equipment, and storage media.
  • the present disclosure provides a text classification method.
  • the text classification method It includes: obtaining the text to be classified; generating feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified; performing text classification processing on the text to be classified based on the feature values of the text feature to obtain text classification As a result, the text classification results are used to indicate whether the specified type of noise is present.
  • the present disclosure provides a text recognition method.
  • the text recognition method includes: performing sensitive word recognition on the acquired text to be recognized, and obtaining a sensitive word recognition result; according to the characteristic value of the text-type feature of the text to be recognized, The identified text is subjected to text classification processing and a text classification result is generated.
  • the text classification result is used to indicate whether the specified type of noise exists; based on the sensitive word recognition result and the text classification result, a text recognition result of the text to be recognized is generated.
  • the present disclosure provides a text classification device.
  • the text classification device includes: an acquisition module for acquiring text to be classified; a feature value generation module for generating based on preset text class features and the text to be classified. The characteristic value of the text class feature of the text to be classified; the classification determination module is used to perform text classification processing on the text to be classified according to the characteristic value of the text class feature to obtain a text classification result, and the text classification result is used to indicate whether the specified type of noise exists.
  • the present disclosure provides a text recognition device.
  • the text recognition device includes: a word recognition module for performing sensitive word recognition on the acquired text to be recognized to obtain a sensitive word recognition result; and a classification module for performing sensitive word recognition according to the text to be recognized. Identify the feature values of the text-type features of the text, perform text classification processing on the text to be recognized, and generate text classification results. The text classification results are used to indicate whether the specified type of noise exists; the result generation module is used to identify the sensitive words and the text classification results based on the results. , generate text recognition results of the text to be recognized
  • the present disclosure provides an electronic device.
  • the electronic device includes: at least one processor; and a memory communicatively connected to the at least one processor.
  • the memory stores one or more computer programs, and one or more computer programs. Can be executed by at least one processor, so that at least one processor can execute the above-mentioned text classification method or text recognition method.
  • the present disclosure provides a computer-readable storage medium on which a computer program is stored.
  • the computer program implements the above-mentioned text classification method or text recognition method when executed by a processor/processing core.
  • the embodiments provided by the present disclosure can generate feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified, and perform text classification processing on the feature values of the generated text feature to obtain the text Classification results, through which the text classification results can be used to determine whether there is specified type of noise in the text to be classified.
  • This text classification method can determine whether there is a specified type of noise in the text to be classified based on text characteristics, so that during the text recognition process, the interference caused by the noise data can be reduced based on the classification results, which is conducive to obtaining objective Text recognition results.
  • Figure 1 is a scene diagram of a voice call service provided by an embodiment of the present disclosure
  • Figure 2 is a flow chart of a text classification method provided by an embodiment of the present disclosure
  • Figure 3 is a schematic flowchart of model training and model use provided by an embodiment of the present disclosure
  • Figure 4 is a flow chart of a text recognition method provided by an embodiment of the present disclosure.
  • Figure 5 is a flow chart of another text recognition method provided by an embodiment of the present disclosure.
  • Figure 6 is a block diagram of a text classification device provided by an embodiment of the present disclosure.
  • Figure 7 is a block diagram of a text recognition device provided by an embodiment of the present disclosure.
  • Figure 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
  • speech recognition technology is a technology that can convert speech data into text information.
  • Speech recognition technology involves many disciplines and technical fields such as acoustics, phonetics, linguistics, digital signal processing theory, information theory, and computer science. Due to the diversity and complexity of speech signals, the processing performance of speech signals by speech processing equipment is easily affected by the following performance influencing factors, such as the size of the recognition vocabulary and the complexity of the speech, the quality of the speech signal, the speaker's Quantity (single speaker vs. multiple speakers), quality of call hardware, and processing power. Under the influence of the above performance factors, the recognition accuracy of speech processing is subject to certain limitations.
  • Figure 1 is a scene diagram of a voice call service provided by an exemplary embodiment of the present disclosure. As shown in Figure 1, this scenario includes: user 10, user communication device 11, customer service agent 20, agent communication device 21, communication network 30 and voice processing device 40. User call equipment 11 establishes a call with the agent call device 21 through the communication network 30, and the customer service agent 20 provides voice call services to the user 10 during the call, such as receiving consultations, handling business, etc.
  • the voice processing device 40 can obtain the voice data that requires quality inspection among the voice data of both parties to the call from the communication network 30, and convert the voice data that requires quality inspection into the corresponding voice data through automatic speech recognition. dialogue text, and perform service quality detection based on the dialogue text to obtain service quality detection results.
  • the main content of voice service quality inspection is sensitive word recognition.
  • sensitive words include words that do not comply with norms such as industry norms, management norms and/or disciplinary norms.
  • a sensitive word database can be established in advance based on the usage scenario.
  • the sensitive word database can include dirty words, abusive words, uncivilized words, threatening words, words related to major events, and other sensitive words created according to specific specifications. word. Since the same word may or may not be a sensitive word in different language environments, the sensitive word dictionary needs to be updated in a timely manner according to the phonetic environment of the usage scenario.
  • sensitive words may include words of interest.
  • the words of interest may include at least one of service evaluation terms, business evaluation terms, and business keywords.
  • service evaluation terms e.g., service evaluation terms
  • business evaluation terms e.g., business keywords
  • business keywords e.g., business keywords
  • the call process between the user 10 and the customer service agent 20 may be interfered by various types of noise, such as environmental noise, human voice interference, reverberation, echo and other interference sources.
  • the source of environmental noise can be a machine capable of playing meaningful audio signals (such as a radio, audio player, etc.).
  • Reverberation can be understood as an acoustic phenomenon in which a sound signal and the sound signal are repeatedly reflected and absorbed by obstacles during propagation to form a superposition of sound waves.
  • Echo can also be called acoustic echo (Acoustic echo) Echo), echo can be understood as a repeated sound signal formed by the propagation and reflection of the sound played by the speaker of the speech processing device itself in the space. This repeated sound signal will be transmitted back to the microphone to form noise interference.
  • call return noise also called call return noise, return noise
  • This noise interference is usually caused by the hardware characteristics of the communication equipment itself. For example, the customer service agent's call equipment has poor isolation of its own transceiver loop, or the call equipment's own loudspeaker is louder and the microphone sensitivity is higher, causing the sound played in the call equipment's own speaker to be transmitted back to the microphone, and then transmitted back to the microphone.
  • the sound data from the microphone is mixed with the voice data of the customer service agent, forming return noise in the conversation text of the customer service agent.
  • the following schematic dialogue text shows part of the dialogue text between a user (hereinafter referred to as a customer) and a customer service agent (hereinafter referred to as an agent) in the form of a dialogue.
  • the content of each dialogue The speaker's identity and the corresponding call content can be separated by a separator symbol (such as a colon); the left side of the colon represents the speaker's identity, and the right side of the colon represents the speaker's call content in text form.
  • Example 1 The call content may include the following text information.
  • Agent Somewhat qualified. We will leave you a note here to reduce the number of calls to you during working hours. Goodbye.
  • Example 2 The call content may include the following text information.
  • Example 3 The call content may include the following text information.
  • Example 3 the agent's call content
  • the "neurosis” is based on the interference of return noise and the speech recognition errors caused by the text conversion of automatic speech recognition. That is to say, when automatic speech recognition technology is used to convert voice call data into text (also called speech translation), the return noise data caused by the voice backhaul phenomenon will also be translated. Regardless of whether the translation is correct, the translation result will be affected. Cause noise interference.
  • voiceprint recognition technology is also called speaker recognition technology. It is an intelligent voice core technology that uses computer systems to automatically complete speaker identity recognition. This technology is based on the unique personality information of the speaker contained in the voice data, and uses computers and current information recognition technology to automatically identify the identity of the speaker corresponding to the current voice. This technology is used to identify noise data in speech data and remove the noise data to improve the accuracy of automatic speech recognition technology and thereby improve the accuracy of speech quality inspection.
  • voiceprint recognition technology can be used to identify the identity of the speaker in the voice call data. Based on the identified identity of the current speaker, the voice data that does not belong to the current speaker is eliminated, and the voice of the current speaker is retained. data to denoise voice call data.
  • the denoising process of voice data through voiceprint recognition technology may include: receiving voice data; based on the unique personality information of each speaker in the voice data, automatically identifying the identity of each speaker through voiceprint recognition technology;
  • the audio data of the designated speaker's voice data is used as noise data, the noise data is removed from the current voice data, the designated speaker's voice data is retained, and automatic speech recognition technology is used to translate the designated speaker's voice data. Get the translated dialogue text of the specified speaker.
  • the voiceprint recognition technology since the audio information of the noise data is usually short, the voiceprint recognition technology usually cannot correctly identify the identity of the speaker corresponding to the current voice; the superposition of noise data and the voice data of both parties in the call, such as When both parties speak at the same time, the superposition of noise data will further increase the difficulty of voiceprint recognition technology.
  • the process of denoising through voiceprint recognition and then translating through speech recognition technology involves two technologies. The use, processing process is complicated and cumbersome, and the processing efficiency is low. Therefore, in related technologies, the noise data in the call voice data is not easy to be correctly identified, and may even cause the call voice data to be misrecognized, resulting in a low recognition accuracy.
  • the text classification method can determine the presence of specified types of noise in speech data; the text recognition method can determine based on the Specify class The determination result of the existence of type noise is further recognized and processed to obtain the recognition result based on the existence of the noise.
  • the text classification method and text recognition method according to the embodiments of the present disclosure can be executed by electronic devices such as terminal devices or servers.
  • the terminal devices can be vehicle-mounted devices with data processing capabilities, user equipment (User Equipment, UE), mobile devices, and user terminals.
  • UE User Equipment
  • PDA Personal Digital Assistant
  • handheld device computing device
  • vehicle-mounted device etc.
  • these methods can be implemented by the processor in the terminal device calling computer-readable program instructions stored in the memory, or , these methods can be executed through the server.
  • Figure 2 is a flow chart of a text classification method provided by an embodiment of the present disclosure.
  • the text classification method includes the following steps S210 to S230.
  • step S210 the text to be classified is obtained.
  • the processing device may obtain the text to be classified in various ways. For example, directly use a text in the conversation text as the text to be classified; or, store multiple texts in the text processing device in advance, and the processing device obtains each text one by one as the current text to be classified; or, the processing device is performing During text recognition processing, if it is necessary to classify the current text to be recognized, the text to be recognized can be directly used as the text to be classified.
  • step S220 based on the preset text class features and the text to be classified, a feature value of the text class feature of the text to be classified is generated.
  • the text features are text-related feature items preset based on the text to be classified. These feature items can be used to characterize the probability of the presence of sensitive words in the conversation text.
  • step S230 text classification processing is performed on the text to be classified according to the feature value of the text type feature, and a text classification result indicating the presence of the specified type of noise is generated.
  • the specified type of noise may be a sound signal that has certain practical meaning and damages the quality of the collected voice data.
  • the specified type of noise includes, but is not limited to, those pointed out in the above embodiments: noise caused by environmental noise, human voice interference, reverberation, echo and other interference sources.
  • the text classification method can be based on the preset text class features and the text to be classified, generate the feature values of the text-type features of the text to be classified, perform text classification processing on the generated feature values of the text-type features, and obtain the text classification results.
  • the text classification results it can be determined whether the text to be classified is The specified type of noise is present.
  • This method can determine whether there is a specified type of noise in the text to be classified based on text characteristics, so that in the subsequent text recognition process, the interference caused by the noise data can be reduced based on the classification results, so it is conducive to obtaining objective Text recognition results.
  • the text classification method of the embodiment of the present disclosure is to determine whether there is a specified type of noise in the text to be classified based on the characteristic values of the text-type features of the text to be classified, and to determine whether there is specified type of noise in the text to be classified. It has nothing to do with pattern recognition processing. Therefore, the method according to the embodiments of the present disclosure determines whether there is noise in the dialogue text, and will not be affected by various factors that adversely affect the accuracy of voiceprint recognition in related technologies, and the accuracy is higher; and, compared with In the related art, it is necessary to first perform denoising based on voiceprint recognition on the acquired voice data, and then perform a voice recognition process on the denoised voice data.
  • the text classification method in the embodiment of the present disclosure is to denoise the acquired text to be classified. , determine whether the specified type of noise exists based on text features, improve the accuracy of classification results, simplify the processing method, and improve processing efficiency.
  • the preset text-like features include at least one text-like feature.
  • the step of generating feature values of the text-type features of the text to be classified may specifically include the following steps S11 and S12.
  • step S11 the value rule of each text-type feature in the at least one text-type feature is determined based on the at least one text-type feature.
  • each text feature is represented as a feature operator, and each feature operator is used to describe the value rule of a text feature, that is, the correspondence between the text feature and different feature values.
  • step S12 based on the value rules of each text class feature, a feature value of each text class feature of the text to be classified is generated.
  • the feature value of the text feature is a numerical representation of the text feature.
  • the text classification process based on the feature value of the text feature of the text to be classified can embody Whether there is an objective situation of a specified type of noise in the dialogue text, and the richer the types of preset text features, the more accurate the subsequent corresponding text classification results will be, and the more conducive to improving the accuracy of the text classification results.
  • the text to be classified is text selected from pre-obtained conversation texts.
  • At least one text-type feature includes at least one of the following text-type features: sensitive word distribution features, used to characterize the distribution of sensitive words in conversation texts; predetermined features of the text itself, used to characterize the predetermined features of the text to be classified; and Predetermined features related to the dialogue text are used to characterize the predetermined features related to the text to be classified and the dialogue text.
  • the conversation text includes: the conversation text of the target object and the conversation text of the conversation object generated during a call between the target object and the conversation object with which the target object talks, and the text to be classified is the target object One of the dialogue texts.
  • the conversation text of the target object is the agent call text
  • the conversation text of the conversation object with which the target object talks is the customer call text.
  • the agent call text contains multiple agent call texts
  • the customer call text contains multiple customer call texts.
  • At least one text-like feature includes a sensitive word distribution feature.
  • the above-mentioned step S11 may specifically include: determining the value rules of at least one of the following text-type features included in the sensitive word distribution features: the value rules of the first text-type feature, and the first text-type feature is used to represent: the dialogue text of the target object. Whether only the text to be classified contains sensitive words; the value rules of the second text class feature.
  • the second text class feature is used to characterize: whether the sensitive word in the text to be classified appears in the dialogue text of the conversation object;
  • the third text class feature is used to characterize: whether there are sensitive words in the dialogue text of the conversation object;
  • the value rules of the fourth text class feature the fourth text class feature is used to represent: scheduled dialogue Whether there are sensitive words in the text, and whether the sensitive words in the predetermined dialogue text are consistent with the sensitive words in the text to be classified, the predetermined dialogue text is one of the dialogue texts of the dialogue object, and the predetermined dialogue text is related to the text to be classified adjacent text.
  • the above step S12 may specifically include: value rules based on the first text type feature, At least one of the value rules of the second text-type feature, the value rule of the third text-type feature, and the value rule of the fourth text-type feature is used to generate the text to be classified corresponding to the first text-type feature, the second text-type feature, and the value rule of the fourth text-type feature.
  • value rules based on the first text type feature
  • At least one of the value rules of the second text-type feature, the value rule of the third text-type feature, and the value rule of the fourth text-type feature is used to generate the text to be classified corresponding to the first text-type feature, the second text-type feature, and the value rule of the fourth text-type feature.
  • the characteristic value of at least one text-type feature among the text-type feature, the third text-type feature and the fourth text-type feature.
  • the feature operator T 1 can be constructed through the following expression (1) to determine the value rule of the first text type feature.
  • T 1 ( ⁇ 1 ) is referred to as operator T 1
  • ⁇ 1 is the first text type feature
  • the value of T 1 ( ⁇ 1 ) represents the characteristic value of the first text type feature
  • a text class feature ⁇ 1 is used to characterize: whether only the text to be classified in the dialogue text of the target object contains sensitive words. If so, then T 1 ( ⁇ 1 ) takes the value 0. If not, then T 1 ( ⁇ 1 ) The value is 1.
  • agent call text contains sensitive words
  • customer call texts may also contain sensitive words, so Only the text to be classified is less likely to contain sensitive words.
  • T 1 ( ⁇ 1 ) is only a schematic explanation, and it must be satisfied that the value of T 1 ( ⁇ 1 ) when ⁇ 1 is “yes” is smaller than the value of T 1 when ⁇ 1 is “no”.
  • the value of ( ⁇ 1 ) is enough, and the specific value can be customized according to actual needs.
  • the feature operator T 2 can be constructed through the following expression (2) to determine the value rule of the second text type feature.
  • T 2 ( ⁇ 2 ) is referred to as operator T 2
  • ⁇ 2 is the second text class feature
  • T 2 ( ⁇ 2 ) is the feature value of the second text class feature
  • the second text class Feature ⁇ 2 is used to characterize: whether the sensitive words in the text to be classified appear in the dialogue text of the dialogue object
  • the conversation text of the target object is the agent call text
  • the text to be classified is any agent call text in the agent call text
  • the conversation text of the conversation object with the target object is the customer call text
  • the second text type feature The meaning of the feature value is: whether the sensitive words in the text to be classified appear in the customer call text adjacent to the text to be classified.
  • a sensitive word in the text to be classified appears in the customer call text adjacent to the text to be classified, it indicates that the probability of the predetermined sensitive word actually existing in the text to be classified is small; if the sensitive word to be classified appears If the sensitive words in the text do not appear in the customer call text adjacent to the text to be classified, it indicates that the sensitive words in the predetermined text to be classified are more likely to be real conversation content.
  • Example 1 is "Customer: You keep calling me during my working hours. Can you be a little bit qualified? Agent: A little bit qualified. We will give you a note here. I’ll reduce the number of calls to you during working hours. Goodbye.” Assume that the text to be classified is the agent call text in Example 1. Since "somewhat quality" appears not only in the agent call text, but also in the customer call text adjacent to the agent call text, in this case "somewhat quality” The probability that it actually exists in the corresponding text to be classified (that is, the agent's call text) is small, and the probability that it is return noise is high.
  • T 2 ( ⁇ 2 ) is only a schematic explanation, and it must be satisfied that the value of T 2 ( ⁇ 2 ) when ⁇ 2 is “yes” is smaller than the value of T 2 when ⁇ 2 is “no”.
  • the value of ( ⁇ 2 ) is sufficient, and the specific value can be customized according to actual needs.
  • the feature operator T 3 can be constructed through the following expression (3) to determine the value rule of the third text type feature.
  • T 3 ( ⁇ 3 ) represents the value of operator T 3
  • ⁇ 3 is the third text class feature
  • T 3 ( ⁇ 3 ) is the feature value of the third text class feature
  • the conversation text of the target object is an agent call text
  • the text to be classified is an agent call text among the agent call texts
  • the conversation text of the conversation object with the target object is a customer call text
  • the third text type feature The meaning is: whether the customer call text contains sensitive words.
  • the dialogue text of the dialogue object also contains sensitive words (it is enough to include sensitive words, and the sensitive words may be the same as or different from the sensitive words in the text to be classified), it indicates that the predetermined sensitive words are to be classified.
  • the probability of real existence in the classified text is relatively high; if the dialogue text of the dialogue object does not contain sensitive words, it means that the probability of the sensitive words in the predetermined text to be classified is the actual dialogue content is small.
  • T 3 ( ⁇ 3 ) is only a schematic explanation, and it must be satisfied that the value of T 3 ( ⁇ 3 ) when ⁇ 3 is “no” is smaller than the value of T 3 ( ⁇ 3 ) when ⁇ 3 is “yes”.
  • the value of ( ⁇ 3 ) is sufficient, and the specific value can be customized according to actual needs.
  • the feature operator T 4 can be constructed through the following expression (4) to determine the value rule of the fourth text type feature.
  • T 4 ( ⁇ 4 ) is referred to as operator T 4
  • ⁇ 4 is the fourth text class feature
  • T 4 ( ⁇ 4 ) is the characteristic value of the fourth text class feature
  • the fourth text class Feature ⁇ 4 is used to characterize: whether the predetermined dialogue text contains sensitive words, and, if sensitive words are contained, whether the sensitive words present in the predetermined dialogue text are consistent with the sensitive words in the text to be classified; where, the predetermined dialogue text is One of the dialogue texts of the dialogue object, and the predetermined dialogue text is text adjacent to the text to be classified.
  • the conversation text of the target object is an agent call text
  • the text to be classified is an agent call text among the agent call texts
  • the conversation text of the conversation object with the target object is a customer call text
  • the fourth text type feature The meaning is: whether the customer call text adjacent to the text to be classified contains sensitive words, and if it contains sensitive words, whether the sensitive words are consistent with the sensitive words in the text to be classified.
  • the agent and the customer take turns speaking. If there is a dispute between the two, and the agent call text corresponding to a certain sentence of the agent's speech contains sensitive words such as uncivilized words, the corresponding agent call text will be deleted.
  • the adjacent customer call text (the previous sentence of the customer call text or the next sentence of the customer call text of the agent call text in the dialogue text) also contains sensitive words with a high probability.
  • the sensitive words in the agent call text are different from the sensitive words in the adjacent customer call text, and the sensitive words in the agent call text are different from the adjacent customer call texts.
  • the sensitive words in the former situation are different, which excludes the possibility of return noise. Therefore, the former situation (the situation where the sensitive words are different) is different from the latter situation.
  • the sensitive words in the text to be classified are more likely to be the actual conversation content; and only the agent call text contains sensitive words such as uncivilized words, and the adjacent customers of the agent call text
  • the call text does not contain sensitive words, and the probability of this situation happening in real scenarios is low.
  • T 4 ( ⁇ 4 ) is only a schematic explanation. It must be satisfied that when ⁇ 4 is “No”, the value of T 4 ( ⁇ 4 ) is smaller than when ⁇ 4 is “Yes, consistent”.
  • the specific value can be customized according to actual needs.
  • At least one text-like feature includes a predetermined feature of the text itself.
  • the above-mentioned step S11 may specifically include: determining the value rules of at least one of the following text-type features included in the predetermined characteristics of the text itself: the value rules of the fifth text-type feature and the value rules of the sixth text-type feature.
  • the features are used to represent: the sentence integrity information of the text to be classified; the sixth text type feature is used to represent: the total number of times a specific word appears in a specified position in the dialogue text of the target object.
  • the above step S12 may specifically include: based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, generating text corresponding to the fifth text class feature and the sixth text class in the text to be classified.
  • the characteristic value of at least one text-type feature in the feature may specifically include: based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, generating text corresponding to the fifth text class feature and the sixth text class in the text to be classified.
  • At least one of the following information items can be represented by the predetermined characteristics of the text itself in the text-type features: sentence integrity information of the text to be classified, specific words appearing at specified positions in the dialogue text of the target object. Total times.
  • the feature operator T 5 can be constructed through the following expression (5) to determine the value rule of the fifth text type feature.
  • T 5 ( ⁇ 5 ) is referred to as operator T 5
  • ⁇ 5 is the fifth text class feature
  • T 5 ( ⁇ 5 ) is the characteristic value of the fifth text class feature
  • the fifth text class feature Feature ⁇ 5 is used to characterize the sentence integrity of the text to be classified.
  • Sentence integrity includes the rationality of the sentence structure and the consistency of the sentence semantics; the value of ⁇ 5 can be obtained by scoring the text to be classified through the preset semantic model. The higher the score, the higher the score. High means the completeness of the sentence is better.
  • T 4 ( ⁇ 4 ) takes the value 0; if the sentence completeness score of the text to be classified is greater than 0.5 and less than or equal to 0.8, then T 4 ( ⁇ 4 ) takes a value of 1; if the sentence completeness score of the text to be classified is greater than 0.8, then T 4 ( ⁇ 4 ) takes the value 2.
  • the probability that sensitive words in the text to be classified are real conversation content is proportional to the sentence completeness value of the text to be classified; if the sentence completeness of the text to be classified is higher, the sentence completeness in the text to be classified will be higher.
  • T 5 ( ⁇ 5 ) is only a schematic explanation.
  • the specific value can be determined according to the actual situation. Requires custom settings.
  • the feature operator T 6 can be constructed through the following expression (6) to determine the value rule of the fifth text type feature.
  • T 6 ( ⁇ 6 ) is referred to as operator T 6
  • ⁇ 6 is the sixth text type feature
  • T 6 ( ⁇ 6 ) is the characteristic value of the sixth text type feature.
  • the probability that a sensitive word in the text to be classified is a real conversation content is inversely proportional to the number of times a specific word appears in a specified position in the text to be classified; the more times a specific word appears in a specified position, the more , the lower the probability that the sensitive words in the text to be classified are real dialogue content; the fewer times a specific word appears in a specified position, the higher the probability that the sensitive words in the text to be classified are real dialogue content.
  • a specific word is a polite word
  • the more the total number of times the polite word appears in all specified positions the lower the probability that the sensitive words in the text to be classified are real conversation content; the less the total number of times the polite word appears in all the specified positions. , indicating the sensitivity in the text to be classified The higher the probability that the testimonials are real conversation content.
  • T 6 ( ⁇ 6 ) is only a schematic explanation.
  • the specific value can be determined according to the actual situation. Requires custom settings.
  • the specific words are polite words
  • the specified positions include at least the beginning position (the first conversation text in the conversation text) and the end position (the last conversation text in the conversation text).
  • T 6 ( ⁇ 6 ) is only a schematic explanation.
  • the specific value can be determined according to the actual situation. Requires custom settings.
  • At least one text-like feature includes predetermined features related to the conversation text.
  • the above-mentioned step S11 may specifically include: determining a value rule for at least one of the following text-type features included in the predetermined features related to the dialogue text: a value rule for a seventh text-type feature and a value rule for an eighth text-type feature.
  • the seventh text-type feature is used to characterize: the number of text items in the dialogue text to which the text to be classified belongs; the eighth text-type feature is used to characterize the position where the text to be classified appears in the dialogue text.
  • the above step S12 may specifically include: based on at least one of the value rules of the seventh text class feature and the value rule of the eighth text class feature, generating text corresponding to the seventh text class feature and the eighth text class in the text to be classified.
  • the characteristic value of at least one text-type feature in the feature may specifically include: based on at least one of the value rules of the seventh text class feature and the value rule of the eighth text class feature, generating text corresponding to the seventh text class feature and the eighth text class in the text to be classified.
  • At least one of the following information items can be represented by predetermined features related to the dialogue text in the text-type features: the number of text items of the dialogue text, and the occurrence position information of the text to be classified in the dialogue text.
  • the feature operator T 7 can be constructed through the following expression (7) to determine the value rule of the seventh text type feature.
  • T 7 ( ⁇ 7 ) is referred to as operator T 7
  • ⁇ 7 is the seventh text type feature
  • the value of T 7 ( ⁇ 7 ) is the characteristic value of the seventh text type feature.
  • the seventh text type feature ⁇ 7 is used to characterize the total number of texts (conversation turns) contained in the dialogue text to which the text to be classified belongs, that is, how many texts are produced in total during a call between the two parties; among them, K1 is smaller than K2, And K1 and K2 are both integers greater than or equal to 1.
  • the probability that a sensitive word in the text to be classified is the real dialogue content is proportional to the total number of dialogue texts in which the text to be classified is located; for example: the total number of texts included in the dialogue text to which the text to be classified belongs. The more there are (or the more rounds of calls), the higher the probability that the sensitive words in the text to be classified are real conversation content.
  • T 7 ( ⁇ 7 ) is only a schematic explanation. It needs to be satisfied that the larger ⁇ 7 is, the larger the value of T 7 ( ⁇ 7 ) is.
  • the feature operator T 8 can be constructed through the following expression (8) to determine the value rule of the eighth text type feature.
  • T 8 ( ⁇ 8 ) is simply called operator T 8
  • x represents the sentence in which the text to be classified appears in the dialogue text
  • L is the total number of texts contained in the dialogue text
  • the eighth text type feature is used to represent: the position where the text to be classified appears in the dialogue text. specifically, Indicates that the position where the text to be classified appears in the dialogue text is the previous paragraph Location; Indicates that the position where the text to be classified appears in the dialogue text is the middle position, Indicates that the position where the text to be classified appears in the dialogue text is the later position.
  • the probability that a sensitive word in the text to be classified is the real conversation content is related to the position of the text to be classified in the conversation text. The later the position of the text to be classified appears in the conversation text, the later the position of the text to be classified appears in the conversation text. The higher the probability that the sensitive words in the text to be classified are real conversation content.
  • T 8 ( ⁇ 8 ) is only a schematic explanation.
  • the specific value can be determined according to the actual situation. Requires custom settings;
  • the larger the value of the feature operator indicates the probability that the sensitive word actually exists in the text to be classified. The bigger.
  • the text to be classified belongs to the dialogue text of the target object.
  • step S230 the step of performing text classification processing on the text to be classified according to the characteristic value of the text feature to obtain the text classification result may specifically include the following steps S21 and S22.
  • step S21 based on the preset portrait features, feature values of the text to be classified corresponding to the portrait features of the target object are obtained.
  • the portrait features are used to characterize the individual characteristics of the target object.
  • step S22 text classification processing is performed on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain a text classification result.
  • portrait features can be used to assist in determining whether there is a specified type of noise based on text features.
  • the text to be classified can be text-based. Classification processing to improve the accuracy of text classification results.
  • the preset portrait features include at least one portrait feature; in step S21, based on the preset portrait features, obtain the feature value of the text to be classified corresponding to the portrait feature of the target object, Specifically, it may include the following steps S31 and S32.
  • step S31 a value rule for each portrait feature is determined based on at least one portrait feature.
  • each portrait feature is represented as a feature operator, and each feature operator is used to describe the value rule of a portrait feature, that is, the correspondence between the portrait feature and different feature values.
  • the characteristic value of the portrait feature is a numerical representation of the portrait feature; text classification processing based on the feature values of the text feature of the text to be classified and the portrait feature can more accurately reflect the text to be classified. Whether there is an objective situation of the specified type of noise in the text, and the richer the types of pre-set portrait features, the more accurate the subsequent text classification results will be based on combining the feature values of the two different types of features (text and portrait) , which will help further improve the accuracy of text classification results.
  • the target object is a customer service agent; the individual characteristics are used to characterize at least one of the following information items: agent level, agent length of service, the number of times the agent's speech does not comply with the predetermined speech rules within a predetermined statistical period, and Whether the agent's speech does not comply with the predetermined speech rules because the text to be classified contains sensitive words and there is a corresponding historical record.
  • multiple different types of portrait feature settings are used to provide auxiliary judgment for the subsequent accurate judgment of whether there is specified type of noise in the text to be classified, thereby improving the accuracy of the final classification result.
  • the feature operator S 1 can be constructed through the following expression (9) to determine the value rule for the individual feature of agent service age.
  • S 1 ( ⁇ 1 ) is referred to as operator S 1
  • ⁇ 1 represents the length of service of the call customer service agent, and the unit can be years
  • S 1 ( ⁇ 1 ) represents the value of operator S 1
  • the sensitive words in the text to be classified are those of real conversation content. Probability is inversely proportional to the length of service of the call agent; the greater the length of service of the call agent (for example, ⁇ 1 is greater than A2 (such as 10 years)), the lower the probability that the sensitive words in the text to be classified are real conversation content; the call agent The smaller the party's working experience (for example, the working experience ⁇ 1 is less than A1 (such as 1 year)), the higher the probability that the sensitive words in the text to be classified are real conversation content.
  • the feature operator S 2 can be constructed through the following expression (10) to determine the value rule of the feature of agent level.
  • S 2 ( ⁇ 2 ) is referred to as operator S 2
  • ⁇ 2 represents the agent level
  • I, II, and III respectively represent the three levels from high to low.
  • first-level and second-level three levels, the first level is the highest level, the second level is the second, and the third level is the lowest level.
  • the probability that sensitive words in the text to be classified are real conversation content is inversely proportional to the agent level; the higher the agent level, the lower the probability that sensitive words in the text to be classified are real conversation content; the agent level The lower the value, the higher the probability that the sensitive words in the text to be classified are real conversation content.
  • the feature operator S 3 can be constructed through the following expression (11) to determine the value rule of the individual feature that the agent's speaking skills within a predetermined statistical period do not comply with the predetermined speaking rules.
  • S 3 ( ⁇ 3 ) is referred to as operator S 3 , and ⁇ 3 indicates that the agent suffered a loss of profits due to his speech not complying with the predetermined speech rules during the predetermined statistical period (for example, within the last month).
  • Degree; S 3 ( ⁇ 3 ) represents the value of operator S 3 ;
  • the probability that the sensitive words in the text to be classified are real conversation content is proportional to the number of times that the agent suffered a loss of profits in a predetermined statistical period because his speaking skills did not comply with the predetermined speaking rules; for example, the agent’s number in the past month The more times you are punished for internal violations of speech skills, the higher the probability that the sensitive words in the text to be classified are real conversation content.
  • the feature operator S 4 can be constructed through the following expression (12) to determine whether the agent's speech does not comply with the predetermined speech rules due to sensitive words contained in the text to be classified and is subject to loss of profits. History records the value rules for this individual characteristic.
  • S 4 ( ⁇ 4 ) is referred to as operator S 4 , and ⁇ 4 indicates whether the agent’s speech in the historical call data does not comply with the predetermined speech rules because the text to be classified contains sensitive words.
  • the probability that the sensitive words in the text to be classified is the actual conversation content is related to whether the agent in the historical call data suffered a loss of profits due to the inclusion of sensitive words in the text to be classified, causing the agent's speech to not comply with the predetermined speech rules. Processing is related.
  • the probability that the sensitive words in the text to be classified is the actual conversation content is higher; in historical call data , if the agent has never been punished for speaking violations caused by sensitive words in the text to be identified, then the probability that the sensitive words in the text to be classified is the actual conversation content is low.
  • the portrait characteristics may also include at least one of the following information items: the number of evaluations, the number of complaints within a predetermined statistical period because the speech skills do not comply with the predetermined speech rules, etc.
  • Class features can be set in more types as needed, which are not specifically limited in the embodiments of this disclosure.
  • corresponding portrait features can be set to characterize at least one of the following information items: customer service level, customer credit score, number of customer points, whether the customer appears corresponding Bad history.
  • the association relationship can be a representation of a function or a model, and the association relationship can be obtained through model training.
  • step S230 text classification processing is performed on the text to be classified according to the feature value of the text type feature, and a text classification result indicating whether the specified type of noise exists is generated.
  • steps S41 and S42 may be included.
  • step S41 the feature values of the text class features are processed by the first classification model to obtain the first text category of the text to be classified.
  • the first classification model is a model pre-trained using sample text.
  • step S42 a text classification result is generated based on a predetermined correspondence between the value of the first text category and whether there is a predetermined type of noise.
  • the first classification model is used to indicate an association between the feature value of the text class feature and the text class of the text to be classified.
  • the feature value of the text class feature is processed based on the first classification model to obtain the first text category of the text to be classified; when the first text category is a first value, for example, 1, the corresponding text classification result is that the text exists in the text to be classified.
  • Specified type of noise when the first text category is the second value, for example, 0, the corresponding text classification result is that there is no specified type of noise in the text to be classified; thus, based on the processing results output by the model, it can be accurately judged whether there is specified type of noise in the text to be classified.
  • Type noise the processing steps are not cumbersome and the processing efficiency is high.
  • the training data of the first classification model is: from historical speech The sample text obtained from the dialogue text corresponding to the data; during the training process of the first classification model, the text class feature value of the sample text can be obtained, and corresponding annotation information is added to whether there is a specified type of noise in the sample text, using Model training is performed on the text class feature values of the sample text with the annotation information to obtain a first classification model, which is used to perform text classification processing on the text to be classified and generate a text classification result indicating whether the specified type of noise exists in the text to be classified, Improve processing efficiency and accuracy of text classification and recognition based on text class features.
  • the above-mentioned step S22 is the step of performing text classification processing on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain the text classification result, which may specifically include the following steps S51 and S52.
  • step S51 the feature values of the text feature and the feature value of the portrait feature are processed through the second classification model to obtain the second text category of the text to be classified.
  • the second classification model is a model trained using sample text.
  • step S52 a text classification result is generated based on a predetermined correspondence between the value of the second text category and whether there is a predetermined type of noise.
  • the second classification model is used to indicate the correlation between the feature values of the text-type features and the feature values of the portrait-type features and the text category of the text to be classified.
  • the feature values of the text feature and the feature value of the portrait feature are processed to obtain the second text category of the text to be classified; when the second text category is the first value, for example, 1, the corresponding text classification result
  • the second text category is the first value, for example, 1, the corresponding text classification result
  • the corresponding text classification result is that the specified type of noise does not exist in the text to be classified; thus, based on the processing results output by the model, the target type can be accurately judged. Whether there is specified type of noise in the classified text, the processing steps are not cumbersome and the processing efficiency is high.
  • the training data of the second classification model is: a sample text obtained from the dialogue text corresponding to the historical speech data; during the training process of the second classification model, the text class feature value and Portrait class feature values, and label whether there is a specified type of noise in the sample text.
  • the following takes the second classification model as an example to describe the training process of the second classification model.
  • n represents the number of sample texts, and n is an integer greater than or equal to 1
  • T 1 to T 8 represent text-like features described by the above expressions (1)-(8)
  • S 1 to S 4 represent Through the portrait features expressed by the above expressions (9)-(12)
  • the annotation information value is an annotation of whether there is a specified type of noise in each sample text in the sample text for training the second classification model.
  • the annotation information value If it is 0, it means that the corresponding sample text does not have the specified type of noise; if the annotation information value is 1, it means that the corresponding sample text does not have the specified type of noise.
  • the characteristic values of the text-type features and the characteristic values of the portrait-type features are determined, and corresponding annotation information is added to each sample text (annotation information "0" indicates that the specified text does not exist in the text.
  • Type noise indicates that the specified type of noise exists in the text
  • the model is used to train the correlation between the eigenvalues of text features and the feature values of portrait features and the return noise.
  • the construction of text features and portrait features and the speech return judgment based on the constructed feature values of text features and portrait features are the identification of noise data at the text level and portrait level. and judgment.
  • voiceprint recognition technology segments and eliminates noise data at the speech level. Both the text classification method and the text recognition method in the embodiments of the present disclosure perform noise processing at the text level and the image level. The judgment of voice data does not have the problems of low denoising accuracy and complicated process of voiceprint recognition technology.
  • FIG 3 is a schematic flowchart of model training and model use provided by an embodiment of the present disclosure.
  • the model training process may include the following steps S301 to S303.
  • step S301 as shown in "Input training data" in Figure 3, the input training data can be obtained using a text classification task.
  • step S302 as shown in "Machine Learning Training" in Figure 3, a predetermined type of machine learning network is used for model training to obtain a trained model.
  • step S303 as shown in "Obtaining the trained text classification model" in Figure 3, the trained model is used as the second classification model.
  • the machine learning network can adopt any of the following machine models: logistic regression or logistic regression (Logistic Regression, LR) model, text classification algorithm model TextCNN, pre-trained language model Bert and other machine learning models Model.
  • logistic regression or logistic regression Logistic Regression, LR
  • the LR model is the simplest and most commonly used classification model in traditional machine learning.
  • the LR algorithm is simple, efficient, easy to parallelize and has the characteristics of online learning. It has a very wide range of applications in the industry.
  • the TNN model can obtain a two-dimensional sentence matrix based on the word vector, and then select different filters for convolution operations to obtain multiple Feature matrix (featuremap) performs a maximum pooling operation on each feature matrix, then splices them together, and finally classifies through the fully connected layer of the classifier (softmax).
  • featuremap Feature matrix
  • the TextCNN model has the advantage of a simple network structure.
  • the Bert model can use conversion
  • the bidirectional encoder representation of Transformer Transformer
  • the pre-trained BERT representation can be fine-tuned through an additional output layer, and is suitable for the construction of state-of-the-art models for a wide range of tasks; in actual application scenarios, the appropriate BERT representation can be selected according to actual training needs
  • the model is not specifically limited in the embodiments of this disclosure.
  • using the trained model to perform text classification processing may include the following steps S304 to S306.
  • step S304 calculate the text to be classified The eigenvalues of text-type features and the eigenvalues of portrait-type features.
  • step S305 as shown in "Model Processing” in Figure 3, the trained second classification model is used to process the feature values of the text feature and the feature value of the portrait feature of the text to be classified.
  • step S306 as shown in "Output Text Category" in Figure 3, if the output text category is "1", it is determined that there is speech backhaul in the text to be recognized; if the text category is "0", it is determined that the text is to be recognized. There is no rhetorical echo in the text.
  • the model training process of the first classification model is similar to the training process of the second classification model.
  • the training data of the first classification model is the text class features of the sample text used to train the second classification model. eigenvalues.
  • the sample text used to train the first classification model and the sample text used to train the second classification model may be the same sample text, or they may be different sample texts.
  • the training process of the first classification model please refer to the corresponding content in the training of the second classification model, and will not be described again here.
  • model identification step it is necessary to calculate the feature values of the text-type features of the text to be classified, and use the first classification model obtained by training to process the feature values of the text-type features of the text to be classified to obtain the corresponding text category for use To determine whether there is return noise in the text to be classified.
  • the presence or absence of sensitive words in the customer call text including but not limited to the location and completeness of the text used, the presence or absence of sensitive words in the customer call text, similarities and differences with agent sensitive words, agent length of service, level, and loss of profits due to non-compliance with speech rules.
  • the number of times (such as the number of times being punished) and other features have been completed to complete the construction of text features and portrait features; during the speech return judgment process, the constructed text features and portrait features can be used in advance to obtain model training
  • the eigenvalues of text features and the eigenvalues of portrait features in the sample text are used to generate training data, and the training data and the annotation results of the sample text (at least whether there is a specific type of noise) are used for model training to obtain a classification model.
  • the output of the trained model can be used to determine whether there is speech backhaul in the text to be recognized.
  • the specified type of noise includes: return noise generated by sound return; sound return refers to: sound return from the speaker of the calling device to the microphone array during the call.
  • the model classification method of the embodiment of the present disclosure can generate the feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified, and perform text classification processing on the feature value of the generated text feature to obtain Text classification results are used to determine whether there is call return noise in the text to be classified, and the accuracy of the classification results is improved.
  • the processing method and processing steps are simple and feasible, and the processing efficiency of the classification results is improved.
  • Figure 4 is a flow chart of a text recognition method provided by an embodiment of the present disclosure. As shown in Figure 4, the method may include the following steps S410 to S430.
  • step S410 sensitive word recognition is performed on the acquired text to be recognized, and a sensitive word recognition result is obtained.
  • the text to be recognized may be obtained in advance from the conversation text converted from the obtained call voice.
  • the text to be recognized is output, and the text to be recognized can be a dialogue text in the specified dialogue text.
  • the text to be recognized may be any dialogue text of the customer service agent among the dialogue texts between the customer service agent and the customer.
  • NER NamedEntityRecognition
  • LSTM LongShort-TermMemory
  • Bert network Bert network. If no sensitive word is recognized, a null value or corresponding prompt message can be output, indicating that the text to be recognized does not contain sensitive words; if a sensitive word is recognized, the recognized sensitive word can be added to the sensitive word list for use for subsequent processing.
  • the LSTM network is a temporal recurrent neural network that can alleviate the long-term dependency problem of general recurrent neural networks.
  • the LSTM network and Bert network can achieve better recognition results in named entity recognition.
  • the network for named entity recognition can be selected as needed, and the embodiments of the present disclosure do not impose specific restrictions.
  • step S420 text classification processing is performed on the text to be recognized according to the feature value of the text-type feature of the text to be recognized, and a text classification result is generated.
  • the text classification result is used to indicate whether the specified type of noise exists.
  • the text to be recognized when performing text classification processing on the text to be recognized, the text to be recognized is used as the text to be classified, and the text classification method in the above embodiment is executed to obtain the corresponding text classification result.
  • the text classification method in the above embodiment is executed to obtain the corresponding text classification result.
  • step S430 a text recognition result of the text to be recognized is generated based on the sensitive word recognition result and the text classification result.
  • the presence or absence of specific types of noise data in the dialogue text can be effectively determined based on the characteristic values of the text features of the text to be recognized; the determination results of the presence or absence of specific types of noise data can be used to assist sensitive word recognition. Improved the accuracy of sensitive word recognition.
  • the text recognition method proposed in this disclosure is at the text level, reducing the adverse effects of speech return noise and translation errors on text recognition results during the text recognition process, and effectively reducing the erroneous judgment of sensitive word recognition results in the presence of predetermined types of noise. , improving the accuracy of sensitive word recognition.
  • the sensitive word recognition results and the speech return judgment results can be combined to complete the recognition of sensitive words in a speech quality inspection scenario and improve sensitive words.
  • the accuracy of identification reduces the erroneous judgment that the agent's speech does not comply with the predetermined speech rules.
  • step S420 may specifically include: using the text classification method of any of the above embodiments of the present disclosure to perform text classification processing on the text to be recognized, to obtain a text classification result.
  • the text to be recognized is one of the dialogue texts of the target object obtained from the dialogue text.
  • Step S420 may specifically include: step S61, obtaining the feature value of the portrait feature of the text to be recognized, which is used to characterize the individual features of the target object; step S62, based on the feature value of the text feature and the feature value of the portrait feature , perform text classification processing on the text to be recognized, and obtain text classification results.
  • the text recognition method in the embodiment of the present disclosure calculates text-like features of the text to be recognized
  • the eigenvalues of the eigenvalues and the eigenvalues of portrait features are used to determine whether there is a specified type of noise (such as speech return noise) in the text to be identified, and the sensitive words in the text to be identified are identified, combined with the determination results of whether the specified type of noise exists. and sensitive word recognition results to improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
  • step S430 may specifically include: if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result is that there is no specified type of noise, output the identified sensitive words as text Recognition results.
  • the sensitive word is identified and it is determined that there is no noise of the specified type, it is determined that the recognition result of the sensitive word is not caused by the noise of the specified type. Therefore, the recognized sensitive word is output as the recognition result, combined with the result of the specified type of noise.
  • the existence determination results and sensitive word recognition results improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
  • step S430 may further include: if the number of sensitive words recognized from the text to be recognized is equal to zero, determining that there are no sensitive words in the text to be recognized, and outputting the first prompt information, where the first prompt information is is empty to indicate sensitive words; if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result indicates that there is a specified type of noise, determine whether the identified sensitive words are caused by the specified type of noise, and then determine the recognition
  • the second prompt information is output when the identified sensitive word is caused by the specified type of noise.
  • the second prompt information is used to indicate that the sensitive word is caused by the specified type of noise, and is output when it is determined that the identified sensitive word is not caused by the specified type of noise.
  • the identified sensitive words are used as text recognition results.
  • the sensitive word is identified and it is determined that the specified type of noise exists, determine whether the identified sensitive word is caused by the specified type of noise, and output the corresponding
  • the prompt information is used to prompt the processing results, so that the determination results of the existence of specified types of noise and the sensitive word recognition results can be combined to improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
  • FIG. 5 shows a flowchart of a text recognition method according to an exemplary embodiment of the present disclosure.
  • the text recognition method may include the following steps S501 to S509.
  • step S501 text to be recognized is input.
  • the text to be recognized is any dialogue text of the customer service agent among the dialogue texts between the customer service agent and the customer.
  • step S502 determine whether the text to be recognized contains sensitive words.
  • step S503 If the sensitive word is not recognized, step S503 is executed. If the sensitive word is recognized, step S504 is executed.
  • step S503 the first prompt information is output.
  • the first prompt information may be, for example, a first prompt symbol, indicating that the text to be recognized does not contain sensitive words.
  • the first prompt symbol may be "[]", for example.
  • step S503 you can return to step S501 to recognize the next text to be recognized.
  • step S504 the recognized sensitive words are obtained.
  • the acquired sensitive words can be added to the sensitive word list.
  • step S505 text classification processing is performed to determine whether there is return noise.
  • step S506 is executed; if there is no “conversation back” phenomenon in the text to be recognized, step S508 is executed.
  • step S506 it is determined that speech backhaul exists, and step S507 is executed.
  • step S507 second prompt information is output.
  • the second prompt information may be, for example, a second prompt symbol.
  • the second prompt symbol indicates that the sensitive word in the text to be recognized is caused by "talking back", so the sensitive word is not output; the second prompt symbol may be the same as the second prompt symbol.
  • a prompt symbol is a different symbol.
  • the first prompt symbol may be " ⁇ ".
  • step S501 may be returned to recognize the next text to be recognized.
  • step S508 it is determined that there is no speech backhaul, and step S509 is executed.
  • step S509 the recognized sensitive words are output.
  • the sensitive word in the text to be recognized is output.
  • the dialogue text includes: the dialogue text of the target object and the dialogue text of the dialogue object that dialogues with the target object, and the text to be recognized is the dialogue text of the target object.
  • the text to be recognized is the dialogue text of the target object.
  • the text recognition result also includes: obtaining new text to be recognized in the dialogue text converted from the obtained call voice information, and generating new text recognition results until the number of acquisitions is equal to the number of text items of the target object's dialogue text, and we obtain Text recognition results of the target object’s dialogue text based on the presence or absence of noise.
  • each text in the dialogue text can be used as a text to be recognized for the above-mentioned text recognition processing in turn, until the last text to be recognized is obtained for text recognition processing, and a noise-based noise-based analysis of all texts in the dialogue text of the target object is obtained.
  • a noise-based noise-based analysis of all texts in the dialogue text of the target object is obtained.
  • the conversation text of the target object is the conversation text of the customer service agent
  • the conversation text of the conversation object is the conversation text of the customer or user.
  • this text recognition method it is possible to complete the recognition of sensitive words in a speech quality inspection scenario.
  • This strategy improves the accuracy of sensitive word recognition and reduces errors when agents use speech techniques that do not comply with predetermined speech rules. determination.
  • the text recognition method of the embodiment of the present disclosure taking the specified type of noise as speech return noise as an example, by pre-constructing text features and portrait features, the feature values of the text to be recognized based on the constructed features are obtained, and then according to the above The eigenvalues of the constructed features are used to determine whether there is speech return noise to determine whether there is speech return noise. Based on the judgment results of the speech return noise and the sensitive word recognition results, it is comprehensively determined whether there are sensitive words in the dialogue text of the target object. , improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
  • the present disclosure also provides a text classification device, a text recognition device, electronic equipment, and a computer-readable storage medium.
  • the text classification device can be used to implement any text classification method provided by the present disclosure
  • the text recognition device can be used to implement any text classification method provided by the present disclosure.
  • Any text recognition method, electronic equipment, computer-readable storage media and the above can be used to implement any text classification method or any text recognition method provided by the present disclosure.
  • Figure 6 is a block diagram of a text classification device provided by an embodiment of the present disclosure.
  • the text classification device 600 includes the following modules.
  • Obtaining module 610 is used to obtain text to be classified.
  • the feature value generation module 620 is configured to generate feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified.
  • the classification determination module 630 is configured to perform text classification processing on the text to be classified according to the feature value of the text class feature, and obtain a text classification result.
  • the text classification result is used to indicate whether the specified type of noise exists.
  • the preset text-like features include at least one text-like feature.
  • the feature value generation module 620 may specifically include: a rule determination unit, configured to determine the value rule for each text feature based on at least one preset text feature; and a value generation unit, configured to determine the value rule based on each text feature. Value rules are used to generate feature values corresponding to each text class feature in the text to be classified.
  • the text to be classified is text selected from pre-obtained conversation texts.
  • Text-type features include at least one of the following features: sensitive word distribution features, predetermined features of the text itself, and predetermined features related to the dialogue text; among them, the sensitive word distribution features are used to characterize the distribution of sensitive words in the dialogue text; text The predetermined features of the text itself are used to characterize the predetermined features of the text to be classified; the predetermined features related to the dialogue text are used to characterize the predetermined features of the text to be classified that are related to the dialogue text.
  • the dialogue text includes: the dialogue text of the target object and the dialogue text of the dialogue object with which the target object is dialogued, generated during a call between the target object and the dialogue object with which the target object is conversing.
  • the classification text is one of the dialogue texts of the target object.
  • At least one text feature includes a sensitive word distribution feature.
  • the rule determination unit is specifically used to: determine the value rules of at least one of the following text-type features included in the sensitive word distribution characteristics: the value rules of the first text-type feature, and the first text-type feature is used to represent: the dialogue text of the target object Whether the sensitive words in the text to be classified only exist in the text to be classified; the value rules of the second text class feature.
  • the second text class feature is used to characterize: whether the sensitive word in the text to be classified appears in the dialogue text of the conversation object; the second The value rules of the three text-type features, the third text-type feature is used to characterize: whether there are sensitive words in the dialogue text of the conversation object; and the value rules of the fourth text-type feature, the fourth text-type feature is used to characterize: Whether there are sensitive words in the scheduled dialogue text, and whether there are sensitive words in the scheduled dialogue text Whether the sensitive word in is consistent with the sensitive word in the text to be classified, the predetermined dialogue text is one of the dialogue texts of the dialogue object, and the predetermined dialogue text is text adjacent to the text to be classified.
  • the value generation unit is specifically configured to: based on at least one of the value rules of the first text-type feature, the value rules of the second text-type feature, the value rules of the third text-type feature, and the value rules of the fourth text-type feature. Or, generating a feature value corresponding to at least one text-type feature among the first text-type feature, the second text-type feature, the third text-type feature and the fourth text-type feature in the text to be classified.
  • At least one text-like feature includes predetermined features of the text itself.
  • the rule determination unit is specifically used to: determine the value rules of at least one of the following text-type features included in the predetermined characteristics of the text itself: the value rules of the fifth text-type feature and the value rules of the sixth text-type feature, the fifth text-type feature
  • the features are used to represent: the sentence integrity information of the text to be classified; the sixth text type feature is used to represent: the total number of times a specific word appears in a specified position in the dialogue text of the target object.
  • the value generation unit is specifically configured to: generate, based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, corresponding to the fifth text class feature and the sixth text class in the text to be classified.
  • the characteristic value of at least one text-type feature in the feature is specifically configured to: generate, based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, corresponding to the fifth text class feature and the sixth text class in the text to be classified.
  • At least one text-like feature includes predetermined features related to the conversation text.
  • the rule determination unit is specifically used to: determine the value rule of at least one of the following text-type features contained in the predetermined characteristics related to the dialogue text: the value rule of the seventh text-type feature and the value rule of the eighth text-type feature, the first
  • the seventh text-type feature is used to represent: the number of text items of the dialogue text to which the text to be classified belongs; the eighth text-type feature is used to represent: the position where the text to be classified appears in the dialogue text.
  • the value generation unit is specifically configured to: based on at least one of the value rules of the seventh text class feature and the value rule of the eighth text class feature, generate the text to be classified corresponding to the seventh text class feature and the eighth text class The characteristic value of at least one text-type feature in the feature.
  • the text to be classified belongs to the dialogue text of the target object; the classification determination module 630 is specifically configured to: based on the preset portrait features, obtain the characteristic value of the portrait feature of the target object corresponding to the text to be classified, and the portrait feature Features are used to characterize the individual characteristics of the target object; based on the feature values of text features and the feature values of portrait features, text classification processing is performed on the text to be classified to obtain text classification results.
  • the preset portrait features include at least one portrait feature.
  • the classification determination module 630 when used to obtain the feature values of the portrait features of the text to be classified based on the preset portrait features, is specifically used to: determine the value of each portrait feature based on at least one preset portrait feature. Value rules; based on the value rules of each portrait feature, obtain the feature value of each portrait feature of the target object corresponding to the text to be classified.
  • the target object is a customer service agent
  • the individual characteristics are used to characterize at least one of the following information items: agent level, agent length of service, the number of times the agent's speech does not comply with the predetermined speech rules within a predetermined statistical period, and A history of whether the agent's speech did not comply with the predetermined speech rules and the agent suffered a loss of profits due to the inclusion of sensitive words in the text to be classified.
  • the classification determination module 630 is specifically configured to: process the feature values of the text class features through a first classification model to obtain the first text category of the text to be classified.
  • the first classification model is pre-trained using sample texts. a model; generating a text classification result based on a predetermined correspondence between the value of the first text category and whether there is a predetermined type of noise.
  • the classification determination module 630 is specifically used to: perform text classification processing on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain the text classification result: through the second Classification model processing processes the feature values of text features and the feature values of portrait features to obtain the second text category of the text to be classified.
  • the second classification model is a model pre-trained using sample text; according to the second text category There is a predetermined correspondence between the value and whether there is a predetermined type of noise, and a text classification result is generated.
  • the feature value of the text feature of the text to be classified can be generated based on the preset text feature and the text to be classified, and the generated feature value of the text feature can be subjected to text classification processing.
  • the text classification result is obtained, through which the text classification result can be used to determine whether there is a specified type of noise in the text to be classified; this method can determine whether there is a specified type of noise in the text to be classified based on the text class characteristics, so as to perform the text recognition process , the interference caused by noisy data can be reduced based on the classification results, which is beneficial to obtaining objective text recognition results.
  • Figure 7 is a block diagram of a text recognition device provided by an embodiment of the present disclosure.
  • the text recognition device 700 includes the following modules.
  • the word recognition module 710 is used to perform sensitive word recognition on the acquired text to be recognized, and obtain sensitive word recognition results.
  • the classification module 720 is configured to perform text classification processing on the text to be identified based on the feature values of the text-type features of the text to be identified, and generate a text classification result.
  • the text classification result is used to indicate whether a specified type of noise exists.
  • the result generation module 730 is configured to generate a text recognition result of the text to be recognized based on the sensitive word recognition result and the text classification result.
  • the classification module 720 is specifically configured to perform text classification processing on the text to be recognized according to the text classification method in any of the above embodiments of the present disclosure, and obtain a text classification result.
  • the text to be recognized is one of the dialogue texts of the target object obtained from the dialogue text; the classification module 720 is specifically used to: obtain the feature values of the portrait features of the text to be recognized; wherein the portrait features are used for Characterize the individual characteristics of the target object; based on the eigenvalues of text features and the eigenvalues of portrait features, perform text classification processing on the text to be recognized to obtain text classification results.
  • the result generation module 730 is specifically configured to: if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result is that there is no specified type of noise, then output the identified sensitive words, as text recognition results.
  • the result generation module 730 is also specifically configured to: if the number of sensitive words recognized from the text to be recognized is equal to zero, determine that there are no sensitive words in the text to be recognized, and output the first prompt information; the first prompt information Used to indicate that the input sensitive words are empty; if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result shows that there is a specified type of noise, determine whether the identified sensitive words are caused by the specified type of noise, in When it is determined that the identified sensitive word is caused by the specified type of noise, the second prompt information is output. The second prompt information is used to indicate that the sensitive word is caused by the specified type of noise, and when it is determined that the identified sensitive word is not caused by the specified type of noise The identified sensitive words are output as text recognition results.
  • the dialogue text includes: in the target object and with the target object The conversation text of the target object generated during a call between the conversation objects and the conversation text of the conversation object with the target object, and the text to be recognized is one of the conversation texts of the target object.
  • the text recognition device also includes: an acquisition module for acquiring new text to be recognized in the conversation text obtained by converting the acquired call voice; the result generation module 730 is also used to generate a text recognition result of the new text to be recognized until The number of acquisition times is equal to the number of text pieces of the target object's dialogue text, and the text recognition result of the dialogue text is obtained.
  • the conversation text of the target object is the conversation text of the customer service agent.
  • the presence or absence of a specific type of noise data in the dialogue text can be effectively determined based on the characteristic values of the text features of the text to be recognized; the determination result of the presence or absence of the specific type of noise data can be used to assist sensitive word recognition.
  • the accuracy of sensitive word recognition is improved; the text recognition method proposed in this disclosure is at the text level, reducing the adverse effects of speech return noise and translation errors on text recognition results during the text recognition process, effectively reducing the presence of predetermined types of noise Under the premise of erroneous judgment of sensitive word recognition results, the accuracy of sensitive word recognition is improved.
  • FIG. 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
  • an embodiment of the present disclosure provides an electronic device, which includes: at least one processor 801; at least one memory 802, and one or more I/Os connected between the processor 801 and the memory 802. Interface 803.
  • the memory 802 stores one or more computer programs that can be executed by at least one processor 801, and the one or more computer programs can be executed by at least one processor 801, so that at least one processor 801 can perform the above text classification method or Any text recognition method.
  • Embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored.
  • the computer program implements the above text classification method or any text recognition method when executed by a processor/processing core.
  • Computer-readable storage media may be volatile or non-volatile computer-readable storage media.
  • An embodiment of the present disclosure also provides a computer program.
  • the computer program When the computer program is run in a processor of an electronic device, the processor in the electronic device executes the above text classification method or any text recognition method.
  • computer storage media includes volatile and non-volatile media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data. lossless, removable and non-removable media.
  • Computer storage media include, but are not limited to, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), static random access memory (SRAM), flash memory or other memory technology, portable Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassette, magnetic tape, disk storage or other magnetic storage device, or that can be used to store the desired information and can be accessed by a computer any other medium.
  • communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery medium.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to various computing/processing devices, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage on a computer-readable storage medium in the respective computing/processing device .
  • Computer program instructions for performing operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages.
  • the computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through the Internet). connect).
  • LAN local area network
  • WAN wide area network
  • an external computer such as an Internet service provider through the Internet. connect
  • an electronic circuit such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA)
  • the electronic circuit can Computer readable program instructions are executed to implement various aspects of the disclosure.
  • the computer program described here may be implemented specifically through hardware, software, or a combination thereof.
  • the computer program is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and so on.
  • These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, thereby producing a machine that, when executed by the processor of the computer or other programmable data processing apparatus, , resulting in an apparatus that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium. These instructions cause the computer, programmable data processing device and/or other equipment to work in a specific manner. Therefore, the computer-readable medium storing the instructions includes An article of manufacture that includes instructions that implement aspects of the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
  • Computer-readable program instructions can also be loaded into a computer, other programmable data processing device, or other equipment, so that when the computer, other programmable data processing device A series of operational steps are executed on a computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, such that instructions executed on a computer, other programmable data processing apparatus, or other device implement one or more of the methods in the flowcharts and/or block diagrams. The function/action specified in the box.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions that contains one or more executable functions for implementing the specified logical functions instruction.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or acts. , or can be implemented using a combination of specialized hardware and computer instructions.
  • Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in conjunction with other embodiments, unless expressly stated otherwise. Features and/or components used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Abstract

The present disclosure provides a text classification method and apparatus, a text recognition method and apparatus, an electronic device, and a storage medium. The text classification method comprises: acquiring a text to be classified; on the basis of a preset text class feature and the text to be classified, generating a feature value of the text class feature of the text to be classified; performing text classification processing on the text to be classified according to the feature value of the text class feature, to obtain a text classification result, the text classification result being used to indicate whether a specified type of noise is present.

Description

文本分类方法及装置、文本识别方法及装置、电子设备、存储介质Text classification method and device, text recognition method and device, electronic equipment, storage medium
相关申请的交叉引用Cross-references to related applications
该专利申请要求于2022年8月3日在中国国家知识产权局提交的中国专利申请202210928633.3的优先权,该中国专利申请的公开以引用方式全文并入本文中。This patent application claims priority from Chinese patent application 202210928633.3, which was filed with the State Intellectual Property Office of China on August 3, 2022. The disclosure of this Chinese patent application is incorporated herein by reference in its entirety.
技术领域Technical field
本公开涉及计算机技术领域,特别涉及一种文本分类方法、识别方法及装置、设备、存储介质。The present disclosure relates to the field of computer technology, and in particular to a text classification method, recognition method and device, equipment, and storage medium.
背景技术Background technique
在自然语言处理领域,大量的文本处理任务可以通过文本分类的方式来解决。文本分类是指按照一定标准对文本进行自动分类。例如,可以通过文本分类对情感分析、意图识别和问答匹配等文本处理任务进行处理,可以提升文本处理能力。In the field of natural language processing, a large number of text processing tasks can be solved by text classification. Text classification refers to the automatic classification of text according to certain standards. For example, text processing tasks such as sentiment analysis, intent recognition, and question and answer matching can be processed through text classification, which can improve text processing capabilities.
在执行文本识别的文本任务处理时,需要识别的文本内容可能会因存在噪音而包含相应的干扰信息,从而导致文本内容出现语义不通顺、语义混乱等问题,进而导致无法得到客观的文本识别结果的问题。因此,需要基于文本中是否存在噪音数据进行文本分类,从而在文本识别的处理过程中,可以基于该分类结果减少噪音数据带来的干扰。When performing text task processing for text recognition, the text content that needs to be recognized may contain corresponding interference information due to the presence of noise, resulting in problems such as semantic incoherence and semantic confusion in the text content, and thus the inability to obtain objective text recognition results. The problem. Therefore, text classification needs to be performed based on whether there is noise data in the text, so that during the text recognition process, the interference caused by noise data can be reduced based on the classification results.
发明内容Contents of the invention
本公开提供一种文本分类方法及装置、文本识别方法及装置、电子设备、存储介质。The present disclosure provides a text classification method and device, a text recognition method and device, electronic equipment, and storage media.
第一方面,本公开提供了一种文本分类方法,该文本分类方法 包括:获取待分类文本;基于预设的文本类特征和待分类文本,生成待分类文本的文本类特征的特征值;根据文本类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果,文本分类结果用于指示指定类型噪声是否存在。In a first aspect, the present disclosure provides a text classification method. The text classification method It includes: obtaining the text to be classified; generating feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified; performing text classification processing on the text to be classified based on the feature values of the text feature to obtain text classification As a result, the text classification results are used to indicate whether the specified type of noise is present.
第二方面,本公开提供了一种文本识别方法,该文本识别方法包括:对获取的待识别文本进行敏感词识别,得到敏感词识别结果;根据待识别文本的文本类特征的特征值,对待识别文本进行文本分类处理,生成文本分类结果,文本分类结果用于指示指定类型噪声是否存在;根据敏感词识别结果和文本分类结果,生成待识别文本的文本识别结果。In a second aspect, the present disclosure provides a text recognition method. The text recognition method includes: performing sensitive word recognition on the acquired text to be recognized, and obtaining a sensitive word recognition result; according to the characteristic value of the text-type feature of the text to be recognized, The identified text is subjected to text classification processing and a text classification result is generated. The text classification result is used to indicate whether the specified type of noise exists; based on the sensitive word recognition result and the text classification result, a text recognition result of the text to be recognized is generated.
第三方面,本公开提供了一种文本分类装置,该文本分类装置包括:获取模块,用于获取待分类文本;特征值生成模块,用于基于预设的文本类特征和待分类文本,生成待分类文本的文本类特征的特征值;分类确定模块,用于根据文本类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果,文本分类结果用于指示指定类型噪声是否存在。In a third aspect, the present disclosure provides a text classification device. The text classification device includes: an acquisition module for acquiring text to be classified; a feature value generation module for generating based on preset text class features and the text to be classified. The characteristic value of the text class feature of the text to be classified; the classification determination module is used to perform text classification processing on the text to be classified according to the characteristic value of the text class feature to obtain a text classification result, and the text classification result is used to indicate whether the specified type of noise exists.
第四方面,本公开提供了一种文本识别装置,该文本识别装置包括:词识别模块,用于对获取的待识别文本进行敏感词识别,得到敏感词识别结果;分类模块,用于根据待识别文本的文本类特征的特征值,对待识别文本进行文本分类处理,生成文本分类结果,文本分类结果用于指示指定类型噪声是否存在;结果生成模块,用于根据敏感词识别结果和文本分类结果,生成待识别文本的文本识别结果In a fourth aspect, the present disclosure provides a text recognition device. The text recognition device includes: a word recognition module for performing sensitive word recognition on the acquired text to be recognized to obtain a sensitive word recognition result; and a classification module for performing sensitive word recognition according to the text to be recognized. Identify the feature values of the text-type features of the text, perform text classification processing on the text to be recognized, and generate text classification results. The text classification results are used to indicate whether the specified type of noise exists; the result generation module is used to identify the sensitive words and the text classification results based on the results. , generate text recognition results of the text to be recognized
第五方面,本公开提供了一种电子设备,该电子设备包括:至少一个处理器;以及与至少一个处理器通信连接的存储器,存储器存储有一个或多个计算机程序,一个或多个计算机程序能够被至少一个处理器执行,以使至少一个处理器能够执行上述的文本分类方法或文本识别方法。In a fifth aspect, the present disclosure provides an electronic device. The electronic device includes: at least one processor; and a memory communicatively connected to the at least one processor. The memory stores one or more computer programs, and one or more computer programs. Can be executed by at least one processor, so that at least one processor can execute the above-mentioned text classification method or text recognition method.
第六方面,本公开提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序在被处理器/处理核执行时实现上述的文本分类方法或文本识别方法。 In a sixth aspect, the present disclosure provides a computer-readable storage medium on which a computer program is stored. The computer program implements the above-mentioned text classification method or text recognition method when executed by a processor/processing core.
本公开所提供的实施例,可以根据预设的文本类特征和待分类文本,生成该待分类文本的文本类特征的特征值,对生成的文本类特征的特征值进行文本分类处理,得到文本分类结果,通过该文本分类结果可以确定待分类文本中是否存在指定类型噪声。该文本分类方法可以基于文本类特征对待分类文本中是否存在指定类型噪声进行判定,从而在进行文本识别的处理过程中,可以基于该分类结果减少噪音数据带来的干扰,因此有利于得到客观的文本识别结果。The embodiments provided by the present disclosure can generate feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified, and perform text classification processing on the feature values of the generated text feature to obtain the text Classification results, through which the text classification results can be used to determine whether there is specified type of noise in the text to be classified. This text classification method can determine whether there is a specified type of noise in the text to be classified based on text characteristics, so that during the text recognition process, the interference caused by the noise data can be reduced based on the classification results, which is conducive to obtaining objective Text recognition results.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.
附图说明Description of the drawings
附图用来提供对本公开的进一步理解,并且构成说明书的一部分,与本公开的实施例一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细示例实施例进行描述,以上和其他特征和优点对本领域技术人员将变得更加显而易见,在附图中:The accompanying drawings are used to provide a further understanding of the present disclosure and constitute a part of the specification. They are used to explain the present disclosure together with the embodiments of the present disclosure and do not constitute a limitation of the present disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing detailed example embodiments with reference to the accompanying drawings, in which:
图1为本公开实施例提供的语音通话服务的场景图;Figure 1 is a scene diagram of a voice call service provided by an embodiment of the present disclosure;
图2为本公开实施例提供的文本分类方法的流程图;Figure 2 is a flow chart of a text classification method provided by an embodiment of the present disclosure;
图3为本公开实施例提供的模型训练和模型使用的流程示意图;Figure 3 is a schematic flowchart of model training and model use provided by an embodiment of the present disclosure;
图4为本公开实施例提供的文本识别方法的流程图;Figure 4 is a flow chart of a text recognition method provided by an embodiment of the present disclosure;
图5为本公开实施例提供的另一文本识别方法的流程图;Figure 5 is a flow chart of another text recognition method provided by an embodiment of the present disclosure;
图6为本公开实施例提供的文本分类装置的框图;Figure 6 is a block diagram of a text classification device provided by an embodiment of the present disclosure;
图7为本公开实施例提供的文本识别装置的框图;Figure 7 is a block diagram of a text recognition device provided by an embodiment of the present disclosure;
图8为本公开实施例提供的电子设备的框图。Figure 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本领域的技术人员更好地理解本公开的技术方案,以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变 和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as Considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes may be made to the embodiments described herein and modifications without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
在不冲突的情况下,本公开各实施例及实施例中的各特征可相互组合。The embodiments of the present disclosure and the features in the embodiments may be combined with each other without conflict.
如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
本文所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其它特征、整体、步骤、操作、元件、组件和/或其群组。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。The terminology used herein is used to describe particular embodiments only and is not intended to limit the disclosure. As used herein, the singular forms "a," "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that when the terms "comprising" and/or "made of" are used in this specification, the presence of features, integers, steps, operations, elements and/or components is specified but does not exclude the presence or addition of a or a plurality of other features, integers, steps, operations, elements, components and/or groups thereof. Words such as "connected" or "connected" are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be construed to have meanings consistent with their meanings in the context of the relevant art and the present disclosure, and will not be construed as having idealized or excessive formal meanings, Unless expressly so limited herein.
在实际应用场景中,语音识别技术是一种可以将语音数据转换为文本信息的技术。语音识别技术涉及声学、语音学、语言学、数字信号处理理论、信息论、计算机科学等众多学科和技术领域。由于语音信号的多样性和复杂性,语音处理设备对语音信号的处理性能容易受到下述性能影响因素的影响,例如,识别词汇表的大小和语音的复杂性、语音信号的质量、说话人的数量(单个说话人还是多说话人)、通话硬件设备的质量和处理能力。在上述性能影响因素的影响下,语音处理的识别准确率受到了一定的限制。In practical application scenarios, speech recognition technology is a technology that can convert speech data into text information. Speech recognition technology involves many disciplines and technical fields such as acoustics, phonetics, linguistics, digital signal processing theory, information theory, and computer science. Due to the diversity and complexity of speech signals, the processing performance of speech signals by speech processing equipment is easily affected by the following performance influencing factors, such as the size of the recognition vocabulary and the complexity of the speech, the quality of the speech signal, the speaker's Quantity (single speaker vs. multiple speakers), quality of call hardware, and processing power. Under the influence of the above performance factors, the recognition accuracy of speech processing is subject to certain limitations.
图1为本公开示例性实施例提供的语音通话服务的场景图。如图1所示,该场景中包括:用户10、用户通话设备11、客服坐席20、坐席通话设备21、通信网络30和语音处理设备40。用户通话设备 11与坐席通话设备21通过通信网络30建立通话,客服坐席20在通话中为用户10提供语音通话服务,例如接受咨询、办理业务等。Figure 1 is a scene diagram of a voice call service provided by an exemplary embodiment of the present disclosure. As shown in Figure 1, this scenario includes: user 10, user communication device 11, customer service agent 20, agent communication device 21, communication network 30 and voice processing device 40. User call equipment 11 establishes a call with the agent call device 21 through the communication network 30, and the customer service agent 20 provides voice call services to the user 10 during the call, such as receiving consultations, handling business, etc.
为了对语音通话进行服务质量检测,语音处理设备40可以根据从通信网络30中获取上述通话双方的语音数据中的需要质检的语音数据,通过自动语音识别将需要质检的语音数据转变为相应的对话文本,并基于该对话文本进行服务质量检测,得到服务质量检测结果。In order to detect the service quality of the voice call, the voice processing device 40 can obtain the voice data that requires quality inspection among the voice data of both parties to the call from the communication network 30, and convert the voice data that requires quality inspection into the corresponding voice data through automatic speech recognition. dialogue text, and perform service quality detection based on the dialogue text to obtain service quality detection results.
需要说的是,本公开技术方案中,在从通信网络30中获取需要质检的语音数据时,需要经过该通话所涉及用户的授权确认。例如,在采集上述需要质检的语音数据之前需经用户10授权同意。本公开技术方案中对数据的获取、存储、使用、处理等均符合国家法律法规的相关规定。It should be noted that in the technical solution of the present disclosure, when obtaining voice data that requires quality inspection from the communication network 30, authorization confirmation from the user involved in the call is required. For example, the user 10 must obtain authorization and consent before collecting the above-mentioned voice data that requires quality inspection. The acquisition, storage, use and processing of data in this disclosed technical solution all comply with the relevant provisions of national laws and regulations.
在一些实施例中,语音服务质量检测(简称语音质检)的主要内容是进行敏感词识别。具体地,敏感词包括不符合行业规范、管理规范和/或纪律规范等规范的词汇。在一些场景中,可以预先根据使用场景建立敏感词库,敏感词词库中可以包含脏词、辱骂词、不文明用词、威胁恐吓词、重大事件相关词以及其它根据具体规范所创建的敏感词。由于同一个词在不同的语言环境中可能是敏感词也可能不是敏感词,因此,需要根据使用场景的语音环境对敏感词词库及时进行更新。In some embodiments, the main content of voice service quality inspection (voice quality inspection for short) is sensitive word recognition. Specifically, sensitive words include words that do not comply with norms such as industry norms, management norms and/or disciplinary norms. In some scenarios, a sensitive word database can be established in advance based on the usage scenario. The sensitive word database can include dirty words, abusive words, uncivilized words, threatening words, words related to major events, and other sensitive words created according to specific specifications. word. Since the same word may or may not be a sensitive word in different language environments, the sensitive word dictionary needs to be updated in a timely manner according to the phonetic environment of the usage scenario.
在一些实施例中,敏感词可以包括感兴趣词。具体地,感兴趣词可以包括服务评价语、业务评价语和业务关键词中至少一者。在一些场景中,根据感兴趣词的识别,有利于获取用户感兴趣的业务信息,用户对相关业务的评价信息、以及用户对相关服务器的评价信息等信息。In some embodiments, sensitive words may include words of interest. Specifically, the words of interest may include at least one of service evaluation terms, business evaluation terms, and business keywords. In some scenarios, based on the identification of words of interest, it is helpful to obtain the business information that the user is interested in, the user's evaluation information on related services, and the user's evaluation information on related servers.
在语音质检场景中,用户10和客服坐席20的通话过程可能会受到多种类型噪声的干扰,例如环境噪声、人声干扰、混响、回声等多种干扰源带来的噪声干扰。环境噪声的声源可以是能够播放有含义的音频信号的机器(例如收音机、音频播放器等)。混响可以理解为是声音信号和该声音信号在传播时经障碍物多次反射和吸收而形成声波叠加的一种声学现象。回声也可以称为是声学回波(Acoustic  Echo),回声可以理解为是语音处理设备自身扬声器播放的声音在空间内经传播和反射形成的重复的声音信号,该重复的声音信号会回传给麦克风所形成的噪声干扰。In the voice quality inspection scenario, the call process between the user 10 and the customer service agent 20 may be interfered by various types of noise, such as environmental noise, human voice interference, reverberation, echo and other interference sources. The source of environmental noise can be a machine capable of playing meaningful audio signals (such as a radio, audio player, etc.). Reverberation can be understood as an acoustic phenomenon in which a sound signal and the sound signal are repeatedly reflected and absorbed by obstacles during propagation to form a superposition of sound waves. Echo can also be called acoustic echo (Acoustic echo) Echo), echo can be understood as a repeated sound signal formed by the propagation and reflection of the sound played by the speaker of the speech processing device itself in the space. This repeated sound signal will be transmitted back to the microphone to form noise interference.
通话回传噪声(也称话术回传噪声、回传噪声)带来的噪声干扰是回声干扰中的一种。通常由通话设备自身硬件特性造成该噪声干扰。例如客服坐席的通话设备自身收发环路的隔离度较差、或该通话设备自身的扬声器声音较大且麦克风灵敏度较高,造成通话设备自身扬声器中播放的声音回传到麦克风中,回传到麦克风的声音数据混合在该客服坐席的语音数据中,形成该客服坐席对话文本中的回传噪声。The noise interference caused by call return noise (also called call return noise, return noise) is one type of echo interference. This noise interference is usually caused by the hardware characteristics of the communication equipment itself. For example, the customer service agent's call equipment has poor isolation of its own transceiver loop, or the call equipment's own loudspeaker is louder and the microphone sensitivity is higher, causing the sound played in the call equipment's own speaker to be transmitted back to the microphone, and then transmitted back to the microphone. The sound data from the microphone is mixed with the voice data of the customer service agent, forming return noise in the conversation text of the customer service agent.
作为回传噪声的具体示例,在下述示意性的对话文本中,以对话形式示出了用户(如下可简称为客户)和客服坐席(如下可简称为坐席)的部分对话文本,每条对话内容中可以通过分隔符号(例如冒号)进行说话人身份和对应通话内容的分隔;冒号左边表示说话人的身份,冒号右边为说话人的文字形式的通话内容。As a specific example of return noise, the following schematic dialogue text shows part of the dialogue text between a user (hereinafter referred to as a customer) and a customer service agent (hereinafter referred to as an agent) in the form of a dialogue. The content of each dialogue The speaker's identity and the corresponding call content can be separated by a separator symbol (such as a colon); the left side of the colon represents the speaker's identity, and the right side of the colon represents the speaker's call content in text form.
示例一,通话内容例如可以包括如下文本信息。Example 1: The call content may include the following text information.
客户:一直在我工作的时间给我打电话,能不能有点素质。Customer: Keep calling me during my working hours. Can you be a little more qualified?
坐席:有点素质。我们这边给您备注一下,减少工作时间给您去电的次数,再见。Agent: Somewhat qualified. We will leave you a note here to reduce the number of calls to you during working hours. Goodbye.
示例二,通话内容例如可以包括如下文本信息。Example 2: The call content may include the following text information.
客户:再催我,我就去投诉你。Customer: If you push me again, I will file a complaint against you.
坐席:去投诉你。先生,我们这边是提醒您注意您的征信,希望您尽快处理一下。Agent: Go complain to you. Sir, we are here to remind you to pay attention to your credit report and hope that you will deal with it as soon as possible.
示例三,通话内容例如可以包括如下文本信息。Example 3: The call content may include the following text information.
客户:你好,对,我是XXX(xxx的发音为:shenjingjing)。Customer: Hello, yes, I am XXX (xxx is pronounced as: shenjingjing).
坐席:神经病。我们这边是通知您,您的信息即将逾期,希望您能尽快处理。Attendant: Crazy. We are here to inform you that your information is about to expire and we hope you can handle it as soon as possible.
通过上述通话内容的文本信息可以看出:示例一中坐席通话内容中的“有点素质”和示例二中坐席通话内容中的“去投诉你”均为回传噪声,示例三中坐席通话内容中的“神经病”,在回传噪声干扰的基础还叠加了自动语音识别的文本转化带来的语音识别错误。也就 是说,使用自动语音识别技术对语音通话数据进行文本转换(也成为语音转译)时,因语音回传现象形成的回传噪音数据也会被转译出来,无论转译是否正确,均会对转译结果造成噪声干扰。It can be seen from the text information of the above call content: "Some quality" in the agent's call content in Example 1 and "Go and complain about you" in the agent's call content in Example 2 are return noises. In Example 3, the agent's call content The "neurosis" is based on the interference of return noise and the speech recognition errors caused by the text conversion of automatic speech recognition. That is That is to say, when automatic speech recognition technology is used to convert voice call data into text (also called speech translation), the return noise data caused by the voice backhaul phenomenon will also be translated. Regardless of whether the translation is correct, the translation result will be affected. Cause noise interference.
相关技术中,声纹识别技术又称说话人识别技术,它是利用计算机系统自动完成说话人身份识别的智能语音核心技术。这种技术基于语音数据中所包含的说话人特有的个性信息,利用计算机以及现在的信息识别技术,自动鉴别当前语音对应的说话人身份。通过这一技术去识别语音数据中的噪音数据,对噪音数据进行剔除,以提升自动语音识别技术的准确率,进而提升语音质检的准确性。Among related technologies, voiceprint recognition technology is also called speaker recognition technology. It is an intelligent voice core technology that uses computer systems to automatically complete speaker identity recognition. This technology is based on the unique personality information of the speaker contained in the voice data, and uses computers and current information recognition technology to automatically identify the identity of the speaker corresponding to the current voice. This technology is used to identify noise data in speech data and remove the noise data to improve the accuracy of automatic speech recognition technology and thereby improve the accuracy of speech quality inspection.
在一些场景中,可以通过声纹识别技术可以对语音通话数据中说话人的身份进行识别,根据识别出的当前说话人的身份,剔除不属于当前说话人的语音数据,保留当前说话人的语音数据,以对语音通话数据进行去噪。In some scenarios, voiceprint recognition technology can be used to identify the identity of the speaker in the voice call data. Based on the identified identity of the current speaker, the voice data that does not belong to the current speaker is eliminated, and the voice of the current speaker is retained. data to denoise voice call data.
示例性地,通过声纹识别技术进行语音数据的去噪处理可以包括:接收语音数据;基于语音数据中各说话人特有的个性信息,通过声纹识别技术自动鉴别各说话人身份;将不属于指定说话人的语音数据的音频数据作为噪音数据,从当前语音数据中进行该噪音数据的剔除,保留指定说话人的语音数据,并使用自动语音识别技术对该指定说话人的语音数据进行转译,得到转译后该指定说话人的对话文本。For example, the denoising process of voice data through voiceprint recognition technology may include: receiving voice data; based on the unique personality information of each speaker in the voice data, automatically identifying the identity of each speaker through voiceprint recognition technology; The audio data of the designated speaker's voice data is used as noise data, the noise data is removed from the current voice data, the designated speaker's voice data is retained, and automatic speech recognition technology is used to translate the designated speaker's voice data. Get the translated dialogue text of the specified speaker.
在上述语音数据去噪处理过程中,由于噪音数据的声频信息通常较短,通常会导致声纹识别技术不能正确鉴别当前语音对应的说话人身份;噪音数据与通话双方语音数据的叠加,例如通话双方同时讲话时,噪音数据的叠加将进一步增加声纹识别技术的难度;在上述示例中,先通过声纹识别进行去噪处理,再通过语音识别技术进行转译的处理过程,涉及了两种技术的使用,处理过程复杂繁琐,处理效率低下。因此,相关技术中通话语音数据中的噪音数据不易被正确识别,甚至会导致通话语音数据被误识别,识别准确率较低。In the above-mentioned voice data denoising process, since the audio information of the noise data is usually short, the voiceprint recognition technology usually cannot correctly identify the identity of the speaker corresponding to the current voice; the superposition of noise data and the voice data of both parties in the call, such as When both parties speak at the same time, the superposition of noise data will further increase the difficulty of voiceprint recognition technology. In the above example, the process of denoising through voiceprint recognition and then translating through speech recognition technology involves two technologies. The use, processing process is complicated and cumbersome, and the processing efficiency is low. Therefore, in related technologies, the noise data in the call voice data is not easy to be correctly identified, and may even cause the call voice data to be misrecognized, resulting in a low recognition accuracy.
下面结合附图和具体实施例,描述本公开实施例提供的文本分类方法和文本识别方法,通过文本分类方法可以对语音数据中指定类型噪声的存在情况进行判定;通过文本识别方法,可以基于该指定类 型噪声存在情况的判定结果,进行进一步的识别处理,以得到基于该噪声存在情况的识别结果。The following describes the text classification method and text recognition method provided by the embodiments of the present disclosure with reference to the accompanying drawings and specific embodiments. The text classification method can determine the presence of specified types of noise in speech data; the text recognition method can determine based on the Specify class The determination result of the existence of type noise is further recognized and processed to obtain the recognition result based on the existence of the noise.
根据本公开实施例的文本分类方法和文本识别方法可以由终端设备或服务器等电子设备执行,终端设备可以为具有数据处理能力的车载设备、用户设备(User Equipment,UE)、移动设备、用户终端、终端、个人数字助理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备等,这些方法可以通过终端设备中的处理器调用存储器中存储的计算机可读程序指令的方式来实现,或者,可通过服务器执行这些方法。The text classification method and text recognition method according to the embodiments of the present disclosure can be executed by electronic devices such as terminal devices or servers. The terminal devices can be vehicle-mounted devices with data processing capabilities, user equipment (User Equipment, UE), mobile devices, and user terminals. , terminal, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, etc., these methods can be implemented by the processor in the terminal device calling computer-readable program instructions stored in the memory, or , these methods can be executed through the server.
图2为本公开实施例提供的文本分类方法的流程图。参照图2,该文本分类方法包括如下步骤S210至S230。Figure 2 is a flow chart of a text classification method provided by an embodiment of the present disclosure. Referring to Figure 2, the text classification method includes the following steps S210 to S230.
在步骤S210,获取待分类文本。In step S210, the text to be classified is obtained.
在该步骤中,处理设备获取待分类文本的方式可以有多种。例如,直接将对话文本中的一条文本作为待分类文本;或者,预先将多条文本存入文本处理设备,由处理设备逐条获取每条文本作为当前的待分类文本;再或者,处理设备在进行文本识别处理时,若需要对当前待识别文本进行文本分类,将可以直接该待识别文本作为待分类文本。In this step, the processing device may obtain the text to be classified in various ways. For example, directly use a text in the conversation text as the text to be classified; or, store multiple texts in the text processing device in advance, and the processing device obtains each text one by one as the current text to be classified; or, the processing device is performing During text recognition processing, if it is necessary to classify the current text to be recognized, the text to be recognized can be directly used as the text to be classified.
在步骤S220,基于预设的文本类特征和待分类文本,生成待分类文本的文本类特征的特征值。In step S220, based on the preset text class features and the text to be classified, a feature value of the text class feature of the text to be classified is generated.
在该步骤中,文本类特征是基于待分类文本预先设置的文本相关的特征项,这些特征项可以用于表征敏感词在对话文本中存在的概率。In this step, the text features are text-related feature items preset based on the text to be classified. These feature items can be used to characterize the probability of the presence of sensitive words in the conversation text.
在步骤S230,根据文本类特征的特征值,对待分类文本进行文本分类处理,生成用于指示指定类型噪声存在情况的文本分类结果。In step S230, text classification processing is performed on the text to be classified according to the feature value of the text type feature, and a text classification result indicating the presence of the specified type of noise is generated.
在该步骤中,指定类型噪声可以是具有一定实际含义的声音信号,且有损采集到的语音数据的质量。示例性地,指定类型噪声包括但不限于上文实施例中指出的:环境噪声、人声干扰、混响、回声等多种干扰源带来的噪声。In this step, the specified type of noise may be a sound signal that has certain practical meaning and damages the quality of the collected voice data. Illustratively, the specified type of noise includes, but is not limited to, those pointed out in the above embodiments: noise caused by environmental noise, human voice interference, reverberation, echo and other interference sources.
根据本公开的实施例的文本分类方法,可以根据预设的文本类 特征和待分类文本,生成该待分类文本的文本类特征的特征值,对生成的文本类特征的特征值进行文本分类处理,得到文本分类结果,通过该文本分类结果可以确定待分类文本中是否存在指定类型噪声。该方法可以基于文本类特征对待分类文本中是否存在指定类型噪声进行判定,从而在进行后续的文本识别的处理过程中,可以基于该分类结果减少噪音数据带来的干扰,因此有利于得到客观的文本识别结果。According to the text classification method of the embodiment of the present disclosure, the text classification method can be based on the preset text class features and the text to be classified, generate the feature values of the text-type features of the text to be classified, perform text classification processing on the generated feature values of the text-type features, and obtain the text classification results. Through the text classification results, it can be determined whether the text to be classified is The specified type of noise is present. This method can determine whether there is a specified type of noise in the text to be classified based on text characteristics, so that in the subsequent text recognition process, the interference caused by the noise data can be reduced based on the classification results, so it is conducive to obtaining objective Text recognition results.
相较于通过声纹识别技术对语音通话数据进行去噪,本公开实施例的文本分类方法是根据待分类文本的文本类特征的特征值,来确定待分类文本是否存在指定类型噪声,与声纹识别处理无关。因此,本公开实施例的方法对对话文本中是否存在噪声的判定结果,不会受到相关技术中对声纹识别的准确性造成不利的各种因素的影响,准确率更高;并且,相较于相关技术中需要先对获取的语音数据进行基于声纹识别的去噪,再对去噪后的语音数据进行语音识别的处理过程,本公开实施例的文本分类方法是对获取的待分类文本,进行基于文本类特征进行的指定类型噪声是否存在的判定,提高分类结果准确率的同时,并简化处理方式,提高处理效率。Compared with denoising voice call data through voiceprint recognition technology, the text classification method of the embodiment of the present disclosure is to determine whether there is a specified type of noise in the text to be classified based on the characteristic values of the text-type features of the text to be classified, and to determine whether there is specified type of noise in the text to be classified. It has nothing to do with pattern recognition processing. Therefore, the method according to the embodiments of the present disclosure determines whether there is noise in the dialogue text, and will not be affected by various factors that adversely affect the accuracy of voiceprint recognition in related technologies, and the accuracy is higher; and, compared with In the related art, it is necessary to first perform denoising based on voiceprint recognition on the acquired voice data, and then perform a voice recognition process on the denoised voice data. The text classification method in the embodiment of the present disclosure is to denoise the acquired text to be classified. , determine whether the specified type of noise exists based on text features, improve the accuracy of classification results, simplify the processing method, and improve processing efficiency.
下面对根据本公开实施例的文本分类方法进行展开说明。The text classification method according to the embodiment of the present disclosure will be described below.
在一些实施例中,预设的文本类特征包括至少一个文本类特征。在上述步骤S220中,基于预设的文本类特征和待分类文本,生成待分类文本的文本类特征的特征值的步骤,具体可以包括如下步骤S11和S12。In some embodiments, the preset text-like features include at least one text-like feature. In the above-mentioned step S220, based on the preset text-type features and the text to be classified, the step of generating feature values of the text-type features of the text to be classified may specifically include the following steps S11 and S12.
在步骤S11,根据至少一个文本类特征,确定至少一个文本类特征中的每个文本类特征的取值规则。In step S11, the value rule of each text-type feature in the at least one text-type feature is determined based on the at least one text-type feature.
在该步骤中,每个文本类特征表示为一个特征算子,每个特征算子用于描述一个文本类特征的取值规则,即该文本类特征与不同特征值之间的对应关系。In this step, each text feature is represented as a feature operator, and each feature operator is used to describe the value rule of a text feature, that is, the correspondence between the text feature and different feature values.
在步骤S12,基于每个文本类特征的取值规则,生成待分类文本的每个文本类特征的特征值。In step S12, based on the value rules of each text class feature, a feature value of each text class feature of the text to be classified is generated.
在该实施例中,文本类特征的特征值是文本类特征的数值体现,根据待分类文本的文本类特征的特征值进行的文本分类处理,可以体 现对话文本中是否存在指定类型噪声的客观情况,且预先设置的文本类特征的类型越丰富,后续对应的文本分类结果越准确,越有利于提高文本分类结果的准确性。In this embodiment, the feature value of the text feature is a numerical representation of the text feature. The text classification process based on the feature value of the text feature of the text to be classified can embody Whether there is an objective situation of a specified type of noise in the dialogue text, and the richer the types of preset text features, the more accurate the subsequent corresponding text classification results will be, and the more conducive to improving the accuracy of the text classification results.
在一些实施例中,待分类文本是从预先获取的对话文本中选取的文本。至少一个文本类特征包括如下文本类特征中的至少一者:敏感词分布特征,用于表征敏感词在对话文本中的分布;文本自身预定特征,用于表征待分类文本自身的预定特征;与对话文本相关的预定特征,用于表征待分类文本与对话文本相关的预定特征。In some embodiments, the text to be classified is text selected from pre-obtained conversation texts. At least one text-type feature includes at least one of the following text-type features: sensitive word distribution features, used to characterize the distribution of sensitive words in conversation texts; predetermined features of the text itself, used to characterize the predetermined features of the text to be classified; and Predetermined features related to the dialogue text are used to characterize the predetermined features related to the text to be classified and the dialogue text.
通过多种不同类型的文本类特征设置,为后续准确判断待分类文本是否存在指定类型噪声提供客观依据。Through multiple different types of text feature settings, it provides an objective basis for subsequent accurate judgment of whether there is specified type of noise in the text to be classified.
在一些实施例中,对话文本包括:在目标对象和与该目标对象对话的对话对象之间进行的一次通话过程中产生的目标对象的对话文本和对话对象的对话文本,待分类文本为目标对象的对话文本之一。In some embodiments, the conversation text includes: the conversation text of the target object and the conversation text of the conversation object generated during a call between the target object and the conversation object with which the target object talks, and the text to be classified is the target object One of the dialogue texts.
在一些实施例中,当目标对象的对话文本为坐席通话文本时,与该目标对象对话的对话对象的对话文本为客户通话文本。坐席通话文本中包含多条坐席通话文本,客户通话文本中包含多条客户通话文本。In some embodiments, when the conversation text of the target object is the agent call text, the conversation text of the conversation object with which the target object talks is the customer call text. The agent call text contains multiple agent call texts, and the customer call text contains multiple customer call texts.
在一些实施例中,至少一个文本类特征包括敏感词分布特征。In some embodiments, at least one text-like feature includes a sensitive word distribution feature.
上述步骤S11具体可以包括:确定敏感词分布特征中包含的如下至少一个文本类特征的取值规则:第一文本类特征的取值规则,第一文本类特征用于表征:目标对象的对话文本中是否只有待分类文本中含有敏感词;第二文本类特征的取值规则,第二文本类特征用于表征:待分类文本中的敏感词在对话对象的对话文本中是否出现;第三文本类特征的取值规则,第三文本类特征用于表征:对话对象的对话文本中是否存在敏感词;以及,第四文本类特征的取值规则,第四文本类特征用于表征:预定对话文本中是否存在敏感词,以及该预定对话文本中存在的敏感词与待分类文本中的敏感词是否一致,预定对话文本为对话对象的对话文本之一,且预定对话文本是与待分类文本相邻的文本。The above-mentioned step S11 may specifically include: determining the value rules of at least one of the following text-type features included in the sensitive word distribution features: the value rules of the first text-type feature, and the first text-type feature is used to represent: the dialogue text of the target object. Whether only the text to be classified contains sensitive words; the value rules of the second text class feature. The second text class feature is used to characterize: whether the sensitive word in the text to be classified appears in the dialogue text of the conversation object; the third text The value rules of class features, the third text class feature is used to characterize: whether there are sensitive words in the dialogue text of the conversation object; and the value rules of the fourth text class feature, the fourth text class feature is used to represent: scheduled dialogue Whether there are sensitive words in the text, and whether the sensitive words in the predetermined dialogue text are consistent with the sensitive words in the text to be classified, the predetermined dialogue text is one of the dialogue texts of the dialogue object, and the predetermined dialogue text is related to the text to be classified adjacent text.
上述步骤S12具体可以包括:基于第一文本类特征的取值规则、 第二文本类特征的取值规则、第三文本类特征的取值规则和第四文本类特征的取值规则中至少一者,生成待分类文本中的对应于第一文本类特征、第二文本类特征、第三文本类特征和第四文本类特征中至少一个文本类特征的特征值。The above step S12 may specifically include: value rules based on the first text type feature, At least one of the value rules of the second text-type feature, the value rule of the third text-type feature, and the value rule of the fourth text-type feature is used to generate the text to be classified corresponding to the first text-type feature, the second text-type feature, and the value rule of the fourth text-type feature. The characteristic value of at least one text-type feature among the text-type feature, the third text-type feature and the fourth text-type feature.
作为具体示例,可以通过下述表达式(1)构建特征算子T1,以用于确定第一文本类特征的取值规则。
As a specific example, the feature operator T 1 can be constructed through the following expression (1) to determine the value rule of the first text type feature.
在上述表达式(1)中,T11)简称算子T1,α1为第一文本类特征,T11)的取值表示第一文本类特征的特征值;第一文本类特征α1用于表征:目标对象的对话文本中是否只有待分类文本中含有敏感词,若是,则T11)取值为0,若否,则T11)取值为1。In the above expression (1), T 11 ) is referred to as operator T 1 , α 1 is the first text type feature, and the value of T 11 ) represents the characteristic value of the first text type feature; A text class feature α 1 is used to characterize: whether only the text to be classified in the dialogue text of the target object contains sensitive words. If so, then T 11 ) takes the value 0. If not, then T 11 ) The value is 1.
在本公开实施例中,若目标对象的对话文本中只有待分类文本中含有敏感词(α1=是),则表明待分类文本中的敏感词为真实对话内容的概率较小(T11)=0);若目标对象的对话文本中并非只有当前待分类文本中含有敏感词(α1=否),则表明预定的待分类文本中的敏感词为真实对话内容的概率较大(T11)=1)。In the embodiment of the present disclosure, if only the text to be classified in the dialogue text of the target object contains sensitive words (α 1 =yes), it means that the probability that the sensitive words in the text to be classified is the real dialogue content is small (T 1 ( α 1 )=0); if the target object’s dialogue text does not contain sensitive words only in the current text to be classified (α 1 =No), it indicates that the probability of the sensitive words in the predetermined text to be classified is the real dialogue content is relatively high. (T 11 )=1).
作为示例,实际应用场景中,如果坐席通话文本存在含有敏感词,则通常情况下,可能存在多条(不仅一条)坐席通话文本中均含有敏感词,客户通话文本中也可能存在敏感词,所以仅待分类文本中含有敏感词的概率较小。As an example, in actual application scenarios, if the agent call text contains sensitive words, usually there may be multiple (not just one) agent call texts containing sensitive words, and customer call texts may also contain sensitive words, so Only the text to be classified is less likely to contain sensitive words.
需要说明的是,上述T11)的取值仅仅是示意性的说明,需满足α1为“是”时T11)的取值小于α1为“否”时T11)的取值即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 11 ) is only a schematic explanation, and it must be satisfied that the value of T 11 ) when α 1 is “yes” is smaller than the value of T 1 when α 1 is “no”. The value of (α 1 ) is enough, and the specific value can be customized according to actual needs.
作为具体示例,可以通过下述表达式(2)构建特征算子T2,以用于确定第二文本类特征的取值规则。
As a specific example, the feature operator T 2 can be constructed through the following expression (2) to determine the value rule of the second text type feature.
在上述表达式(2)中,T22)简称算子T2,α2为第二文本类特征,T22)为第二文本类特征的特征值;第二文本类特征α2用于表征:待分类文本中的敏感词是否出现在对话对象的对话文本中;T22)表 示算子T2的值。具体地,若待分类文本中的敏感词出现在对话对象的对话文本中(α2=是),则T22)取值为0;待分类文本中的敏感词未出现在对话对象的对话文本中(α2=否),则T22)取值为1。In the above expression (2), T 22 ) is referred to as operator T 2 , α 2 is the second text class feature, and T 22 ) is the feature value of the second text class feature; the second text class Feature α 2 is used to characterize: whether the sensitive words in the text to be classified appear in the dialogue text of the dialogue object; T 22 ) table Indicates the value of operator T 2 . Specifically, if the sensitive word in the text to be classified appears in the dialogue text of the dialogue object (α 2 = yes), then the value of T 22 ) is 0; the sensitive word in the text to be classified does not appear in the dialogue object In the dialogue text (α 2 =No), then T 22 ) takes the value 1.
作为示例,目标对象的对话文本为坐席通话文本,待分类文本为坐席通话文本中的任一条坐席通话文本,与目标对象对话的对话对象的对话文本为客户通话文本时,第二文本类特征的特征值的含义是:待分类文本中的敏感词是否出现在与待分类文本相邻的客户通话文本中。As an example, the conversation text of the target object is the agent call text, the text to be classified is any agent call text in the agent call text, and the conversation text of the conversation object with the target object is the customer call text, the second text type feature The meaning of the feature value is: whether the sensitive words in the text to be classified appear in the customer call text adjacent to the text to be classified.
在本公开实施例中,若待分类文本中的敏感词出现在与待分类文本相邻的客户通话文本中,则表明预定的敏感词在待分类文本中真实存在的概率较小;若待分类文本中的敏感词未出现在与待分类文本相邻的客户通话文本中,则表明预定的待分类文本中的敏感词为真实对话内容的概率较大。In the embodiment of the present disclosure, if a sensitive word in the text to be classified appears in the customer call text adjacent to the text to be classified, it indicates that the probability of the predetermined sensitive word actually existing in the text to be classified is small; if the sensitive word to be classified appears If the sensitive words in the text do not appear in the customer call text adjacent to the text to be classified, it indicates that the sensitive words in the predetermined text to be classified are more likely to be real conversation content.
例如,在前述实施例描述的作为回传噪声的具体示例中;示例一为“客户:一直在我工作的时间给我打电话,能不能有点素质。坐席:有点素质。我们这边给您备注一下,减少工作时间给您去电的次数,再见”。假设待分类文本为示例一中的坐席通话文本,由于“有点素质”不仅出现在该坐席通话文本中,还出现在与该坐席通话文本相邻的客户通话文本中,该情形下“有点素质”在对应的待分类文本(即该坐席通话文本)中真实存在的概率较小,属于回传噪声的概率较大。For example, in the specific example of return noise described in the previous embodiment; Example 1 is "Customer: You keep calling me during my working hours. Can you be a little bit qualified? Agent: A little bit qualified. We will give you a note here. I’ll reduce the number of calls to you during working hours. Goodbye.” Assume that the text to be classified is the agent call text in Example 1. Since "somewhat quality" appears not only in the agent call text, but also in the customer call text adjacent to the agent call text, in this case "somewhat quality" The probability that it actually exists in the corresponding text to be classified (that is, the agent's call text) is small, and the probability that it is return noise is high.
需要说明的是,上述T22)的取值仅仅是示意性的说明,需满足α2为“是”时T22)的取值小于α2为“否”时T22)的取值即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 22 ) is only a schematic explanation, and it must be satisfied that the value of T 22 ) when α 2 is “yes” is smaller than the value of T 2 when α 2 is “no”. The value of (α 2 ) is sufficient, and the specific value can be customized according to actual needs.
作为具体示例,可以通过下述表达式(3)构建特征算子T3,以用于确定第三文本类特征的取值规则。
As a specific example, the feature operator T 3 can be constructed through the following expression (3) to determine the value rule of the third text type feature.
在上述表达式(3)中,T33)表示算子T3的值,α3为第三文本类特征,T33)为第三文本类特征的特征值,第三文本类特征α3用于 表征对话对象的对话文本中是否含有敏感词。具体地,若对话对象的对话文本中含有敏感词(α3=是),则T33)取值为1,若对话对象的对话文本中不含敏感词(α3=否),则T33)取值为0。In the above expression (3), T 33 ) represents the value of operator T 3 , α 3 is the third text class feature, T 33 ) is the feature value of the third text class feature, and the third Text class feature α 3 is used for Characterizes whether the conversation text of the conversation object contains sensitive words. Specifically, if the dialogue text of the dialogue object contains sensitive words (α 3 =yes), then T 33 ) takes a value of 1; if the dialogue text of the dialogue object does not contain sensitive words (α 3 =no), Then T 33 ) takes the value 0.
作为示例,当目标对象的对话文本为坐席通话文本,待分类文本为坐席通话文本中的一条坐席通话文本,与该目标对象对话的对话对象的对话文本为客户通话文本时,第三文本类特征的含义是:客户通话文本中是否含有敏感词。As an example, when the conversation text of the target object is an agent call text, the text to be classified is an agent call text among the agent call texts, and the conversation text of the conversation object with the target object is a customer call text, the third text type feature The meaning is: whether the customer call text contains sensitive words.
在本公开实施例中,若对话对象的对话文本中也含有敏感词(含敏感词即可,该敏感词与待分类文本中的敏感词可能相同或不同),则表明预定的敏感词在待分类文本中真实存在的概率较大;若对话对象的对话文本中不含敏感词,则表明预定的待分类文本中的敏感词为真实对话内容的概率较小。In this embodiment of the present disclosure, if the dialogue text of the dialogue object also contains sensitive words (it is enough to include sensitive words, and the sensitive words may be the same as or different from the sensitive words in the text to be classified), it indicates that the predetermined sensitive words are to be classified. The probability of real existence in the classified text is relatively high; if the dialogue text of the dialogue object does not contain sensitive words, it means that the probability of the sensitive words in the predetermined text to be classified is the actual dialogue content is small.
需要说明的是,上述T33)的取值仅仅是示意性的说明,需满足α3为“否”时的T33)取值小于α3为“是”时T33)的取值即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 33 ) is only a schematic explanation, and it must be satisfied that the value of T 33 ) when α 3 is “no” is smaller than the value of T 3 ( α 3 ) when α 3 is “yes”. The value of (α 3 ) is sufficient, and the specific value can be customized according to actual needs.
作为具体示例,可以通过下述表达式(4)构建特征算子T4,以用于确定第四文本类特征的取值规则。
As a specific example, the feature operator T 4 can be constructed through the following expression (4) to determine the value rule of the fourth text type feature.
在上述表达式(4)中,T44)简称算子T4,α4为第四文本类特征,T44)为第四文本类特征的特征值,第四文本类特征α4用于表征:预定对话文本中是否含有敏感词,以及,在含有敏感词的情况下预定对话文本中存在的敏感词与待分类文本中的敏感词是否一致;其中,预定对话文本为对话对象的对话文本之一,且预定对话文本是与待分类文本相邻的文本。In the above expression (4), T 44 ) is referred to as operator T 4 , α 4 is the fourth text class feature, T 44 ) is the characteristic value of the fourth text class feature, and the fourth text class Feature α 4 is used to characterize: whether the predetermined dialogue text contains sensitive words, and, if sensitive words are contained, whether the sensitive words present in the predetermined dialogue text are consistent with the sensitive words in the text to be classified; where, the predetermined dialogue text is One of the dialogue texts of the dialogue object, and the predetermined dialogue text is text adjacent to the text to be classified.
具体地,若预定对话文本中不含有敏感词(α4=否),T44)取值为0;若该预定对话文本中含有敏感词且所含敏感词与待分类文本中的敏感词一致(α4=是,一致),则T44)取值为1;若该预定对话文本中含有敏感词且所含敏感词与待分类文本中的敏感词不一致 (α4=是,不一致),则T44)取值为2。Specifically, if the predetermined dialogue text does not contain sensitive words (α 4 =No), the value of T 44 ) is 0; if the predetermined dialogue text contains sensitive words and the sensitive words are the same as those in the text to be classified The sensitive words are consistent (α 4 = yes, consistent), then T 44 ) takes the value 1; if the predetermined dialogue text contains sensitive words and the sensitive words are inconsistent with the sensitive words in the text to be classified (α 4 =Yes, inconsistent), then T 44 ) takes the value 2.
作为示例,当目标对象的对话文本为坐席通话文本,待分类文本为坐席通话文本中的一条坐席通话文本,与该目标对象对话的对话对象的对话文本为客户通话文本时,第四文本类特征的含义是:与待分类文本相邻的客户通话文本中是否含有敏感词,在含有敏感词的情况下,该敏感词是否与待分类文本中的敏感词一致。As an example, when the conversation text of the target object is an agent call text, the text to be classified is an agent call text among the agent call texts, and the conversation text of the conversation object with the target object is a customer call text, the fourth text type feature The meaning is: whether the customer call text adjacent to the text to be classified contains sensitive words, and if it contains sensitive words, whether the sensitive words are consistent with the sensitive words in the text to be classified.
在本公开实施例中,若预定对话文本中不含敏感词(例如与待分类文本相邻的客户通话文本中不含敏感词),则表明待分类文本中的敏感词为真实对话内容的概率最低(T44)=0);若预定对话文本中含敏感词,但预定对话文本中的敏感词与待分类文本中的敏感词一致(例如与待分类文本相邻的客户通话文本中含敏感词,但与待分类文本中的敏感词一致),则表明敏感词真实出自目标对象的情况存在一定概率,但概率较低(T44)=1);若预定对话文本中含敏感词,且预定对话文本中的敏感词与待分类文本中的敏感词不一致(例如与待分类文本相邻的客户通话文本中含敏感词,且与待分类文本中的敏感词不一致),则表明敏感词真实出自目标对象的情况概率较高(T44)=2)。In the embodiment of the present disclosure, if the predetermined conversation text does not contain sensitive words (for example, the customer call text adjacent to the text to be classified does not contain sensitive words), it indicates the probability that the sensitive words in the text to be classified are real conversation content Minimum (T 44 ) = 0); if the predetermined dialogue text contains sensitive words, but the sensitive words in the predetermined dialogue text are consistent with the sensitive words in the text to be classified (for example, the customer call text adjacent to the text to be classified contains sensitive words, but is consistent with the sensitive words in the text to be classified), it indicates that there is a certain probability that the sensitive words actually come from the target object, but the probability is low (T 44 ) = 1); if the predetermined dialogue text contains sensitive words, and the sensitive words in the predetermined conversation text are inconsistent with the sensitive words in the text to be classified (for example, the customer call text adjacent to the text to be classified contains sensitive words, and the sensitive words are inconsistent with the sensitive words in the text to be classified) , indicating that the probability that the sensitive word actually comes from the target object is relatively high (T 44 )=2).
例如,在坐席与客户在通话场景中,坐席和客户轮流讲话,如果二者发生争执,坐席某一句讲话对应的坐席通话文本中含有不文明用词等敏感词,则在该坐席通话文本的相邻的客户通话文本(对话文本中的该坐席通话本文的前一句客户通话文本或后一句客户通话文本)中也含有敏感词概率较高。此外,考虑到可能存在回传噪声,则:对于该坐席通话文本中的敏感词与相邻的客户通话文本中含有的敏感词不同的情形,以及该坐席通话文本中的敏感词与相邻的客户通话文本中含有的敏感词相同的情形相比,前一种情形中二者敏感词不同,排除了存在回传噪声的可能,因此前一种情形(敏感词不同的情形)与后一种情形(敏感词相同的情形)相比,待分类文本中的敏感词为真实对话内容的概率更高;而仅坐席通话文本中含不文明用词等敏感词,该坐席通话文本的相邻客户通话文本中不含敏感词,该情形在真实场景中发生的概率较低。 For example, in a call scenario between an agent and a customer, the agent and the customer take turns speaking. If there is a dispute between the two, and the agent call text corresponding to a certain sentence of the agent's speech contains sensitive words such as uncivilized words, the corresponding agent call text will be deleted. The adjacent customer call text (the previous sentence of the customer call text or the next sentence of the customer call text of the agent call text in the dialogue text) also contains sensitive words with a high probability. In addition, considering the possibility of return noise,: for the situation where the sensitive words in the agent call text are different from the sensitive words in the adjacent customer call text, and the sensitive words in the agent call text are different from the adjacent customer call texts. Compared with the situation where the sensitive words contained in the customer call text are the same, the sensitive words in the former situation are different, which excludes the possibility of return noise. Therefore, the former situation (the situation where the sensitive words are different) is different from the latter situation. Compared with the situation (the situation with the same sensitive words), the sensitive words in the text to be classified are more likely to be the actual conversation content; and only the agent call text contains sensitive words such as uncivilized words, and the adjacent customers of the agent call text The call text does not contain sensitive words, and the probability of this situation happening in real scenarios is low.
需要说明的是,上述T44)的取值仅仅是示意性的说明,需满足α4为“否”时T44)的取值小于α4为“是,一致”时T44)的取值,α4为“是,一致”时T44)的取值小于α4为“是,不一致”时T44)的取值即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 44 ) is only a schematic explanation. It must be satisfied that when α 4 is “No”, the value of T 44 ) is smaller than when α 4 is “Yes, consistent”. The value of T 44 ), when α 4 is “yes, consistent”, the value of T 44 ) is smaller than the value of T 44 ) when α 4 is “yes, inconsistent”, The specific value can be customized according to actual needs.
在一些实施例中,至少一个文本类特征包括文本自身预定特征。In some embodiments, at least one text-like feature includes a predetermined feature of the text itself.
上述步骤S11具体可以包括:确定文本自身预定特征中包含的如下至少一个文本类特征的取值规则:第五文本类特征的取值规则和第六文本类特征的取值规则,第五文本类特征用于表征:待分类文本的句子完整性信息;第六文本类特征用于表征:目标对象的对话文本中特定用语在规定位置出现的总次数。The above-mentioned step S11 may specifically include: determining the value rules of at least one of the following text-type features included in the predetermined characteristics of the text itself: the value rules of the fifth text-type feature and the value rules of the sixth text-type feature. The features are used to represent: the sentence integrity information of the text to be classified; the sixth text type feature is used to represent: the total number of times a specific word appears in a specified position in the dialogue text of the target object.
上述步骤S12具体可以包括:基于第五文本类特征的取值规则和第六文本类特征的取值规则中至少一者,生成待分类文本中的对应于第五文本类特征和第六文本类特征中至少一个文本类特征的特征值。The above step S12 may specifically include: based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, generating text corresponding to the fifth text class feature and the sixth text class in the text to be classified. The characteristic value of at least one text-type feature in the feature.
在该实施例中,可以通过文本类特征中的文本自身预定特征,表征如下信息项中的至少一项:待分类文本的句子完整性信息、目标对象的对话文本中特定用语在规定位置出现的总次数。In this embodiment, at least one of the following information items can be represented by the predetermined characteristics of the text itself in the text-type features: sentence integrity information of the text to be classified, specific words appearing at specified positions in the dialogue text of the target object. Total times.
作为具体示例,可以通过下述表达式(5)构建特征算子T5,以用于确定第五文本类特征的取值规则。
As a specific example, the feature operator T 5 can be constructed through the following expression (5) to determine the value rule of the fifth text type feature.
在上述表达式(5)中,T55)简称算子T5,α5为第五文本类特征,T55)为第五文本类特征的特征值,第五文本类特征α5用于表征待分类文本的句子完整性,句子完整性包括句子结构的合理性和句子语义的一致性;α5的值可通过预设的语义模型对待分类文本进行打分得到,得分越高,表示句子完整性越好。In the above expression (5), T 55 ) is referred to as operator T 5 , α 5 is the fifth text class feature, T 55 ) is the characteristic value of the fifth text class feature, and the fifth text class feature Feature α 5 is used to characterize the sentence integrity of the text to be classified. Sentence integrity includes the rationality of the sentence structure and the consistency of the sentence semantics; the value of α 5 can be obtained by scoring the text to be classified through the preset semantic model. The higher the score, the higher the score. High means the completeness of the sentence is better.
具体地,若待分类文本的句子完整性得分小于或等于0.5,则T44)取值为0;若待分类文本的句子完整性得分大于0.5且小于或等于0.8,则T44)取值为1;若待分类文本的句子完整性得分大于 0.8,则T44)取值为2。Specifically, if the sentence completeness score of the text to be classified is less than or equal to 0.5, then T 44 ) takes the value 0; if the sentence completeness score of the text to be classified is greater than 0.5 and less than or equal to 0.8, then T 4 ( α 4 ) takes a value of 1; if the sentence completeness score of the text to be classified is greater than 0.8, then T 44 ) takes the value 2.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与待分类文本的句子完整性取值成正比;若待分类文本的句子完整性越高,则待分类文本中的敏感词为真实对话内容的概率越高;若句子完整性越低,则待分类文本中的敏感词真实出自待分类文本的概率越低。In this embodiment of the present disclosure, the probability that sensitive words in the text to be classified are real conversation content is proportional to the sentence completeness value of the text to be classified; if the sentence completeness of the text to be classified is higher, the sentence completeness in the text to be classified will be higher. The higher the probability that the sensitive words are real conversation content; the lower the completeness of the sentence, the lower the probability that the sensitive words in the text to be classified actually come from the text to be classified.
需要说明的是,上述T55)的取值仅仅是示意性的说明,需满足α5越大时,T55)的取值越大即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 55 ) is only a schematic explanation. The larger α 5 is, the larger the value of T 55 ) is. The specific value can be determined according to the actual situation. Requires custom settings.
作为具体示例,可以通过下述表达式(6)构建特征算子T6,以用于确定第五文本类特征的取值规则。
As a specific example, the feature operator T 6 can be constructed through the following expression (6) to determine the value rule of the fifth text type feature.
在上述表达式(6)中,T66)简称算子T6,α6为第六文本类特征,T66)为第六文本类特征的特征值。第六文本类特征α6用于表征:目标对象的对话文本中的特定用语在规定位置出现的总次数;α6=0表示所有规定位置均未出现特定用语;α6=n1表示特定用语在规定位置出现了n1次,n1大于或等于1且小于预定次数阈值;α6=n2表示所有特定用语规定位置出现了n2次,n2大于或等于预定次数阈值,预定次数阈值为大于或等于1且小于或等于总次数,总次数是特定用语在全部规定位置均出现时的次数。In the above expression (6), T 66 ) is referred to as operator T 6 , α 6 is the sixth text type feature, and T 66 ) is the characteristic value of the sixth text type feature. The sixth text type feature α 6 is used to represent: the total number of times a specific word appears in the specified position in the target object’s dialogue text; α 6 =0 means that the specific word does not appear in all specified positions; α 6 =n1 means that the specific word does not appear in all specified positions; α 6 =n1 means that the specific word appears in all specified positions. The specified position appears n1 times, n1 is greater than or equal to 1 and less than the predetermined number threshold; α 6 = n2 means that all specific terms appear n2 times at the specified position, n2 is greater than or equal to the predetermined number threshold, and the predetermined number threshold is greater than or equal to 1 and Less than or equal to the total number of times, which is the number of times a specific term appears in all specified positions.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与待分类文本中在规定位置上出现的特定用语的次数成反比;在规定位置上出现特定用语的次数越多,待分类文本中的敏感词为真实对话内容的概率越低;在规定位置上出现特定用语的次数越少,待分类文本中的敏感词为真实对话内容的概率越高。In the embodiment of the present disclosure, the probability that a sensitive word in the text to be classified is a real conversation content is inversely proportional to the number of times a specific word appears in a specified position in the text to be classified; the more times a specific word appears in a specified position, the more , the lower the probability that the sensitive words in the text to be classified are real dialogue content; the fewer times a specific word appears in a specified position, the higher the probability that the sensitive words in the text to be classified are real dialogue content.
例如,特定用语为礼貌用语,礼貌用语在全部规定位置出现的总次数越多,说明待分类文本中的敏感词为真实对话内容的概率越低;礼貌用语在全部规定位置出现的总次数越少,说明待分类文本中的敏 感词为真实对话内容的概率越高。For example, if a specific word is a polite word, the more the total number of times the polite word appears in all specified positions, the lower the probability that the sensitive words in the text to be classified are real conversation content; the less the total number of times the polite word appears in all the specified positions. , indicating the sensitivity in the text to be classified The higher the probability that the testimonials are real conversation content.
需要说明的是,上述T66)的取值仅仅是示意性的说明,需满足α6越大时,T66)的取值越小即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 66 ) is only a schematic explanation. The larger α 6 is, the smaller the value of T 66 ) is. The specific value can be determined according to the actual situation. Requires custom settings.
作为示例,在语音服务质量检测(简称语音质检)场景中,特定用语为礼貌用语,规定位置至少包括开头位置(对话文本中的第一条对话文本)和结尾位置(对话文本中的最后一条对话文本)。例如,特定用语至少包括开头位置的礼貌问候语和结尾位置的礼貌结束语;α6可以表示目标对象的对话文本中礼貌用语出现的次数;α6=0表示问候语和结束语均未出现礼貌用语,此时待分类文本中的敏感词为真实对话内容的概率最高(T66)=2);n1等于1时,即α6=1,表示只在对话文本的开头位置出现1次问候语或只在对话文本的结尾位置出现1次结束语,此时待分类文本中的敏感词为真实对话内容的概率次高(T66)=1);n2等于2时,即α6=2,表示通话开头位置出现1次问候语且通话结尾位置出现1次结束语,此时待分类文本中的敏感词为真实对话内容的概率最低(T66)=0)。As an example, in the voice service quality inspection (voice quality inspection for short) scenario, the specific words are polite words, and the specified positions include at least the beginning position (the first conversation text in the conversation text) and the end position (the last conversation text in the conversation text). dialogue text). For example, a specific phrase at least includes a polite greeting at the beginning and a polite closing at the end; α 6 can represent the number of times polite phrases appear in the target object’s dialogue text; α 6 = 0 means that no polite phrases appear in either the greeting or the ending. At this time, the probability that the sensitive words in the text to be classified is the real conversation content is the highest (T 66 ) = 2); when n1 is equal to 1, that is, α 6 = 1, it means that the greeting only appears once at the beginning of the conversation text. slang or the concluding sentence only appears once at the end of the dialogue text. At this time, the probability of the sensitive words in the text to be classified is the real dialogue content is the second highest (T 66 ) = 1); when n2 is equal to 2, that is, α 6 =2, which means that a greeting appears at the beginning of the call and a closing word appears at the end of the call. At this time, the probability that the sensitive words in the text to be classified is the actual conversation content is the lowest (T 66 ) = 0).
需要说明的是,上述T66)的取值仅仅是示意性的说明,需满足α6越大时,T66)的取值越小即可,具体取值可以根据实际需要自定义设置。It should be noted that the above value of T 66 ) is only a schematic explanation. The larger α 6 is, the smaller the value of T 66 ) is. The specific value can be determined according to the actual situation. Requires custom settings.
在一些实施例中,至少一个文本类特征中包括与对话文本相关的预定特征。In some embodiments, at least one text-like feature includes predetermined features related to the conversation text.
上述步骤S11具体可以包括:确定与对话文本相关的预定特征中包含的如下至少一个文本类特征的取值规则:第七文本类特征的取值规则和第八文本类特征的取值规则,第七文本类特征用于表征:待分类文本所属的对话文本的文本条数;第八文本类特征用于表征待分类文本在对话文本中出现的位置。The above-mentioned step S11 may specifically include: determining a value rule for at least one of the following text-type features included in the predetermined features related to the dialogue text: a value rule for a seventh text-type feature and a value rule for an eighth text-type feature. The seventh text-type feature is used to characterize: the number of text items in the dialogue text to which the text to be classified belongs; the eighth text-type feature is used to characterize the position where the text to be classified appears in the dialogue text.
上述步骤S12具体可以包括:基于第七文本类特征的取值规则和第八文本类特征的取值规则中至少一者,生成待分类文本中的对应于第七文本类特征和第八文本类特征中至少一个文本类特征的特征值。 The above step S12 may specifically include: based on at least one of the value rules of the seventh text class feature and the value rule of the eighth text class feature, generating text corresponding to the seventh text class feature and the eighth text class in the text to be classified. The characteristic value of at least one text-type feature in the feature.
在该示例中,可以通过文本类特征中的与对话文本相关的预定特征,表征如下信息项中的至少一项:对话文本的文本条数、待分类文本在对话文本中的出现位置信息。In this example, at least one of the following information items can be represented by predetermined features related to the dialogue text in the text-type features: the number of text items of the dialogue text, and the occurrence position information of the text to be classified in the dialogue text.
作为示例,可以通过下述表达式(7)构建特征算子T7,以用于确定第七文本类特征的取值规则。
As an example, the feature operator T 7 can be constructed through the following expression (7) to determine the value rule of the seventh text type feature.
在上述表达式(7)中,T77)简称算子T7,α7为第七文本类特征,T77)的取值为第七文本类特征的特征值。第七文本类特征α7用于表征待分类文本所属的对话文本中包含的文本总条数(对话轮次),即通话双方的一次通话过程中共产生了多少条文本;其中,K1小于K2,且K1、K2均为大于或等于1的整数。In the above expression (7), T 77 ) is referred to as operator T 7 , α 7 is the seventh text type feature, and the value of T 77 ) is the characteristic value of the seventh text type feature. The seventh text type feature α 7 is used to characterize the total number of texts (conversation turns) contained in the dialogue text to which the text to be classified belongs, that is, how many texts are produced in total during a call between the two parties; among them, K1 is smaller than K2, And K1 and K2 are both integers greater than or equal to 1.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与待分类文本所在对话文本的总数目成正比;例如:待分类文本所属的对话文本中包含的文本总条数越多(或通话轮次越多),待分类文本中的敏感词为真实对话内容的概率越高。In the embodiment of the present disclosure, the probability that a sensitive word in the text to be classified is the real dialogue content is proportional to the total number of dialogue texts in which the text to be classified is located; for example: the total number of texts included in the dialogue text to which the text to be classified belongs. The more there are (or the more rounds of calls), the higher the probability that the sensitive words in the text to be classified are real conversation content.
需要说明的是,上述T77)的取值仅仅是示意性的说明,需满足α7越大时,T77)的取值越大即可,具体取值可以根据实际需要自定义设置;K1、K2的取值可以根据实际需要进行设置,示例性地,K1=10,K2=50。It should be noted that the above value of T 77 ) is only a schematic explanation. It needs to be satisfied that the larger α 7 is, the larger the value of T 77 ) is. The specific value can be determined according to the actual situation. Custom settings are required; the values of K1 and K2 can be set according to actual needs. For example, K1=10 and K2=50.
作为示例,可以通过下述表达式(8)构建特征算子T8,以用于确定第八文本类特征的取值规则。
As an example, the feature operator T 8 can be constructed through the following expression (8) to determine the value rule of the eighth text type feature.
在上述表达式(8)中,T88)简称算子T8x表示待分类文本出现在对话文本中的第几句;L为对话文本中包含的文本总条数;第八文本类特征用于表征:待分类文本在对话文本中出现的位置。具体地,表示待分类文本在对话文本中出现的位置为前段 位置;表示待分类文本在对话文本中出现的位置为中段位置,表示待分类文本在对话文本中出现的位置为后段位置。In the above expression (8), T 88 ) is simply called operator T 8 , x represents the sentence in which the text to be classified appears in the dialogue text; L is the total number of texts contained in the dialogue text; the eighth text type feature is used to represent: the position where the text to be classified appears in the dialogue text. specifically, Indicates that the position where the text to be classified appears in the dialogue text is the previous paragraph Location; Indicates that the position where the text to be classified appears in the dialogue text is the middle position, Indicates that the position where the text to be classified appears in the dialogue text is the later position.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与待分类文本在对话文本中出现的位置具有关联关系,待分类文本在对话文本中出现的位置越靠后,代表待分类文本中的敏感词为真实对话内容的概率越高。In the embodiment of the present disclosure, the probability that a sensitive word in the text to be classified is the real conversation content is related to the position of the text to be classified in the conversation text. The later the position of the text to be classified appears in the conversation text, the later the position of the text to be classified appears in the conversation text. The higher the probability that the sensitive words in the text to be classified are real conversation content.
需要说明的是,上述T88)的取值仅仅是示意性的说明,需满足α8越大时,T88)的取值越大即可,具体取值可以根据实际需要自定义设置;It should be noted that the above value of T 88 ) is only a schematic explanation. The larger α 8 is, the larger the value of T 88 ) is. The specific value can be determined according to the actual situation. Requires custom settings;
在上述通过表达式(1)-(8)中,特征算子的取值越大(即待分类文本的相应文本类特征的特征值越大)说明敏感词在待分类文本中真实存在的概率越大。In the above expressions (1)-(8), the larger the value of the feature operator (that is, the larger the feature value of the corresponding text class feature of the text to be classified) indicates the probability that the sensitive word actually exists in the text to be classified. The bigger.
在实际应用中,对于文本类特征中的敏感词分布特征、文本类特征中的文本自身预定特征和与对话文本相关的预定特征可以根据实际需要进行更多类型的设置,本公开实施例不做具体限定。In practical applications, more types of settings can be made based on actual needs for the distribution features of sensitive words in the text-type features, the predetermined features of the text itself in the text-type features, and the predetermined features related to the conversation text. The embodiments of the present disclosure do not Specific limitations.
在一些实施例中,待分类文本属于目标对象的对话文本。In some embodiments, the text to be classified belongs to the dialogue text of the target object.
上述步骤S230中,根据文本类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果的步骤,具体可以包括如下步骤S21和S22。In the above-mentioned step S230, the step of performing text classification processing on the text to be classified according to the characteristic value of the text feature to obtain the text classification result may specifically include the following steps S21 and S22.
在步骤S21,基于预设的画像类特征,得到待分类文本的对应于目标对象的画像类特征的特征值,画像类特征用于表征目标对象的个体特征。在步骤S22,根据文本类特征的特征值和画像类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果。In step S21, based on the preset portrait features, feature values of the text to be classified corresponding to the portrait features of the target object are obtained. The portrait features are used to characterize the individual characteristics of the target object. In step S22, text classification processing is performed on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain a text classification result.
在该实施例中,画像类特征可以作为在文本类特征的基础上,对是否存在指定类型噪声进行辅助性判定,结合文本类特征的特征值和画像类特征的特征值,对待分类文本进行文本分类处理,提高文本分类结果的准确性。In this embodiment, portrait features can be used to assist in determining whether there is a specified type of noise based on text features. Combining the feature values of text features and the feature values of portrait features, the text to be classified can be text-based. Classification processing to improve the accuracy of text classification results.
在一些实施例中,预设的画像类特征包括至少一个画像类特征;步骤S21中,基于预设的画像类特征,得到待分类文本的对应于目标对象的画像类特征的特征值的步骤,具体可以包括如下步骤S31 和S32。In some embodiments, the preset portrait features include at least one portrait feature; in step S21, based on the preset portrait features, obtain the feature value of the text to be classified corresponding to the portrait feature of the target object, Specifically, it may include the following steps S31 and S32.
在步骤S31,根据至少一个画像类特征,确定每个画像类特征的取值规则。In step S31, a value rule for each portrait feature is determined based on at least one portrait feature.
在该步骤中,每个画像类特征表示为一个特征算子,每个特征算子用于描述一个画像类特征的取值规则,即该画像类特征与不同特征值之间的对应关系。In this step, each portrait feature is represented as a feature operator, and each feature operator is used to describe the value rule of a portrait feature, that is, the correspondence between the portrait feature and different feature values.
S32,基于每个画像类特征的取值规则,得到待分类文本的对应于目标对象的每个画像类特征的特征值。S32: Based on the value rules of each portrait feature, obtain the feature value of each portrait feature of the text to be classified corresponding to the target object.
在该实施例中,画像类特征的特征值是画像类特征的数值体现;根据待分类文本的文本类特征的特征值和画像类特征共同进行的文本分类处理,可以更加准确的体现该待分类文本中是否存在指定类型噪声的客观情况,且预先设置的画像类特征的类型越丰富,后续结合该两种不同类型(文本类和画像类)特征的特征值所进行的文本分类结果将越准确,从而有利于进一步提高文本分类结果的准确性。In this embodiment, the characteristic value of the portrait feature is a numerical representation of the portrait feature; text classification processing based on the feature values of the text feature of the text to be classified and the portrait feature can more accurately reflect the text to be classified. Whether there is an objective situation of the specified type of noise in the text, and the richer the types of pre-set portrait features, the more accurate the subsequent text classification results will be based on combining the feature values of the two different types of features (text and portrait) , which will help further improve the accuracy of text classification results.
在一些实施例中,目标对象为客服坐席;个体特征用于表征如下信息项中的至少一项:坐席级别、坐席工龄、预定统计周期内的坐席话术不符合预定话术规则的次数、以及是否因待分类文本中包含敏感词导致坐席话术不符合预定话术规则而存在相应的历史记录。In some embodiments, the target object is a customer service agent; the individual characteristics are used to characterize at least one of the following information items: agent level, agent length of service, the number of times the agent's speech does not comply with the predetermined speech rules within a predetermined statistical period, and Whether the agent's speech does not comply with the predetermined speech rules because the text to be classified contains sensitive words and there is a corresponding historical record.
在该实施例中,通过多种不同类型的画像类特征设置,为后续准确判断待分类文本是否存在指定类型噪声提供辅助性的判断,提高最终分类结果的准确性。In this embodiment, multiple different types of portrait feature settings are used to provide auxiliary judgment for the subsequent accurate judgment of whether there is specified type of noise in the text to be classified, thereby improving the accuracy of the final classification result.
作为具体示例,可以通过下述表达式(9)构建特征算子S1,以用于确定坐席工龄这一个体特征的取值规则。
As a specific example, the feature operator S 1 can be constructed through the following expression (9) to determine the value rule for the individual feature of agent service age.
在上述表达式(9)中,S11)简称算子S1,β1表示通话客服坐席的工龄,单位可以为年;S11)表示算子S1的值。A1和A2均为大于或等于1的整数,且A2大于A1,示例性地,A2=3,A1=1。In the above expression (9), S 11 ) is referred to as operator S 1 , β 1 represents the length of service of the call customer service agent, and the unit can be years; S 11 ) represents the value of operator S 1 . Both A1 and A2 are integers greater than or equal to 1, and A2 is greater than A1. For example, A2=3 and A1=1.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的 概率,与通话坐席方的工龄成反比;通话坐席方的工龄越大(工龄β1例如大于A2(如10年)),待分类文本中的敏感词为真实对话内容的概率越低;通话坐席方的工龄越小(工龄β1例如小于A1(如1年)),待分类文本中的敏感词为真实对话内容的概率越高。In this embodiment of the present disclosure, the sensitive words in the text to be classified are those of real conversation content. Probability is inversely proportional to the length of service of the call agent; the greater the length of service of the call agent (for example, β 1 is greater than A2 (such as 10 years)), the lower the probability that the sensitive words in the text to be classified are real conversation content; the call agent The smaller the party's working experience (for example, the working experience β 1 is less than A1 (such as 1 year)), the higher the probability that the sensitive words in the text to be classified are real conversation content.
需要说明的是,上述A1、A2、S11)的取值仅仅是示意性的说明,需满足β1的取值越大,S11)的取值越小即可,具体取值可以根据实际需要自定义设置。It should be noted that the above-mentioned values of A1, A2, and S 11 ) are only schematic explanations. The larger the value of β 1 is, the smaller the value of S 11 ) is. The specific value can be customized according to actual needs.
作为具体示例,可以通过下述表达式(10)构建特征算子S2,以用于确定坐席级别这一特征的取值规则。
As a specific example, the feature operator S 2 can be constructed through the following expression (10) to determine the value rule of the feature of agent level.
在上述表达式(10)中,S22)简称算子S2,β2表示坐席级别,Ⅰ,Ⅱ,Ⅲ分别表示由高到低的三个级别,例如对于一级、二级、三级而言,一级为最高级别、二级次之、三级为最次级别。In the above expression (10), S 22 ) is referred to as operator S 2 , β 2 represents the agent level, and I, II, and III respectively represent the three levels from high to low. For example, for first-level and second-level , three levels, the first level is the highest level, the second level is the second, and the third level is the lowest level.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与坐席级别成反比;坐席级别越高,待分类文本中的敏感词为真实对话内容的概率越低;坐席级别越低,待分类文本中的敏感词为真实对话内容的概率越高。In this embodiment of the present disclosure, the probability that sensitive words in the text to be classified are real conversation content is inversely proportional to the agent level; the higher the agent level, the lower the probability that sensitive words in the text to be classified are real conversation content; the agent level The lower the value, the higher the probability that the sensitive words in the text to be classified are real conversation content.
需要说明的是,上述级别数量的设定和表示方式仅仅是示意性的说明,需满足β2所代表的坐席级别越高,S22)的取值越小即可,具体取值可以根据实际需要自定义设置。It should be noted that the above-mentioned setting and expression of the number of levels are only schematic explanations. It must be satisfied that the higher the agent level represented by β 2 , the smaller the value of S 22 ). The specific value is The settings can be customized according to actual needs.
作为具体示例,可以通过下述表达式(11)构建特征算子S3,以用于确定预定统计周期内的坐席话术不符合预定话术规则这一个体特征的取值规则。
As a specific example, the feature operator S 3 can be constructed through the following expression (11) to determine the value rule of the individual feature that the agent's speaking skills within a predetermined statistical period do not comply with the predetermined speaking rules.
在表达式(11)中,S33)简称算子S3,β3表示坐席在预定统计周期(例如最近一月内)因话术不符合预定话术规则而受到利益损失处理的次数;S33)表示算子S3的值;C1、C2均为大于或等于1的整 数,且C2大于C1;示例性地,C1=5,C2=10。In expression (11), S 33 ) is referred to as operator S 3 , and β 3 indicates that the agent suffered a loss of profits due to his speech not complying with the predetermined speech rules during the predetermined statistical period (for example, within the last month). Degree; S 33 ) represents the value of operator S 3 ; C1 and C2 are both integers greater than or equal to 1. number, and C2 is greater than C1; for example, C1=5, C2=10.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与预定统计周期因话术不符合预定话术规则而受到利益损失处理的次数成正比;例如,坐席最近一月内因话术违规受处罚次数越多,则待分类文本中的敏感词为真实对话内容的概率越高。In this embodiment of the present disclosure, the probability that the sensitive words in the text to be classified are real conversation content is proportional to the number of times that the agent suffered a loss of profits in a predetermined statistical period because his speaking skills did not comply with the predetermined speaking rules; for example, the agent’s number in the past month The more times you are punished for internal violations of speech skills, the higher the probability that the sensitive words in the text to be classified are real conversation content.
需要说明的是,上述预定统计周期的取值仅仅是示意性的说明,需满足β3所代表的次数越高,S33)的取值越大即可,C1和C2的取值具体取值可以根据实际需要自定义设置。It should be noted that the value of the above-mentioned predetermined statistical period is only a schematic explanation. The higher the number of times represented by β 3 , the larger the value of S 33 ). The values of C1 and C2 The specific value can be customized according to actual needs.
作为具体示例,可以通过下述表达式(12)构建特征算子S4,以用于确定:是否因待分类文本中包含敏感词导致坐席话术不符合预定话术规则而受到利益损失处理的历史记录这一个体特征的取值规则。
As a specific example, the feature operator S 4 can be constructed through the following expression (12) to determine whether the agent's speech does not comply with the predetermined speech rules due to sensitive words contained in the text to be classified and is subject to loss of profits. History records the value rules for this individual characteristic.
在上述表达式(12)中,S44)简称算子S4,β4表示在历史通话数据中坐席是否因待分类文本中包含敏感词导致坐席话术不符合预定话术规则而受到利益损失处理;S44)表示算子S4的值。具体地,若受到过利益损失处理(例如包括但不限于受到职位降级、取消评优资格或经济损失中的至少一种,即β4=是),则S44)取值为1,若未受到过利益损失处理(即β4=否),则S44)取值为0。In the above expression (12), S 44 ) is referred to as operator S 4 , and β 4 indicates whether the agent’s speech in the historical call data does not comply with the predetermined speech rules because the text to be classified contains sensitive words. Subject to profit loss treatment; S 44 ) represents the value of operator S 4 . Specifically, if you have suffered a loss of benefits (for example, including but not limited to at least one of position demotion, disqualification, or economic loss, that is, β 4 = yes), then S 44 ) takes the value 1 , if there has been no profit loss treatment (that is, β 4 =No), then S 44 ) takes the value 0.
在本公开实施例中,待分类文本中的敏感词为真实对话内容的概率,与历史通话数据中坐席是否因待分类文本中包含敏感词导致坐席话术不符合预定话术规则而受到利益损失处理具有关联关系。In the embodiment of the present disclosure, the probability that the sensitive words in the text to be classified is the actual conversation content is related to whether the agent in the historical call data suffered a loss of profits due to the inclusion of sensitive words in the text to be classified, causing the agent's speech to not comply with the predetermined speech rules. Processing is related.
例如,在历史通话数据中,若坐席曾因待识别文本中的敏感词导致的话术违规而受到过惩罚,则待分类文本中的敏感词为真实对话内容的概率较高;在历史通话数据中,若坐席因待识别文本中的敏感词导致的话术违规而受到过惩罚的情形未发生过,则待分类文本中的敏感词为真实对话内容的概率较低。For example, in historical call data, if the agent has been punished for speaking irregularities caused by sensitive words in the text to be identified, the probability that the sensitive words in the text to be classified is the actual conversation content is higher; in historical call data , if the agent has never been punished for speaking violations caused by sensitive words in the text to be identified, then the probability that the sensitive words in the text to be classified is the actual conversation content is low.
需要说明的是,上述S44)的取值仅仅是示意性的说明,需满足β4为“否”时S44)的取值小于β4为“是”时S44)的取值即可,具体取值 可以根据实际需要自定义设置。It should be noted that the above value of S 44 ) is only a schematic explanation, and it must be satisfied that the value of S 44 ) when β 4 is “no” is smaller than the value of S 4 when β 4 is “yes”. The value of (β 4 ) is enough, the specific value is The settings can be customized according to actual needs.
在一些实施例中,画像类特征例如还可以包括如下信息项中的至少一项:评优次数、预定统计周期内因话术不符合预定话术规则被投诉次数等,在实际应用中,对于画像类特征可以根据需要进行更多类型的设置,本公开实施例不做具体限定。In some embodiments, the portrait characteristics may also include at least one of the following information items: the number of evaluations, the number of complaints within a predetermined statistical period because the speech skills do not comply with the predetermined speech rules, etc. In practical applications, for portraits Class features can be set in more types as needed, which are not specifically limited in the embodiments of this disclosure.
在一些实施例中,若目标对象为客户,则可以设置相应的画像类特征,以用于表征如下信息项中的至少一项:客户服务等级、客户信用评分、客户积分数量、客户是否出现相应不良历史记录。In some embodiments, if the target object is a customer, corresponding portrait features can be set to characterize at least one of the following information items: customer service level, customer credit score, number of customer points, whether the customer appears corresponding Bad history.
在本公开实施例中,待分类文本的文本类特征的特征值与待分类文本中的指定类型噪声是否存在之间存在一定的关联关系;对待分类文本进行文本分类处理时,可以基于该关联关系,根据文本类特征的特征值生成用于指示指定类型噪声是否存在的文本分类结果。该关联关系可以是函数或模型的表现形式,且该关联关系可以通过模型训练而获得。In the embodiment of the present disclosure, there is a certain correlation between the feature value of the text feature of the text to be classified and whether the specified type of noise exists in the text to be classified; when performing text classification processing on the text to be classified, the correlation can be based on this correlation , generating text classification results indicating whether the specified type of noise exists based on the feature values of the text class features. The association relationship can be a representation of a function or a model, and the association relationship can be obtained through model training.
在一些实施例中,上述步骤S230中,根据文本类特征的特征值,对待分类文本进行文本分类处理,生成用于指示指定类型噪声是否存在的文本分类结果,具体可以包括如下步骤S41和S42。In some embodiments, in the above step S230, text classification processing is performed on the text to be classified according to the feature value of the text type feature, and a text classification result indicating whether the specified type of noise exists is generated. Specifically, the following steps S41 and S42 may be included.
在步骤S41,通过第一分类模型对文本类特征的特征值进行处理,得到待分类文本的第一文本类别,第一分类模型是利用样本文本预先训练得到的模型。在步骤S42,根据第一文本类别的取值与是否存在预定类型噪声的预定对应关系,生成文本分类结果。In step S41, the feature values of the text class features are processed by the first classification model to obtain the first text category of the text to be classified. The first classification model is a model pre-trained using sample text. In step S42, a text classification result is generated based on a predetermined correspondence between the value of the first text category and whether there is a predetermined type of noise.
在该实施例中,第一分类模型用于指示文本类特征的特征值与待分类文本的文本类别之间的关联关系。基于第一分类模型对该文本类特征的特征值进行处理,得到待分类文本的第一文本类别;第一文本类别为第一值例如为1时,相应的文本分类结果为待分类文本中存在指定类型噪声;第一文本类别为第二值例如为0时,相应的文本分类结果为待分类文本中不存在指定类型噪声;从而根据模型输出的处理结果,准确判断待分类文本中是否存在指定类型噪声,处理步骤不繁琐,处理效率高。In this embodiment, the first classification model is used to indicate an association between the feature value of the text class feature and the text class of the text to be classified. The feature value of the text class feature is processed based on the first classification model to obtain the first text category of the text to be classified; when the first text category is a first value, for example, 1, the corresponding text classification result is that the text exists in the text to be classified. Specified type of noise; when the first text category is the second value, for example, 0, the corresponding text classification result is that there is no specified type of noise in the text to be classified; thus, based on the processing results output by the model, it can be accurately judged whether there is specified type of noise in the text to be classified. Type noise, the processing steps are not cumbersome and the processing efficiency is high.
在本公开实施例中,第一分类模型的训练数据为:从历史语音 数据对应的对话文本中获取的样本文本;在第一分类模型的训练过程中,可以获取该样本文本的文本类特征值,并对该样本文本中是否存在指定类型噪声添加对应的标注信息,利用具有该标注信息的样本文本的文本类特征值进行模型训练得到第一分类模型,以用于对待分类文本进行文本分类处理,生成用于指示待分类文本中指定类型噪声是否存在的文本分类结果,提高处理效率和基于文本类特征进行文本分类识别的准确性。In this embodiment of the present disclosure, the training data of the first classification model is: from historical speech The sample text obtained from the dialogue text corresponding to the data; during the training process of the first classification model, the text class feature value of the sample text can be obtained, and corresponding annotation information is added to whether there is a specified type of noise in the sample text, using Model training is performed on the text class feature values of the sample text with the annotation information to obtain a first classification model, which is used to perform text classification processing on the text to be classified and generate a text classification result indicating whether the specified type of noise exists in the text to be classified, Improve processing efficiency and accuracy of text classification and recognition based on text class features.
在一些实施例中,上述步骤S22,根据文本类特征的特征值和画像类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果的步骤,具体可以包括如下步骤S51和S52。In some embodiments, the above-mentioned step S22 is the step of performing text classification processing on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain the text classification result, which may specifically include the following steps S51 and S52.
在步骤S51,通过第二分类模型处理对文本类特征的特征值和画像类特征的特征值进行处理,得到待分类文本的第二文本类别,第二分类模型是利用样本文本进行训练得到的模型。在步骤S52,根据第二文本类别的取值与是否存在预定类型噪声的预定对应关系,生成文本分类结果。In step S51, the feature values of the text feature and the feature value of the portrait feature are processed through the second classification model to obtain the second text category of the text to be classified. The second classification model is a model trained using sample text. . In step S52, a text classification result is generated based on a predetermined correspondence between the value of the second text category and whether there is a predetermined type of noise.
在该实施例中,第二分类模型用于指示文本类特征的特征值以及画像类特征的特征值与待分类文本的文本类别之间的关联关系。基于第二分类模型对文本类特征的特征值和画像类特征的特征值进行处理,得到待分类文本的第二文本类别;第二文本类别为第一值例如为1时,相应的文本分类结果为待分类文本中存在指定类型噪声,第二文本类别为第二值例如为0时,相应的文本分类结果为待分类文本中不存在指定类型噪声;从而根据模型输出的处理结果,准确判断待分类文本中是否存在指定类型噪声,处理步骤不繁琐,处理效率高。In this embodiment, the second classification model is used to indicate the correlation between the feature values of the text-type features and the feature values of the portrait-type features and the text category of the text to be classified. Based on the second classification model, the feature values of the text feature and the feature value of the portrait feature are processed to obtain the second text category of the text to be classified; when the second text category is the first value, for example, 1, the corresponding text classification result When the specified type of noise exists in the text to be classified, and the second text category is the second value, for example, 0, the corresponding text classification result is that the specified type of noise does not exist in the text to be classified; thus, based on the processing results output by the model, the target type can be accurately judged. Whether there is specified type of noise in the classified text, the processing steps are not cumbersome and the processing efficiency is high.
在本公开实施例中,第二分类模型的训练数据为:从历史语音数据对应的对话文本中获取的样本文本;在第二分类模型的训练过程中,获取该样本文本的文本类特征值和画像类特征值,并对该样本文本中是否存在指定类型噪声的情况进行标注,利用具有该标注信息的样本文本的文本类特征值和画像类特征值进行模型训练得到第二分类模型,以用于对待分类文本进行分类,生成用于指示待分类文中指定类型噪声是否存在的分类结果,提高处理效率和基于文本类特征进 行文本分类识别的准确性。In the embodiment of the present disclosure, the training data of the second classification model is: a sample text obtained from the dialogue text corresponding to the historical speech data; during the training process of the second classification model, the text class feature value and Portrait class feature values, and label whether there is a specified type of noise in the sample text. Use the text class feature values and portrait class feature values of the sample text with the label information to perform model training to obtain the second classification model. It is used to classify the text to be classified and generate a classification result that indicates whether the specified type of noise exists in the text to be classified, thereby improving the processing efficiency and conducting processing based on text class features. Accuracy of line text classification recognition.
下面以第二分类模型为例,描述第二分类模型的训练过程。The following takes the second classification model as an example to describe the training process of the second classification model.
下面对第二分类模型的训练数据的构造过程进行描述。The following describes the construction process of training data for the second classification model.
作为示例,通过下述表1示例性地示出第二分类模型的训练数据的示意性取值。As an example, the schematic values of the training data of the second classification model are schematically shown in Table 1 below.
表1第二分类模型的训练数据
Table 1 Training data of the second classification model
在上述表1中,n表示样本文本的数量,且n大于等于1的整数,T1至T8表示通过上述表达式(1)-(8)描述的文本类特征,S1至S4表示通过上述表达式(9)-(12)表述的画像类特征,标注信息值是对第二分类模型进行训练的样本文本中的每个样本文本是否存在指定类型噪声的情况的标注,标注信息值为0,表示相应样本文本不存在指定类型噪声;标注信息值为1,表示相应样本文本存在指定类型噪声。In the above Table 1, n represents the number of sample texts, and n is an integer greater than or equal to 1, T 1 to T 8 represent text-like features described by the above expressions (1)-(8), and S 1 to S 4 represent Through the portrait features expressed by the above expressions (9)-(12), the annotation information value is an annotation of whether there is a specified type of noise in each sample text in the sample text for training the second classification model. The annotation information value If it is 0, it means that the corresponding sample text does not have the specified type of noise; if the annotation information value is 1, it means that the corresponding sample text does not have the specified type of noise.
在该步骤中,对于每个样本文本,确定文本类特征的特征值和画像类特征的特征值,并为每个样本文本添加对应的标注信息(标注信息“0”,表示文本中不存在指定类型噪声;标注信息“1”,表示文本中存在指定类型噪声),从而完成训练数据的构造。指定类型噪声为回传噪声时,该模型用于对文本类特征的特征值和画像类特征的特征值与回传噪声的是否存在的关联关系进行训练。In this step, for each sample text, the characteristic values of the text-type features and the characteristic values of the portrait-type features are determined, and corresponding annotation information is added to each sample text (annotation information "0" indicates that the specified text does not exist in the text. Type noise; the label information "1" indicates that the specified type of noise exists in the text), thereby completing the construction of training data. When the specified type of noise is return noise, the model is used to train the correlation between the eigenvalues of text features and the feature values of portrait features and the return noise.
在上述实施例的处理过程中,文本类特征和画像类特征的构建和基于构建的文本类特征和画像类特征的特征值进行话术回传判定,是文本层面和画像层面的噪音数据的识别和判定,而相关技术中,声纹识别技术是在语音层面进行噪音数据的分割和剔除。本公开实施例的文本分类方法和文本识别方法均是在文本层面和画像层面进行噪 音数据的判定,不存在声纹识别技术去噪准确率低、流程复杂的问题。During the processing of the above embodiments, the construction of text features and portrait features and the speech return judgment based on the constructed feature values of text features and portrait features are the identification of noise data at the text level and portrait level. and judgment. Among related technologies, voiceprint recognition technology segments and eliminates noise data at the speech level. Both the text classification method and the text recognition method in the embodiments of the present disclosure perform noise processing at the text level and the image level. The judgment of voice data does not have the problems of low denoising accuracy and complicated process of voiceprint recognition technology.
下面对第二分类模型的训练过程进行描述。The training process of the second classification model is described below.
图3为本公开实施例提供的模型训练和模型使用的流程示意图。如图3所示,该模型训练过程可以包括如下步骤S301至S303。Figure 3 is a schematic flowchart of model training and model use provided by an embodiment of the present disclosure. As shown in Figure 3, the model training process may include the following steps S301 to S303.
在步骤S301,如图3中“输入训练数据”所示,可以采用文本分类的任务获取输入的训练数据。In step S301, as shown in "Input training data" in Figure 3, the input training data can be obtained using a text classification task.
在步骤S302,如图3中“机器学习训练”所示,使用预定类型的机器学习网络进行模型训练,得到训练后的模型。In step S302, as shown in "Machine Learning Training" in Figure 3, a predetermined type of machine learning network is used for model training to obtain a trained model.
在步骤S303,如图3中的“得到训练后的文本分类模型”所示,将该训练后的模型作为第二分类模型。In step S303, as shown in "Obtaining the trained text classification model" in Figure 3, the trained model is used as the second classification model.
在一些实施例中,机器学习网络可以采用如下机器模型中的任一种:逻辑回归或者逻辑斯蒂克回归(Logistic Regression,LR)模型、文本分类算法模型TextCNN、预训练语言模型Bert等机器学习模型。In some embodiments, the machine learning network can adopt any of the following machine models: logistic regression or logistic regression (Logistic Regression, LR) model, text classification algorithm model TextCNN, pre-trained language model Bert and other machine learning models Model.
LR模型是传统机器学习中的最简单的最常用的分类模型。LR算法简单、高效、易于并行且在线学习的特点,在工业界具有非常广泛的应用;TNN模型,可以根据词向量得到一个二维句子矩阵,然后选择不同的过滤器进行卷积操作得到多个特征矩阵(featuremap),对每个特征矩阵进行最大池化操作,进而将其拼接起来,最后经过分类器(softmax)全联接层进行分类。TextCNN模型具有网络结构简单的优势,通过引入已经训练好的词向量会有较好的模型训练效果;该模型具有模型参数数目少,计算量少,训练速度快的优点;Bert模型,可以使用转换器(Transformer)的双向编码器表示,预训练的BERT表示可以通过一个额外的输出层进行微调,适用于广泛任务的最先进模型的构建;在实际应用场景中,可以根据实际训练需要选择合适的模型,本公开实施例不做具体限定。The LR model is the simplest and most commonly used classification model in traditional machine learning. The LR algorithm is simple, efficient, easy to parallelize and has the characteristics of online learning. It has a very wide range of applications in the industry. The TNN model can obtain a two-dimensional sentence matrix based on the word vector, and then select different filters for convolution operations to obtain multiple Feature matrix (featuremap) performs a maximum pooling operation on each feature matrix, then splices them together, and finally classifies through the fully connected layer of the classifier (softmax). The TextCNN model has the advantage of a simple network structure. By introducing already trained word vectors, it will have a better model training effect; this model has the advantages of less number of model parameters, less calculation, and fast training speed; the Bert model can use conversion The bidirectional encoder representation of Transformer (Transformer), the pre-trained BERT representation can be fine-tuned through an additional output layer, and is suitable for the construction of state-of-the-art models for a wide range of tasks; in actual application scenarios, the appropriate BERT representation can be selected according to actual training needs The model is not specifically limited in the embodiments of this disclosure.
下面对第二分类模型使用过程进行描述。The following describes the process of using the second classification model.
继续参考图3,利用训练后的模型进行文本分类处理可以包括如下步骤S304至S306。Continuing to refer to Figure 3, using the trained model to perform text classification processing may include the following steps S304 to S306.
在步骤S304,如图3中“特征值计算”所示,计算待分类文本 的文本类特征的特征值和画像类特征的特征值。In step S304, as shown in "Feature Value Calculation" in Figure 3, calculate the text to be classified The eigenvalues of text-type features and the eigenvalues of portrait-type features.
在步骤S305,如图3中“模型处理”所示,使用训练得到的第二分类模型,对待分类文本的文本类特征的特征值和画像类特征的特征值进行处理。In step S305, as shown in "Model Processing" in Figure 3, the trained second classification model is used to process the feature values of the text feature and the feature value of the portrait feature of the text to be classified.
在步骤S306,如图3中“输出文本类别”所示,若输出的文本类别为“1”,则判定待识别文本中存在话术回传;如果文本类别为“0”,则判定待识别文本中不存在话术回传。In step S306, as shown in "Output Text Category" in Figure 3, if the output text category is "1", it is determined that there is speech backhaul in the text to be recognized; if the text category is "0", it is determined that the text is to be recognized. There is no rhetorical echo in the text.
通过上述步骤S301至S306,对第二分类模型的训练过程和使用过程进行描述。应理解,第一分类模型的模型训练过程与第二分类模型的训练过程类似,不同之处在于,第一分类模型的训练数据为对第二分类模型进行训练所使用的样本文本的文本类特征的特征值。对第一分类模型进行训练所使用的样本文本与对第二分类模型进行训练所使用的样本文本可以是相同的样本文本,也可以是不同的样本文本。第一分类模型的训练过程中的其他细节内容可参照第二分类模型的训练中的相应内容,在此不再赘述。在模型识别步骤中,需要计算待分类文本的文本类特征的特征值,并利用训练得到的第一分类模型处理,处理待分类文本的文本类特征的特征值,得到相应的文本类别,以用于确定待分类文本中回传噪声的是否存在。Through the above steps S301 to S306, the training process and usage process of the second classification model are described. It should be understood that the model training process of the first classification model is similar to the training process of the second classification model. The difference is that the training data of the first classification model is the text class features of the sample text used to train the second classification model. eigenvalues. The sample text used to train the first classification model and the sample text used to train the second classification model may be the same sample text, or they may be different sample texts. For other details in the training process of the first classification model, please refer to the corresponding content in the training of the second classification model, and will not be described again here. In the model identification step, it is necessary to calculate the feature values of the text-type features of the text to be classified, and use the first classification model obtained by training to process the feature values of the text-type features of the text to be classified to obtain the corresponding text category for use To determine whether there is return noise in the text to be classified.
在本公开实施例中,包括但不限于使用文本位置、完整性,客户通话文本敏感词的有无、与坐席敏感词的异同、坐席工龄、级别、不符合话术规则而收到利益损失处理的次数(例如受罚次数)等特征,完成了文本类特征和画像类特征的构建;话术回传判定过程中,可以预先使用构建的文本类特征和画像类特征,获取用于进行模型训练的样本文本中的文本类特征的特征值和画像类特征的特征值,以生成训练数据,并使用该训练数据和样本文本的标注结果(至少是否存在特定类型噪声)进行模型训练,得到分类模型,从而可以通过训练后的模型的输出,判定待识别文本中是否存在话术回传。In the embodiment of this disclosure, including but not limited to the location and completeness of the text used, the presence or absence of sensitive words in the customer call text, similarities and differences with agent sensitive words, agent length of service, level, and loss of profits due to non-compliance with speech rules. The number of times (such as the number of times being punished) and other features have been completed to complete the construction of text features and portrait features; during the speech return judgment process, the constructed text features and portrait features can be used in advance to obtain model training The eigenvalues of text features and the eigenvalues of portrait features in the sample text are used to generate training data, and the training data and the annotation results of the sample text (at least whether there is a specific type of noise) are used for model training to obtain a classification model. Thus, the output of the trained model can be used to determine whether there is speech backhaul in the text to be recognized.
在一些实施例中,指定类型噪声包括:由声音回传产生的回传噪声;声音回传是指:在通话过程中通话设备的扬声器到麦克风阵列的声音回传。 In some embodiments, the specified type of noise includes: return noise generated by sound return; sound return refers to: sound return from the speaker of the calling device to the microphone array during the call.
本公开实施例的模型分类方法,可以根据预设的文本类特征和待分类文本,生成该待分类文本的文本类特征的特征值,对生成的文本类特征的特征值进行文本分类处理,得到文本分类结果,以确定待分类文本中是否存在通话回传噪声,提高分类结果准确率,处理方式和处理步骤简单可行,提高了分类结果的处理效率。The model classification method of the embodiment of the present disclosure can generate the feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified, and perform text classification processing on the feature value of the generated text feature to obtain Text classification results are used to determine whether there is call return noise in the text to be classified, and the accuracy of the classification results is improved. The processing method and processing steps are simple and feasible, and the processing efficiency of the classification results is improved.
可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例,限于篇幅,本公开不再赘述。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It can be understood that the above-mentioned method embodiments mentioned in this disclosure can be combined with each other to form a combined embodiment without violating the principle logic. Due to space limitations, the details will not be described in this disclosure. Those skilled in the art can understand that in the above-mentioned methods of specific embodiments, the specific execution order of each step should be determined by its function and possible internal logic.
下面通过图4,描述本公开实施例的文本识别方法的处理流程。图4为本公开实施例提供的文本识别方法的流程图。如图4所示,该方法可以包括如下步骤S410至S430。The processing flow of the text recognition method according to the embodiment of the present disclosure is described below through FIG. 4 . Figure 4 is a flow chart of a text recognition method provided by an embodiment of the present disclosure. As shown in Figure 4, the method may include the following steps S410 to S430.
在步骤S410,对获取的待识别文本进行敏感词识别,得到敏感词识别结果。In step S410, sensitive word recognition is performed on the acquired text to be recognized, and a sensitive word recognition result is obtained.
在步骤S410之前,可以预先在由获取的通话语音转换得到的对话文本中,获取待识别文本。该步骤中,输出待识别文本,待识别文本可以为指定对话文本中的一条对话文本。示例性地,待识别文本可以是客服坐席与客户之间的对话文本中的客服坐席的任一条对话文本。Before step S410, the text to be recognized may be obtained in advance from the conversation text converted from the obtained call voice. In this step, the text to be recognized is output, and the text to be recognized can be a dialogue text in the specified dialogue text. For example, the text to be recognized may be any dialogue text of the customer service agent among the dialogue texts between the customer service agent and the customer.
在步骤S410,对待识别文本进行命名实体识别(NamedEntityRecognition,NER),根据实体识别结果,确实待识别文本是否存在敏感词;通过NER技术识别待识别文本中的敏感词,NER模型可采用长短期记忆网络(LongShort-TermMemory,LSTM)、Bert网络。如果没有识别到敏感词,则可以输出空值或相应提示信息,表示待识别文本中不包含敏感词;如果识别到敏感词,则可以把识别到的敏感词加入到敏感词列表中,以用于进行后续处理。In step S410, NamedEntityRecognition (NER) is performed on the text to be recognized. According to the entity recognition results, it is confirmed whether there are sensitive words in the text to be recognized. Sensitive words in the text to be recognized are identified through NER technology. The NER model can use long short-term memory. Network (LongShort-TermMemory, LSTM), Bert network. If no sensitive word is recognized, a null value or corresponding prompt message can be output, indicating that the text to be recognized does not contain sensitive words; if a sensitive word is recognized, the recognized sensitive word can be added to the sensitive word list for use for subsequent processing.
LSTM网络是一种时间循环神经网络,可以减轻一般循环神经网络存在的长期依赖问题。基于LSTM网络和Bert网络在命名实体识别时,具体可以取得较好的识别效果,实际应用中,可以根据需要选择进行命名实体识别的网络,本公开实施例不做具体限制。 The LSTM network is a temporal recurrent neural network that can alleviate the long-term dependency problem of general recurrent neural networks. The LSTM network and Bert network can achieve better recognition results in named entity recognition. In practical applications, the network for named entity recognition can be selected as needed, and the embodiments of the present disclosure do not impose specific restrictions.
在步骤S420,根据待识别文本的文本类特征的特征值,对待识别文本进行文本分类处理,生成文本分类结果,文本分类结果用于指示指定类型噪声是否存在。In step S420, text classification processing is performed on the text to be recognized according to the feature value of the text-type feature of the text to be recognized, and a text classification result is generated. The text classification result is used to indicate whether the specified type of noise exists.
在本公开实施例中,对待识别文本进行文本分类处理时,将待识别文本作为待分类文本,执行上述实施例中的文本分类方法,得到对应的文本分类结果。文本分类的具体过程和具体细节,可参照前述实施例中结合图2-图3描述的文本分类方法的具体步骤,本公开实施例不再赘述。In the embodiment of the present disclosure, when performing text classification processing on the text to be recognized, the text to be recognized is used as the text to be classified, and the text classification method in the above embodiment is executed to obtain the corresponding text classification result. For the specific process and details of text classification, please refer to the specific steps of the text classification method described in conjunction with Figures 2-3 in the previous embodiments, and will not be described again in the embodiments of this disclosure.
在步骤S430,根据敏感词识别结果和文本分类结果,生成待识别文本的文本识别结果。In step S430, a text recognition result of the text to be recognized is generated based on the sensitive word recognition result and the text classification result.
通过该文本识别方法,通过待识别文本的文本类特征的特征值可以对对话文本中的特定类型噪声数据的是否存在进行有效判定;通过特定类型噪声数据是否存在的判定结果来辅助敏感词识别,提升了敏感词识别的准确率。本公开提出的文本识别方法是在文本层面,降低文本识别过程中话术回传噪声和转译错误对文本识别结果的不利影响,有效减少了对预定类型噪声存在前提下敏感词识别结果的错误判定,提升了敏感词识别的准确率。Through this text recognition method, the presence or absence of specific types of noise data in the dialogue text can be effectively determined based on the characteristic values of the text features of the text to be recognized; the determination results of the presence or absence of specific types of noise data can be used to assist sensitive word recognition. Improved the accuracy of sensitive word recognition. The text recognition method proposed in this disclosure is at the text level, reducing the adverse effects of speech return noise and translation errors on text recognition results during the text recognition process, and effectively reducing the erroneous judgment of sensitive word recognition results in the presence of predetermined types of noise. , improving the accuracy of sensitive word recognition.
根据本公开实施例的文本识别方法,在对待识别文本中的敏感词进行识别时,可以结合敏感词识别结果和话术回传判定结果,完成语音质检场景下敏感词的识别,提升敏感词识别的准确率,减少对坐席使用话术不符合预定话术规则的错误判定。According to the text recognition method of the embodiment of the present disclosure, when identifying sensitive words in the text to be recognized, the sensitive word recognition results and the speech return judgment results can be combined to complete the recognition of sensitive words in a speech quality inspection scenario and improve sensitive words. The accuracy of identification reduces the erroneous judgment that the agent's speech does not comply with the predetermined speech rules.
在一些实施例中,步骤S420具体可以包括:利用本公开上述任一实施例的文本分类方法对待识别文本进行文本分类处理,得到文本分类结果。In some embodiments, step S420 may specifically include: using the text classification method of any of the above embodiments of the present disclosure to perform text classification processing on the text to be recognized, to obtain a text classification result.
在一些实施例中,待识别文本是从对话文本中获取的目标对象的对话文本之一。步骤S420具体可以包括:步骤S61,获取待识别文本的画像类特征的特征值,画像类特征用于表征目标对象的个体特征;步骤S62,根据文本类特征的特征值和画像类特征的特征值,对待识别文本进行文本分类处理,得到文本分类结果。In some embodiments, the text to be recognized is one of the dialogue texts of the target object obtained from the dialogue text. Step S420 may specifically include: step S61, obtaining the feature value of the portrait feature of the text to be recognized, which is used to characterize the individual features of the target object; step S62, based on the feature value of the text feature and the feature value of the portrait feature , perform text classification processing on the text to be recognized, and obtain text classification results.
本公开实施例中文本识别方法,计算待识别文本的文本类特征 的特征值和画像类特征的特征值,确定待识别文本中是否存在指定类型噪声(例如话术回传噪声),对待识别文本中的敏感词进行识别,结合指定类型噪声的是否存在的判定结果和敏感词识别结果,提升语音质检场景下敏感词识别的准确率。The text recognition method in the embodiment of the present disclosure calculates text-like features of the text to be recognized The eigenvalues of the eigenvalues and the eigenvalues of portrait features are used to determine whether there is a specified type of noise (such as speech return noise) in the text to be identified, and the sensitive words in the text to be identified are identified, combined with the determination results of whether the specified type of noise exists. and sensitive word recognition results to improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
在一些实施例中,步骤S430具体可以包括:若从待识别文本识别出的敏感词的数量大于或等于1,且文本分类结果为不存在指定类型噪声,则输出识别出的敏感词,作为文本识别结果。In some embodiments, step S430 may specifically include: if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result is that there is no specified type of noise, output the identified sensitive words as text Recognition results.
在该步骤中,若识别出敏感词,且确定不存在指定类型噪声,则确定敏感词的识别结果不是指定类型噪声造成的,因此,输出识别到的敏感词作为识别结果,结合指定类型噪声的是否存在的判定结果和敏感词识别结果,提升语音质检场景下敏感词识别的准确率。In this step, if the sensitive word is identified and it is determined that there is no noise of the specified type, it is determined that the recognition result of the sensitive word is not caused by the noise of the specified type. Therefore, the recognized sensitive word is output as the recognition result, combined with the result of the specified type of noise. The existence determination results and sensitive word recognition results improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
在一些实施例中,步骤S430具体还可以包括:若从待识别文本识别出的敏感词的数量等于零,则确定待识别文本中不存在敏感词,并输出第一提示信息,第一提示信息用于指示敏感词为空;若从待识别文本识别出的敏感词的数量大于或等于1,且文本分类结果为存在指定类型噪声,确定识别出的敏感词是否由指定类型噪声导致,在确定识别出的敏感词是由指定类型噪声导致时输出第二提示信息,第二提示信息用于指示敏感词是由指定类型噪声导致的,并在确定识别出的敏感词不是由指定类型噪声导致时输出识别出的敏感词,作为文本识别结果。In some embodiments, step S430 may further include: if the number of sensitive words recognized from the text to be recognized is equal to zero, determining that there are no sensitive words in the text to be recognized, and outputting the first prompt information, where the first prompt information is is empty to indicate sensitive words; if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result indicates that there is a specified type of noise, determine whether the identified sensitive words are caused by the specified type of noise, and then determine the recognition The second prompt information is output when the identified sensitive word is caused by the specified type of noise. The second prompt information is used to indicate that the sensitive word is caused by the specified type of noise, and is output when it is determined that the identified sensitive word is not caused by the specified type of noise. The identified sensitive words are used as text recognition results.
在该步骤中,若未识别出敏感词,则可以直接确定不存在敏感词,若识别出敏感词,且确定存在指定类型噪声,确定识别出的敏感词是否由指定类型噪声导致,并输出相应的提示信息以用于提示该处理结果,从而可以结合指定类型噪声的是否存在的判定结果和敏感词识别结果,提升语音质检场景下敏感词识别的准确率。In this step, if no sensitive word is identified, it can be directly determined that there is no sensitive word. If the sensitive word is identified and it is determined that the specified type of noise exists, determine whether the identified sensitive word is caused by the specified type of noise, and output the corresponding The prompt information is used to prompt the processing results, so that the determination results of the existence of specified types of noise and the sensitive word recognition results can be combined to improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
图5示出本公开示例性实施例的文本识别方法的流程图。如图5所示,该文本识别方法可以包括如下步骤S501至S509。FIG. 5 shows a flowchart of a text recognition method according to an exemplary embodiment of the present disclosure. As shown in Figure 5, the text recognition method may include the following steps S501 to S509.
在步骤S501,输入待识别文本。In step S501, text to be recognized is input.
在一些实施例中,待识别文本为客服坐席与客户之间的对话文本中的客服坐席的任一条对话文本。 In some embodiments, the text to be recognized is any dialogue text of the customer service agent among the dialogue texts between the customer service agent and the customer.
在步骤S502,判断待识别文本中是否包含敏感词。In step S502, determine whether the text to be recognized contains sensitive words.
该步骤中,可以通过NER技术识别待识别文本中的敏感词。如果没有识别到敏感词,则执行步骤S503,若识别出敏感词,则执行步骤S504。In this step, sensitive words in the text to be recognized can be identified through NER technology. If the sensitive word is not recognized, step S503 is executed. If the sensitive word is recognized, step S504 is executed.
在步骤S503,输出第一提示信息。In step S503, the first prompt information is output.
该第一提示信息例如可以是第一提示符号,表示待识别文本中不包含敏感词。第一提示符号例如可以是“[]”。The first prompt information may be, for example, a first prompt symbol, indicating that the text to be recognized does not contain sensitive words. The first prompt symbol may be "[]", for example.
在一些实施例中,步骤S503之后可以返回步骤S501,进行下一条待识别文本的识别。In some embodiments, after step S503, you can return to step S501 to recognize the next text to be recognized.
在步骤S504,得到识别出的敏感词。In step S504, the recognized sensitive words are obtained.
该步骤中,可以将获取到的敏感词加入到敏感词列表中。In this step, the acquired sensitive words can be added to the sensitive word list.
在步骤S505,进行文本分类处理,以确定是否存在回传噪声。In step S505, text classification processing is performed to determine whether there is return noise.
通过文本分类处理,进行待识别文本中话术回传噪声存在与否的判定,即判定待识别文本中是否存在“话术回传”类型的噪声干扰现象。如果待识别文本中存在“话术回传”现象,则执行步骤S506;如果待识别文本中不存在“话术回传”现象,则执行步骤S508。Through text classification processing, the presence or absence of speech return noise in the text to be recognized is determined, that is, whether there is a "speech return" type of noise interference phenomenon in the text to be recognized. If there is a "conversation back" phenomenon in the text to be recognized, step S506 is executed; if there is no "conversation back" phenomenon in the text to be recognized, step S508 is executed.
在步骤S506,确定存在话术回传,并执行步骤S507。In step S506, it is determined that speech backhaul exists, and step S507 is executed.
在步骤S507,输出第二提示信息。In step S507, second prompt information is output.
该第二提示信息例如可以是第二提示符号,第二提示符号表示待识别文本中的敏感词是由于“话术回传”导致的,因此不输出该敏感词;第二提示符号可以与第一提示符号为不同符号,示例性地,第一提示符号例如可以是“{}”。The second prompt information may be, for example, a second prompt symbol. The second prompt symbol indicates that the sensitive word in the text to be recognized is caused by "talking back", so the sensitive word is not output; the second prompt symbol may be the same as the second prompt symbol. A prompt symbol is a different symbol. For example, the first prompt symbol may be "{}".
在一些实施例中,步骤S507之后,可以返回步骤S501,进行下一条待识别文本的识别。In some embodiments, after step S507, step S501 may be returned to recognize the next text to be recognized.
在步骤S508,确定不存在话术回传,并执行步骤S509。In step S508, it is determined that there is no speech backhaul, and step S509 is executed.
在步骤S509,输出识别到的敏感词。In step S509, the recognized sensitive words are output.
在该步骤中,在待识别文本中存在敏感词且确定不存在话术回传的情况下,输出待识别文本中的敏感词。In this step, if there is a sensitive word in the text to be recognized and it is determined that there is no speech return, the sensitive word in the text to be recognized is output.
在一些实施例中,对话文本包括:目标对象的对话文本和与目标对象对话的对话对象的对话文本,待识别文本为目标对象的对话文 本之一。In some embodiments, the dialogue text includes: the dialogue text of the target object and the dialogue text of the dialogue object that dialogues with the target object, and the text to be recognized is the dialogue text of the target object. One of the books.
该文本识别结果还包括:在由获取的通话语音信息转换得到的对话文本中,获取新的待识别文本,生成新的文本识别结果,直到获取次数等于目标对象的对话文本的文本条数,得到目标对象的对话文本的基于噪声是否存在的文本识别结果。The text recognition result also includes: obtaining new text to be recognized in the dialogue text converted from the obtained call voice information, and generating new text recognition results until the number of acquisitions is equal to the number of text items of the target object's dialogue text, and we obtain Text recognition results of the target object’s dialogue text based on the presence or absence of noise.
在该实施例中,可以依次将对话文本中的每条文本作为待识别文本进行上述文本识别处理,直到获取最后一条待识别文本进行文本识别处理,得到目标对象的对话文本中全部文本的基于噪声是否存在的文本识别结果,完成语音质检场景下目标对象的对话文本中敏感词的识别,该策略提升了敏感词的识别准确率。In this embodiment, each text in the dialogue text can be used as a text to be recognized for the above-mentioned text recognition processing in turn, until the last text to be recognized is obtained for text recognition processing, and a noise-based noise-based analysis of all texts in the dialogue text of the target object is obtained. Based on the existing text recognition results, the recognition of sensitive words in the dialogue text of the target object in the speech quality inspection scenario is completed. This strategy improves the recognition accuracy of sensitive words.
在一些实施例中,目标对象的对话文本为客服坐席的对话文本,对话对象的对话文本为客户或用户的对话文本。In some embodiments, the conversation text of the target object is the conversation text of the customer service agent, and the conversation text of the conversation object is the conversation text of the customer or user.
在该实施例中,通过该文本识别方法,可以实现完成语音质检场景下敏感词的识别,该策略提升了敏感词识别的准确率,减少对坐席使用的话术不符合预定话术规则的错误判定。In this embodiment, through this text recognition method, it is possible to complete the recognition of sensitive words in a speech quality inspection scenario. This strategy improves the accuracy of sensitive word recognition and reduces errors when agents use speech techniques that do not comply with predetermined speech rules. determination.
根据本公开实施例的文本识别方法,以指定类型噪声为话术回传噪声为例,通过预先构建文本类特征和画像类特征,获取待识别文本基于所构建各特征的特征值,然后根据上述构建的特征的特征值进行话术回传判定,以确定是否存在话术回传噪声,基于话术回传噪声的情况判定结果和敏感词识别结果,综合判定目标对象的对话文本是否存在敏感词,提升语音质检场景下敏感词识别的准确率。According to the text recognition method of the embodiment of the present disclosure, taking the specified type of noise as speech return noise as an example, by pre-constructing text features and portrait features, the feature values of the text to be recognized based on the constructed features are obtained, and then according to the above The eigenvalues of the constructed features are used to determine whether there is speech return noise to determine whether there is speech return noise. Based on the judgment results of the speech return noise and the sensitive word recognition results, it is comprehensively determined whether there are sensitive words in the dialogue text of the target object. , improve the accuracy of sensitive word recognition in speech quality inspection scenarios.
此外,本公开还提供了文本分类装置、文本识别装置、电子设备、计算机可读存储介质,文本分类装置可用来实现本公开提供的任一种文本分类方法,文本识别装置可用来实现本公开提供的任一种文本识别方法,电子设备、计算机可读存储介质上述均可用来实现本公开提供的任一种文本分类方法或任一种文本识别方法,相应技术方案和描述和参见方法部分的相应记载,不再赘述。In addition, the present disclosure also provides a text classification device, a text recognition device, electronic equipment, and a computer-readable storage medium. The text classification device can be used to implement any text classification method provided by the present disclosure, and the text recognition device can be used to implement any text classification method provided by the present disclosure. Any text recognition method, electronic equipment, computer-readable storage media and the above can be used to implement any text classification method or any text recognition method provided by the present disclosure. For the corresponding technical solutions and descriptions, please refer to the corresponding section of the method section. Record, no more details.
图6为本公开实施例提供的文本分类装置的框图。Figure 6 is a block diagram of a text classification device provided by an embodiment of the present disclosure.
参照图6,本公开实施例提供了一种文本分类装置,该文本分类装置600包括如下模块。 Referring to Figure 6, an embodiment of the present disclosure provides a text classification device. The text classification device 600 includes the following modules.
获取模块610,用于获取待分类文本。Obtaining module 610 is used to obtain text to be classified.
特征值生成模块620,用于基于预设的文本类特征和待分类文本,生成待分类文本的文本类特征的特征值。The feature value generation module 620 is configured to generate feature values of the text feature of the text to be classified based on the preset text feature and the text to be classified.
分类确定模块630,用于根据文本类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果,文本分类结果用于指示指定类型噪声是否存在。The classification determination module 630 is configured to perform text classification processing on the text to be classified according to the feature value of the text class feature, and obtain a text classification result. The text classification result is used to indicate whether the specified type of noise exists.
在一些实施例中,预设的文本类特征包括至少一个文本类特征。特征值生成模块620具体可以包括:规则确定单元,用于根据预设的至少一个文本类特征,确定每个文本类特征的取值规则;值生成单元,用于基于每个文本类特征的取值规则,生成待分类文本中对应每个文本类特征的特征值。In some embodiments, the preset text-like features include at least one text-like feature. The feature value generation module 620 may specifically include: a rule determination unit, configured to determine the value rule for each text feature based on at least one preset text feature; and a value generation unit, configured to determine the value rule based on each text feature. Value rules are used to generate feature values corresponding to each text class feature in the text to be classified.
在一些实施例中,待分类文本是从预先获取的对话文本中选取的文本。文本类特征包括如下特征项的至少一项:敏感词分布特征、文本自身预定特征、与对话文本相关的预定特征;其中,敏感词分布特征,用于表征敏感词在对话文本中的分布;文本自身预定特征,用于表征待分类文本自身的预定特征;与对话文本相关的预定特征,用于表征待分类文本的与对话文本相关的预定特征。In some embodiments, the text to be classified is text selected from pre-obtained conversation texts. Text-type features include at least one of the following features: sensitive word distribution features, predetermined features of the text itself, and predetermined features related to the dialogue text; among them, the sensitive word distribution features are used to characterize the distribution of sensitive words in the dialogue text; text The predetermined features of the text itself are used to characterize the predetermined features of the text to be classified; the predetermined features related to the dialogue text are used to characterize the predetermined features of the text to be classified that are related to the dialogue text.
在一些实施例中,对话文本包括:在目标对象和与该目标对象对话的对话对象之间进行的一次通话过程中产生的目标对象的对话文本和与目标对象对话的对话对象的对话文本,待分类文本为目标对象的对话文本之一。In some embodiments, the dialogue text includes: the dialogue text of the target object and the dialogue text of the dialogue object with which the target object is dialogued, generated during a call between the target object and the dialogue object with which the target object is conversing. The classification text is one of the dialogue texts of the target object.
在一些实施例中,至少一个文本类特征中包括敏感词分布特征。规则确定单元具体用于:确定敏感词分布特征中包含的如下至少一个文本类特征的取值规则:第一文本类特征的取值规则,第一文本类特征用于表征:目标对象的对话文本中的敏感词是否只存在于待分类文本中;第二文本类特征的取值规则,第二文本类特征用于表征:待分类文本中的敏感词在对话对象的对话文本中是否出现;第三文本类特征的取值规则,第三文本类特征用于表征:对话对象的对话文本中是否存在敏感词;以及,第四文本类特征的取值规则,第四文本类特征用于表征:预定对话文本中是否存在敏感词,以及预定对话文本中存 在的敏感词与待分类文本中的敏感词是否一致,预定对话文本为对话对象的对话文本之一,且预定对话文本是与待分类文本相邻的文本。In some embodiments, at least one text feature includes a sensitive word distribution feature. The rule determination unit is specifically used to: determine the value rules of at least one of the following text-type features included in the sensitive word distribution characteristics: the value rules of the first text-type feature, and the first text-type feature is used to represent: the dialogue text of the target object Whether the sensitive words in the text to be classified only exist in the text to be classified; the value rules of the second text class feature. The second text class feature is used to characterize: whether the sensitive word in the text to be classified appears in the dialogue text of the conversation object; the second The value rules of the three text-type features, the third text-type feature is used to characterize: whether there are sensitive words in the dialogue text of the conversation object; and the value rules of the fourth text-type feature, the fourth text-type feature is used to characterize: Whether there are sensitive words in the scheduled dialogue text, and whether there are sensitive words in the scheduled dialogue text Whether the sensitive word in is consistent with the sensitive word in the text to be classified, the predetermined dialogue text is one of the dialogue texts of the dialogue object, and the predetermined dialogue text is text adjacent to the text to be classified.
值生成单元具体用于:基于第一文本类特征的取值规则、第二文本类特征的取值规则、第三文本类特征的取值规则和第四文本类特征的取值规则中至少一者,生成待分类文本中的对应于第一文本类特征、第二文本类特征、第三文本类特征和第四文本类特征中至少一个文本类特征的特征值。The value generation unit is specifically configured to: based on at least one of the value rules of the first text-type feature, the value rules of the second text-type feature, the value rules of the third text-type feature, and the value rules of the fourth text-type feature. Or, generating a feature value corresponding to at least one text-type feature among the first text-type feature, the second text-type feature, the third text-type feature and the fourth text-type feature in the text to be classified.
在一些实施例中,至少一个文本类特征中包括文本自身预定特征。规则确定单元具体用于:确定文本自身预定特征中包含的如下至少一个文本类特征的取值规则:第五文本类特征的取值规则和第六文本类特征的取值规则,第五文本类特征用于表征:待分类文本的句子完整性信息;第六文本类特征用于表征:目标对象的对话文本中特定用语在规定位置出现的总次数。In some embodiments, at least one text-like feature includes predetermined features of the text itself. The rule determination unit is specifically used to: determine the value rules of at least one of the following text-type features included in the predetermined characteristics of the text itself: the value rules of the fifth text-type feature and the value rules of the sixth text-type feature, the fifth text-type feature The features are used to represent: the sentence integrity information of the text to be classified; the sixth text type feature is used to represent: the total number of times a specific word appears in a specified position in the dialogue text of the target object.
值生成单元具体用于:基于第五文本类特征的取值规则和第六文本类特征的取值规则中至少一者,生成待分类文本中的对应于第五文本类特征和第六文本类特征中至少一个文本类特征的特征值。The value generation unit is specifically configured to: generate, based on at least one of the value rules of the fifth text class feature and the value rule of the sixth text class feature, corresponding to the fifth text class feature and the sixth text class in the text to be classified. The characteristic value of at least one text-type feature in the feature.
在一些实施例中,至少一个文本类特征中包括与对话文本相关的预定特征。规则确定单元具体用于:确定与对话文本相关的预定特征中包含的如下至少一个文本类特征的取值规则:第七文本类特征的取值规则和第八文本类特征的取值规则,第七文本类特征用于表征:待分类文本所属的对话文本的文本条数;第八文本类特征用于表征:待分类文本在对话文本中出现的位置。In some embodiments, at least one text-like feature includes predetermined features related to the conversation text. The rule determination unit is specifically used to: determine the value rule of at least one of the following text-type features contained in the predetermined characteristics related to the dialogue text: the value rule of the seventh text-type feature and the value rule of the eighth text-type feature, the first The seventh text-type feature is used to represent: the number of text items of the dialogue text to which the text to be classified belongs; the eighth text-type feature is used to represent: the position where the text to be classified appears in the dialogue text.
值生成单元具体用于:基于第七文本类特征的取值规则和第八文本类特征的取值规则中至少一者,生成待分类文本中的对应于第七文本类特征和第八文本类特征中至少一个文本类特征的特征值。The value generation unit is specifically configured to: based on at least one of the value rules of the seventh text class feature and the value rule of the eighth text class feature, generate the text to be classified corresponding to the seventh text class feature and the eighth text class The characteristic value of at least one text-type feature in the feature.
在一些实施例中,待分类文本属于目标对象的对话文本;分类确定模块630具体用于:基于预设的画像类特征,得到待分类文本对应的目标对象的画像类特征的特征值,画像类特征用于表征目标对象的个体特征;根据文本类特征的特征值和画像类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果。 In some embodiments, the text to be classified belongs to the dialogue text of the target object; the classification determination module 630 is specifically configured to: based on the preset portrait features, obtain the characteristic value of the portrait feature of the target object corresponding to the text to be classified, and the portrait feature Features are used to characterize the individual characteristics of the target object; based on the feature values of text features and the feature values of portrait features, text classification processing is performed on the text to be classified to obtain text classification results.
在一些实施例中,预设的画像类特征包括至少一个画像类特征。分类确定模块630,在用于基于预设的画像类特征,得到待分类文本的画像类特征的特征值时,具体用于:根据预设的至少一个画像类特征,确定每个画像类特征的取值规则;基于每个画像类特征的取值规则,得到待分类文本对应的目标对象的的每个画像类特征的特征值。In some embodiments, the preset portrait features include at least one portrait feature. The classification determination module 630, when used to obtain the feature values of the portrait features of the text to be classified based on the preset portrait features, is specifically used to: determine the value of each portrait feature based on at least one preset portrait feature. Value rules; based on the value rules of each portrait feature, obtain the feature value of each portrait feature of the target object corresponding to the text to be classified.
在一些实施例中,目标对象为客服坐席;个体特征用于表征如下信息项中的至少一项:坐席级别、坐席工龄、预定统计周期内的坐席话术不符合预定话术规则的次数、以及是否因待分类文本中包含敏感词导致坐席话术不符合预定话术规则而受到利益损失处理的历史记录。In some embodiments, the target object is a customer service agent; the individual characteristics are used to characterize at least one of the following information items: agent level, agent length of service, the number of times the agent's speech does not comply with the predetermined speech rules within a predetermined statistical period, and A history of whether the agent's speech did not comply with the predetermined speech rules and the agent suffered a loss of profits due to the inclusion of sensitive words in the text to be classified.
在一些实施例中,分类确定模块630具体用于:通过第一分类模型对文本类特征的特征值进行处理,得到待分类文本的第一文本类别,第一分类模型是利用样本文本预先训练得到的模型;根据第一文本类别的取值与是否存在预定类型噪声的预定对应关系,生成文本分类结果。In some embodiments, the classification determination module 630 is specifically configured to: process the feature values of the text class features through a first classification model to obtain the first text category of the text to be classified. The first classification model is pre-trained using sample texts. a model; generating a text classification result based on a predetermined correspondence between the value of the first text category and whether there is a predetermined type of noise.
在一些实施例中,分类确定模块630,在用于根据文本类特征的特征值和画像类特征的特征值,对待分类文本进行文本分类处理,得到文本分类结果时,具体用于:通过第二分类模型处理对文本类特征的特征值和画像类特征的特征值进行处理,得到待分类文本的第二文本类别,第二分类模型是利用样本文本预先训练得到的模型;根据第二文本类别的取值与是否存在预定类型噪声的预定对应关系,生成文本分类结果。In some embodiments, the classification determination module 630 is specifically used to: perform text classification processing on the text to be classified according to the feature values of the text feature and the feature value of the portrait feature to obtain the text classification result: through the second Classification model processing processes the feature values of text features and the feature values of portrait features to obtain the second text category of the text to be classified. The second classification model is a model pre-trained using sample text; according to the second text category There is a predetermined correspondence between the value and whether there is a predetermined type of noise, and a text classification result is generated.
根据本公开的实施例的文本分类装置,可以根据预设的文本类特征和待分类文本,生成该待分类文本的文本类特征的特征值,对生成的文本类特征的特征值进行文本分类处理,得到文本分类结果,通过该文本分类结果可以确定待分类文本中是否存在指定类型噪声;该方法可以基于文本类特征对待分类文本中是否存在指定类型噪声进行判定,从而在进行文本识别的处理过程中,可以基于该分类结果减少噪音数据带来的干扰,因此有利于得到客观的文本识别结果。According to the text classification device according to the embodiment of the present disclosure, the feature value of the text feature of the text to be classified can be generated based on the preset text feature and the text to be classified, and the generated feature value of the text feature can be subjected to text classification processing. , the text classification result is obtained, through which the text classification result can be used to determine whether there is a specified type of noise in the text to be classified; this method can determine whether there is a specified type of noise in the text to be classified based on the text class characteristics, so as to perform the text recognition process , the interference caused by noisy data can be reduced based on the classification results, which is beneficial to obtaining objective text recognition results.
图7为本公开实施例提供的文本识别装置的框图。 Figure 7 is a block diagram of a text recognition device provided by an embodiment of the present disclosure.
参照图7,本公开实施例提供了一种文本识别装置,该文本识别装置700包括如下模块。Referring to FIG. 7 , an embodiment of the present disclosure provides a text recognition device. The text recognition device 700 includes the following modules.
词识别模块710,用于对获取的待识别文本进行敏感词识别,得到敏感词识别结果。The word recognition module 710 is used to perform sensitive word recognition on the acquired text to be recognized, and obtain sensitive word recognition results.
分类模块720,用于根据待识别文本的文本类特征的特征值,对待识别文本进行文本分类处理,生成文本分类结果,文本分类结果用于指示指定类型噪声是否存在。The classification module 720 is configured to perform text classification processing on the text to be identified based on the feature values of the text-type features of the text to be identified, and generate a text classification result. The text classification result is used to indicate whether a specified type of noise exists.
结果生成模块730,用于根据敏感词识别结果和文本分类结果,生成待识别文本的文本识别结果。The result generation module 730 is configured to generate a text recognition result of the text to be recognized based on the sensitive word recognition result and the text classification result.
在一些实施例中,分类模块720,具体用于根据本公开上述任一实施例的文本分类方法对待识别文本进行文本分类处理,得到文本分类结果。In some embodiments, the classification module 720 is specifically configured to perform text classification processing on the text to be recognized according to the text classification method in any of the above embodiments of the present disclosure, and obtain a text classification result.
在一些实施例中,待识别文本是从对话文本中获取的目标对象的对话文本之一;分类模块720具体用于:获取待识别文本的画像类特征的特征值;其中,画像类特征用于表征目标对象的个体特征;根据文本类特征的特征值和画像类特征的特征值,对待识别文本进行文本分类处理,得到文本分类结果。In some embodiments, the text to be recognized is one of the dialogue texts of the target object obtained from the dialogue text; the classification module 720 is specifically used to: obtain the feature values of the portrait features of the text to be recognized; wherein the portrait features are used for Characterize the individual characteristics of the target object; based on the eigenvalues of text features and the eigenvalues of portrait features, perform text classification processing on the text to be recognized to obtain text classification results.
在一些实施例中,结果生成模块730具体用于:若从待识别文本识别出的敏感词的数量大于或等于1,且文本分类结果为不存在指定类型噪声,则输出识别出的敏感词,作为文本识别结果。In some embodiments, the result generation module 730 is specifically configured to: if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result is that there is no specified type of noise, then output the identified sensitive words, as text recognition results.
在一些实施例中,结果生成模块730具体还用于:若从待识别文本识别出的敏感词的数量等于零,则确定待识别文本不存在敏感词,并输出第一提示信息;第一提示信息用于指示输入敏感词为空;若从待识别文本识别出的敏感词的数量大于或等于1,且文本分类结果为存在指定类型噪声,确定识别出的敏感词是否由指定类型噪声导致,在确定识别出的敏感词是由指定类型噪声导致时输出第二提示信息,第二提示信息用于指示敏感词是由指定类型噪声导致的,并在确定识别出的敏感词不是由指定类型噪声导致时输出识别出的敏感词,作为文本识别结果。In some embodiments, the result generation module 730 is also specifically configured to: if the number of sensitive words recognized from the text to be recognized is equal to zero, determine that there are no sensitive words in the text to be recognized, and output the first prompt information; the first prompt information Used to indicate that the input sensitive words are empty; if the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result shows that there is a specified type of noise, determine whether the identified sensitive words are caused by the specified type of noise, in When it is determined that the identified sensitive word is caused by the specified type of noise, the second prompt information is output. The second prompt information is used to indicate that the sensitive word is caused by the specified type of noise, and when it is determined that the identified sensitive word is not caused by the specified type of noise The identified sensitive words are output as text recognition results.
在一些实施例中,对话文本包括:在目标对象和与该目标对象 对话的对话对象之间进行的一次通话过程中产生的目标对象的对话文本和与目标对象对话的对话对象的对话文本,待识别文本为目标对象的对话文本之一。文本识别装置还包括:获取模块,用于在由获取的通话语音转换得到的对话文本中,获取新的待识别文本;结果生成模块730还用于生成新的待识别文本的文本识别结果,直到获取次数等于目标对象的对话文本的文本条数,得到对话文本的文本识别结果。In some embodiments, the dialogue text includes: in the target object and with the target object The conversation text of the target object generated during a call between the conversation objects and the conversation text of the conversation object with the target object, and the text to be recognized is one of the conversation texts of the target object. The text recognition device also includes: an acquisition module for acquiring new text to be recognized in the conversation text obtained by converting the acquired call voice; the result generation module 730 is also used to generate a text recognition result of the new text to be recognized until The number of acquisition times is equal to the number of text pieces of the target object's dialogue text, and the text recognition result of the dialogue text is obtained.
在一些实施例中,目标对象的对话文本为客服坐席的对话文本。In some embodiments, the conversation text of the target object is the conversation text of the customer service agent.
通过该文本识别装置,通过待识别文本的文本类特征的特征值可以对对话文本中的特定类型噪声数据的是否存在的进行有效判定;通过是否存在特定类型噪声数据的判定结果辅助敏感词识别,提升了敏感词识别的准确率;本公开提出的文本识别方法是在文本层面,降低文本识别过程中话术回传噪声和转译错误对文本识别结果的不利影响,有效减少了对预定类型噪声存在前提下敏感词识别结果的错误判定,提升了敏感词识别的准确率。Through this text recognition device, the presence or absence of a specific type of noise data in the dialogue text can be effectively determined based on the characteristic values of the text features of the text to be recognized; the determination result of the presence or absence of the specific type of noise data can be used to assist sensitive word recognition. The accuracy of sensitive word recognition is improved; the text recognition method proposed in this disclosure is at the text level, reducing the adverse effects of speech return noise and translation errors on text recognition results during the text recognition process, effectively reducing the presence of predetermined types of noise Under the premise of erroneous judgment of sensitive word recognition results, the accuracy of sensitive word recognition is improved.
图8为本公开实施例提供的一种电子设备的框图。FIG. 8 is a block diagram of an electronic device provided by an embodiment of the present disclosure.
参照图8,本公开实施例提供了一种电子设备,该电子设备包括:至少一个处理器801;至少一个存储器802,以及连接在处理器801与存储器802之间的一个或多个I/O接口803。存储器802存储有可被至少一个处理器801执行的一个或多个计算机程序,一个或多个计算机程序可被至少一个处理器801执行,以使至少一个处理器801能够执行上述的文本分类方法或任一种文本识别方法。Referring to Figure 8, an embodiment of the present disclosure provides an electronic device, which includes: at least one processor 801; at least one memory 802, and one or more I/Os connected between the processor 801 and the memory 802. Interface 803. The memory 802 stores one or more computer programs that can be executed by at least one processor 801, and the one or more computer programs can be executed by at least one processor 801, so that at least one processor 801 can perform the above text classification method or Any text recognition method.
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序在被处理器/处理核执行时实现上述的文本分类方法或任一种文本识别方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。Embodiments of the present disclosure also provide a computer-readable storage medium on which a computer program is stored. The computer program implements the above text classification method or any text recognition method when executed by a processor/processing core. Computer-readable storage media may be volatile or non-volatile computer-readable storage media.
本公开实施例还提供了一种计算机程序,当该计算机程序在电子设备的处理器中运行时,电子设备中的处理器执行上述文本分类方法或任一种文本识别方法。An embodiment of the present disclosure also provides a computer program. When the computer program is run in a processor of an electronic device, the processor in the electronic device executes the above text classification method or any text recognition method.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、 硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读存储介质上,计算机可读存储介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。Those of ordinary skill in the art can understand that all or some of the steps, systems, and functional modules/units in the devices disclosed above can be implemented as software, firmware, hardware and its appropriate combination. In hardware implementations, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may consist of several physical components. Components execute cooperatively. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读程序指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM)、静态随机存取存储器(SRAM)、闪存或其他存储器技术、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读程序指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。As is known to those of ordinary skill in the art, the term computer storage media includes volatile and non-volatile media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data. lossless, removable and non-removable media. Computer storage media include, but are not limited to, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), static random access memory (SRAM), flash memory or other memory technology, portable Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassette, magnetic tape, disk storage or other magnetic storage device, or that can be used to store the desired information and can be accessed by a computer any other medium. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery medium.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to various computing/processing devices, or to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage on a computer-readable storage medium in the respective computing/processing device .
用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码 或目标代码,编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。Computer program instructions for performing operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or instructions in one or more programming languages. Source code written in any combination or object code, programming languages including object-oriented programming languages - such as Smalltalk, C++, etc., and conventional procedural programming languages - such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server implement. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through the Internet). connect). In some embodiments, by utilizing state information of computer-readable program instructions to personalize an electronic circuit, such as a programmable logic circuit, a field programmable gate array (FPGA), or a programmable logic array (PLA), the electronic circuit can Computer readable program instructions are executed to implement various aspects of the disclosure.
这里所描述的计算机程序可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,计算机程序具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。The computer program described here may be implemented specifically through hardware, software, or a combination thereof. In an optional embodiment, the computer program is embodied as a software product, such as a software development kit (Software Development Kit, SDK) and so on.
这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, thereby producing a machine that, when executed by the processor of the computer or other programmable data processing apparatus, , resulting in an apparatus that implements the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium. These instructions cause the computer, programmable data processing device and/or other equipment to work in a specific manner. Therefore, the computer-readable medium storing the instructions includes An article of manufacture that includes instructions that implement aspects of the functions/acts specified in one or more blocks of the flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置 或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer-readable program instructions can also be loaded into a computer, other programmable data processing device, or other equipment, so that when the computer, other programmable data processing device A series of operational steps are executed on a computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, such that instructions executed on a computer, other programmable data processing apparatus, or other device implement one or more of the methods in the flowcharts and/or block diagrams. The function/action specified in the box.
附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions that contains one or more executable functions for implementing the specified logical functions instruction. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts. , or can be implemented using a combination of specialized hardware and computer instructions.
本文已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施例相结合描述的特征、特性和/或元素,或可与其他实施例相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。 Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in conjunction with other embodiments, unless expressly stated otherwise. Features and/or components used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Claims (20)

  1. 一种文本分类方法,包括:A text classification method including:
    获取待分类文本;Get the text to be classified;
    基于预设的文本类特征和所述待分类文本,生成所述待分类文本的文本类特征的特征值;以及Based on the preset text-type features and the text to be classified, generate feature values of the text-type features of the text to be classified; and
    根据所述文本类特征的特征值,对所述待分类文本进行文本分类处理,得到文本分类结果,所述文本分类结果用于指示指定类型噪声是否存在。According to the characteristic value of the text-type feature, text classification processing is performed on the text to be classified to obtain a text classification result. The text classification result is used to indicate whether a specified type of noise exists.
  2. 根据权利要求1所述的文本分类方法,其中,所述预设的文本类特征包括至少一个文本类特征,并且,The text classification method according to claim 1, wherein the preset text class features include at least one text class feature, and,
    其中,所述基于所述预设的文本类特征和所述待分类文本,生成所述待分类文本的所述文本类特征的特征值,包括:Wherein, generating feature values of the text-type features of the text to be classified based on the preset text-type features and the text to be classified includes:
    根据所述至少一个文本类特征,确定所述至少一个文本类特征中的每个文本类特征的取值规则;以及determining a value rule for each text-type feature in the at least one text-type feature based on the at least one text-type feature; and
    基于所述每个文本类特征的取值规则,生成所述待分类文本的每个文本类特征的特征值。Based on the value rule of each text class feature, a feature value of each text class feature of the text to be classified is generated.
  3. 根据权利要求2所述的文本分类方法,其中,所述待分类文本是从预先获取的对话文本中选取的一个对话文本,The text classification method according to claim 2, wherein the text to be classified is a dialogue text selected from pre-obtained dialogue texts,
    其中,所述至少一个文本类特征包括如下文本类特征中的至少一者:Wherein, the at least one text-based feature includes at least one of the following text-based features:
    敏感词分布特征,其用于表征敏感词在所述对话文本中的分布;Sensitive word distribution characteristics, which are used to characterize the distribution of sensitive words in the dialogue text;
    文本自身预定特征,其用于表征所述待分类文本自身的预定特征;以及Predetermined characteristics of the text itself, which are used to characterize the predetermined characteristics of the text to be classified; and
    与对话文本相关的预定特征,其用于表征所述待分类文本的与所述对话文本相关的预定特征。Predetermined features related to the dialogue text, which are used to characterize the predetermined features of the text to be classified and related to the dialogue text.
  4. 根据权利要求3所述的文本分类方法,其中,所述对话文本 包括:在目标对象和与所述目标对象对话的对话对象之间进行的一次通话过程中产生的所述目标对象的对话文本和所述对话对象的对话文本,所述待分类文本为所述目标对象的对话文本中的一个对话文本。The text classification method according to claim 3, wherein the conversation text Including: the dialogue text of the target object and the dialogue text of the dialogue object generated during a call between the target object and the dialogue object with which the target object talks, and the text to be classified is the target A dialogue text within the object's dialogue text.
  5. 根据权利要求4所述的文本分类方法,其中,所述至少一个文本类特征包括所述敏感词分布特征,The text classification method according to claim 4, wherein the at least one text class feature includes the sensitive word distribution feature,
    其中,所述根据所述至少一个文本类特征,确定所述至少一个文本类特征中的每个文本类特征的取值规则,包括:确定所述敏感词分布特征中包含的第一文本类特征、第二文本类特征、第三文本类特征和第四文本类特征中的至少一个文本类特征的取值规则,所述第一文本类特征用于表征:所述目标对象的对话文本中是否只有所述待分类文本中存在敏感词,所述第二文本类特征用于表征:所述待分类文本中的敏感词在所述对话对象的对话文本中是否出现,所述第三文本类特征用于表征:所述对话对象的对话文本中是否存在敏感词,所述第四文本类特征用于表征:预定对话文本中是否存在敏感词以及所述预定对话文本中存在的敏感词与所述待分类文本中的敏感词是否一致,所述预定对话文本为所述对话对象的对话文本中的一个对话文本,且所述预定对话文本是与所述待分类文本相邻的文本,并且Wherein, determining the value rule of each text-type feature in the at least one text-type feature according to the at least one text-type feature includes: determining the first text-type feature included in the sensitive word distribution feature , the value rule of at least one text-type feature among the second text-type feature, the third text-type feature and the fourth text-type feature, the first text-type feature is used to represent: whether the dialogue text of the target object Only sensitive words exist in the text to be classified. The second text-type features are used to represent: whether the sensitive words in the text to be classified appear in the dialogue text of the conversation object. The third text-type features Used to characterize: whether there are sensitive words in the conversation text of the conversation object, and the fourth text type feature is used to represent: whether there are sensitive words in the predetermined conversation text and whether the sensitive words present in the predetermined conversation text are consistent with the said Whether the sensitive words in the text to be classified are consistent, the predetermined dialogue text is one of the dialogue texts of the dialogue object, and the predetermined dialogue text is text adjacent to the text to be classified, and
    其中,所述基于所述每个文本类特征的取值规则,生成所述待分类文本的每个文本类特征的特征值,包括:基于所述第一文本类特征的取值规则、所述第二文本类特征的取值规则、所述第三文本类特征的取值规则和所述第四文本类特征的取值规则中至少一者,生成所述待分类文本的对应于所述第一文本类特征、第二文本类特征、第三文本类特征和第四文本类特征中至少一个文本类特征的特征值。Wherein, generating the feature value of each text feature of the text to be classified based on the value rule of each text feature includes: based on the value rule of the first text feature, the At least one of the value rules of the second text-type feature, the value rule of the third text-type feature, and the value rule of the fourth text-type feature, generating a text corresponding to the third text type of the text to be classified. The characteristic value of at least one text-type feature among the first text-type feature, the second text-type feature, the third text-type feature and the fourth text-type feature.
  6. 根据权利要求4所述的文本分类方法,其中,所述至少一个文本类特征包括所述文本自身预定特征,The text classification method according to claim 4, wherein the at least one text feature includes a predetermined feature of the text itself,
    其中,所述根据所述至少一个文本类特征,确定所述至少一个文本类特征中的每个文本类特征的取值规则,包括:确定所述文本自身预定特征中包含的第五文本类特征和第六文本类特征中的至少一 个文本类特征的取值规则,所述第五文本类特征用于表征:所述待分类文本的句子完整性信息,所述第六文本类特征用于表征:所述目标对象的对话文本中特定用语在规定位置出现的总次数,并且Wherein, determining a value rule for each text-type feature in the at least one text-type feature based on the at least one text-type feature includes: determining a fifth text-type feature included in the predetermined feature of the text itself. and at least one of the sixth text-type features Value rules for text-type features, the fifth text-type feature is used to characterize: the sentence integrity information of the text to be classified, and the sixth text-type feature is used to characterize: the dialogue text of the target object the total number of times a specific term occurs in a specified position, and
    其中,所述基于所述每个文本类特征的取值规则,生成所述待分类文本的每个文本类特征的特征值,包括:基于所述第五文本类特征的取值规则和所述第六文本类特征的取值规则中至少一者,生成所述待分类文本的对应于所述第五文本类特征和所述第六文本类特征中至少一个文本类特征的特征值。Wherein, generating the feature value of each text feature of the text to be classified based on the value rule of each text feature includes: based on the value rule of the fifth text feature and the At least one of the sixth text-type feature value rules generates a feature value of the text to be classified corresponding to at least one of the fifth text-type feature and the sixth text-type feature.
  7. 根据权利要求4所述的文本分类方法,其中,所述至少一个文本类特征中包括所述与对话文本相关的预定特征,The text classification method according to claim 4, wherein the at least one text-like feature includes the predetermined feature related to the dialogue text,
    其中,所述根据预设的至少一个文本类特征,确定所述至少一个文本类特征中的每个文本类特征的取值规则,包括:确定所述与对话文本相关的预定特征中包含的第七类文本特征和第八类文本特征中的至少一个文本类特征的取值规则,所述第七文本类特征用于表征:所述待分类文本所属的所述对话文本中包含的文本总条数,所述第八文本类特征用于表征:所述待分类文本在所述对话文本中出现的位置,并且Wherein, determining a value rule for each text-type feature in the at least one text-type feature based on at least one preset text-type feature includes: determining the first predetermined feature included in the conversation text-related features. The value rules for at least one text feature among the seven text features and the eighth text feature. The seventh text feature is used to represent: the general text items contained in the dialogue text to which the text to be classified belongs. number, the eighth text type feature is used to characterize: the position where the text to be classified appears in the conversation text, and
    其中,所述基于所述每个文本类特征的取值规则,生成所述待分类文本的每个文本类特征的特征值,包括:基于所述第七文本类特征的取值规则和所述第八文本类特征的取值规则中至少一者,生成所述待分类文本的对应于所述第七文本类特征和所述第八文本类特征中至少一个文本类特征的特征值。Wherein, generating the feature value of each text feature of the text to be classified based on the value rule of each text feature includes: based on the value rule of the seventh text feature and the At least one of the value rules of the eighth text-type feature generates a feature value of the text to be classified corresponding to at least one of the seventh text-type feature and the eighth text-type feature.
  8. 根据权利要求1所述的文本分类方法,其中,所述待分类文本是目标对象的对话文本中的一个对话文本,The text classification method according to claim 1, wherein the text to be classified is a dialogue text among the dialogue texts of the target object,
    其中,所述根据所述文本类特征的特征值,对所述待分类文本进行文本分类处理,得到文本分类结果,包括:Wherein, the step of performing text classification processing on the text to be classified according to the feature value of the text feature to obtain a text classification result includes:
    基于预设的画像类特征,得到所述待分类文本的对应于所述目标对象的画像类特征的特征值,所述画像类特征用于表征所述目标对 象的个体特征;以及Based on the preset portrait features, the feature value of the text to be classified corresponding to the portrait features of the target object is obtained, and the portrait features are used to characterize the target object. individual characteristics of the elephant; and
    根据所述文本类特征的特征值和所述画像类特征的特征值,对所述待分类文本进行文本分类处理,得到所述文本分类结果。According to the feature values of the text feature and the feature value of the portrait feature, text classification processing is performed on the text to be classified to obtain the text classification result.
  9. 根据权利要求8所述的文本分类方法,其中,所述预设的画像类特征包括至少一个画像类特征,并且The text classification method according to claim 8, wherein the preset portrait features include at least one portrait feature, and
    其中,所述基于所述预设的画像类特征,得到所述待分类文本的对应于所述目标对象的画像类特征的特征值,包括:Wherein, obtaining the feature value of the text to be classified corresponding to the portrait feature of the target object based on the preset portrait features includes:
    根据所述至少一个画像类特征,确定每个画像类特征的取值规则;以及Determine the value rules for each portrait feature based on the at least one portrait feature; and
    基于所述每个画像类特征的取值规则,得到所述待分类文本的对应于所述目标对象的每个画像类特征的特征值。Based on the value rules of each portrait feature, a feature value of each portrait feature of the text to be classified corresponding to the target object is obtained.
  10. 根据权利要求1所述的文本分类方法,其中,所述根据所述文本类特征的特征值,对所述待分类文本进行文本分类处理,得到所述文本分类结果,包括:The text classification method according to claim 1, wherein the text classification processing is performed on the text to be classified according to the characteristic value of the text class feature to obtain the text classification result, including:
    通过第一分类模型对所述文本类特征的特征值进行处理,得到所述待分类文本的第一文本类别,所述第一分类模型是预先训练得到的模型;以及The feature values of the text class features are processed by a first classification model to obtain the first text category of the text to be classified, where the first classification model is a pre-trained model; and
    根据所述第一文本类别的取值与是否存在预定类型噪声的预定对应关系,生成所述文本分类结果。The text classification result is generated according to a predetermined correspondence between the value of the first text category and whether there is a predetermined type of noise.
  11. 根据权利要求8所述的文本分类方法,其中,所述根据所述文本类特征的特征值和所述画像类特征的特征值,对所述待分类文本进行文本分类处理,得到所述文本分类结果,包括:The text classification method according to claim 8, wherein the text to be classified is subjected to text classification processing based on the feature values of the text feature and the feature value of the portrait feature to obtain the text classification Results include:
    通过第二分类模型对所述文本类特征的特征值和所述画像类特征的特征值进行处理,得到所述待分类文本的第二文本类别,所述第二分类模型是预先训练得到的模型;以及The feature values of the text-type features and the feature values of the portrait-type features are processed by a second classification model to obtain a second text category of the text to be classified. The second classification model is a pre-trained model. ;as well as
    根据所述第二文本类别的取值与是否存在预定类型噪声的预定对应关系,生成所述文本分类结果。 The text classification result is generated according to a predetermined correspondence between the value of the second text category and whether there is a predetermined type of noise.
  12. 一种文本识别方法,包括:A text recognition method including:
    对获取的待识别文本进行敏感词识别,得到敏感词识别结果;Perform sensitive word recognition on the acquired text to be recognized, and obtain the sensitive word recognition results;
    根据所述待识别文本的文本类特征的特征值,对所述待识别文本进行文本分类处理,生成文本分类结果,所述文本分类结果用于指示指定类型噪声是否存在;以及Perform text classification processing on the text to be recognized according to the feature value of the text-type feature of the text to be recognized, and generate a text classification result, where the text classification result is used to indicate whether a specified type of noise exists; and
    根据所述敏感词识别结果和所述文本分类结果,生成所述待识别文本的文本识别结果。According to the sensitive word recognition result and the text classification result, a text recognition result of the text to be recognized is generated.
  13. 根据权利要求12所述的文本识别方法,其中,利用如权利要求1至11中任一项所述的文本分类方法对所述待识别文本进行所述文本分类处理,得到所述文本分类结果。The text recognition method according to claim 12, wherein the text classification process is performed on the text to be recognized using the text classification method according to any one of claims 1 to 11 to obtain the text classification result.
  14. 根据权利要求12或13所述的文本识别方法,其中,所述根据所述敏感词识别结果和所述文本分类结果,生成所述待识别文本的文本识别结果,包括:The text recognition method according to claim 12 or 13, wherein generating the text recognition result of the text to be recognized according to the sensitive word recognition result and the text classification result includes:
    若从所述待识别文本识别出的敏感词的数量等于零,则确定所述待识别文本中不存在敏感词,并输入第一提示信息,所述第一提示信息用于指示敏感词为空;If the number of sensitive words identified from the text to be recognized is equal to zero, it is determined that there are no sensitive words in the text to be recognized, and first prompt information is input, and the first prompt information is used to indicate that the sensitive word is empty;
    若从所述待识别文本识别出的敏感词的数量大于或等于1,且所述文本分类结果为不存在所述指定类型噪声,则输出所述识别出的敏感词,作为所述文本识别结果;以及If the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result is that there is no specified type of noise, then the identified sensitive words are output as the text recognition result ;as well as
    若从所述待识别文本识别出的敏感词的数量大于或等于1,且所述文本分类结果为存在所述指定类型噪声,则确定识别出的敏感词是由所述指定类型噪声导致,并输出第二提示信息,所述第二提示信息用于指示所述敏感词是由所述指定类型噪声导致的。If the number of sensitive words identified from the text to be recognized is greater than or equal to 1, and the text classification result indicates that the specified type of noise exists, it is determined that the identified sensitive words are caused by the specified type of noise, and Second prompt information is output, and the second prompt information is used to indicate that the sensitive word is caused by the specified type of noise.
  15. 一种文本分类装置,包括:A text classification device including:
    获取模块,其用于获取待分类文本;The acquisition module is used to obtain the text to be classified;
    特征值生成模块,其用于基于预设的文本类特征和所述待分类 文本,生成所述待分类文本的文本类特征的特征值;以及Feature value generation module, which is used to generate text based on preset text features and the to-be-classified text, generating feature values of text-type features of the text to be classified; and
    分类确定模块,其用于根据所述文本类特征的特征值,对所述待分类文本进行文本分类处理,得到文本分类结果,所述文本分类结果用于指示指定类型噪声是否存在。A classification determination module, which is configured to perform text classification processing on the text to be classified according to the characteristic value of the text class feature to obtain a text classification result, where the text classification result is used to indicate whether a specified type of noise exists.
  16. 一种文本识别装置,包括:A text recognition device including:
    词识别模块,其用于对获取的待识别文本进行敏感词识别,得到敏感词识别结果;A word recognition module, which is used to identify sensitive words on the acquired text to be recognized and obtain sensitive word recognition results;
    分类模块,其用于根据所述待识别文本的文本类特征的特征值,对所述待识别文本进行文本分类处理,生成文本分类结果,所述文本分类结果用于指示指定类型噪声是否存在;以及A classification module configured to perform text classification processing on the text to be identified based on the feature values of the text-type features of the text to be identified, and generate a text classification result, where the text classification result is used to indicate whether a specified type of noise exists; as well as
    结果生成模块,其用于根据所述敏感词识别结果和所述文本分类结果,生成所述待识别文本的文本识别结果。A result generation module, configured to generate a text recognition result of the text to be recognized based on the sensitive word recognition result and the text classification result.
  17. 一种电子设备,包括:An electronic device including:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器,a memory communicatively connected to said at least one processor,
    其中,所述存储器存储有一个或多个计算机程序,一个或多个所述计算机程序能够被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1-11中任一项所述的文本分类方法。Wherein, the memory stores one or more computer programs, and the one or more computer programs can be executed by the at least one processor, so that the at least one processor can execute any of claims 1-11. The text classification method described in one item.
  18. 一种电子设备,包括:An electronic device including:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器,a memory communicatively connected to said at least one processor,
    其中,所述存储器存储有一个或多个计算机程序,一个或多个所述计算机程序能够被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求12-14中任一项所述的文本识别方法。Wherein, the memory stores one or more computer programs, and the one or more computer programs can be executed by the at least one processor, so that the at least one processor can execute any one of claims 12-14 The text recognition method described in the item.
  19. 一种非暂时性计算机可读存储介质,其上存储有计算机程序,所述计算机程序在被处理器执行时实现如权利要求1-11中任一 项所述的文本分类方法。A non-transitory computer-readable storage medium with a computer program stored thereon, which when executed by a processor implements any one of claims 1-11 The text classification method described in the item.
  20. 一种非暂时性计算机可读存储介质,其上存储有计算机程序,所述计算机程序在被处理器执行时实现权利要求12-14中任一项所述的文本识别方法。 A non-transitory computer-readable storage medium on which a computer program is stored, which implements the text recognition method described in any one of claims 12-14 when executed by a processor.
PCT/CN2023/109568 2022-08-03 2023-07-27 Text classification method and apparatus, text recognition method and apparatus, electronic device and storage medium WO2024027552A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210928633.3A CN117556040A (en) 2022-08-03 2022-08-03 Text classification method, recognition method and device apparatus, storage medium
CN202210928633.3 2022-08-03

Publications (1)

Publication Number Publication Date
WO2024027552A1 true WO2024027552A1 (en) 2024-02-08

Family

ID=89809761

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/109568 WO2024027552A1 (en) 2022-08-03 2023-07-27 Text classification method and apparatus, text recognition method and apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN117556040A (en)
WO (1) WO2024027552A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021174717A1 (en) * 2020-03-05 2021-09-10 苏宁易购集团股份有限公司 Text intent recognition method and apparatus, computer device and storage medium
CN114416989A (en) * 2022-01-17 2022-04-29 马上消费金融股份有限公司 Text classification model optimization method and device
CN114707513A (en) * 2022-03-22 2022-07-05 腾讯科技(深圳)有限公司 Text semantic recognition method and device, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021174717A1 (en) * 2020-03-05 2021-09-10 苏宁易购集团股份有限公司 Text intent recognition method and apparatus, computer device and storage medium
CN114416989A (en) * 2022-01-17 2022-04-29 马上消费金融股份有限公司 Text classification model optimization method and device
CN114707513A (en) * 2022-03-22 2022-07-05 腾讯科技(深圳)有限公司 Text semantic recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN117556040A (en) 2024-02-13

Similar Documents

Publication Publication Date Title
US10950234B2 (en) Method and apparatus for determining speech interaction satisfaction
US10431205B2 (en) Dialog device with dialog support generated using a mixture of language models combined using a recurrent neural network
US11216510B2 (en) Processing an incomplete message with a neural network to generate suggested messages
US11074416B2 (en) Transformation of chat logs for chat flow prediction
US8972260B2 (en) Speech recognition using multiple language models
CN109767765A (en) Talk about art matching process and device, storage medium, computer equipment
CN1321401C (en) Speech recognition apparatus, speech recognition method, conversation control apparatus, conversation control method
EP1901283A2 (en) Automatic generation of statistical laguage models for interactive voice response applacation
WO2018192186A1 (en) Speech recognition method and apparatus
US11194973B1 (en) Dialog response generation
WO2022252636A1 (en) Artificial intelligence-based answer generation method and apparatus, device, and storage medium
CN112581964B (en) Multi-domain oriented intelligent voice interaction method
CN112530408A (en) Method, apparatus, electronic device, and medium for recognizing speech
WO2021063101A1 (en) Speech breakpoint detection method, apparatus and device based on artificial intelligence
WO2022100691A1 (en) Audio recognition method and device
JP2024502946A (en) Punctuation and capitalization of speech recognition transcripts
CN114416989A (en) Text classification model optimization method and device
WO2021169485A1 (en) Dialogue generation method and apparatus, and computer device
JP2012037797A (en) Dialogue learning device, summarization device, dialogue learning method, summarization method, program
WO2023279691A1 (en) Speech classification method and apparatus, model training method and apparatus, device, medium, and program
WO2024027552A1 (en) Text classification method and apparatus, text recognition method and apparatus, electronic device and storage medium
CN110634486A (en) Voice processing method and device
WO2020162239A1 (en) Paralinguistic information estimation model learning device, paralinguistic information estimation device, and program
Andra et al. Contextual keyword spotting in lecture video with deep convolutional neural network
TWI776296B (en) Voice response system and voice response method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23849260

Country of ref document: EP

Kind code of ref document: A1