CN116501844A - Voice keyword retrieval method and system - Google Patents

Voice keyword retrieval method and system Download PDF

Info

Publication number
CN116501844A
CN116501844A CN202310173515.0A CN202310173515A CN116501844A CN 116501844 A CN116501844 A CN 116501844A CN 202310173515 A CN202310173515 A CN 202310173515A CN 116501844 A CN116501844 A CN 116501844A
Authority
CN
China
Prior art keywords
keyword
voice
target
text content
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310173515.0A
Other languages
Chinese (zh)
Inventor
吴国斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Financial Technology Co Ltd
Original Assignee
Bank of China Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Financial Technology Co Ltd filed Critical Bank of China Financial Technology Co Ltd
Priority to CN202310173515.0A priority Critical patent/CN116501844A/en
Publication of CN116501844A publication Critical patent/CN116501844A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a voice keyword retrieval method and a system, wherein the method comprises the following steps: converting the voice information into target text content; matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords; and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information. The invention ensures that the accuracy of the voice keyword search result is higher.

Description

Voice keyword retrieval method and system
Technical Field
The invention relates to the technical field of voice information retrieval, in particular to a voice keyword retrieval method and a voice keyword retrieval system.
Background
In the current conversation process between the client and the customer service seat, for long voice content, customer service personnel judge the intention and emotion state of the client only through a manual mode, so that strong subjectivity exists.
The existing customer service system can convert the voice content of the customer into text information, and then extract keywords from the text information, so that customer service personnel can analyze the call demands of the customer. However, the keyword search scheme provided in the prior art is relatively single, and generally, a fixed number of words are identified from text information to serve as keywords, so that the intention of the current conversation process of the client is determined, and the accuracy of the method is insufficient.
Therefore, a method and a system for retrieving voice keywords are needed to solve the above problems.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides a voice keyword retrieval method and a voice keyword retrieval system.
The invention provides a voice keyword retrieval method, which comprises the following steps:
converting the voice information into target text content;
matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords;
and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
According to the voice keyword retrieval method provided by the invention, when the words in the target text content are matched with a plurality of groups of keyword combinations in a keyword word stock, if the matching is successful, before the target keywords are determined, the method further comprises the steps of:
performing voice scene classification processing on the sample keywords to obtain a plurality of first sample keyword sets, wherein each first sample keyword set is a keyword set corresponding to different voice scene types;
dividing a plurality of sample keywords in each first sample keyword set into corresponding keyword candidates based on keyword candidate dividing rules corresponding to different voice emotion types, and obtaining divided keyword candidates;
and constructing the keyword combination corresponding to each voice emotion type according to one or more groups of divided keyword candidate items to obtain the keyword word stock.
According to the method for searching the voice keywords provided by the invention, after constructing the keyword combination corresponding to each voice emotion type according to one or more groups of divided keyword candidate items to obtain the keyword word stock, the method further comprises the following steps:
receiving a first input, wherein the first input comprises an operation of inputting a new keyword or an operation of determining a keyword to be deleted;
and responding to the first input, dividing the newly added keywords into corresponding keyword candidates, or deleting the keywords to be deleted from the keyword lexicon.
According to the voice keyword retrieval method provided by the invention, the word in the target text content is matched with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, the target keyword is determined, and the method comprises the following steps:
based on a regular expression, matching words in the target text content with keywords corresponding to each keyword candidate in each keyword combination in the keyword word stock in sequence according to word sequence;
if at least one keyword exists in each keyword candidate in any one keyword combination and is successfully matched with the corresponding word in the target text content, the keyword combination successfully matched is used as a target keyword combination, and the keyword successfully matched in the target keyword combination is used as a target keyword.
According to the method for searching the voice keyword provided by the invention, the keyword label is marked on the corresponding word in the target text content based on the target keyword, so as to obtain the keyword searching result of the voice information, and the method comprises the following steps:
determining text position information of a target word in the target text content, wherein the target word represents a word successfully matched with the target keyword in the target text content;
marking a keyword label for the target word in the target text content according to the text position information to obtain marked text content;
according to the voice emotion type of the target keyword combination, determining voice emotion information corresponding to the target text content;
and generating a keyword search result of the voice information through the voice emotion information and the marked text content.
According to the voice keyword searching method provided by the invention, the voice information is converted into target text content, and the voice keyword searching method comprises the following steps:
collecting audio data;
extracting voice characteristics in the audio data to obtain the voice information;
performing text conversion processing on the voice information to obtain corresponding text information;
and performing word segmentation processing on the text information to obtain the target text content.
The invention also provides a voice keyword retrieval system, which comprises:
the voice conversion module is used for converting voice information into target text content;
the matching module is used for matching the words in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, the target keywords are determined; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords;
and the retrieval result output module is used for marking keyword labels for the corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes any one of the voice keyword retrieval methods when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of retrieving a speech keyword as described in any of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements a method of retrieving a speech keyword as described in any of the above.
According to the voice keyword retrieval method and system, the words in the target text content are matched with the plurality of groups of keywords corresponding to different voice emotion types in the keyword word stock, so that the target keywords are obtained through matching, the keyword labels are marked on the corresponding words in the target text content according to the target keywords, the keyword retrieval result of voice information is obtained, and the accuracy of the voice keyword retrieval result is higher.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for searching voice keywords;
FIG. 2 is a schematic diagram of a voice keyword retrieval system according to the present invention;
fig. 3 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the current customer service system, when keyword retrieval is performed on voice content, generally, a plurality of fixed words are identified as keywords in text information converted from the voice content, and a retrieval mode of dynamic combination of the keywords cannot be satisfied; secondly, the content corresponding to the keywords is also various, including cursory people, complaints, purposes and pacifying various emotion categories, most of which depend on manual subjective judgment at present, and the accuracy and the efficiency are insufficient.
According to the invention, a voice keyword word stock is constructed through the Gbase database, voice text information is matched with keywords in the word stock through a regular function, so that the keywords of the voice text information are searched according to a matching result, and corresponding words in the voice text information are marked as keywords, so that the keywords related to the voice content are quickly and accurately searched. It should be noted that the voice keyword searching method provided by the invention can be widely applied to keyword recognition analysis of client systems in various industries such as telecom, bank, insurance and the like; meanwhile, the invention does not limit the database for constructing the keyword word library specifically, and other database environments can be selected for construction according to actual requirements, for example, a MySQL database, an Oracle database and the like, so as to be compatible with various different database platforms.
Fig. 1 is a flow chart of a voice keyword searching method provided by the present invention, and as shown in fig. 1, the present invention provides a voice keyword searching method, including:
step 101, converting the voice information into target text content.
In the invention, the voice call process between the customer and the customer service is recorded by the voice acquisition device to obtain the voice call record, and then the text conversion processing is carried out on the voice call record, so that the voice information is extracted.
Specifically, the voice information is converted into the target text content by the existing voice-to-text technology, in an embodiment, in the process of converting the voice into the text, the voice information is processed and stored in the Gbase database table, so that the text content converted from the voice into the text is obtained, and the text content is a longtext field and supports 64M call content at maximum.
102, matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords.
In the invention, the keyword word library comprises a plurality of groups of keyword combinations, and each keyword combination corresponds to one voice emotion type. For example, for a complaint emotion type, a set of keyword candidates is included in the corresponding keyword combination, and the keywords included in the keyword candidates may specifically be: complaints/you cannot understand/want complaints/nuisance calls. When any word in the text content corresponding to the voice information of the client appears in the keyword candidate item, the matching is successful, the keyword which is successfully matched in the keyword candidate item is determined to be a target keyword, for example, the target text content is ' me need complaint ', the complaint ' in the keyword candidate item is exactly matched, the target keyword in the current keyword candidate item is determined to be ' complaint ', and then the voice emotion type corresponding to the keyword combination in which the current keyword candidate item is positioned is determined to be complaint emotion.
Further, the invention searches the whole process of the call in real time, when the client completes a section of dialogue, searches the keyword of the currently acquired text content, for example, in the subsequent process of the call, the obtained target text content is "yesterday has saved money, and is also a short message for prompting", and when matching with the keyword word stock, the keyword combination is matched to a group of keyword combinations, and the keyword combination comprises 4 groups of keyword candidates respectively: (already/before/early/yesterday/evening/morning) ((store/play/process/deduct) ((say/prompt/remind/how/is/za) ((phone/sms)) ], there is a word match of one keyword in each set of keyword candidates with the word of the target text content, specifically "yesterday" in the first set of keyword candidates, the "store" in the second group of keyword candidates, the "yes" in the third group of keyword candidates, and the "short message" in the fourth group of keyword candidates, and then the voice emotion type corresponding to the keyword combination where the current keyword candidate is located is determined as the complaint emotion.
In another embodiment, keyword search can be further performed on voice information generated by customer service, so as to perform corresponding analysis on subsequent flows according to the search result, for example, determine whether the service call flow meets the standard. Specifically, after a text content corresponding to a section of customer service voice information (hereinafter referred to as customer service text content) is obtained, matching words in the customer service text content with keywords in a keyword lexicon, wherein the matched keyword combination is as follows: (there/encounter/complaint) × (what/what) (aspect/problem/place). In the keyword combination, each of the three groups of keyword candidates has a keyword, and the keyword can be successfully matched with the words in the text content of the customer service, so that the current voice emotion type for the customer service is determined to be the purpose of inquiring the incoming call (the voice emotion can be understood as neutral). Along with the continuous conversation process, the customer service text content generated by the customer service personnel can also carry out keyword retrieval in real time, for example, the voice emotion type corresponding to a certain keyword combination is negative to the customer, and when the words in the customer service text content are matched with the keyword combination, the negative emotion of the customer service in the current conversation is determined.
And step 103, marking keyword labels for corresponding words in the target text content based on the target keywords, and obtaining keyword retrieval results of the voice information.
According to the method, the target text content can be matched with the keywords in the keyword lexicon through the regular expression, if the target text content is matched with the corresponding keywords, the regular expression is utilized to obtain the position information of the words corresponding to the target keywords in the target text content, the keyword labels are marked on the words according to the position information, and the target text content marked with the keyword labels is output to a user side (such as a customer service terminal in a customer service system).
According to the voice keyword retrieval method provided by the invention, the target keywords are obtained by matching the word in the target text content with the plurality of groups of keywords corresponding to different voice emotion types in the keyword lexicon, and the keyword labels are marked on the corresponding words in the target text content according to the target keywords, so that the keyword retrieval result of voice information is obtained, and the accuracy of the voice keyword retrieval result is higher.
On the basis of the above embodiment, after the matching is performed between the words in the target text content and the plurality of groups of keywords in the keyword lexicon, if the matching is successful, before determining the target keywords, the method further includes:
and carrying out voice scene classification processing on the sample keywords to obtain a plurality of first sample keyword sets, wherein each first sample keyword set is a keyword set corresponding to different voice scene types.
In the invention, the sample keywords can be initially classified according to different voice scenes, so that corresponding keyword sets are provided for the different voice scenes, such as marketing business voice scenes, collect business voice scenes and the like. In another embodiment, the keyword sets may be further categorized according to industry categories, such as marketing business voice scenarios in the telecommunications industry and marketing business scenarios in the insurance industry.
Dividing a plurality of sample keywords in each first sample keyword set into corresponding keyword candidates based on keyword candidate dividing rules corresponding to different voice emotion types, and obtaining divided keyword candidates;
and constructing the keyword combination corresponding to each voice emotion type according to one or more groups of divided keyword candidate items to obtain the keyword word stock.
In the invention, the keywords in the first sample keyword set are required to be divided, the specific dividing rule can be determined according to different voice emotion types and actual requirements, for example, according to a historical service dialogue, the historical text contents representing a plurality of different expression modes of the same voice emotion type are obtained, so that the sequence of each keyword candidate in the keyword combination is determined; the keywords divided into each keyword candidate item can be divided into the same keyword candidate item according to the part of speech and/or the synonym relation, so that the keyword candidate item is obtained; and finally, combining a plurality of keyword candidates (one or more keywords are divided) based on the determined sequence of the keyword candidates to obtain keyword combinations, and endowing the keyword combinations with corresponding voice emotion attributes so as to construct a keyword word stock.
In the present invention, a combination of keywords including only one keyword candidate may also correspond to one type of speech emotion. Preferably, in an embodiment, different emotion grades can be set for the same emotion according to the type of the speech emotion, for example, for a complaint emotion, during a speech call, the speech text content currently generated by the customer is only matched with a keyword comprising a keyword option (for example, the keyword combination of "complaint/you don't understand/want to complaint/harassment call") and the current speech emotion can be defined as a general complaint emotion (which can be understood as that the current emotion state of the customer is stable and the customer normally states complaint requirements); with the voice interaction process between the customer and customer service, if a certain voice text content generated later by the customer is matched with another keyword combination (the keyword combination can be newly added with keyword candidates in the combination of complaint/you do not understand/want to complaint/harassment call, for example, "(complaint/you do not understand/want to complaint/harassment call)," (just/before) "," can also be set according to word library construction requirements, for example, (already/before/early/yesterday/evening/early/morning), (save/make/process/buckle), (say/prompt/remind/how/be/make) the occurrence of abnormal complaint emotion, and the voice emotion grade is abnormal complaint emotion.
For the keyword scene of the customer service voice text, a keyword combination of a corresponding scene can also be constructed according to a voice dialogue scene, for example, a keyword combination (please/trouble/convenience) is (providing/speaking/reporting) is (identity card/mobile phone/number/mailbox), a dialogue scene for providing relevant information for a customer is constructed, and along with the subsequent conversation process, the keyword combinations of different voice emotion grades are further constructed in a keyword word stock.
In one embodiment, the database table of the keyword lexicon constructed based on Gbase may be referred to as table 1:
TABLE 1
In the invention, a corresponding database table can be set according to the record number, and the voice data can be stored according to the voice emotion type, and the specific reference can be made to table 2:
TABLE 2
Field name Chinese name Standard name Chinese English Field type
recording_id Recording id Recording number bigint(1)
ts Tendency of collecting keyword-complaint Complaints of int(1)
ts_loc Collecting keyword-complaint tendency position Complaint location varchar(500)
On the basis of the above embodiment, after constructing the keyword combinations corresponding to each of the speech emotion types according to one or more groups of the divided keyword candidate terms, the method further includes:
receiving a first input, wherein the first input comprises an operation of inputting a new keyword or an operation of determining a keyword to be deleted;
and responding to the first input, dividing the newly added keywords into corresponding keyword candidates, or deleting the keywords to be deleted from the keyword lexicon.
In the invention, the basic information of the keyword library is maintained, and the module realizes the maintenance of the keyword library according to the classification of keywords, and performs new creation, editing, deletion and recovery on the keyword information.
In the invention, the keywords in the keyword lexicon can be dynamically changed, including but not limited to new creation (newly added keywords), editing (modifying keywords in the current keyword candidates), deleting (deleting the current keyword candidates or keywords in the keyword candidates) and recovering (recovering the lexicon to the last version), thereby realizing the dynamic adjustment of the keyword lexicon and improving the retrieval performance of the lexicon.
On the basis of the above embodiment, the matching the word in the target text content with the plurality of groups of keywords in the keyword lexicon, if matching is successful, determining the target keyword includes:
based on a regular expression, matching words in the target text content with keywords corresponding to each keyword candidate in each keyword combination in the keyword word stock in sequence according to word sequence;
if at least one keyword exists in each keyword candidate in any one keyword combination and is successfully matched with the corresponding word in the target text content, the keyword combination successfully matched is used as a target keyword combination, and the keyword successfully matched in the target keyword combination is used as a target keyword.
In one embodiment, the Gbase database is used for illustration. Based on the Gbase database environment, a keyword word library is constructed, keywords of voice call contents between clients and customer service are searched through a regular function corresponding to Gbase, wherein the text format of voice call records is JSON format, so that the positions (offset relative to the initial position in the searched text contents) of the keywords and the marked keywords in the text are searched, the voice emotion types corresponding to the voice call contents are marked, and the voice emotion types are provided for upper-layer applications (such as display and data analysis through a visual terminal) for analysis. In the invention, when carrying out keyword search analysis on words in the target text content, each word in the target text content needs to be matched with each keyword in a keyword word stock, for example, the target text content is "what has been paid or no express information", and when being matched with the keyword word stock, a group of keyword combinations is matched, wherein the keyword combinations comprise 3 groups of keyword candidate items, namely: (already/before/yesterday) (deposit/run/pay) ((express/postal number)), there is a word match (or multiple words) in each group of keyword candidates with the target text content, so as to determine that the match is successful.
When a certain keyword combination exists or keyword candidates are not completely matched, for example, 5 keywords are set in a certain keyword candidate, a certain word in the target text content is matched with 4 keywords in the certain word and is not successfully matched, the matching needs to be continued, and after the 5 keywords are matched once, whether the current retrieval process is finished can be determined.
In one embodiment, gbase-based fuzzy query retrieval, retrieval of keywords in text content may be accomplished using the following canonical function: regexp_instr, which provides a function of acquiring the position of a character string by a regular expression mode under the POSIX standard, and after the character string (word) in the text content is successfully matched with a keyword phrase, the position information of the character string in the text content is provided.
In another embodiment, the substring of the specified character string (i.e. the keyword searched in the text content) can be extracted through the regular function regexp_substring, so as to independently form a group of keyword display results, and the keyword display results are highlighted through the front-end display terminal. In one embodiment, the search function constructed by the regular function of Gbase may be as shown in table 1:
TABLE 1
In an embodiment, a text retrieval mode based on Python can be used, and the mode is related to call based on the running environment of the Python.
On the basis of the above embodiment, the marking, based on the target keyword, a keyword tag for a word corresponding to the target text content to obtain a keyword search result of the voice information includes:
determining text position information of a target word in the target text content, wherein the target word represents a word successfully matched with the target keyword in the target text content;
marking a keyword label for the target word in the target text content according to the text position information to obtain marked text content;
according to the voice emotion type of the target keyword combination, determining voice emotion information corresponding to the target text content;
and generating a keyword search result of the voice information through the voice emotion information and the marked text content.
According to the method, text position information of target words is determined in target text content through corresponding regular expressions, the target words are marked as keywords according to the text position information, finally, the marked text content and voice emotion types corresponding to the target text content obtained through keyword retrieval are output as keyword retrieval results, and corresponding display is carried out through a visual display terminal.
It should be noted that, in the present invention, for the target text content that is not successfully matched, the word that is successfully matched in the target text content may be marked as a keyword, and output to the user end for display, so that the related personnel can perform subsequent analysis by a manual manner.
On the basis of the above embodiment, the converting the voice information into the target text content includes:
collecting audio data;
extracting voice characteristics in the audio data to obtain the voice information;
performing text conversion processing on the voice information to obtain corresponding text information;
and performing word segmentation processing on the text information to obtain the target text content.
In the invention, after the audio data in the voice call process is collected, firstly, the voice characteristics in the audio data are extracted, and the existing noise and interference data are removed to obtain voice information; further, after converting the voice information into text information by using the existing voice text conversion tool, the text information is subjected to word segmentation by using a word segmentation tool (such as jieba word segmentation), and besides, some stop words in the text information can be deleted, so that target text content is obtained, and the efficiency of subsequent keyword retrieval is improved.
The voice keyword search system provided by the invention is described below, and the voice keyword search system described below and the voice keyword search method described above can be referred to correspondingly.
Fig. 2 is a schematic structural diagram of a voice keyword search system provided by the present invention, and as shown in fig. 2, the present invention provides a voice keyword search system, which includes a voice conversion module 201, a matching module 202, and a search result output module 203, where the voice conversion module 201 is configured to convert voice information into target text content; the matching module 202 is configured to match the word in the target text content with a plurality of groups of keyword combinations in a keyword lexicon, and if the matching is successful, determine a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords; the retrieval result output module 203 is configured to tag a keyword label for a word corresponding to the target text content based on the target keyword, so as to obtain a keyword retrieval result of the voice information.
According to the voice keyword retrieval system provided by the invention, the words in the target text content are matched with the plurality of groups of keywords corresponding to different voice emotion types in the keyword lexicon, so that the target keywords are obtained through matching, and the keyword labels are marked on the corresponding words in the target text content according to the target keywords, so that the keyword retrieval result of voice information is obtained, and the accuracy of the voice keyword retrieval result is higher.
The system provided by the invention is used for executing the method embodiments, and specific flow and details refer to the embodiments and are not repeated herein.
Fig. 3 is a schematic structural diagram of an electronic device according to the present invention, as shown in fig. 3, the electronic device may include: processor (Processor) 301, communication interface (Communications Interface) 302, memory (Memory) 303 and communication bus 304, wherein Processor 301, communication interface 302, memory 303 accomplish the communication between each other through communication bus 304. The processor 301 may invoke logic instructions in the memory 303 to perform a voice keyword retrieval method comprising: converting the voice information into target text content; matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords; and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
Further, the logic instructions in the memory 303 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the method for retrieving a speech keyword provided by the above methods, the method comprising: converting the voice information into target text content; matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords; and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
In still another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the voice keyword retrieval method provided in the above embodiments, the method comprising: converting the voice information into target text content; matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords; and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for retrieving a voice keyword, comprising:
converting the voice information into target text content;
matching the word in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, determining a target keyword; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords;
and marking keyword labels for corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
2. The method for searching for a voice keyword according to claim 1, wherein, before the matching of the words in the target text content with the plurality of sets of keywords in the keyword lexicon, if the matching is successful, the method further comprises:
performing voice scene classification processing on the sample keywords to obtain a plurality of first sample keyword sets, wherein each first sample keyword set is a keyword set corresponding to different voice scene types;
dividing a plurality of sample keywords in each first sample keyword set into corresponding keyword candidates based on keyword candidate dividing rules corresponding to different voice emotion types, and obtaining divided keyword candidates;
and constructing the keyword combination corresponding to each voice emotion type according to one or more groups of divided keyword candidate items to obtain the keyword word stock.
3. The method for retrieving a speech keyword according to claim 2, wherein after said constructing the keyword combination corresponding to each of the speech emotion types according to one or more groups of the divided keyword candidates, the method further comprises:
receiving a first input, wherein the first input comprises an operation of inputting a new keyword or an operation of determining a keyword to be deleted;
and responding to the first input, dividing the newly added keywords into corresponding keyword candidates, or deleting the keywords to be deleted from the keyword lexicon.
4. The method for searching for a voice keyword according to claim 1, wherein the matching the word in the target text content with the plurality of sets of keywords in the keyword lexicon includes:
based on a regular expression, matching words in the target text content with keywords corresponding to each keyword candidate in each keyword combination in the keyword word stock in sequence according to word sequence;
if at least one keyword exists in each keyword candidate in any one keyword combination and is successfully matched with the corresponding word in the target text content, the keyword combination successfully matched is used as a target keyword combination, and the keyword successfully matched in the target keyword combination is used as a target keyword.
5. The method for retrieving a keyword according to claim 4, wherein the step of labeling the keyword tag for the corresponding word in the target text content based on the target keyword to obtain the keyword retrieval result of the voice information includes:
determining text position information of a target word in the target text content, wherein the target word represents a word successfully matched with the target keyword in the target text content;
marking a keyword label for the target word in the target text content according to the text position information to obtain marked text content;
according to the voice emotion type of the target keyword combination, determining voice emotion information corresponding to the target text content;
and generating a keyword search result of the voice information through the voice emotion information and the marked text content.
6. The voice keyword retrieval method of any one of claims 1 to 5, wherein the converting the voice information into the target text content includes:
collecting audio data;
extracting voice characteristics in the audio data to obtain the voice information;
performing text conversion processing on the voice information to obtain corresponding text information;
and performing word segmentation processing on the text information to obtain the target text content.
7. A voice keyword retrieval system, comprising:
the voice conversion module is used for converting voice information into target text content;
the matching module is used for matching the words in the target text content with a plurality of groups of keyword combinations in a keyword word stock, and if the matching is successful, the target keywords are determined; the keyword word library comprises the keyword combinations corresponding to a plurality of different voice emotion types; each keyword combination comprises at least one group of keyword candidates, and each keyword candidate comprises a plurality of corresponding keywords;
and the retrieval result output module is used for marking keyword labels for the corresponding words in the target text content based on the target keywords to obtain keyword retrieval results of the voice information.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of speech keyword retrieval as claimed in any one of claims 1 to 6 when the computer program is executed by the processor.
9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the speech keyword retrieval method of any one of claims 1 to 6.
10. A computer program product comprising a computer program which, when executed by a processor, implements the speech keyword retrieval method of any one of claims 1 to 6.
CN202310173515.0A 2023-02-28 2023-02-28 Voice keyword retrieval method and system Pending CN116501844A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310173515.0A CN116501844A (en) 2023-02-28 2023-02-28 Voice keyword retrieval method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310173515.0A CN116501844A (en) 2023-02-28 2023-02-28 Voice keyword retrieval method and system

Publications (1)

Publication Number Publication Date
CN116501844A true CN116501844A (en) 2023-07-28

Family

ID=87321988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310173515.0A Pending CN116501844A (en) 2023-02-28 2023-02-28 Voice keyword retrieval method and system

Country Status (1)

Country Link
CN (1) CN116501844A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117669513A (en) * 2024-01-30 2024-03-08 江苏古卓科技有限公司 Data management system and method based on artificial intelligence

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117669513A (en) * 2024-01-30 2024-03-08 江苏古卓科技有限公司 Data management system and method based on artificial intelligence
CN117669513B (en) * 2024-01-30 2024-04-12 江苏古卓科技有限公司 Data management system and method based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN104598445B (en) Automatically request-answering system and method
CN106600298B (en) Power information system customer service knowledge base construction method based on work order data analysis
CN108197282B (en) File data classification method and device, terminal, server and storage medium
CN111694939A (en) Method, device and equipment for intelligently calling robot and storage medium
US20220156305A1 (en) Labeling/names of themes
US10078689B2 (en) Labeling/naming of themes
CN108027814B (en) Stop word recognition method and device
CN112035599B (en) Query method and device based on vertical search, computer equipment and storage medium
US20060224682A1 (en) System and method of screening unstructured messages and communications
CN111159334A (en) Method and system for house source follow-up information processing
CN112235470B (en) Incoming call client follow-up method, device and equipment based on voice recognition
CN116501844A (en) Voice keyword retrieval method and system
CN112364622A (en) Dialog text analysis method, dialog text analysis device, electronic device and storage medium
JP2017167726A (en) Conversation analyzer, method and computer program
CN111062211A (en) Information extraction method and device, electronic equipment and storage medium
CN113051384B (en) User portrait extraction method based on dialogue and related device
CN114722191A (en) Automatic call clustering method and system based on semantic understanding processing
CN114418327A (en) Automatic order recording and intelligent order dispatching method for customer service system
CN116628173B (en) Intelligent customer service information generation system and method based on keyword extraction
CN109145092B (en) Database updating and intelligent question and answer management method, device and equipment
CN112527969A (en) Incremental intention clustering method, device, equipment and storage medium
CN111325019A (en) Word bank updating method and device and electronic equipment
CN111831286A (en) User complaint processing method and device
CN114254088A (en) Method for constructing automatic response model and automatic response method
CN111460088A (en) Similar text retrieval method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination