CN114661862A - Voice data based search method and device, computer equipment and storage medium - Google Patents
Voice data based search method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN114661862A CN114661862A CN202210195811.6A CN202210195811A CN114661862A CN 114661862 A CN114661862 A CN 114661862A CN 202210195811 A CN202210195811 A CN 202210195811A CN 114661862 A CN114661862 A CN 114661862A
- Authority
- CN
- China
- Prior art keywords
- slot position
- position value
- slot
- value
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000004458 analytical method Methods 0.000 claims abstract description 30
- 238000006243 chemical reaction Methods 0.000 claims abstract description 21
- 238000012795 verification Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 14
- 239000013598 vector Substances 0.000 description 12
- 238000010586 diagram Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 101100495431 Schizosaccharomyces pombe (strain 972 / ATCC 24843) cnp1 gene Proteins 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 101100421536 Danio rerio sim1a gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Abstract
The embodiment of the application provides a searching method, a device, computer equipment and a storage medium based on voice data, the method comprises the steps of firstly carrying out text conversion on the voice data to be processed to obtain initial text data, then carrying out semantic analysis on the initial text data to determine a corresponding field type and an initial slot position value, then obtaining a corresponding slot rewriting index table according to the field type, recording an un-rewritten first slot position value and a rewritten second slot position value of the corresponding field type by the slot rewriting index table, then adjusting the initial slot position value according to the slot rewriting index table to obtain a target slot position value, finally carrying out searching based on the target slot position value to obtain a searching result, adjusting the initial slot position value through the slot rewriting index table to enable the target slot position value to be more accurate, thereby improving the condition that a keyword cannot be recalled due to the searching through the initial slot value, the recall rate and accuracy of the search results are improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a search method and apparatus based on voice data, a computer device, and a storage medium.
Background
With the progress and development of artificial intelligence technology, intelligent voice devices such as intelligent sound equipment are gradually applied to our lives, and users can input their requests through voice to acquire search results. In the prior art, a user sends a voice request, and performs Recognition processing through an ASR (Automatic Speech Recognition) and an NLU (Natural Language Understanding), however, in an actual application scenario, problems such as wrong words, missing words, short names, aliases, or accents may occur to the voice request sent by the user, so that there is an error in voice Recognition, and then the relevance between a returned search result and content that the user needs to search is low, which reduces a recall rate of search, and affects a search effect and user experience.
Disclosure of Invention
The embodiment of the application provides a searching method and device based on voice data, computer equipment and a storage medium, and aims to solve the technical problems that the searching recall rate is reduced and the searching effect and the user experience are influenced due to the low recognition accuracy of the voice data.
In one aspect, the present application provides a search method based on voice data, including:
acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data;
performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value;
acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value;
and searching based on the target slot position value to obtain a search result.
In one aspect, the present application provides a search apparatus based on voice data, including:
the conversion module is used for acquiring voice data to be processed and performing text conversion on the voice data to be processed to obtain initial text data;
the analysis module is used for carrying out semantic analysis on the initial text data and determining a corresponding field type and an initial slot position value;
the acquisition module is used for acquiring a corresponding slot position rewriting index table according to the field type, and the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
the rewriting module is used for rewriting an index table according to the slot position and adjusting the initial slot position value to obtain a target slot position value;
and the searching module is used for searching based on the target slot position value to obtain a searching result.
In one aspect, the present application provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps in the above-mentioned search method based on voice data when executing the computer program.
In one aspect, the present application provides a computer readable medium storing a computer program, which when executed by a processor, implements the steps in the above-mentioned voice data-based search method.
The embodiment of the application provides a search method based on voice data, which comprises the steps of firstly carrying out text conversion on voice data to be processed to obtain initial text data, then carrying out semantic analysis on the initial text data to determine a corresponding field type and an initial slot position value, then obtaining a corresponding slot rewriting index table according to the field type, recording an un-rewritten first slot position value and a rewritten second slot position value of the corresponding field type by the slot rewriting index table, then adjusting the initial slot position value according to the slot rewriting index table to obtain a target slot position value, finally carrying out search based on the target slot position value to obtain a search result, adjusting the initial slot position value by the slot rewriting index table to enable the target slot position value to be more accurate, thereby improving the condition that a keyword cannot be recalled due to the fact that the search is carried out through the initial slot position value, the recall rate and accuracy of the search results are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a flow diagram of a method for voice data based searching in one embodiment;
FIG. 2 is a flow diagram of a method for slot overwrite index table generation in one embodiment;
FIG. 3 is a flow diagram of a method for score determination of a first candidate slot level value and a second candidate slot level value in one embodiment;
FIG. 4 is a flow diagram of a search result determination method in one embodiment;
FIG. 5 is a block diagram showing the structure of a speech data based search apparatus according to an embodiment;
FIG. 6 is a block diagram of a computer device in one embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, in an embodiment, a search method based on voice data is provided, and the search method based on voice data can be applied to both a terminal and a server, and the embodiment is exemplified by being applied to the server. The searching method based on the voice data specifically comprises the following steps:
and 102, acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data.
The voice data to be processed refers to voice data of a query instruction sent by a user, specifically, the user sends real-time voice data of the query instruction to the server, so that the voice data to be processed is directly obtained from the server, or the voice data of the query instruction is stored in the server in advance, and then the voice data to be processed is obtained from the server. After the voice data to be processed is obtained, the voice data to be processed can be converted into text data through a voice recognition technology to obtain initial text data, and the specific conversion process is as follows: voice Activity Detection (VAD) is carried out on Voice data to be processed, then Voice framing is carried out on the Voice data to be processed with the removed silence, feature extraction is carried out on a plurality of Voice frames, such as Linear Prediction Cepstrum Coefficients (LPCC) or Mel cepstrum coefficients (MFCC), the Voice data are converted into multidimensional vectors, then the multidimensional vectors are input into an Acoustic Model (AM), phoneme data are output, the phoneme data are converted into corresponding characters by utilizing a dictionary, finally the characters are input into a Language Model (LM) to obtain the probability of a single character or word, and the characters meeting the probability threshold are output to be initial text data. In the embodiment, the voice data to be processed is converted into the initial text data, so that the subsequent high-efficiency processing is performed based on the text data, and the complex complexity of directly processing the voice data is avoided.
And 104, performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value.
Wherein, the domain type refers to the category of the domain to which the initial text data belongs, such as the movie and television domain, the news domain, etc., the initial slot value refers to the keyword or the keyword, etc. that the initial text data contains and is clearly defined or intended, specifically, the initial text data can be semantically analyzed through a preset rule template, a text classification model based on statistical machine learning, or based on deep learning, and the corresponding domain type and the initial slot value are extracted, wherein the preset rule template is a rule template summarized by manually analyzing the related intention or the representative text data under the keyword in each domain type, then the initial text data is subjected to a series of processing such as word segmentation, part of speech tagging, named entity recognition, dependency syntax analysis, semantic classification, etc., and the processed initial text data is matched through the preset rule template, extracting corresponding field types and initial slot values, wherein the semantic analysis based on statistical machine learning is to extract the characteristics of initial text data, such as ngram characteristics, part of speech characteristics and entity type characteristics; after the features are extracted, tf-idf (term frequency-inverse document frequency index) vectorization representation is carried out, then algorithms such as a support vector machine, logistic regression and random forest are used for training to obtain a semantic analysis model, initial text data are input into the semantic analysis model, and a corresponding field type and an initial slot value are obtained. For example, the initial text data is "i want to see a pig wearing," the corresponding domain type is a movie domain, and the corresponding initial slot value is "pig wearing. It can be understood that, in this embodiment, by determining the field type and the initial slot position value corresponding to the initial text data, extraction of the keywords of the initial text data is achieved, interference of information of the degree of correlation is reduced, and improvement of processing efficiency of subsequent text data is facilitated.
And 106, acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type.
The slot rewriting index table is a table configured in advance for rewriting an initial slot value of a corresponding field type, each field type corresponds to one slot rewriting index table, the slot rewriting index table records an un-rewritten first slot value and a rewritten second slot value of the corresponding field type, the first slot value refers to an un-rewritten slot value which has an error (such as a wrong word, a missing word, a replacement word, an alias, an abbreviation and the like) or is difficult to identify (such as a newly generated word), and the second slot value refers to a slot value which is used for rewriting the first slot value to improve the search efficiency or a newly generated word configured in advance, such as a slot value of a correct movie name with the same circumferential depth. As another example, a first slot level value is "piggy-pack" and a corresponding second slot level value is "piggy-pack". In this embodiment, the index table is rewritten by obtaining the corresponding slot in advance, so as to rewrite the initial slot value, thereby improving the accuracy of the initial slot value, further improving the subsequent search efficiency, and reducing the recall rate.
And 108, rewriting the index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value.
And the target slot position value is the slot position value obtained after the initial slot position value is adjusted. Specifically, according to the initial slot position value, searching is carried out in the slot rewriting index table, when a first slot position value consistent with the initial slot position value is found, the initial slot position value is adjusted according to a second slot position value corresponding to the first slot position value, and a target slot position value is obtained, so that the target slot position value is more accurate, the problems of character errors, character leakage and the like existing in the voice data to be processed are solved, and the recall rate of searching is favorably reduced.
And step 110, searching based on the target slot position value to obtain a search result.
Specifically, the target slot value is used as a search keyword to perform searching, and a search result is obtained. Because the target slot position value is the correct slot position value, the condition that the keyword cannot be recalled due to the fact that the search is conducted through the initial slot position value is improved, and the recall rate and accuracy of the search result are improved.
The searching method based on the voice data comprises the steps of firstly carrying out text conversion on the voice data to be processed to obtain initial text data, then, semantic analysis is performed on the initial text data to determine the corresponding domain type and initial slot position value, and then, obtaining a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type, and then rewriting the index table according to the slot position, adjusting the initial slot position value to obtain a target slot position value, finally, searching based on the target slot position value to obtain a search result, the initial slot position value is adjusted by rewriting the index table of the slot position, so that the target slot position value is more accurate, therefore, the condition that the keywords cannot be recalled due to the fact that the search is conducted through the initial slot position value is improved, and the recall rate and accuracy of the search results are improved.
In one embodiment, before the step of obtaining the corresponding slot position according to the field type to overwrite the index table, the method further includes: and generating a slot position rewriting index table by adopting a preset matching mode aiming at each field type.
Specifically, for each field type, a plurality of search keywords of the corresponding field type are obtained from a search log of a user and used as a first slot position value, and a standard slot position value of the corresponding field type is obtained, where the standard slot position may be obtained by collecting slot position values related to the corresponding field type in advance, or may be obtained by determining according to a search result. The preset matching mode refers to a preset matching mode for matching the standard slot value for the search keyword, the preset matching mode may be a matching mode based on model conversion, a matching mode based on plug-in conversion, or a combination of two matching modes, and as an optimization of this embodiment, in order to improve the accuracy of the slot rewriting index table, a combination of the matching mode based on model conversion and the matching mode based on plug-in conversion is adopted to ensure the accuracy and rationality of matching, thereby improving the accuracy of matching the first slot value in the slot rewriting index table and the standard slot value.
As shown in fig. 2, in an embodiment, the step of generating the slot overwriting index table by using a preset matching manner for each field type includes:
112C, selecting standard slot position values corresponding to the first K first similarities to determine as first candidate slot position values, and selecting standard slot position values corresponding to the first L second similarities to determine as second candidate slot position values, wherein K and L are natural numbers larger than 1;
and 112E, according to the scores, determining a second slot position value corresponding to the first slot position value from the first candidate slot position value and the second candidate slot position value.
The preset sample database is used for storing a first slot position value, the preset search database is used for storing a standard slot position value, the standard slot position value is a rewritten slot position value and is also an accurate slot position value, and the second slot position value is a subset of the standard slot position value and is formed by the standard slot position value. Specifically, for each first slot position value, calculating a semantic similarity between each first slot position value and each standard slot position value to obtain a first similarity, and calculating a semantic similarity between each first slot position value and each standard slot position value to obtain a second similarity, wherein the first distance calculation method and the second distance calculation method may be any one of similarity measurement methods of an euclidean distance, a cosine distance, a manhattan distance, a hamming distance, a mahalanobis distance, and a chebyshev distance. The first candidate slot position value is the first K standard slot position values with higher first similarity, the second candidate slot position value is the first L standard slot position values with higher second similarity, and the number K of the first candidate slot position values may be equal to the number L of the second candidate slot position values. The preset scoring rule is a preset rule for evaluating the matching degree of each first candidate slot position value, each second candidate slot position value and each first slot position value, and the scoring rule may be a rule for scoring according to the use frequency of each first candidate slot position value, each second candidate slot position value or the heat degree of each first candidate slot position value and each second candidate slot position value by a user in combination with the first similarity or the second similarity of the user to improve the objectivity and accuracy of scoring the first candidate slot position value and the second candidate slot position value, it can be understood that in this embodiment, by scoring the first candidate slot position value and the second candidate slot position value, the quantification of the correlation between the standard slot position value and the first slot position value is realized, and according to each score, the second slot position value corresponding to the first slot position value is determined from the first candidate slot position value and the second candidate slot position value, the second slot value acquisition efficiency is improved.
In one embodiment, the step of calculating the semantic similarity between each first slot position value and each standard slot position value by using a first distance calculation method for each first slot position value to obtain a first similarity includes: converting the first slot position value into a first pinyin text by using the pinyin plug-in, and converting each standard slot position value into a standard pinyin text; and calculating Euclidean distances between the first pinyin text and each standard pinyin text, and determining the first similarity according to the Euclidean distances.
Among them, a pinyin plug-in (pinyin) is a plug-in of an elastic search for converting text data into pinyin, for example, if the first slot position is "piggy-wear", the first pinyin text is "xiaozhupeipeeii". Specifically, the first slot value is converted into a first pinyin text by using the pinyin plug-in unit, each standard slot value is converted into a standard pinyin text, then the first pinyin text and each standard pinyin text are converted into multidimensional vectors, and the euclidean distance between the multidimensional vector corresponding to the first pinyin text and the multidimensional vector corresponding to each standard pinyin text is calculated, wherein the specific calculation formula is as follows:
wherein d (x)i,yi) Is the Euclidean distance, x, of the first phonetic text and the standard phonetic text thereiniIs the i-dimension vector, y, of the first Pinyin textiIs the ith dimension vector of the standard Pinyin text, and N is the dimension of the first Pinyin text and the multidimensional vector of each standard Pinyin text. Then, the first similarity is determined according to the Euclidean distance, and the calculation formula is as follows:
wherein sim1 (x)i,yi) Is the first similarity. In the embodiment, the first slot position value and each standard slot position value are converted through the pinyin plug-in, matching processing of the text data with pronunciation errors is achieved, the first similarity of the converted first slot position value and each standard slot position value is determined through the Euclidean distance, the calculation process is simple and rapid, and measurement of the similarity of the first slot position value and each standard slot position value is achieved efficiently.
In one embodiment, the step of calculating the semantic similarity between each first slot position value and each standard slot position value by using a second distance calculation method to obtain a second similarity includes: inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding; and calculating cosine distances between the first embedding and each standard embedding, and determining a second similarity according to the cosine distances.
The trained text analysis model refers to a pre-trained machine learning model for converting text, for example, a word2vec model, and embedding (embedding) refers to feature vectors of text. Specifically, a first slot value is input into a trained text analysis model, an output result of the model is used as a first embedding, each standard slot value is respectively input into the trained text analysis model, an output result of the model is used as a corresponding standard embedding, and then, a cosine distance between the first embedding and each standard embedding is calculated through the following formula:
wherein, T (a)j,bj) Cosine distance of the first embedding from the standard embedding therein, ajAs a first embedded j-th dimension vector, bjFor the j-th dimension of the standard embedded vectors, M is the dimension of the first embedded and the respective standard embedded multi-dimensional vectors. Then, a second similarity is determined according to the cosine distance, and the calculation formula is as follows:
sim2(aj,bj)=1-T(aj,bj),
wherein sim2 (a)j,bj) Is the second similarity. In the embodiment, the first slot position value and each standard slot position value are converted through the trained text analysis model, so that matching processing of the text data which has wrong characters, missing characters and the like and is not rewritten is realized, the second similarity of the converted first slot position value and each standard slot position value is determined through the cosine distance, the calculation process is simple and rapid, and the measurement of the similarity of the first slot position value and each standard slot position value is efficiently realized.
As shown in fig. 3, in an embodiment, the step of determining the score of each first candidate slot position value and each second candidate slot position value according to a preset scoring rule includes:
step 112D1, determining a usage frequency score of each of the first candidate slot position value and the second candidate slot position value according to the usage frequency of each of the first candidate slot position value and the second candidate slot position value;
step 112D2, determining a popularity score of each of the first candidate slot position value and the second candidate slot position value according to the popularity of each of the first candidate slot position value and the second candidate slot position value;
step 112D3, inputting each first candidate slot position value into a preset verification classifier respectively, determining a first verification result of each first candidate slot position value, obtaining a first accuracy according to the first verification result, inputting each second candidate slot position value into the preset verification classifier respectively, determining a second verification result of each second candidate slot position value, and obtaining a second accuracy according to the second verification result;
step 112D4, determining the score of each first candidate slot position value according to the first accuracy, the first similarity of each first candidate slot position value, the usage frequency score, the frequency weight corresponding to the usage frequency and the heat weight corresponding to the heat, and determining the score of each second candidate slot position value according to the second accuracy, the second similarity of each second candidate slot position value, the usage frequency score, the frequency weight corresponding to the usage frequency and the heat weight corresponding to the heat.
Wherein, the usage frequency refers to the frequency of occurrence of the standard slot value in the search, and may be counted by a user search log within a preset time period (e.g. one year), and the usage frequency score of each of the first candidate slot value and the second candidate slot value is determined, for example, for K first candidate slot values, the usage frequency score of the first candidate slot value with the highest usage frequency is determined to be 1, the usage frequency scores of the corresponding first candidate slot values are determined to be (K-1)/K, (K-2)/K.. 1/K) in order of the usage frequencies from high to low, for L second candidate slot values, the usage frequency score of the second candidate slot value with the highest usage frequency is determined to be 1, the usage frequency score of the corresponding second candidate slot values is determined to be (L-1)/L in order of the usage frequencies from high to low, 1/L/2/L. Hot means the number of requests occurring in different users is the hot that a standard slot value may be searched, the heat score may be determined by counting the number of times a standard slot value is searched for a predetermined period of time, such as 3 months, for example, determining a hot score of 1 for the first candidate slot level value with the highest hot degree for the K first candidate slot level values, determining the heat degree of the corresponding first candidate slot position value to be (K-1)/K, (K-2)/K.. 1/K in sequence from high to low according to the heat degree, determining a heat score of 1 for the second candidate slot position value with the highest heat for the L second candidate slot position values, and determining the heat scores of the corresponding second candidate slot position values to be (L-1)/L, (L-2)/L. The preset verification classifier is a classifier obtained by training classifiers such as SVMs, decision trees and the like in advance and used for determining the correct probabilities of the first similarity and the second similarity, the frequency weight refers to the weight of the use frequency, the heat weight refers to the weight of the heat, and the sum of the frequency weight and the heat weight is 1, for example, the frequency weight is 0.6, and the heat weight is 0.4. The score of the first candidate slot level value and the score of the second candidate slot level value may be calculated by the following formulas:
Z1p=x*s1p*(α*F1p+β*H1p)
Z2p=y*s2q*(α*F2q+β*H2q),
wherein Z is1pFor the score of the pth first candidate slot value, p ∈ [1, k]X is a first accuracy, s1pIs the first similarity of the p-th first candidate bin value, alpha is the frequency weight, beta is the heat weight, F1pScore for frequency of use of the pth first candidate slot value, HzpScore the heat of the pth first candidate slot value, Z2pFor the score of the qth second candidate slot value, q ∈ [1, L]Y is the second accuracy, s2qIs a second similarity of the q second candidate slot values, F2qScoring the frequency of use of the qth second candidate slot value, H2qAnd scoring the heat of the qth second candidate slot-level value. It can be understood that, in this embodiment, each first candidate slot position value and each second candidate slot position value are scored from two dimensions, that is, using frequency and heat, and the scores of each first candidate slot position value and each second candidate slot position value are determined by combining the accuracy of each similarity, so that the scores of each first candidate slot position value and each second candidate slot position value are accurately quantized, the matching accuracy and the matching efficiency are improved, and the generation efficiency of the slot rewriting index table is further improved.
As shown in fig. 4, in an embodiment, the step of performing a search based on the target slot location value to obtain a search result includes:
step 110B, if the search information corresponding to the target slot value is searched, taking the search information as a search result;
and step 110C, if the search information corresponding to the target slot position value is not searched, acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value, and determining the recommendation information as a search result.
Specifically, the target slot position value is used as a search keyword, searching is performed in a preset search resource library, when search information corresponding to the target slot position value is searched, the search information is used as a search result, it is worth explaining that the search result can be converted into voice data to be played, user experience is improved, when the search information corresponding to the target slot position value is not searched, a preset recommendation method is adopted to obtain corresponding recommendation information according to the target slot position value, the recommendation information is determined as the search result, the preset recommendation method can be used for obtaining information with the highest heat degree in the corresponding field type as the recommendation information according to the field type of the target slot position value, and the search experience of a user is improved.
As shown in fig. 5, in one embodiment, a search apparatus based on voice data is provided, including:
a conversion module 502, configured to obtain voice data to be processed, and perform text conversion on the voice data to be processed to obtain initial text data;
an analysis module 504, configured to perform semantic analysis on the initial text data, and determine a corresponding domain type and an initial slot position value;
an obtaining module 506, configured to obtain a corresponding slot rewriting index table according to the domain type, where the slot rewriting index table records a first slot value and a second slot value that are not rewritten of the corresponding domain type;
a rewriting module 508, configured to rewrite an index table according to the slot position, and adjust the initial slot position value to obtain a target slot position value;
and a searching module 510, configured to perform a search based on the target slot value to obtain a search result.
In one embodiment, the apparatus for searching based on voice data further comprises: and the generating module is used for generating the slot position rewriting index table by adopting a preset matching mode aiming at each field type.
In one embodiment, the generating module comprises:
the acquisition unit is used for acquiring the first slot position value corresponding to the field type from a preset sample database and acquiring a standard slot position value corresponding to the field type from a preset search database, wherein the second slot position value is a subset of the standard slot position value;
a calculating unit, configured to calculate, for each first slot position value, a semantic similarity between each first slot position value and each standard slot position value by using a first distance calculation method to obtain a first similarity, and calculate, for each first slot position value, a semantic similarity between each first slot position value and each standard slot position value by using a second distance calculation method to obtain a second similarity;
the selecting unit is used for selecting the first K standard slot position values corresponding to the first similarity to determine as a first candidate slot position value, and selecting the first L standard slot position values corresponding to the second similarity to determine as a second candidate slot position value, wherein K and L are natural numbers larger than 1;
a first determining unit, configured to determine scores of the first candidate slot position values and the second candidate slot position values according to a preset scoring rule;
a second determining unit, configured to determine, according to each score, the second slot position value corresponding to the first slot position value from the first candidate slot position value and the second candidate slot position value.
In one embodiment, the computing unit comprises:
the conversion subunit is used for converting the first slot position value into a first pinyin text by using a pinyin plug-in and converting each standard slot position value into a standard pinyin text;
and the first calculating subunit is used for calculating the Euclidean distance between the first pinyin text and each standard pinyin text to obtain the first similarity.
In one embodiment, the computing unit further comprises:
the analysis subunit is used for inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding;
and the second calculating subunit is used for calculating cosine distances between the first embedding and each standard embedding, and determining the second similarity according to the cosine distances.
In one embodiment, the first determination unit includes:
a first determining subunit, configured to determine, according to usage frequencies of the first candidate slot position values and the second candidate slot position values, usage frequency scores of the first candidate slot position values and the second candidate slot position values;
a second determining subunit, configured to determine a popularity score of each of the first candidate slot position value and the second candidate slot position value according to popularity of each of the first candidate slot position value and the second candidate slot position value;
a third determining subunit, configured to input each of the first candidate slot position values into a preset verification classifier, determine a first verification result of each of the first candidate slot position values, obtain a first accuracy according to the first verification result, input each of the second candidate slot position values into the preset verification classifier, determine a second verification result of each of the second candidate slot position values, and obtain a second accuracy according to the second verification result;
a fourth determining subunit, configured to determine a score of each of the first candidate slot positions according to the first accuracy, the first similarity of each of the first candidate slot positions, a score of a usage frequency, a frequency weight corresponding to the usage frequency, and a heat weight corresponding to the heat, and determine a score of each of the second candidate slot positions according to the second accuracy, the second similarity of each of the second candidate slot positions, a score of a usage frequency, a frequency weight corresponding to the usage frequency, and a heat weight corresponding to the heat.
In one embodiment, the search module comprises:
the searching unit is used for searching in a preset searching resource library according to the target slot position value;
a first result obtaining unit, configured to, if search information corresponding to the target slot value is searched, take the search information as the search result;
and the second result acquisition unit is used for acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value if the search information corresponding to the target slot position value is not searched, and determining the recommendation information as the search result.
FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be a server including, but not limited to, a high performance computer and a cluster of high performance computers. As shown in fig. 6, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement a search method based on voice data. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a search method based on speech data. Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the search method based on voice data provided by the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in fig. 6. The memory of the computer device may store therein respective program templates constituting the voice data-based search means. For example, the conversion module 502, the analysis module 504, the obtaining module 506, the rewriting module 508, and the search module 510.
A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data; performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot value; acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type; rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value; and searching based on the target slot position value to obtain a search result.
A computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of: acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data; performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value; acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type; rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value; and searching based on the target slot position value to obtain a search result.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method for searching based on voice data, comprising:
acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data;
performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot value;
acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value;
and searching based on the target slot position value to obtain a search result.
2. The method of claim 1, wherein before the step of obtaining the corresponding slot overwriting index table according to the domain type, the method further comprises:
and generating the slot position rewriting index table by adopting a preset matching mode aiming at each field type.
3. The method for searching based on voice data according to claim 1, wherein the step of generating the slot overwriting index table for each of the domain types by using a preset matching manner comprises:
acquiring the first slot position value corresponding to the field type from a preset sample database, and acquiring a standard slot position value corresponding to the field type from a preset search database, wherein the second slot position value is a subset of the standard slot position value;
aiming at each first slot position value, calculating the semantic similarity between each first slot position value and each standard slot position value by adopting a first distance calculation method to obtain a first similarity, and calculating the semantic similarity between each first slot position value and each standard slot position value by adopting a second distance calculation method to obtain a second similarity;
selecting the first K standard slot position values corresponding to the first similarity to determine as a first candidate slot position value, and selecting the first L standard slot position values corresponding to the second similarity to determine as a second candidate slot position value, wherein K and L are both natural numbers larger than 1;
respectively determining scores of each first candidate slot position value and each second candidate slot position value according to a preset scoring rule;
according to each score, determining the second slot value corresponding to the first slot value from the first candidate slot value and the second candidate slot value.
4. The method as claimed in claim 3, wherein the step of calculating the semantic similarity between each of the first slot values and each of the standard slot values by using a first distance calculation method for each of the first slot values to obtain a first similarity comprises:
converting the first slot position value into a first pinyin text by using a pinyin plug-in, and converting each standard slot position value into a standard pinyin text;
and calculating the Euclidean distance between the first pinyin text and each standard pinyin text, and determining the first similarity according to the Euclidean distance.
5. The method for searching based on voice data according to claim 3, wherein the step of calculating the semantic similarity between each of the first bin values and each of the standard bin values by using the second distance calculation method to obtain the second similarity comprises:
inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding;
and calculating cosine distances between the first embedding and each standard embedding, and determining the second similarity according to the cosine distances.
6. The method of claim 3, wherein the step of determining the score of each of the first candidate slot position values and each of the second candidate slot position values according to a preset scoring rule comprises:
determining a usage frequency score of each of the first candidate slot position value and the second candidate slot position value according to the usage frequency of each of the first candidate slot position value and the second candidate slot position value;
determining a heat degree score of each of the first candidate slot position value and the second candidate slot position value according to the heat degree of each of the first candidate slot position value and the second candidate slot position value;
respectively inputting each first candidate slot position value into a preset verification classifier, determining a first verification result of each first candidate slot position value, obtaining a first accuracy according to the first verification result, respectively inputting each second candidate slot position value into the preset verification classifier, determining a second verification result of each second candidate slot position value, and obtaining a second accuracy according to the second verification result;
determining the score of each first candidate slot position value according to the first accuracy, the first similarity of each first candidate slot position value, the score of the use frequency, the frequency weight corresponding to the use frequency and the heat weight corresponding to the heat, and determining the score of each second candidate slot position value according to the second accuracy, the second similarity of each second candidate slot position value, the score of the use frequency, the frequency weight corresponding to the use frequency and the heat weight corresponding to the heat.
7. The method of claim 6, wherein the step of searching based on the target slot value to obtain the search result comprises:
searching in a preset search resource library according to the target slot position value;
if searching the search information corresponding to the target slot value, taking the search information as the search result;
if the search information corresponding to the target slot position value is not searched, acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value, and determining the recommendation information as the search result.
8. A search apparatus based on voice data, comprising:
the conversion module is used for acquiring voice data to be processed and performing text conversion on the voice data to be processed to obtain initial text data;
the analysis module is used for performing semantic analysis on the initial text data and determining a corresponding field type and an initial slot position value;
the acquisition module is used for acquiring a corresponding slot position rewriting index table according to the field type, and the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
the rewriting module is used for rewriting an index table according to the slot position and adjusting the initial slot position value to obtain a target slot position value;
and the searching module is used for searching based on the target slot position value to obtain a searching result.
9. A computer arrangement comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method for searching based on speech data according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method for searching based on speech data according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210195811.6A CN114661862A (en) | 2022-03-01 | 2022-03-01 | Voice data based search method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210195811.6A CN114661862A (en) | 2022-03-01 | 2022-03-01 | Voice data based search method and device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114661862A true CN114661862A (en) | 2022-06-24 |
Family
ID=82028363
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210195811.6A Pending CN114661862A (en) | 2022-03-01 | 2022-03-01 | Voice data based search method and device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114661862A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115376153A (en) * | 2022-08-31 | 2022-11-22 | 南京擎盾信息科技有限公司 | Contract comparison method and device and storage medium |
-
2022
- 2022-03-01 CN CN202210195811.6A patent/CN114661862A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115376153A (en) * | 2022-08-31 | 2022-11-22 | 南京擎盾信息科技有限公司 | Contract comparison method and device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110765244B (en) | Method, device, computer equipment and storage medium for obtaining answering operation | |
CN110162627B (en) | Data increment method and device, computer equipment and storage medium | |
TWI512719B (en) | An acoustic language model training method and apparatus | |
CN108304375B (en) | Information identification method and equipment, storage medium and terminal thereof | |
JP4494632B2 (en) | Information retrieval and speech recognition based on language model | |
CN108899013B (en) | Voice search method and device and voice recognition system | |
US10019514B2 (en) | System and method for phonetic search over speech recordings | |
WO2003010754A1 (en) | Speech input search system | |
CN112069298A (en) | Human-computer interaction method, device and medium based on semantic web and intention recognition | |
KR20080069990A (en) | Speech index pruning | |
CN112925945A (en) | Conference summary generation method, device, equipment and storage medium | |
CN115146629A (en) | News text and comment correlation analysis method based on comparative learning | |
WO2020233381A1 (en) | Speech recognition-based service request method and apparatus, and computer device | |
JP2015125499A (en) | Voice interpretation device, voice interpretation method, and voice interpretation program | |
CN111126084B (en) | Data processing method, device, electronic equipment and storage medium | |
Moyal et al. | Phonetic search methods for large speech databases | |
CN114661862A (en) | Voice data based search method and device, computer equipment and storage medium | |
JP5723711B2 (en) | Speech recognition apparatus and speech recognition program | |
US8639510B1 (en) | Acoustic scoring unit implemented on a single FPGA or ASIC | |
CN110362592B (en) | Method, device, computer equipment and storage medium for pushing arbitration guide information | |
CN116028626A (en) | Text matching method and device, storage medium and electronic equipment | |
CN116150306A (en) | Training method of question-answering robot, question-answering method and device | |
Le et al. | Automatic quality estimation for speech translation using joint ASR and MT features | |
CN114239555A (en) | Training method of keyword extraction model and related device | |
CN113673237A (en) | Model training method, intent recognition method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |