CN114661862A - Voice data based search method and device, computer equipment and storage medium - Google Patents

Voice data based search method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114661862A
CN114661862A CN202210195811.6A CN202210195811A CN114661862A CN 114661862 A CN114661862 A CN 114661862A CN 202210195811 A CN202210195811 A CN 202210195811A CN 114661862 A CN114661862 A CN 114661862A
Authority
CN
China
Prior art keywords
slot position
position value
slot
value
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210195811.6A
Other languages
Chinese (zh)
Inventor
孙瑜希
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL New Technology Co Ltd
Original Assignee
Shenzhen TCL New Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL New Technology Co Ltd filed Critical Shenzhen TCL New Technology Co Ltd
Priority to CN202210195811.6A priority Critical patent/CN114661862A/en
Publication of CN114661862A publication Critical patent/CN114661862A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The embodiment of the application provides a searching method, a device, computer equipment and a storage medium based on voice data, the method comprises the steps of firstly carrying out text conversion on the voice data to be processed to obtain initial text data, then carrying out semantic analysis on the initial text data to determine a corresponding field type and an initial slot position value, then obtaining a corresponding slot rewriting index table according to the field type, recording an un-rewritten first slot position value and a rewritten second slot position value of the corresponding field type by the slot rewriting index table, then adjusting the initial slot position value according to the slot rewriting index table to obtain a target slot position value, finally carrying out searching based on the target slot position value to obtain a searching result, adjusting the initial slot position value through the slot rewriting index table to enable the target slot position value to be more accurate, thereby improving the condition that a keyword cannot be recalled due to the searching through the initial slot value, the recall rate and accuracy of the search results are improved.

Description

Voice data based search method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a search method and apparatus based on voice data, a computer device, and a storage medium.
Background
With the progress and development of artificial intelligence technology, intelligent voice devices such as intelligent sound equipment are gradually applied to our lives, and users can input their requests through voice to acquire search results. In the prior art, a user sends a voice request, and performs Recognition processing through an ASR (Automatic Speech Recognition) and an NLU (Natural Language Understanding), however, in an actual application scenario, problems such as wrong words, missing words, short names, aliases, or accents may occur to the voice request sent by the user, so that there is an error in voice Recognition, and then the relevance between a returned search result and content that the user needs to search is low, which reduces a recall rate of search, and affects a search effect and user experience.
Disclosure of Invention
The embodiment of the application provides a searching method and device based on voice data, computer equipment and a storage medium, and aims to solve the technical problems that the searching recall rate is reduced and the searching effect and the user experience are influenced due to the low recognition accuracy of the voice data.
In one aspect, the present application provides a search method based on voice data, including:
acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data;
performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value;
acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value;
and searching based on the target slot position value to obtain a search result.
In one aspect, the present application provides a search apparatus based on voice data, including:
the conversion module is used for acquiring voice data to be processed and performing text conversion on the voice data to be processed to obtain initial text data;
the analysis module is used for carrying out semantic analysis on the initial text data and determining a corresponding field type and an initial slot position value;
the acquisition module is used for acquiring a corresponding slot position rewriting index table according to the field type, and the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
the rewriting module is used for rewriting an index table according to the slot position and adjusting the initial slot position value to obtain a target slot position value;
and the searching module is used for searching based on the target slot position value to obtain a searching result.
In one aspect, the present application provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps in the above-mentioned search method based on voice data when executing the computer program.
In one aspect, the present application provides a computer readable medium storing a computer program, which when executed by a processor, implements the steps in the above-mentioned voice data-based search method.
The embodiment of the application provides a search method based on voice data, which comprises the steps of firstly carrying out text conversion on voice data to be processed to obtain initial text data, then carrying out semantic analysis on the initial text data to determine a corresponding field type and an initial slot position value, then obtaining a corresponding slot rewriting index table according to the field type, recording an un-rewritten first slot position value and a rewritten second slot position value of the corresponding field type by the slot rewriting index table, then adjusting the initial slot position value according to the slot rewriting index table to obtain a target slot position value, finally carrying out search based on the target slot position value to obtain a search result, adjusting the initial slot position value by the slot rewriting index table to enable the target slot position value to be more accurate, thereby improving the condition that a keyword cannot be recalled due to the fact that the search is carried out through the initial slot position value, the recall rate and accuracy of the search results are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Wherein:
FIG. 1 is a flow diagram of a method for voice data based searching in one embodiment;
FIG. 2 is a flow diagram of a method for slot overwrite index table generation in one embodiment;
FIG. 3 is a flow diagram of a method for score determination of a first candidate slot level value and a second candidate slot level value in one embodiment;
FIG. 4 is a flow diagram of a search result determination method in one embodiment;
FIG. 5 is a block diagram showing the structure of a speech data based search apparatus according to an embodiment;
FIG. 6 is a block diagram of a computer device in one embodiment.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, in an embodiment, a search method based on voice data is provided, and the search method based on voice data can be applied to both a terminal and a server, and the embodiment is exemplified by being applied to the server. The searching method based on the voice data specifically comprises the following steps:
and 102, acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data.
The voice data to be processed refers to voice data of a query instruction sent by a user, specifically, the user sends real-time voice data of the query instruction to the server, so that the voice data to be processed is directly obtained from the server, or the voice data of the query instruction is stored in the server in advance, and then the voice data to be processed is obtained from the server. After the voice data to be processed is obtained, the voice data to be processed can be converted into text data through a voice recognition technology to obtain initial text data, and the specific conversion process is as follows: voice Activity Detection (VAD) is carried out on Voice data to be processed, then Voice framing is carried out on the Voice data to be processed with the removed silence, feature extraction is carried out on a plurality of Voice frames, such as Linear Prediction Cepstrum Coefficients (LPCC) or Mel cepstrum coefficients (MFCC), the Voice data are converted into multidimensional vectors, then the multidimensional vectors are input into an Acoustic Model (AM), phoneme data are output, the phoneme data are converted into corresponding characters by utilizing a dictionary, finally the characters are input into a Language Model (LM) to obtain the probability of a single character or word, and the characters meeting the probability threshold are output to be initial text data. In the embodiment, the voice data to be processed is converted into the initial text data, so that the subsequent high-efficiency processing is performed based on the text data, and the complex complexity of directly processing the voice data is avoided.
And 104, performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value.
Wherein, the domain type refers to the category of the domain to which the initial text data belongs, such as the movie and television domain, the news domain, etc., the initial slot value refers to the keyword or the keyword, etc. that the initial text data contains and is clearly defined or intended, specifically, the initial text data can be semantically analyzed through a preset rule template, a text classification model based on statistical machine learning, or based on deep learning, and the corresponding domain type and the initial slot value are extracted, wherein the preset rule template is a rule template summarized by manually analyzing the related intention or the representative text data under the keyword in each domain type, then the initial text data is subjected to a series of processing such as word segmentation, part of speech tagging, named entity recognition, dependency syntax analysis, semantic classification, etc., and the processed initial text data is matched through the preset rule template, extracting corresponding field types and initial slot values, wherein the semantic analysis based on statistical machine learning is to extract the characteristics of initial text data, such as ngram characteristics, part of speech characteristics and entity type characteristics; after the features are extracted, tf-idf (term frequency-inverse document frequency index) vectorization representation is carried out, then algorithms such as a support vector machine, logistic regression and random forest are used for training to obtain a semantic analysis model, initial text data are input into the semantic analysis model, and a corresponding field type and an initial slot value are obtained. For example, the initial text data is "i want to see a pig wearing," the corresponding domain type is a movie domain, and the corresponding initial slot value is "pig wearing. It can be understood that, in this embodiment, by determining the field type and the initial slot position value corresponding to the initial text data, extraction of the keywords of the initial text data is achieved, interference of information of the degree of correlation is reduced, and improvement of processing efficiency of subsequent text data is facilitated.
And 106, acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type.
The slot rewriting index table is a table configured in advance for rewriting an initial slot value of a corresponding field type, each field type corresponds to one slot rewriting index table, the slot rewriting index table records an un-rewritten first slot value and a rewritten second slot value of the corresponding field type, the first slot value refers to an un-rewritten slot value which has an error (such as a wrong word, a missing word, a replacement word, an alias, an abbreviation and the like) or is difficult to identify (such as a newly generated word), and the second slot value refers to a slot value which is used for rewriting the first slot value to improve the search efficiency or a newly generated word configured in advance, such as a slot value of a correct movie name with the same circumferential depth. As another example, a first slot level value is "piggy-pack" and a corresponding second slot level value is "piggy-pack". In this embodiment, the index table is rewritten by obtaining the corresponding slot in advance, so as to rewrite the initial slot value, thereby improving the accuracy of the initial slot value, further improving the subsequent search efficiency, and reducing the recall rate.
And 108, rewriting the index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value.
And the target slot position value is the slot position value obtained after the initial slot position value is adjusted. Specifically, according to the initial slot position value, searching is carried out in the slot rewriting index table, when a first slot position value consistent with the initial slot position value is found, the initial slot position value is adjusted according to a second slot position value corresponding to the first slot position value, and a target slot position value is obtained, so that the target slot position value is more accurate, the problems of character errors, character leakage and the like existing in the voice data to be processed are solved, and the recall rate of searching is favorably reduced.
And step 110, searching based on the target slot position value to obtain a search result.
Specifically, the target slot value is used as a search keyword to perform searching, and a search result is obtained. Because the target slot position value is the correct slot position value, the condition that the keyword cannot be recalled due to the fact that the search is conducted through the initial slot position value is improved, and the recall rate and accuracy of the search result are improved.
The searching method based on the voice data comprises the steps of firstly carrying out text conversion on the voice data to be processed to obtain initial text data, then, semantic analysis is performed on the initial text data to determine the corresponding domain type and initial slot position value, and then, obtaining a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type, and then rewriting the index table according to the slot position, adjusting the initial slot position value to obtain a target slot position value, finally, searching based on the target slot position value to obtain a search result, the initial slot position value is adjusted by rewriting the index table of the slot position, so that the target slot position value is more accurate, therefore, the condition that the keywords cannot be recalled due to the fact that the search is conducted through the initial slot position value is improved, and the recall rate and accuracy of the search results are improved.
In one embodiment, before the step of obtaining the corresponding slot position according to the field type to overwrite the index table, the method further includes: and generating a slot position rewriting index table by adopting a preset matching mode aiming at each field type.
Specifically, for each field type, a plurality of search keywords of the corresponding field type are obtained from a search log of a user and used as a first slot position value, and a standard slot position value of the corresponding field type is obtained, where the standard slot position may be obtained by collecting slot position values related to the corresponding field type in advance, or may be obtained by determining according to a search result. The preset matching mode refers to a preset matching mode for matching the standard slot value for the search keyword, the preset matching mode may be a matching mode based on model conversion, a matching mode based on plug-in conversion, or a combination of two matching modes, and as an optimization of this embodiment, in order to improve the accuracy of the slot rewriting index table, a combination of the matching mode based on model conversion and the matching mode based on plug-in conversion is adopted to ensure the accuracy and rationality of matching, thereby improving the accuracy of matching the first slot value in the slot rewriting index table and the standard slot value.
As shown in fig. 2, in an embodiment, the step of generating the slot overwriting index table by using a preset matching manner for each field type includes:
step 112A, acquiring a first slot position value corresponding to the field type from a preset sample database, and acquiring a standard slot position value corresponding to the field type from a preset search database, wherein the second slot position value is a subset of the standard slot position value;
step 112B, for each first slot level value, calculating the semantic similarity between each first slot level value and each standard slot level value by using a first distance calculation method to obtain a first similarity, and calculating the semantic similarity between each first slot level value and each standard slot level value by using a second distance calculation method to obtain a second similarity;
112C, selecting standard slot position values corresponding to the first K first similarities to determine as first candidate slot position values, and selecting standard slot position values corresponding to the first L second similarities to determine as second candidate slot position values, wherein K and L are natural numbers larger than 1;
step 112D, respectively determining scores of each first candidate slot position value and each second candidate slot position value according to a preset scoring rule;
and 112E, according to the scores, determining a second slot position value corresponding to the first slot position value from the first candidate slot position value and the second candidate slot position value.
The preset sample database is used for storing a first slot position value, the preset search database is used for storing a standard slot position value, the standard slot position value is a rewritten slot position value and is also an accurate slot position value, and the second slot position value is a subset of the standard slot position value and is formed by the standard slot position value. Specifically, for each first slot position value, calculating a semantic similarity between each first slot position value and each standard slot position value to obtain a first similarity, and calculating a semantic similarity between each first slot position value and each standard slot position value to obtain a second similarity, wherein the first distance calculation method and the second distance calculation method may be any one of similarity measurement methods of an euclidean distance, a cosine distance, a manhattan distance, a hamming distance, a mahalanobis distance, and a chebyshev distance. The first candidate slot position value is the first K standard slot position values with higher first similarity, the second candidate slot position value is the first L standard slot position values with higher second similarity, and the number K of the first candidate slot position values may be equal to the number L of the second candidate slot position values. The preset scoring rule is a preset rule for evaluating the matching degree of each first candidate slot position value, each second candidate slot position value and each first slot position value, and the scoring rule may be a rule for scoring according to the use frequency of each first candidate slot position value, each second candidate slot position value or the heat degree of each first candidate slot position value and each second candidate slot position value by a user in combination with the first similarity or the second similarity of the user to improve the objectivity and accuracy of scoring the first candidate slot position value and the second candidate slot position value, it can be understood that in this embodiment, by scoring the first candidate slot position value and the second candidate slot position value, the quantification of the correlation between the standard slot position value and the first slot position value is realized, and according to each score, the second slot position value corresponding to the first slot position value is determined from the first candidate slot position value and the second candidate slot position value, the second slot value acquisition efficiency is improved.
In one embodiment, the step of calculating the semantic similarity between each first slot position value and each standard slot position value by using a first distance calculation method for each first slot position value to obtain a first similarity includes: converting the first slot position value into a first pinyin text by using the pinyin plug-in, and converting each standard slot position value into a standard pinyin text; and calculating Euclidean distances between the first pinyin text and each standard pinyin text, and determining the first similarity according to the Euclidean distances.
Among them, a pinyin plug-in (pinyin) is a plug-in of an elastic search for converting text data into pinyin, for example, if the first slot position is "piggy-wear", the first pinyin text is "xiaozhupeipeeii". Specifically, the first slot value is converted into a first pinyin text by using the pinyin plug-in unit, each standard slot value is converted into a standard pinyin text, then the first pinyin text and each standard pinyin text are converted into multidimensional vectors, and the euclidean distance between the multidimensional vector corresponding to the first pinyin text and the multidimensional vector corresponding to each standard pinyin text is calculated, wherein the specific calculation formula is as follows:
Figure BDA0003527133830000071
wherein d (x)i,yi) Is the Euclidean distance, x, of the first phonetic text and the standard phonetic text thereiniIs the i-dimension vector, y, of the first Pinyin textiIs the ith dimension vector of the standard Pinyin text, and N is the dimension of the first Pinyin text and the multidimensional vector of each standard Pinyin text. Then, the first similarity is determined according to the Euclidean distance, and the calculation formula is as follows:
Figure BDA0003527133830000081
wherein sim1 (x)i,yi) Is the first similarity. In the embodiment, the first slot position value and each standard slot position value are converted through the pinyin plug-in, matching processing of the text data with pronunciation errors is achieved, the first similarity of the converted first slot position value and each standard slot position value is determined through the Euclidean distance, the calculation process is simple and rapid, and measurement of the similarity of the first slot position value and each standard slot position value is achieved efficiently.
In one embodiment, the step of calculating the semantic similarity between each first slot position value and each standard slot position value by using a second distance calculation method to obtain a second similarity includes: inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding; and calculating cosine distances between the first embedding and each standard embedding, and determining a second similarity according to the cosine distances.
The trained text analysis model refers to a pre-trained machine learning model for converting text, for example, a word2vec model, and embedding (embedding) refers to feature vectors of text. Specifically, a first slot value is input into a trained text analysis model, an output result of the model is used as a first embedding, each standard slot value is respectively input into the trained text analysis model, an output result of the model is used as a corresponding standard embedding, and then, a cosine distance between the first embedding and each standard embedding is calculated through the following formula:
Figure BDA0003527133830000082
wherein, T (a)j,bj) Cosine distance of the first embedding from the standard embedding therein, ajAs a first embedded j-th dimension vector, bjFor the j-th dimension of the standard embedded vectors, M is the dimension of the first embedded and the respective standard embedded multi-dimensional vectors. Then, a second similarity is determined according to the cosine distance, and the calculation formula is as follows:
sim2(aj,bj)=1-T(aj,bj),
wherein sim2 (a)j,bj) Is the second similarity. In the embodiment, the first slot position value and each standard slot position value are converted through the trained text analysis model, so that matching processing of the text data which has wrong characters, missing characters and the like and is not rewritten is realized, the second similarity of the converted first slot position value and each standard slot position value is determined through the cosine distance, the calculation process is simple and rapid, and the measurement of the similarity of the first slot position value and each standard slot position value is efficiently realized.
As shown in fig. 3, in an embodiment, the step of determining the score of each first candidate slot position value and each second candidate slot position value according to a preset scoring rule includes:
step 112D1, determining a usage frequency score of each of the first candidate slot position value and the second candidate slot position value according to the usage frequency of each of the first candidate slot position value and the second candidate slot position value;
step 112D2, determining a popularity score of each of the first candidate slot position value and the second candidate slot position value according to the popularity of each of the first candidate slot position value and the second candidate slot position value;
step 112D3, inputting each first candidate slot position value into a preset verification classifier respectively, determining a first verification result of each first candidate slot position value, obtaining a first accuracy according to the first verification result, inputting each second candidate slot position value into the preset verification classifier respectively, determining a second verification result of each second candidate slot position value, and obtaining a second accuracy according to the second verification result;
step 112D4, determining the score of each first candidate slot position value according to the first accuracy, the first similarity of each first candidate slot position value, the usage frequency score, the frequency weight corresponding to the usage frequency and the heat weight corresponding to the heat, and determining the score of each second candidate slot position value according to the second accuracy, the second similarity of each second candidate slot position value, the usage frequency score, the frequency weight corresponding to the usage frequency and the heat weight corresponding to the heat.
Wherein, the usage frequency refers to the frequency of occurrence of the standard slot value in the search, and may be counted by a user search log within a preset time period (e.g. one year), and the usage frequency score of each of the first candidate slot value and the second candidate slot value is determined, for example, for K first candidate slot values, the usage frequency score of the first candidate slot value with the highest usage frequency is determined to be 1, the usage frequency scores of the corresponding first candidate slot values are determined to be (K-1)/K, (K-2)/K.. 1/K) in order of the usage frequencies from high to low, for L second candidate slot values, the usage frequency score of the second candidate slot value with the highest usage frequency is determined to be 1, the usage frequency score of the corresponding second candidate slot values is determined to be (L-1)/L in order of the usage frequencies from high to low, 1/L/2/L. Hot means the number of requests occurring in different users is the hot that a standard slot value may be searched, the heat score may be determined by counting the number of times a standard slot value is searched for a predetermined period of time, such as 3 months, for example, determining a hot score of 1 for the first candidate slot level value with the highest hot degree for the K first candidate slot level values, determining the heat degree of the corresponding first candidate slot position value to be (K-1)/K, (K-2)/K.. 1/K in sequence from high to low according to the heat degree, determining a heat score of 1 for the second candidate slot position value with the highest heat for the L second candidate slot position values, and determining the heat scores of the corresponding second candidate slot position values to be (L-1)/L, (L-2)/L. The preset verification classifier is a classifier obtained by training classifiers such as SVMs, decision trees and the like in advance and used for determining the correct probabilities of the first similarity and the second similarity, the frequency weight refers to the weight of the use frequency, the heat weight refers to the weight of the heat, and the sum of the frequency weight and the heat weight is 1, for example, the frequency weight is 0.6, and the heat weight is 0.4. The score of the first candidate slot level value and the score of the second candidate slot level value may be calculated by the following formulas:
Z1p=x*s1p*(α*F1p+β*H1p)
Z2p=y*s2q*(α*F2q+β*H2q),
wherein Z is1pFor the score of the pth first candidate slot value, p ∈ [1, k]X is a first accuracy, s1pIs the first similarity of the p-th first candidate bin value, alpha is the frequency weight, beta is the heat weight, F1pScore for frequency of use of the pth first candidate slot value, HzpScore the heat of the pth first candidate slot value, Z2pFor the score of the qth second candidate slot value, q ∈ [1, L]Y is the second accuracy, s2qIs a second similarity of the q second candidate slot values, F2qScoring the frequency of use of the qth second candidate slot value, H2qAnd scoring the heat of the qth second candidate slot-level value. It can be understood that, in this embodiment, each first candidate slot position value and each second candidate slot position value are scored from two dimensions, that is, using frequency and heat, and the scores of each first candidate slot position value and each second candidate slot position value are determined by combining the accuracy of each similarity, so that the scores of each first candidate slot position value and each second candidate slot position value are accurately quantized, the matching accuracy and the matching efficiency are improved, and the generation efficiency of the slot rewriting index table is further improved.
As shown in fig. 4, in an embodiment, the step of performing a search based on the target slot location value to obtain a search result includes:
step 110A, searching in a preset search resource library according to a target slot position value;
step 110B, if the search information corresponding to the target slot value is searched, taking the search information as a search result;
and step 110C, if the search information corresponding to the target slot position value is not searched, acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value, and determining the recommendation information as a search result.
Specifically, the target slot position value is used as a search keyword, searching is performed in a preset search resource library, when search information corresponding to the target slot position value is searched, the search information is used as a search result, it is worth explaining that the search result can be converted into voice data to be played, user experience is improved, when the search information corresponding to the target slot position value is not searched, a preset recommendation method is adopted to obtain corresponding recommendation information according to the target slot position value, the recommendation information is determined as the search result, the preset recommendation method can be used for obtaining information with the highest heat degree in the corresponding field type as the recommendation information according to the field type of the target slot position value, and the search experience of a user is improved.
As shown in fig. 5, in one embodiment, a search apparatus based on voice data is provided, including:
a conversion module 502, configured to obtain voice data to be processed, and perform text conversion on the voice data to be processed to obtain initial text data;
an analysis module 504, configured to perform semantic analysis on the initial text data, and determine a corresponding domain type and an initial slot position value;
an obtaining module 506, configured to obtain a corresponding slot rewriting index table according to the domain type, where the slot rewriting index table records a first slot value and a second slot value that are not rewritten of the corresponding domain type;
a rewriting module 508, configured to rewrite an index table according to the slot position, and adjust the initial slot position value to obtain a target slot position value;
and a searching module 510, configured to perform a search based on the target slot value to obtain a search result.
In one embodiment, the apparatus for searching based on voice data further comprises: and the generating module is used for generating the slot position rewriting index table by adopting a preset matching mode aiming at each field type.
In one embodiment, the generating module comprises:
the acquisition unit is used for acquiring the first slot position value corresponding to the field type from a preset sample database and acquiring a standard slot position value corresponding to the field type from a preset search database, wherein the second slot position value is a subset of the standard slot position value;
a calculating unit, configured to calculate, for each first slot position value, a semantic similarity between each first slot position value and each standard slot position value by using a first distance calculation method to obtain a first similarity, and calculate, for each first slot position value, a semantic similarity between each first slot position value and each standard slot position value by using a second distance calculation method to obtain a second similarity;
the selecting unit is used for selecting the first K standard slot position values corresponding to the first similarity to determine as a first candidate slot position value, and selecting the first L standard slot position values corresponding to the second similarity to determine as a second candidate slot position value, wherein K and L are natural numbers larger than 1;
a first determining unit, configured to determine scores of the first candidate slot position values and the second candidate slot position values according to a preset scoring rule;
a second determining unit, configured to determine, according to each score, the second slot position value corresponding to the first slot position value from the first candidate slot position value and the second candidate slot position value.
In one embodiment, the computing unit comprises:
the conversion subunit is used for converting the first slot position value into a first pinyin text by using a pinyin plug-in and converting each standard slot position value into a standard pinyin text;
and the first calculating subunit is used for calculating the Euclidean distance between the first pinyin text and each standard pinyin text to obtain the first similarity.
In one embodiment, the computing unit further comprises:
the analysis subunit is used for inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding;
and the second calculating subunit is used for calculating cosine distances between the first embedding and each standard embedding, and determining the second similarity according to the cosine distances.
In one embodiment, the first determination unit includes:
a first determining subunit, configured to determine, according to usage frequencies of the first candidate slot position values and the second candidate slot position values, usage frequency scores of the first candidate slot position values and the second candidate slot position values;
a second determining subunit, configured to determine a popularity score of each of the first candidate slot position value and the second candidate slot position value according to popularity of each of the first candidate slot position value and the second candidate slot position value;
a third determining subunit, configured to input each of the first candidate slot position values into a preset verification classifier, determine a first verification result of each of the first candidate slot position values, obtain a first accuracy according to the first verification result, input each of the second candidate slot position values into the preset verification classifier, determine a second verification result of each of the second candidate slot position values, and obtain a second accuracy according to the second verification result;
a fourth determining subunit, configured to determine a score of each of the first candidate slot positions according to the first accuracy, the first similarity of each of the first candidate slot positions, a score of a usage frequency, a frequency weight corresponding to the usage frequency, and a heat weight corresponding to the heat, and determine a score of each of the second candidate slot positions according to the second accuracy, the second similarity of each of the second candidate slot positions, a score of a usage frequency, a frequency weight corresponding to the usage frequency, and a heat weight corresponding to the heat.
In one embodiment, the search module comprises:
the searching unit is used for searching in a preset searching resource library according to the target slot position value;
a first result obtaining unit, configured to, if search information corresponding to the target slot value is searched, take the search information as the search result;
and the second result acquisition unit is used for acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value if the search information corresponding to the target slot position value is not searched, and determining the recommendation information as the search result.
FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be a server including, but not limited to, a high performance computer and a cluster of high performance computers. As shown in fig. 6, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program that, when executed by the processor, causes the processor to implement a search method based on voice data. The internal memory may also have stored therein a computer program that, when executed by the processor, causes the processor to perform a search method based on speech data. Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, the search method based on voice data provided by the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in fig. 6. The memory of the computer device may store therein respective program templates constituting the voice data-based search means. For example, the conversion module 502, the analysis module 504, the obtaining module 506, the rewriting module 508, and the search module 510.
A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data; performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot value; acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type; rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value; and searching based on the target slot position value to obtain a search result.
A computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of: acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data; performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot position value; acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type; rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value; and searching based on the target slot position value to obtain a search result.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for searching based on voice data, comprising:
acquiring voice data to be processed, and performing text conversion on the voice data to be processed to obtain initial text data;
performing semantic analysis on the initial text data, and determining a corresponding field type and an initial slot value;
acquiring a corresponding slot position rewriting index table according to the field type, wherein the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
rewriting an index table according to the slot position, and adjusting the initial slot position value to obtain a target slot position value;
and searching based on the target slot position value to obtain a search result.
2. The method of claim 1, wherein before the step of obtaining the corresponding slot overwriting index table according to the domain type, the method further comprises:
and generating the slot position rewriting index table by adopting a preset matching mode aiming at each field type.
3. The method for searching based on voice data according to claim 1, wherein the step of generating the slot overwriting index table for each of the domain types by using a preset matching manner comprises:
acquiring the first slot position value corresponding to the field type from a preset sample database, and acquiring a standard slot position value corresponding to the field type from a preset search database, wherein the second slot position value is a subset of the standard slot position value;
aiming at each first slot position value, calculating the semantic similarity between each first slot position value and each standard slot position value by adopting a first distance calculation method to obtain a first similarity, and calculating the semantic similarity between each first slot position value and each standard slot position value by adopting a second distance calculation method to obtain a second similarity;
selecting the first K standard slot position values corresponding to the first similarity to determine as a first candidate slot position value, and selecting the first L standard slot position values corresponding to the second similarity to determine as a second candidate slot position value, wherein K and L are both natural numbers larger than 1;
respectively determining scores of each first candidate slot position value and each second candidate slot position value according to a preset scoring rule;
according to each score, determining the second slot value corresponding to the first slot value from the first candidate slot value and the second candidate slot value.
4. The method as claimed in claim 3, wherein the step of calculating the semantic similarity between each of the first slot values and each of the standard slot values by using a first distance calculation method for each of the first slot values to obtain a first similarity comprises:
converting the first slot position value into a first pinyin text by using a pinyin plug-in, and converting each standard slot position value into a standard pinyin text;
and calculating the Euclidean distance between the first pinyin text and each standard pinyin text, and determining the first similarity according to the Euclidean distance.
5. The method for searching based on voice data according to claim 3, wherein the step of calculating the semantic similarity between each of the first bin values and each of the standard bin values by using the second distance calculation method to obtain the second similarity comprises:
inputting the first slot position value into a trained text analysis model to obtain a first embedding, and respectively inputting each standard slot position value into the trained text analysis model to obtain each standard embedding;
and calculating cosine distances between the first embedding and each standard embedding, and determining the second similarity according to the cosine distances.
6. The method of claim 3, wherein the step of determining the score of each of the first candidate slot position values and each of the second candidate slot position values according to a preset scoring rule comprises:
determining a usage frequency score of each of the first candidate slot position value and the second candidate slot position value according to the usage frequency of each of the first candidate slot position value and the second candidate slot position value;
determining a heat degree score of each of the first candidate slot position value and the second candidate slot position value according to the heat degree of each of the first candidate slot position value and the second candidate slot position value;
respectively inputting each first candidate slot position value into a preset verification classifier, determining a first verification result of each first candidate slot position value, obtaining a first accuracy according to the first verification result, respectively inputting each second candidate slot position value into the preset verification classifier, determining a second verification result of each second candidate slot position value, and obtaining a second accuracy according to the second verification result;
determining the score of each first candidate slot position value according to the first accuracy, the first similarity of each first candidate slot position value, the score of the use frequency, the frequency weight corresponding to the use frequency and the heat weight corresponding to the heat, and determining the score of each second candidate slot position value according to the second accuracy, the second similarity of each second candidate slot position value, the score of the use frequency, the frequency weight corresponding to the use frequency and the heat weight corresponding to the heat.
7. The method of claim 6, wherein the step of searching based on the target slot value to obtain the search result comprises:
searching in a preset search resource library according to the target slot position value;
if searching the search information corresponding to the target slot value, taking the search information as the search result;
if the search information corresponding to the target slot position value is not searched, acquiring corresponding recommendation information by adopting a preset recommendation method based on the target slot position value, and determining the recommendation information as the search result.
8. A search apparatus based on voice data, comprising:
the conversion module is used for acquiring voice data to be processed and performing text conversion on the voice data to be processed to obtain initial text data;
the analysis module is used for performing semantic analysis on the initial text data and determining a corresponding field type and an initial slot position value;
the acquisition module is used for acquiring a corresponding slot position rewriting index table according to the field type, and the slot position rewriting index table records a first slot position value which is not rewritten and a second slot position value which is rewritten of the corresponding field type;
the rewriting module is used for rewriting an index table according to the slot position and adjusting the initial slot position value to obtain a target slot position value;
and the searching module is used for searching based on the target slot position value to obtain a searching result.
9. A computer arrangement comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method for searching based on speech data according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method for searching based on speech data according to any one of claims 1 to 7.
CN202210195811.6A 2022-03-01 2022-03-01 Voice data based search method and device, computer equipment and storage medium Pending CN114661862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210195811.6A CN114661862A (en) 2022-03-01 2022-03-01 Voice data based search method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210195811.6A CN114661862A (en) 2022-03-01 2022-03-01 Voice data based search method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114661862A true CN114661862A (en) 2022-06-24

Family

ID=82028363

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210195811.6A Pending CN114661862A (en) 2022-03-01 2022-03-01 Voice data based search method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114661862A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115376153A (en) * 2022-08-31 2022-11-22 南京擎盾信息科技有限公司 Contract comparison method and device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115376153A (en) * 2022-08-31 2022-11-22 南京擎盾信息科技有限公司 Contract comparison method and device and storage medium

Similar Documents

Publication Publication Date Title
CN110765244B (en) Method, device, computer equipment and storage medium for obtaining answering operation
CN110162627B (en) Data increment method and device, computer equipment and storage medium
TWI512719B (en) An acoustic language model training method and apparatus
CN108304375B (en) Information identification method and equipment, storage medium and terminal thereof
JP4494632B2 (en) Information retrieval and speech recognition based on language model
CN108899013B (en) Voice search method and device and voice recognition system
US10019514B2 (en) System and method for phonetic search over speech recordings
WO2003010754A1 (en) Speech input search system
CN112069298A (en) Human-computer interaction method, device and medium based on semantic web and intention recognition
KR20080069990A (en) Speech index pruning
CN112925945A (en) Conference summary generation method, device, equipment and storage medium
CN115146629A (en) News text and comment correlation analysis method based on comparative learning
WO2020233381A1 (en) Speech recognition-based service request method and apparatus, and computer device
JP2015125499A (en) Voice interpretation device, voice interpretation method, and voice interpretation program
CN111126084B (en) Data processing method, device, electronic equipment and storage medium
Moyal et al. Phonetic search methods for large speech databases
CN114661862A (en) Voice data based search method and device, computer equipment and storage medium
JP5723711B2 (en) Speech recognition apparatus and speech recognition program
US8639510B1 (en) Acoustic scoring unit implemented on a single FPGA or ASIC
CN110362592B (en) Method, device, computer equipment and storage medium for pushing arbitration guide information
CN116028626A (en) Text matching method and device, storage medium and electronic equipment
CN116150306A (en) Training method of question-answering robot, question-answering method and device
Le et al. Automatic quality estimation for speech translation using joint ASR and MT features
CN114239555A (en) Training method of keyword extraction model and related device
CN113673237A (en) Model training method, intent recognition method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination