CN111640454A - Spectrogram matching method, device and equipment and computer readable storage medium - Google Patents

Spectrogram matching method, device and equipment and computer readable storage medium Download PDF

Info

Publication number
CN111640454A
CN111640454A CN202010405211.9A CN202010405211A CN111640454A CN 111640454 A CN111640454 A CN 111640454A CN 202010405211 A CN202010405211 A CN 202010405211A CN 111640454 A CN111640454 A CN 111640454A
Authority
CN
China
Prior art keywords
sample
spectrogram
phoneme
information
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010405211.9A
Other languages
Chinese (zh)
Other versions
CN111640454B (en
Inventor
郑琳琳
龙洪锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Speakin Intelligent Technology Co ltd
Original Assignee
Guangzhou Speakin Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Speakin Intelligent Technology Co ltd filed Critical Guangzhou Speakin Intelligent Technology Co ltd
Priority to CN202010405211.9A priority Critical patent/CN111640454B/en
Publication of CN111640454A publication Critical patent/CN111640454A/en
Application granted granted Critical
Publication of CN111640454B publication Critical patent/CN111640454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a spectrogram matching method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a sample spectrogram and acquiring a sample spectrogram; acquiring sample phoneme information of a target phoneme in the sample spectrogram; and calculating phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity. The invention can realize intelligent matching of the spectrogram and improve the matching efficiency of the spectrogram.

Description

Spectrogram matching method, device and equipment and computer readable storage medium
Technical Field
The present invention relates to the field of speech processing technologies, and in particular, to a spectrogram matching method, apparatus, device, and computer-readable storage medium.
Background
At present, with the continuous development of society, the voice processing technology is also gradually applied to various fields, a spectrogram is a common voice data expression mode, and the spectrogram is often required to be used in the voice processing process, and the voice recognition, the identity recognition and other processing are performed through the matching between the spectrogram.
The traditional spectrogram matching method judges the matching condition between spectrograms by manually comparing differences, but the matching method is time-consuming and low in matching efficiency.
Disclosure of Invention
The invention mainly aims to provide a spectrogram matching method, a spectrogram matching device, spectrogram matching equipment and a computer-readable storage medium, and aims to realize intelligent matching of spectrograms and improve matching efficiency of the spectrograms.
In order to achieve the above object, an embodiment of the present invention provides a spectrogram matching method, including:
acquiring a sample spectrogram and acquiring a sample spectrogram;
acquiring sample phoneme information of a target phoneme in the sample spectrogram;
and calculating phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity.
Optionally, before the step of acquiring the phoneme information of the target phoneme in the spectrogram of the specimen, the method further includes:
correcting the sample spectrogram to obtain a corrected sample spectrogram;
the step of acquiring the sample phoneme information of the target phoneme in the sample spectrogram comprises:
and acquiring the sample phoneme information of the target phoneme in the corrected sample spectrogram.
Optionally, the step of correcting the sample spectrogram to obtain a corrected sample spectrogram includes:
acquiring amplitude curve information corresponding to each sample phoneme in the sample spectrogram, and calculating to obtain corresponding deviation information according to the amplitude curve information;
and correcting the sample spectrogram based on the deviation information to obtain a corrected sample spectrogram.
Optionally, before the step of acquiring the phoneme information of the target phoneme in the spectrogram of the specimen, the method further includes:
detecting whether peaks and troughs in the sample spectrogram are labeled or not;
if the peaks and the troughs in the sample spectrogram are labeled, executing the following steps: acquiring sample phoneme information of a target phoneme in the sample spectrogram;
if the wave crest and the wave trough in the sample frequency spectrogram are not labeled, labeling the wave crest and the wave trough in the sample frequency spectrogram to obtain a labeled sample frequency spectrogram;
the step of acquiring the sample phoneme information of the target phoneme in the sample spectrogram comprises:
and acquiring the sample phoneme information of the target phoneme in the labeled sample spectrogram.
Optionally, the step of acquiring a sample spectrogram comprises:
acquiring a sample audio, and converting the sample audio into a sample spectrogram based on a preset rule;
and acquiring a corresponding sample audio from a preset sample database according to the sample audio, and acquiring a sample spectrogram corresponding to the sample audio.
Optionally, the step of calculating a phoneme similarity according to the sample phoneme information and the sample phoneme information includes:
converting the sample phoneme information into corresponding sample phoneme vectors;
and calculating the vector similarity of the sample phoneme vector and the sample phoneme vector, and determining the phoneme similarity according to the vector similarity.
Optionally, after the step of calculating a phoneme similarity according to the sample phoneme information and the phoneme information, and determining a spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity, the method further includes:
judging whether the matching degree of the spectrogram is greater than a preset threshold value or not;
and if the matching degree of the frequency spectrogram is larger than a preset threshold value, acquiring sample identity information corresponding to the sample frequency spectrogram, and determining the sample identity information of the sample frequency spectrogram according to the sample identity information.
In addition, to achieve the above object, an embodiment of the present invention further provides a spectrogram matching apparatus, including:
the first acquisition module is used for acquiring a sample spectrogram and acquiring a sample spectrogram;
a second obtaining module, configured to obtain sample phoneme information of a target phoneme in the sample spectrogram, and obtain sample phoneme information of the target phoneme in the sample spectrogram;
and the matching degree determining module is used for calculating the phoneme similarity according to the sample phoneme information and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity.
In addition, to achieve the above object, an embodiment of the present invention further provides a spectrogram matching apparatus, which includes a processor, a memory, and a spectrogram matching program stored on the memory and executable by the processor, wherein when the spectrogram matching program is executed by the processor, the steps of the spectrogram matching method are implemented as described above.
In addition, to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, on which a spectrogram matching program is stored, wherein when the spectrogram matching program is executed by a processor, the steps of the spectrogram matching method are implemented as described above.
The invention provides a spectrogram matching method, a device, equipment and a computer-readable storage medium, which are used for acquiring a sample spectrogram and acquiring a sample spectrogram; acquiring sample phoneme information of a target phoneme in the sample spectrogram; and calculating phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity. Through the mode, the sample phoneme information of the target phoneme in the sample spectrogram is obtained, and then the similarity calculation is performed according to the sample phoneme information and the sample phoneme information to determine the calculated spectrogram matching degree, so that the intelligent matching of the spectrogram is realized.
Drawings
Fig. 1 is a schematic hardware configuration diagram of a spectrogram matching apparatus according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a spectrogram matching method according to a first embodiment of the present invention;
FIG. 3 is a flowchart illustrating a spectrogram matching method according to a second embodiment of the present invention;
fig. 4 is a functional block diagram of a spectrogram matching apparatus according to a first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The spectrogram matching method according to the embodiment of the present invention is mainly applied to spectrogram matching equipment, which may be equipment with a data processing function, such as a Personal Computer (PC), a notebook Computer, and a mobile terminal.
Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a spectrogram matching apparatus according to an embodiment of the present invention. In this embodiment of the present invention, the spectrogram matching apparatus may include a processor 1001 (e.g., a central processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard); the network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WI-FI interface, WI-FI interface); the memory 1005 may be a Random Access Memory (RAM) or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
With continued reference to fig. 1, the memory 1005 of fig. 1, which is a computer-readable storage medium, may include an operating system, a network communication module, and a spectrogram matching program. In fig. 1, the network communication module may be configured to connect to a preset database, and perform data communication with the database; the processor 1001 may call a spectrogram matching program stored in the memory 1005, and execute the spectrogram matching method according to the embodiment of the present invention.
Based on the hardware architecture, embodiments of the spectrogram matching method of the present invention are provided.
The embodiment of the invention provides a spectrogram matching method.
Referring to fig. 2, fig. 2 is a flowchart illustrating a spectrogram matching method according to a first embodiment of the present invention.
In this embodiment, the spectrogram matching method includes the following steps:
step S10, acquiring a sample spectrogram and acquiring a sample spectrogram;
the spectrogram is a common voice data expression, and the spectrogram is often used in the voice processing process, and the voice recognition, the identity recognition and other processing are performed through matching between the spectrogram. In the spectrogram, the x-axis represents frequency and the y-axis represents amplitude. The traditional spectrogram matching method judges the matching condition between spectrograms by manually comparing differences, but the matching method is time-consuming and low in matching efficiency. In view of the above, this embodiment provides a spectrogram matching method, which includes acquiring sample phoneme information of a target phoneme in a sample spectrogram, acquiring sample phoneme information of the target phoneme in the sample spectrogram, and performing similarity calculation according to the sample phoneme information and the sample phoneme information to determine a calculated spectrogram matching degree, so as to implement intelligent matching of the spectrogram.
The spectrogram matching method in this embodiment is implemented by spectrogram matching equipment, which may be a personal computer, a notebook computer, a mobile terminal (e.g., a mobile phone), and the like, and in this embodiment, a computer is taken as an example for description. In this embodiment, the computer first acquires a sample spectrogram, which can be regarded as a target object that needs to be processed currently. Secondly, a database of the computer stores a plurality of sample spectrograms in advance, the sample spectrograms can be considered to be collected in advance, when the matching process is started, the sample spectrograms are obtained from the database, and the subsequent matching processing is carried out on the sample spectrograms and the sample spectrograms. Of course, the sample spectrogram may also be considered to be obtained by converting similar sample audio obtained by matching from a preset sample database through some way (e.g., voiceprint recognition) based on a sample voice corresponding to the sample spectrogram.
Specifically, step S10 includes:
a1, acquiring a sample audio, and converting the sample audio into a sample spectrogram based on a preset rule;
step a2, obtaining a corresponding sample audio from a preset sample database according to the sample audio, and obtaining a sample spectrogram corresponding to the sample audio.
It should be noted that, in this embodiment, the computer only acquires the sample audio in the initial stage, and in order to perform the subsequent matching processing, the computer needs to perform corresponding conversion processing on the sample audio based on a certain preset rule when the sample audio is acquired, so as to obtain a corresponding sample spectrogram. During the transformation, the transformation may be obtained based on a fourier transform rule, and certainly, other software or programs may be used for the transformation, and the specific transformation method may refer to the prior art and is not described herein again. In this way, even if a piece of audio is obtained, for example, a speech recording of a certain person is obtained, the audio can be converted into a sample spectrogram and then processed.
For the acquisition of the sample spectrogram, a corresponding sample audio may be acquired from a preset sample database according to the sample audio, and then the sample spectrogram corresponding to the sample audio is acquired. For the acquisition of the sample audio, the sample audio which is similar to the sample audio can be obtained based on the modes of voiceprint recognition, voice feature matching and the like. By the method, in the practical application process of identity matching through spectrogram matching, the sample spectrogram needing to be matched can be reduced, so that the identity matching efficiency is improved. In addition, the acquisition of the sample spectrogram can be obtained by converting the sample audio or directly obtained by converting the sample audio in advance.
Step S20, acquiring sample phoneme information of a target phoneme in the sample spectrogram, and acquiring sample phoneme information of the target phoneme in the sample spectrogram;
in this embodiment, when obtaining the sample spectrogram and the sample spectrogram, sample phoneme information of a target phoneme in the sample spectrogram is obtained, and sample phoneme information of the target phoneme in the sample spectrogram is obtained. The target phoneme can be preset by a system or selected by a user, and the target phoneme can include one or more (namely more than two); the corpus phoneme information and the sample phoneme information may include, but are not limited to, frequency, amplitude, peak value, valley value, and period.
Specifically, the acquisition process of the sample phoneme information and the sample phoneme information is as follows:
respectively labeling phoneme points corresponding to target phonemes in the sample spectrogram and the sample spectrogram to obtain sample phoneme points and sample phoneme points, wherein the sample phoneme points are recorded in the sample spectrogram, so that a computer can acquire sample phoneme information, such as frequency, amplitude, peak value, valley value, period and the like, corresponding to the sample phoneme points through the sample spectrogram; similarly, since the sample phoneme points are recorded in the sample spectrogram, the computer may also obtain sample phoneme information corresponding to each sample phoneme point through the sample spectrogram, such as frequency, amplitude, peak value, valley value, and period.
It is understood that, in a piece of audio, the same target phoneme may appear one or more times, and correspondingly, the same target phoneme may correspond to one or more points on the spectrogram, so that, when obtaining the sample phoneme information, an average value calculation may be performed on the sample phoneme information of the plurality of phoneme points corresponding to the same target phoneme to obtain the final sample phoneme information. For example, the phoneme r includes three sample phoneme points 1, 2, and 3 in the sample spectrogram, and the amplitude values a at which the sample phoneme points 1, 2, and 3 are respectively obtainedr1、Ar2、Ar3Thereafter, an average calculation can be performed to obtain Ar=(Ar1+Ar2+Ar3)/3。
Step S30, calculating a phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining a spectrogram matching degree between the sample spectrogram and the sample spectrogram according to the phoneme similarity.
In this embodiment, when obtaining the sample phoneme information and the sample phoneme information, the computer may calculate to obtain the phoneme similarity according to the sample phoneme information and the sample phoneme information; the phoneme similarity can be regarded as the representation of the pronunciation difference degree of the sample spectrogram and the sample spectrogram to the same phoneme; the larger the phoneme similarity is, the smaller the pronunciation difference of the sample spectrogram and the sample spectrogram to the same phoneme is; the smaller the phoneme similarity is, the larger the pronunciation difference of the same phoneme between the sample spectrogram and the sample spectrogram is.
Specifically, the step of calculating the phoneme similarity according to the sample phoneme information and the sample phoneme information includes:
a step b1 of converting the sample phoneme information into a corresponding sample phoneme vector;
in this embodiment, for the process of calculating the phoneme similarity, the computer first converts the sample phoneme information into a corresponding sample phoneme vector, and converts the sample phoneme information into a corresponding sample phoneme vector. Wherein, the conversion rules of the two should be consistent; specifically, for phoneme information in a vector, including a, b, and c, and the order of arrangement is a, b, and c, first, three types of attributes a, b, and c in sample phoneme information are taken out, mapped to corresponding attribute values according to a certain numerical mapping relationship, and then sorted according to the order of a, b, and c, so as to obtain sample phoneme vectors corresponding to the sample phoneme information; similarly, sample phoneme information may be obtained for conversion into corresponding sample phoneme vectors. The phoneme information type, the sorting order of each phoneme information, and the numerical mapping relation between the attributes and the numerical values included in the vector can be set according to actual conditions.
And b2, calculating the vector similarity of the sample phoneme vector and the sample phoneme vector, and determining the phoneme similarity according to the vector similarity.
When the sample phoneme vector and the sample phoneme vector are obtained, the computer may calculate a vector similarity between the sample phoneme vector and the sample phoneme vector, and then determine the phoneme similarity according to the vector similarity, for example, the vector similarity may be directly used as the phoneme similarity, or a certain linear transformation process may be performed. The vector similarity between the sample phoneme vector and the sample phoneme vector may be calculated by using different formulas according to actual needs, for example, the vector similarity may be calculated based on a remainder similarity formula, or calculated based on an euclidean distance formula or a chebyshev distance formula. Through the method, the phoneme information of the sample and the phoneme information of the sample are respectively converted into vectors and are used for carrying out similarity calculation, so that the similarity between quantized representation phonemes is convenient for carrying out quantitative description on the matching relation between frequency spectrogram images.
When the phoneme similarity is obtained through calculation, the phoneme similarity can be regarded as a representation of the pronunciation difference degree of the sample spectrogram and the sample spectrogram to the same phoneme, so that the spectrogram matching degree of the sample spectrogram and the sample spectrogram can be determined according to the phoneme similarity, and the matching relation between the spectrograms is quantitatively described according to the phoneme information of the same phoneme in different spectrograms. When determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity, the phoneme similarity may be used as the spectrogram matching degree; or carrying out certain linear transformation processing on the phoneme similarity to obtain the spectrogram matching degree; or setting different similarity ranges in advance, wherein the different similarity ranges correspond to different spectrogram matching degrees, and then determining the corresponding spectrogram matching degree according to the similarity range in which the phoneme similarity is located.
The embodiment of the invention provides a spectrogram matching method, which comprises the steps of obtaining a sample spectrogram and obtaining a sample spectrogram; acquiring sample phoneme information of a target phoneme in the sample spectrogram; and calculating phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity. Through the manner, the embodiment of the invention can improve the matching efficiency of the spectrogram by acquiring the sample phoneme information of the target phoneme in the sample spectrogram, and then performing similarity calculation according to the sample phoneme information and the sample phoneme information to determine and obtain the spectrogram matching degree through calculation, thereby realizing the intelligent matching of the spectrogram.
Based on the first embodiment of the spectrogram matching method, a second embodiment of the spectrogram matching method of the present invention is provided.
Referring to fig. 3, fig. 3 is a flowchart illustrating a spectrogram matching method according to a second embodiment of the present invention.
In this embodiment, before the step S20, the method further includes:
step S40, correcting the sample spectrogram to obtain a corrected sample spectrogram;
in this embodiment, because the sample audio is affected by the collection environment, the generated spectrogram may be irregular and have a deviation, so as to affect the accuracy of the spectrogram matching result, in this embodiment, after the sample spectrogram is obtained, the sample spectrogram is corrected first, so that the sample spectrogram in the subsequent matching process is regular, thereby avoiding affecting the accuracy of the spectrogram matching result.
Specifically, after the sample spectrogram is acquired, the sample spectrogram is corrected to obtain a corrected sample spectrogram.
Specifically, step S40 includes:
step c1, acquiring amplitude curve information corresponding to each sample phoneme in the sample spectrogram, and calculating to obtain corresponding deviation information according to the amplitude curve information;
and c2, correcting the sample spectrogram based on the deviation information to obtain a corrected sample spectrogram.
The correction process of the sample spectrogram is as follows: the amplitude curve information corresponding to each sample phoneme in the sample spectrogram may be obtained first, where the amplitude curve information may include a peak frequency corresponding to a peak, a trough frequency corresponding to a trough, a center point frequency corresponding to a center point value, a peak value, and a trough value. And then, calculating corresponding deviation information according to the amplitude curve information, wherein the deviation information comprises horizontal axis deviation information and vertical axis deviation information. Specifically, the average frequency corresponding to each sample phoneme may be calculated in the manner described in the first embodiment, and then, the corresponding lateral deviation information may be determined by comparing whether the difference between the peak frequency and the center point frequency of each phoneme point is different from the corresponding average frequency, and whether the difference between the trough frequency and the center point frequency of each phoneme point is different from the corresponding average frequency. Next, the average amplitude corresponding to each sample phoneme can be calculated in the manner described in the first embodiment, and the corresponding longitudinal deviation information can be determined by comparing whether the absolute value of the peak value or the trough value of each phoneme point is different from the corresponding average amplitude. And finally, correcting the sample spectrogram based on the deviation information to obtain a corrected sample spectrogram.
At this time, step S20 includes:
and acquiring sample phoneme information of the target phoneme in the corrected sample spectrogram.
Then, the sample phoneme information of the target phoneme in the corrected sample spectrogram is obtained, and the sample phoneme information of the target phoneme in the sample spectrogram is obtained, so as to perform the subsequent steps.
In this embodiment, after the sample spectrogram is acquired, the sample spectrogram is corrected first, so that the sample spectrogram in the subsequent matching process is standard, thereby avoiding affecting the accuracy of the spectrogram matching result.
Based on the first embodiment of the spectrogram matching method, a third embodiment of the spectrogram matching method of the present invention is provided.
In this embodiment, before the step S20, the method further includes:
step A, detecting whether peaks and troughs in the sample spectrogram are labeled or not;
in this embodiment, in order to facilitate the acquisition of the phoneme information of the subsequent sample, after the sample spectrogram is acquired, whether the peaks and troughs in the sample spectrogram are labeled may be detected first.
If the peaks and troughs in the sample spectrogram have been labeled, execute step S20: acquiring sample phoneme information of a target phoneme in the sample spectrogram;
if the peaks and troughs in the sample spectrogram have been labeled, obtaining sample phoneme information of the target phonemes in the sample spectrogram, and then performing subsequent steps.
If the peaks and the troughs in the sample spectrogram are not labeled, executing the step B: marking peaks and troughs in the sample spectrogram to obtain a marked sample spectrogram;
at this time, step S20 includes:
and acquiring sample phoneme information of a target phoneme in the labeled sample spectrogram, and acquiring sample phoneme information of the target phoneme in the sample spectrogram.
And if the wave crests and the wave troughs in the sample frequency spectrogram are not labeled, labeling the wave crests and the wave troughs in the sample frequency spectrogram to obtain the labeled sample frequency spectrogram. The specific labeling process may be: and sequentially acquiring the vertical coordinates corresponding to each time point according to time through a computer, and comparing the vertical coordinates of adjacent moments. When it is detected that the ordinate y1 at the time t1 is smaller than the ordinate y2 at the time t2 and the ordinate y2 at the time t2 is larger than the ordinate y3 at the time t3, it is determined that the point corresponding to the time t2 is a peak; when it is detected that the ordinate y4 at the time t4 is greater than the ordinate y5 at the time t6, and the ordinate y5 at the time t5 is less than the ordinate y6 at the time t6, it is determined that the point corresponding to the time t5 is a trough; and marking the wave crest and the wave trough according to the detection result. Of course, it can be understood that, in a specific embodiment, the sample frequency spectrogram may also be sent to a corresponding working end, so that a worker manually marks peaks and troughs in the sample frequency spectrogram, and then receives the marked sample frequency spectrogram returned by the working end. Then, the sample phoneme information of the target phoneme in the labeled sample spectrogram is obtained, and the sample phoneme information of the target phoneme in the sample spectrogram is obtained, so as to perform the subsequent steps.
Based on the embodiments of the spectrogram matching method, a fourth embodiment of the spectrogram matching method of the present invention is provided.
In this embodiment, after step S30, the method further includes:
step C, judging whether the matching degree of the spectrogram is greater than a preset threshold value or not;
the spectrogram matching process in this embodiment may be applied to a voice identity recognition process, that is, matching a sample spectrogram with a sample spectrogram, so as to determine a sample identity corresponding to the sample spectrogram according to a spectrogram matching degree. Specifically, when the spectrogram matching degree of the sample spectrogram and the spectrogram of the sample spectrogram is obtained, the computer may compare the spectrogram matching degree with a preset threshold value, and determine whether the spectrogram matching degree is greater than the preset threshold value.
And D, if the matching degree of the frequency spectrogram is larger than a preset threshold value, acquiring sample identity information corresponding to the sample frequency spectrogram, and determining the sample identity information of the sample frequency spectrogram according to the sample identity information.
In this embodiment, if the matching degree of the spectrogram is smaller than or equal to the preset threshold, it may be determined that the sample spectrogram has a lower matching degree, and the two do not belong to the same identity; if the matching degree of the spectrogram is greater than the preset threshold, the matching degree of the sample spectrogram by the sample spectrogram is considered to be higher, the sample spectrogram and the sample spectrogram belong to the same identity, and the computer can acquire sample identity information corresponding to the sample spectrogram and determine the sample identity information of the sample spectrogram according to the sample identity information, namely determine the sample identity corresponding to the sample spectrogram.
Through the above manner, the spectrogram matching process of the embodiment can be applied to the voice identity recognition process, and if the spectrogram matching degree is greater than the preset threshold, it can be considered that the sample spectrogram has a higher matching degree, and the sample spectrogram belong to the same identity, at this time, the computer can obtain sample identity information corresponding to the sample spectrogram, and determine the sample identity information of the sample spectrogram according to the sample identity information, thereby implementing voice identity recognition.
In addition, the embodiment of the invention also provides a spectrogram matching device.
Referring to fig. 4, fig. 4 is a functional block diagram of a spectrogram matching apparatus according to a first embodiment of the present invention.
In this embodiment, the spectrogram matching apparatus includes:
a first obtaining module 10, configured to obtain a sample spectrogram and obtain a sample spectrogram;
a second obtaining module 20, configured to obtain sample phoneme information of a target phoneme in the sample spectrogram, and obtain sample phoneme information of the target phoneme in the sample spectrogram;
and the matching degree determining module 30 is configured to calculate a phoneme similarity according to the sample phoneme information and the sample phoneme information, and determine a spectrogram matching degree between the sample spectrogram and the sample spectrogram according to the phoneme similarity.
Each virtual function module of the spectrogram matching apparatus is stored in the memory 1005 of the spectrogram matching device shown in fig. 1, and is used for implementing all functions of a spectrogram matching program; when executed by the processor 1001, the modules may perform spectrogram matching functions.
Further, the spectrogram matching apparatus further comprises:
the correcting module is used for correcting the sample spectrogram to obtain a corrected sample spectrogram;
the second obtaining module 20 is specifically configured to: and acquiring the sample phoneme information of the target phoneme in the corrected sample spectrogram.
Further, the correction module includes:
the first calculating unit is used for acquiring amplitude curve information corresponding to each sample phoneme in the sample spectrogram and calculating corresponding deviation information according to the amplitude curve information;
and the correcting unit is used for correcting the sample spectrogram based on the deviation information to obtain a corrected sample spectrogram.
Further, the spectrogram matching apparatus further comprises:
the detection module is used for detecting whether the wave crests and the wave troughs in the sample spectrogram are labeled or not;
the second obtaining module 20 is specifically configured to: if the wave crests and the wave troughs in the sample spectrogram are labeled, obtaining sample phoneme information of target phonemes in the sample spectrogram;
the labeling module is used for labeling the wave crests and the wave troughs in the sample spectrogram if the wave crests and the wave troughs in the sample spectrogram are not labeled, so as to obtain a labeled sample spectrogram;
the second obtaining module 20 is specifically configured to: and acquiring the sample phoneme information of the target phoneme in the labeled sample spectrogram.
Further, the first obtaining module 10 includes:
the conversion unit is used for acquiring a sample audio and converting the sample audio into a sample spectrogram based on a preset rule;
and the acquisition unit is used for acquiring a corresponding sample audio from a preset sample database according to the sample audio and acquiring a sample spectrogram corresponding to the sample audio.
Further, the matching degree determination module 30 includes:
a vector conversion unit, configured to convert the specimen phoneme information into a corresponding specimen phoneme vector, and convert the sample phoneme information into a corresponding sample phoneme vector;
and the second calculating unit is used for calculating the vector similarity of the sample phoneme vector and determining the phoneme similarity according to the vector similarity.
Further, the spectrogram matching apparatus further comprises:
the matching degree judging module is used for judging whether the matching degree of the spectrogram is greater than a preset threshold value or not;
and the information determining module is used for acquiring sample identity information corresponding to the sample spectrogram if the spectrogram matching degree is greater than a preset threshold value, and determining the sample identity information of the sample spectrogram according to the sample identity information.
The function implementation of each module in the spectrogram matching apparatus corresponds to each step in the spectrogram matching method embodiment, and the function and implementation process thereof are not described in detail herein.
In addition, the embodiment of the invention also provides a computer readable storage medium.
The computer-readable storage medium of the present invention stores a spectrogram matching program, wherein the spectrogram matching program, when executed by a processor, implements the steps of the spectrogram matching method as described above.
The method implemented when the spectrogram matching program is executed may refer to various embodiments of the spectrogram matching method of the present invention, which are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A spectrogram matching method, comprising:
acquiring a sample spectrogram and acquiring a sample spectrogram;
acquiring sample phoneme information of a target phoneme in the sample spectrogram;
and calculating phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity.
2. The spectrogram matching method of claim 1, wherein said step of obtaining the phoneme information of the target phoneme in the spectrogram of the specimen further comprises:
correcting the sample spectrogram to obtain a corrected sample spectrogram;
the step of acquiring the sample phoneme information of the target phoneme in the sample spectrogram comprises:
and acquiring the sample phoneme information of the target phoneme in the corrected sample spectrogram.
3. The spectrogram matching method of claim 2, wherein said step of correcting said sample spectrogram to obtain a corrected sample spectrogram comprises:
acquiring amplitude curve information corresponding to each sample phoneme in the sample spectrogram, and calculating to obtain corresponding deviation information according to the amplitude curve information;
and correcting the sample spectrogram based on the deviation information to obtain a corrected sample spectrogram.
4. The spectrogram matching method of claim 1, wherein said step of obtaining the phoneme information of the target phoneme in the spectrogram of the specimen further comprises:
detecting whether peaks and troughs in the sample spectrogram are labeled or not;
if the peaks and the troughs in the sample spectrogram are labeled, executing the following steps: acquiring sample phoneme information of a target phoneme in the sample spectrogram;
if the wave crest and the wave trough in the sample frequency spectrogram are not labeled, labeling the wave crest and the wave trough in the sample frequency spectrogram to obtain a labeled sample frequency spectrogram;
the step of acquiring the sample phoneme information of the target phoneme in the sample spectrogram comprises:
and acquiring the sample phoneme information of the target phoneme in the labeled sample spectrogram.
5. The spectrogram matching method of claim 1, wherein the step of acquiring a sample spectrogram comprises:
acquiring a sample audio, and converting the sample audio into a sample spectrogram based on a preset rule;
and acquiring a corresponding sample audio from a preset sample database according to the sample audio, and acquiring a sample spectrogram corresponding to the sample audio.
6. The spectrogram matching method of claim 1, wherein said step of calculating phoneme similarity from said sample phoneme information and said sample phoneme information comprises:
converting the sample phoneme information into corresponding sample phoneme vectors;
and calculating the vector similarity of the sample phoneme vector and the sample phoneme vector, and determining the phoneme similarity according to the vector similarity.
7. The spectrogram matching method of any one of claims 1 to 6, wherein after the step of calculating a phoneme similarity according to the sample phoneme information and the sample phoneme information, and determining the spectrogram matching degree of the sample spectrogram and the spectrogram of the spectrogram, according to the phoneme similarity, further comprises:
judging whether the matching degree of the spectrogram is greater than a preset threshold value or not;
and if the matching degree of the frequency spectrogram is larger than a preset threshold value, acquiring sample identity information corresponding to the sample frequency spectrogram, and determining the sample identity information of the sample frequency spectrogram according to the sample identity information.
8. A spectrogram matching apparatus, comprising:
the first acquisition module is used for acquiring a sample spectrogram and acquiring a sample spectrogram;
a second obtaining module, configured to obtain sample phoneme information of a target phoneme in the sample spectrogram, and obtain sample phoneme information of the target phoneme in the sample spectrogram;
and the matching degree determining module is used for calculating the phoneme similarity according to the sample phoneme information and determining the spectrogram matching degree of the sample spectrogram and the sample spectrogram according to the phoneme similarity.
9. A spectrogram matching apparatus comprising a processor, a memory, and a spectrogram matching program stored on the memory and executable by the processor, wherein the spectrogram matching program, when executed by the processor, implements the steps of the spectrogram matching method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a spectrogram matching program is stored, wherein the spectrogram matching program, when executed by a processor, implements the steps of the spectrogram matching method of any one of claims 1 to 7.
CN202010405211.9A 2020-05-13 2020-05-13 Spectrogram matching method, device, equipment and computer readable storage medium Active CN111640454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010405211.9A CN111640454B (en) 2020-05-13 2020-05-13 Spectrogram matching method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010405211.9A CN111640454B (en) 2020-05-13 2020-05-13 Spectrogram matching method, device, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111640454A true CN111640454A (en) 2020-09-08
CN111640454B CN111640454B (en) 2023-08-11

Family

ID=72333212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010405211.9A Active CN111640454B (en) 2020-05-13 2020-05-13 Spectrogram matching method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111640454B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114267375A (en) * 2021-11-24 2022-04-01 北京百度网讯科技有限公司 Phoneme detection method and device, training method and device, equipment and medium
CN114429770A (en) * 2022-04-06 2022-05-03 北京普太科技有限公司 Sound data testing method and device of tested equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102664017A (en) * 2012-04-25 2012-09-12 武汉大学 Three-dimensional (3D) audio quality objective evaluation method
CN105355212A (en) * 2015-10-14 2016-02-24 天津大学 Firm underdetermined blind separation source number and hybrid matrix estimating method and device
CN109448707A (en) * 2018-12-18 2019-03-08 北京嘉楠捷思信息技术有限公司 Voice recognition method and device, equipment and medium
CN109817223A (en) * 2019-01-29 2019-05-28 广州势必可赢网络科技有限公司 Phoneme marking method and device based on audio fingerprints
CN110189749A (en) * 2019-06-06 2019-08-30 四川大学 Voice keyword automatic identifying method
CN110223673A (en) * 2019-06-21 2019-09-10 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of voice, storage medium, electronic equipment
CN110634490A (en) * 2019-10-17 2019-12-31 广州国音智能科技有限公司 Voiceprint identification method, device and equipment
CN111063342A (en) * 2020-01-02 2020-04-24 腾讯科技(深圳)有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium
CN111108552A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identity identification method and related device
CN111133508A (en) * 2019-12-24 2020-05-08 广州国音智能科技有限公司 Method and device for selecting comparison phonemes

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102664017A (en) * 2012-04-25 2012-09-12 武汉大学 Three-dimensional (3D) audio quality objective evaluation method
CN105355212A (en) * 2015-10-14 2016-02-24 天津大学 Firm underdetermined blind separation source number and hybrid matrix estimating method and device
CN109448707A (en) * 2018-12-18 2019-03-08 北京嘉楠捷思信息技术有限公司 Voice recognition method and device, equipment and medium
CN109817223A (en) * 2019-01-29 2019-05-28 广州势必可赢网络科技有限公司 Phoneme marking method and device based on audio fingerprints
CN110189749A (en) * 2019-06-06 2019-08-30 四川大学 Voice keyword automatic identifying method
CN110223673A (en) * 2019-06-21 2019-09-10 龙马智芯(珠海横琴)科技有限公司 The processing method and processing device of voice, storage medium, electronic equipment
CN110634490A (en) * 2019-10-17 2019-12-31 广州国音智能科技有限公司 Voiceprint identification method, device and equipment
CN111108552A (en) * 2019-12-24 2020-05-05 广州国音智能科技有限公司 Voiceprint identity identification method and related device
CN111133508A (en) * 2019-12-24 2020-05-08 广州国音智能科技有限公司 Method and device for selecting comparison phonemes
CN111063342A (en) * 2020-01-02 2020-04-24 腾讯科技(深圳)有限公司 Speech recognition method, speech recognition device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114267375A (en) * 2021-11-24 2022-04-01 北京百度网讯科技有限公司 Phoneme detection method and device, training method and device, equipment and medium
CN114429770A (en) * 2022-04-06 2022-05-03 北京普太科技有限公司 Sound data testing method and device of tested equipment

Also Published As

Publication number Publication date
CN111640454B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN111046133B (en) Question and answer method, equipment, storage medium and device based on mapping knowledge base
WO2021174717A1 (en) Text intent recognition method and apparatus, computer device and storage medium
TWI752455B (en) Image classification model training method, image processing method, data classification model training method, data processing method, computer device, and storage medium
JP2021526242A (en) Insurance recording quality inspection methods, equipment, equipment and computer storage media
CN111640453B (en) Spectrogram matching method, device, equipment and computer readable storage medium
CN111259625A (en) Intention recognition method, device, equipment and computer readable storage medium
CN108038208B (en) Training method and device of context information recognition model and storage medium
CN111640454B (en) Spectrogram matching method, device, equipment and computer readable storage medium
CN110750991B (en) Entity identification method, device, equipment and computer readable storage medium
CN111626346A (en) Data classification method, device, storage medium and device
CN110164417B (en) Language vector obtaining and language identification method and related device
US6718306B1 (en) Speech collating apparatus and speech collating method
CN111353484A (en) Image character recognition method, device, equipment and readable storage medium
JPWO2020003413A1 (en) Information processing equipment, control methods, and programs
CN112966964A (en) Product matching method, device, equipment and storage medium based on design requirements
CN111640450A (en) Multi-person audio processing method, device, equipment and readable storage medium
CN111640421A (en) Voice comparison method, device, equipment and computer readable storage medium
CN111143001A (en) Language detection method of terminal, user equipment, storage medium and device
CN110738126A (en) Lip shearing method, device and equipment based on coordinate transformation and storage medium
CN115859065A (en) Model evaluation method, device, equipment and storage medium
CN115881108A (en) Voice recognition method, device, equipment and storage medium
CN115019788A (en) Voice interaction method, system, terminal equipment and storage medium
CN110134909B (en) Curved surface drawing method, equipment, storage medium and device
CN111353867A (en) Learning rate adjusting method, device, equipment and readable storage medium
CN112351304A (en) Intelligent large screen control method, device, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant