CN112967717B

CN112967717B - Fuzzy matching training method for English speech translation with high accuracy

Info

Publication number: CN112967717B
Application number: CN202110223114.2A
Authority: CN
Inventors: 王晓靖; 张敏; 李琦; 丁桂芝; 牛明敏; 张晨曦; 郭晓斌
Original assignee: Zhengzhou Railway Vocational and Technical College
Current assignee: Zhengzhou Railway Vocational and Technical College
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2023-08-22
Anticipated expiration: 2041-03-01
Also published as: CN112967717A

Abstract

The invention discloses a fuzzy matching method for English speech translation with high accuracy, which comprises the following steps: s1, acquiring voice information and converting the voice information into a digital voice signal; s2, carrying out voice detection on the digital voice signal by adopting a fuzzy sound image rule to obtain corresponding matching parameters, and obtaining a reconfiguration data stream of the programmable device by adopting genetic algorithm operation according to the matching parameters; s3, reconfiguring the programmable device by adopting the reconfiguration data stream; s4, the online voice receiving module can search online voice data by taking the matching parameters as guidance, and roughly sort the online voice data; inputting the retrieved online voice data into a programmable device for fuzzy matching; s5, finding out the most matched online voice data output.

Description

Fuzzy matching training method for English speech translation with high accuracy

Technical Field

The invention relates to the field of English translation, in particular to a fuzzy matching method for English speech translation with high accuracy.

Background

Along with the development of internet technology, the intellectualization of English translation is also continuously advancing, such as a computer-aided translation system, text translation and voice translation are realized one by one, and particularly, on-line voice translation can be realized by handheld equipment such as a mobile phone APP.

In the prior art, the voice is generally converted into the text, then the similarity matching is carried out on the text information and sentences in a translation library by adopting an algorithm, and the result with the highest similarity is used as output. As disclosed in CN201710532235.9, an information retrieval technology is adopted to construct an index for a large-scale translation memory, rough selection and carefully selecting strategies are adopted, namely, firstly, a matched subset is obtained from the index library according to input sentences to be translated, then, final translation output is obtained by using a fuzzy matching method of linear combination of semantic vector similarity and editing distance of sentences, and finally, manually edited translations and source language sentence segments thereof are returned to the translation memory for incremental updating.

The translation efficiency is not high, the text recognition rate is also related to whether English pronunciation is accurate, text information needs to be checked again after text matching to obtain a translation result, meaning is understood again according to the text information, and communication efficiency is affected.

Disclosure of Invention

The invention provides the following technical scheme for solving the problems in the background technology: a fuzzy matching method for English speech translation with high accuracy comprises the following specific steps:

s1, acquiring voice information and converting the voice information into a digital voice signal;

s2, carrying out voice detection on the digital voice signal by adopting a fuzzy sound image rule to obtain corresponding matching parameters, and obtaining a reconfiguration data stream of the programmable device by adopting genetic algorithm operation according to the matching parameters;

s3, reconfiguring the programmable device by adopting the reconfiguration data stream;

s4, the online voice receiving module can search online voice data by taking the matching parameters as guidance, and roughly sort the online voice data; inputting the retrieved online voice data into a programmable device for fuzzy matching;

s5, finding out the most matched online voice data output.

The fuzzy matching algorithm in the step S4 is as follows: and taking a plurality of matching parameters as a fuzzy rule R, taking digitized voice as an input quantity X, and determining an output U by the output of a plurality of fuzzy rules when the input quantity X activates the plurality of fuzzy rules R.

The genetic algorithm in the step S2 is specifically as follows:

s21, randomly generating chromosome individuals;

s22, calculating the fitness value of the individual;

s23, randomly performing mutation operation on the individuals to generate offspring individuals;

and S24, executing selection operation, if the fitness value of the child individual is higher than that of the individual, copying the child individual to the next generation, otherwise copying the child individual to the next generation with a small probability, and the like until the termination condition is met.

The fuzzy sound image rule in the step S2 includes a description of a sound image feature implemented by a rule formed by a mel cepstrum coefficient analysis method, a short-time energy and short-time average zero-crossing rate statistical method, a formant extraction method obtained based on spectrum analysis and other methods, and outputs of the rule, such as the mel cepstrum coefficient, the short-time energy, the short-time average zero-crossing rate, the formant and the like, are used as matching parameters.

A configurable logic module (LUT) in a programmable device is used as a chromosome, an optimal chromosome is found out through a genetic algorithm, and a reconfiguration data stream is generated and configured on the programmable device, so that a circuit is reconfigured to realize that the circuit is changed according to the change of input voice so as to more effectively match voice data.

The invention also provides a translation fuzzy matching system of English voice, which can translate more accurately, wherein one technical scheme is as follows: a translation fuzzy matching system of English speech comprises a processor, a memory, a programmable device, an online speech receiving module, a speech acquisition module and a display playing module;

the processor and the programmable device are respectively connected with the voice acquisition module; the processor is electrically connected with the memory, the programmable device, the online voice receiving module, the voice acquisition module and the display playing module respectively;

the online voice receiving module and the memory are connected with the programmable device;

the voice acquisition module acquires voice information and converts the voice information into a digital voice signal.

The processor receives the digital voice signal, carries out voice detection on the digital voice signal by adopting a fuzzy sound image rule to obtain corresponding matching parameters, adopts genetic algorithm operation to obtain reconfiguration data flow of the programmable device according to the matching parameters, stores the reconfiguration data flow into the memory, and controls the memory and the programmable device to reconfigure the programmable device.

The matching parameters are also used for retrieving online voice data through an online voice receiving module, and the online voice receiving module can take the matching parameters as a guide to retrieve the online voice data and roughly sort the online voice data; and inputting the retrieved online voice data into a programmable device for fuzzy matching.

The programmable device is configured to realize a fuzzy matching algorithm, wherein a plurality of matching parameters are used as fuzzy rules R, digital voice is used as input quantity X, and when the input quantity X activates a plurality of fuzzy rules R, the output U is determined by the output of the fuzzy rules.

The programmable device takes digital voice signals and online voice data as input, and finally finds out the most matched online voice data to output to a processor for combination and text translation to a display and play module for playing and displaying through calculation of a fuzzy matching algorithm.

According to the characteristics of the received voice information, the matching parameters are obtained, the optimal matching strategy is obtained by using a genetic algorithm, a reconfiguration data stream for configuring a programmable device is generated, the matching of on-line voice data and an input digital voice signal is facilitated, finally, the matching of the voice signal is directly carried out by adopting a fuzzy matching algorithm, and the voice and text output translated after the accurate matching data is obtained according to a fuzzy rule. Therefore, the best matched voice translation information can be obtained according to the characteristics of voice input, and voice output can be realized rapidly.

Drawings

FIG. 1 is a block diagram of a translation fuzzy matching system.

FIG. 2 is a schematic diagram of a translation fuzzy matching system.

Fig. 3 is a step diagram of a translation fuzzy matching system for english speech.

FIG. 4 is a diagram of a genetic algorithm implementation step.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

As shown in fig. 3, the present invention provides a specific embodiment: a fuzzy matching method for English speech translation with high accuracy comprises the following specific steps:

s5, finding out the most matched online voice data output.

The fuzzy matching algorithm in the S4 is as follows: and taking a plurality of matching parameters as a fuzzy rule R, taking digitized voice as an input quantity X, and determining an output U by the output of a plurality of fuzzy rules when the input quantity X activates the plurality of fuzzy rules R.

The genetic algorithm in S2 is specifically as follows:

s21, randomly generating chromosome individuals;

s22, calculating the fitness value of the individual;

The fuzzy sound image rule in the S2 comprises sound image feature description realized by the rule formed by the methods of the mel cepstrum coefficient analysis, the short-time energy and short-time average zero-crossing rate statistical method, the formant extraction method obtained based on spectrum analysis and the like, and the output of the rule such as the mel cepstrum coefficient, the short-time energy, the short-time average zero-crossing rate, the formant and the like is used as a matching parameter.

The configurable logic module (LUT) in the programmable device 3 is used as a chromosome, the optimal chromosome is found out through a genetic algorithm, and a reconfiguration data stream is generated and configured on the programmable device 3, so that a circuit is reconfigured to realize that the circuit is changed according to the change of input voice so as to more effectively match voice data.

Example 2

As shown in fig. 1, a specific embodiment of the present invention is a translation fuzzy matching system for english language, which includes a processor 1, a memory 2, a programmable device 3, an online speech receiving module 4, a speech acquiring module 5, and a display playing module 6.

The processor 1 and the programmable device 3 are respectively connected with the voice acquisition module 5; the processor 1 is electrically connected with the memory 2, the programmable device 3, the online voice receiving module 4, the voice obtaining module 5 and the display playing module 6 respectively.

The online voice receiving module 4 and the memory 2 are connected with the programmable device 3;

the voice acquisition module 5 acquires voice information and converts the voice information into a digital voice signal;

the processor 1 receives the digital voice signal, performs voice detection on the digital voice signal by adopting a fuzzy sound image rule to obtain corresponding matching parameters, adopts genetic algorithm operation to obtain a reconfiguration data stream of the programmable device 3 according to the matching parameters, and stores the reconfiguration data stream into the memory 2, and the processor 1 controls the memory 2 and the programmable device 3 to reconfigure the programmable device 3.

The matching parameters are also used for retrieving online voice data through the online voice receiving module 4, and the online voice receiving module 4 can take the matching parameters as guidance to retrieve online voice data and roughly sort the online voice data; the retrieved online voice data is input to the programmable device 3 for fuzzy matching.

The programmable device 3 is configured to implement a fuzzy matching algorithm, taking a plurality of matching parameters as a fuzzy rule R and digitized voice as an input quantity X, and when the input quantity X activates a plurality of fuzzy rules R, the output U is determined by the output of the plurality of fuzzy rules.

The programmable device 3 takes digital voice signals and online voice data as input, and finally finds out the most matched online voice data to output to the processor 1 for combination and text translation to the display and play module 6 for playing and displaying after calculation of a fuzzy matching algorithm.

As shown in fig. 2, as a preferred embodiment, the programmable device 3 may be implemented by using an FPGA3', and the processor 1 can control the configuration data flow stored in the memory to complete the configuration after powering up the FPGA3' chip when starting up the program; the reconfiguration data stream can also be regenerated in the running process of the processor 1 to reconfigure the FPGA3' so as to perform personalized translation on different english voices.

As a preferred embodiment, the genetic algorithm may employ a trend compact genetic algorithm;

as a preferred embodiment, the fuzzy sound image rule includes a feature description of a sound image implemented by a rule composed of a mel-frequency cepstrum coefficient analysis method, a short-time energy and short-time average zero-crossing rate statistical method, a formant extraction method obtained based on spectrum analysis and the like, and outputs of the rule such as the mel-frequency cepstrum coefficient, the short-time energy, the short-time average zero-crossing rate, the formant and the like are used as matching parameters.

As a preferred embodiment, the processor 1 may be implemented by an STM32 series single-chip microcomputer 1' or a DSP.

As a preferred embodiment, the online voice receiving module 4 may remotely obtain an online voice feature library in a server, and implement retrieval of online voice features according to matching parameters and a manner of converting digital voice signals into text; the search results are sequenced and then sent to the programmable device 3 for fuzzy matching; the Websocket server 4' can be built in cooperation with the processor 1.

According to the embodiment, the matching parameters are obtained according to the characteristics of the received voice information, the optimal matching strategy is obtained by using the genetic algorithm, the reconfiguration data stream for configuring the programmable device 3 is generated, the on-line voice data and the input digital voice signals are conveniently matched, finally the voice signals are directly matched by using the fuzzy matching algorithm, and the accurate matching data is obtained according to the fuzzy rule, and then the translated voice and text are output. Therefore, the best matched voice translation information can be obtained according to the characteristics of voice input, and voice output can be realized rapidly.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A fuzzy matching method for English speech translation with high accuracy is characterized in that: the matching system adopted by the matching method comprises a processor, a memory, a programmable device, an online voice receiving module, a voice acquisition module and a display playing module; the processor and the programmable device are respectively connected with the voice acquisition module; the processor is electrically connected with the memory, the programmable device, the online voice receiving module, the voice acquisition module and the display playing module respectively; the online voice receiving module and the memory are connected with the programmable device;

the fuzzy matching method comprises the following steps:

s1, the voice acquisition module acquires voice information and converts the voice information into a digital voice signal;

s2, carrying out voice detection on the digital voice signal by adopting a fuzzy sound image rule to obtain corresponding matching parameters, and obtaining a reconfiguration data stream of the programmable device by adopting genetic algorithm operation according to the matching parameters; the reconfiguration data flow of the programmable device obtained by adopting genetic algorithm operation according to the matching parameters is specifically as follows: the configurable logic module in the programmable device is used as a chromosome, the optimal chromosome is found out through a genetic algorithm, and a reconfiguration data stream is generated and configured on the programmable device, so that a circuit is reconfigured to realize that the circuit is changed according to the change of input voice so as to more effectively match voice data; the fuzzy sound image rule comprises a sound image feature description realized by a rule composed of a mel cepstrum coefficient analysis method, a short-time energy and short-time average zero-crossing rate statistical method and a formant extraction method obtained based on spectrum analysis, and the output of the rule such as the mel cepstrum coefficient, the short-time energy, the short-time average zero-crossing rate and the formant are used as matching parameters;

the genetic algorithm comprises the following specific implementation steps:

s21, randomly generating chromosome individuals;

s22, calculating the fitness value of the individual;

s24, executing selection operation, if the fitness value of the child individual is higher than that of the individual, copying the child individual to the next generation, otherwise copying the child individual to the next generation with a small probability, and then repeating the steps until the termination condition is met;

s3, adopting the reconfiguration data flow to reconfigure the programmable device to realize the change of the programmable device according to the change of the input voice so as to match the voice data;

s4, the online voice receiving module takes the matching parameters as guidance to retrieve online voice data, and roughly sorts the online voice data; inputting the retrieved online voice data into a programmable device for fuzzy matching; the programmable device is configured to realize a fuzzy matching algorithm, a plurality of matching parameters are used as fuzzy rules R, a digital voice signal is used as an input quantity X, and when the input quantity X activates a plurality of fuzzy rules R, an output U is determined by the output of the fuzzy rules; the programmable device takes digital voice signals and online voice data as input, and finally finds out the most matched online voice data through calculation of a fuzzy matching algorithm, outputs the most matched online voice data to a processor for combination and text translation, and sends the most matched online voice data to a display and play module for playing and displaying;

s5, outputting the most matched online voice data.