CN108628837A

CN108628837A - Using the simultaneous interpretation case of convolutional neural networks algorithm translation Spanish and Sichuan accent

Info

Publication number: CN108628837A
Application number: CN201710172489.4A
Authority: CN
Inventors: 邱念
Original assignee: Hunan Original Culture Development Co Ltd
Current assignee: Hunan Original Culture Development Co Ltd
Priority date: 2017-03-22
Filing date: 2017-03-22
Publication date: 2018-10-09

Abstract

The invention discloses a kind of simultaneous interpretation casees using convolutional neural networks algorithm translation Spanish and Sichuan accent, including：Component one, simultaneous interpretation box main body part and its accessory；Large database concept in component two, cloud storage；The convolutional neural networks model carried in component three, cloud computing center is constituted, and by above-mentioned component, the present invention can substitute the advanced simultaneous interpretation translator of profession, be translated for user, the benefit brought is：Translation will not lead to the mistake caused by fatigue for a long time；Greatly reduce the fund cost for engaging simultaneous interpretation translator；Since, without sitting people, therefore volume very little can be that space is saved in international conference place, the seats for placing participants increase the number of participants of meeting more in the simultaneous interpretation case of the present invention.

Description

Using the simultaneous interpretation case of convolutional neural networks algorithm translation Spanish and Sichuan accent

Technical field

The present invention relates to intelligent simultaneous interpretation casees, more particularly to using convolutional neural networks algorithm translation Spanish and four The simultaneous interpretation case of Kawaguchi's sound.

Background technology

With the quickening of internationalization process, the demand of simultaneous interpretation translation is increasing, and in international conference, simultaneous interpretation case is indispensable Important translation tool, traditional simultaneous interpretation case is a subvitreous closed box, and translator is sitting in wherein, passes through microphone pair On meeting-place spokesman what is said or talked about carry out simultaneous interpretation translation, traditional simultaneous interpretation case overall volume is larger, occupy meeting-place space；Traditional Simultaneous interpretation case closed structure ventilating air permeability is bad, easily causes translator's anoxic, under the translation of high intensity, be easy with The extension of the time of meeting and make translator tired, to influence the accuracy rate of translation；And the same of profession is engaged in international conference It is larger to pass translation expense, if spokesman has the people with Sichuan accent in meeting-place, then also needs to be equipped with the translator for understanding Sichuan words, such as Fruit translation foreign side's offer again is difficult then to engage to the personnel that can carry out simultaneous interpretation translation to Sichuan dialect and Spanish.

Invention content

The invention mainly solves the technical problem of providing a kind of using convolutional neural networks algorithm translation Spanish and The simultaneous interpretation case of Sichuan accent can substitute the advanced simultaneous interpretation translation of high wages, provide to the user will not because the translation time is long and The translation error caused by fatigue, and can identify the Sichuan accent of user, avoid user from not speaking standard Chinese pronunciation, translator is again It is ignorant of the difficult situation of Sichuan accent.

In order to solve the above technical problems, one aspect of the present invention is：It provides a kind of using convolutional Neural net The simultaneous interpretation case of network algorithm translation Spanish and Sichuan accent, which is characterized in that including：Component one, simultaneous interpretation box main body part and Its accessory；Large database concept in component two, cloud storage；The convolutional neural networks model structure carried in component three, cloud computing center At.

Convolutional neural networks are a kind of perceptrons of multilayer, and every layer is made of two dimensional surface, and each plane is by multiple only Vertical neuron forms, first comprising some simple members and complexity in network, is denoted as C members and S members respectively；C members condense together structure At convolutional layer, S members, which condense together, constitutes down-sampling layer；The Sichuan accent audio or spanish audio of input by and filtering Device and can biasing set carry out convolution, the N number of characteristic pattern of generation at C layer（N values can be manually set）, then Feature Mapping figure is by asking With weighted value and biasing, then pass through an activation primitive（Usually select Sigmoid functions）Obtain S layers of Feature Mapping figure.Root According to the quantity for being manually set C layers and S layers, the above work recycles progress successively；Finally, the down-sampling to most tail portion and output layer into The full connection of row, obtains output translation result to the end.

The simultaneous interpretation box main body of the present invention is case structure, is different from traditional simultaneous interpretation case, the present invention is due to using volume The machine translation mode of product neural network algorithm, therefore without sitting people, therefore the simultaneous interpretation of the present invention in the simultaneous interpretation case at meeting scene Case volume is much smaller than traditional simultaneous interpretation case, is provided in simultaneous interpretation box main body of the invention and to be connected for several audio transmission lines The switch and master switch of total electric wire connecting junction, the interchanger of incoming fiber optic, each voice-grade channel of control；The accessory of simultaneous interpretation case includes：Meeting Each attend a banquet the audio input line being connected and output line on field, microphone of attending a banquet, earphone of attending a banquet, attend a banquet on translation switch, light It is fine.

Claims

1. using the simultaneous interpretation case of convolutional neural networks algorithm translation Spanish and Sichuan accent, which is characterized in that including：Component One, simultaneous interpretation box main body part and its accessory；Large database concept in component two, cloud storage；It is carried in component three, cloud computing center Convolutional neural networks model is constituted

Component one according to claim 1, it is characterised in that：Simultaneous interpretation box main body is case structure, is different from traditional same Biography case, machine translation mode of the present invention due to using convolutional neural networks algorithm, therefore in the simultaneous interpretation case at meeting scene Without sitting people, therefore the simultaneous interpretation case volume of the present invention is much smaller than traditional simultaneous interpretation case, is provided in simultaneous interpretation box main body of the invention For total electric wire connecting junction of several audio transmission lines connection, the interchanger of incoming fiber optic, each voice-grade channel of control switch and Master switch；The accessory of simultaneous interpretation case includes：The each audio input line being connected and output line, microphone of attending a banquet, seat of attending a banquet on meeting-place Seat earphone, attend a banquet on translation switch, optical fiber.

2. component two according to claim 1, it is characterised in that：The memory space of cloud storage configuration must not be less than 100TB； In the space of cloud storage, taxonomic revision is carried out to big data information, inventory is divided to put；Large database concept in cloud storage space is at least Need to include following several libraries：1）Spanish voice big data（By being no less than 100 people, divides men and women to carry out pronunciation respectively and record together One Spanish；The total sentence number of Spanish of different content is not less than 100,000）、2）The voice large database concept of Sichuan accent （By being no less than 100 people, divides men and women to carry out pronunciation respectively and record same sentence Sichuan accent sentence；The Sichuan accent language of different content The total sentence number of sentence is not less than 100,000）、3）Spanish grammer and pronunciation rule database, 4）Chinese grammar and Sichuan accent pronunciation Rule database.

3. component three according to claim 1, it is characterised in that：Big to convolutional neural networks algorithm model input audio Before data, it need to be that every audio data carries out sound wave Image Rendering, then by drawn sound wave image input part three.

4. component three according to claim 1, it is characterised in that：Convolutional neural networks are a kind of perceptrons of multilayer, often Layer is made of two dimensional surface, and each plane is made of multiple independent neurons, simple first and complicated comprising some in network Member is denoted as C members and S members respectively；C members, which condense together, constitutes convolutional layer, and S members, which condense together, constitutes down-sampling layer；Input Sichuan accent audio or spanish audio by with filter and can biasing set carry out convolution, the N number of characteristic pattern of generation at C layer（N Value can be manually set）, then Feature Mapping figure is by summation, weighted value and biasing, then passes through an activation primitive（Usually select Sigmoid functions）Obtain S layers of Feature Mapping figure.

5. according to the quantity for being manually set C layers and S layers, the above work recycles progress successively；Finally, to the down-sampling of most tail portion and Output layer is connected entirely, obtains output translation result to the end.

6. component three according to claim 1, it is characterised in that the process of convolution：It is gone with a trainable filter fx The audio frequency sound image of one input of convolution（It is input picture at C1 layers, the input of convolutional layer later is then that the convolution of preceding layer is special Sign figure）, by an activation primitive (generally using Sigmoid functions), then plus one biases bx, obtains convolutional layer Cx.

7. concrete operation such as following formula, Mj is the value of input feature vector figure in formula：

The process of sub-sampling includes：The m pixel per neighborhood（M is to be manually set）Summation becomes a pixel, then passes through mark Wx+1 weightings are measured, biasing bx+1 is further added by, Feature Mapping figure is then generated by activation primitive Sigmoid；From a plane to The mapping of next plane can be regarded as making convolution algorithm, and S layers are considered as fuzzy filter, play Further Feature Extraction Effect；Spatial resolution between hidden layer and hidden layer is successively decreased, and the number of planes contained by every layer is incremented by, and can be used for detecting more in this way More characteristic informations；For sub-sampling layer, there is N number of input feature vector figure, just has N number of output characteristic pattern, only each feature The size of figure has obtained corresponding change, concrete operation such as following formula, down in formula（）Indicate down-sampling function；

。