CN110516266A

CN110516266A - Video caption automatic translating method, device, storage medium and computer equipment

Info

Publication number: CN110516266A
Application number: CN201910894066.2A
Authority: CN
Inventors: 张启
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-09-20
Filing date: 2019-09-20
Publication date: 2019-11-29

Abstract

The invention belongs to technical field of data processing, a kind of video caption automatic translating method, device, storage medium and computer equipment are disclosed.Device in the present invention, including data obtaining module, preprocessing module, translation module and memory module.The present invention is by translating and exporting to the subtitle of video after obtaining the caption translating request of viewer, realize the function of video caption automatic translation, avoid problem at high cost caused by human translation and limited languages, the Production Time of video caption translation is saved, it can be according to the actual demand free switching languages of viewer, the user experience of viewer is improved, there is good practical value.

Description

Video caption automatic translating method, device, storage medium and computer equipment

Technical field

The invention belongs to technical field of data processing, and in particular to a kind of video caption automatic translating method, device, storage Medium and computer equipment.

Background technique

As a large amount of video is made, issues and is propagated using Digital Media mode, the viewings such as computer, home theater view The equipment of frequency is also gradually popularized, and people can be contacted fast, on a large scale from country variant, different regions, different language The propagation of video, video is more convenient and quick；But the following main problem is, for the video of different language, Most of video viewers understand that there is also difficulties even with audio and subtitle；As propagated most wide film and electricity in video Depending on play, viewer understands the most of dialogue of personage or monologue in video to the plot of film and TV play, therefore sees If there are obstacles in language understanding by the person of seeing, substantially fail to understand film and TV play；Therefore, video production side is come It says, the difficulty of language understanding means to lose potentially large number of spectators, and for viewer, then it is a large amount of to lose appreciation The chance of outstanding video work.

In existing video production process, the caption translating production of the video of different language, which mainly relies on, to be accomplished manually, Problem on obstacle, due to cost of labor etc., human translation word are watched caused by the languages of some subtitles although can solve Curtain usually only translates several commonly used languages, it is difficult to meet the demand that different viewers obtains the subtitle of its suitable languages, thus Influence the Experience Degree of viewer.

Summary of the invention

In order to solve the above problems existing in the present technology, it is an object of that present invention to provide a kind of video caption automatic translations Method, apparatus, storage medium and computer equipment, present invention saves the Production Times of video caption translation, and can root According to the actual demand free switching languages of viewer, there is good practical value.

The technical scheme adopted by the invention is as follows:

A kind of video caption automatic translating method, comprising the following steps:

Whether real-time detection has caption translating request, if so, then obtaining caption translating request, and is requested according to caption translating Determine sensing word, wherein sensing word includes source languages, target language and video location；

The corresponding video file in current video position is extracted, the corresponding original subtitle text of current video file is then obtained Part；

Currently whether the corresponding languages of original subtitle file are source languages for judgement, if so, then to current original subtitle file It is split and generates multiple phrases, if not, output translation request error message；

The language model of target language is extracted, and multiple phrases after segmentation are sequentially input to the language model of target language It is interior, obtain target subtitle file；

Current goal subtitle file is corresponding with original subtitle file, and current goal subtitle file is exported to working as forward sight Information is completed in frequency file, output translation.

Preferably, caption translating request is voice request and/or text request；When caption translating request is voice request When, determining sensing word, specific step is as follows:

Current speech is requested to carry out pretreatment operation, wherein pretreatment operation includes noise elimination and grammer segmentation；

It is converted to by pretreated voice request, obtains text request, then determined by identification text request Sensing word.

Preferably, when obtaining the corresponding original subtitle file of current video file, the specific steps are as follows:

Judge whether to prestore the corresponding original subtitle file of current video file；

If so, then continuing the judgement currently corresponding languages of original subtitle file；

If not, obtaining the video frame of current video file, identification shape is successively then carried out to the subtitle of all video frames At original document, all original documents are arranged according to the sequence of corresponding video frame, form original subtitle file.

Preferably, by current goal subtitle file and original subtitle file to it is corresponding when

A1. current goal subtitle file is traversed, is judged with the presence or absence of sensitive word in current goal subtitle file, if so, then Sensitive word is replaced operation and/or delete operation, then current goal subtitle file is input to the grammer of target language The verification of smoothness degree is carried out in model, is carried out if not, current goal subtitle file is input in the syntactic model of target language Clear and coherent degree verification；

A2. judge whether the clear and coherent rate of smoothness degree verification is higher than threshold value, if so, final goal subtitle file is then exported, and Final goal subtitle file is corresponding with original subtitle file, finally current final goal subtitle file is exported to current video Target subtitle file after mark if not, the phrase by clear and coherent rate lower than threshold value is labeled, and is re-entered mesh by file Verification object subtitle file is obtained after the language model of poster kind, and step A1 then is repeated to verification object subtitle file.

Preferably, by final goal subtitle file and original subtitle file to it is corresponding when by final goal subtitle file Timestamp is corresponding with the timestamp of original subtitle file.

Preferably, original subtitle file includes that original captioned test and scene corresponding with each original captioned test are believed Breath；When being split to current original subtitle file, multiple phrases are generated after dividing original captioned test, each phrase is corresponding There is a scene information；Extract target language language model after, by after segmentation multiple phrases and corresponding scene information successively It inputs in the language model of target language, obtains target subtitle file.

A kind of device based on above-mentioned video caption automatic translating method, including data obtaining module, preprocessing module, turn over Translate module and memory module；

Data obtaining module, for obtaining caption translating request, video file and original subtitle corresponding with video file File is also used to be requested to determine sensing word according to caption translating, wherein sensing word includes source languages, target language and video position It sets；

Preprocessing module is also used to for being split processing to original subtitle file to the original subtitle text after segmentation Part carries out languages judgement；

Memory module, for storing the language model of video file, original subtitle file and multiple languages；

Then translation module exports target subtitle text for calling the corresponding language model of the target language in sensing word Part.

Preferably, above-mentioned video caption automatic translation device further includes correction verification module；Memory module is also used to store The syntactic model of multiple languages；

Correction verification module, for calling the corresponding syntactic model of the target language in sensing word, then to target subtitle file Carry out the verification of smoothness degree, and the clear and coherent rate result of output verification.

A kind of computer readable storage medium, it is characterised in that: be stored with computer program；The calculation machine program is located When managing device execution, above-mentioned video caption automatic translating method is realized.

A kind of computer equipment, including one or more processors further include memory, further include one or more calculating Machine program；One or more computer programs are stored in memory, and are configured as by one or more of Processor executes；One or more computer programs are configured for executing above-mentioned video caption automatic translation side Method.

The invention has the benefit that

By the way that the subtitle of video is translated and exported after obtaining the caption translating request of viewer, video words are realized The function of curtain automatic translation, avoids problem at high cost caused by human translation and limited languages, has saved video caption translation Production Time can improve the user experience of viewer according to the actual demand free switching languages of viewer, have very Good practical value.

Other beneficial effects of the invention will be described in detail in a specific embodiment.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is the flow diagram of video caption automatic translating method in embodiment 1.

Specific embodiment

With reference to the accompanying drawing and specific embodiment come the present invention is further elaborated.It should be noted that for Although the explanation of these way of example is to be used to help understand the present invention, but and do not constitute a limitation of the invention.The present invention Disclosed function detail is only used for description example embodiments of the present invention.However, this hair can be embodied with many alternative forms It is bright, and be not construed as limiting the invention in the embodiment that the present invention illustrates.

It should be appreciated that terminology used in the present invention is only used for description specific embodiment, it is not intended to limit of the invention show Example embodiment.If term " includes ", " including ", "comprising" and/or " containing " are in the present invention by use, specified institute's sound Bright feature, integer, step, operation, unit and/or component existence, and be not excluded for one or more other features, number Amount, step, operation, unit, component and/or their combination existence or increase.

It should be appreciated that it will be further noted that the function action occurred may go out with attached drawing in some alternative embodiments Existing sequence is different.For example, depending on related function action, can actually substantially be executed concurrently, or sometimes Two figures continuously shown can be executed in reverse order.

It should be appreciated that providing specific details, in the following description in order to which example embodiment is understood completely. However, those of ordinary skill in the art are it is to be understood that implementation example embodiment without these specific details. Such as system can be shown in block diagrams, to avoid with unnecessary details come so that example is unclear.In other instances, may be used Or not show well-known process, structure and technology unnecessary details, to avoid making example embodiment unclear.

Embodiment 1:

As shown in Figure 1, the present embodiment provides a kind of video caption automatic translating methods, comprising the following steps:

In the present embodiment, by the way that the subtitle of video is translated and is exported after obtaining the caption translating request of viewer, The function of realizing video caption automatic translation avoids problem at high cost caused by human translation and limited languages, has saved view The Production Time of frequency caption translating can improve the user of viewer according to the actual demand free switching languages of viewer Experience Degree has good practical value.

For example, user can issue caption translating request, caption translating by oneself oral account or by man-machine interface May include in request " English to Chinese ", " Chinese to English ", " Ying Yide ", " Ying Yi ", " Ying Yihan ", " Ying Yi Russia ", " Han Yi Russia ", The sensing word of two languages such as " Chinese translates Korea Spro ", " Chinese translates day ", " Yi Russia " translation, thus the user convenient for different language habit exists Independently switch the languages for oneself needing to translate when watching video, the subtitle after then translating can occur simultaneously with original subtitle Subtitle in video pictures or after translation can substitute original subtitle and appear in video pictures, it is possible thereby to promote each languages Information exchange and cultural exchanges.

As one of preferred embodiment, the language model of each languages is based on CNN (convolutional neural networks) Or training forms RNN (Recognition with Recurrent Neural Network) etc. in advance, so that the data processings such as caption translating are more efficient, and then realizes The automatic real time translation of video caption.

Embodiment 2

The present embodiment is the further improvement made on the basis of embodiment 1, and the difference of the present embodiment and embodiment 1 exists In:

In the present embodiment, caption translating request is voice request and/or text request；When caption translating requests to ask for voice When asking, determining sensing word, specific step is as follows:

Current speech is requested to carry out pretreatment operation, wherein pretreatment operation includes noise elimination and grammer segmentation；Its In, grammer segmentation includes that punctuation mark prejudges, spoken language repeats screening out, segmenting for words.

It is real using Python third party library jieba when voice request is Chinese as one of preferred embodiment Now the text request of current speech request is segmented, recognition speed is fast, so that the recognition accuracy of voice request is bright It is aobvious to improve.

Embodiment 3

The present embodiment is the further improvement made on the basis of embodiment 1 and any embodiment 2, the present embodiment and reality It applies example 1 and any difference of embodiment 2 is:

In the present embodiment, when obtaining the corresponding original subtitle file of current video file, the specific steps are as follows:

As one of preferred embodiment, OCR recognizer is used when identifying to the subtitle of video frame, by This, which can be convenient, is rapidly converted to text information for image information.

Alternatively preferred embodiment, the lower section for presetting picture in video frame is caption area, when progress word When curtain identification, the caption area of each video frame is only identified；Thus avoid when the texts such as billboard occur in video pictures also into Subtitle mistake caused by row identification.

Embodiment 4

The present embodiment is the further improvement made on the basis of embodiment 1-3 is any, the present embodiment and embodiment 1-3 Any difference is:

In the present embodiment, by current goal subtitle file and original subtitle file to it is corresponding when

Sensitive word detection and the verification of smoothness degree make the subtitle accuracy rate after translation higher, and more adduction rule, avoid pair Video itself causes deleterious effect.

Embodiment 5

The present embodiment is the further improvement made on the basis of embodiment 4, and the difference of the present embodiment and embodiment 4 exists In:

In the present embodiment, by final goal subtitle file and original subtitle file to it is corresponding when by final goal subtitle file Timestamp it is corresponding with the timestamp of original subtitle file；Thus it is watched caused by avoiding subtitle from misplacing uncomfortable.

As one of preferred embodiment, when the timestamp of original subtitle file is the broadcasting of subtitle in video Between, during caption translating, each subtitle, which is bound, unique corresponding play time.

Embodiment 6

The present embodiment is the further improvement made on the basis of embodiment 1-5 is any, the present embodiment and embodiment 1-5 Any difference is:

In the present embodiment, original subtitle file includes original captioned test and scene corresponding with each original captioned test Information；When being split to current original subtitle file, multiple phrases are generated after dividing original captioned test, each phrase is right Ying Youyi scene information；Extract target language language model after, by after segmentation multiple phrases and corresponding scene information according to In the language model of secondary input target language, target subtitle file is obtained.

In the present embodiment, by presetting scene information, so that caption translating is more accurate；Meanwhile language model is in training Scene information training can be added in the process, so that language model can be according to current subtitle during caption Scene translated.

Embodiment 7

The present embodiment is made on the basis of embodiment 1-6 is any, and the present embodiment provides one kind to be appointed based on embodiment 1-6 The device of the one video caption automatic translating method, including data obtaining module, preprocessing module, translation module and storage mould Block；

Embodiment 8

The present embodiment is the further improvement made on the basis of embodiment 7, and the difference of the present embodiment and embodiment 7 exists In:

In the present embodiment, above-mentioned video caption automatic translation device further includes correction verification module；Memory module is also used to deposit Store up the syntactic model of multiple languages；

Embodiment 9

The present embodiment is made on the basis of embodiment 1-6 is any, and the present embodiment provides a kind of computer-readable storages Medium, it is characterised in that: be stored with computer program；When the calculation machine program is executed by processor, above-mentioned video is realized Subtitle automatic translating method.

Wherein, computer readable storage medium described in the present embodiment includes but is not limited to that any kind of disk is (including soft Disk, hard disk, CD, CD-ROM and magneto-optic disk), ROM (Read-Only Memory, read-only memory), RAM (RandomAcceSS Memory, immediately memory), EPROM (EraSable Programmable Read-Only Memory, Erarable Programmable Read only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory, Electrically Erasable Programmable Read-Only Memory), flash memory, magnetic card or light card.It is, storage Equipment includes that can be by equipment (for example, computer, mobile phone) with any medium for the form storage or transmission information that can be read Read-only memory, disk or CD etc., memory disclosed in the present embodiment are only used as citing rather than as restrictions.

Any video words of embodiment 1-7 may be implemented in computer readable storage medium provided in an embodiment of the present invention Curtain automatic translating method, concrete function realize refer to embodiment 1-7 it is any in explanation, details are not described herein.

Embodiment 10

The present embodiment is made on the basis of embodiment any one of 1-6, and the present embodiment provides a kind of computer equipment, packets One or more processors are included, further include memory, further include one or more computer programs；One or more meters Calculation machine program is stored in memory, and is configured as being executed by one or more of processors；Described one or more A computer program is configured for executing above-mentioned video caption automatic translating method.

Wherein, computer equipment described in the present embodiment can be server, personal computer and network equipment etc. and set It is standby.It, can be with it will be understood by those skilled in the art that device structure described in the present embodiment does not constitute the restriction to all devices Including than illustrating more or fewer components, or the certain components of combination.It includes that video caption is automatic that memory, which can be used for storing, The computer program of interpretation method and each functional module, processor runs the computer program for being stored in memory, to hold The various function application and data processing of row equipment.

Embodiments described above is only schematical, can if being related to unit as illustrated by the separation member It is physically separated with being or may not be；If being related to component shown as a unit, can be or can also be with It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual Need that some or all of the units may be selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not In the case where paying creative labor, it can understand and implement.

The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；Although with reference to the foregoing embodiments Invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each implementation Technical solution documented by example is modified or equivalent replacement of some of the technical features.And these modification or Replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.

The present invention is not limited to above-mentioned optional embodiment, anyone can show that other are each under the inspiration of the present invention The product of kind form.Above-mentioned specific embodiment should not be understood the limitation of pairs of protection scope of the present invention, protection of the invention Range should be subject to be defined in claims, and specification can be used for interpreting the claims.

Claims

1. a kind of video caption automatic translating method, it is characterised in that: the following steps are included:

Whether real-time detection has caption translating request, if so, then obtaining caption translating request, and requests to determine according to caption translating Sensing word, wherein sensing word includes source languages, target language and video location；

The corresponding video file in current video position is extracted, the corresponding original subtitle file of current video file is then obtained；

Currently whether the corresponding languages of original subtitle file are source languages for judgement, if so, then carrying out to current original subtitle file Segmentation generates multiple phrases, if not, output translation request error message；

The language model of target language is extracted, and multiple phrases after segmentation are sequentially input in the language model of target language, Obtain target subtitle file；

Current goal subtitle file is corresponding with original subtitle file, and current goal subtitle file is exported to current video text Information is completed in part, output translation.

2. video caption automatic translating method according to claim 1, it is characterised in that: caption translating request is asked for voice It asks and/or text request；When caption translating request is voice request, determining sensing word, specific step is as follows:

It is converted to by pretreated voice request, obtains text request, then determined and be directed toward by identification text request Word.

3. video caption automatic translating method according to claim 2, it is characterised in that: it is corresponding to obtain current video file Original subtitle file when, the specific steps are as follows:

If not, obtaining the video frame of current video file, identification is successively then carried out to the subtitle of all video frames and is formed just All original documents are arranged according to the sequence of corresponding video frame, form original subtitle file by beginning file.

4. video caption automatic translating method according to claim 3, it is characterised in that: by current goal subtitle file with When original subtitle file is to corresponding to

A1. current goal subtitle file is traversed, is judged with the presence or absence of sensitive word in current goal subtitle file, if so, then will be quick Sense word is replaced operation and/or delete operation, then current goal subtitle file is input to the syntactic model of target language Middle progress smoothness degree verification, carries out smoothness if not, current goal subtitle file is input in the syntactic model of target language Degree verification；

A2. judge whether the clear and coherent rate of smoothness degree verification is higher than threshold value, if so, then exporting final goal subtitle file, and will most Whole target subtitle file is corresponding with original subtitle file, finally exports current final goal subtitle file to current video text Target subtitle file after mark if not, the phrase by clear and coherent rate lower than threshold value is labeled, and is re-entered target by part Verification object subtitle file is obtained after the language model of languages, and step A1 then is repeated to verification object subtitle file.

5. video caption automatic translating method according to claim 4, it is characterised in that: by final goal subtitle file with Original subtitle file is to corresponding with the timestamp of original subtitle file by the timestamp of final goal subtitle file when corresponding to.

6. -5 any video caption automatic translating method according to claim 1, it is characterised in that: original subtitle file packet Include original captioned test and scene information corresponding with each original captioned test；Current original subtitle file is split When, multiple phrases are generated after dividing original captioned test, each phrase is corresponding with a scene information；Extract the language of target language Say model after, by after segmentation multiple phrases and corresponding scene information sequentially input in the language model of target language, obtain Target subtitle file.

7. a kind of video caption automatic translation device, it is characterised in that: including data obtaining module, preprocessing module, translation mould Block and memory module；

Data obtaining module, for obtaining caption translating request, video file and original subtitle file corresponding with video file, It is also used to be requested to determine sensing word according to caption translating, wherein sensing word includes source languages, target language and video location；

Preprocessing module, for being split processing to original subtitle file, be also used to the original subtitle file after segmentation into The judgement of row languages；

Then translation module exports target subtitle file for calling the corresponding language model of the target language in sensing word.

8. video caption automatic translation device according to claim 7, it is characterised in that: further include correction verification module；Storage Module is also used to store the syntactic model of multiple languages；

Then correction verification module carries out target subtitle file for calling the corresponding syntactic model of the target language in sensing word Clear and coherent degree verification, and the clear and coherent rate result of output verification.

9. a kind of computer readable storage medium, it is characterised in that: be stored with computer program；The calculation machine program is processed When device executes, any video caption automatic translating method of claim 1 to 6 is realized.

10. a kind of computer equipment, it is characterised in that: further include memory including one or more processors, further include one Or multiple computer programs；One or more computer programs are stored in memory, and are configured as by described One or more processors execute；It is any that one or more computer programs are configured for perform claim requirement 1 to 6 The video caption automatic translating method.