CN103297710B

CN103297710B - Chinese mark the most in real time in foreign language caption phonotape and videotape recorded broadcast equipment

Info

Publication number: CN103297710B
Application number: CN201310243550.1A
Authority: CN
Inventors: 不公告发明人
Original assignee: Qinghai Hanla Information Technology Co Ltd
Current assignee: QINGHAI HANLA INFORMATION TECHNOLOGY CO., LTD.
Priority date: 2013-06-19
Filing date: 2013-06-19
Publication date: 2016-08-17
Anticipated expiration: 2033-06-19
Also published as: CN103297710A

Abstract

The technical program is foreign language caption phonotape and videotape recorded broadcast equipment during a kind of Chinese marks the most in real time, belongs to voice and image-data processing apparatus technical field.The audio-video recorded broadcast equipment that the technical program uses includes: microphone and camara module (1), audio-visual synchronization signal mark module (2), sound language audio signal extraction module (3), Mandarin speech recognition module (4), the machine translation module (5) of foreign language translated in Chinese, video pictures or image frame subtitle superposition machine module (6), audio/video coding compression module (7), network transmission module A(8), band audio/video decoding PKUNZIP server module (9), network transmission module B(10), client modules (11) with audio frequency and video phonotape and videotape playout software.There is the technical program that Chinese can be made more extensively preferably to travel to all over the world, promoted that Chinese culture exchanges with the mutual of world culture.

Description

Chinese mark the most in real time in foreign language caption phonotape and videotape recorded broadcast equipment

Technical field

The technical program belongs to voice and image-data processing apparatus technical field.

Background technology

At present the Chinese character of the sound image data of Chinese the most on the market or foreign language or its comparison subtitle superposition, general by artificial Chinese in sound for Chinese image data is converted into Chinese character or foreign language by mode, then gives video pictures or image frame captions are folded Add machine the Chinese character or foreign language caption of expressing the Chinese meaning to be superimposed upon on video pictures or image frame, a large amount of real-time owing to existing Or the non real-time sound image data of Chinese, including the sound image data such as telerecording and film, therefore, if depending merely on employing The mode of artificial conversion can be the most time-consuming and be difficult to accomplish real-time Transmission, along with the appearance of digital sound image technology, special Not being that computer system occurs for the technology processing video image data, having had increasing need for one can be automatically in real time by Chinese The phonotape and videotape of voice be converted into Chinese speech and fill in foreign language caption technology occur, and this can be automatically according to Chinese speech The technology being converted into Chinese and middle foreign language caption in real time not only can be run in the computer system of band hanzi system, moreover it is possible to or else With hanzi system, the Hesperian computer system with the U.S. as representative of only ASCII character system with 128 characters is transported OK, to meet the appearance of the increasingly extensive utilization of World Wide Web and cloud computing, Internet of Things and Chinese language craze all over the world, in The needs of western cultural exchanges new situations the most frequently.

Summary of the invention

The proposition of the technical program is contemplated to solve above-mentioned these problems occur.The technical program is by adopting specifically In marking the most in real time with following Chinese, the technology of foreign language caption phonotape and videotape recorded broadcast equipment solves above-mentioned produced problem:

The audio-video recorded broadcast equipment that the technical program uses includes: microphone and camara module 1, audio-visual synchronization signal labelling The machine translation mould of foreign language translated in module 2, sound language audio signal extraction module 3, Mandarin speech recognition module 4, Chinese Block 5, video pictures or image frame subtitle superposition machine module 6, audio/video coding compression module 7, network transmission module A8, band sound Video decoding PKUNZIP server module 9, network transmission module B10, the client mould of band audio frequency and video phonotape and videotape playout software Block 11.

Chinese speech in real time is sequentially included the following steps: at the scene during the audio-video recorded broadcast equipment work that the technical program uses During phonotape and videotape recorded broadcast, described recorded broadcast equipment passes through microphone and camara module 1, by Chinese speech and field scene typing and be stored into In the system of described recorded broadcast equipment, the computer in system first passes through audio-visual synchronization signal mark module 2 and carries out and pass through The corresponding Chinese that in the image data of above-mentioned camara module 1 production, video pictures or image frame are recorded with above-mentioned microphone has The audio signal synchronizing signal labelling of sound language is also stored in the stocking system of phonotape and videotape recorded broadcast equipment, then band is synchronized letter The audio signal of the Chinese sound language of labelled notation is extracted by sound language audio signal extraction module 3, and band synchronizes letter The audio signal of the Chinese sound language of labelled notation passes to the Mandarin speech recognition module 4 in computer, the Chinese again after extracting Mandarin speech recognition is become to carry drawing with 26 of synchronizing signal labelling identical with the Chinese speech identified by language sound identification module 4 The Chinese phonetics codes of fourth letter representation, then translate into the machine translation module 5 of foreign language by Chinese above-mentioned Chinese phonetics codes is turned over It is translated into the foreign language specified with corresponding Chinese phonetics codes sentence with identical synchronizing signal labelling represented with 26 Latin alphabets Sentence, then by the Chinese phonetics codes captions of above-mentioned band synchronizing signal labelling or foreign language caption or their comparison text subtitle transmission To existing video pictures or image frame subtitle superposition machine module 6, according to Chinese phonetics codes captions or foreign language caption or they Caption information is superimposed upon video picture with the corresponding relation of video pictures or image frame synchronizing signal labelling by comparison text subtitle On face or image frame, and encode by above-mentioned audio/video coding compression module 7 and compress, after above-mentioned coding and compression It is transmitted further to network transmission module A8, then by network transmission module A8, above-mentioned after coding and compression is had identical synchronizing signal In the band of labelling, foreign language caption and the video pictures of Chinese speech or image frame are transferred to broadband network, and broadband network is passed It is passed on the band audio/video decoding PKUNZIP server module 9 specified store, band audio frequency and video phonotape and videotape playout software Client modules 11 just logs on above-mentioned band audio/video decoding PKUNZIP server module 9 by network transmission module B10 Can watch in real time above-mentioned scene in real time with in foreign language caption and the video image data picture of Chinese speech.

The embedded Chinese character of machine translation module 5 and the Chinese of foreign language translated in above-mentioned Mandarin speech recognition module 4 and Chinese Phonetic and Chinese voice code bidirectional modular converter.

Above-mentioned network transmission module A8 or network transmission module B10, is cable-network transmission module or wireless network Any one in transport module, when using cable-network transmission module, above-mentioned broadband network is wired broadband network, is adopting When using wireless network transmission module, above-mentioned broadband network is wireless broadband network.

Described wireless network transmission module is any one in 3G, 4G, wifi, wimax, bluetooth.

Above-mentioned Chinese phonetics codes, can be by described Chinese character and the Chinese phonetic alphabet and the Chinese in the computer of hanzi system Language phonetic code bi-directional conversion module is converted into Chinese character, Chinese phonetics codes, the Chinese phonetic alphabet, and Chinese character can individually or Chinese phonetics codes With Chinese character, the Chinese phonetic alphabet, the foreign language comparison that meaning is consistent shows, stores, exports.

Above-mentioned Chinese phonetics codes is in units of word, regards individual Chinese character as monosyllable here, should according to composition Phonetic in " Scheme for the Chinese Phonetic Alphabet " of each syllable of word, with and only with 26 Latin alphabets to the initial consonant of the Chinese phonetic alphabet, Referral letter, simple or compound vowel of a Chinese syllable, tone are taked first to encode and are spelled by the sequential encoding of " acoustic code+Jie's code+rhyme code+tune code is held concurrently every syllable symbol " the most successively Write, and the coding of the phonetic code by obtaining directly expresses Chinese information, when direct term syllable code represents Chinese information Time, its usage in punctuation is identical with English usage in punctuation, and during coding, multiple syllables of same word need not Space continuous programming code, between word and word, space to be had separates.

Due to the technical program use Chinese phonetics codes that 26 Latin alphabets represent to express Chinese information, and when directly When term syllable code represents Chinese information, its usage in punctuation is identical, so with English usage in punctuation Ensure that the expression punctuation mark of Chinese information is interior all completely the same with ASCII character namely double with ASCII character 100% Hold, the most above-mentioned Mandarin speech recognition module, machine translation module, voice synthetic module due to process Chinese information be with The on all four Chinese phonetics codes of ASCII character represents, thus makes these modules can be in the calculating of ASCII character system Machine runs, owing to the module of composition whole system can be run in the computer of ASCII character system, therefore, whole system Can run in the computer of ASCII character system.

After having had the technical program, Chinese information can be at Hanzi internal code system and the ASCII of non-Hanzi internal code system The computer information system of interior code system is transmitted unblockedly and processes, and the widest along with World Wide Web The appearance of general utilization and cloud computing, Internet of Things and Chinese language craze all over the world so that Chinese and the world with English as representative The exchange of mutually viewing and emulating of the real-time non real-time image data of various countries brings great convenience, and particularly convenient foreign country spectators are led to Cross the real time video data limit of China see China News limit learn to speak Chinese, Chinese character, the Chinese phonetic alphabet and Chinese phonetics codes so that Chinese More extensively can preferably travel to all over the world, promote that Chinese culture exchanges with the mutual of world culture.

Accompanying drawing explanation

It it is foreign language caption phonotape and videotape recorded broadcast device systems schematic diagram during the Chinese of the present invention marks the most in real time shown in accompanying drawing

Detailed description of the invention

Below in conjunction with embodiment, the detailed description of the invention of the present invention is further described.

(1) below the coded method of each syllable sound, rhyme, tone of the Chinese phonetics codes that the technical program is used uses Method:

Note: the symbol after dash " " is Chinese phonetic symbols, the front letter of dash " " is the Chinese used The coding of language each syllable sound, rhyme, tone, the most all with, below following control table is referred to as code table.

Here it is worth noting that: when keying in the punctuation mark of Chinese phonetics codes and Chinese phonetics codes statement with keyboard, 26 Latin alphabets of composition Chinese phonetics codes coding are the most identical corresponding with the 26 of western language QWERTY keyboard letter keys, the Chinese The punctuation mark key of language phonetic code statement is the most identical corresponding with the punctuation mark key of western language QWERTY keyboard, inputs Chinese speech When code letter and punctuation mark, it is only necessary to the corresponding identical key mapping of impact western language QWERTY keyboard.

1, the coded identification of acoustic code uses the alphabetic character of the initial consonant basically identical with the Scheme for the Chinese Phonetic Alphabet, under such as using The coding form of this acoustic code in face:

b—b; p—p; m—m; f—f; d—d; t—t;

n—n; l—l; g—g; k—k; h—h;

J zh, j;Q ch, q;X sh, x; r—r;

z—z; c—c; s—s; y—y; w—w。

2, a letter representation during Chinese phonetic alphabet referral letter (ü) uses 26 Latin alphabets, such as uses following this Jie The coding form of code:

i—i; u—u; y—ü。

3, the coding of rhyme code, to a single vowel letter representation in (ü) uses 26 Latin alphabets in addition to, other use The alphabetic character identical with the Chinese phonetic alphabet, the composite vowel of the Chinese phonetic alphabet can use the form that " Scheme for the Chinese Phonetic Alphabet " is identical, A consonant can also be used to encode, such as use following this alphabetic character that the simple or compound vowel of a Chinese syllable of the Chinese phonetic alphabet is carried out Coding:

a—a; o—o; e—e; i—i; u—u; y—ü;

k—ao; c— ai; s—an; x—ou ; w—ei; n—en;

z—ua; l—uo; b—ang; d—ong; p—eng；

q—ing; g—ng; er—er;

r—i;" when i only spells with Chinese Pin Yin pseudonym zh, ch, sh phase, the coding r generation of the i phonetic code of the Chinese phonetic alphabet Table ".That is: zhi, chi, shi of the Chinese phonetic alphabet represents with jr, qr, xr of phonetic code respectively.Jr or qr or xr or er input through keyboard Time press J and R or Q and R or X and R or the input of two key mappings of E and R respectively.

4, the coding of tune code is in addition to using a Chinese no consonant v and representing the upper sound (∨) of the Chinese phonetic alphabet, its It uses vowel to represent the tone of Chinese, such as uses following letter to encode the tone of the Chinese phonetic alphabet:

A, high and level tone;E/, rising tone;V ∨, upper sound;U, falling tone;

O phonetic does not mark tone mark softly, softly.

(2) the Chinese phonetics codes Chinese information utilizing above-mentioned coding represents the following method of using:

In units of word, regard individual Chinese character as monosyllable here, according to " the Chinese of each syllable forming this word Language phonetic plan " in phonetic, successively by the sequential encoding of " acoustic code+Jie's code+rhyme code+tunes code double every syllable symbol ", same Multiple syllables of word separate write the two or more syllables of a word together without space, and the coding space between word and word separates, when Chinese information represents When being in Chinese phonetics codes state, its six kinds of periods, seven kinds of labels and the number of dividing a word with a hyphen at the end of a line use the form identical with English；

Here owing to regarding the independent Chinese character used as monosyllable, therefore, the side of encoding of chinese characters of the present invention Method is identical with the method for Chinese language words syllable coding, uses single syllable coding to compile by obtaining word after word write the two or more syllables of a word together in the present invention Yard, be made up of several words one group of word is referred to as phrase by us, and the coding of the used phrase of the present invention is with Chinese sentence Encode identical, owing to word can represent phrase and Chinese sentence, the coding of the phrase used the most in the present invention and the Chinese The coding of statement can be realized by the coding of word, without phrase and Chinese sentence are additionally formulated a set of specially Coding, typically when whole sentence entire chapter represents Chinese information in units of word, be typically made without homophone word when understanding Selection, sound the sentence that will not produce ambiguity in principle, with coding express time also will not produce ambiguity.

Implementing of the technical program is described below as a example by the voice of a Chinese sentence inputted by microphone Step:

At the scene during Chinese speech phonotape and videotape recorded broadcast in real time, described recorded broadcast equipment passes through microphone and camara module 1, by Chinese Voice and field scene typing are also stored into the system of described recorded broadcast equipment, and the computer in system first passes through audio frequency and video Synchronizing signal mark module 2 carry out in the image data by the production of above-mentioned camara module 1 video pictures or image frame with The audio signal synchronizing signal labelling of the corresponding Chinese sound language that above-mentioned microphone is recorded also is stored in phonotape and videotape recorded broadcast equipment In stocking system, audio-visual synchronization signal mark module 2 makes synchronizing signal labelling can use existing making video pictures Or the technology of image frame and audio sync timestamp labelling is carried out.

Then the audio signal of the Chinese sound language of band synchronizing signal labelling is extracted by sound language audio signal Module 3 extracts, and the Chinese sound language audio digital signals that extracting method can directly be stored by system extracts, it is also possible to The Chinese sound language audio digital signals that Chinese sound language audio signal will be had system to be stored by D/A digiverter Being converted into analogue signal to extract, more original method is that Chinese sound language audio signal is had by loudspeaker broadcasting Chinese again The mode of the sound of sound language extracts, and enumerates the most one by one.

The audio signal of the Chinese sound language with synchronizing signal labelling passes to the Chinese in computer after extracting again Sound identification module 4, Mandarin speech recognition is become synchronize identical with the Chinese speech identified of band to believe by Mandarin speech recognition module 4 The Chinese phonetics codes represented with 26 Latin alphabets of labelled notation.

When using Chinese-voice-code voice identification module 4 to carry out Mandarin speech recognition, this Mandarin speech recognition Chinese syllable as the primitive identified, is had previously been stored in the Chinese syllable sound template in computer system by module by lookup With Chinese speech syllabified code synopsis, identify corresponding Chinese syllable phonetic code after coupling, just obtain when voice inputs continuously Continuous print Chinese syllable voice sequence, carries out cutting by word to the above-mentioned Chinese syllable phonetic code string that obtains by the way of checking thesaurus Point, to the multiple segmentation of words, after can differentiating according to means such as the contact of Chinese lexical syntactic context and statistical laws again Carry out the segmentation of words, the word being syncopated as is taked write the two or more syllables of a word together between syllable and the syllable of same word, space between word and word Mode represents.

Enumerate the example that by the inventive method, Chinese speech is carried out Chinese phonetics codes identification below:

1. Chinese speech is converted into Chinese phonetics codes:

Such as: we extract the Chinese speech of the following Chinese sentence in image data:

" we use Latin every day.”

(1) the Chinese syllable sound template in computer system and Chinese speech syllabified code pair are had previously been stored in by lookup According to table, after coupling, identify corresponding Chinese syllable voice sequence:

Space is had) between Wov mno mwv tisa xrv ydu laa dqa wnv .(syllable and syllable

Or wovmnomwvtisaxrvydulaadqawnv. (without space between syllable and syllable)

(after skilled, the symbol of the schwa in mno o can omit when not causing audio mixing, above following the most all with.)

Would indicate that in order to allow everybody see here the letter of tone is underlined, the tone letter in phonetic code is simultaneously Tool sound insulation joint effect, in actual speech code, tone is without underscore, and after skilled Chinese phonetics codes, tone is held concurrently and can be facilitated every syllabic sign Distinguish.

Complete the pure speech recognition process that the complexity of a system is unrelated with the dictionary scale of system.

(2) voice sequence is carried out the segmentation of words, be finally completed the phonetic code conversion in units of word.

The Chinese phonetics codes word dictionary dividing good word in computer system is had previously been stored in, by same list by lookup Multiple syllable write the two or more syllables of a word together of word, separate the Chinese phonetics codes just obtaining our final needs following with space between word and word:

Wovmno mwvtisa xrvydu laadqawnv.

Above-mentioned Chinese phonetics codes is translated into 26 Latin words by the machine translation module 5 being translated into foreign language again by Chinese The foreign language sentence specified with corresponding Chinese phonetics codes sentence with identical synchronizing signal labelling that matrix shows:

Call Chinese and translate into the machine translation module 5 of foreign language, then the Chinese information of Chinese speech representation that will obtain Being converted into foreign language, here as a example by English, to other foreign language too, the most just differ a citing.

(note: above is to understand Chinese phonetics codes for convenience with the Chinese character compareed with Chinese phonetics codes hereinafter occurred Implication, actual in pure ASCII character system is run and occur without, above following the most all with)

Such as by the Chinese information of Chinese speech representation obtained above:

wovmno mwvtisa xrvydu laadqawnv.

Calling Chinese to translate into the machine translation module 5 of foreign language and obtain following translation switch process, it is above-mentioned right to finally give Answer the english sentence of Chinese phonetics codes:

The Chinese information of 1.wovmno mwvtisa xrvydu laadqawnv.(Chinese speech representation)

We use Latin every day.(Chinese information represented with Chinese character)

A) Chinese dictionary looking into the mark word part of speech having previously been stored in computer system sets up word part of speech string: (sentence Part in bracket is part of speech, the most all with)

Wovmno(personal pronoun 1)+mwvtisa(time noun 1)+xrvydu(verb 1)+laadqawnv(noun 2).

Our (personal pronoun 1)+every day (time noun 1)+use (verb 1)+Latin (noun 2).

B) look into, according to sentence part of speech string obtained above, the table having previously been stored in computer system to be had previously been stored in Chinese sentence patterns in table:

(the component string composition sentence pattern that part of speech and this word are made, the most all with)

Wovmno(personal pronoun 1 makees subject)+mwvtisa (time noun 1 makees time adverbial)+xrvydu(verb 1 Make predicate)+laadqawnv (object made in noun 2)

Our (personal pronoun 1 makees subject)+every day (time noun 1 makees time adverbial)+use (predicate made in verb 1)+ Latin (object made in noun 2)

C) table look-up according to Chinese sentence patterns obtained above the corresponding English sentence obtaining having previously been stored in table:

Wovmno(personal pronoun 1 makees subject)+xrvydu (predicate made in verb 1)+laadqawnv(noun 2 makees object) + mwvtisa(time noun 1 makees time adverbial)

We (personal pronoun 1 makees subject)+use (predicate made in verb 1)+Latin (object made in noun 2)

+ every day (time noun 1 makees time adverbial)

Now look into the Chinese-English dictionary having previously been stored in computer system and carry out word or the conversion of the phrase meaning, and by this Sentence pattern Sequential output just completes Chinese and translates into the conversion of English, in order to show this machine translation process can amphicheirality, we Remake and convert further below:

D) according to be obtained above English sentence table look-up obtain having previously been stored in table with corresponding English word or phrase The part of speech string that part of speech is consistent: (this part of speech string also can extract from the object language sentence pattern obtained and obtain, the most all with)

Wovmno(personal pronoun 1)+xrvydu(verb 1)+laadqawnv(noun 2)+mwvtisa(time noun 1).

We (personal pronoun 1)+use (verb 1)+Latin (noun 2)+every day (time noun 1).

E) look into the Chinese-English dictionary having previously been stored in computer system and carry out word or the conversion of the phrase meaning and by above The Sequential output of obtained English sentence:

We(personal pronoun 1) use(verb 1) latin(noun 2) every day(time noun 1).

we use latin every day.

Complete Chinese and translate into the conversion of English.

Further after obtaining Chinese phonetics codes, can be by Chinese character and the Chinese phonetic alphabet and the Chinese when Chinese phonetics codes needs Language phonetic code bi-directional conversion module is converted into Chinese character or the Chinese phonetic alphabet, and this Chinese phonetics codes Chinese character modular converter can be embedded in the Chinese In language sound identification module 4, during now whole system has to operate at the computer of hanzi system, Chinese phonetics codes or Chinese character Or the Chinese phonetic alphabet can individually or Chinese phonetics codes and Chinese character, the Chinese phonetic alphabet, foreign language comparison display that meaning is consistent, store, Output, detailed process is as follows:

Chinese phonetics codes is changed by following steps by calling Chinese phonetics codes Chinese character bi-directional conversion modular computer One-tenth Chinese character:

By search respectively Chinese phonetics codes and Chinese character in units of word and Chinese phonetic alphabet synopsis can easily by Chinese phonetics codes is converted into Chinese character and the Chinese phonetic alphabet, such as:

Wovmno is by looking into acoustic code, Jie's code, rhyme code, tune code and Chinese phonetic alphabet synopsis or generating according to this synopsis Chinese phonetics codes syllable or word and pinyin syllable or word synopsis obtain w ǒ men, then are found with word as list by w ǒ men The Chinese character of position, when the phonetic code in units of word is by the Chinese phonetic alphabet in units of word and the Chinese character in units of word After setting up corresponding relation, once need the phonetic code in units of word can be no longer necessary to by the Chinese in units of word Phonetic, directly sets up corresponding relation and carries out corresponding conversion with the Chinese character in units of word.Such as: wovmno can turn Being changed to w ǒ men, then can be converted into " we " by w ǒ men, such wovmno and " we " the most directly establish corresponding pass System, can not change by the Chinese phonetic alphabet w ǒ men when needing, and direct realize between wovmno and " us " two-way can Inverse conversion.

When meeting homonym, can carry out differentiating according to the contact of Chinese lexical syntactic context and the means such as statistical law laggard Row Chinese character in units of word is selected.Such as: on ysvlune, become loaded with mailbag.Crude oil is become loaded with on ysvlune.In conjunction with up and down " ysvlune " that the contact of literary composition can see that above in one represents cruise, after " ysvlune " in one represent oil Wheel, this two word can be converted into " becoming loaded with mailbag on cruise " and " becoming loaded with crude oil on oil tanker " respectively.To other word case Also the like.

The result of above-mentioned bidirectional reversible conversion both can individually show can also compare display, such as:

Former sentence: " we use Chinese character and latin literary composition every day." can reversibly be converted to following with the inventive method computer Several forms:

1.“Wǒmen měitiān shǐyòng lādīngwěn。”

2.“wovmno mwvtisa xrvydu laadqawnv.”

3.“Wǒmen měitiān shǐyòng lādīngwěn。”

We use Latin every day.

4.“wovmno mwvtisa xrvydu laadqawnv.”

We use Latin every day.

5. “Wǒmen měitiān shǐyòng lādīngwěn。”

“wovmno mwvtisa xrvydu laadqawnv.”

In order to understand implication and the learning Chinese of Chinese with allowing foreigner or Minorities In China more aspect, it is also possible to often The word of individual comparison inserts corresponding foreign language word or minority language, in following word, such as adds corresponding English The note of the language word work Chinese meaning:

“wovmno Wǒmen mwvtisa měitiān xrvydu shǐyòng laadqawnv lādīngwěn 。”

We We every every day day uses use Latin Latine.

The above following Chinese subtitle so mentioned in the technical program or Chinese language subtitles can be just Chinese phonetics codes, Chinese character and the Chinese phonetic alphabet.

Again the Chinese phonetics codes captions of above-mentioned band synchronizing signal labelling or foreign language caption or their comparison text subtitles are passed Be defeated by traditional video pictures or image frame subtitle superposition machine module 6, according to Chinese phonetics codes captions or foreign language caption or it Compare the corresponding relation of text subtitle and video pictures or image frame synchronizing signal labelling caption information be superimposed upon video On picture or image frame, and it is synthesized together storage or synchronism output.

So we use said method to achieve Chinese speech real-time imaging data is transformed into Chinese speech and is filled The real-time imaging data of middle foreign language caption, in like manner can also use identical method to realize above process and knot other foreign language Really, the most tired state.

Finally and by the real-time imaging data of foreign language caption in Chinese speech obtained above filling through audio frequency and video Compression coding module 7 encodes and compresses, and is transmitted further to network transmission module A8 after above-mentioned coding and compression, then by network Transport module A8 is by foreign language caption and Chinese speech in the above-mentioned band with identical synchronizing signal labelling after coding and compression Video pictures or image frame are transferred to broadband network, and the band audio/video decoding that broadband network is transmitted to specify decompresses soft Storing on part server module 9, the client modules 11 of band audio frequency and video phonotape and videotape playout software is by network transmission module B10 Log on above-mentioned band audio/video decoding PKUNZIP server module 9 and just can watch foreign language during above-mentioned scene carries in real time in real time The video image data picture of captions and Chinese speech, so we just complete real-time Chinese speech by the equipment of this technology Phonotape and videotape is converted into the recorded broadcast process of the audio and video files of foreign language datum in real-time Chinese speech filling.

By that analogy, in aforementioned manners, we can also realize Chinese to the conversion of other foreign language and corresponding captions thereof also With synchronize corresponding video pictures or image frame synthesis superposition stores or and pass through described network transmission and server and Client can watch the audio and video files picture of the Chinese speech after conversion and middle foreign language caption in real time.When needing further exist for permissible Download from a server these audio and video files and be converted into the various form being easy to play for television station or multimedia machine broadcasting.

Finally it is worthy of note: the machine translation module 5 of foreign language translated in above-mentioned Chinese can use a kind of use The Chinese of Chinese phonetics codes and foreign language bidirectional reversible machine translation module, and above two machine translation module all can be embedded Chinese character and the Chinese phonetic alphabet and Chinese voice code bidirectional modular converter.

Described wireless network transmission module is any one in 3G, 4G, wifi, wimax, bluetooth.By Being all prior art in above-mentioned network, object lesson is not stated tired one by one.

Claims

1. a foreign language caption phonotape and videotape recorded broadcast equipment during Chinese marks the most in real time, is characterized in that: include microphone and video camera mould Block (1), audio-visual synchronization signal mark module (2), sound language audio signal extraction module (3), Mandarin speech recognition module (4), the machine translation module (5) of foreign language, video pictures or image frame subtitle superposition machine module (6) translated in Chinese, sound regards Frequently compression coding module (7), network transmission module A(8), band audio/video decoding PKUNZIP server module (9), network pass Defeated module B(10), the client modules (11) of band audio frequency and video phonotape and videotape playout software；

When sequentially including the following steps: Chinese speech phonotape and videotape recorded broadcast in real time at the scene during the work of this equipment, described recorded broadcast equipment passes through Microphone and camara module (1), by Chinese speech and field scene typing the system being stored into described recorded broadcast equipment, be Computer in system is first passed through audio-visual synchronization signal mark module (2) and carries out and produced by above-mentioned camara module (1) In image data, the audio signal of the corresponding Chinese sound language that video pictures or image frame are recorded with above-mentioned microphone synchronizes Signal labelling and be stored in phonotape and videotape recorded broadcast equipment stocking system in, then by the Chinese sound language of band synchronizing signal labelling Audio signal extracted by sound language audio signal extraction module (3), the sound language of Chinese of band synchronizing signal labelling The Mandarin speech recognition module (4) that the audio signal of speech is passed in computer after extracting again, Mandarin speech recognition module (4) Become to carry the Chinese represented with 26 Latin alphabets of synchronizing signal labelling identical with the Chinese speech identified by Mandarin speech recognition Language phonetic code, then translate into the machine translation module (5) of foreign language by Chinese and above-mentioned Chinese phonetics codes is translated into draw with 26 The foreign language sentence specified with corresponding Chinese phonetics codes sentence with identical synchronizing signal labelling of fourth letter representation, then by upper State the Chinese phonetics codes captions of band synchronizing signal labelling or foreign language caption or their comparison text subtitles are transferred to existing video Picture or image frame subtitle superposition machine module (6), according to Chinese phonetics codes captions or foreign language caption or their comparison text words Caption information is superimposed upon video pictures or image picture by curtain and the corresponding relation of video pictures or image frame synchronizing signal labelling On face, and encode by above-mentioned audio/video coding compression module (7) and compress, being transmitted further to after above-mentioned coding and compression Network transmission module A(8), then by network transmission module A(8) above-mentioned will have identical synchronizing signal labelling after coding and compression Band in foreign language caption and the video pictures of Chinese speech or image frame be transferred to broadband network, broadband network is transmitted to Store on band audio/video decoding PKUNZIP server module (9) specified, the visitor of band audio frequency and video phonotape and videotape playout software Network transmission module B(10 is passed through at family end module (11)) log on above-mentioned band audio/video decoding PKUNZIP server module (9) just can watch in real time above-mentioned scene in real time with in foreign language caption and the video image data picture of Chinese speech.

2. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 1 marks the most in real time, is characterized in that: described Machine translation module (5) the embedded Chinese character of foreign language and the Chinese phonetic alphabet and Chinese translated in Mandarin speech recognition module (4) and Chinese Phonetic code bi-directional conversion module.

3. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 1 marks the most in real time, is characterized in that: above-mentioned Network transmission module A(8) or network transmission module B(10), it is cable-network transmission module or wireless network transmission module In any one, use cable-network transmission module time, above-mentioned broadband network is wired broadband network, use wireless network During network transport module, above-mentioned broadband network is wireless broadband network.

4. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 3 marks the most in real time, is characterized in that: described Wireless network transmission module be any one in 3G, 4G, wifi, wimax, bluetooth.

5. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 1 marks the most in real time, is characterized in that: described Chinese phonetics codes is in units of word, regards individual Chinese character as monosyllable here, according to each syllable forming this word " Scheme for the Chinese Phonetic Alphabet " in phonetic, with and only with 26 Latin alphabets to the initial consonant of the Chinese phonetic alphabet, referral letter, simple or compound vowel of a Chinese syllable, tone Take first to encode the most successively by the sequential encoding spelling of " acoustic code+Jie's code+rhyme code+tune code is held concurrently every syllable symbol ", and by obtaining The coding of phonetic code directly express Chinese information, when direct term syllable code represents Chinese information, its punctuation mark Usage identical with English usage in punctuation, during coding, multiple syllables of same word are without space continuous programming code, Between word and word, space to be had separates.

6. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 5 marks the most in real time, is characterized in that: described Chinese phonetics codes is that initial consonant all represents with the consonant Latin alphabet, for representing that the initial consonant of the phonetic code of Chinese information is except " the Chinese Language phonetic plan " in initial consonant zh, ch, sh represent with tri-consonant Latin alphabets of j, q, x respectively outside, remaining initial consonant with The consonant Latin alphabet of the same-sign in " Scheme for the Chinese Phonetic Alphabet " represents, zhi, chi, shi in " Scheme for the Chinese Phonetic Alphabet " divide Not representing with jr, qr, xr of phonetic code, the er of the er phonetic code in " Scheme for the Chinese Phonetic Alphabet " represents, jr or qr or xr or J and R or Q and R or X and R or two key mapping inputs of E and R are pressed respectively during er input through keyboard.

7. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 5 marks the most in real time, is characterized in that: described Chinese phonetics codes represent in the single vowel in " Scheme for the Chinese Phonetic Alphabet " originally and referral letter with an alphabetical y in 26 letters ü, remaining single vowel and the coding of referral letter use the symbol identical with the single vowel in " Scheme for the Chinese Phonetic Alphabet " and referral letter.

8. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 5 marks the most in real time, is characterized in that: described Chinese phonetics codes composite vowel is in addition to representing with the symbol identical with " Scheme for the Chinese Phonetic Alphabet ", with a consonant table Show.

9. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 5 marks the most in real time, is characterized in that: described Chinese phonetics codes its adjust four vowels of code and the no alphabetical v of Chinese to represent, with the Latin alphabet a, e, v, u, o Represent respectively the high and level tone in " Scheme for the Chinese Phonetic Alphabet "-, rising tone e :/, upper sound v: ∨, falling tone u:, softly o do not mark.

10. foreign language caption phonotape and videotape recorded broadcast equipment during Chinese as claimed in claim 2 marks the most in real time, is characterized in that: described Chinese phonetics codes, can be by described Chinese character and the Chinese phonetic alphabet and Chinese voice code bidirectional in the computer of hanzi system Modular converter is converted into Chinese character, Chinese phonetics codes, the Chinese phonetic alphabet, and Chinese character can individually or Chinese phonetics codes and Chinese character, Chinese Phonetic, the foreign language comparison that meaning is consistent shows, stores, exports.